Stochastic integration theory

OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S Series Editors R. COHEN S.K. DONALDSON S. HILDEBRANDT T . J...

Author: Peter Medvegyev

40 downloads 1028 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S

Series Editors R. COHEN S.K. DONALDSON S. HILDEBRANDT T . J . LY O N S M . J . TAY L O R

OX FO R D G R A D U AT E T E X T S I N M AT H E M AT I C S

Books in the series 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Keith Hannabuss: An introduction to quantum theory Reinhold Meise and Dietmar Vogt: Introduction to functional analysis James G. Oxley: Matroid theory N.J. Hitchin, G.B. Segal, and R.S. Ward: Integrable systems: twistors, loop groups, and Riemann surfaces Wulf Rossmann: Lie groups: An introduction through linear groups Qing Liu: Algebraic geometry and arithmetic curves Martin R. Bridson and Simon M. Salamon (eds): Invitations to geometry and topology Shmuel Kantorovitz: Introduction to modern analysis Terry Lawson: Topology: A geometric approach Meinolf Geck: An introduction to algebraic geometry and algebraic groups Alastair Fletcher and Vladimir Markovic: Quasiconformal maps and Teichmiiller theory Dominic Joyce: Riemannian holonomy groups and calibrated geometry Fernando Villegas: Experimental Number Theory P´ eter Medvegyev: Stochastic Integration Theory

Stochastic Integration Theory Péter Medvegyev

1

3 Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York c P´ eter Medvegyev, 2007 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2007 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–921525–6 1 3 5 7 9 10 8 6 4 2

To the memory of my father

This page intentionally left blank

Contents Preface

xiii

1 Stochastic processes 1.1

1.2

1.3

1.4

1

Random functions

1

1.1.1 Trajectories of stochastic processes

2

1.1.2 Jumps of stochastic processes

3

1.1.3 When are stochastic processes equal?

6

Measurability of Stochastic Processes

7

1.2.1 Filtration, adapted, and progressively measurable processes

8

1.2.2 Stopping times

13

1.2.3 Stopped variables, σ-algebras, and truncated processes

19

1.2.4 Predictable processes

23

Martingales

29

1.3.1 Doob’s inequalities

30

1.3.2 The energy equality

35

1.3.3 The quadratic variation of discrete time martingales

37

1.3.4 The downcrossings inequality

42

1.3.5 Regularization of martingales

46

1.3.6 The Optional Sampling Theorem

49

1.3.7 Application: elementary properties of Lévy processes

58

1.3.8 Application: the first passage times of the Wiener processes

80

1.3.9 Some remarks on the usual assumptions

91

Localization

92

1.4.1 Stability under truncation

93

1.4.2 Local martingales

94 vii

viii CONTENTS 1.4.3 Convergence of local martingales: uniform convergence on compacts in probability

104

1.4.4 Locally bounded processes

106

2 Stochastic Integration with Locally Square-Integrable Martingales 2.1

2.2 2.3

2.4

108

The Itô–Stieltjes Integrals

109

2.1.1 Itˆ o–Stieltjes integrals when the integrators have finite variation

111

2.1.2 Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales

117

2.1.3 Itˆ o–Stieltjes integrals when the integrators are semimartingales

124

2.1.4 Properties of the Itˆ o–Stieltjes integral

126

2.1.5 The integral process

126

2.1.6 Integration by parts and the existence of the quadratic variation

128

2.1.7 The Kunita–Watanabe inequality

134

The Quadratic Variation of Continuous Local Martingales

138

Integration when Integrators are Continuous Semimartingales

146

2.3.1 The space of square-integrable continuous local martingales

147

2.3.2 Integration with respect to continuous local martingales

151

2.3.3 Integration with respect to semimartingales

162

2.3.4 The Dominated Convergence Theorem for stochastic integrals

162

2.3.5 Stochastic integration and the Itˆ o–Stieltjes integral

164

Integration when Integrators are Locally Square-Integrable Martingales

167

2.4.1 The quadratic variation of locally square-integrable martingales

167

2.4.2 Integration when the integrators are locally square-integrable martingales

171

2.4.3 Stochastic integration when the integrators are semimartingales

176

CONTENTS

3 The Structure of Local Martingales 3.1

ix 179

Predictable Projection

182

3.1.1 Predictable stopping times

182

3.1.2 Decomposition of thin sets

188

3.1.3 The extended conditional expectation

190

3.1.4 Definition of the predictable projection

192

3.1.5 The uniqueness of the predictable projection, the predictable section theorem

194

3.1.6 Properties of the predictable projection

201

3.1.7 Predictable projection of local martingales

204

3.1.8 Existence of the predictable projection

206

Predictable Compensators

207

3.2.1 Predictable Radon–Nikodym Theorem

207

3.2.2 Predictable Compensator of locally integrable processes

213

3.2.3 Properties of the Predictable Compensator

217

3.3

The Fundamental Theorem of Local Martingales

219

3.4

Quadratic Variation

222

3.2

4 General Theory of Stochastic Integration 4.1

4.2

4.3

4.4

225

Purely Discontinuous Local Martingales

225

4.1.1 Orthogonality of local martingales

227

4.1.2 Decomposition of local martingales

232

4.1.3 Decomposition of semimartingales

234

Purely Discontinuous Local Martingales and Compensated Jumps

235

4.2.1 Construction of purely discontinuous local martingales

240

4.2.2 Quadratic variation of purely discontinuous local martingales

244

Stochastic Integration With Respect To Local Martingales

246

4.3.1 Definition of stochastic integration

248

4.3.2 Properties of stochastic integration

250

Stochastic Integration With Respect To Semimartingales

254

4.4.1 Integration with respect to special semimartingales

257

x

CONTENTS

4.5

4.4.2 Linearity of the stochastic integral

261

4.4.3 The associativity rule

262

4.4.4 Change of measure

264

The Proof of Davis’ Inequality

277

4.5.1 Discrete-time Davis’ inequality

279

4.5.2 Burkholder’s inequality

287

5 Some Other Theorems 5.1

292

The Doob–Meyer Decomposition

292

5.1.1 The proof of the theorem

292

5.1.2 Dellacherie’s formulas and the natural processes

299

5.1.3 The sub- super- and the quasi-martingales are semimartingales

303

5.2

Semimartingales as Good Integrators

308

5.3

Integration of Adapted Product Measurable Processes

314

5.4

Theorem of Fubini for Stochastic Integrals

319

5.5

Martingale Representation

328

6 Itˆ o’s Formula

351

6.1

Itˆ o’s Formula for Continuous Semimartingales

353

6.2

Some Applications of the Formula

359

6.2.1 Zeros of Wiener processes

359

6.2.2 Continuous Lévy processes

366

6.2.3 Lévy’s characterization of Wiener processes

368

6.2.4 Integral representation theorems for Wiener processes

373

6.2.5 Bessel processes

375

Change of Measure for Continuous Semimartingales

377

6.3.1 Locally absolutely continuous change of measure

377

6.3.2 Semimartingales and change of measure

378

6.3.3 Change of measure for continuous semimartingales

380

6.3.4 Girsanov’s formula for Wiener processes

382

6.3.5 Kazamaki–Novikov criteria

386

6.3

CONTENTS

6.4

6.5

Itˆ o’s Formula for Non-Continuous Semimartingales

394

6.4.1 Itˆ o’s formula for processes with finite variation

398

6.4.2 The proof of Itˆ o’s formula

401

6.4.3 Exponential semimartingales

411

Itˆ o’s Formula For Convex Functions

417

6.5.1 Derivative of convex functions

418

6.5.2 Definition of local times

422

6.5.3 Meyer–Itˆ o formula

429

6.5.4 Local times of continuous semimartingales

438

6.5.5 Local time of Wiener processes

445

6.5.6 Ray–Knight theorem

450

6.5.7 Theorem of Dvoretzky Erd˝ os and Kakutani

457

7 Processes with Independent Increments 7.1

xi

460

Lévy processes

460

7.1.1 Poisson processes

461

7.1.2 Compound Poisson processes generated by the jumps

464

7.1.3 Spectral measure of Lévy processes

472

7.1.4 Decomposition of Lévy processes

480

7.1.5 Lévy–Khintchine formula for Lévy processes

486

7.1.6 Construction of Lévy processes

489

7.1.7 Uniqueness of the representation

491

Predictable Compensators of Random Measures

496

7.2.1 Measurable random measures

497

7.2.2 Existence of predictable compensator

501

7.3

Characteristics of Semimartingales

508

7.4

Lévy–Khintchine Formula for Semimartingales with Independent Increments

513

7.4.1 Examples: probability of jumps of processes with independent increments

513

7.4.2 Predictable cumulants

518

7.4.3 Semimartingales with independent increments

523

7.2

xii CONTENTS

7.5

7.4.4 Characteristics of semimartingales with independent increments

530

7.4.5 The proof of the formula

534

Decomposition of Processes with Independent Increments

538

Appendix

547

A

Results from Measure Theory

547

A.1 The Monotone Class Theorem

547

A.2 Projection and the Measurable Selection Theorems

550

A.3 Cramér’s Theorem

551

A.4 Interpretation of Stopped σ-algebras

555

B

C

Wiener Processes

559

B.1 Basic Properties

559

B.2 Existence of Wiener Processes

567

B.3 Quadratic Variation of Wiener Processes

571

Poisson processes

579

Notes and Comments

594

References

597

Index

603

Preface I started to write this book a few years ago mainly because I wanted to understand the theory of stochastic integration. Stochastic integration theory is a very popular topic. The main reason for this is that the theory provides the necessary mathematical background for derivative pricing theory. Of course, many books purport to explain the theory of stochastic integration. Most of them concentrate on the case of Brownian motion, and a few of them discuss the general case. Though the first type of book is quite readable, somehow they disguise the main ideas of the general theory. On the other hand, the books concentrating on the general theory were, for me, of a bit sketchy. I very often had quite serious problems trying to decode what the ideas of the authors were, and it took me a long time, sometimes days and weeks, to understand some basic ideas of the theory. I was nearly always able to understand the main arguments but, looking back, I think some simple notes and hints could have made my suffering shorter. The theory of stochastic integration is full of non-trivial technical details. Perhaps from a student’s point of view the best way to study and to understand measure theory and the basic principles of modern mathematical analysis is to study probability theory. Unfortunately, this is not true for the general theory of stochastic integration. The reason for this is very simple: the general theory of stochastic integration contains too much measure theory. Perhaps the best way to understand the limits of measure theory is to study the general theory of stochastic integration. I think this beautiful theory pushes modern mathematics to its very limits. On the other hand, despite many technical details there are just a very few simple issues which make up the backbone of stochastic analysis. 1. The first one is, of course, martingales and local martingales. The basic concept of stochastic analysis is random noise. But what is the right mathematical model for the random noise? Perhaps the most natural idea would be the random walk, that is processes with stationary and independent increments: the so called Lévy processes, with mean value zero. But, unfortunately, this class of processes has some very unpleasant properties. Perhaps the biggest problem is that the sum of two Lévy process is not a Lévy process again. Modern mathematics is very much built on the idea of linearity. If there is not some very fundamental and very clear reason for it, then every reasonable class of mathematical objects should be closed under linear combinations. The concept of random noise comes xiii

xiv

PREFACE

very much from applications. One of the main goals of mathematics is to build safe theoretical tools and, like other scientific instruments, mathematical tools should be both simple and safe, similar to computer tools. Most computer users never read the footnotes in computer manuals, they just have a general feeling about the limits of the software. It is the responsibility of the writer of the software to make the software work in a plausible way. If the behaviour of the software is not reasonable, then its use becomes dangerous, e.g. you could easily lose your files, or delete or modify something and make the computer behave unpredictably, etc. Likewise, if an applied mathematical theory cannot guarantee that the basic objects of the theory behave reasonably, then the theory is badly written, and as one can easily make hidden errors in it, its usage is dangerous. In our case, if the theory cannot guarantee that the sum of two random noises is again a random noise, then the theory is very dangerous from the point of view of sound applications. The main reason for introducing martingales is that from the intuitive point of view they are very close to the idea of a random walk, but if we fix the amount of observable information they form a linear space. The issue of local martingales is a bit more tricky. Of course local martingales and not just real martingales form the class of random noise. Without doubt, local martingales make life for a stochastic analyst very difficult. From an intuitive, applied point of view, local martingales and martingales are very close and that is why it is easy to make mistakes. Therefore, in most cases the mathematical proofs have to be very detailed and cautious. On the other hand the local martingales form a large and stable class, so the resulting theory is very stable and simple to use. As in elementary algebra, most of the problems come from the fact that one cannot divide by zero. In stochastic analysis most of the problems come from the fact that not every local martingale is a martingale and therefore one can take expected values only with care. Is there some intuitive idea why one should introduce local martingales? Perhaps, yes. First of all one should realize that not really martingales, but uniformly integrable martingales, are the objects of the theory. If we observe a martingale up to a fixed, finite moment of time we get a uniformly integrable martingale, but most of the natural moments of time are special random variables. The measurement of the time-line is, in some sense, very arbitrary. Traditionally we measure it with respect to some physical, astronomical movements. For some processes this coordinate system is rather arbitrary. It is more natural, for example, to say ‘after lunch I called my friend’ than to say ‘I called my friend at twenty-three past and sometimes at twentytwo past one depending on the amount of food my wife gave me’. Of course the moment of time after lunch is a random variable with respect to the coordinate system generated by the relative position of the earth and the sun, but as a basis for observing my general habits this random time, ‘after lunch’, is the natural point of orientation. So, in some ways, it is very natural to say that a process is a random noise if one can define a sequence of random moments, so-called stopping times, τ 0 < τ 1 < . . . such that if we observe the random noise up to τ k the truncated processes are uniformly integrable martingales, which is exactly the definition of local martingales. The idea that local martingales are the good

PREFACE

xv

mathematical models for random noise comes from the fact that sometimes we want to perturb the measurement of the time-line in an order-preserving way, and we want the class of ‘random noise processes’ to be invariant under these transformations. 2. The second-most important concept is quadratic variation. One can think of stochastic analysis as the mathematical theory of quadratic variation. In classical analysis one can define an integral only when the integrator has bounded variation. Even in this case, one can define two different concepts of integration. One is the Lebesgue–Stieltjes type of integration and the other is the Riemann– Stieltjes concept of integration. If the integrand is continuous, then the two concepts are equal. It is easy to see, that if the integrand is left-continuous and in the Riemann–Stieltjes type integrals one may choose only the starting point of the sub-intervals of the partitions as test-point, then for these type of approximating sums the integrals of Riemann–Stieltjes type will converge and they are equal to the Lebesgue–Stieltjes integrals. One may ask whether one can extend this trick to some more general class of integrators. The answer is yes. It turns out that the same concept works if the integrators are local martingales. There is just one new element: the convergence of the integrating sums holds only in probability. If the integrators are local martingales or if they have finite variation then for this integral, the so-called integration by parts formula is valid. In this formula, the most notable factor is the quadratic co-variation [X, Y ] (t). If, for example, X is continuous and Y has finite variation then [X, Y ] (t) = 0 but generally [X, Y ] (t) = 0. As the stochastic integrals are defined only by convergence in probability the random variable [X, Y ] (t) is defined only up to a measure-zero set. This implies that the trajectories of the process t → [X, Y ] (t) are undefined. One can exert quite a lot of effort to show that there is a right-continuous process with limits from the left, denoted by [X, Y ] such that for every t the value of [X, Y ] at time t is a version of the random variable [X, Y ] (t). The key observation in the proof of this famous theorem is that XY −[X, Y ] is a local martingale and it is the only process for which this property holds and the jump-process of the process [X, Y ] is the process ∆X∆Y . The integration by parts formula is the prototype of Itˆ o’s formula, which is the main analytical tool of stochastic analysis. Perhaps it is not without interest to emphasize that the main difficulty in proving this famous formula, in the general case of discontinuous processes, is to establish the existence of the quadratic variation. It is worth mentioning that it is relatively easy to show the existence of the quadratic variation for the so-called locally square-integrable martingales. It is nearly trivial to show the existence of the quadratic variation when the trajectories of the process have finite variation. Hence, it is not so difficult to prove the existence of [X] [X, X] if process X has a decomposition X = V + H where the trajectories of V have finite variation and H is a so-called locally square-integrable martingale. The main problem is that we do not know that every local martingale has this decomposition! To

xvi PREFACE prove that this decomposition exists one should show the Fundamental Theorem of Local Martingales, which is perhaps the most demanding result of the theory. 3. The third most important concept of the theory is predictability. There are many interrelated objects in the theory modified by the adjective predictable. Perhaps the simplest and most intuitive one is the concept of predictable stopping time. Stopping times describe the occurrence of random events. The occurrence of a random event is predictable, if there is a sequence of other events which announces the predictable event. That is, a stopping time τ is predictable if there is a sequence of stopping times (τ n ) with τ n τ and τ n < τ whenever τ > 0. This definition is very intuitive and appealing. If τ is a predictable stopping time, then one can say that the event [τ , ∞) {(t, ω) : τ (ω) ≤ t} ⊆ R+ × Ω is also predictable. The σ-algebra generated by these type of predictable random intervals is called the σ-algebra of predictable events. One should agree that this definition of predictability is in some sense very close to the intuitive idea of predictability. Quite naturally, a stochastic process is called predictable if it is measurable with respect to the σ-algebra of the predictable events. It is an important and often useful observation that the set of predictable events is the same as the σ-algebra generated by the left-continuous adapted processes. Recall that a process is called adapted when its value for every moment of time is measurable with respect to the σ-algebra representing the amount of information available at that time. The values of left-continuous processes are at least infinitesimally predictable. One of the most surprising facts of stochastic integration theory is that in the general case the integrands of stochastic integrals should be predictable. Although it looks like a very deep mathematical observation, one should also admit that this is a very natural result. The best interpretation of stochastic integrals is that they are the net results of continuous-time trading or gaming processes. Everybody knows that in a casino one should play a trading strategy only if one decides about the stakes before the random events generating the gains occur. This means that the playing strategy should be predictable. An important concept related to predictability is the concept of the predictable compensator. If one has a risky stochastic process X, one can ask whether there is a compensator P for the risk of process X. The compensator should be ‘simpler’ than the process itself. Generally it is assumed that P is monotone or at least it has finite variation. The compensator P should be predictable and one should assume that X − P is a totally random process, that is X − P is a local martingale. This is of course a very general setup, but it appears in most of the applications of stochastic analysis. For a process X there are many compensators, that is there are many processes Y such that X − Y is a local martingale. Perhaps the simplest one is X itself. But it is very important that the predictable

PREFACE

xvii

compensator of X, if it exists and if it has finite variation, is in fact unique. The reason for this is that every predictable local martingale is continuous, and if the trajectories of a continuous local martingale have finite variation then the local martingale is constant. 4. Stochastic integration theory is built on probability theory. Therefore every object of the theory is well-defined only almost surely and this means that stochastic integrals are also defined almost surely. In classical integration theory, one first defines the integral over some fixed set and then defines the integral function. In stochastic integration theory this approach does not work as it is entirely non-trivial how one can construct the integral process from the almost surely defined separate integrals. Therefore, in stochastic integration theory one immediately defines the integral processes, so stochastic integrals are processes and not random variables. 5. There are basically two types of local martingales: continuous and purely discontinuous ones. The canonical examples of continuous local martingales are the Wiener processes, and the simplest purely discontinuous local martingales are the compensated Poisson processes. Every local martingale which has trajectories with finite variation is purely discontinuous, but there are purely discontinuous local martingales with infinite variation. Every local martingale has a unique decomposition L = L (0) + Lc + Ld , where Lc is a continuous local martingale and Ld is a purely discontinuous local martingale. A very important property of purely discontinuous local martingales is that they are sums of their continuously compensated single jumps. Si , by definition, is a single jump if there is a stopping time τ such that every trajectory of Si is constant before and after the random jump-time τ . The single jumps obviously have trajectories with finite variation, and as the compensators Pi , by definition, also have finite variation, the compensated single jumps Li Si − Pi also have trajectories with finite variation. Of course this does not imply that the trajectories of L, as infinite sums, should also have finite variation. If L is a purely discontinuous local martingale and L = i Li where Li are continuously compensated single jumps, then one can think about the stochastic integral with respect to L as the sum of the stochastic integrals with respect to Li . Every Li has finite variation so, in this case, the stochastic integral, as a pathwise integral, is well-defined and if the integrand is predictable then the integral is a local martingale. Of course one should restrict the class of integrands as one has to guarantee the convergence of the sum of the already defined integrals. If the integrand is predictable then the stochastic integral with respect to a purely discontinuous local martingale is a sum of local martingales. Therefore it is also a local martingale. 6. The stochastic integral with respect to continuous local martingales is a bit more tricky. The fundamental property of stochastic integrals with respect to local martingales is that the resulting process is also a local martingale. The intuition behind this observation is that the basic interpretation of stochastic integration is that it is the cumulative gain of an investment process into a randomly changing price process. Every moment of time we decide about the size

xviii PREFACE of our investment, this is the integrand, and our short term gains are the product of our investment and the change of the random price-integrator. Our total gain is the sum of the short term gains. If we can choose our strategy only in a predictable way it is quite natural to assume that our cumulative gain process will be also totally random. That is, if the investment strategy is predictable and the random integrator price process is a local martingale, then the net, cumulative gain process is also a local martingale. How much is the quadratic variation of the resulting gain process? If H • L denotes the integral of H with respect to the local martingale L then one should guarantee the very natural identity [H • L] = H 2 • [L], where the right-hand side expression H 2 • [L] denotes the classical pathwise integral of H 2 with respect to the increasing process [L]. The identity is really very natural as [L] describes the ‘volatility’ of L along the timeline, and if in every moment of time we have H pieces of L then our short term 2 change will be (H∆L) ≈ H 2 ·∆ [L]. So our aggregated ‘volatility’ is 2 in ‘volatility’ 2 H ∆ [L] H •L. It is a very nice observation that there is just one continuous local martingale, denoted by H • L, for which [H • L, N ] = H • [L, N ] holds for every continuous local martingale N . The stochastic integral with respect to a local martingale L is the sum of two integrals: the integral H • Lc with respect to the continuous and the integral H • Ld with respect to the purely discontinuous part of L. 7. As there are local martingales which have finite variation, one can ask whether the new and the classical definitions are the same or not? The answer is that if the integrand is predictable the two concepts of integration are not different. This allows us to further generalize the concept of stochastic integration. We say that process S is a semimartingale if S = L + V where L is a local martingale and V is adapted and has finite variation. One can define the integral with respect to S as the sum of the integrals with respect L and with respect to V . A fundamental problem is that in the discontinuous case, as we have local martingales with finite variation, the decomposition is not unique. But as for processes with finite variation the two concepts of integration coincide, this definition of stochastic integral with respect to semimartingales is well-defined. In the first chapter of the book we introduce the basic definitions and some of the elementary theorems of martingale theory. In the second chapter we give an elementary introduction to stochastic integration theory. Our introduction is built on the concept of Itˆ o–Stieltjes integration. In the third chapter we shall discuss the structure of local martingales and in Chapter Four we shall discuss the general theory of stochastic integration. In Chapter Six we prove Itˆ o’s formula. In Chapter Seven we apply the general theory to the classical theory of processes with independent increments. Finally it is a pleasure to thank to those who have helped me to write this book. In particular I would like to thank the efforts of Tam´ as Badics from University of Pannonia and Petrus Potgieter from University of South Africa. They read most of the book and without their help perhaps I would not have been able

PREFACE

xix

to finish the book. I wish to thank Istv´ an Dancs and J´ anos Száz from Corvinus University for support and help. I would like to express my gratitude to the Magyar K¨ ulkereskedelmi Bank for their support. Budapest, 2006 [email protected] medvegyev.uni-corvinus.hu

This page intentionally left blank

1 STOCHASTIC PROCESSES

In this chapter we first discuss the basic definitions of the theory of stochastic processes. Then we discuss the simplest properties of martingales, the Martingale Convergence Theorem and the Optional Sampling Theorem. In the last section of the chapter we introduce the concept of localization.

1.1

Random functions

Let us fix a probability space (Ω, A, P). As in probability theory we refer to the set of real-valued (Ω, A)-measurable functions as random variables. We assume that the space (Ω, A, P) is complete, that is all subsets of measure zero sets are also measurable. This assumption is not a serious restriction but it is a bit surprising that we need it. We shall need this assumption many times, for example when we prove that the hitting times1 of Borel measurable sets are stopping times2 . When we prove this we shall use the so-called Projection Theorem3 which is valid only when the space (Ω, A, P) is complete. We shall also use the Measurable Selection Theorem4 several times, which is again valid only when the measure space is complete. Let us remark that all applications of the completeness assumption are connected to the Predictable Projection Theorem, which is the main tool in the discussion of discontinuous semimartingales. In the theory of stochastic processes, random variables very often have infinite value. Hence the image space of the measurable functions is not R but the set of extended real numbers R [−∞, ∞]. The most important examples of random variables with infinite value are stopping times. Stopping times give the random time of the occurrence of observable events. If for a certain outcome ω the event never occurs, it is reasonable to say that the value of the stopping time for this ω is +∞. 1 See:

Definition 1.26, page 15. Definition 1.21, page 13. 3 See: Theorem A.12, page 550. 4 See: Theorem A.13, page 551. 2 See:

1

2

STOCHASTIC PROCESSES

1.1.1

Trajectories of stochastic processes

In the most general sense stochastic processes are such functions X(t, ω) that for any fixed parameter t the mappings ω → X(t, ω) are random variables on (Ω, A, P). The set of possible time parameters Θ is some subset of the extended real numbers. In the theory of continuous-time stochastic processes Θ is an interval, generally Θ = R+ [0, ∞), but sometimes Θ = [0, ∞] and Θ = (0, ∞) is also possible. If we do not say explicitly what the domain of the definition of the stochastic process is, then Θ is R+ . It is very important to append some remarks to this definition. In probability theory the random variables are equivalence classes, which means that the random variables X(t) are defined up to measure zero sets. This means that in general X(t, ω) is meaningless for a fixed ω. If the possible values of the time parameter t are countable then we can select from the equivalence classes X(t) one element, and fix a measure zero set, and outside of this set the expressions X(t, ω) are meaningful. But this is impossible if Θ is not countable5 . Therefore, we shall always assume that X(t) is a function already carefully selected from its equivalence class. To put it in another way: when one defines a stochastic process, one should fix the space of possible trajectories and the stochastic processes are function-valued random variables which are defined on the space (Ω, A, P). Definition 1.1 Let us fix the probability space (Ω, A, P) and the set of possible time parameters6 Θ. The function X defined on Θ × Ω is a stochastic process over Θ × Ω if for every t ∈ Θ it is measurable on (Ω, A, P) in its second variable. Definition 1.2 If we fix an outcome ω ∈ Ω then the function t → X(t, ω) defined over Θ is the trajectory or realization of X corresponding to the outcome ω. If all 7 the trajectories of the process X have a certain property then we say that the process itself has this property. For example, if all the trajectories of X are continuous then we say that X is continuous, if all the trajectories of X have finite variation then we say that X has finite variation, etc. Recall that in probability theory the role of the space (Ω, A, P) is a bit problematic. All the relevant questions of probability theory are related to the joint distributions of random variables and the whole theory is independent of the specific space carrying the random variables having these joint distributions. 5 This is what the author prefers to call the revenge of the zero sets. This is very serious and it will make our life quite difficult. The routine solution to this challenge is that all the processes which we are going to discuss have some sort of continuity property. In fact, we shall nearly always assume that the trajectories of the stochastic processes are regular, that is at every point all the trajectories have limits from both sides and they are either right- or left-continuous. As we want to guarantee that the martingales have proper trajectories we shall need the so-called usual assumptions. 6 In most of the applications Θ is the time parameter. Sometimes the natural interpretation of Θ is not the time but some spatial parameter. See: Example 1.126, page 90. In continuous ‘time’ theory of stochastic processes Θ is an interval in the half-line R+ . 7 Not almost all trajectories. See: Definition 1.8, page 6, Example 1.11, page 8.

RANDOM FUNCTIONS

3

Of course it is not sufficient to define the distributions alone. For instance, it is very important to clarify the relation between the lognormal and the normal distribution, and we can do it only when we refer directly to random variables. Hence, somehow, we should assume that there is a measure space carrying the random variables with the given distributions: if ξ has normal distribution then exp(ξ) has lognormal distribution. This is a very simple and very important relation which is not directly evident from the density functions. The existence of a space (Ω, A, P) enables us to use the power of measure theory in probability theory, but the specific structure of (Ω, A, P) is highly irrelevant. The space (Ω, A, P) contains the ‘causes’, but we see only the ξ (ω) ‘consequences’. We never observe the outcome ω. We can see only its consequence ξ(ω). As the space (Ω, A, P) is irrelevant one can define it in a ‘canonical way’. In probability theory, generally, Ω R, A B (R) and P is the measure generated by the distribution function of ξ or in the multidimensional case Ω Rn and A B (Rn ). In both cases Ω is the space of all possible realizations. Similarly in the theory of stochastic processes the only entities which one can observe are the trajectories. Sometimes it is convenient if Ω is the space of possible trajectories. In this case we say that Ω is given in its canonical form. It is worth emphasizing that in probability theory there is no advantage at all in using any specific representation. In the theory of stochastic processes the relevant questions are related to time and all the information about the time should be somehow coded in Ω. Hence, it is very plausible if we assume that the elements of Ω are not just abstract objects which somehow describe the information about the timing of certain events, but are also functions over the set of possible time values. That is, in the theory of stochastic processes, the canonical model is not just one of the possible representation: it is very often the right model to discuss certain problems. 1.1.2

Jumps of stochastic processes

Of course, the theory of stochastic processes is an application of mathematical analysis. Hence the basic mathematical tool of the theory of stochastic processes is measure theory. To put it another way, perhaps one of the most powerful applications of measure theory is the theory of stochastic processes. But measure theory is deeply sequential, related on a fundamental level to countable objects. We can apply measure theory to continuous-time stochastic processes only if we restrict the trajectories of the stochastic processes to ‘countably determined functions’. Definition 1.3 Let I ⊆ R be an interval and let Y be an arbitrary topological space. We say that the function f : I → Y is regular if at any point t ∈ I, where it is meaningful, f has left-limits f (t−) f− (t) lim f (s) ∈ Y st

4


and right-limits f (t+) f+ (t) lim f (s) ∈ Y. st

We say that f is right-regular if it is regular and it is right-continuous. We say that f is left-regular if it is regular and it is left-continuous. If f is a real-valued function, that is if Y R in the above definition, then the existence of limits means that the function has finite limits. As, in this book, stochastic processes are mainly real-valued stochastic processes, to make the terminology as simple as possible we shall always assume that regular processes have finite limits. If the process X is regular and if t is an interior point of Θ then as the limits are finite it is meaningful to define the jump ∆X(t) X(t+) − X(t−) of X at t. It is not too important, but a bit confusing, that somehow one should fix the definition of jumps of the regular processes at the endpoints of the time interval Θ. If Θ = R+ then what is the jump of the function χΘ at t = 0? Is it zero or one? Definition 1.4 We do not know anything about X before t = 0 so by definition we shall assume that X(0−) X(0). Therefore for any right-regular process on R+ ∆X(0) X(0+) − X(0−) = 0.

(1.1)

In a similar way, if, for example, Θ [0, 1) and X χΘ , then X is rightregular and does not have a jump at t = 1. Observe that in both examples the trajectories were continuous functions on Θ so it is a bit strange to say that the jump process of a continuous process is not zero8 . It is not entirely irrelevant how we define the jump process at t = 0. If we consider process F χR+ as a distribution function of a measure then how much is the integral [0,1] 1dF ? We shall assume that the 1 distribution functions are right-regular and not leftregular. By definition9 0 1dF is the integral over (0, 1] and as F is right-regular 8 One can take another approach. In general: what is the value of an undefined variable? If X is the value process of a game and τ is some exit strategy, then what is the value of the game if we never exit from the game, that is if τ = ∞? It is quite reasonable to say that in this case the value of the game is zero. Starting from this example one can say that once a variable is undefined then we shall assume that its value is zero. If one uses this approach then X (0−) 0 and ∆X (0) = X (0+). b 9 In measure theory one can very often find the convention a f dµ [a,b) f dµ. We shall assume that the integrator processes are right- and not left-continuous, so we shall use the convention ab f dµ (a,b] f dµ.

RANDOM FUNCTIONS

the measure of (0, 1] is F (1) − F (0) = 0 so convention one can think that

1 0

5

1dF = 0. According to our

1dF = F (1) − F (0−) = F (1) − F (0) = 1 − 1 0. [0,1]

On the other hand one can correctly argue that

1dF

[0,1]

χ([0, 1])dF = 1. R

To avoid these type of problems we shall never include the set {t = 0} in the domain of integration. The regular functions have many interesting properties. We shall very often use the next propositions: Proposition 1.5 Let f be a real-valued regular function defined on a finite and closed interval [a, b]. For any c > 0 the number of the jumps in [a, b] bigger in absolute value then c is finite. The number of the jumps of f are at most countable. Proof. The second part of the proposition is an easy consequence of the first part. Assume that there is an infinite number of points (tn ) in [a, b] for which |∆f (tn )| ≥ c. As [a, b] is compact, one can assume that tn → t∗ . Obviously we can assume that for an infinite number of points tn ≤ t∗ or t∗ ≤ tn . Hence we can assume that tn t∗ . But f has a left-limit at t∗ so if x, y < t∗ are close enough to t∗ then |f (x) − f (y)| ≤ c/4. If tn is close enough to t∗ and x < tn < y are close enough to tn and to t∗ then c ≤ |f (tn +) − f (tn −)| ≤ ≤ |f (tn +) − f (y)| + |f (y) − f (x)| + |f (x) − f (tn −)| ≤

3 c, 4

which is impossible. Proposition 1.6 If a function f is real valued and regular then it is bounded on any compact interval. Proof. Fix a finite closed interval [a, b]. If f were not bounded on [a, b] then there would be a sequence (tn ) for which |f (tn )| ≥ n. As [a, b] is compact one could assume, that tn → t∗ . We could also assume that e.g. tn t∗ and therefore f (tn ) → f (t∗ −) ∈ R which is impossible.

6


Proposition 1.7 Let f be a real valued regular function defined on a finite and closed interval [a, b]. If the jumps of f are smaller than c then for any ε > 0 there is a δ such that |f (t ) − f (t )| < c + ε

whenever

|t − t | ≤ δ.

Proof. If such a δ were not available then for some δ n 0 for all n there would be tn , tn such that |tn − tn | ≤ δ n and |f (tn ) − f (tn )| ≥ c + ε.

(1.2)

As [a, b] is compact, one could assume that tn → t∗ and tn → t∗ for some t∗ . Notice that except for a finite number of indexes (tn ) and (tn ) are on different sides of t∗ , since if, for instance, for an infinite number of indexes tn , tn ≥ t∗ then for some subsequences tnk t∗ and tnk t∗ and as the trajectories of f are regular limk→∞ f (tnk ) = limk→∞ f (tnk ) which contradicts (1.2). So we can assume that tn t∗ and tn t∗ . Using again the regularity of f, one has |∆f (t∗ )| ≥ c + ε which contradicts the assumption |∆f | ≤ c. 1.1.3

When are stochastic processes equal?

A stochastic process X has three natural ‘facets’. The first one is the process itself, which is the two-dimensional ‘view’. We shall refer to this as X(t, ω) or just as X. With the first notation we want to emphasize that X is a function of two variables. For instance, the different concepts of measurability, like predictability or progressive measurability, characterize X as a function of two variables. We shall often use the notations X(t) or sometimes Xt , which denote the random variable ω → X(t, ω), that is the random variable belonging to moment t. Similarly we shall use the symbols X(ω), or Xω as well, which refer to the trajectory belonging to ω, that is X(ω) is the ‘facet’ t → X(t, ω) of X. Definition 1.8 Let X and Y be two stochastic processes on the probability space (Ω, A, P). 1. The process X is a modification of the process Y if for all t ∈ Θ the variables X(t) and Y (t) are almost surely equal, that is for all t ∈ Θ P (X(t) = Y (t)) P ({ω : X(t, ω) = Y (t, ω)}) = 1. By this definition, the set of outcomes ω where X(t, ω) = Y (t, ω), can depend on t ∈ Θ. 2. The processes X and Y are indistinguishable if there is a set N ⊆ Ω which has probability zero, and whenever ω ∈ / N then X (ω) = Y (ω) , that is X(t, ω) = Y (t, ω) for all t ∈ Θ and ω ∈ / N.

MEASURABILITY OF STOCHASTIC PROCESSES

7

Proposition 1.9 Assume that the realizations of X and Y are almost surely continuous from the left or they are almost surely continuous from the right. If X is a modification of Y then X and Y are indistinguishable. Proof. Let N0 be the set of outcomes where X and Y are not left-continuous or right-continuous. Let (rk ) be the set of rational points10 in Θ and let Nk {X(rk ) = Y (rk )} {ω : X(rk , ω) = Y (rk , ω)} . X is a modification of Y hence P(Nk ) = 0 for all k. Therefore if N ∪∞ k=0 Nk then P(N ) = 0. If ω ∈ / N then X(rk , ω) = Y (rk , ω) for all k, hence as the trajectories X(ω) and Y (ω) are continuous from the same side X(t, ω) = Y (t, ω) for all t ∈ Θ. Therefore outside N obviously X(ω) = Y (ω), that is X and Y are indistinguishable. Example 1.10 With modification one can change the topological properties of trajectories.

In the definition of stochastic processes one should always fix the analytic properties like continuity, regularity, differentiability etc. of the trajectories. It is not a great surprise that with modification one can dramatically change these properties. For example, let (Ω, A, P) ([0, 1] , B, λ) and Y (t, ω) ≡ 0. The trajectories of Y are continuous. If χQ is the characteristic function of the rational numbers, and X(t, ω) χQ (t + ω) then for all ω the trajectories of X are never continuous but X is a modification of Y . From the example it is also obvious that it is possible for X to be a modification of Y but for X and Y not to be indistinguishable. If X and Y are stochastic processes then, unless we explicitly say otherwise, X = Y means that X and Y are indistinguishable.

1.2

Measurability of Stochastic Processes

As we have already mentioned, the theory of stochastic processes is an application of measure theory. On the one hand this remark is almost unnecessary as measure theory is the cornerstone of every serious application of mathematical analysis. On the other hand it is absolutely critical how one defines the class of 10 Recall that Θ is an interval in R. If X and Y are left-continuous then left-continuity is meaningless in the left endpoint of Θ, so if Θ has a left endpoint then we assume that this left endpoint is part of (rk ). Similarly when X and Y are right-continuous and Θ has right endpoint then we assume that this endpoint is in (rk ).

8


measurable functions which one can use in stochastic analysis. Every stochastic process is a function of two variables, so it is obvious to assume that every process is product measurable. Example 1.11 An almost surely continuous process is not necessarily product measurable.

Let (Ω, A, P) ([0, 1] , B, λ) and let E be a subset of [0, 1] which is not Lebesgue measurable. The process 0 if ω = 0 X(t, ω) χE (t) if ω = 0 is almost surely continuous. X is not product measurable as by Fubini’s theorem the product measurability implies partial measurability but if ω = 0 then t → X(t, ω) is not measurable. Although the example is trivial it is not without any interest. Processes X and Y are considered to be equal if they are indistinguishable. So in theory it can happen that X is product measurable and X = Y but Y is not product measurable. To avoid these type of measurability problems we should for example, assume that the different objects of stochastic analysis, like martingales, local martingales, or semimartingales etc. are right-regular and not just almost surely right-regular. Every trajectory of a Wiener processes should be continuous, but it can happen that it starts only almost surely from zero. 1.2.1

Filtration, adapted, and progressively measurable processes

A fundamental property of time is its ‘irreversibility’. This property of time is expressed with the introduction of the filtration. Definition 1.12 Let us fix a probability space (Ω, A, P). For every t ∈ Θ let us select a σ-algebra Ft ⊆ A in such a way that whenever s < t then Fs ⊆ Ft . The correspondence t → Ft is called a filtration and we shall denote this correspondence by F. The quadruplet (Ω, A, P, F) is called a stochastic basis. With the filtration F one can define the σ-algebras Ft+ ∩s>t Ft ,

Ft− σ (∪s 0 that on the interval [0, ε] the trajectory w (ω) is zero. Obviously F = ∪n Fn , where Fn is the set of outcomes ω, for which w (ω) is zero on the interval [0, 1/n]. Fn is measurable as it is equal to the set 1 w (rn ) = 0, rn ∈ 0, ∩Q . n Obviously P(Fn ) = 0, therefore P(F ) = 0. By definition w(0) ≡ 0, therefore / F0w . If t > 0 and 1/n ≤ t, then obviously Fn ∈ F0w = {Ω, ∅}. Hence F ∈ w Ft , therefore ∪1/n≤t Fn ∈ Ftw . On the other hand for every t > 0 evidently ∪1/n≤t Fn = F , since obviously ∪1/n≤t Fn ⊆ F and if ω ∈ F then ω ∈ Fn ⊆ w w , that is F0w = F0+ . ∪1/n≤t Fn for some index n. Hence F ∈ ∩t>0 Ftw = F0+ Let us remark that, as we shall see later, if N is the collection of sets with 11 One can observe that the interpretation of F t− is intuitively quite appealing, but the interpretation of Ft+ looks a bit unclear. It is intuitively not obvious that what type of information one can get in an infinitesimally short time interval after t or to put it in another way it is not too clear why one can get Ft = Ft+ . Therefore from an intuitive point of view it is not a great surprise that we shall generally assume that Ft = Ft+ .

10


measure-zero in A then the filtration Ft σ (Ftw ∪ N ) is right-continuous, so this extended F satisfies the usual conditions12 . The σ-algebra F0w = {Ω, ∅} is complete, which implies that to make F right-continuous one should add to the σ-algebra Ftw all the null sets from A, or at least the null sets of Ftw for all t and it is not sufficient to complete the σ-algebras Ftw separately. Definition 1.14 We say that process X is adapted to the filtration F if X(t) is measurable with respect to Ft for every t. A set A ⊆ Θ × Ω is adapted if the process χA is adapted. In the following we shall fix a stochastic basis (Ω, A, P, F) and if we do not say otherwise we shall always assume that all stochastic processes are adapted with respect to the filtration F of the stochastic basis. It is easy to see that the set of adapted sets form a σ-algebra. Example 1.15 If Ft ≡ {∅, Ω} for all t then only the deterministic processes are adapted. If Ft ≡ A for all t then every product measurable stochastic process is adapted.

The concept of adapted processes is a dynamic generalization of partial measurability. The dynamic generalization of product measurability is progressive measurability: Definition 1.16 A set A ⊆ Θ×Ω is progressively measurable if for all t ∈ Θ A ∩ ([0, t] × Ω) ∈ Rt B ([0, t]) × Ft , that is for all t the restriction of A to [0, t] × Ω is measurable with respect to the product σ-algebra Rt B ([0, t]) × Ft . The progressively measurable sets form a σ-algebra R. We say that a process X is progressively measurable if it is measurable with respect to R. It is clear from the definition that every progressively measurable process is adapted. Example 1.17 Adapted process which is not progressively measurable.

12 See:

Proposition 1.103, page 67.


11

Let Ω Θ [0, 1] and let Ft A be the σ-algebra generated by the finite subsets of Ω. If D {t = ω} then the function X χD is obviously adapted. We prove that it is not product measurable. Assume that {X = 1} = D ∈ B (Θ) × A. By the definition of product measurability Y [0, 1/2] × Ω ∈ B (Θ) × A. So if D ∈ B (Θ) × A then Y ∩ D ∈ B (Θ) × A. Therefore by the projection theorem13 [0, 1/2] ∈ A which is impossible. Therefore D ∈ / B (Θ) × A. If Ft A for all t then X is adapted but not progressively measurable. Example 1.18 Every adapted, continuous from the left and every adapted, continuous from the right process is progressively measurable14 .

Assume, for example, that X is adapted and continuous from the right. Fix a t (n) (n) (n) and let 0 = t0 < t1 < . . . < tk = t be a partition of [0, t]. Let us define the processes Xn (s)

X (0)

(n) X tk

if if

s = 0 . (n) (n) s ∈ tk−1 , tk

As X is adapted Xn is measurable with respect to the σ-algebra Rt B ([0, t]) × (n) Ft . If the sequence of partitions (tk ) is infinitesimal, that is if (n) (n) lim max tk − tk−1 = 0

n→∞

k

then as X is right-continuous Xn → X. Therefore the restriction of X to [0, t] is Rt -measurable. Hence X is progressively measurable. Example 1.19 If X is regular then ∆X is progressively measurable.

Like the product measurability, the progressive measurability is also a very mild assumption. It is perhaps the mildest measurability concept one can use in stochastic analysis. The main reason why one should introduce this concept is the following much-used observation: Proposition 1.20 Assume that V is a right-regular, adapted process and assume that every trajectory of V has finite variation on every finite interval [0, t]. 1. If for every ω the trajectories X (ω) are integrable on any finite interval with respect to the measure generated by V (ω) then the parametric 13 If P (N ) = 0 if N is countable otherwise P (N ) = 1, then the probability space (Ω, A, P ) is complete. 14 Specially, if X(t, ω) is measurable in ω and continuous in t then X is product measurable.

12


integral process

t

X (s, ω) V (ds, ω)

Y (t, ω)

(1.4)

0

X (s, ω) V (ds, ω) (0,t]

forms a right-regular process and ∆Y = X · ∆V . 2. If additionally X is progressively measurable then Y is adapted. Proof. The first statement of the proposition is a direct consequence of the Dominated Convergence Theorem. Observe that to prove the second statement one cannot directly apply Fubini’s theorem, but one can easily adapt its usual proof: Let H denote the set of bounded processes for which Y (t) in (1.4) is Ft -measurable. As the measure of finite intervals is finite H is a linear space, it contains the constant process X ≡ 1, and if 0 ≤ Hn ∈ H and Hn H and H is bounded then by the Monotone Convergence Theorem H ∈ H. This implies that H is a λ-system. If C ∈ Ft and s1 , s2 ≤ t, and B (s1 , s2 ] × C then as V is adapted the integral

t

χB dV = χC [V (s2 ) − V (s1 )] 0

is Ft -measurable. These processes form a π-system, hence by the Monotone Class Theorem H contains the processes which are measurable with respect to the σ-algebra generated by the processes χC χ ((s1 , s2 ]). As C ∈ Ft the πsystem generates the σ-algebra of the product measurable sets B ((0, t])×Ft . X is progressively measurable so its restriction to (0, t ] is (B ((0, t]) × Ft )-measurable. Hence the proposition is true if X is bounded. From this the general case follows from the Dominated Convergence Theorem. What is the intuitive idea behind progressive measurability? Generally the filtration F is generated by some process X. Recall that if Z (ξ α )α∈A is a set of random variables and X σ (ξ α : α ∈ A) denotes the σ-algebra generated by them then X = ∪S⊆A XS where the subsets S are arbitrary countable generated subsets of A and for any S set XS denotes the σ-algebra

by the

the countably many variables ξ αi α ∈S of Z, that is XS σ ξ αi : αi ∈ S . By this i structure of the generated σ-algebras, FtX contains all the information one can obtain observing X up to time t countably many times. If a process Y is adapted with respect to F X then Y reflects the information one can obtain from countable many observations of X. But sometimes, like in (1.4), we want information


13

which depends on uncountable number of observations of the underlying random source. In these cases one needs progressive measurability! 1.2.2

Stopping times

After filtration, stopping time is perhaps the most important concept of the theory of stochastic processes. As stopping times describe the moments when certain random events occur, it is not a great surprise that most of the relevant questions of the theory are somehow related to stopping times. It is important that not every random time is a stopping time. Stopping times are related to events described by the filtration of the stochastic base15 . At every time t one can observe only the events of the probability space (Ω, Ft , P). If τ is a random time then at time t one cannot observe the whole τ . One can observe only the random variable τ ∧ t! By definition τ is a stopping time if τ ∧ t is an (Ω, Ft , P)-random variable for all t. Definition 1.21 Let Ω be the set of outcomes and let F be a filtration on Ω. Let τ : Ω → Θ ∪ {∞}. 1. The function τ is a stopping time if for every t ∈ Θ {τ ≤ t} ∈ Ft . We denote the set of stopping times by Υ. 2. The function τ is a weak stopping time if for every t ∈ Θ {τ < t} ∈ Ft . Example 1.22 Almost-surely zero functions and stopping times.

Assume that the probability space (Ω, A, P) is complete and for every t the σ-algebra Ft contains the measure-zero sets of A. If N ⊆ Ω is a measure-zero set and the function τ ≥ 0 is zero on the complement of N , then τ is stopping time, as for all t {τ ≤ t} ⊆ N ∈ Ft , hence {τ ≤ t} ∈ Ft . In a similar way if σ ≥ 0 is almost surely +∞ then σ is a stopping time. These examples are special cases of the following: If (Ω, A, P, F) satisfies the usual conditions and τ is a stopping time and σ ≥ 0 is almost surely equal to τ then σ is also a stopping time. We shall see several times that in the theory of stochastic processes the time axis is not symmetric. The filtration defines an orientation on the real axis. 15 If we travel from a city to the countryside then the moment when we arrive at the first pub after we leave the city is a stopping time, but the time when we arrive at the last pub before we leave the city is not a stopping time. In a similar way when X is a stochastic process the first time X is zero is a stopping time, but the last time it is zero is not a stopping time. One of the most important random times which is generally not a stopping time is the moment when X reaches its maximum on a certain interval. See: Example 1.110, page 73.

14


An elementary but very import consequence of this orientation is the following proposition: Proposition 1.23 Every stopping time is a weak stopping time. If the filtration F is right-continuous then every weak stopping time is a stopping time. Proof. As the filtration F is increasing, if τ is a stopping time then for all n 1 ∈ Ft−1/n ⊆ Ft . τ ≤t− n Therefore {τ < t} = ∪n

1 τ ≤t− n

∈ Ft .

On the other hand if F is right-continuous that is if Ft+ = Ft then 1 {τ ≤ t} = ∩n τ < t + ∈ ∩n Ft+1/n Ft+ = Ft . n The right-continuity of the filtration is used in the next proposition as well. Proposition 1.24 If τ and σ are stopping times then τ ∧ σ and τ ∨ σ are also stopping times. If (τ n ) is an increasing sequence of stopping times then τ lim τ n n→∞

is a stopping time. If the filtration F is right-continuous and (τ n ) is a decreasing sequence of stopping times then τ lim τ n n→∞

is a stopping time. Proof. If τ and σ are stopping times then {τ ∧ σ ≤ t} = {τ ≤ t} ∪ {σ ≤ t} ∈ Ft , {τ ∨ σ ≤ t} = {τ ≤ t} ∩ {σ ≤ t} ∈ Ft . If τ n τ then for all t {τ ≤ t} = ∩n {τ n ≤ t} ∈ Ft . If τ n τ then for all t c

{τ ≥ t} = ∩n {τ n ≥ t} = ∩n {τ n < t} ∈ Ft


15

that is {τ < t} = ∪n {τ n < t} ∈ Ft . Hence τ is a weak stopping time. If the filtration F is right-continuous then τ is a stopping time. Corollary 1.25 If the filtration F is right-continuous and (τ n ) is a sequence of stopping times then sup τ n , n

inf τ n n

lim sup τ n , n→∞

lim inf τ n n→∞

are stopping times. The next definition concretizes the abstract definition of stopping times: Definition 1.26 If Γ ⊆ R+ × Ω then the expression τ Γ (ω) inf {t : (t, ω) ∈ Γ}

(1.5)

is called the début of the set Γ. If B ⊆ Rn and X is a vector valued stochastic process then τ B (ω) inf {t : X(t, ω) ∈ B}

(1.6)

is called the hitting time of set B. If B ⊆ R and X is a stochastic process and if Γ {X ∈ B} then τ Γ = τ B which means that every hitting time is a special début. Example 1.27 The most important hitting times are the random functions τ a (ω) inf {t : X(t, ω)Ra} where R is one of the relations ≥, >, ≤, σ : X(t) ∈ B} . The set Γ {(t, ω) : X(t, ω) ∈ B} ∩ {(t, ω) : t > σ (ω)} is progressively measurable since by the progressive measurability of X the first set in the intersection is progressively measurable, and the characteristic function of the other set is adapted and left-continuous hence it is also progressively measurable. By the theorem above if (Ω, A, P, F) satisfies the usual conditions then the expression τ = τ Γ inf {t : (t, ω) ∈ Γ} is a stopping time. 16 See:

Theorem A.12, page 550. can happen that (s, ω) ∈ Γ for all s > t, but (t, ω) ∈ / Γ. In this case τ Γ (ω) = t, but ω∈ / projΩ (Γ ∩ [0, t) × Ω), therefore in the proof we used the right-continuity of the filtration. 17 It


17

Corollary 1.30 If the stochastic base (Ω, A, P, F) satisfies the usual conditions, the process X is progressively measurable and B is a Borel set then the hitting times τ 0 0,

τ n+1 inf {t > τ n : X(t) ∈ B}

are stopping times. Example 1.31 If X is not progressively measurable then the hitting times of Borel sets are not necessarily stopping times.

Let X χD be the adapted but not progressively measurable process in Example 1.17. The hitting time of the set B {1} is obviously not a stopping time as / A F1/2 . {τ B ≤ 1/2} = [0, 1/2] ∈ The main advantage of the above construction is its generality. An obvious disadvantage of the just proved theorem is that it builds on the Projection Theorem. Very often we do not need the generality of the above construction and we can construct stopping times without referring to the Projection Theorem. Example 1.32 Construction of stopping times without the Projection Theorem.

1. If the set B is closed and X is a continuous, adapted process then one can easily proof that the hitting time (1.6) is a stopping time. As the trajectories are continuous the sets K(t, ω) X ([0, t] , ω) are compact for every outcome ω. As B is closed K(t, ω) ∩ B = ∅ if and only, if the distance between the two sets is positive. Therefore K(t, ω) ∩ B = ∅ if and only if τ B (ω) > t. As the trajectories are continuous X([0, t] ∩ Q, ω) is dense in the set K(t, ω). As the metric is a continuous function {τ B ≤ t} = {K(t) ∩ B = ∅} = {d (K(t), B) = 0} = = {ω : inf {d(X (s, ω) , B) : s ≤ t, s ∈ Q} = 0} . X(s) is Ft -measurable for a fixed s ≤ t, hence as x → d (x, B) is continuous d(X(s), B) is also Ft -measurable. The infimum of a countable number of measurable functions is measurable, hence {τ B ≤ t} ∈ Ft . 2. We prove that if B is open, the trajectories of X are right-continuous and adapted, and the filtration F is right-continuous then the hitting time (1.6) is a stopping time. It is sufficient to prove that {τ B < t} ∈ Ft for all t. As the trajectories are right-continuous and as B is open X(s, ω) ∈ B, if and only if,

18


there is an ε > 0 such that whenever u ∈ [s, s + ε) then X(u, ω) ∈ B. From this {τ B < t} = ∪s∈Q∩[0,t) {X(s) ∈ B} ∈ Ft . 3. In a similar way one can prove that if X is left-continuous and adapted, F is right-continuous, and B is open, then the hitting time τ B is a stopping time. 4. If the filtration is right-continuous, and X is a right or left-continuous adapted process, then for any number c the first passage time τ inf {t : X(t) > c} is a stopping time. 5. If B is open and the filtration is not right-continuous, then even for continuous processes the hitting time τ B is not necessarily a stopping time18 . If X(t, ω) t · ξ(ω), where ξ is a Gaussian random variable, and Ft is the filtration generated by X, then F0 = {0, Ω} , and the hitting time τ B of the set B {x > 0} is τ B (ω)

0 if ξ(ω) > 0 . ∞ if ξ(ω) ≤ 0

/ F0 , so τ B is not a stopping time. Obviously {τ B ≤ 0} ∈ 6. Finally we show that if σ is an arbitrary stopping time and X is a right-regular, adapted process and c > 0, then the first passage time τ (ω) inf {t > σ : |∆X(t, ω)| ≥ c} is stopping time. Let us fix an outcome ω and let assume that ∞ > tn τ (ω) , where |∆X(tn , ω)| ≥ c. The trajectory X(ω) is right-regular, therefore the jumps which are bigger than c do not have an accumulation point. Hence for all indexes n large enough tn is already constant, that is τ (ω) = tn > σ (ω) , so |∆X(τ (ω))| = |∆X(tn )| ≥ c for some n. This means that |∆X (τ )| ≥ c on the set {τ < ∞} and on the set {σ < ∞} one has τ > σ. Let A(t) ([0, t] ∩ Q) ∪ {t}. We prove that τ (ω) ≤ t if and only if for all n ∈ N one can find a pair qn , pn ∈ A(t) for which σ(ω) < pn < qn < pn +

1 n

18 The reason for this is clear as the event {τ B = t} can contain such outcomes ω that the trajectory will hit the set B just after t therefore one should investigate the events {τ B < t}.


19

and |X(pn , ω) − X(qn , ω)| ≥ c −

1 . n

(1.7)

One implication is evident, that is if τ (ω) ≤ t, then as the jumps bigger than c do not have accumulation points, |∆X(s, ω)| ≥ c for some σ(ω) < s ≤ t. Hence by the regularity of the trajectories one can construct the necessary sequences. On the other hand, let us assume that the sequences (pn ) , (qn ) exist. Without loss of generality one can assume that (pn ) and (qn ) are convergent. Let σ (ω) ≤ s ≤ t be the common limit point of these sequences. If for an infinite number of indexes pn ≥ s, then in any right neighbourhood of s there is an infinite number of intervals [pn , qn ], on which X changes more then c/2 > 0, which is impossible as X is right-continuous. Similarly, only for a finite number of indexes qn ≤ s as otherwise for an infinite number of indexes pn < qn ≤ s which is impossible as X(ω) is left-continuous. This means that for indexes n big enough σ (ω) < pn ≤ s ≤ qn . Taking the limit in the line (1.7) |∆X(s, ω)| ≥ c and hence τ (ω) ≤ s ≤ t. Using this property one can easily proof that {τ ≤ t} =

n∈N

p,q∈A(t) p r > τ } = {σ > r} ∩ {τ < r} = c

= {σ ≤ r} ∩ {τ < r} ∈ Ft . From this c

{σ ≤ τ } ∩ {σ ≤ t} = {σ > τ } ∩ {σ ≤ t} = = ∪r∈Q {σ > r > τ } ∩ {σ ≤ t} = = ∪r∈Q,r≤t {σ > r > τ } ∩ {σ ≤ t} ∈ Ft . Hence by the definition of Fσ one has {σ ≤ τ } ∈ Fσ . On the other hand {τ ≤ σ} ∩ {σ ≤ t} = {σ ≤ t} ∩ {τ ≤ t} ∩ {τ ∧ t ≤ σ ∧ t} ∈ Ft ,

22


since the first two sets, by the definition of stopping times, are in Ft and the two random variables in the third set are Ft -measurable. Hence {τ ≤ σ} ∈ Fσ . Proposition 1.35 If X is progressively measurable and τ is an arbitrary stopping time then the stopped variable Xτ is Fτ -measurable, and the truncated process X τ is progressively measurable. Proof. The first part of the proposition is an easy consequence of the second as, if B is a Borel measurable set and X τ is adapted, then for all s {Xτ ∈ B} ∩ {τ ≤ s} = {X (τ ∧ s) ∈ B} ∩ {τ ≤ s} = = {X τ (s) ∈ B} ∩ {τ ≤ s} ∈ Fs , that is, in this case the stopped variable Xτ is Fτ -measurable. Therefore it is sufficient to prove that if X is progressively measurable then X τ is also progressively measurable. Let Y (t, ω)

1

if t < τ (ω)

0

if t ≥ τ (ω)

.

Y is right-regular. τ is a stopping time so {Y (t) = 0} = {τ ≤ t} ∈ Ft . Hence Y is adapted, therefore it is progressively measurable20 . Obviously if τ (ω) > 0 then21

Z (t, ω)

X (s, ω) Y (ds, ω) = (0,t]

0 if t < τ (ω) . −X (τ (ω) , ω) if t ≥ τ (ω)

As X is progressively measurable Z is adapted22 and also right-regular so it is again progressively measurable. As X τ = XY − Z + X (0) χ (τ = 0) X τ is obviously progressively measurable. Corollary 1.36 If G σ(X(τ ) : X is right-regular and adapted) then G = Fτ . Proof. As every right-regular and adapted process is progressively measurable G ⊆ Fτ . If A ∈ Fτ then the process X(t) χA χ (τ ≤ t) is right-regular and by 20 See:

Example 1.18, page 11. τ (ω) = 0 then Z (ω) = 0. 22 See: Proposition 1.20, page 11. 21 If


23

the definition of Fτ {X(t) = 1} = A ∩ {τ ≤ t} ∈ Ft . Hence X is adapted. Obviously X (τ ) = χA . Therefore Fτ ⊆ G. 1.2.4

Predictable processes

The class of progressively measurable processes is too large. As we have already remarked, the interesting stochastic processes have regular trajectories. There are two types of regular processes: some of them have left- and some of them have right-continuous trajectories. It is a bit surprising that there is a huge difference between these two classes. But one should recall that the trajectories are not just functions: the time parameter has an obvious orientation: the time line is not symmetric, the time flows from left to right. Definition 1.37 Let (Ω, A, P, F) be a stochastic base, and let us denote by P the σ-algebra of the subsets of Θ × Ω generated by the adapted, continuous processes. The sets in the σ-algebra P are called predictable. A process X is predictable if it is measurable with respect to P. Example 1.38 A deterministic process is predictable if and only if its single trajectory is a Borel-measurable function.

Obviously we call a process X deterministic if it does not depend on the random parameter ω, more exactly a process X is called deterministic if it is a stochastic process on (Ω, {Ω, ∅}). If A {Ω, ∅} then the set of continuous stochastic processes is equivalent to the set of continuous functions, and the σ-algebra generated by the continuous functions is equivalent to the σ-algebra of the Borel measurable sets on Θ. The set of predictable processes is closed for the usual operations of analysis23 . The most important and specific operation related to stochastic processes is the truncation: Proposition 1.39 If τ is an arbitrary stopping time and X is a predictable stochastic process then the truncated process X τ is also predictable. Proof. Let L be the set of bounded stochastic processes X for which X τ is predictable. It is obvious that L is a λ-system. If X is continuous then X τ is also continuous hence the π-system of the bounded continuous processes is in L. From the Monotone Class Theorem it is obvious that L contains the set of bounded predictable processes. If X is an arbitrary predictable process then 23 Algebraic

and lattice type operations, usual limits etc.

24


Xn Xχ (|X| ≤ n) is a predictable bounded process and therefore Xnτ is also predictable. Xnτ → X τ therefore X τ is obviously predictable. To discuss the structure of the predictable processes let us introduce some notation: Definition 1.40 If τ and σ are stopping times then one can define the random intervals {(t, ω) ∈ [0, ∞) × Ω : τ (ω) R1 tR2 σ (ω)} where R1 and R2 are one of the relations < or ≤. One can define four random intervals [σ, τ ] , [σ, τ ) , (σ, τ ] and (σ, τ ) where the meaning of these notations is obvious. One should emphasize that, in the definition of the stochastic intervals, the value of the time parameter t is always finite. Therefore if τ (ω) = ∞ for some ω then (∞, ω) ∈ / [τ , τ ]. In measure theory we are used to the fact that the σ-algebras generated by the different types of intervals are the same. In R or in Rn one can construct every type of interval from any other type of interval with a countable number of set operations. For random intervals this is not true! For example, if we want to construct the semi-closed random interval [0, τ ) with random closed segments [0, σ] then we need a sequence of stopping times (σ n ) for which σ n τ , and σ n < τ . If there is such a sequence24 then of course [0, σ n ] [0, τ ) , that is, in this case [0, τ ) is in the σ-algebra generated by the closed random segments. But for an arbitrary stopping time τ such a sequence does not exist. If τ is a stopping time and c > 0 is a constant, then τ − c is generally not a stopping time! On the other hand if c > 0 then τ + c is always a stopping time, hence as [0, τ ] = ∩n [0, τ + 1/n) the closed random intervals [0, τ ] are in the σ-algebra generated by the intervals [0, σ). This shows again that in the theory of the stochastic processes the time line is not symmetric! Definition 1.41 Y is a predictable simple process if there is a sequence of stopping times 0 = τ0 < τ1 < . . . < τn < . . . such that Y = η 0 χ ({0}) +

η i χ ((τ i , τ i+1 ])

(1.8)

i 24 If for τ there is a sequence of stopping times σ τ , σ ≤ τ and σ < τ on the set n n n {τ > 0} then we shall say that τ is a predictable stopping time. Of course the main problem is that not every stopping time is predictable. See: Definition 3.5, page 182. The simplest examples are the jumps of the Poisson processes. See: Example 3.7, page 183.


25

where η 0 is F0 -measurable and η i are Fτ i -measurable random variables. If the stopping times (τ i ) are constant then we say that Y is a predictable step processes. Now we are ready to discuss the structure of predictable processes. Proposition 1.42 Let X be a stochastic process on Θ [0, ∞). The following statements are equivalent 25 : 1. X is predictable. 2. X is measurable with respect to the σ-algebra generated by the adapted leftregular processes. 3. X is measurable with respect to the σ-algebra generated by the adapted leftcontinuous processes. 4. X is measurable with respect to the σ-algebra generated by the predictable step processes. 5. X is measurable with respect to the σ-algebra generated by the predictable simple processes. Proof. Let P1 , P2 , P3 , P4 and P5 denote the σ-algebras in the proposition. Obviously it is sufficient to prove that these five σ-algebras are equal. 1. Obviously P1 ⊆ P2 ⊆ P3 . 2. Let X be one of the processes generating P3 , that is let X be a left-continuous, adapted process. As X is adapted Xn (t) X (0) χ({0}) +

k

X

k 2n

k k+1 , χ 2n 2n

is a predictable step process. As X is left-continuous obviously Xn → X so X is P4 -measurable hence P3 ⊆ P4 . 3. Obviously P4 ⊆ P5 . 4. Let F ∈ F0 and let fn be such a continuous functions that fn (0) = 1 and fn is zero on the interval [1/n, ∞). If Xn fn χF then Xn is obviously P1 measurable, therefore the process χF χ({0}) = lim Xn n→∞

25 Let us recall that by definition X (0−) X (0). Therefore if ξ is an arbitrary F -measurable 0 random variable then the process X ξχ ({0}) is adapted and left-regular, so if Z is predictable then Z + X is also predictable. Hence we cannot generate P without the measurable rectangles {0}×F, F ∈ F0 . If one wants to avoid these sets then one should define the predictable processes on the open half line (0, ∞). This is not necessarily a bad idea as the predictable processes are the integrands of stochastic integrals, and we shall always integrate only on the intervals (0, t], so in the applications of the predictable processes the value of the these processes is entirely irrelevant at t = 0.

26


is also P1 -measurable. If η 0 is an F0 -measurable random variable then η 0 is a limit of F0 -measurable step functions therefore the process η 0 χ ({0}) is P1 measurable. This means that the first term in (1.8) is P1 -measurable. Let us now discuss the second kind of term in (1.8). Let τ be an arbitrary stopping time. If  1 if t ≤ τ (ω)  1 − n (t − τ (ω)) if τ (ω) < t < τ (ω) + 1/n Xn (t, ω)  0 if t ≥ τ (ω) + 1/n then Xn has continuous trajectories, and it is easy to see that Xn is adapted. Therefore χ ([0, τ ]) = lim Xn ∈ P1 . n→∞

If σ ≤ τ is another stopping time then χ ((σ, τ ]) = χ ([0, τ ] \ [0, σ]) = χ ([0, τ ]) − χ ([0, σ]) ∈ P1 . If F ∈ Fσ then σ F (ω)

σ (ω) if ω ∈ F ∞ if ω ∈ /F

is also a stopping time as {σ F ≤ t} = {σ ≤ t} ∩ F ∈ Ft . If σ ≤ τ then Fσ ⊆ Fτ , therefore not only σ F but τ F is also a stopping time. χF χ ((σ, τ ]) = χ ((σ F , τ F ]) ∈ P1 . If η is Fσ -measurable, then η is a limit of step functions, hence if η is Fσ measurable and σ ≤ τ then the process ηχ ((σ, τ ]) is P1 -measurable. By the definition of the predictable simple processes every predictable simple process is P1 -measurable. Hence P5 ⊆ P1 . Corollary 1.43 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and (σ, τ ] generate the σ-algebra of the predictable sets. Corollary 1.44 If Θ = [0, ∞) then the random intervals {0} × F, F ∈ F0 and [0, τ ] generate the σ-algebra of the predictable sets. Definition 1.45 Let T denote the set of measurable rectangles {0} × F,

F ∈ F0


27

and {(s, t] × F,

F ∈ Fs } .

The sets in T are called predictable rectangles. Corollary 1.46 If Θ = [0, ∞) then the predictable rectangles generate the σ-algebra of predictable sets. It is quite natural to ask what the difference is between the σ-algebras generated by the right-regular and by the left-regular processes. Definition 1.47 The σ-algebra generated by the adapted, right-regular processes is called the σ-algebra of the optional sets. A process is called optional if it is measurable with respect to the σ-algebra of the optional sets. As every continuous process is right-regular so the σ-algebra of the optional sets is never smaller than the σ-algebra of the predictable sets P. Example 1.48 Adapted, right-regular process which is not predictable.

The simplest example of a right-regular process which is not predictable is the Poisson process. Unfortunately, at the present moment it is a bit difficult to prove26 . The next example is ‘elementary’. Let Ω [0, 1] and for all t let Ft

σ (B ([0, t]) ∪ (t, 1]) if t < 1 . B ([0, 1]) if t ≥ 1

If s ≤ t then Fs ⊆ Ft , and hence F is a filtration. It is easy to see that the random function τ (ω) ω is a stopping time. Let A [τ ] [τ , τ ] be the graph 2 of τ , which is the diagonal of the closed rectangle [0, 1] . 1. Let us show that A is optional. It is easy to see that the process Xn χ ([τ , τ + 1/n)) is right-continuous. Xn is adapted as {Xn (t) = 1} =

τ ≤t y} so, as [0, τ ] = {(t, ω) : t ≤ τ (ω)} {(t, ω) : t ≤ ω} = = {(x, y) : x ≤ y} , obviously R ∩ [0, τ ] = ∅ = (∅ × Ω) ∩ [0, τ ] . By the structure of Fs the interval (s, 1] is an atom of Fs . Hence if F ∩ (s, 1] = ∅, then (s, 1] ⊆ F , hence for some B ∈ B ([0, s]) R (s, t] × F = (s, t] × (B ∪ (s, 1]) . So R ∩ [0, τ ] = (s, t] × (B ∪ (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × (s, 1]) ∩ {(x, y) : x ≤ y} = = ((s, t] × Ω) ∩ [0, τ ] and therefore in both cases the intersection has representation of type B × Ω. This remains true if we take the rectangles of type {0} × F, F ∈ F0 . As 27 If we draw Ω on the y-axis and we draw on the time line the x-axis then [τ , τ ] is the line y = x, [0, τ ] is the upper triangle. In the following argument F is under the diagonal hence the whole rectangle R is under the diagonal.

MARTINGALES

29

the generation and the restriction of the σ-algebras are interchangeable operations P ∩ [0, τ ] = σ (T ) ∩ [0, τ ] = σ (T ∩ [0, τ ]) = = σ ((B × Ω) ∩ [0, τ ]) = σ (B × Ω) ∩ [0, τ ] = = (B ([0, 1]) × Ω) ∩ [0, τ ] , which is exactly (1.9). 4. As the left-regular χ ([0, τ ]) is adapted and χ ([τ , τ ]) is not predictable, the right-regular, adapted process χ ([0, τ )) = χ ([0, τ ]) − χ ([τ , τ ]) is also not predictable.

1.3

Martingales

In this section we introduce and discuss some important properties of continuoustime martingales. As martingales are stochastic processes one should fix the properties of their trajectories. We shall assume that the trajectories of the martingales are right-regular. The right-continuity of martingales is essential in the proof of the Optional Sampling Theorem, which describes one of the most important properties of martingales. There are a lot of good books on martingales, so we will not try to prove the theorems in their most general form. We shall present only those results from martingale theory which we shall use in the following. The presentation below is a bit redundant. We could have first proved the Downcrossing Inequality and from it we could have directly proved the Martingale Convergence Theorem. But I don’t think that it is a waste of time and paper to show these theorems from different angles. Definition 1.49 Let us fix a filtration F. The adapted process X is a submartingale if 1. the trajectories of X are right-regular, 2. for any time t the expected value of X + (t) is finite28 , a.s.

3. if s < t, then E(X (t) | Fs ) ≥ X(s). 28 Some authors, see: [53], assume that if X is a submartingale then X (t) is integrable for all t. If we need this condition then we shall say that X is an integrable submartingale. The same remark holds for supermartingales as well. Of course martingales are always integrable.

30


We say that X is a supermartingale, if −X is a submartingale. X is a martingale if X is a supermartingale and a submartingale at the same time. This means that 1. the trajectories of X are right-regular, 2. for any time t the expected value of X (t) is finite, a.s. 3. if s < t, then E (X (t) | Fs ) = X (s). The conditional expectation is always a random variable—that is, the conditional expectation E(X(t) | Fs ) is always an equivalence class. As X is a stochastic process X(s) is a function and not an equivalence class. Hence the two sides in the definition can be equal only in almost sure sense. Generally we shall not emphasize this, and we shall use the simpler =, ≥ and ≤ relations. If X is a martingale, and g is a convex function29 on R and E (g(X(t))+ ) < ∞ for all t, then the process Y (t) g (X(t)) is a submartingale as by Jensen’s inequality g (X (s)) = g (E (X (t) | Fs )) ≤ E (g (X (t)) | Fs ) . p

In particular, if X is a martingale, p ≥ 1, and |X (t)| is integrable for all t, then p the process |X| is a submartingale. If X is a submartingale, g is convex and increasing, and Y (t) g(X(t)) is integrable, then Y is a submartingale, as in this case E (g (X (t)) | Fs ) ≥ g (E (X (t) | Fs )) ≥ g (X (s)) . In particular, if X is a submartingale, then X + is also a submartingale. 1.3.1

Doob’s inequalities

The most well-known inequalities of the theory of martingales are Doob’s inequalities. First we prove the discrete-time versions, and then we discuss the continuous-time cases. n

Proposition 1.50 (Doob’s inequalities, discrete-time) Let X (Xk , Fk )k=1 be a non-negative submartingale. 1. If λ ≥ 0, then

λP

max Xk ≥ λ

1≤k≤n

≤ E (Xn ) .

(1.10)

2. If p > 1, then30 p Xk p ≤ max Xk ≤ p − 1 Xn p q Xn p . 1≤k≤n p 29 Convex 30 Of

functions are continuous so g(X) is adapted. course as usual 1/p + 1/q = 1.

(1.11)

MARTINGALES

31

Proof. Let us remark that both inequalities estimate the size of the maximum of the non-negative submartingales. 1. Let λ > 0. A1 {X1 ≥ λ} ,

Ak

max Xi < λ ≤ Xk ,

A

1≤i 1, then sup |X (t)| ≤ t∈Θ

p

p sup X (t)p . p − 1 t∈Θ

(1.16)

3. If Θ is closed and b is the finite or infinite right endpoint of Θ then under the conditions above λP sup X (t) ≥ λ ≤ X + (b)1 , (1.17) t∈Θ

p

λp P sup |X (t)| ≥ λ

≤ X (b)p ,

t∈Θ

sup |X (t)| ≤ t∈Θ

p

p X (b)p . p−1

(1.18)

We shall very often use the following corollary of (1.16): Corollary 1.54 If X is a martingale and p > 1, then X ∗ sup |Xk | ∈ Lp (Ω) t∈Θ

or (X ∗ ) p

p p sup |Xk | = sup |Xk | ∈ L1 (Ω) t∈Θ

t∈Θ

if and only if X is bounded in Lp (Ω). Definition 1.55 If p ≥ 1, then Hp will denote the space of martingales X for which sup |X(t)| < ∞. t

p

Hp also denotes the equivalence classes of these martingales, where two martingales are equivalent whenever they are indistinguishable. Definition 1.56 If X ∈ H2 , then we shall say that X is a square-integrable martingale. If supt |Xn (t) − X(t)|p → 0 then for a subsequence a.s

sup |Xnk (t) − X(t)| → 0, t

MARTINGALES

35

hence if Xn is right-regular for every n, then X is almost surely right-regular. From the definition of the Hp spaces it is trivial that for all p ≥ 1 the Hp martingales are uniformly integrable. From these the next observation is obvious: Proposition 1.57 Hp as a set of equivalence classes with the norm XHp

sup |X (t)| t

(1.19)

p

is a Banach space. If p > 1 then by Corollary 1.54 X ∈ Hp if and only if X is bounded in Lp (Ω). 1.3.2

The energy equality

An important elementary property of martingales is the following: Proposition 1.58 (Energy equality) Let X be a martingale and assume that X (t) is square integrable for all t. If s < t then

2 E (X (t) − X (s)) = E X 2 (t) − E X 2 (s) . Proof. The difference of the two sides is d 2 · E (X (s) · (X (s) − X (t))) . As s < t, by the martingale property dn 2 · E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t))) = = 2 · E (E (X (s) χ (|X (s)| ≤ n) · (X (s) − X (t)) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · E (X (s) − X (t) | Fs )) = = 2 · E (X (s) χ (|X (s)| ≤ n) · 0) = 0. As X (s) , X (t) ∈ L2 (Ω) obviously |X (s) · (X (s) − X (t))| is integrable. Hence one can use the Dominated Convergence Theorem on both sides so d = lim dn = 0. n→∞

Corollary 1.59 If X ∈ H then there is a random variable, denoted by X(∞), such that X(∞) ∈ L2 (Ω, F∞ , P) and 2

a.s.

X(t) = E(X(∞) | Ft )

(1.20)

36


for every t. In L2 (Ω)-convergence lim X(t) = X(∞).

t→∞

Proof. Let tn ∞ be arbitrary. By the energy equality the sequence 2 X(tn )2 is increasing, and by the definition of H2 it is bounded from above. Also by the energy equality if n > m then 2

2

2

X(tn ) − X(tm )2 = X(tn )2 − X(tm )2 , hence (X(tn )) is a Cauchy sequence in L2 (Ω). As L2 (Ω) is complete the sequence (X(tn )) is convergent in L2 (Ω). It is obvious from the construction that the limit X (∞) as an object in L2 (Ω) is unique, that is X (∞) ∈ L2 (Ω) is independent of the sequence (tn ). X is a martingale, so if s ≥ 0 then a.s.

X (t) = E (X (t + s) | Ft ) . In probability spaces L1 -convergence follows from L2 -convergence and as the conditional expectation is continuous in L1 (Ω), if s → ∞ then a.s.

X (t) = E

lim X (t + s) | Ft E (X (∞) | Ft ) .

s→∞

Example 1.60 Wiener processes and the structure of the square-integrable martingales.

Let u < ∞ and let w be a Wiener process on the interval Θ [0, u]. As w has independent increments, for every t ≤ u E (w (u) | Ft ) = E (w (u) − w (t) | Ft ) + E (w (t) | Ft ) = w (t) . / H2 , On the half-line R+ w is not bounded in L2 (Ω) that is, if Θ = R+ then w ∈ and of course the representation (1.20) does not hold. Proposition 1.61 Let X be a martingale and let p ≥ 1. If for some random variable X(∞) Lp (Ω)

X (t) → X(∞), then a.s.

X (t) → X(∞)

MARTINGALES

37

and a.s.

X(t) = E (X(∞) | Ft ) ,

t ≥ 0.

(1.21)

Proof. As the conditional expectation is continuous in L1 (Ω) if s → ∞ then from the relation a.s.

X(t) = E (X(t + s) | Ft ) ,

t≥0

(1.21) follows. For an arbitrary s the increment N (u) X (u + s) − X (s) is a martingale with respect to the filtration Gu Fs+u . Let β(s) sup |X(u + s) − X(∞)| ≤ sup |N (u)| + |X(s) − X(∞)| . u

u≥0

X is right-regular, so it is sufficient to take the supremum over the rational numbers, so β(s) is measurable. sup N (u)p ≤ X (s) − X(∞)p + sup X (u + s) − X (∞)p . u

u

Lp

Let ε > 0 be arbitrary. As X (s) → X (∞) if s is large enough then the right-hand side is less than ε > 0. By Doob’s and by Markov’s inequalities P (β (s) > 2δ) ≤ P (|X(s) − X(∞)| > δ) + P sup |N (u)| > δ ≤ u

≤

X(s)

p − X(∞)p p

δ

+

ε p δ

.

P

Therefore if s → ∞ then β (s) → 0. Every stochastically convergent sequence has a.s. an almost surely convergent subsequence. By the definition of β (s) if β (sk ) → 0 a.s. then X(t) → X(∞). Corollary 1.62 If X ∈ H2 then there is a random variable X(∞) ∈ L2 (Ω) such that X(t) → X(∞), where the convergence holds in L2 (Ω) and almost surely. 1.3.3

The quadratic variation of discrete time martingales

Our goal is to extend the result just proved to spaces Hp , p ≥ 1. The main tool of stochastic analysis is the so-called quadratic variation. Let us first investigate the quadratic variation of discrete-time martingales. Proposition 1.63 (Austin) Let Z denote the set of integers. Let X = (Xn , Fn )n∈Z be a martingale over Z, that is let us assume that Θ = Z. If X

38


is bounded in L1 (Ω) then the ‘quadratic variation’ of X is almost surely finite: ∞

2 a.s.

(Xn+1 − Xn ) < ∞.

(1.22)

n=−∞

Proof. As X is bounded in L1 (Ω) there is a k < ∞ such that Xn 1 ≤ k for all n ∈ Z. Let X ∗ supn |Xn |. |X| is a non-negative submartingale so by Doob’s inequality P (X ∗ ≥ p) ≤

k , p

therefore X ∗ is almost surely finite. Fix a number p and define the continuously and differentiable, convex function f (t)

t2 2p |t| − p2

if |t| ≤ p . if |t| > p

As f is convex the expression g (s1 , s2 ) f (s2 ) − f (s1 ) − (s2 − s1 ) f (s1 ) is non-negative. If |s1 | , |s2 | ≤ p then 2

g (s1 , s2 ) = s22 − s21 − (s2 − s1 ) 2s1 = (s2 − s1 ) . By the definition of f obviously f (t) ≤ 2p |t|. Therefore E (f (Xn )) ≤ 2pE (|Xn |) ≤ 2pk.

(1.23)

By the elementary properties of the conditional expectation E ((Xn+1 − Xn ) f (Xn )) = E (E ((Xn+1 − Xn ) f (Xn )) | Fn ) = = E (f (Xn ) E ((Xn+1 − Xn )) | Fn ) = 0 for all n ∈ Z. From this and from (1.23), using the definition of g, for all n 2pk ≥ E (f (Xn )) ≥ E (f (Xn ) − f (X−n )) = =

n−1 i=−n

E (f (Xi+1 ) − f (Xi )) =

MARTINGALES

=

n−1

39

E (f (Xi+1 ) − f (Xi ) − (Xi+1 − Xi ) f (Xi ))

i=−n

n−1

E (g (Xi+1 , Xi )) .

i=−n

By the Monotone Convergence Theorem 2 ∗ ∗ (Xn+1 − Xn ) χ (X ≤ p) = E g (Xn+1 , Xn ) χ (X ≤ p) ≤ E n∈Z

n∈Z

≤E =

g (Xn+1 , Xn )

=

n∈Z

E (g (Xn+1 , Xn )) ≤ 2pk.

n∈Z

As X ∗ is almost surely finite,

n∈Z

2

(Xn+1 − Xn ) is almost surely convergent.

Corollary 1.64 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) and (τ n ) is an increasing sequence of stopping times then almost surely ∞

2

(X(τ n+1 ) − X(τ n )) < ∞.

(1.24)

n=1

Proof. For every m let us introduce the bounded stopping times τ m n τ n ∧ m. By the discrete-time version of the Optional Sampling Theorem32

X m X (τ m n ) , Fτ m n n is a martingale, and therefore from the proof of the previous proposition ∞ 2 m 2pk ≥ E X τm χ (X ∗ ≤ p) . n+1 − X (τ n ) n=1

If m → ∞ then by Fatou’s lemma ∞ 2 E (X (τ n+1 ) − X (τ n )) χ (X ∗ ≤ p) ≤ 2pk, n=1

from which (1.24) is obvious. 32 See:

Lemma 1.83, page 49.

40


Corollary 1.65 Let X (Xn , Fn ) be a martingale over the natural numbers N. If X is bounded in L1 (Ω) then there is a variable X∞ such that |X∞ | < ∞ and a.s.

lim Xn = X∞ .

n→∞

Proof. Assume that for some ε > 0 on a set of positive measure A lim sup |Xp − Xq | ≥ 2ε.

(1.25)

p,q→∞

Let τ 0 1, and let τ n+1 inf {m ≥ τ n : |Xm − Xτ n | ≥ ε} . Obviously τ n is a stopping time for all n and the sequence (τ n ) is increasing. On the set A |X(τ n+1 ) − X(τ n )| ≥ ε. By (1.24) almost surely

∞ n=0

2

(X(τ n+1 ) − X(τ n )) < ∞ which is impossible.

Corollary 1.66 If X = (Xn , Fn ) is a non-negative martingale then there exists a finite, non-negative variable X∞ such that X∞ ∈ L1 (Ω) and almost surely Xn → X ∞ . Proof. X is non-negative and the expected value of Xn is the same for all n, a.s. hence X is obviously bounded in L1 (Ω). So Xn → X∞ exists. By Fatou’s lemma

X (0) = E (Xn | F0 ) = lim E (Xn | F0 ) ≥ E lim inf Xn | F0 = n→∞

n→∞

= E (X∞ | F0 ) ≥ 0 and therefore X∞ ∈ L1 (Ω). Corollary 1.67 Assume that Θ = R+ . If X is a uniformly integrable martingale then there is a variable X (∞) ∈ L1 (Ω) such that X (t) → X (∞), where the convergence holds in L1 (Ω) and almost surely. For all t a.s.

X (t) = E (X (∞) | Ft ) .

(1.26)

Proof. Every uniformly integrable set is bounded in L1 , so if tn ∞, then a.s. there is an X(∞) such that X(tn ) → X(∞). By the uniform integrability the convergence holds in L1 (Ω) as well. Obviously X(∞) as an equivalence class is independent of (tn ). The relation (1.26) is an easy consequence of the L1 (Ω)continuity of the conditional expectation.

MARTINGALES

41

Corollary 1.68 Assume that p ≥ 1 and Θ = R+ . If X ∈ Hp then there is a variable X (∞) ∈ Lp (Ω) such that X (t) → X (∞), where the convergence holds in Lp (Ω) and almost surely. For all t a.s.

X (t) = E (X (∞) | Ft ) .

(1.27)

Proof. If the measure is finite and p ≤ q then Lq ⊆ Lp . Hence if p ≥ 1 and X ∈ Hp then X ∈ H1 so, if tn ∞, then there is a variable X(∞) such that a.s. p p X(tn ) → X(∞). As by the definition of Hp spaces |X(t)| ≤ sups |X(s)| ∈ L1 (Ω), so X(∞) ∈ Lp (Ω) and by the Dominated Convergence Theorem the convergence holds in Lp (Ω) as well. Obviously X(∞), as an equivalence class, is independent of (tn ). The relation (1.27) is an easy consequence of the L1 (Ω) continuity of the conditional expectation. Theorem 1.69 (L´ evy’s convergence theorem) If (Fn ) is an increasing sequence of σ-algebras, ξ ∈ L1 (Ω) and F∞ σ (∪n Fn ) , then Xn E (ξ | Fn ) → E (ξ | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. Let Xn E (ξ | Fn ). As E( |Xn |) E (|E (ξ | Fn )|)) ≤ E (E (|ξ| | Fn ))) = E(|ξ|) < ∞, a.s.

X = (Xn , Fn ) is an L1 (Ω) bounded martingale. Therefore Xn → X∞ . After the proof we shall prove as a separate lemma that the sequence (Xn ) is uniformly L1

integrable, hence Xn → X∞ . If A ∈ Fn , and m ≥ n, then Xm dP = ξdP, A

A

L1

hence as Xm → X∞

X∞ dP = A

ξdP,

A ∈ ∪n Fn .

(1.28)

A

As X∞ and ξ are integrable it is easy to see that the sets A for which (1.28) is true is a λ-system. As (Fn ) is increasing ∪n Fn is obviously a π-system. Therefore by the Monotone Class Theorem (1.28) is true if A ∈ F∞ σ (∪Fn ). X∞ is a.s. obviously F∞ -measurable, hence X∞ = E (ξ | F∞ ).

42


Lemma 1.70 If ξ ∈ L1 , and (Fα )α∈A is an arbitrary set of σ-algebras then the set of random variables Xα E (ξ | Fα ) ,

α∈A

(1.29)

is uniformly integrable. Proof. By Markov’s inequality for every α P (|Xα | ≥ n) ≤

1 1 E (E (|ξ| | Fα )) = E (|ξ|) . n n

Therefore for any δ there is an n0 that if n ≥ n0 , then P (|Xα | ≥ n) < δ. As that is for X ∈ L1 (Ω) the integral function A |ξ| dP is absolutely continuous, arbitrary ε > 0 there is a δ such that if P (A) < δ, then A |ξ| dP < ε. Hence if n is large enough, then

{|Xα |>n}

|Xα | dP ≤

{|Xα |>n}

E (|ξ| | Fα ) dP =

{|Xα |>n}

|ξ| dP < ε,

which means that the set (1.29) is uniformly integrable. 1.3.4

The downcrossings inequality

Let X be an arbitrary adapted stochastic process and let a < b. Let us fix a point of time t, and let S {s0 < s1 < · · · < sm } be a certain finite number of moments in the time interval [0, t). Let33 τ 0 inf {s ∈ S : X (s) > b} ∧ t. With induction define τ 2k+1 inf {s ∈ S : s > τ 2k , X (s) < a} ∧ t, τ 2k inf {s ∈ S : s > τ 2k−1 , X (s) > b} ∧ t. It is easy to check that τ k is a stopping time for all k. It is easy to see that if X is an integrable submartingale then the inequality a.s.

τ 2k ≤ τ 2k+1 < t 33 If

the set after inf is empty, then the infimum is by definition +∞.

MARTINGALES

43

is impossible as in this case X (τ 2k ) > b, X (τ 2k+1 ) < a and by the submartingale property34 b < E(X (τ 2k )) ≤ E(X (τ 2k+1 )) < a, which is impossible. We say that function f downcrosses the interval [a, b] if there are points u < v with f (u) > b and f (v) < a. By definition f has n downcrosses with thresholds a, b on the set S if there are points in S u1 < v1 < u2 < v2 < · · · < un < vn with f (uk ) > b, f (vk ) < a. Let us denote by DSa,b the a < b downcrossings of X in the set S. Obviously DSa,b ≥ n = {τ 2n−1 < t} ∈ Ft , and hence DSa,b is Ft -measurable. We show that χ

DSa,b

≥n ≤

m k=0

+

(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) . n(b − a)

(1.30)

Recall that m is the number of points in S. Therefore the maximum number of possible downcrossings is obviously m. If we have more than n downcrossings then in the sum the first n term is bigger than b − a. For every trajectory all but the last non-zero terms of the sum are positive as they are all not smaller than b − a > 0. There are two possibilities: in the last non-zero term either τ 2k+1 < t or τ 2k+1 = t. In the first case X(τ 2k ) − X(τ 2k+1 ) > b − a > 0. In the second case still X (τ 2k ) > b, therefore in this case X (τ 2k ) − X (τ 2k+1 ) > b − X(t). Of course b − X(t) can be negative. This is the reason why we added to the sum + the correction term (X(t) − b) . If b − X(t) < 0 then +

X (τ 2k ) − X (τ 2k+1 ) + (X(t) − b) = X (τ 2k ) − X (τ 2k+1 ) + X(t) − b = = X (τ 2k ) − X (t) + X(t) − b = = X (τ 2k ) − b > 0, 34 See:


44


which means that (1.30) always holds. Taking the expectation on both sides P

DSa,b

m k=0

≥n ≤E

+

(X(τ 2k ) − X(τ 2k+1 )) + (X(t) − b) n(b − a)

=

1 E (X(τ 2k ) − X(τ 2k+1 )) + n (b − a) k=0

1 + + E (X(t) − b) . n (b − a) m

=

Now assume that X is an integrable submartingale. As t ≥ τ 2k+1 ≥ τ 2k by the discrete Optional Sampling Theorem35 E (X(τ 2k ) − X(τ 2k+1 )) ≤ 0, so

P DSa,b ≥ n ≤

+ E (X(t) − b) n (b − a)

.

If the number of points of S increases by refining S then the number of downcrossings DSa,b does not decrease. If S is an arbitrary countable set then the number of downcrossings in S is the supremum of the downcrossings of the finite subsets of S. With the Monotone Convergence Theorem we get the following important inequality: Theorem 1.71 (Downcrossing inequality) If X is an integrable submartingale and S is an arbitrary finite or countable subset of the time interval [0, t) then

E ((X(t) − b)+ ) P DSa,b ≥ n ≤ . n (b − a) In particular

P DSa,b = ∞ = 0. There are many important consequences of this inequality. The first one is a generalization of the martingale convergence theorem. Corollary 1.72 (Submartingale convergence theorem) Let X (Xn , Fn ) be a submartingale over the natural numbers N. If X is bounded in L1 (Ω) then 35 See:


MARTINGALES

45

there is a variable X∞ ∈ L1 (Ω) such that a.s.

lim Xn = X∞ .

(1.31)

n→∞ a.s.

Proof. If Xn → X∞ then by Fatou’s lemma E (|X∞ |) ≤ lim inf E (|Xn |) ≤ k < ∞ n→∞

and X∞ ∈ L1 (Ω). Let a < b be rational thresholds, and let Sm {1, 2, . . . , m}. As E (|Xm |) ≤ k for all m

P DSa,b ≥n ≤ m

+ E (Xm − b) n (b − a)

≤

k . n(b − a)

If m ∞ then for all n

P DNa,b = ∞ ≤ P DNa,b ≥ n ≤

k , n(b − a)

which implies that P DNa,b = ∞ = 0. The convergence in (1.31) easily follows from the next lemma: Lemma 1.73 Let (cn ) be an arbitrary sequence of real numbers. If for every a < b rational thresholds the number of downcrossings of the sequence (cn ) is finite then the (finite or infinite) limit limn→∞ cn exists. Proof. The lim supn cn and the lim inf n cn extended real numbers always exist. If lim inf cn < a < b < lim sup cn n→∞

n→∞

then the number of the downcrossings of (cn ) is infinite. Definition 1.74 Let ξ ∈ L1 (Ω) and let Xn E (ξ | Fn ) , n ∈ N. Assume that the sequence of σ-algebras (Fn ) is decreasing, that is Fn+1 ⊆ Fn for all n ∈ N. These type of sequences are called reversed martingales. If Y−n Xn for all n ∈ N and G−n Fn then Y = (Yn , Gn ) is martingale over the parameter set Θ = {−1, −2, . . .}. If (Xn , Fn ) is a reversed martingale then one can assume that Xn = E (X0 | Fn ) for all n. If X is a continuous-time martingale and tn t∞ then the sequence (X(tn ), Ftn )n is a reversed martingale.

46


Theorem 1.75 (L´ evy) If (Fn ) is a decreasing sequence of σ-algebras, X0 ∈ L1 (Ω) and F∞ ∩n Fn then Xn E (X0 | Fn ) → E (X0 | F∞ ) , where the convergence holds in L1 (Ω) and almost surely. Proof. As (Xn ) is uniformly integrable36 , it is sufficient to prove that (Xn ) is almost surely convergent. Let a < b be rational thresholds. On the set A

lim inf Xn < a < b < lim sup Xn n→∞

n→∞

the number of downcrossings is infinite. As n → X−n is a martingale on Z, the probability of A is zero. Hence a.s.

lim inf Xn = lim sup Xn . n→∞

1.3.5

n→∞

Regularization of martingales

Recall that, by definition, every continuous-time martingale is right-regular. Let F be an arbitrary filtration, and let ξ ∈ L1 (Ω). In discrete-time the sequence Xn E(ξ | Fn ) is a martingale as for every s < t a.s.

E (X(t) | Fs ) E (E(ξ | Ft ) | Fs ) = E (ξ | Fs ) X(s). In continuous-time X is not necessarily a martingale as the trajectories of X are not necessarily right-regular. Definition 1.76 A stochastic process X has martingale structure if E (X (t)) is finite for every t and a.s.

E (X(t) | Fs ) = X(s) for all s < t. Our goal is to show that if the filtration F satisfies the usual conditions then every stochastic process with martingale structure has a modification which is a 36 See:


MARTINGALES

47

martingale. The proof depends on the following simple lemma: Lemma 1.77 If X has a martingale structure then there is an Ω0 ⊆ Ω with P(Ω0 ) = 1, such that for every trajectory X(ω) with ω ∈ Ω0 and for every rational threshold a < b the number of downcrossings over the rational numbers a,b is finite. In particular if ω ∈ Ω0 then for every t ∈ Θ the (finite or infinite) DQ limits lim X(s, ω),

st, s∈Q

lim X(s, ω)

st, s∈Q

exist. Proof. The first part of the lemma is a direct consequence of the downcrossings inequality. If limn X(sn , ω) does not exist for some sn t then for some rational / Ω0 . thresholds a < b the number of downcrossings of (X(sn , ω)) is infinite, so ω ∈ Assume that X has a martingale structure. Let Ω0 ⊆ Ω be the subset in the lemma above. (t, ω) X

0 if ω ∈ / Ω0 . limst,s∈Q X(s, ω) if ω ∈ Ω0

(1.32)

is right-regular. Let t < s, ε > 0. We show that X (s, ω) ≤ X (t, ω) − X (tn , ω) + X (t, ω) − X

(s, ω) . + |X (tn , ω) − X (sn , ω)| + X (sn , ω) − X

As for an arbitrary ω ∈ Ω0 the number of ε/3 downcrossings of X over the Q is finite, so one can assume that in a right neighbourhood (t.t + u) of t for every tn , sn ∈ Q |X (tn , ω) − X (sn , ω)|
sup X + (t) = 0, t∈[0,1]

t∈I

1

that is, without the regularity of the trajectories Doob’s inequality does not hold. Of course Y ≡ 0 is regular modification of X, and for Y Doob’s inequality holds. 1.3.6

The Optional Sampling Theorem

As a first step let us prove the discrete-time version of the Optional Sampling Theorem40 . Lemma 1.83 Let X = (Xn , Fn ) be a discrete-time, integrable submartingale. If τ 1 and τ 2 are stopping times and for some p < ∞ P (τ 1 ≤ τ 2 ) = P (τ 2 ≤ p) = 1, then X(τ 1 ) ≤ E (X(τ 2 ) | Fτ 1 ) and E (X0 ) ≤ E (X(τ 1 )) ≤ E (X(τ 2 )) ≤ E (Xp ) . If X is a martingale then in both lines above equality holds everywhere. 40 The reader should observe that we have already used this lemma several times. Of course the proof of the lemma is independent of the results above.

50


Proof. Let τ 1 ≤ τ 2 ≤ p and ϕk χ (τ 1 < k ≤ τ 2 ) . Observe that {ϕk = 1} = {τ 1 < k, τ 2 ≥ k} = c

= {τ 1 ≤ k − 1} ∩ {τ 2 ≤ k − 1} ∈ Fk−1 . By the assumptions Xk is integrable for all k, so Xk − Xk−1 is also integrable, therefore the conditional expectation of the variable Xk − Xk−1 with respect to the σ-algebra Fk−1 exists. ϕk is bounded, hence p E (η) E ϕk [Xk − Xk−1 ] = k=1

=

p

E (E (ϕk [Xk − Xk−1 ] | Fk−1 )) =

k=1

=

p

E (ϕk E (Xk − Xk−1 | Fk−1 )) ≥ 0.

k=1

If τ 1 (ω) = τ 2 (ω) for some outcome ω, then ϕk (ω) = 0 for all k, hence η (ω) 0. If τ 1 (ω) < τ 2 (ω), then η (ω) X (τ 1 (ω) + 1) − X (τ 1 (ω)) + X (τ 1 (ω) + 2) − X (τ 1 (ω) + 1) + . . . + X (τ 2 (ω)) − X (τ 2 (ω) − 1) , which is X (τ 2 (ω)) − X (τ 1 (ω)). Therefore E (η) = E (X (τ 2 ) − X (τ 1 )) ≥ 0. Xk is integrable for all k, therefore E (X (τ 1 )) and E (X (τ 2 )) are finite. By the finiteness of these expected values E (X (τ 2 ) − X (τ 1 )) = E (X (τ 2 )) − E (X (τ 1 )) , hence E (X (τ 2 )) ≥ E (X (τ 1 )) . Let A ∈ Fτ 1 ⊆ Fτ 2 , and let us define the variables τ k (ω) if ω ∈ A . τ ∗k (ω) p + 1 if ω ∈ /A

(1.33)

MARTINGALES

51

τ ∗1 and τ ∗2 are stopping times since if n ≤ p, then {τ ∗k ≤ n} = A ∩ {τ k ≤ n} = A ∩ {τ k ≤ n} ∈ Fn . By (1.33) E (X

(τ ∗2 ))

=

X (τ 2 ) dP+ Ac

A

X (p + 1) dP ≥ E (X (τ ∗1 )) =

X (τ 1 ) dP+

=

X (p + 1) dP. Ac

A

As Xp+1 is integrable one can cancel inequality so

Ac

X (p + 1) dP from both sides of the

X (τ 2 ) dP ≥

A

X (τ 1 ) dP. A

X (τ 1 ) is Fτ 1 -measurable and therefore E (X (τ 2 ) | Fτ 1 ) ≥ X (τ 1 ) . To prove the continuous-time version of the Optional Sampling Theorem we need some technical lemmas: Lemma 1.84 If τ is a stopping time, then there is a sequence of stopping times (τ n ) such that τ n has finite number of values41 , τ < τ n for all n and τn τ. (n)

Proof. Divide the interval [0, n) into n2n equal parts. Ik Let τ n (ω)

k/2n +∞

if otherwise

[(k − 1) /2n , k/2n ).

ω ∈ τ −1 (Ik ) (n)

.

(n)

Obviously τ < τ n . At every step the subintervals Ik are divided equally, and (n) (n) the value of τ n on τ −1 (Ik ) is always the right endpoint of the interval Ik . Therefore τ n τ . τ is a stopping time, hence, using that, every stopping time is a weak stopping time τ 41 τ

n (ω)

−1

(n) Ik

= +∞ is possible.

=

k τ< n 2

k−1 ∩ τ< 2n

c ∈ Fk/2n .

52


Therefore

i τn ≤ n 2

=

k τn = n 2

k≤i

=

(n) τ −1 Ik ∈ Fi/2n . k≤i

The possible values of τ n are among the dyadic numbers i/2n and therefore τ n is a stopping time. Lemma 1.85 If (τ n ) is a sequence of stopping times and τ n τ then Fτ n + Fτ + . If τ n > τ and τ n τ then Fτ n Fτ + . Proof. Recall that by definition A ∈ Fρ+ if A ∩ {ρ ≤ t} ∈ Ft+ for every t. If A ∈ Fρ+ , then A ∩ {ρ < t} =

n

1 A∩ ρ≤t− n

∈ ∪n F(t−1/n)+ ⊆ Ft .

1. Let A ∈ Fρ+ and let ρ ≤ σ. A ∩ {σ ≤ t} = A ∩ {ρ ≤ t} ∩ {σ ≤ t} ∈ Ft+ as A ∩ {ρ ≤ t} ∈ Ft+ and {σ ≤ t} ∈ Ft . From this it is easy to see that Fτ + ⊆ ∩n Fτ n + . If A ∈ ∩n Fτ n + , then as τ n τ A ∩ {τ < t} = A

(∪n {τ n < t}) =

(A ∩ {τ n < t}) ∈ Ft .

n

So A ∩ {τ ≤ t} =

n

1 A∩ τ τ be a finitevalued approximating sequence42 . As τ is bounded there is an N large enough that τ (n) ≤ N . By the first lemma X(τ (n) ) = E (X(N ) | Fτ (n) ) .

(1.35)

As τ (n) > τ , by the last lemma ∩n Fτ (n) = Fτ + . So by the definition of the conditional expectation (1.35) means that X(τ (n) )dP = X(N )dP, A ∈ Fτ + . A

A

X(N ) is integrable therefore the sequence X(τ (n) ) is uniformly integrable43

by (1.35). By the right-continuity of the martingales X (τ ) = limn→∞ X τ (n) , so if A ∈ Fτ + then (n) X(N )dP = lim X(τ )dP = lim X(τ (n) )dP = n→∞

A

=

A

A n→∞

X(τ )dP. A

As X (τ ) is Fτ -measurable and Fτ ⊆ Fτ + , X (τ ) = E (X (N ) | Fτ + ) . If X is uniformly integrable then one can assume that X is a martingale on [0, ∞]. There is a continuous bijective time transformation f between the intervals [0, ∞] and [0, 1]. During this transformation the properties of X and τ do not change, but f (τ ) will be bounded, so using the same argument as above one can prove that X (τ ) = E (X (∞) | Fτ + ) . Finally if τ 1 ≤ τ 2 , then as Fτ 1 + ⊆ Fτ 2 + E (X (τ 2 ) | Fτ 1 + ) = E (E (X (N ) | Fτ 2 + ) | Fτ 1 + ) = = E (X (N ) | Fτ 1 + ) = X (τ 1 ) , where if X is uniformly integrable, then N ∞. 42 See: 43 See:

Lemma 1.84, page 51. Lemma 1.70, page 42.

54


Corollary 1.87 If X is a non-negative martingale and τ 1 ≤ τ 2 , then X(τ 1 ) ≥ E (X(τ 2 ) | Fτ 1 + ) .

(1.36)

Proof. First of all let us remark, that as X is a non-negative martingale X(∞) is meaningful 44 , and if n ∞ then X (τ ∧ n) → X (τ ) for every stopping time τ . Let G σ ∪n F(τ ∧n)+ . Obviously G ⊆ Fτ + . Let A ∈ Fτ + . A ∩ {τ ≤ n} ∩ {τ ∧ n ≤ t} = A ∩ {τ ≤ t ∧ n} ∈ Ft+ , therefore A ∩ {τ ≤ n} ∈ F(τ ∧n)+ . So A ∩ {τ < ∞} ∈ G. Also A ∩ {τ > n} ∩ {τ ∧ n ≤ t} = A ∩ {t ≥ τ > n} ∈ Ft+ so A ∩ {τ > n} ∈ F(τ ∧n)+ . Hence A ∩ {τ = ∞} = A ∩ (∩n {τ > n}) ∈ G, therefore G = Fτ + . Let n1 ≤ n2 . By the Optional Sampling Theorem

X(τ 1 ∧ n1 ) = E X(τ 2 ∧ n2 ) | F(τ 1 ∧n1 )+ . X(τ 2 ∧ n2 ) ∈ L1 (Ω) and therefore by Lévy’s theorem X(τ 1 ) = E (X(τ 2 ∧ n2 ) | Fτ 1 + ) . By Fatou’s lemma X(τ 1 ) = lim E (X(τ 2 ∧ n2 ) | Fτ 1 + ) ≥ E n2 →∞

lim X(τ 2 ∧ n2 ) | Fτ 1 +

n2 →∞

=

= E (X(τ 2 ) | Fτ 1 + ) .

Proposition 1.88 (Optional Sampling Theorem for submartingales) Let τ 1 ≤ τ 2 bounded stopping times. If X is an integrable submartingale then X (τ 1 ) and X (τ 2 ) are integrable and X (τ 1 ) ≤ E (X (τ 2 ) | Fτ 1 ) .

(1.37)

The inequality also holds if τ 1 ≤ τ 2 are arbitrary stopping times and X can be extended as an integrable submartingale to [0, ∞]. Proof. The proof of the proposition is nearly the same as the proof in the martingale case. Again it is sufficient to prove the inequality in the bounded 44 See:

Corollary 1.66, page 40.

MARTINGALES (n)

55

(n)

case. Assume that τ 1 ≤ τ 2 ≤ K and let (τ 1 )n and (τ 2 )n be the finite-valued (n) (n) approximating sequences of τ 1 and τ 2 . By the construction τ 1 ≤ τ 2 , so by the first lemma of the subsection (n) (n) X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F

F

(n)

By the right-continuity of submartingales X(τ k ) → X(τ k ) and therefore one should prove that the convergence holds in L1 (Ω), that is, one should prove the (n) uniform integrability of the sequences (X(τ k )). Since in this case one can take the limits under the integral signs therefore X(τ 1 )dP ≤ X(τ 2 )dP, F ∈ Fτ 1 + . F

F

As X(τ 1 ) is Fτ 1 + -measurable by the definition of the conditional expectation X (τ 1 ) = E (X (τ 1 ) | Fτ 1 ) ≤ E (X (τ 2 ) | Fτ 1 + ) . This means that (1.37) holds. Let us prove that the sequence uniformly integrable.

(n) X(τ k ) is

1. As X is submartingale, X + is also submartingale, therefore from the finite Optional Sampling Theorem

(n) ≤ E X + (K) | Fτ (n) . 0 ≤ X+ τ k k

The right-hand side is uniformly integrable45 , so the left-hand side is also uniformly integrable. (n)

2. Let Xn X(τ k ). By the finite Optional Sampling Theorem (Xn ) is obviously an integrable reversed submartingale. Let n > m. As (Xn ) is a reversed submartingale 0≤ Xn− dP = − Xn dP = Xn dP − E(Xn ) ≤ {Xn− ≥N } {Xn− ≥N } {Xn− 0, then σ τ ∧ N is a bounded stopping time. If π were not right but left-continuous then one could not apply the Optional Sampling Theorem: if P were left-continuous then P (σ) = 0, and E (π (0)) = 0 = E (−λσ) = E (P (σ) − λσ) = E (π (σ)) . Let w be a Wiener process and let τ a inf {t : w(t) = a}

MARTINGALES

57

be the first passage time of an a = 0. As w is not uniformly integrable and τ a is unbounded, one cannot apply the Optional Sampling Theorem: almost surely46 a.s. τ a < ∞, hence w (τ a ) = a. Therefore E (w (τ a )) = E (a) = a = 0 = E (w (0)) .

Example 1.90 The exponential martingales of Wiener processes are not uniformly integrable.

Let w be a Wiener process. If the so-called exponential martingale X (t) exp (w (t) − t/2) were uniformly integrable, then for every stopping time one could apply the Optional Sampling Theorem. X is a non-negative martingale, therefore there is47 a random variable X (∞) such that almost surely X(t) → X(∞). For almost all trajectories of w the set {w = 0} is unbounded48 , therefore w(σ n ) = 0 for some sequence σ n ∞. Therefore σ

σn

a.s. n X (∞) = lim X (σ n ) lim exp w (σ n ) − = lim exp − = 0. n→∞ n→∞ n→∞ 2 2 a.s.

Since X(0) = 1, X(∞) = 0 and X is continuous, if a < 1 then almost surely τ a inf {t : X(t) = a} < ∞. a.s.

That is X (τ a ) = a. So if a < 1, then E (X (0)) = 1 > a = E (X (τ a )) . Hence X is not uniformly integrable. Proposition 1.91 (Martingales and conservation of the expected value) Let X be an adapted and right-regular process. X is a martingale if and only if X(τ ) ∈ L1 (Ω)

and

E (X(τ )) = E (X(0))

for all bounded stopping times τ . This property holds for every stopping time τ if and only if X is a uniformly integrable martingale. 46 See:

Proposition B.7, page 564. Corollary 1.66, page 40. 48 See: Corollary B.8, page 565. 47 See:

58


Proof. If X is a martingale, or uniformly integrable martingale, then by the Optional Sampling Theorem the proposition holds. Let s < t and let A ∈ Fs . It is easy to check that τ = tχAc + sχA

(1.38)

is a bounded stopping time. By the assumption of the proposition E (X(0)) = E (X(τ )) = E (X(t)χAc ) + E (X(s)χA ) . As τ ≡ t is also a stopping time, E (X(0)) = E (X(t)) = E (X(t)χAc ) + E (X(t)χA ) . Comparing the two equations E (X(s)χA ) = E (X(t)χA ) , that is E (X(s) | Fs ) = E (X(t) | Fs ) . As X is adapted, X(s) is Fs -measurable so X(s) = E (X(t) | Fs ). If one can apply the property E (X(τ )) = E (X(0)) for every stopping time τ then one can apply it for the stopping time τ ≡ ∞ as well. Hence X (∞) exists and in (1.38) t = ∞ is possible, hence X(s) = E (X(∞) | Fs ) , so X is uniformly integrable49 . Corollary 1.92 (Conservation of the martingale property under truncation) If X is a martingale and τ is a stopping time then the truncated process X τ is also a martingale. Proof. If X is right-regular then the truncated process X τ is also right-regular. By Proposition 1.35 X τ is adapted. Let φ be a bounded stopping time. As υ φ ∧ τ is a bounded stopping time by Proposition 1.91 E (X τ (φ)) = E (X(υ)) = E (X(0)) = E (X τ (0)) and therefore X τ is a martingale. 1.3.7

Application: elementary properties of L´ evy processes

Lévy processes are natural generalizations of Wiener and Poisson processes. Let us fix a stochastic base space (Ω, A, P, F) and assume that Θ = [0, ∞). Definition 1.93 Let X be an adapted stochastic process. X is a process with independent increments with respect to the filtration F if 49 See:

Lemma 1.70, page 42 .

MARTINGALES

59

1. X (0) = 0, 2. X is right-regular, 3. whenever s < t then the increment X (t) − X (s) is independent of the σ-algebra Fs . A process X with independent increments is a Lévy process, if it has stationary or homogeneous increments that is for every t and for every h > 0 the distribution of the increment X(t + h) − X(t) is the same as the distribution of X(h) − X(0). By definition every Lévy process and every process with independent increments has right-regular trajectories. This topological assumption is very important as it is not implied by the other assumptions: Example 1.94 Not every process starting from zero and having stationary and independent increments is a Lévy process.

Let Ω be arbitrary and A = Ft = {∅, Ω} and let (xα )α be a Hamel basis of R over the rational numbers. For every t let X(t) be the sum of the coordinates of t in the Hamel basis. Obviously X(t + s) = X(t) + X(s) so X has stationary and independent increments. But as X is highly discontinuous50 it does not have a modification which is a Lévy process. Example 1.95 The sum of two Lévy processes is not necessarily a Lévy process51 .

We show that even the sum of two Wiener processes is not a Wiener process. The present counter example is very important as it shows that, although the Lévy processes are the canonical and most important examples of semimartingales, they are not the right objects from the point of view of the theory. The sum of two semimartingales52 is a semimartingale and the same is true for martingales or for local martingales. But it is not true for Lévy processes! 1. Let Ω be the set of two-dimensional continuous functions R+ → R2 with the property f (0) = (0, 0). Let P1 be a measure on the Borel σ-algebra of Ω for which the canonical stochastic process X (ω, t) = ω (t) is a two-dimensional Wiener process with correlation coefficient 1. In the same way let P2 be the measure on Ω under which X is a Wiener process with correlation coefficient −1. Let P (P1 + P2 )/2. It is easy to see that the coordinate processes w1 (t) and 50 The

image space of X is the rational numbers! example depends on results which we shall prove later. So the reader can skip the example at the first reading. 52 We shall introduce the definitions of semimartingales and local martingales later. 51 The

60


w2 (t) are Wiener processes. On the other hand, a simple calculation shows that the distribution of Z w1 + w2 is not Gaussian. Z is continuous and every continuous Lévy process is a linear combination of a Wiener process and a linear trend53 , therefore, as Z is not a Gaussian process it cannot be a Lévy process. 2. The next example is bit more technical, but very similar: Let w be a Wiener t process with respect to some filtration F. Let X (t) 0 sign (w) dw, where the integral, of course, is an Itˆ o integral. The quadratic variation of X is

t

2

(sign (w)) d [w] =

[X] (t) = 0

t

1ds = t 0

so by Lévy’s characterization theorem54 the continuous local martingale X is also a Wiener process55 with respect to F. If Z w + X = 1 • w + sign (w (s)) • w = (1 + sign (w (s))) • w then Z is a continuous martingale with respect to F with zero expected value. [Z] (t) =

t

2

(1 + sign (w)) d [w] = 0

t

2

(1 + sign (w (s))) ds 0

so Z is not a Wiener process. As in the first example, every continuous Lévy process is a linear combination of a Wiener process and a linear trend, therefore, as Z is not a Wiener process it cannot be a Lévy process. During the proof of the next proposition, we shall need the next very useful simple observation: Lemma 1.96 ξ 1 and ξ 2 are independent vector-valued random variables if and only if ϕ = ϕ 1 · ϕ2 , where ϕ1 is the Fourier transform of ξ 1 and ϕ2 is the Fourier transform of ξ 2 and ϕ is the Fourier transform of the joint distribution of (ξ 1 , ξ 2 ). Proof. If ξ 1 and ξ 2 are independent then the decomposition obviously holds. The other implication is an easy consequence of the Monotone Class Theorem: 53 See:

Theorem 6.11, page 367. Theorem 6.13, page 368. 55 See: Example 6.14, page 370. 54 See:

MARTINGALES

61

fix a vector v and let L be the set of bounded functions u for which E (u (ξ 1 ) · exp (i (v, ξ 2 ))) = E (u (ξ 1 )) · E (exp (i (v, ξ 2 ))) . L is obviously a λ-system. Under the conditions of the lemma L contains the π-system of the functions u (x) = exp (i (u, x)) , so it contains the characteristic functions of the sets of the σ-algebra generated by these exponential functions. Therefore it is easy to see that for every Borel measurable set B E (χB (ξ 1 ) · exp (i (v, ξ 2 ))) = P (ξ 1 ∈ B) · E (exp (i (v, ξ 2 ))) . Now let L be the set of bounded functions v for which E (χB (ξ 1 ) · v (ξ 2 )) = P (ξ 1 ∈ B) · E (v (ξ 2 )) . With the same argument as above, by the Monotone Class Theorem for any Borel measurable set D, one can choose v = χD . So P (ξ 1 ∈ B, ξ 2 ∈ D) = E (χB (ξ 1 ) · χD (ξ 2 )) = P (ξ 1 ∈ B) · P (ξ 2 ∈ D) therefore, by independent.

definition,

the

random

vectors

ξ1

and

ξ2

are

Proposition 1.97 For an adapted process X the increments are independent if and only if the σ-algebra Gt generated by the increments X (u) − X (v) ,

u≥v≥t

is independent of Ft for every t. Proof. To make the notation as simple as possible let X (t0 ) denote an arbitrary Ft0 -measurable random variable. Let 0 = t−1 ≤ t = t0 ≤ t1 ≤ t2 ≤ . . . ≤ tn . We show that if X has independent increments then the random variables X(t0 ), X(t1 ) − X(t0 ), X(t2 ) − X(t1 ), . . . , X(tn ) − X(tn−1 )

(1.39)

are independent. To prove this one should prove that the Fourier transform of the joint distribution of the variables in (1.39) is the product of the Fourier

62


transforms of the distributions of these increments:    n uj [X(tj ) − X(tj−1 )] = ϕ(u) E exp i  

j=0



= E E exp i 



= E exp i 



= E exp i 



= E exp i 



= E exp i

n j=0

n−1







uj ∆X(tj ) E (exp (iun ∆X(tn ))) = 



uj ∆X(tj ) ϕtn ,tn−1 (un ) =

j=0 n−1

uj ∆X(tj ) | Ftn−1  =



j=0 n−1



uj ∆X(tj ) E exp (iun ∆X(tn )) | Ftn−1  =

j=0 n−1



 uj ∆X(tj ) ϕtn ,tn−1 (un ) = · · · =

j=0

=

n !

ϕtj ,tj−1 (uj ).

j=0

Of course this means that the σ-algebra generated by a finite number of increments is independent of Ft for any t. As the union of σ-algebras generated by finite number of increments is a π-system, with the uniqueness of the extension of the probability measures from π-systems one can prove that the σ-algebra generated by the increments is independent of Ft . Let us denote by ϕt the Fourier transform of X(t). As X has stationary and independent increments, for every u ϕt+s (u) E (exp (iuX(t + s))) = = E (exp (iu (X(t + s) − X (t))) exp (iuX(t))) = = E (exp (iu (X(t + s) − X (t)))) · E (exp (iuX(t))) = = E (exp (iuX(s))) · E (exp (iuX(t))) ϕt (u) · ϕs (u), therefore ϕt+s (u) = |ϕt (u)| · |ϕs (u)| .

(1.40)

MARTINGALES

63

As |ϕt (u)| ≤ 1 for all u and as |ϕ0 (u)| = 1 from Cauchy’s functional equation |ϕt (u)| = exp (t · c(u)) . This implies that ϕt (u) is never zero. Let h > 0. ϕt (u) − ϕt+h (u) = |ϕt (u)| 1 − ϕt+h (u) ≤ ϕt (u) ≤ |1 − ϕh (u)| . X is right-continuous so if h 0 then by the Dominated Convergence Theorem, using that X (0) = 0 lim ϕh (u) = ϕ0 (u) = 1.

h0

So ϕt (u) is right-continuous. If t > 0 then ϕt (u) − ϕt−h (u) = ϕt−h (u) 1 − ϕt (u) ≤ ϕt−h (u) ≤ |1 − ϕh (u)| → 0, so ϕt (u) is also left-continuous. Hence ϕt (u) is continuous in t. Therefore E(exp(iu∆X(t))) = lim E(exp(iu(X(t) − X(t − h)))) = h0

= lim

h0

ϕt (u) = 1, ϕt−h (u)

so ∆X(t) = 0 almost surely. a.s.

a.s.

Hence for some subsequence X (tnk ) → X (t). This implies that X (t−) = X (t). Therefore one can make the next important observation:

Proposition 1.98 If X is a Lévy process then ϕt (u) = 0 for every u and the probability of a jump at time t is zero for every t. This implies that every Lévy process is continuous in probability. We shall need the following generalization: Proposition 1.99 If X is a process with independent increments and X is continuous in probability then ϕt (u) ϕ(u, t) E (exp (iuX (t))) is never zero.

64


Proof. Let us fix the parameter u. As X is continuous in probability ϕ(u, t) is continuous in t. Let t0 (u) inf {t : ϕ (u, t) = 0} . One should prove that t0 (u) = ∞. By definition X (0) = 0 therefore ϕ (u, 0) = 1 and as ϕ (u, t) is continuous in t obviously t0 (u) > 0. Let h (u, s, t) E (exp (iu (X (t) − X (s)))) . X has independent increments, so if s < t then ϕ (u, t) = ϕ (u, s) h (u, s, t) .

(1.41)

By the right-regularity of X ϕ (u, t0 (u)) = 0. As X (t) has limits from the left if t0 (u) < ∞ then ϕ (u, t0 (u) −) is well-defined. We show that it is not zero. By (1.41) if s < t0 (u) < ∞ then ϕ (u, t0 (u) −) = ϕ (u, s) h (u, s, t0 (u) −) . ϕ (u, s) = 0 by the definition of t0 (u), so if ϕ (u, t0 (u) −) = 0 then h (u, s, t0 (u) −) = 0 for every s < t0 (u). 0=

lim h (u, s, t0 (u) −) =

st0 (u)

=

lim E (exp (iuX (t0 (u) −) − iuX (s))) =

st0 (u)

= E (exp (0)) = 1, which is impossible. Therefore 0 = ϕ (u, t0 (u)) = ϕ (u, t0 (u) −) = 0, which is impossible since ϕ is continuous. Let us recall the following simple observation: Proposition 1.100 Let ψ be a complex-valued, continuous curve defined on R. If ψ (t) = 0 for every t then it has a logarithm that is there is a continuous curve φ with the property that ψ = exp (φ). If φ1 (t0 ) = φ2 (t0 ) for some point t0 and ψ = exp (φ1 ) = exp (φ2 ) for some continuous curves φ1 and φ2 then φ1 = φ2 .

MARTINGALES

65

Proof. The proposition and its proof is quite well-known, so we just sketch it: 1. ψ = 0, so if ψ = exp (φ1 ) = exp (φ2 ) then 1=

ψ exp (φ1 ) = = exp (φ1 − φ2 ) . ψ exp (φ2 )

Hence for all t φ1 (t) = φ2 (t) + 2πin (t) , where n (t) is a continuous integer-valued function. As n (t0 ) = 0 obviously n ≡ 0, so φ1 = φ2 . 2. The complex series ln (1 + z) =

∞

n+1

(−1)

n=1

zn n

is convergent if |z| < 1. On the real line exp (ln (1 + z)) = 1 + z.

(1.42)

As ln (1 + z) is analytic (1.42) holds for every |z| < 1. To simplify notation as much as possible let us assume that t0 = 0 and ϕ (t0 ) = 1 and let us assume that we are looking for a curve with φ (t0 ) = 0. From (1.42) there is an r > 0 that ψ (t) ln (ϕ (t)) is well-defined for |t| < r. 3. Let a be the infimum and let b be the supremum of the endpoints of closed intervals where one can define a φ. If an a and bn b and φ is defined on [an , bn ] then by the first point of the proof φ (t) is well-defined on (a, b). Let assume that b < ∞. As ψ (b) = 0 we can define the curve θ (t) ψ (b + t) /ψ (b). Applying the part of the proposition just proved for some r > 0 ψ (t) = exp ( (t)) , ψ (b)

|b − t| < r,

with (b) = 0. Let t ∈ (b − r, b). As the range of the complex exponential function is C\ {0} there is a z ∈ C with ψ (b) = exp (z). exp (φ (t)) = ψ (b) exp ( (t)) = exp (z + (t)) . Hence φ (t) = z + (t) + 2nπi. With z + (t) + 2nπi one can easily continue φ to (a, b + r). This contradiction shows that one can define φ for the whole R.

66


ϕ1 (u) E (exp (iuX (1))) is non-zero and by the Dominated Convergence Theorem it is obviously continuous in u. By the observation just proved ϕ1 (u) = exp (log ϕ1 (u)) exp(φ(u)), where by definition φ(0) = 0. From this by (1.40) ϕn (u) = exp(nφ(u)) and ϕ1/n (u) = exp(n−1 φ(u)) for every n ∈ N. Hence if r is a rational number then ϕr (u) = exp(rφ(u)). By the just proved continuity in t t ∈ R+ .

ϕt (u) = exp (tφ(u)) ,

(1.43)

Lévy processes are not martingales but we can use martingale theory to investigate their properties. The key tool is the so-called exponential martingale of X. Let us define the process Zt (u, ω) Z (t, u, ω)

exp (iuX(t, ω)) . ϕt (u)

(1.44)

ϕt (u) is continuous in t for every fixed u, and therefore Zt (u, ω) is a right-regular stochastic process. Let t > s. E (Zt (u) | Fs ) E =E =

exp (iuX (t)) | Fs ϕt (u)

=

exp (iu (X (t) − X (s))) exp (iuX (s)) | Fs ϕt−s (u) ϕs (u)

exp (iuX (s)) E (exp (iu (X (t) − X (s)))) = ϕs (u) ϕt−s (u)

= Zs (u)

E (exp (iuX (t − s))) = ϕt−s (u)

= Zs (u) · 1 Zs (u) , therefore Zt (u) is a martingale in t for any fixed u. Definition 1.101 Zt (u) is called the exponential martingale of X. Example 1.102 The exponential martingale of a Wiener process. If w is a Wiener process then Zt (u, ω)

exp (iuw(t)) u2 = exp iuw(t) + t . exp(−tu2 /2) 2

=

MARTINGALES

67

If instead of the Fourier transform we normalize with the Laplace transform, then56 exp (uw(t)) u2 = exp uw(t) − t . exp(tu2 /2) 2

Let X be a Lévy process and assume that the filtration is generated by X. Denote this filtration by F X . Obviously F X does not necessarily contain the measure-zero sets57 , so F X does not satisfy the usual conditions. Let N denotes the collection of measure-zero sets and let us introduce the so-called augmented filtration: Ft σ (σ (X (s) : s ≤ t) ∪ N ) .

(1.45)

It is a bit surprising, but for every Lévy process the augmented filtration satisfies the usual conditions. That is, for Lévy processes the augmented filtration F is always right-continuous58 : Proposition 1.103 If X is a Lévy process then (1.45) is right-continuous that is Ft = Ft+ . Proof. Let us take the exponential martingale of X. If t < w < s then exp (iuX (w)) Zw (u) = E (Zs (u) | Fw ) E ϕw (u)

exp (iuX (s)) | Fw , ϕs (u)

therefore Zw (u) ϕs (u) exp (iuX (w))

ϕs (u) = E (exp (iuX (s)) | Fw ) . ϕw (u)

If w t then from the continuity of ϕt and from the right-continuity of X, with Lévy’s theorem59 exp (iuX (t))

ϕs (u) a.s. = E (exp (iuX (s)) | Ft+ ) . ϕt (u)

As exp (iuX (t)) is Ft -measurable, and Zt (u) is a martingale exp (iuX (t)) 56 See:

Example 1.118, page 82. Example 1.13, page 9. 58 See: Example 1.13, page 9. 59 See: Theorem 1.75, page 46. 57 See:

ϕs (u) a.s. = E (exp (iuX (s)) | Ft ) . ϕt (u)

68


Therefore a.s.

E (exp (iuX (s)) | Ft ) = E (exp (iuX (s)) | Ft+ ) .

(1.46)

This equality can be extended to multidimensional trigonometric polynomials. For example, if t < w ≤ s1 ≤ s2 and η u1 X (s1 ) + u2 X (s2 ) then, as X(s2 ) − X (s1 ) is independent of Fs1 : E (exp (iη) | Fw ) = E (exp (iu1 X (s1 )) · exp (iu2 X (s2 )) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 ))) | Fs1 ) | Fw ) = = E (exp (i(u1 + u2 )X (s1 )) · E (exp (iu2 (X(s2 ) − X (s1 )))) | Fw ) =

= E exp (i (u1 + u2 ) X (s1 )) · ϕs2 −s1 (u2 ) | Fw = = ϕs2 −s1 (u2 ) · E (exp (i (u1 + u2 ) X (s1 )) | Fw ) = = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zw (u1 + u2 ) . If w t then by the right-continuity of Zs and by Lévy’s theorem60 a.s.

E (exp (iη) | Ft+ ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . On the other hand with the same calculation if w = t a.s.

E (exp (iη) | Ft ) = ϕs2 −s1 (u2 ) · ϕs1 (u1 + u2 ) · Zt (u1 + u2 ) . Therefore a.s.

E (exp (iη) | Ft ) = E (exp (iη) | Ft+ ) . That is if sk > t then

E exp i

uk X(sk )

| Ft+

a.s.

= E exp i

k

uk X(sk )

| Ft

.

(1.47)

k

If sk ≤ t then equation (1.47) trivially holds. Hence if L is the set of bounded functions f for which a.s.

E (f (X (s1 ) , . . . , X (sn )) | Ft+ ) = E (f (X (s1 ) , . . . , X (sn )) | Ft ) then L contains the π-system of the trigonometric polynomials. L is trivially a λsystem, therefore, by the Monotone Class Theorem, L contains the characteristic functions of the sets of the σ-algebra generated by the trigonometric polynomials. 60 See:

Theorem 1.75, page 46.

MARTINGALES

69

That is if B ∈ B (Rn ) then one can write in place of f the characteristic functions χB . Collection Z of sets A for which a.s.

E (χA | Ft+ ) = E (χA | Ft ) is also a λ-system which contains the sets of the π-system n

∪n σ ((X (sk ))k=1 , sk ≥ 0) . Again, by the Monotone Class Theorem, Z contains the σ-algebra 0 = σ (X (s) : s ≥ 0) . F∞

0 If A ∈ Ft+ ∩n Ft+1/n then A ∈ F∞ σ F∞ ∪ N . Therefore there is an a.s. 0 ∈ F0 ⊆ Z ∈ F∞ A , with χA = χA. As A ∞ a.s.

a.s. a.s. χA = E (χA | Ft+ ) = E χA | Ft+ = E χA | Ft . Hence up to a measure-zero set χA is almost surely equal to an Ft -measurable function E χA | Ft . As Ft contains all the measure-zero set χA is Ft measurable, that is A ∈ Ft . In a similar way one can prove the next proposition: Proposition 1.104 If X is a process with independent increments and X is continuous in probability then (1.45) is right-continuous, that is Ft = Ft+ . Example 1.105 One cannot drop the condition of independent increments. If ζ ∼ = N (0, 1) and X (t, ω) tζ (ω) then the trajectories of X are continuous and X has stationary increments. If F is the augmented filtration, then F0 = σ (N ), and if t > 0, then Ft = σ (σ (X) , N ), hence Ft is not right-continuous. Example 1.106 The augmentation is important: if w is a Wiener process then Ftw σ (w (s) : s ≤ t) is not necessarily right-continuous61 .

From now on we shall assume that the filtration of every Lévy process satisfies the usual assumptions. 61 See:

Example 1.13, page 9.

70


Proposition 1.107 If the process X is left-continuous then the filtration FtX σ (X (s) : s ≤ t) is left-continuous. This remains true for the augmented filtration.

X X Proof. Let Ft− σ ∪s n then {τ n ≤ t} = {τ ≤ n}. From (1.51) by the definition of the stopped σ-algebra An A ∩ {τ ≤ n} ∈ Fτ n . As τ n is bounded, by (1.49) exp (iu (X(τ n + t) − X(τ n ))) dP = P (An ) ϕt (u) . An

(1.52)

72


From (1.50) and by the Dominated Convergence Theorem

exp (iuX ∗ (t)) dP =

A

lim χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =

=

A n→∞

χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) dP =

= lim

n→∞

A

exp (iu (X (τ n + t) − X (τ n ))) dP =

= lim

n→∞

An

= lim P (An ) ϕt (u) = P (A) ϕt (u) = P (A) n→∞

exp (iuX(t)) dP. Ω

2. If A Ω then the equation above means that the Fourier transform of X ∗ (t) is ϕt . That is, the distribution of X ∗ (t) and X (t) is the same. Let L be the set of bounded functions f for which for all A ∈ Fτ

f (X ∗ (t)) dP = P (A)

f (X ∗ (t)) dP.

Ω

A

Obviously L is a λ-system, and L contains the π-system of the trigonometric polynomials x → exp (iux) ,

u ∈ R.

By the Monotone Class Theorem, L contains the functions f χB with B ∈ B (R). Therefore for every A ∈ Fτ and B ∈ B(R)

χB (X ∗ (t)) dP = P (A ∩ {X ∗ (t) ∈ B}) =

A

= P (A)

χB (X ∗ (t)) dP = P (A) · P (X ∗ (t) ∈ B) .

Ω

So X ∗ (t) is independent of Fτ . 3. One should prove that X ∗ has stationary and independent increments. If σ τ + t and X ∗∗ (h) X (σ + h) − X (σ) ,

MARTINGALES

73

then using the part of the proposition already proved for the stopping time σ X ∗ (t + h) − X ∗ (t) (X (τ + t + h) − X (τ )) − (X (τ + t) − X (τ )) = = X (σ + h) − X (σ) = X ∗∗ (h) ∼ = X(h), which is independent of t and therefore X ∗ has stationary increments. Also by the already proved part of the proposition X ∗ (t + h) − X ∗ (t) = X ∗∗ (h) is independent of Fσ Ft∗ . Obviously X ∗ (0) = 0 and X ∗ is right-regular therefore X ∗ is a process with independent increments. 4. Now we prove that X and X ∗ have the same distribution. Let 0 = t0 < t1 < . . . < tn be arbitrary. As we proved X ∗ (tk ) − X ∗ (tk−1 ) ∼ = X ∗ (tk − tk−1 ) ∼ = X (tk − tk−1 ) ∼ = ∼ X (tk ) − X (tk − 1) . = As the increments are independent (X ∗ (tk ) − X ∗ (tk−1 ))k=1 has the same disn n tribution as (X (tk ) − X (tk−1 ))k=1 . This implies that (X (tk ))k=1 has the same n distribution as (X ∗ (tk ))k=1 . Which, by the Monotone Class Theorem, implies that X ∗ and X has the same distribution. n

5. As we proved X ∗ is a process with independent increments so Ft∗ is independent of the σ-algebra Gt∗ generated by the increments64 X ∗ (u) − X ∗ (v) ,

u ≥ v ≥ t.

So as a special case the set {X ∗ (t) : t ≥ 0} is independent of F0∗ = Fτ . Example 1.110 Random times which are not stopping times.

Let a > 0 and let w be a Wiener process. 1. Let γ a sup {0 ≤ s ≤ a : w (s) = 0} = inf {s ≥ 0 : w (a − s) = 0} . 64 See:


74


Obviously γ a is Fa -measurable, so it is a random time. As P (w (a) = 0) = 0 almost surely γ a < a. Assume that γ a is a stopping time. In this case by the strong Markov property w∗ (t) w (t + γ a ) − w (γ a ) is also a Wiener process. It is easy to see that if w∗ is a Wiener process then w (t) tw∗ (1/t) is also a Wiener process65 . As every one-dimensional Wiener process almost surely returns to the origin66 , with the strong Markov property it is easy to prove that w returns to the origin almost surely after any time t. This means that there is a sequence tn 0 with tn > 0 that almost surely w∗ (tn ) = 0. But this is impossible as almost surely w∗ does not have a zero on the interval (0, a − γ a ]. 2. Let β a max {w (s) : 0 ≤ s ≤ a} , ρa inf {0 ≤ s ≤ a : w (s) = β a } . We show that ρa is not a stopping time. As P (w (a) − w (a/2) < 0) = 1/2 P (ρa < a) > 0. If ρa were a stopping time, then by the strong Markov property w∗ (t) w (t + ρa ) − w (ρa ) would be a Wiener process. But this is impossible as with positive probability the interval (0, a − ρa ] is not empty and on this interval w∗ cannot have a positive value. An important consequence of the strong Markov property is the following: Proposition 1.111 If the size of the jumps of a Lévy process X are smaller than a constant c > 0, that is |∆X| ≤ c then on any interval [0, t] the moments of X are uniformly bounded. That is for each m there is a constant K (m, t), that E (|X m (s)|) ≤ K (m, t) ,

s ∈ [0, t] .

Proof. One may assume that the stopping time67 τ 1 inf {t : |X (t)| > c} 65 See:

Corollary B.10, page 566. Corollary B.8, page 565. 67 Recall that F satisfies the usual assumptions. See: Example 1.32, page 17. 66 See:

MARTINGALES

75

is finite, as by the zero-one law the set of outcomes ω where τ 1 (ω) = ∞ has probability 0 or 1. If with probability one τ 1 (ω) = ∞ then X is uniformly bounded, hence in this case the proposition holds. Then define the stopping time τ 2 inf {t : |X ∗ (t)| > c} + τ 1 inf {t : |X (t + τ 1 ) − X (τ 1 )| > c} + τ 1 . In a similar way let us define τ 3 etc. By the strong Markov property the variables {X ∗ (t) : t ≥ 0} are independent of the σ-algebra Fτ 1 . The variable τ 2 − τ 1 inf {t ≥ 0 : |X ∗ (t)| > c} is measurable with respect to the σ-algebra generated by the variables {X ∗ (t) : t ≥ 0} hence τ 2 − τ 1 is independent of Fτ 1 . In general τ n − τ n−1 is independent of Fτ n−1 . Also by the strong Markov property for all n the distribution of τ n − τ n−1 is the same as the distribution of τ 1 . Therefore if τ 0 0, then using the independence of variables (τ k − τ k−1 )

E (exp (−τ n )) = E exp −

n

(τ k − τ k−1 )

n

= (E (exp (−τ 1 ))) q n ,

k=1

where 0 < q ≤ 1. If q = 1 then almost surely τ 1 = 0, which by the rightcontinuity implies that |X (0)| ≥ c > 0, which, by the definition of Lévy processes, is not the case, so q < 1. As the jumps are smaller than c |X (τ 1 )| ≤ |X (τ 1 −)| + |∆X (τ 1 )| ≤ ≤ |X (τ 1 −)| + c ≤ 2c. In a same way it is easy to see that in general sup |X τ n (t)| = sup |{X (t) : t ∈ [0, τ n ]}| ≤ 2nc. t

Therefore by Markov’s inequality P (|X (t)| > 2nc) ≤ P (τ n < t) = P (exp (−τ n ) > exp (−t)) ≤ ≤

E (exp (−τ n )) ≤ exp (t) q n . exp (−t)

As q < 1 L(m)

∞ n=0

m

[2 (n + 1) c] q n < ∞,

76


so m

E (|X (t)| ) ≤

∞

m

[2 (n + 1) c] · P (|X (t)| > 2nc) ≤

n=0

≤ exp (t)

∞

m

[2 (n + 1) c] q n exp (t) L (m) ,

n=0

from which the proposition is evident. One can generalize these observations. Proposition 1.112 (Strong Markov property for processes with independent increments) Let X be a process with independent increments and assume that X is continuous in probability. Let D ([0, ∞)) denote the space of right-regular functions over [0, ∞) and let H be the σ-algebra over D ([0, ∞)) generated by the coordinate functionals. If f is a non-negative H-measurable functional68 over D ([0, ∞)), then for every stopping time τ < ∞ E (f (X ∗ ) | Fτ ) = E (f (Xs∗ )) |s=τ where Xs∗ (t) X (s + t) − X (s) . Proof. Let ϕ (u, t) be the Fourier transform of X (t). As X is continuous in probability ϕ (u, t) = 0 and Z (u, t)

exp (iuX (t)) ϕ (u, t)

is a martingale69 . Let τ be a bounded stopping time. By the Optional Sampling Theorem E (Z (u, τ + s) | Fτ ) = Z (u, τ ) . ϕ (u, τ + t) is Fτ -measurable. Therefore E (exp (iuX ∗ (t)) | Fτ ) E (exp (iu (X (τ + t) − X (τ ))) | Fτ ) =

(1.53)

ϕ (u, s + t) ϕ (u, τ + t) = |s=τ = ϕ (u, τ ) ϕ (u, s)

68 It is easy to see that f (X) = g (X (t ) , X (t ) , . . .) where g is an R∞ → R Borel mea1 2 surable function and (tk ) is a countable sequence in R+ . The canonical example is f (X) sups≤t |X (s)|. 69 See:


MARTINGALES

=

77

ϕ (u, s) E (exp (iu (X (t + s) − X (s)))) |s=τ = ϕ (u, s) = E (exp (iu (Xs∗ (t)))) |s=τ .

If τ is not bounded then τ n τ ∧ n is a bounded stopping time. Let h (s) E (exp (iu (X (s + t) − X (s)))) As τ < ∞ X (τ n + t) − X (τ n ) → X (τ + t) − X (τ ) So by the Dominated Convergence Theorem h (τ n ) → h (τ ). If A ∈ Fτ then A ∩ {τ ≤ n} ∈ Fτ n therefore

χ (τ ≤ n) exp (iu (X (τ n + t) − X (τ n ))) =

A

χ (τ ≤ n) h (τ n ) dP. A

By the Dominated Convergence Theorem one can take the limit n → ∞. Hence in (1.53) we can drop the condition that τ is bounded. With the Monotone Class Theorem one can prove that for any Borel measurable set B E (χB (X ∗ (t)) | Fτ ) = E (χB (Xs∗ (t))) |s=τ In the usual way, using multi-dimensional trigonometric polynomials and the Monotone Class Theorem several times, one can extend the relation to every H-measurable and bounded function. Finally one can prove the proposition with the Monotone Convergence Theorem. Corollary 1.113 Under the same conditions as above E (f (X ∗ ) | τ = s) = E (f (Xs∗ )) . Let us remark, that if X is a Lévy process then the distribution of Xs∗ is the same as the distribution of X for every s so E (f (X ∗ ) | Fτ ) = E (f (X))

78


for every τ < ∞. If f (X) exp (i

E exp i

n

n k=1

∗

| Fτ

uk X (tk )

uk X (tk )) then

= E exp i

k=1

n

uk X (tk )

.

k=1

The right-hand side is deterministic which implies that (X ∗ (t1 ), X ∗ (t2 ), . . . X ∗ (tn )) is independent of Fτ and has the same distribution as (X(t1 ), X(t2 ), . . . , X(tn )). Proposition 1.114 If X is a process with independent increments and X is continuous in probability, and the jumps of X are bounded by some constant c, then all the moments of X are uniformly bounded on any finite interval, that is, for every t E (|X m (s)|) ≤ K (m, t) < ∞,

s ∈ [0, t] .

Proof. Let us fix a t. X has right-regular trajectories so on any finite interval the trajectories are bounded. Therefore sups≤2t |X (s)| < ∞. Hence if b is sufficiently large then P

sup |X (s)| >

s≤2t

b 2

< q < 1.

Let τ inf {s : |X (s)| > a} ∧ 2t. By the definition of τ {τ < t} ⊆

sup |X(s)| > a ⊆ {τ ≤ t}. s≤t

If for some ω. ω∈

sup |X(s)| > a \{τ < t} s≤t

then sup |X(s, ω)| ≤ a s a, so process X has a jump at (t, ω), which by the stochastic continuity of X has probability zero. As the size of the jumps is bounded

MARTINGALES

79

by the right-continuity sup |X(s)| ≤ sup |X(s−)| + sup |∆X(s)| ≤ a + c. s≤τ

s≤τ

s≤τ

We show that this implies that

sup |X(s)| > a + b + c ⊆ sup |X(s)| > a, sup |X(τ + s) − X(τ )| > b . s≤t

s≤t

s≤t

If sup |X(s)| > a + b + c s≤t

then obviously sups≤t |X(s)| > a, hence τ ≤ t, so if sups≤t |X(τ +s)−X(τ )| ≤ b, then sup |X(s)| ≤ sup |X(s)| + sup |X(τ + s) − X(τ )| ≤ a + b + c. s≤t

s≤τ

s≤t

Which is impossible. If u ≤ t, then sup |X(u + s) − X(u)| ≤ 2 sup |X(s)|. s≤t

s≤2t

Therefore if u ≤ t, then b sup |X(u + s) − X(u)| > b ⊆ sup |X(s)| > . 2 s≤t s≤2t Let F be the distribution function of τ . By the just proved strong Markov property

P sup |X (s)| > a + b + c ≤ s≤t

≤ P sup |X (s)| > a, sup |X (τ + s) − X (τ )| > b s≤t

=

s≤t

= P τ < t, sup |X (τ + s) − X (τ )| > b =

s≤t

P sup |X ((τ + s)) − X (τ )| > b | τ = u dF (u) =

= [0,t)

s≤t

= [0,t)

P sup |X (u + s) − X (u)| > b dF (u) ≤ s≤t

80


≤P

sup |X (s)| >

s≤2t

b 2

· P (τ < t) =

= q · P (τ < t) ≤ q · P sup |X (s)| > a . s≤t

From this for an arbitrary n

P sup |X (s)| > n (b + c) ≤ q n . s≤t

Hence ∞ m m m E (|X (t)| ) ≤ E sup |X (s)| (n (b + c)) q n−1 < ∞. ≤ s≤t

n=1

We shall return to Lévy processes in section 7.1. If the reader is interested only in Lévy processes then they can continue the reading there. 1.3.8

Application: the first passage times of the Wiener processes

In this subsection we present some applications of the Optional Sampling Theorem. Let w be a Wiener process. We shall discuss some properties of the first passage times τ a inf {t : w (t) = a} .

(1.54)

The set {a} is closed and w is continuous, hence τ a is a stopping time70 . Recall that71 almost surely lim sup w (t) = ∞, t→∞

lim inf w (t) = −∞. t→∞

(1.55)

Therefore as w is continuous τ a is almost surely finite. Example 1.115 The martingale convergence theorem does not hold in L1 (Ω).

Let w be a Wiener process and let X w + 1. Let τ be the first passage time of zero for X, that is let τ inf {t : X (t) = 0} = τ −1 inf {t : w (t) = −1} . 70 See: 71 See:

Example 1.32, page 17. Proposition B.7, page 564.

MARTINGALES

81

As X is martingale X τ is a non-negative martingale. By the martingale convergence theorem for non-negative martingales 72 if t ∞ then X τ (t) is almost surely convergent. As we remarked, τ is almost surely finite therefore obviously X τ (∞) = 0. By the Optional Sampling Theorem X τ (t)1 = X(τ ∧ t)1 = E (X(τ ∧ t)) = E (X(0)) = 1 for any t. Hence the convergence does not hold in L1 (Ω). Example 1.116 If a < 0 < b and τ a and τ b are the respective first passage times of some Wiener process w, then P (τ a < τ b ) =

b , b−a

P (τ b < τ a ) =

−a . b−a

By (1.55) with probability one, the trajectories of w are unbounded. Therefore as w starts from the origin the trajectories of w finally leave the interval [a, b]. So P (τ a < τ b ) + P (τ b < τ a ) = 1. If τ τ a ∧ τ b then wτ is a bounded martingale. Hence one can use the Optional Sampling Theorem. Obviously wττ is either a or b, hence E (wττ ) = aP (τ a < τ b ) + bP (τ b < τ a ) = E (wτ (0)) = 0. We have two equations with two unknowns. Solving this system of linear equations, one can easily deduce the formulas above. Example 1.117 Let a < 0 < b and let τ a and τ b be the respective first passage times of some Wiener process w. If τ τ a ∧ τ b , then E (τ ) = |ab|.

With direct calculation it is easy to see that the process w2 (t)−t is a martingale. From this it is easy to show that the process X (t) (w(t) − a) (b − w(t)) + t is also a martingale. By the Optional Sampling Theorem |ab| = −ab = E (X (0)) = E (X (τ ∧ n)) = = E (w (τ ∧ n) − a) (b − w (τ ∧ n)) + E (τ ∧ n) . 72 See:


82


If n ∞ then by the Monotone and by the Dominated Convergence Theorems the limit of the right-hand side is E (τ ). Example 1.118 Let w be a Wiener process. The Laplace transform of the first passage time τ a is √ L (s) E (exp (−sτ a )) = exp − |a| 2s ,

s ≥ 0.

(1.56)

Let a > 0. For every u the process X (t) exp u · w (t) − t · u2 /2 is a martingale73 . So the truncated process X τ a is also a martingale. If u ≥ 0, then

0≤X

τa

u2 t (t) ≤ exp ua − 2

≤ exp (au) ,

hence X τ a is a bounded martingale. Every bounded martingale is uniformly integrable, therefore one can apply the Optional Sampling Theorem. So

u2 τ a E Xττaa = E exp ua − = E (X τ a (0)) = 1. 2 Hence

u2 τ a E exp − 2 If u

√

= exp (−ua) .

2s ≥ 0 then √

L (s) E (exp (−sτ a )) = exp −a 2s .

If a < 0 then repeating the calculations for the Wiener process −w √

L (s) = exp − |a| 2s .

Example 1.119 The Laplace transform of the first passage time of the reflected Wiener process |w| is (s) E (exp (−s L τ a )) =

73 See:

(1.44), page 66.

1 √ , cosh a 2s

s ≥ 0.

(1.57)

MARTINGALES

83

By definition τ a inf {t : |w (t)| = a} . Let 2 u t exp (uw (t)) + exp (−uw (t)) exp − X (t) 2 2 2 u t cosh (uw (t)) exp − . 2 X is the sum of two martingales, hence it is a martingale. X τ a ≤ cosh (ua), therefore one can again apply the Optional Sampling Theorem. 2

τa u τa E Xτ a = E cosh (ua) exp − = 1, 2 therefore

E exp If u

√

−u2 τa 2

=

1 . cosh (ua)

2s then E (exp (−s τ a )) =

1

√ . cosh a 2s

Example 1.120 The density function of the distribution of the first passage time τ a of a Wiener process is −1/2 a2 f (x) = |a| 2πx3 . exp − 2x

(1.58)

By the uniqueness of the Laplace transform √ it is sufficient to prove that the Laplace transform of (1.58) is exp − |a| 2s . By the definition of the Laplace transform ∞ exp (−sx) f (x) dx, s ≥ 0. L (s) 0

If F denotes the distribution function of (1.58) then F (x)

x

f (t) dt = 2 0

a

∞

√

2 1 u du, exp − 2x 2πx

(1.59)

84


since if we substitute t xa2 /u2 , then 2 au3 u √ exp − xa2 (−2) u−3 du = 3 2πx3 2x a ∞ 2 ∞ 1 u √ =2 exp − du. 2x 2πx a

a

F (x) =

Integrating by parts and using that F (0) = 0, if s > 0 then L (s) = [exp (−sx) F

∞ (x)]0

∞

s exp (−sx) F (x) dx =

+ 0

∞

=s

exp (−sx) F (x) dx. 0

By (1.59)

∞

L (s) = 2s

exp (−sx) 0

a

∞

2 1 u √ dudx. exp − 2x 2πx

Fix s and let us take L (s) as a function of a. Let us denote this function by g (a). We show that if a > 0 then g (a) satisfies the differential equation d2 g (a) = 2sg (a) . da2

(1.60)

The integrand is non-negative, so by Fubini’s theorem one can change the order of the integration, so

∞

∞

g (a) = 2s a

0

exp (−sx) √

2 u 1 dxdu. exp − 2x 2πx

As 0

∞

√

1 1 exp (−sx) dx = √ Γ 2πx 2πs

1 0. If s > 0 and z s + it then 1 1 z 1 log z = exp ln (|z|) exp i arg = exp 2 2 2 |z| " arctan (t/s) arctan (t/s) 4 = s2 + t2 cos + i sin . 2 2

z

1/2

86


The complex Laplace transform is continuous so ϕ (t) = L (−it) = #

$

√ " arctan −t arctan −t 4 s s 2 2 + i sin = = lim exp −a 2 s + t cos s0 2 2

"

π

π = = exp −a 2 |t| cos − sgnt + i sin − sgnt 4 4 "

= exp −a |t| (1 − i · sgnt) .

Example 1.122 The maximum process of a Wiener process. Let w be a Wiener process, and let us introduce the maximum process S (t) sup w (s) = max w (s) . s≤t

s≤t

We show that for every a ≥ 0 and t ≥ 0 P (S (t) ≥ a) = P (τ a ≤ t) = 2 · P (w (t) ≥ a) = P (|w (t)| ≥ a) .

(1.61)

The first and last equality are trivial. We prove the second one: recall that the density function of the distribution of τ a is 2 d 1 d a P (τ a ≤ t) F (t) f (t) = a √ . exp − 3 dt dt 2t 2πt √ w (t) ∼ = N 0, t , so a = U (t) 2 · P (w (t) ≥ a) = 2 1 − Φ √ t 2 = √ 2π

u2 exp − du. √ 2 a/ t

∞

Differentiating with respect to t 2 a a d U (t) = √ exp − t−3/2 , dt 2t 2π

MARTINGALES

87

hence the derivatives of P (τ a ≤ t) and 2 · P (w (t) ≥ a) with respect to t are the same. The two functions are equal if t = 0, therefore 2 · P (w (t) ≥ a) = P (τ a ≤ t) for every t.

Example 1.123 The density function of S (t) sups≤t w (s) is f (x) = √

x2 2 exp − , 2t 2πt

x > 0.

√ By (1.61) P (S (t) ≥ x) = 2 1 − Φ x/ t . Differentiating we get the formula. Example 1.124 If w is a Wiener process then

π E sup |w (s)| = , 2 s≤1

2 E sup w (s) = . π s≤1

Let S (t) sup |w (s)| = max |w (s)| , s≤t

s≤t

τ a inf {t : |w (t)| = a} . If x > 0, then74 s s

P S (t) ≤ x = P max xw 2 ≤ x = P max w 2 ≤ 1 = s≤t s≤t x x t = P max2 |w (s)| ≤ 1 = P τ1 ≥ 2 = x s≤t/x 1 x =P √ ≤ √ . τ1 t If σ > 0, then %

74 Recall

2 π

0

∞

x2 exp − 2 2σ

dx = σ.

that s → xw s/x2 is also a Wiener process.

88


The expected value depends only on the distribution, so by Fubini’s theorem and by (1.57) % 2

2 ∞ τ1 x 1 dx = exp − E S (1) = E √ =E π 0 2 τ1 % ∞ 2 2 τ1 x E exp − = dx = π 0 2 % ∞ % ∞ 2 1 2 exp (x) dx = 2 dx = = π 0 cosh x π 0 exp (2x) + 1 % ∞ % % 2 1 2 π π · = . =2 dy = 2 π 1 y2 + 1 π 4 2

In a similar way, if S denotes the supremum of w then E (S (1)) = E %

1 √ τ1

%

=E

2 π

∞

0

x2 τ 1 exp − 2

dx

=

2 x τ1 E exp − dx = 2 0 % ∞ % 2 2 . exp (−x) dx = = π 0 π

=

2 π

∞

One can prove the last relation with (1.61) as well: % E (S (1)) = E (|w (1)|) =

2 π

0

∞

x2 x exp − 2

%

dx =

2 π

Example 1.125 The intersection of a two-dimensional Wiener process with a line has Cauchy distribution.

Let w1 and w2 be independent Wiener processes, and let us consider the line75 L {x = a} where a > 0. The two-dimensional process w (t) (w1 (t) , w2 (t)) meets L the first time at τ a inf {t : w1 (t) = a} . 75 The Wiener processes are invariant under rotation so the result is true for an arbitrary line. One can generalize the result to an arbitrary dimension. In the general case, we are investigating the distribution of the intersection of the Wiener processes with hyperplanes.

MARTINGALES

89

What is the distribution of the y coordinate that is what is the distribution of w2 (τ a )?

1. For an arbitrary u the process t → u−1 w1 u2 t is also a Wiener process, hence the distribution of its maximum process is the same as the distribution of the maximum process of w1 . Let us denote this maximum process by S1 . √ 1 P (τ a ≥ x) = P (S1 (x) ≤ a) = P xS1 √ 2x ≤ a = ( x) 2

√ a = P xS1 (1) ≤ a = P ≥x . S12 (1) w intersects L at w2 (τ a ). τ a is σ (w1 )-measurable, and as w1 and w2 are independent, that is the σ-algebras σ (w2 ) and σ (w1 ) are independent, τ a is independent of w2 . We show that √ w2 (τ a ) ∼ = τ a · w2 (1)

(1.62)

√ that is, the distribution of w2 (τ a ) is the same as the distribution of τ a · w2 (1). Using the independence of τ a and w2 √

P (w2 (τ a ) ≤ x | τ a = t) = P (w2 (t) ≤ x) = P tw2 (1) ≤ x , and √

√ tw2 (1) ≤ x . P ( τ a w2 (1) ≤ x | τ a = t) = P Integrating both equations by the distribution of τ a we get (1.62). Hence √ a a w2 (τ a ) ∼ · w2 (1) ∼ · w2 (1) . = τ a · w2 (1) ∼ = = S1 |w1 (1)| w1 (1) and w2 (1) are independent with distribution N (0, 1). Therefore w2 (τ a ) has a Cauchy distribution. 2. One can also prove the relation with Fourier transforms. Let us calculate the

Fourier transform of w2 (τ a )! The Fourier transform of N (0, 1) is exp −t2 /2 . By the independence of τ a and w2 and by (1.56) ϕ (t) E (exp (itw2 (τ a ))) = ∞ E (exp (itw2 (τ a )) | τ a = u) dG (u) = = 0

=

∞

E (exp (itw2 (u))) dG (u) = 0

90


t2 = exp − u dG (u) = 2 0 2 2 t t = E exp − τ a L = 2 2 √

= exp −a t2 = exp (−a |t|) ,

∞

which is the Fourier transform of a Cauchy distribution. Example 1.126 The process of first passage times of Wiener processes.

Let w be a Wiener process and let us define the hitting times τ a inf {t : w (t) = a} ,

σ a inf {t : w (t) > a} .

w is continuous, the set {x > a} is open, hence σ a is a weak stopping time. As the augmented filtration of w is right-continuous σ a is a stopping time76 . w has continuous trajectories so obviously τ a ≤ σ a . As the trajectories of w can contain ‘peaks and flat segments’ it can happen that for some outcomes τ a is strictly smaller than σ a . As we shall immediately see almost surely τ a = σ a . One can define the stochastic processes T (a, ω) τ a (ω),

S(a, ω) σ a (ω)

with a ∈ R+ . It is easy to see that T and S have strictly increasing trajectories. If an a then w(τ an ) = an a, hence obviously τ an τ a , so T is leftcontinuous. On the other hand, it is easy to see that if an a, then σ an σ a , hence S is right-continuous. It is also easy to see, that T (a+, ω) = S(a, ω) and S(a−, ω) = T (a, ω) for all ω. Obviously τ a and σ a are almost surely finite. By the strong Markov property of w w∗ (t) w(τ a + t) − w(τ a ) is also a Wiener process. {τ a < σ a } is in the set {w∗ (t) ≤ 0 on some interval [0, r] , r ∈ Q} . As w∗ is a Wiener process it is not difficult to prove77 that if r > 0 then P (w∗ (t) ≤ 0, ∀t ∈ [0, r]) = 0. 76 See: 77 See:

Example 1.32, page 17. Corollary B.12, page 566.

MARTINGALES

91

Hence P (τ a = σ a ) = P (τ a < σ a ) = 0 for every a. Therefore S is a right-continuous modification of T . Obviously if b > a and τ ∗b−a is the first passage time of w∗ to b − a then τ b − τ a = τ ∗b−a . By the strong Markov property τ ∗b−a is independent of Fτ a . Therefore T (b) − T (a) is independent of Fτ a . In general, one can easily prove that T and therefore S have independent increments with respect to the filtration Ga Fτ a . Obviously S(0) = 0, hence S is a Lévy process with respect to the filtration G. 1.3.9

Some remarks on the usual assumptions

The usual assumptions are crucial conditions of stochastic analysis. Without them very few statements of the theory would hold. The most important objects of stochastic analysis are related to stopping times, as these objects express the timing of events. The main tool of stochastic analysis is measure theory. In measure theory, objects are defined up to measure-zero sets. From a technical point of view, of course it is not a great surprise that we want to guarantee that every random time, which is almost surely equal to a stopping time, should also be a stopping time. The definition of a stopping time is very natural: at time t one can observe only τ ∧ t so we should assume τ ∧ t to be Ft -measurable for every t. Hence if τ and τ are almost surely equal and they differ on a set N , then every subset of N should be also Ft -measurable. This implies that one should add all the measure-zero sets and all their subsets to the filtration78 . The right-continuity of the filtration is more problematic; it assumes that somehow we can foresee the events of the near future. At first sight is seems natural; in our usual experience we always have some knowledge about the near future. Our basic experience is speed and momentum, and these objects are by definition the derivatives of the trajectories. By definition, differentiability means that the right-derivative is equal to the left-derivative and the left-derivative depends on the past and the present. So in our differentiable world we always know the right-derivative, hence—infinitesimally—we can always see the future. But in stochastic analysis we are interested in objects which are non-differentiable. Recall that for a continuous process the hitting time of a closed set is a stopping time79 . At the moment that we hit a closed set we know that we are in the set. But what about the hitting times80 of open sets? We hit an open set at its boundary and when we hit it we are generally still outside the set. Recall that the hitting time of an open set is a stopping time only when the filtration is right-continuous81 . That is, when we hit the boundary of an open set—by the 78 See:

Example 6.37, page 386. Example 1.32, page 17. 80 See: Definion 1.26, page 15. 81 See: Example 1.32, page 17. 79 See:

92


right-continuity of the filtration—we can ask for some extra information about the future which tells us whether we shall really enter the set or not. This is, of course, a very strong assumption. If we want to go to a restaurant and we are at the door, we know that we shall enter the restaurant. But a Wiener process can easily turn back at the door. One of the most surprising statements of the theory is that the augmented filtration of a Lévy process is right-continuous. This is true not only for Lévy processes, but under more general conditions82 . It is important to understand the reason behind this phenomena. The probability that a one-dimensional Wiener process hits the boundary of an open set without actually entering the set itself has zero83 probability! And in general the rightcontinuity of an augmented filtration means that all the events which need some insight into the future84 have zero probability. We cannot see the future, we are just ignoring the irrelevant information!

1.4

Localization

Localization is one of the most frequently used concepts of mathematical analysis. For example, if f is a continuous function on R, then of course generally f is not integrable on the whole real line. But this is not a problem at all. We can x still talk about the integral function F (x) 0 f (t)dt of f . The functions of Calculus are not integrable, they are just locally integrable. In the real analysis we say that a certain property holds locally if it holds on every compact subset of the underlying topological space85 . In the real line it is enough to ask that the property holds on any closed, bounded interval, in particular for any t the property should hold on any interval [0, t]. Very often, like in the case of local integrability, it is sufficient to ask that the property should hold on some intervals [0, tn ] where tn ∞. In stochastic analysis we should choose the upper bounds tn in a measurable way with respect to the underlying filtration. This explains the next definition: Definition 1.127 Let X be a family of processes. We say that process X is locally in X if there is a sequence of stopping times (τ n ) for which almost surely86 τ n ∞, and the truncated processes X τ n belong to X for every n. The sequence (τ n ) is called the localizing sequence of X. Xloc denotes the set of processes locally belonging to X . A specific problem of the definition above, is that with localization one cannot modify the value of the variable X(0), since every truncated process X τ n at the 82 This is true e.g. for so called Feller processes, which form an important subclass of the Markov processes. 83 See: Example 1.126, page 90, Corollary B.12, page 566. But see: Example 6.10, page 364. 84 Like sudden jumps of the Poisson processes. 85 Generally the topological space is locally compact. 86 Almost surely and not everywhere! See: Proposition 1.130, page 94.

LOCALIZATION

93

time t = 0 has the same value X(0). To overcome this problem some authors87 instead of using X τ n use the process X τ n χ (τ n > 0) in the definition of the localization or instead of X they localize the process X − X(0). In most cases it does not matter how we define the localization. First of all we shall use the localization procedure to define the different classes of local martingales. From the point of view of stochastic analysis, one can always assume that every local martingale is zero at time t = 0, as our final goal is to investigate the class of semimartingale, and the semimartingales have the representation X(0) + L + V, where L is a local martingale, zero at time t = 0. Just to fix the ideas we shall later explicitly concretize the definitions in the cases of local martingales and locally bounded processes. In both cases we localize the processes X − X(0). 1.4.1

Stability under truncation

It is quite natural to ask for which type of processes X one has (Xloc )loc = Xloc . Definition 1.128 We say that space of processes X is closed or stable under truncation or closed under stopping if whenever X ∈ X then X τ ∈ X for arbitrary stopping time τ . It is an important consequence of

this property that if X is closed under trun(k) cation and Xk ∈ Xloc and τ n are the localizing sequences of the processes (k)

Xk , then τ n ∧m k=1 τ n for any finite m is a common localizing sequence of the first m processes. That is, if X is closed under the truncation, then for a finite number of processes we can always assume that they have a common localizing sequence. From the definition it is clear that if X is closed under the truncation, then Xloc is also closed under the truncation as, if (τ n ) is a localizing sequence of X and τ is an arbitrary stopping time, then (τ n ) is obviously a localizing sequence of the truncated process X τ . Example 1.129 M, the space of uniformly integrable martingales, H2 , the space of the square-integrable martingales and K, the set of bounded processes are closed under truncation. It is obvious from the definition that K is closed under truncation. By the Optional Sampling Theorem if M ∈ M, then M τ ∈ M. As 2 2 τ ≤E sup |X (t)| n

∞ 0

if |η| ≤ n if |η| > n

are stopping times. Obviously ρn τ n ∧ σ n ∧ α n ∧ β n is a stopping time and ρn ∞ so (ρn ) is a localizing sequence. ρn

Z ρn (ξX + ηY )

= χ (|ξ| ≤ n) ξX ρn + χ (|η| ≤ n) ηY ρn .

(1.63)

As X ρn , Y ρn ∈ M and as χ (|ξ| ≤ n) ξ and χ (|η| ≤ n) η are bounded F0 measurable variables, obviously Z ρn ∈ M and therefore Z is a local martingale. Let us observe that in line (1.63) we used that X, Y ∈ L that is X(0) = Y (0) = 0. If in the definition of local martingales one had used the simpler X ∈ Mloc definition, then in this proposition one should have assumed the ξ and η to be bounded. 90 See:


96


One can observe that in the definition of local martingales we used the class of uniformly integrable martingales and not the class of martingales. If Lτ n is a martingale for some τ n , then Lτ n ∧n ∈ M, so the class of local martingales is the same as the class of ‘locally uniformly integrable martingales’. Very often we prove different theorems first for uniformly integrable martingales and then with localization we extend the proofs to local martingales. In most cases one should use the same method if one wants to extend the result from uniformly integrable martingales just to martingales. An important subclass of local martingales is the space of locally squareintegrable martingales: 2 Definition 1.135 X is a locally square-integrable martingale if X−X(0) ∈ Hloc .

Example 1.136 Every martingale which has square-integrable values is a locally square-integrable martingale. By definition a martingale X is square-integrable in ω if X(t) ∈ L2 (Ω) for every t. In this case X(0) ∈ L2 (Ω), therefore for all t X(t) − X(0) ∈ L2 (Ω), so again one can assume that X(0) = 0. If τ n n then (τ n ) is a localizing sequence. By Doob’s inequality

sup |X τ n (t)| = sup |X (t)| ≤ 2 · X (n) < ∞, 2

t≤n

t 2

2

2 so X τ n ∈ H2 and therefore X ∈ Hloc .

Example 1.137 Every continuous local martingale is locally square-integrable91 .

Let X be a continuous local martingale and let (τ n ) be a localizing sequence of X. As X is continuous σ n inf {t : |X(t)| ≥ n} is a stopping time. If ρn τ n ∧σ n then ρn ∞ and |X ρn | ≤ n by the continuity of X, so X ρn is a bounded, hence it is a square-integrable martingale. Therefore 2 . M ∈ Hloc 2 Example 1.138 Martingales which are not in Hloc .

91 One can easily generalize this example. If the jumps of X are bounded then X is in H2 . loc See: Proposition 1.152, page 107.

LOCALIZATION

97

Let us denote by σ (N ) the σ-algebra generated by the measure-zero sets. Let Ft

σ (N ) if t < 1 , A if t ≥ 1

and let ξ ∈ L1 (Ω), but ξ ∈ / L2 (Ω). Let us also assume that E (ξ) = 0. F satisfies a.s. the usual conditions, hence X (t) E (ξ | Ft ) is martingale. X (0) = 0, hence 2 / Hloc as, if the not only X ∈ Mloc , but also X ∈ L. On the other hand X ∈ stopping time τ is not almost surely constant, then almost surely τ ≥ 1, hence / L2 (Ω). for all t ≥ 1 X τ (t) = ξ ∈ It is a quite natural, but wrong, guess that local martingales are badly integrable martingales. The local martingales are far more mysterious objects. Example 1.139 Integrable local martingale which is not a martingale.

Let Ω C [0, ∞) , that is let Ω be the set of continuous functions defined on the half-line R+ . Let X be the canonical coordinate process, that is if ω ∈ Ω, then let X (t, ω) ω (t), and let the filtration F be the filtration generated by X. Let P be the probability measure defined on Ω for which X is a Wiener process starting from point 1. Let τ 0 inf {t : X(t) = 0} . Let us define the measure Q(t) on the σ-algebra Ft with the Radon–Nikodym derivative dQ (t) X (t ∧ τ 0 ) = X (t) χ (t < τ 0 ) + X (τ 0 ) χ (t ≥ τ 0 ) = dP = X (t) χ (t < τ 0 ) . As the truncated martingales are martingales, X τ 0 is a martingale under the measure P. Hence E (X (t ∧ τ 0 ) | Fs ) = X (s ∧ τ 0 ) . The measures (Q (t))t≥0 are consistent: if s < t and F ∈ Fs ⊆ Ft , then

F

dQ (s) dP dP

F

dQ (t) dP = Q (t) (F ) . dP

Q (s) (F ) =

X (s ∧ τ 0 ) dP = F

X (t ∧ τ 0 ) dP F

98


In particular

X (t ∧ τ 0 ) dP =

Q (t) (Ω) Ω

X (0) dP = 1, Ω

so Q (t) is a probability measure for every t. The space C [0, ∞) is a Kolmogorov type measure space, so on the Borel sets of C [0, ∞) there is a probability measure Q, which, restricted to Ft is Q (t). {τ 0 ≤ t} ∈ Ft for every t so Q (τ 0 ≤ t) = Q (t) (τ 0 ≤ t)

Ω

χ (τ 0 ≤ t) X (τ 0 ∧ t) dP =

= Ω

χ (τ 0 ≤ t) X (τ 0 ) dP = 0, a.s.

so Q (τ 0 = ∞) = 1, that is X is almost surely never zero under Q. Hence X > 0 under Q, so under Q the process Y 1/X is almost surely well-defined. 1. As a first step let us show that Y is not a martingale under Q. To show this it is sufficient to prove that the Q-expected value of Y is decreasing to zero. As P(τ 0 < ∞) if t ∞ EQ (Y (t))

Y (t) dQ =

Ω

= Ω

Ω

1 dQ (t) = X (t)

1 χ (t < τ 0 ) X (t) dP = X (t)

χ (t < τ 0 ) dP = P (t < τ 0 ) → 0.

= Ω

2. Now we prove that Y is a local martingale under Q. Let ε > 0 and let τ ε inf {t : X(t) = ε} . X is continuous, therefore if ε 0 then τ ε (ω) τ 0 (ω) for every outcome ω. Since Q(τ 0 = ∞) = 1 obviously Q-almost surely92 τ ε ∞. Let us show, that under Q the truncated process Y τ ε is a martingale. Almost surely 0 < Y τ ε ≤ 1/ε hence Y τ ε is almost surely bounded, hence it is uniformly integrable. One should 92 Let us recall that by the definition of the localizing sequence, it is sufficient if the localizing sequence converges just almost surely to infinity.

LOCALIZATION

99

only prove that Y τ ε is a martingale under Q. If s < t and F ∈ Fs , then as τ ε < τ 0

Y

τε

(t) dQ

F

F

1 dQ (t) = X (t ∧ τ ε )

(1.64)

1 X (t ∧ τ 0 ) dP = F X (t ∧ τ ε ) χ (t < τ ε ) χ (t ≥ τ ε ) + X (t) χ (t < τ 0 ) dP = = X (t) X (τ ε ) F X (t) = χ (τ 0 > t ≥ τ ε ) dP = χ (t < τ ε ) + ε F 1 = ε + (X τ 0 (t) − ε) χ (t ≥ τ ε ) dP. ε F =

Let us prove that M (t) (X τ 0 (t) − ε) χ (t ≥ τ ε ) is a martingale under P. If σ is a bounded stopping time, then as τ ε < τ 0 by the elementary properties of the conditional expectation93 and by the Optional Sampling Theorem E (M (σ)) E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε )) = = E (E ((X τ 0 (σ) − ε) χ (σ ≥ τ ε ) | Fσ∧τ ε )) = = E (E (X τ 0 (σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E (E (X (τ 0 ∧ σ) − ε | Fσ∧τ ε ) χ (σ ≥ τ ε )) = = E ((X (σ ∧ τ ε ) − ε) χ (σ ≥ τ ε )) = = E ((X (τ ε ) − ε) χ (σ ≥ τ ε )) = 0, which means that M is really a martingale94 . As M is a martingale under P in the last integral of (1.64) one can substitute s on the place of t, so calculating backwards 1 1 τε dQ = dQ Y (t) dQ X (t ∧ τ ) X (s ∧ τ ε) ε F F F Y τ ε (s) dQ, F

that is Y τ ε is a martingale under Q. Therefore τ 1/n localizes Y under Q. 93 See: 94 See:

Proposition 1.34, page 20. Proposition 1.91, page 57.

100


Example 1.140 L2 (Ω) bounded local martingale, which is not a martingale

95

.

Let w be a standard Wiener process in R3 , and let X(t) w(t)+u where u = 0 is a fixed vector. By the elementary properties of Wiener processes96 if t → ∞ then R (t) X (t)2 → ∞.

(1.65)

With direct calculation it is easy to check that on R3 \ {0} the function g (x)

1 1 =" 2 x2 x1 + x22 + x23

is harmonic, that is97 ∆g

∂2 ∂2 ∂2 g + g + g = 0. ∂x21 ∂x22 ∂x23

Hence by Itô’s formula98 M 1/R is a local martingale. The density function of the X (t) is

1 2 ft (x) √ 3 exp − x − u2 . 2t 2πt 1

If t ≥ 1 then ft is uniformly bounded so if t ≥ 1 then obviously

E M 2 (t) =

R3

≤

R3

1

2 ft

x2 1

(x) dλ3 (x) ≤

2 dλ3

(x) .

x2

Evidently the last integral can diverge only around x = 0. I

x≤1

1

2 dλ3 (x) =

x2

k

G(k)

1

2 dλ3

x2

(x)

95 We shall use several results which we shall prove later, so one can skip this example during the first reading. 96 See: Proposition B.7, page 564, Corollary 6.9, page 363. 97 Now ∆ denotes the Laplace operator. 98 See: Theorem 6.2, page 353. As n = 3 almost surely X(t) = 0 hence we can use the formula. See: Theorem 6.7, page 359.

LOCALIZATION

101

where G (k) =

1 2k+1

1 < x2 ≤ k 2

.

As 2k G (k) = G (0) using the transformation T (x) 2k x 1 1 3k dλ (x) = 3 2 2 2 dλ (x) = k G(0) x2 G(k) 2 x2 1 k =2 2 dλ3 (x) . G(k) x2 Hence I=

∞

2−k

k=0

G(0)

1

2 dλ3

x2

(x) < ∞.

2 It is easy to show that E

2 M (t) is continuous in t. Therefore it is bounded on [0, t]. Hence E M (t) is bounded on R+ . By (1.65) M (t) → 0. M is bounded in L1

L2 (Ω) therefore it is uniformly integrable, so M (t) → 0. If M were a martingale then 0 = M (t) = E (M (∞) | Ft ) = E (0 | Ft ) = 0, which is impossible. As the uniformly integrable local martingales are not necessarily martingales even the next, nearly trivial observation is very useful: Proposition 1.141 Every non-negative local martingale is a supermartingale. Proof: Let M = M (0) + L be a non-negative local martingale. Observe that by the definition of supermartingales, M (t) ≥ 0 is not necessarily integrable so one cannot assume that M (0) is integrable. As L ∈ L there is a localizing sequence (τ n ) that Lρn ∈ M for all n. If t > s, then as M ≥ 0, by Fatou’s lemma

E (M (t) | Fs ) = E lim inf M τ n (t) | Fs ≤ lim inf E (M τ n (t) | Fs ) = n→∞

n→∞

τn

= M (0) + lim inf E (L n→∞

(t) | Fs ) =

= M (0) + lim inf Lτ n (s) = M (s) . n→∞

Corollary 1.142 If M ∈ L and M ≥ 0 then M = 0. Proof: As M is a supermartingale 0 ≤ E (M (t)) ≤ E (M (0)) = 0 for all t ≥ 0, a.s. so M (t) = 0.

102


The most striking and puzzling feature of local martingales is that even uniform integrability is not sufficient to guarantee that local martingales are proper martingales. The reason for it is the following: If Γ is a set of stopping times, then the uniform integrability of the family (X (t))t∈Θ does not guarantee the uniform integrability of the stopped family (X (τ ))τ ∈Γ . This cannot happen if the local martingale belongs to the so-called class D. Definition 1.143 Process X belongs to the Dirichlet–Doob class99 , shortly X is in class D, if the set {X (τ ) : τ < ∞ is an arbitrary finite-valued stopping time} is uniformly integrable. We shall also denote by D the set of processes in class D. Proposition 1.144 Let L be a local martingale. L is in class D if and only if L ∈ M that is if L is a uniformly integrable martingale. Proof: Recall that we constructed a non-negative L2 (Ω)-bounded local martingales which is not a proper martingale. 1. Let L ∈ D and let L be a local martingale. As τ = 0 is a stopping time, by the definition of D, L(0) is integrable, so one can assume that L ∈ L. If (τ n ) is a localizing sequence of L then L (τ n ∧ s) = Lτ n (s) = E (Lτ n (t) | Fs ) = = E (L (τ n ∧ t) | Fs ) . τ n ∞, hence the sequences (L (τ n ∧ s))n and (L (τ n ∧ t))n converge to L (s) and L (t). By uniform integrability the convergence L (τ n ∧ t) → L(t) holds in L1 (Ω) as well. By the L1 -continuity of the conditional expectation L (s) = E (L (t) | Fs ) , hence L is a martingale100 . Obviously the set {L(t)}t ⊆ {L(τ )}τ is uniformly integrable so L ∈ M. 2. The reverse implication is obvious: If L is a uniformly integrable martingale then by the Optional Sampling Theorem L (τ ) = E (L (∞) | Fτ ) for every stopping time τ , hence the family (L (τ ))τ is uniformly integrable101 . 99 In [77] on page 244 class D is called Dirchlet class. [74] on page 107 remarks that class D is for Doob’s class and the definition was introduced by P.A. Meyer in 1963. 100 Observe that it is enough to asssume that {L (τ )} is uniformly integrable for the set of τ bounded stopping times τ . 101 See: Lemma, 1.70, page 42.

LOCALIZATION

103

Corollary 1.145 If a process X is dominated by an integrable variable then X ∈ D, hence if X is a local martingale and X is dominated by an integrable variable102 then X ∈ M. Example 1.146 Let us assume that L has independent increments. If X exp (L) then X is a local martingale if and only if X is a martingale.

One should only prove that if X is a local martingale, then X is a martingale. By the definition of processes with independent increments, L(0) = 0, hence X(0) = 1. X is a non-negative local martingale, so it is a supermartingale103 . If m(t) denotes the expected value of X(t) then by the supermartingale property 1 ≥ m (t) > 0. Let us prove that M (t) X (t) /m (t) is a martingale. As L has independent increments, if t > s, then m (t) E (X(t)) = E (X(s)) E (exp (L(t) − L(s))) m(s)E (exp (L(t) − L(s))) . From this E (M (t) | Fs ) E =E

exp (L (t)) | Fs m (t)

=

exp (L (t) − L (s) + L (s)) | Fs m (t)

=

=

exp (L (s)) E (exp (L (t) − L (s)) | Fs ) = m (t)

=

exp (L (s)) E (exp (L (t) − L (s))) = m (t)

=

exp (L (s)) M (s) , m (s)

hence M is martingale. For arbitrary T < ∞ on the interval [0, T ] M is uniformly integrable, that is, M is in class D. As on interval [0, T ] 0 ≤ X = M m ≤ M, hence X is also in class D. Therefore X ∈ D and X is a local martingale on [0, T ]. This means that X is a martingale on [0, T ] for every T , hence X is a martingale on R+ . 102 See: 103 See:

Davis’ inequality. Theorem 4.62, page 277. Proposition 1.141, page 101.

104


If a process has independent increments and the expected value of the process is zero, then it is obviously a martingale. Therefore martingales are the generalization of random walks. From an intuitive point of view one can also think about local martingales as generalized random walks as we shall later prove the next — somewhat striking— theorem: Theorem 1.147 Assume that the stochastic base satisfies the usual conditions. If a local martingale has independent increments then it is a true martingale104 . 1.4.3

Convergence of local martingales: uniform convergence on compacts in probability

Let X be an arbitrary space. In Xloc it is very natural to define the topology with localization; Xm → X, if X and the elements of the sequence (Xm ) have a common localizing sequence (τ n ) and for every n in the topology of X τn lim Xm = Xτn.

m→∞

p Let us assume105 that (Xm ) and X are in Hloc . In Hp one should define the topology with the norm

. XHp sup |X (s)| s p

If τ n ∞ and t < ∞, then for every δ > 0 one can find an n, that P (τ n ≤ t) < δ. Let ε > 0 be arbitrary. If A

sup |Xm (s) − X (s)| > ε , s≤t

then P (A) = P ((τ n ≤ t) ∩ A) + P ((τ n > t) ∩ A) ≤ ≤ P (τ n ≤ t) + P ((τ n > t) ∩ A) ≤ δ + P ((τ n > t) ∩ A) ≤ τn τn ≤ δ + P sup |Xm (s) − X (s)| > ε ≤ ≤δ+P

s≤t

τn sup |Xm s

(s) − X

τn

(s)| > ε .

104 Of course the main point is that a local martingale with independent increments has finite expected value. See: Theorem 7.97, page 545. 105 It is an important consequence of the Fundamental Theorem of Local Martingales that 1 . See Corollary 3.59, page 221. every local martingale is in Hloc

LOCALIZATION

105

By Markov’s inequality the stochastic convergence follows from the convergence τn in Lp (Ω). Therefore if limm→∞ Xm = X τ n in Hp then τn lim P sup |Xm (s) − X τ n (s)| > ε = 0.

m→∞

s

This implies that for every ε > 0 and for every t lim P sup |Xm (s) − X (s)| > ε = 0.

m→∞

s≤t

Hence one should expect that the next definition is very useful106 : Definition 1.148 We say that the sequence of stochastic processes (Xn ) converges uniformly on compacts in probability to process X if for arbitrary107 t a} . If X is right-regular then |X (τ a )| ≥ a, but as X can reach the level a with a jump, it can happen that for certain outcomes |X (τ a )| > a. For right-continuous processes one can only use the estimation |X (τ a )| ≤ a + |∆X (τ a )| . As the jump |∆X (τ a )| can be arbitrarily large X is not necessarily bounded on the random interval [0, τ a ] {(t, ω) : 0 ≤ t ≤ τ a (ω) < ∞} .

(1.66)

On the other hand, let us assume that X is left-continuous. If τ a (ω) > 0 and |X(τ a (ω)), ω| > a for some outcome ω then by the left-continuity one can decrease the value of τ a (ω), which by definition is impossible. Hence |X (τ a )| ≤ a on the set {τ a > 0}. This means that if X is left-continuous and X(0) = 0 then X is bounded on the random interval (1.66). These observations are the core of the next two propositions: Proposition 1.151 If the filtration is right-continuous then every left-regular process is locally bounded. Proof: Let X be left-regular. The process X − X(0) is also left-regular so one can assume that X(0) = 0. Define the random times τ n inf {t : |X(t)| > n} . The filtration is right-continuous, X is left-regular so τ n is a stopping time109 . As X(0) = 0, if τ n (ω) = 0 then |X (τ n )| ≤ n. If τ n (ω) > 0 then |X (τ n (ω), ω)| > n is impossible as in this case, by the left-continuity of X one could decrease τ n (ω). 108 See: 109 See:

Proposition 1.6, page 5. Example 1.32, page 17.

LOCALIZATION

107

Hence the truncated process X τ n is bounded. Let us show that τ n ∞, that is, let us show that the sequence (τ n ) is a localizing sequence. Obviously (τ n ) is never decreasing. If for some outcome ω the sequence (τ n (ω)) were bounded then one would find a bounded sequence (tn ) for which |X(tn , ω)| > n. Let (tnk )k be a monotone, convergent subsequence of (tn ). If tnk → t∗ , then |X(tn , ω)| → ∞, which is impossible as X has finite left and right limits. Proposition 1.152 If the filtration is right-continuous and the jumps of the right-regular process X are bounded then X is locally bounded. Proof: We can again assume that X(0) = 0. Assume that |∆X| ≤ a. As in the previous proposition if τ n inf {t : |X(t)| > n} then (τ n ) is a localizing sequence, |X(τ n −)| ≤ n, therefore |X τ n | ≤ n + |∆X(τ n )| ≤ n + a.

Example 1.153 In the previous propositions one cannot drop the condition of regularity.

The process X(t)

1/t 0

if t > 0 if t = 0

is continuous from the left but not regular, and it is obviously not locally bounded. The 1/ (1 − t) if t < 1 X(t) 0 if t ≥ 1 is continuous from the right but it is also not locally bounded.

2 STOCHASTIC INTEGRATION WITH LOCALLY SQUARE-INTEGRABLE MARTINGALES In this chapter we shall present a relatively simple introduction to stochastic integration theory. Our main simplifying assumption is that we assume that the integrators are locally square-integrable martingales. Every continuous process is 2 contains the continuous local martingales. locally bounded, hence the space Hloc In most of the applications the integrator is continuous, therefore in this chapter we shall mainly concentrate on the continuous case. As we shall see, the slightly 2 more general case, when the integrator is in Hloc is nearly the same as the continuous one. The central concept of this chapter is the quadratic variation [X]. We shall show that if X is a continuous local martingale then [X] is continuous, increasing and X 2 − [X] is also a local martingale. It is a crucial observation that in the continuous case these properties characterize the quadratic variation. When the integrator X is discontinuous then the quadratic variation [X] is also discontinuous. As in the continuous case, X 2 − [X] is still a local martingale, but this property does not characterize the quadratic variation for local martingales in general. The jump process of the quadratic variation ∆ [X] satisfies 2 the identity ∆ [X] = (∆X) , and [X] is the only right-continuous, increasing 2 process for which X 2 − [X] is a local martingale and the identity ∆ [X] = (∆X) holds. When the integrators are continuous one can define the stochastic integral for progressively measurable integrands. The main difference between the 2 case is that in the discontinuous case we should take continuous and the Hloc into account the jumps of the integral. Because of this extra burden in the discontinuous case one can define the stochastic integral only when the integrands are predictable. In the first part of the chapter we shall introduce the so-called Itˆ o–Stieltjes integral. We shall use the existence theorem of the Itˆ o–Stieltjes integral to prove the existence of the quadratic variation. After this, we present the construction 108

ˆ THE ITO–STIELTJES INTEGRALS

109

of stochastic integral when the integrators are continuous local martingales. At the end of the chapter we briefly discuss the difference between the continuous 2 and the Hloc case. In the present chapter we assume that the filtration is right-continuous and if N ∈ A has probability zero, then N ∈ Fs for all s. But we shall not need the assumption that (Ω, A, P) is complete.

2.1

The Itˆ o–Stieltjes Integrals

In this section we introduce the simplest concept of stochastic integration, which I prefer to call Itˆ o–Stieltjes integration. Every integral is basically a limit of certain approximating sums. The meaning of the integral is generally obvious for the finite approximations and by definition the integral operator extends the meaning of the finite sums to some more complicated infinite objects. In stochastic integration theory we have two stochastic processes: the integrator X and the integrand Y . As in elementary analysis, let us fix an interval [a, b] and let (n)

∆n : a = t0

(n)

< t1

< · · · < t(n) mn = b

(2.1)

be a partition of [a, b]. For a fixed partition ∆n let us define the finite approximating sum Sn

mn

(n) (n) (n) X tk − X tk−1 , Y τk

k=1 (n)

where the test points τ k have been chosen in some way from the time subin(n) (n) tervals [tk−1 , tk ]. If the integrator X is the price of some risky asset then (n)

(n)

(n)

(n)

X(tk ) − X(tk−1 ) is the change of the price during the time interval [tk−1 , tk ] (n)

and if Y (τ k ) is the number of assets one holds during this time period then Sn is the net change of the value of the portfolio during the whole time period [a, b]. If (n) (n) lim max tk − tk−1 = 0 n→∞

k

then the sequence of partitions (∆n ) is called infinitesimal. In this section we b say that the integral a Y dX exists if for any infinitesimal sequence of partitions of [a, b] the sequence of approximating sums (Sn ) is convergent and the limit is independent of the partition (∆n ). The main problem is the following: under which conditions and in which sense does the limit limn→∞ Sn exist? Generally we can only guarantee that the approximating sequence (Sn ) is convergent in probability and for the existence of the integral we should assume that the test (n) points τ k have been chosen in a very restricted way. That is, we should assume,

110

STOCHASTIC INTEGRATION (n)

(n)

that τ k = tk−1 . This type of integral we shall call the Itˆ o–Stieltjes integral of Y against X. Perhaps the most important and most unusual point in the theory is (n) that we should restrict the choice of the test points τ k . The simplest example showing why it is necessary follows: Example 2.1 Let w be a Wiener process. Try to define the integral

b

wdw!

a

Consider the approximating sums Sn

(n) (n) (n) w(tk ) w(tk ) − w(tk−1 ) ,

k

and In

(n) (n) (n) w(tk−1 ) w(tk ) − w(tk−1 ) .

k (n)

In the first case τ k

(n)

tk

(n)

and in the second case τ k

Sn − In =

(n)

tk−1 . Obviously

2 (n) (n) w(tk ) − w(tk−1 ) ,

k

which is the approximating sum for the quadratic variation of the Wiener process. As we will prove1 if n → ∞ then in L2 (Ω)-norm lim (Sn − In ) = b − a = 0,

n→∞

that is the limit of the approximating sums is dependent on the choice of the test (n) points τ k . As the interpretation of the stochastic integral is basically the net (n) (n) gain of some gambling process, it is quite reasonable to choose τ k as tk−1 as one should decide about the size of a portfolio before the prices change, since it is quite unrealistic to assume that one can decide about the size of an investment after the new prices have already been announced. It is very simple to see that 1 In = 2 =

w

2

(n) (tk )

−w

2

(n) (tk−1 )

k

−

(n) w(tk )

k

2 1 (n) 1 2 (n) w(tk ) − w(tk−1 ) , w (b) − w2 (a) − 2 2 k

1 See:

−

Example 2.27, page 129, Theorem B.17, page 571.

2

(n) w(tk−1 )

=


111

hence lim In =

n→∞

=

1 & 2 'b 1 w (t) a − (b − a) = 2 2 1 1 2 w (b) − w2 (a) − (b − a) , 2 2

and similarly lim Sn =

n→∞

=

2.1.1

1 & 2 'b 1 w (t) a − (b − a) + (b − a) = 2 2 1 1 2 w (b) − w2 (a) + (b − a) . 2 2

Itˆ o–Stieltjes integrals when the integrators have finite variation

Integration theory is quite simple when the trajectories of the integrator X have finite variation on any finite interval. As a point of departure it is worth recalling a classical theorem from elementary analysis. The following simple proposition is well-known and it is just a parametrized version of one of the most important existence theorems of the calculus. Proposition 2.2 (Existence of Riemann–Stieltjes integrals) Let us fix a finite time interval [a, b]. If the trajectories of the integrator X have finite variation and the integrand Y is continuous, then for all outcomes ω the limit of the integrating sums Sn

mn

(n) (n) (n) Y (τ k ) X(tk ) − X(tk−1 ) ,

(2.2)

k=1

exists and it is independent of the choice of the infinitesimal sequence of partitions ( (n) (n) (n) (2.1) and of the choice of the test points τ k ∈ tk−1 , tk . Proof. As the trajectories Y (ω) are continuous on [a, b] they are uniformly continuous and therefore for any ε > 0 there is a δ (ω) > 0, such that if |t − t | < δ (ω) , then2 |Y (t , ω) − Y (t , ω)|
0, otherwise X (ω) is constant on [a, b] and the integral trivially exists.

112

STOCHASTIC INTEGRATION

If all partitions of [a, b] are finer than δ (ω) /2, that is, if for all n δ (ω) (n) (n) max tk − tk−1 < k 2 then by (2.3) 0 ≤ |Si − Sj |

(i) (i) (i) (j) (j) (j) Y (τ k ) X(tk ) − X(tk−1 ) − Y (τ l ) X(tl ) − X(tl−1 ) k

l

(i) (j) Y (θr ) − Y (θr ) (X(sr ) − X(sr−1 )) ≤ r

(j) ≤ max Y (θ(i) ) − Y (θ ) |X(sr ) − X(sr−1 )| ≤ r r r

r

(j) ≤ max Y (θ(i) r ) − Y (θ r ) Var (X, a, b) ≤ ε, r

(i)

(j)

where (sr ) is any partition containing the points (tk ) and (tl ) and the (i) (j) θ(i) and θ(j) are the original test points τ k and τ k corresponding to r r [sr−1 , sr ] respectively. So for any ω, (Sn (ω)) is a Cauchy sequence. so for all ω the limit b

Y dX a

(ω) lim Sn (ω) n→∞

exists. If (Sp ) and (Sq ) are two different approximating sequences generated by different infinitesimal sequences of partitions of [a, b] or they belong to different choices of test points and In

Sp Sq

if n = 2p if n = 2q − 1

then by the argument just presented (In ) also has a limit, which is of course the common limit of (Sp ) and (Sq ). Hence the limit does not depend on the (n) infinitesimal sequence of partitions (tk ) and does not depend on the way of (n) choosing the test points (τ k ). Definition 2.3 If the value of the integral is independent of the choice of test (n) points (τ k ) then the integral is called the Riemann–Stieltjes integral of Y against b X. Of course the integral is denoted by a Y dX.


113

Example 2.4 IfY and X have common points of discontinuity then the Riemann– b Stieltjes integral a Y dX does not exist.

If Y (t)

0 if t ≤ 0 1 if t > 0

and X (t)

0 if t < 0 1 if t ≥ 0

1 (n) then the Riemann–Stieltjes integral −1 XdY does not exist. If τ k ≤ 0 for the subinterval containing t = 0 then Sn = 0, otherwise Sn = 1. Observe that (n) if the test point τ k is the left endpoint of the subinterval, then Sn = 0, hence the so-called Itô–Stieltjes integral3 is zero. Our goal is to extend the integral to discontinuous integrands. As a first step, we extend the integral to regular integrands. As we saw in the previous (n) example even for left-regular integrands we cannot choose the test points τ k arbitrarily. (n)

Definition 2.5 If the value of the test point τ k is always the left endpoint (n) (n) (n) (n) of the subinterval [tk−1 , tk ], that is if τ k = tk−1 for all k, then the integral is called the Itˆ o–Stieltjes integral of Y against X. Of course the Itô–Stieltjes b integrals are also denoted by a Y dX. Example 2.6 If f is a simple predictable jump that is f (t)

c1 c2

if if

t ≤ t0 t > t0

then for any regular function g the Itˆ o–Stieltjes integral is a

b

f dg = c1 (g (t0 +) − g (a)) + c2 (g (b) − g (t0 +)) .

If f is a simple jump that is   c1 f (t) c3 c 2

if if if

t < t0 t = t0 t > t0

then for any right-regular function g the Itˆ o–Stieltjes integral is again (2.4). 3 See

the definition below.

(2.4)

114


If t0 = b then by definition g(t0 +) = g(b+) = g(b) so in this case (2.4) is (n) obvious. Let (tk ) be an infinitesimal sequence of partitions. By the definition of the integral Sn

(n) (n) (n) f (tk−1 ) g(tk ) − g(tk−1 ) =

k

(n) (n) = c1 g(tj ) − g(a) + c2 g(b) − g(tj ) , (n)

(n)

(n)

where t0 ∈ [tj−1 , tj ). If n → ∞, then tj t0 + and as g is regular the limit limn Sn exists and it is equal to the formula given. Assume that g is right-regular. (n) (n) If t0 = tj−1 then the approximating sums do not change. If t0 = tj−1 then

(n) Sn = c1 (g (t0 ) − g (a)) + c3 g(t0 ) − g(tj ) +

(n) + c2 g(b) − g(tj ) . (n)

g is right-continuous at t0 so g (t0 ) − g(tj ) → 0, hence the limit is again the same as in the previous case. One can easily generalize the example above4 : Lemma 2.7 If every trajectory of the integrand Y is a finite number of jumps and X is a right-continuous process, then for arbitrary a < b the Itˆ o– b Stieltjes integral a Y dX exists and the approximating sums converge for every outcome ω. Example 2.8 If f is a simple spike, that is if f (t)

c 0

if if

t = t0 , t = t0

then for any right-continuous integrator the Itˆ o–Stieltjes integral of f is zero.

The approximating sum is Sn = 4 Let

0 (n)

(n)

c · g(tj+1 ) − g(tj )

(n)

if t0 = tj

(n)

if t0 = tj

us observe that the Itˆ o–Stieltjes integral is, trivially, additive.

.


115

In the first case of course limn Sn = 0, in the second case as g is right-continuous lim Sn = c lim

n→∞

n→∞

(n) (n) (n) g(tj+1 ) − g(tj ) = c lim g(tj+1 ) − g(t0 ) = 0. n→∞

Observe that if g has bounded variation, then g defines a signed measure on R. b The Lebesgue–Stieltjes integral is a f dg = f (t0 )∆g(t0 ) which is different from the Itô–Stieltjes integral. Later5 we shall show that for left-regular processes the Lebesgue–Stieltjes and the Itˆ o–Stieltjes integrals are equal but, as in this case f is not left-regular, the theorem is not applicable6 . We shall very often use the following simple observation: Proposition 2.9 (The existence of the Itˆ o–Stieltjes integral) If the integrator X is right-continuous7 and it has finite variation and the integrand Y is b regular then for any time interval [a, b] the Itˆ o–Stieltjes integral a Y dX exists and for all outcome ω the approximating sequences In (ω)

(n) (n) (n) Y (tk−1 , ω) X(tk , ω) − X(tk−1 , ω)

k

are convergent. Proof. The proof is similar to the proof of the existence of Riemann–Stieltjes integrals. Fix an outcome ω and let (In ) be the sequence of the approximating sums. Fix an ε > 0 and an outcome ω. By the regularity of Y (ω) there are only a finite number of jumps bigger than8 c Let J

ε . 4 · Var (X) (a, b, ω)

∆Y · χ (|∆Y | ≥ c) and Z Y − J. (J)

1. Let us denote by (In ) the approximating sums formed with J. As Y is regular the number of ‘big jumps’ on every trajectory is finite. X is right-continuous, b hence by the previous lemma the integral a J (ω) dX (ω) exists for any ω. Hence if i and j are big enough, then ε (J) (J) Ii (ω) − Ij (ω) ≤ . 2 5 It is an easy consequence of the Dominated Convergence Theorem. See: Theorem 2.88, page 174. See also the properties of the stochastic integral on page 434. 6 Recall that the Riemann–Stieltjes integral b f dg does not exist. a 7 If X is not right-continuous then we should assume that Y is left-regular. 8 See: Proposition 1.5, page 5. We can assume that Var (X (ω) , a, b) > 0 otherwise X (ω) is constant on [a, b] and the proposition is trivially satisfied.

116


2. Finally let us define the approximating sums In(Z)

(n) (n) (n) Z(tk−1 , ω)X (tk , ω) − X(tk−1 , ω) .

k

The jumps of Z are smaller than c and Z is regular, hence9 there is a δ(ω) such that if |s − t| ≤ δ(ω) then |Z(s, ω) − Z(t, ω)| ≤ 2c. (n) (n) If maxk tk − tk−1 ≤ δ(ω)/2 for all n ≥ N then as in the case of the ordinary Riemann–Stieltjes integral ε (Z) (Z) Ii (ω) − Ij (ω) ≤ 2c · Var (X (ω) , a, b) ≤ . 2 3. Adding up the two inequalities above if i and j are sufficiently large then (J) (J) |Ii (ω) − Ij (ω)| ≤ Ii (ω) − Ij (ω) + (Z) (Z) + Ii (ω) − Ij (ω) ≤ ε.

(2.5)

This means that (In (ω)) is a Cauchy sequence for any ω. The rest of the proof is the same as the last part of the proof of the previous proposition. Example 2.10 The Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal.

One should emphasize that as X has bounded variation one can also define the pathwise Lebesgue–Stieltjes integral of Y with respect to the measures generated by the trajectories of X. If Y is left-continuous then Y = lim

n→∞

(n) (n) (n) Y tk−1 χ tk−1 , tk

k

so by the Dominated Convergence Theorem the two integrals are equal. But in general the Itˆ o–Stieltjes and the Lebesgue–Stieltjes integrals are not equal. If Y (t) = X (t) 9 See:


0 if t < 1/2 1 if t ≥ 1/2


117

then the measure generated by X is the Dirac measure δ 1/2 so the Lebesgue– Stieltjes integral over (0, 1] is one, while the Itˆ o–Stieltjes integral is zero10 . 2.1.2

Itˆ o–Stieltjes integrals when the integrators are locally square-integrable martingales

Perhaps the most important stochastic processes are the Wiener processes. As the trajectories of Wiener processes almost surely do not have finite variation11 , we cannot apply the previous construction when the integrator is a Wiener process. Theorem 2.11 (Fisk) Let L be a continuous local martingale. If the trajectories of L have finite variation then for almost all outcomes ω the trajectories of L are constant functions. Proof. Consider the local martingale M L − L (0). It is sufficient to prove that M = 0. Let V Var (M ) and let (ρn ) be a localizing sequence of M . As the variation of a continuous function is continuous υ n (ω) inf {t : |M (t, ω)| ≥ n} and κn (ω) inf {t : V (t, ω) ≥ n} are stopping times. Hence τ n υ n ∧ κn ∧ ρn is also a stopping time. Obviously τ n ∞, hence if M τ n = 0 for all n then M is zero on [0, τ n ] for all n and therefore M will be zero on ∪n [0, τ n ] = R+ × Ω, so M = 0. As the trajectories of M τ n and V τ n are bounded one can assume that M and V Var (M ) are (n) bounded. Let (tk ) be an arbitrary infinitesimal sequence of partitions of [0, t]. By the energy identity12 if u > v then

2 E (M (u) − M (v)) = E M 2 (u) − M 2 (v) , (2.6) hence as M (0) = 0

E M 2 (t) = E M 2 (t) − E M 2 (0) =

(n) (n) 2 2 =E M tk − M tk−1 = k

=E

k

10 See:

Example 2.6, page 113. Theorem B.17, page 571. 12 See: Proposition 1.58, page 35. 11 See:

M

(n) tk

−M

(n) tk−1

2

.

118


V is bounded hence V Var (M ) ≤ c.

E M 2 (t) ≤

(n)

(n) (n) (n) ≤E − M tk−1 · max M tk − M tk−1 ≤ M tk

k

k

(n) (n) ≤ E V (t) · max M tk − M tk−1 k

(n) (n) ≤ c · E max M tk − M tk−1 . k

The trajectories of M are continuous hence they are uniformly continuous on [0, t] so

(n) (n) − M tk−1 = 0. lim max M tk

n→∞

k

On the other hand

(n) (n) max M tk − M tk−1 ≤ V (t) ≤ c, k

so we can use the Dominated Convergence Theorem:

(n) (n) lim E max M tk − M tk−1 = 0.

n→∞

k

a.s.

Hence M (t) = 0 for every t. The trajectories of M are continuous and therefore13 for almost all outcomes ω one has that M (t, ω) = 0 for all t. This means that when the integrators are continuous local martingales we need another approach. First we prove two very simple lemmata: Lemma 2.12 Let (Mk , Fk ) be a discrete-time martingale and let (Nk ) be an F (Fk ) adapted process. If the variables Nk−1 · (Mk − Mk−1 ) are integrable then the sequence Z0 0,

Zn

n k=1

13 See:


Nk−1 · (Mk − Mk−1 )


119

is an F-martingale. Specifically, if N is uniformly bounded and M is an arbitrary discrete-time martingale then Z is a martingale. Proof. By the assumptions Nk−1 ·(Mk − Mk−1 ) is integrable, hence if k −1 ≥ m then E (Nk−1 (Mk − Mk−1 ) | Fm ) = E (E (Nk−1 (Mk − Mk−1 ) | Fk−1 ) | Fm ) = = E (Nk−1 E (Mk − Mk−1 | Fk−1 ) | Fm ) = = E (Nk−1 · 0 | Fm ) = 0, from which the lemma is evident. Lemma 2.13 Let (Mk , Fk ) be a discrete-time L2 (Ω)-valued martingale. If |Nk | ≤ c is an F-adapted sequence and Z0 0,

Zn

n

Nk−1 · (Mk − Mk−1 )

k=1

then ) 2 2 Zn 2 ≤ c Mn 2 − M0 2 . Proof. By the previous lemma (Zn ) is a martingale, so by the energy equality 2

Zn 2 =

n

2

Nk−1 (Mk − Mk−1 )2 .

k=1

Using the energy equality again 2

Zn 2 ≤ c2

n

2

Mk − Mk−1 2 =

k=1

= c2

n

2 2 Mk 2 − Mk−1 2 =

k=1

2 2 = c2 Mn 2 − M0 2 . First we prove the existence of the integral for continuous integrands. Proposition 2.14 (Existence of Itˆ o–Stieltjes integrals for continuous integrands) If X ∈ H2 and Y is adapted and continuous on a finite interval

120


[a, b] then the Itˆ o –Stieltjes integral In

b a

Y dX exists and the approximating sums

(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 )

k

converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. 1. The basic, but not entirely correct trick is that as Y is continuous it is uniformly continuous, hence if In and Im are two approximating sums of the integral then by the previous lemma In − Im 2

(n) (n) (n) (m) (m) (m) Y (tk−1 ) X(tk )−X(tk−1 ) − Y (tk−1 ) X(tk )−X(tk−1 ) = k k 2

= Y (tk−1 ) − Y (tk−1 ) X(tk ) − X(tk−1 ) ≤ k 2 ) 2 2 ≤c X (b)2 − X (a)2 . Of course the main problem with this estimation is that one cannot guarantee that for any fixed partition Y (tk−1 , ω) − Y (tk−1 , ω) ≤ c

(2.7) (n)

(m)

for every ω. What one can show is that if the partitions (tk ) and (tk ) are sufficiently fine then outside of an event with small probability the estimation (2.7) is valid. That is the reason why one can prove only that the integrating sums converge in probability and not in L2 (Ω). 2. To show the correct proof fix an α and a β and let * + βα2 +

. c, 2 2 2 X (b)2 − X (a)2 For every δ > 0 let us define the modulus of continuity of Y : Mδ (ω, u) sup {|Y (t, ω) − Y (s, ω)| : |t − s| ≤ δ, t, s ∈ [a, u]} . As Y is continuous one can calculate the supremum when s and t are rational numbers so Mδ is adapted and as Y is continuous obviously Mδ is also continuous.


121

Y is continuous, so every trajectory of Y is uniformly continuous on [a, b], hence for every ω lim Mδ (ω, b) = 0.

δ0

This means that if δ is sufficiently small then P(Mδ (b) ≥ c) ≤

β . 2

Fix this δ and let us define the stopping time τ inf {u : Mδ (u) ≥ c} ∧ b. As τ is a stopping time, Z Y τ is adapted and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ c. Let In(Z)

(n) (n) (n) Z(tk−1 ) X(tk ) − X(tk−1 ) .

k

(i) (j) If the partitions tk and tk are finer than δ/2 then by the previous lemma 2

βα2 (Z) (Z) 2 2 . Ii − Ij ≤ c2 X (b)2 − X (a)2 = 2 2 Let A {Mδ (b) ≥ c}. It is easy to see that Z = Y on Ac . By Chebyshev’s inequality P (|Ii − Ij | > α) = = P ({|Ii − Ij | > α} ∩ A) + P ({|Ii − Ij | > α} ∩ Ac ) ≤ ≤ P (A) + P ({|Ii − Ij | > α} ∩ Ac ) =

(Z) (Z) = P (A) + P Ii − Ij > α ∩ Ac ≤ 2 (Z) (Z)

− I I i j β β (Z) (Z) 2 ≤ ≤ + P Ii − Ij > α ≤ + 2 2 2 α

2 2 c2 X (b)2 − X (a)2 β β β ≤ + = + = β. 2 2 α 2 2 Hence (In ) is convergent in probability. Now we generalize the theorem for regular integrands.

122


Proposition 2.15 (The existence of the Itˆ o–Stieltjes integral for H2 integrators) If on a finite interval [a, b] the adapted stochastic process Y is b regular and X ∈ H2 then the Itˆ o–Stieltjes integral a Y dX exists and the Itˆ o-type approximating sums converge in probability. Proof. The proof is similar to the proof of the existence of the integral when the integrator has finite variation. Let (In ) be an approximating sequence of the b integral a Y dX. Fix an ε and a β. * + + c,

Let again J

βε2 2

2

48 X (b)2 − X (a)2

∆Y χ (|∆Y | ≥ c) , Z Y − J.

1. As the trajectories of Y are regular for any ω the trajectory Y (ω) has a finite number of jumps which are larger than c. X ∈ H2 and by definition X b is right-continuous, hence the integral a JdX exists. As it converges for every outcome ω it converges stochastically as well, so if i and j are big enough, then ε β (J) (J) P Ii − Ij > ≤ . 2 3 2. The jumps of Z are smaller than c. As in the continuous case14 if δ > 0 is small enough then there is a stopping time τ such that P (τ < b) P (A) ≤

β 3

and if |x − y| ≤ δ then |Z (x) − Z (y)| ≤ 2c on the random interval [a, τ ]. If (i) V Z τ then |V (x) − V (y)| ≤ 2c whenever |x − y| ≤ δ. If the partitions (tk ) (j) and (tk ) are finer than δ/2 then again as in the continuous case 2

(V ) (V ) 2 2 2 Ii − Ij ≤ (2c) X (b)2 − X (a)2 . 2

By Chebyshev’s inequality

ε (2c)2 X (b)22 − X (a)22 β (V ) (V ) P Ii − Ij > ≤ = . 2 2 3 (ε/2) 14 See:



123

3. If i and j are big enough, then ε

ε

(Z) (J) (J) (Z) P (|Ii − Ij | > ε) ≤ P Ii − Ij > + P Ii − Ij > ≤ 2 2 ≤

ε

β (Z) (Z) + P Ii − Ij > ≤ 3 2

≤

ε

β (Z) (Z) ≤ + P (A) + P Ac ∩ Ii − Ij > 2 3

≤

ε

2β (V ) (V ) + P Ii − Ij > ≤ β. 3 2

This means that (In ) is a Cauchy sequence in probability and hence it converges in probability. Corollary 2.16 Let Y be an adapted, regular process on a finite interval [a, b]. b 2 If X ∈ Hloc then the Itˆ o–Stieltjes integral a Y dX exists and the approximating sums converge in probability. 2 and let (τ n ) be a localizing sequence of X. As Proof. Assume that X ∈ Hloc τ n ∞ for any β > 0 if s is big enough then P (τ s ≤ b) < β/2. Let

In

(n) (n) (n) Y (tk−1 ) X(tk ) − X(tk−1 ) ,

k

Sn

(n) (n) (n) Y (tk−1 ) X τ s (tk ) − X τ s (tk−1 ) .

k

For any α > 0 P (|In − Im | > α) ≤ P (τ s ≤ b) + P (|In − Im | > α, τ s ≥ b) ≤ ≤

β + P (|In − Im | > α, τ s ≥ b) ≤ 2

≤

β + P (|Sn − Sm | > α) . 2

As X τ s ∈ H2 by the previous proposition P (|Sn − Sm | > α) → 0. Hence (In ) is a stochastic Cauchy sequence, so it is convergent in probability.

124 2.1.3


Itˆ o–Stieltjes integrals when the integrators are semimartingales

As we can integrate with respect to processes with finite variation and with respect to locally square-integrable martingales, the next definition is very natural: Definition 2.17 An adapted process X is called a semimartingale if X has a decomposition X = X (0) + V + H

(2.8)

2 where V is a right-continuous, adapted process with finite variation and H ∈ Hloc and V (0) = H (0) = 0.

It is important to emphasize that at the moment we do not know too much about the class of semimartingales. As there are martingales which are not locally square-integrable it is not even evident from the definition that every martingale is a semimartingale. Later we shall prove that every local martingale is a semimartingale in the above sense15 . We shall later also prove that every integrable sub- and supermartingale is a semimartingale16 . Therefore the class of semimartingales is a very broad one. Every continuous local martingale is locally square-integrable 17 , therefore in the continuous case we can use the following definition: Definition 2.18 An adapted continuous stochastic process X is called a continuous semimartingale if X has a decomposition (2.8) where H is a continuous local martingale and V is a continuous, adapted process with finite variation. Proposition 2.19 If X is a continuous semimartingale then the decomposition (2.8) is unique. Proof. If X = X (0)+H1 +V1 and X = X (0)+H2 +V2 then H1 −H2 = V2 −V1 is a continuous local martingale having finite variation. Hence by Fisk’s theorem18 H1 − H2 = V1 − V2 = 0. Example 2.20 For discontinuous semimartingales the decomposition (2.8) necessarily unique.

is not

15 This is the so called Fundamental Theorem of Local Martingales. See: Theorem 3.57, page 220. 16 This is a direct consequence of the so called Doob–Meyer decomposition. See: Proposition 5.11, page 303. 17 See: Example 1.137, page 96. 18 See: Theorem 2.11, page 117.


125

The simplest example is the compensated Poisson process. If π is a Poisson process with parameter λ then the compensated Poisson process X (t) π (t) − 2 λt is in Hloc and the trajectories of X on any finite interval have finite variation. So H X, V 0 and H 0, V X are both proper decompositions of X. Almost surely convergent sequences are convergent in probability, therefore one can easily prove the following theorem: Theorem 2.21 (Existence of Itˆ o–Stieltjes integrals) If X is a semimartingale and Y is a regular and adapted process then for any finite interval [a, b] the b Itˆ o–Stieltjes integral a Y dX exists and it is convergent in probability. The value of the integral is independent of the value of the jumps of Y , that is for any regular Y

b

Y dX =

b

b

a

Y− dX =

Y+ dX.

a

a

Proof. We have already proved the first part of the theorem. Let (In ) be the b sequence of the approximating sums for a Y dX and let (Sn ) be the sequence of approximating sums when the integrand is Y− . We need to prove that In − Sn =

(

(n)

P (n) (n) (n) X tk − X tk−1 → 0. Y tk−1 − Y− tk−1

(2.9)

k

Observe that the situation is very similar to that in the proof of Theorem 2.15. We can separate the big jumps and the small jumps and apply the same argument as above19 . Example 2.22 Wiener integrals.

The simplest case of stochastic integration is the so-called Wiener integral: the integrator is a Wiener process w, the integrand is a deterministic function f . If f is regular, then f , as a stochastic process, is adapted and regular, hence by the b above theorem the expression a f (s) dw (s) is meaningful. The increments of a Wiener process are independent. As the sum of independent normally distributed variables is again normally distributed

(n) f (ti−1 )

(n) w(ti )

−

i 19 See:

(n) w(ti−1 )

∼ =N

0,

i


f

2

(n) ti−1

(n) ti

−

(n) ti−1

.

126


Stochastic convergence implies convergence in distribution, hence

b

f dw ∼ = N 0,

a

b 2

f (t)dt ,

a

where N (µ, σ 2 ) denotes the normal distribution with expected value µ and variance σ 2 . 2.1.4

Properties of the Itˆ o–Stieltjes integral

The next properties of the Itˆ o–Stieltjes integral are obvious: Proposition 2.23 If X1 , X2 and X are semimartingales, Y1 , Y2 and Y are adapted regular processes, α and β are constants then b b a.s. b 1. α a Y1 dX + β a Y2 dX = a (αY1 + βY2 ) dX, b b b a.s 2. a Y d (αX1 + βX2 ) = α a Y dX1 + β a Y dX2 . b b a.s. c 3. If a < c < b, then a Y dX = a Y dX + c Y dX. 4. If Y1 χA is an equivalent modification of Y2 χA for some A ⊆ Ω then the b b integrals a Y1 dX and a Y2 dX are almost surely equal on A. Since the approximating sums are convergent in probability it is important to note that the Itˆ o–Stieltjes integral is defined only as an equivalence class. In the following we shall not distinguish between functions and equivalence classes so a.s. when it is not important to emphasize this difference instead of = we shall use the simpler sign =. 2.1.5

The integral process

Let us briefly investigate the integral process (Y • X) (t)

t

Y dX. a

We have defined the stochastic integral only for fixed time intervals. On every time interval the definition determines the value of the stochastic integral up to a measure-zero set, hence the properties of the integral process t → (Y • X) (t) are unclear. It is not a stochastic process, just an indexed set of random variables! When does it have a version which is a martingale? Assume that X ∈ H2 and that Y is adapted. Assume also that Y is uniformly bounded that is |Y | ≤ c for some constant c. As the filtration F is right-continuous, the right-regular process


127

Y+ is also adapted. As we have seen20 for every t ∈ [a, b]  

2 (n) (n) (n) 2 ≤ Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t In (t)2 E  k

≤ c2 E X 2 (b) − E X 2 (a) K, hence the sequence In (t)

(n) (n) (n) Y+ tk−1 ∧ t X tk ∧ t − X tk−1 ∧ t

k

is bounded in L2 (Ω) so the sequence of the approximating sums is uniformly integrable hence not only p

In (t) → (Y • X) (t) but also L1

In (t) → (Y • X) (t) . It is easy to see21 that if s < t then E (In (t) | Fs ) = In (s) . L1

As In (t) → operator

t a

Y dX using the L1 (Ω)-continuity of the conditional expectation

t

Y dX | Fs

E

s

Y dX.

=

a

a

Observe that In (t) is right-regular so In (t) is a martingale for every n. As Im −In is a martingale by Doob’s inequality, for any λ > 0

λP sup |In (t) − Im (t)| ≥ λ t

≤ In (b) − Im (b)1 .

(In (b)) is convergent in L1 (Ω) so P

sup |In (t) − Im (t)| → 0, t

20 See: 21 See:

Lemma 2.13, page 119. Lemma 2.12, page 118.

128


hence for a subsequence a.s.

sup |Ink (t) − Imk (t)| → 0,

(2.10)

t

so except for a measure-zero set the continuity-type properties of trajectories of (In ) are preserved, so we get the following proposition: Proposition 2.24 If Y is an adapted, regular, and uniformly bounded process, X ∈ H2 then the integral process (Y • X) (t)

t

Y dX,

t≥a

a

has a version which is a martingale. If (In ) is the sequence of approximating sums then for every t P

sup |In (s) − (X • M ) (s)| → 0.

(2.11)

a≤s≤t

If X is continuous and bounded then Y • X has a continuous version. Let us emphasize that in the argument above the set of exceptional points N in (2.10) is in Fb . Of course we should define the integral process on N as well, and of course we should guarantee that the integral process is adapted. We can do this only when we assume that for all s ≤ b, N ∈ Fs . This assumption is part of the usual conditions. Observe that in the continuous case we do not explicitly use the right-continuity of the filtration. On the other hand, this is a very uninteresting remark since, in most cases22 , if we add the measure-zero sets to the filtration then the augmented filtration is right-continuous. 2.1.6

Integration by parts and the existence of the quadratic variation

One of the most important concepts of stochastic analysis is the quadratic variation. The main reason to introduce the Itˆ o–Stieltjes integral is that from the existence theorem of the Itô–Stieltjes integral one can easily deduce the existence of the quadratic variation of semimartingales. Definition 2.25 Let U and V be stochastic processes on [a, b]. If for every

(n) of [a, b] the sequence infinitesimal sequence of partitions tk Qn

(n)

(n) (n) (n) U tk − U tk−1 V tk − V tk−1 k

22 E.g.

if the filtration is generated by a Lévy process. See: Proposition 1.103, page 67.


129

is convergent in probability then the limit limn→∞ Qn is called the quadratic co-variation of U and V . The quadratic co-variation of U and V on [a, b] is b b b denoted by [U, V ]a . If V = U then [U, U ]a [U ]a is called the quadratic variation of U . Of course in stochastic convergence b

[U ]a lim

n→∞

2 (n)

(n) U tk − U tk−1 . k

Example 2.26 If the trajectories of X are continuous and the trajectories of V have a.s. finite variation then [X, V ]ba = 0 for any interval [a, b].

By the continuity assumption, the trajectories of X are uniformly continuous on (n) (n) the compact interval [a, b]. Hence if maxk tk − tk−1 → 0 then for every ω (n) (n) lim max X(tk , ω) − X(tk−1 , ω) → 0.

n→∞

k

Therefore, as Var(V, a, b) < ∞

(n) (n) (n) (n) X(tk ) − X(tk−1 ) V (tk ) − V (tk−1 ) ≤ |Qn | k (n) (n) ≤ max X(tk ) − X(tk−1 ) Var(V, a, b) → 0. k

a.s.

Example 2.27 If w is a Wiener process23 then [w]t0 = t. If π is a Poisson process a.s. then [π]t0 = π (t).

If π is a Poisson process then for any ω the number of the jumps on any finite interval [0, t] is finite, so for any ω one can assume that every subinterval contains just one jump, hence Qn (t, ω) is the number of jumps of the trajectory π (ω) during the time interval [0, t]. So evidently Qn (t, ω) = π (t, ω). Proposition 2.28 (Integration By Parts Formula) If M and N are semimartingales then: b

1. For any finite interval [a, b] the quadratic co-variation [M, N ]a exists. 2. The following integration by parts formula holds: (M N ) (b) − (M N ) (a) =

M− dN + a

23 See:

Theorem B.17, page 571.

b

a

b

b

N− dM + [M, N ]a .

(2.12)

130


Proof. By definition semimartingales are right-regular processes so the processes (n) M− and N− are well-defined left-regular processes. For any partition (tk ) of [a, b] let us define the approximating sums

(n) (n) (n) (n) M tk−1 ∆N tk N tk−1 ∆M tk + +

k

k

+

(n)

∆M tk

(n)

∆N tk

.

k

With elementary calculation for all k (n)

(n)

(n)

(n)

M (tk )N (tk ) − M (tk−1 )(N tk−1 ) =

(n) (n) (n) = M tk−1 N (tk ) − N (tk−1 ) +

(n) (n) (n) + N tk−1 M (tk ) − M (tk−1 ) +

(n) (n) (n) (n) + M (tk ) − M (tk−1 ) N (tk ) − N (tk−1 ) . Adding up by k, on the left side one gets a telescopic sum which adds up to M (b) N (b) − M (a) N (a) , which is the expression on the left-hand side of (2.12). The integrating sums on the right-hand side converge to the Itˆ o–Stieltjes integrals

b

M dN = a

b

M− dN a

and

b

b

N dM = a

N− dM a

b

so [M, N ]a exits and the formula (2.12) holds. Example 2.29 The jumps of independent Poisson processes.

Let N1 and N2 be two Poisson processes with respect to the same filtration24 F. For s ≥ 0 let Ui (s, t) 24 That

exp (−sNi (t)) , E (exp (−sNi (t)))

i = 1, 2

is N1 and N2 are counting Lévy processes with respect to the same filtration.


131

be the exponential martingales defined by the Laplace transforms of the Poisson processes. By the Integration By Parts Formula

t

U1 (s1 , t) U2 (s2 , t) − 1 =

U1 (s1 , r−) U2 (s2 , dr) + 0

+

t

U2 (s2 , r−) U1 (s1 , dr) + 0

+ [U1 (s1 ) , U2 (s2 )] (t) . It is easy to see that U1 and U2 are bounded martingales, with respect to F for any s ≥ 0 on any finite interval [0, t]. As they are also F-adapted the stochastic integrals are martingales25 . Therefore the expected value of the stochastic integrals are zero. So E (U1 (s1 , t) U2 (s2 , t)) − 1 = E ([U1 (s1 ) , U2 (s2 )] (t)) . By the definition of U1 and U2

2 2 ! E exp − si Ni (t) = E (exp (−si Ni (t))) i=1

i=1

if and only if E ([U1 (s1 ) , U2 (s2 )] (t)) = 0.

(2.13)

That is N1 (t) and N2 (t) are independent if and only if (2.8) holds26 . As Laplace transform is continuous in time ∆Ui (s, r) =

exp (−sNi (r)) − exp (−sNi (r−)) ≤0 E (exp (−sNi (r)))

it is easy to see that [U1 (s1 ) , U2 (s2 )] (t) =

∆U1 (s1 , r) ∆U2 (s2 , r) ≥ 0.

r≤t

Therefore its expected value is zero if and only if it is almost surely zero. Hence N1 (t) and N2 (t) are independent if and only if with probability one N1 and N2 do not have common jumps on the interval [0, t]. 25 See: 26 One

Proposition 2.24, page 128. can easily modify the proof of Lemma 1.96 on page 60.

132


The next property of the quadratic co-variation is obvious: Proposition 2.30 If M, N and U are arbitrary semimartingales, ξ and η are F0 -measurable random variables then for any interval [a, b] b a.s.

b

b

[ξM + ηN, U ]a = ξ [M, U ]a + η [N, U ]a . Specifically [M + N ] = [M ] + 2 [M, N ] + [N ] .

a.s.

Example 2.31 If X = X (0)+L+V is a continuous semimartingale then [X]ba = [L]ba for any interval [a, b], where L is the continuous local martingale part of X . b a.s.

As V and L are continuous and the trajectories of V have finite variation [V ]a = a.s. 0 and [V, L] = 0. By the additivity: b a.s.

b

b a.s.

[X]a [X (0) + L + V ]a = [L + V ]a = a.s.

b

b a.s.

b

b

= [L]a + 2 [L, V ]a + [V ]a = [L]a .

Example 2.32 Assume that F is a deterministic, right-regular function with finite variation. If w is a Wiener process then

t

w (s) dF (s) ∼ =N

0,

0

0

t

(F (t) − F (s))2 ds .

w is continuous and F has finite variation therefore [w, F ] = 0. By the integration by parts formula w (t) F (t) =

t

t

wdF +

F− dw,

0

0

hence

t

t

wdF = w (t) F (t) − 0

F− dw = 0

t

F (t) dw −

= 0

t

F− dw = 0

t

(F (t) − F (s−)) dw (s) .

= 0


133

The last integral is a Wiener integral, so 0

t

wdF ∼ =N

t 2 0, (F (t) − F (s−)) ds = 0

t 2 (F (t) − F (s)) ds . = N 0, 0

As we have remarked, if X has finite variation and Y is continuous then27 [X, Y ] = 0. Hence in this case the integration by parts formula is XY − X (0) Y (0) = Y • X + X− • Y. For this formula we do not in fact need the continuity of Y . Observe that as X has finite variation every trajectory of X defines a measure on R+ . Let Y be an arbitrary semimartingale, and let ∆Y denote the jumps of Y . We show, that in this case [Y, X] = ∆Y • X, where the integral is the Lebesgue–Stieltjes integral defined by the trajectories of X. If U ∆Y χ(|∆Y | ≥ ε) are the jumps of Y which are bigger than ε then as the number of such jumps on every finite interval is finite [Y, X] = [Y − U, X] + [U, X] = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε)∆X = = [Y − U, X] + ∆Y χ(|∆Y | ≥ ε) • X. The jumps of the regular process Z Y − U are smaller than ε, hence if the partition of the interval [a, b] is fine enough, then28 (n) (n) Z(tk , ω) − Z(tk−1 , ω) ≤ 2ε for any ω. Therefore if n → ∞

(n) (n) (n) (n) Z(tk ) − Z(tk−1 ) X(tk ) − X(tk−1 ) ≤ 2εVar (X, a, b) → 0. k

As X has finite variation and the integral is a Lebesgue–Stieltjes integral one can use the Dominated Convergence Theorem. From this theorem for every 27 See: 28 See:

Example 2.26, page 129. Proposition 1.7, page 6.

134


trajectory ∆Y χ(|∆Y | ≥ ε) • X → ∆Y • X =

∆Y ∆X,

assuming of course that for every trajectory, on every finite interval, |∆Y | is integrable. But this has to be true as the trajectories of Y are regular so on every finite interval every trajectory of Y will be bounded29 . Proposition 2.33 If X is right-continuous and has finite variation, Y is an arbitrary semimartingale then [X, Y ] = ∆Y ∆X = ∆Y • X (2.14) therefore30 XY − X (0) Y (0) = Y− • X + X− • Y + [X, Y ] = = Y− • X + X− • Y + ∆Y • X = = Y • X + X− • Y where the integral with respect to X is a Lebesgue–Stieltjes integral and the integral with respect to Y is an Itˆ o–Stieltjes integral. 2.1.7

The Kunita–Watanabe inequality

In the construction of the stochastic integral below we shall use the following simple inequality: Proposition 2.34 (Kunita–Watanabe inequality) If X, Y are product measurable processes, and M, N are semimartingales, a ≤ b ≤ ∞ and V Var ([M, N ]) then b b b a.s. |XY | dV ≤ X 2 d [M ] Y 2 d [N ]. (2.15) a

a

a

Remark first that the meaning of the proposition is not really clear as it is not clear what is the meaning of [M ], [N ] and [M, N ]. So far we have defined the quadratic variation only for fixed time intervals, and the quadratic variation for every time interval is defined as a limit in stochastic convergence, and hence the quadratic variation on any interval is defined just up to a measure-zero set. If t X is a semimartingale then for every t one can define [X] (t) [X]0 , but this [X] is not a stochastic process since for a fixed ω and t the value of [X] (t, ω) 29 See:

Proposition 1.6, page 5. that the Lebesgue–Stieltjes integral Y •X exists: The trajectories of Y are regular, hence they are bounded on every finite interval. 30 Observe


135

is undefined. Of course, if t is restricted to the set of the rational numbers then we can collect the corresponding measure-zero sets in just one measurezero set, but it is unclear how one can extend this process to the irrational values of t as at the moment we have not proved any continuity property of the quadratic variation. Observe, that we do not know anything about integral processes. In particular we do not know when they will be martingales. If the integral process is a semimartingale then, by definition, it has a right-continuous version, so by (2.12) the quadratic variation also has a right-continuous version. One of the goals of the later developments will be to provide a right-continuous version for the quadratic variation process or, which is the same, to prove some martingale-type properties for the stochastic integral. So, to prove the inequality up to the end of the section we assume that there are processes [M ], [N ] and [M, N ] which are right-continuous, and that for any t they provide a version of the related quadratic variation. In this case [M ] (ω) , [N ] (ω) and Var ([M, N ] , ω) are increasing, right-continuous functions for every ω, hence they define a measure and for every ω the integrals in (2.15) are defined as Lebesgue–Stieltjes integrals. Proof. It is sufficient to prove the proposition for finite a and b. One can prove the case b = ∞ by the Monotone Convergence Theorem. Also by the Monotone Convergence Theorem one can assume that X any Y are bounded. We should b prove the inequality when on the left-hand side we have a XY d [M, N ] since to prove (2.15) one can replace Y by Y Y · sgn (XY )

dV . d [M, N ]

1. First assume that X = 1 and Y = 1. In this case, the inequality is ) ) b a.s. b b N ] ≤ [M ] [M, a a [N ]a .

(2.16)

Fix a u and a v. The proof of (2.16) is nearly the same as the proof of the classical Cauchy–Schwarz inequality. It is easy to see that for all rational numbers r a.s.

v a.s.

v

v

v

0 ≤ [M + rN ]u = [M, M ]u + 2r · [M, N ]u + r2 · [N, N ]u Ar2 + Br + C. Hence there is a measure-zero set Z such that on the complement of Z the inequality above is true for all rational, and therefore all real, r. Hence, as in a.s.

the proof of the Cauchy–Schwarz inequality B 2 − 4AC ≤ 0 so (2.16) holds with a = u and b = v. Unifying the measure-zero sets one can easily prove (2.16) for

136


every rational numbers u and v. By the assumption above the quadratic variation is right-continuous, so the relation (2.16) holds for every real a = u and b = v. 2. Let (tk ) be a partition of [a, b] and assume that X and Y are constant on every subinterval (tk−1 , tk ]. We are integrating by trajectory so b t XY d [M, N ] ≤ |X (tk ) Y (tk )| [M, N ]tk+1 ≤ k a k ) ) t t ≤ |X (tk ) Y (tk )| [M ]tk+1 [N ]tk+1 . k k k

Using the Cauchy–Schwarz inequality we can continue % b % 2 t t XY d [M, N ] ≤ |X (tk )| [M ]tk+1 Y 2 (tk ) [N ]tk+1 = k k a k k b b = X 2 d [M ] Y 2 d [N ]. a

a

3. Using standard measure theory one can easily prove31 that if µ is a finite, regular measure on the real line, and g is a bounded Borel measurable function, then there is a sequence of step functions sn

ci χ

(n)

(n)

ti , ti+1

i

that sn → g almost surely in µ. As µ is finite and g is bounded sn → g in L2 (µ). 4. We prove that Kunita–Watanabe inequality holds for every outcome where (2.16) holds for every real a and b. Fix the process Y and an outcome ω, and consider the set of processes X for which the inequality (2.16) holds for this ω. Let sn → X (ω) be a set of step functions. By (2.16) the measure generated by [M, N ] (ω) is absolutely continuous with respect to the measure generated by [M ] (ω). Hence sn → X (ω) almost surely in [M, N ] (ω). Therefore by the Dominated Convergence theorem, using that X and Y are bounded, a and b are finite and that the convergence holds almost everywhere in [M, N ] (ω) and in L2 ([M ] (ω)) b b b XY d [M, N ] ≤ X 2 d [M ] Y 2 d [N ] a a a 31 Use Lusin’s theorem [80], page 56, and the uniform continuity of continuous functions on compact sets.


137

for outcome ω. If X is product measurable then by Fubini’s theorem every trajectory of X is Borel measurable. Hence if X is product measurable then inequality (2.15) holds for almost all outcome ω. 5. Now we fix X and repeat the argument for Y . Corollary 2.35 If q, p ≥ 1 and 1/p + 1/q then E 0

∞

∞ |XY | d [M, N ] ≤ X 2 d [M ] 0

p

- ∞ Y 2 d [N ] . 0 q

Proof. By H¨ older’s inequality and by (2.15) E

∞

-

|XY | d [M, N ] ≤

-

∞

X 2 d [M ]

E

0

0

Y 2 d [N ]

≤

0

- ∞ X 2 d [M ] 0

≤

∞

p

- ∞ Y 2 d [N ] . 0 q

Corollary 2.36 If M and N are semimartingales then |[M, N ]| ≤

" [M ] [N ]

(2.17)

and 1/2

[M + N ]

1/2

≤ [M ]

1/2

+ [N ]

and [M + N ] ≤ 2 ([M ] + [N ]) . Proof. The first inequality is just the Kunita–Watanabe inequality when X = Y = 1. [M + N ] = [M ] + 2 [M, N ] + [N ] ≤ " ≤ [M ] + 2 [M ] [N ] + [N ] =

2 1/2 1/2 = [M ] + [N ]

138


from which the second inequality is obvious. In a similar way " [M + N ] ≤ [M ] + 2 [M ] [N ] + [N ] ≤ ≤ [M ] + ([M ] + [N ]) + [N ] = = 2 ([M ] + [N ]) .

2.2

The Quadratic Variation of Continuous Local Martingales

The following proposition is the starting point in our construction of the stochastic integral process. Proposition 2.37 (Simple Doob–Meyer decomposition) If M is a uniformly bounded, continuous martingale, then: 1. 2. 3. 4.

t

The quadratic variation P (t) [M ] (t) [M ]0 exists. [M ] has a version which is increasing and continuous. For this version M 2 − [M ] is a martingale. [M ] is indistinguishable from any increasing, continuous process P for which P (0) = 0 and M 2 − P is a martingale. (n)

If (tk ) is an infinitesimal sequence of partitions of [0, t] then p

sup |Qn (s) − [M ] (s)| → 0

(2.18)

s≤t

for any t, where Qn (s)

2 (n) (n) . M tk ∧ s − M tk−1 ∧ s

k

Proof. By the Integration By Parts Formula for any t M 2 (t) − M 2 (0) = 2

t

M dM + [M ] (t) = 2 · (M • M ) (t) + [M ] (t) . 0

As M is continuous and uniformly bounded the integral process M • M has a version which is a continuous martingale32 , therefore as M 2 is continuous [M ] M 2 − M 2 (0) − 2 · M • M is continuous, and by Proposition 2.24 M 2 − [M ] = M 2 (0) + 2 · (M • M ) 32 See:


THE QUADRATIC VARIATION OF CONTINUOUS LOCAL MARTINGALES

139

t

is a martingale. [M ] (t) is a version of the quadratic variation [M ]0 for any t. p a.s. [M ]0 ≤

q

[M ]0 . Taking the union the For any rational numbers p ≤ q we have measure-zero sets and using the continuity of [M ] we can construct a version which is increasing. If P is another continuous, increasing process for which P (0) = 0 and M 2 − P is a martingale, then N P − [M ] is also a continuous martingale and N (0) = 0. As N is the difference of two increasing processes the trajectories of N have finite variation. By Fisk’s theorem33 N = 0, so P is indistinguishable from [M ]. The convergence (2.18) is a simple consequence of (2.11). First we extend the proposition to continuous local martingales. In order to do it we need the following rule: Proposition 2.38 Under the assumptions of the previous proposition if τ is an τ arbitrary stopping time then [M τ ] = [M ] . τ

2 Proof. As (M τ ) = M 2 τ

τ 2 τ τ (M τ ) − [M ] = M 2 − [M ] = M 2 − [M ] .

τ Stopped martingales are martingales hence M 2 − [M ] is a martinτ gale. [M ] is increasing, so by the uniqueness of the quadratic variation τ [M τ ] = [M ] . Proposition 2.39 If M is a continuous local martingale then there is one and only one continuous, increasing process [M ] such that: 1. [M ] (0) = 0 and 2. M 2 − [M ] is a continuous local martingale.

(n) For any t if tk is an infinitesimal sequence of partitions of [0, t] then p

sup |Qn (s) − [M ] (s)| → 0

(2.19)

s≤t

where Qn (s)

2 (n) (n) . M tk ∧ s − M tk−1 ∧ s

k

Proof. Let M be a continuous local martingale and let (σ n ) be a localizing sequence of M . As M is continuous the hitting times υ n inf {t : |M (t)| ≥ n} 33 See:


140


are stopping times. Stopped martingales are martingales, so if instead of σ n we take the localizing sequence τ n σ n ∧ υ n then the processes Mn M τ n are bounded martingales. 1. As Mn is a bounded, continuous martingale [Mn ] is an increasing processes and Mn2 − [Mn ] is a continuous martingale. By the previous proposition τn

[Mn+1 ]

& τn ' = [Mn ] , = Mn+1

hence [Mn ] = [Mn+1 ] on the interval [0, τ n ]. As τ n ∞ one can define the process [M ] as the ‘union’ of the processes [Mn ], that is [M ] (t, ω) [Mn ] (t, ω) ,

t ≤ τ n (ω) .

Evidently [M ] is continuous, increasing and [M ] (0) = 0. Of course

τ n 2 τ M 2 − [M ] = (M τ n ) − [M ] n Mn2 − [Mn ] ,

which is a martingale, hence M 2 − [M ] is a local martingale. 2. Assume that A (0) = 0 and M 2 − A is a continuous local martingale for some continuous, increasing process A.

Z M 2 − [M ] − M 2 − A = A − [M ] is a continuous local martingale and Z, as the difference of two increasing processes, has finite variation. So by Fisk’s theorem Z is constant. As Z(0) = A (0) − [M ] (0) = 0, obviously Z ≡ 0. (n)

3. Finally, let us prove (2.19). Fix ε, δ, t > 0 and (tk )k . Let Qn be (m) the approximating sum for [M ] and let Qn be the approximating sum for [Mm ]. A sup |Qn (s) − [M ] (s)| > ε , s≤t

(m)

A

(m) sup Qn (s) − [Mm ] (s) > ε . s≤t

As τ m ∞, for m large enough P (τ m ≤ t) ≤ δ/2 and P A(m) ≤ δ/2. Obviously P (A) = P (A ∩ (τ m ≤ t)) + P (A ∩ (τ m > t)) ≤ ≤ P ((τ m ≤ t)) + P (A ∩ (τ m > t)) ≤


141

δ + P (A ∩ (τ m > t)) = 2

δ

δ δ δ = + P A(m) ∩ (τ m > t) ≤ + P A(m) ≤ + , 2 2 2 2 ≤

hence (2.19) holds. Proposition 2.40 If M and N are continuous local martingales then [M, N ] is the only continuous process with finite variation on finite intervals for which: 1. [M, N ] (0) = 0 and 2. M N − [M, N ] is a continuous local martingale. (n)

For any infinitesimal sequence of partitions (tk ) of [0, t] p

sup |Qn (s) − [M, N ] (s)| → 0 s≤t

where Qn (s)

(M (tk ∧ s) − M (tk−1 ∧ s)) (N (tk ∧ s) − N (tk−1 ∧ s)) .

(2.20)

k

Proof. From Fisk’s theorem the uniqueness of [M, N ] is again trivial, as M N −A and M N − B are continuous local martingales for some A and B, then A − B is a continuous local martingale with finite variation, so A − B is a constant. As A (0) = B (0) = 0 obviously A = B. MN =

1 2 2 (M + N ) − (M − N ) , 4

so it is easy to see that Proposition 2.39 can be applied to [M, N ]

1 ([M + N ] − [M − N ]) 4

(2.21)

in order to show that M N − [M, N ] is a continuous local martingale and that (2.21) holds. Definition 2.41 If for some process X there is a process P such that X − P is a local martingale, then we say that P is a compensator of X. If P is continuous then we say that P is a continuous compensator of X. If P is predictable then we say that P is a predictable compensator of X etc. So far we have proved that if M is a continuous local martingale then [M ] is the only increasing, continuous compensator of M 2 . It is important to emphasize that this property of [M ] holds only for continuous local martingales.

142


Example 2.42 Quadratic variation of the compensated Poisson processes.

Let π be a Poisson process with parameter λ. The increments of π are independent and the expected value of π (t) is λt, hence the compensated process ν (t) π (t) − λt is a martingale. We show that ν 2 (t) − λt is also a martingale, that is: λt is a continuous, increasing compensator for ν 2 .

2 E ν 2 (t) − λt | Fs = ν (s) + 2ν (s) E (ν (t) − ν (s) | Fs ) +

2 + E (ν (t) − ν (s)) | Fs − λt. The increments of π are independent, hence the conditional expectation is a real expectation. Given that the increments are stationary 2ν (s) E (ν (t) − ν (s) | Fs ) = 2ν (s) E (ν (t − s)) = 0

2 2 E (ν (t) − ν (s)) | Fs = E (ν (t − s)) = λ (t − s) , hence

E ν 2 (t) − λt | Fs = ν 2 (s) + λ (t − s) − λt = = ν 2 (s) − λs. (ν)

If we partition the interval [0, t] then if Qn is the sequence of the approximating (π) sum for [ν] and Qn is for [π] , then (π) Q(ν) n = Qn − 2λ

+ λ2

(n)

(n) (n) (n) π tk − π tk−1 tk − tk−1 + k (n)

(n)

tk − tk−1

2 .

k

(n) (n) (π) It is easy to see that if maxk tk − tk−1 → 0 then the limit of Qn is the process π. The limits of the other expressions are zero. Hence [ν] = π. Proposition 2.43 If M, N and U are continuous local martingales; ξ and η are F0 -measurable random variables then [ξM + ηN, U ] = ξ [M, U ] + η [N, U ] . Proof. M U − [M, U ] and N U − [N, U ] are local martingales hence (M + N ) U − ([M, U ] + [N, U ]) is also a local martingale, and by the uniqueness property of


143

the quadratic co-variation [M + N, U ] = [M, U ] + [N, U ] . In a similar way: M U − [M, U ] is a local martingale, ξ is F0 -measurable, hence ξ (M U − [M, U ]) is also a local martingale, hence again by the uniqueness property of the quadratic co-variation [ξM, N ] = ξ [M, N ]. Proposition 2.44 If M and N are continuous local martingales then [M, N ] = [M − M (0) , N − N (0)] = [M − M (0) , N ] . Proof. Obviously [M − M (0) , N ] = [M, N ] − [M (0) , N ]. As M (0) is F0 measurable M (0) N is a continuous local martingale. Hence [M (0) , N ] = 0. Proposition 2.45 (Stopping rule for quadratic variation) Let τ be an arbitrary stopping time. τ

1. If M is a continuous local martingale then [M τ ] = [M ] . τ 2. If M and N are continuous local martingales then [M τ , N τ ] = [M, N ] = τ [M , N ]. Proof. [M τ ] is the only continuous, increasing process A for which A (0) = 0 2 and (M τ ) − A is a continuous local martingale. M 2 − [M ] is a continuous local martingale, hence

τ τ τ 2 τ M 2 − [M ] = M 2 − [M ] = (M τ ) − [M ] τ

is a continuous local martingale, hence by the uniqueness [M ] = [M τ ]. From (2.21) and from the first part of the proof [M τ , N τ ] =

1 τ τ ([(M + N ) ] − [(M − N ) ]) = 4 1 τ τ τ ([M + N ] − [M − N ] ) = [M, N ] . 4

If U and V are martingales and τ is a stopping time, then for any bounded stopping time σ by the Optional Sampling Theorem E ((U τ · (V − V τ )) (σ)) = E (U (τ ∧ σ) · E (V (σ) − V (τ ∧ σ) | Fτ ∧σ )) = = E (U (τ ∧ σ) · 0) = 0,

144


hence U τ (V − V τ ) is a martingale. From this it is easy to prove with localization that M τ (N − N τ ) is a local martingale, hence τ

τ

M τ N − [M, N ] = M τ N − M τ N τ + M τ N τ − [M, N ] = τ

τ

= M τ (N − N τ ) + ((M N ) − [M, N ] ) is also a local martingale. From the uniqueness of the quadratic co-variation τ

[M τ , N ] = [M, N ] = [M τ , N τ ] .

Example 2.46 If M and N are independent and they are continuous local martingales with respect to their own filtration then [M, N ] = 0.

Let F M and F N be the filtrations generated by M and N . Let Fs be the σ-algebra generated by the sets A ∩ B,

A ∈ FsM , B ∈ FsN .

We shall prove that if M and N are independent martingales then M N is a martingale under the filtration F. As M and N are martingales, M (t) and N (t) are integrable. M (t) and N (t) are independent for any t. Hence the product M (t) N (t) is also integrable. If F A ∩ B, A ∈ FsM and B ∈ FsN then E (M N (t) χF ) = E (M (t) χA N (t) χB ) = E (M (t) χA ) E (N (t) χB ) = = E (M (s) χA ) E (N (s) χB ) = E (M N (s) χF ) , which by the uniqueness of the extension of finite measures can be extended for every F ∈ Fs . Hence M N is an F-martingale so [M, N ] = 0. The quadratic co-variation is independent of the filtration34 so [M, N ] = 0 under the original filtration. If M and N are local martingales with respect to their own filtration, then the localized processes are independent martingales. Hence if τ (τ n ) is a common localizing sequence then [M, N ] n = [M τ n , N τ n ] = 0. Hence [M, N ] = 0. Proposition 2.47 Let M be a continuous local martingale. M is indistinguishable from a constant if and only if the quadratic variation [M ] is zero. 34 Here we directly used the definition of the quadratic variation as the limit of the approximating sums.


145

Proof. If M is a constant then M 2 is also a constant, hence M 2 is a local martingale35 so [M ] = 0. On the other hand if [M ] = 0 then M 2 − [M ] = M 2 is a local martingale. The proposition follows from the next proposition. Proposition 2.48 M and M 2 are continuous local martingales, if and only if M is a constant. Proof. If M is constant then M and M 2 are local martingales. On the other hand 2

(M − M (0)) = M 2 − 2 · M · M (0) + M 2 (0) . Since M and M 2 are local martingales and M (0) is F0 -measurable, 2 (M − M (0)) is also a local martingale. Let (τ n ) be a localizing sequence for 2 (M − M (0)) . By the martingale property

2 2 E (M τ n (t) − M τ n (0)) = E (M τ n (0) − M τ n (0)) = 0, hence for any t a.s.

M (t ∧ τ n ) = M (0) . Therefore for any t a.s.

M (t) = lim M (t ∧ τ n ) = M (0) . n→∞

The local martingales are right-regular therefore M is indistinguishable from M (0). Corollary 2.49 Let a ≤ b < ∞. A continuous local martingale M is constant on [a, b] if and only if [M ] is constant on [a, b]. Proof. If τ n ∞ then a process X is constant on an interval [a, b] if and only τ if X τ n is constant on [a, b] for all n. Using this fact and that [M τ n ] = [M ] n one can assume that M is a martingale. 1. Define the stochastic process N (t) M (t + a) − M (a) . N is trivially a martingale for the filtration Gt Ft+a , t ≥ 0. N 2 (t) − ([M ] (t + a) − [M ] (a)) = M 2 (t + a) − ([M ] (t + a) − [M ] (a)) − − 2M (t + a) M (a) + M 2 (a) . 35 See:

Definition 1.131, page 94.

146


Obviously M 2 (t + a) − ([M ] (t + a) − [M ] (a)) is a G-martingale. M (t + a) is also a G-martingale hence M (t + a) M (a) + M 2 (a) is obviously a G-local martingale, hence by the uniqueness of the quadratic variation [N ] (t) = [M ] (t + a) − [M ] (a) . 2. M is constant on the interval [a, b] if and only if N is zero on the interval [0, b − a]. As we proved N is constant on [0, b − a] if and only if [N ] = 0 on [0, b − a]. Hence M is constant on [a, b] if and only if [M ] is constant on [a, b]. We summarize the statements above in the following proposition: Proposition 2.50 [M, N ] is a symmetric bilinear form and [M ] ≥ 0. [M ] = 0 if and only if M is constant. This is also true on any half-line [a, ∞) if instead of [M, N ] we use the increments [M, N ] − [M, N ] (a).

2.3

Integration when Integrators are Continuous Semimartingales

In this section we introduce a simple construction of the stochastic integral when the integrator X is a continuous semimartingale and the integrand Y is progressively measurable36 . Every continuous semimartingale has a unique decomposition of type X = X (0) + L + V , where V is continuous and has finite variation and L is a continuous local martingale. The integration with respect to V is a simple measure theoretic exercise: V (ω) generates a σ-finite measure on R+ for every ω. Every progressively measurable process is product measurable, hence all trajectories Y (ω) are measurable. For every ω and for every t one can define the pathwise integral (Y • V ) (t, ω)

t

Y (s, ω) V (ds, ω) , 0

where the integrals are simple Lebesgue integrals37 The main problem is how to define the stochastic integral with respect to the local martingale part L! 36 See: 37 See:

[78] Proposition 1.20, page 11.

INTEGRATION WITH CONTINUOUS SEMIMARTINGALES

2.3.1

147

The space of square-integrable continuous local martingales

Recall the definition and some elementary properties of square-integrable martingales: Definition 2.51 As before H2 is the space of L2 (Ω) bounded martingales38 on R+ . Let G 2 Hc2 denote the space of L2 (Ω)-bounded, continuous martingales. . / H02 M ∈ H2 : M (0) = 0 ,

. / G02 M ∈ G 2 : M (0) = 0 .

The elements of H2 , G 2 , H02 and G02 are equivalence classes: M1 and M2 are in the same equivalence class if they are indistinguishable. Proposition 2.52 M ∈ H2 if and only if sup M 2 (t) ∈ L1 (Ω). t

H2 , ·H2 is a Hilbert space where M H2 M (∞)2 = lim M (t)2 . t→∞

the set of continuous square-integrable martingales G 2 is a closed subspace of H2 . Proof. The first statement follows from Doob’s inequality39 . The relation M (∞)2 = lim M (t)2 t→∞

is obviously true as M (t) converges40 to M (∞) in L2 (Ω), and the norm is a continuous function. In order to show that G 2 is closed, let (Mn ) be a sequence of H2

continuous square-integrable martingales and assume that Mn → M . By Doob’s inequality41 E

2 2 sup |Mn (t) − M (t)| ≤ 4 Mn (∞) − M (∞)2 t

2

4 Mn − M H2 → 0. 38 That is if M is a martingale then M ∈ H2 , that is M is square-integrable, if and only if supt M (t) 2 < ∞. 39 See: Corollary 1.54, page 34. 40 See: Corollary 1.59, page 35. 41 See: (1.18) line, page 34.

148


From the L2 -convergence one has a subsequence for which a.s.

sup |Mnk (t) − M (t)| → 0, t

hence Mnk (t, ω) → M (t, ω) uniformly in t for almost all ω. Hence M (t, ω) is continuous in t for almost all ω. So the trajectories of M are almost surely continuous, therefore G 2 is closed. Our direct goal is to prove that if M is a square-integrable martingale and M (0) = 0 then

2 2 M H2 M (∞)2 = E M 2 (∞) = E ([M ] (∞)) . To do this one should prove that M 2 − [M ] is not only a local martingale but it is a uniformly integrable martingale. Proposition 2.53 (Characterization of square-integrable martingales) Let M be a continuous local martingale. The following statements are equivalent: 1. M is square integrable, 2. M (0) ∈ L2 (Ω) and E ([M ] (∞)) < ∞. In both cases M 2 − [M ] is a uniformly integrable martingale. Proof. The proof of the equivalence of the statements is the following: 1. Let (τ n ) be a localizing sequence of the local martingale M 2 − [M ] and let 2 σ n τ n ∧ n. By the martingale property of (M τ n ) − [M τ n ]

E M 2 (σ n ) − [M ] (σ n ) = E M 2 (0) .

(2.22)

As M is square-integrable M 2 (σ n ) ≤ sup M 2 (t) ∈ L1 (Ω) , t

so by the Dominated Convergence Theorem

lim E M 2 (σ n ) = E lim M 2 (σ n ) = E M 2 (∞) < ∞.

n→∞

n→∞

[M ] is increasing therefore by the Monotone Convergence Theorem and by (2.22)

E ([M ] (∞)) = lim E ([M ] (σ n )) = lim (E M 2 (σ n ) − E M 2 (0) < ∞, n→∞

n→∞


149

that is [M ] (∞) ∈ L1 (Ω) and 1. implies 2. For every stopping time τ 2 M − [M ] (τ ) ≤ sup M 2 (t) + sup [M ] (t) = t

t

= sup M (t) + [M ] (∞) ∈ L1 (Ω) , 2

t

.

/

hence the set M 2 (τ ) − [M ] (τ ) τ is dominated by an integrable variable and therefore it is uniformly integrable. By this M 2 −[M ] is a class D local martingale hence it is a uniformly integrable martingale42 . 2. Let τ be an arbitrary stopping time. Let (σ n ) be a localizing sequence of M . One can assume that M σn − M (0) is bounded43 . Let N M τ ∧σn − M (0). By the definition of the quadratic variation N 2 (t) = 2

t

N− dN + [N ] (t) . 0

o–Stieltjes integral defines a martingale44 . So As N− is bounded the Itˆ

E N 2 (t) = E ([N ] (t)) = E ([M τ ∧σn ] (t)) ≤ E ([M ] (∞)) . Applying Fatou’s lemma

2 E (M − M (0)) (τ ) ≤ E ([M ] (∞)) .

(2.23)

By the second assumption of 2. the expected value on the right-hand side is finite so the set of variables S of type (M − M (0)) (τ ) is bounded in L2 (Ω). Hence S is a uniformly integrable set and therefore M − M (0) is a class D local martingale and hence it is a martingale45 . By (2.23) M − M (0) is trivially bounded in L2 (Ω), that is M − M (0) ∈ G 2 . As M (0) ∈ L2 (Ω) by the first assumption of 2. obviously M ∈ G 2 . Corollary 2.54 If M ∈ G 2 and σ ≤ τ are stopping times then

E M 2 (τ ) − M 2 (σ) | Fσ = E ([M ] (τ ) − [M ] (σ) | Fσ ) =

2 = E (M (τ ) − M (σ)) | Fσ , specifically

E M 2 (τ ) − E M 2 (0) = E ([M ] (τ )) . 42 See:

(2.24)

Proposition 1.144, page 102. σn the general case when M is not necessarily continuous one can assume that M− −M (0) is bounded. 44 See: Proposition 2.24, page 128. 45 See: Proposition 1.144, page 102. 43 In

150


Proof. By the previous proposition M 2 − [M ] is a uniformly integrable martingale, hence if σ ≤ τ then by the Optional Sampling Theorem

E M 2 (τ ) − [M ] (τ ) | Fσ = M 2 (σ) − [M ] (σ) from which the first equation follows. M is also uniformly integrable hence again by the Optional Sampling Theorem M (σ) = E (M (τ ) | Fσ ) .

2 E (M (τ ) − M (σ)) | Fσ =

= E M 2 (τ ) + M 2 (σ) − 2M (σ) M (τ ) | Fσ =

2 = E M 2 (τ ) + M 2 (σ) − 2M (σ) | Fσ =

= E M 2 (τ ) − M 2 (σ) | Fσ . Let M be a semimartingale. Let us define ∞ χC d [M ] αM (C) E 0

where the integral with respect [M ] is the pathwise Lebesgue–Stieltjes integral generated by the increasing, right-regular46 process [M ]. It is not entirely trivial that αM is well-defined, that is the expression under the expected value is measurable. By the Monotone Convergence Theorem ∞ n χC d [M ] = E lim χC d [M ] . E n→∞

0

0

n

As 0 χC d [M ] is measurable47 for every n the parametric integral under the expected value is measurable. Obviously αM is a measure on B (R+ ) × A. Example 2.55 If M ∈ G 2 and τ is a stopping time then αM ([0, τ ]) = E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then E (M 2 (∞)) = E ([M ] (∞))

[M ] (∞) = αM (R+ × Ω).

M H2

2

46 Of

course tacitly we again assume that [M ] has a right-regular version. Proposition 1.20, page 11.

47 See:

(2.25)


151

If τ is an arbitrary random time then ∞ αM ([0, τ ]) E χ ([0, τ ]) d [M ] = E ([M ] (τ ) − [M ] (0)) = 0

= E ([M ] (τ )) . By (2.24) for every stopping time

E ([M ] (τ )) = E M 2 (τ ) − E M 2 (0) , hence αM ([0, τ ]) = E ([M ] (τ )) − E ([M ] (0)) = E ([M ] (τ )) =

= E M 2 (τ ) − E M 2 (0) . If M ∈ G02 then M (0) 0 hence by (2.24)

E M 2 (∞) = E M 2 (∞) − E M 2 (0) = E ([M ] (∞)) . The other relations are consequences of the definitions. Definition 2.56 αM is called the Doléans measure48 generated by the quadratic variation of M . 2.3.2

Integration with respect to continuous local martingales

Let us start with the simplest case: Definition 2.57 Let M be a continuous local martingale. Let L2 (M ) denote the space of equivalence classes of square-integrable and progressively measurable functions on the measure space (R+ × Ω, R, αM ) that is let L2 (M ) L2 (R+ × Ω, R, αM ) where R , as before, denote the σ-algebra of progressively measurable sets. Let ·M denote the norm of the Hilbert space L2 (M ): - XM

X 2 dα R+ ×Ω

Example 2.58 The space L2 (w). 48 See:


M

- E 0

∞

X 2 d [M ]

.

152


The quadratic variation of a Wiener process on an interval [0, s] is s. Hence t 2 Xw = E 0 X 2 (s) ds on the interval [0, t]. If t < ∞ then w ∈ L2 (w) , since by Fubini’s theorem 2 ww

t

E

2

w (s) ds

t

=

0

E w (s) ds =

0

t

sds < ∞.

2

0

The main result of this section is the following: Proposition 2.59 (Stochastic integration and quadratic variation) If M is a continuous local martingale and X ∈ L2 (M ) then there is a unique process in G02 denoted by X • M such that for every N ∈ G 2 [X • M, N ] = X • [M, N ] . If we denote X • M by

t 0

(2.26)

XdM then (2.26) can be written as

t

XdM, N = 0

t

Xd [M, N ] . 0

Proof. We divide the proof into several steps. We prove that X • M exists, and the definition of X • M is correct—that is, the process X • M is unique. 1. The proof of uniqueness is easy. If I1 and I2 are two processes in G02 satisfying (2.26) then [I1 , N ] = [I2 , N ] for all N ∈ G02 . Hence [I1 − I2 , N ] = 0 for all N ∈ G02 . As I1 − I2 ∈ G02 [I1 − I2 , I1 − I2 ] [I1 − I2 ] = 0, hence I1 − I2 is constant49 . As I1 − I2 ∈ G02 , I1 − I2 = 0, so I1 = I2 . 2. Now we prove the existence of X • M . Assume first that N ∈ G02 . By the Kunita–Watanabe inequality50 and by the formula (2.25) E

0

∞

- - ∞ ∞ Xd [M, N ] ≤ X 2 d [M ] d [N ] 0 0 2 2 - ∞

XM = XM 49 See: 50 See:

Proposition 2.47, page 144. Corollary 2.35, page 137.

E "

d [N ]

=

0

E ([N ] (∞)) = XM N H2 .

(2.27)


153

∞ Observe that XM N H2 < ∞, hence 0 Xd [M, N ] is almost surely finite. So the right-hand side of (2.26) is well-defined. By the bilinearity of the quadratic co-variation N → E

∞

Xd [M, N ]

0

is a continuous linear functional on the Hilbert space G02 . As every continuous linear functional on a Hilbert space has a scalar product representation there is an X • M ∈ G02 such that for every N ∈ G02 E

∞

Xd [M, N ] = (X • M, N ) E ((X • M ) (∞) N (∞)) .

(2.28)

0

3. The main part of the proof is to show that for X • M the identity (2.26) holds. Define the process S (X • M ) N − X • [M, N ] . To prove (2.26) we show that S is a continuous martingale, hence by the uniqueness of the quadratic co-variation [X • M, N ] = X • [M, N ]! First observe that S is adapted: (X • M ) N is a product of two martingales, that is the product of two adapted processes. t X is progressively measurable, by the definition of L2 (M ), so the integral 0 Xd [M, N ] is also adapted51 . S is continuous as by the construction (X • M ) N is a product of two continuous functions so it is continuous, and since M and N are continuous t the quadratic variation [M, N ] is also continuous. Therefore the integral 0 Xd [M, N ] as a function of t is continuous. Finally to show that S is a martingale one should prove that52 E (S (τ )) = E (S (0)) = 0

(2.29)

for every bounded stopping time τ . By definition X • M is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem (X • M ) (τ ) = E ((X • M ) (∞) | Fτ ) . 51 See: 52 See:


154


Using that N τ ∈ G02 and (2.28) E (S (τ )) E (X • M ) (τ ) N (τ ) −

τ

0

X [M, N ] =

τ

= E ((X • M ) (τ ) N (τ )) − E

X [M, N ] =

0

= E (E ((X • M ) (∞) | Fτ ) N (τ )) − E = E (E ((X • M ) (∞) N (τ ) | Fτ )) − E = E (X • M (∞) N (τ )) − E

= E (X • M (∞) N (τ )) −

0 ∞

τ

X [M, N ] 0

∞

∞

=

X [M, N ] = τ

0

∞

X [M, N ] = τ

X [M, N τ ] = 0.

0

Therefore (2.29) holds. 4. Finally if N ∈ G 2 then N − N (0) ∈ G02 , hence [X • M, N ] = [X • M, N − N (0)] = = X • [M, N − N (0)] = X • [M, N ] .

Proposition 2.60 (Stopping rule for stochastic integrals) If M is an arbitrary continuous local martingale, X ∈ L2 (M ) and τ is an arbitrary stopping time then τ

X • M τ = (χ ([0, τ ]) X) • M = (X • M ) = X τ • M τ .

(2.30)

Proof. By (2.26) and by the stopping rule for the quadratic variation, if N ∈ G 2 τ

τ

τ

τ

[(X • M ) , N ] = [(X • M ) , N ] = (X • [M, N ]) = X • [M, N ] = = X • [M τ , N ] = [X • M τ , N ] . By the bilinearity of the quadratic variation τ

[(X • M ) − X • M τ , N ] = 0, N ∈ G 2 , τ

from which [(X • M ) − X • M τ ] = 0 that is τ

(X • M ) = X • M τ .


155

If X ∈ L2 (M ) then trivially χ ([0, τ ]) X ∈ L2 (M ). For every N ∈ G 2 τ

[X • M τ , N ] = X • [M τ , N ] = X • [M, N ] = = (χ ([0, τ ]) X) • [M, N ] = = [(χ ([0, τ ]) X) • M, N ] , hence again X • M τ = (χ ([0, τ ]) X) • M. Using stopping rule (2.30) we can extend the stochastic integral to the space L2loc (M ). Definition 2.61 Let M be a continuous local martingale. The space L2loc (M ) is the set of progressively measurable processes X for which there is a localizing sequence of stopping times (τ n ) such that

∞

E 0 τn

=E

X 2 d [M τ n ] = E

X d [M ] = E

0

2

∞

τn

X 2 d [M ]

=

0

∞

χ ([0, τ n ]) X d [M ] 2

0

χ ([0, τ n ]) X 2 dαM < ∞. (0,∞)×Ω

Example 2.62 If M is a continuous local martingale and X is locally bounded then X ∈ L2loc (M ).

One can assume that X(0) = 0 as obviously every F0 -measurable constant pro2 . Let (τ n ) be a common localizing cess is in L2loc . As M is continuous M ∈ Hloc τn 2 53 τn sequence of X and M . M ∈ H so [M ] (∞) ∈ L1 (Ω). Therefore E

∞

2

X d [M

τn

] ≤ sup X 2 (t) E ([M τ n ] (∞)) < ∞. t≤τ n

0

Proposition 2.63 If M is a continuous local martingale then for every X ∈ L2loc (M ) there is a process denoted by X • M such that 1. (X • M ) (0) = 0 and X • M is a continuous local martingale, 2. for every continuous local martingale N [X • M, N ] = X • [M, N ] . 53 See:


(2.31)

156


X • M is unambiguously defined by (2.31), that is X • M is the only continuous local martingale for which for every continuous local martingale N (2.31) holds. Proof. M is a continuous local martingale so it is locally bounded hence M ∈ 2 . Assume that L2loc (M Hloc ) and let (τ n ) be such a localizing sequence of

∞X ∈ 2 X for which E 0 X d [M τ n ] < ∞ that is let X ∈ L2 (M τ n ). Consider the integrals In X • M τ n . τn

τn In+1 (X • M τ n+1 )

τn

= X • (M τ n+1 )

= X • M τ n = In ,

hence In+1 and In are equal on [0, τ n ]. One can define the integral process X • M unambiguously if for all n the value of X • M is by definition is In on the interval [0, τ n ]. By the stopping rule for stochastic integrals it is obvious from the construction that X • M is independent of the localizing sequence (τ n ). Obviously (X • M ) (0) = 0 and X • M is continuous. Trivially (X • M )

τn

τn

(X • M τ n )

= X • M τn

τ

and X • M τ n ∈ G02 , hence (X • M ) n is a uniformly integrable martingale so X •M is a local martingale. We should prove (2.31). Let (τ n ) be such a localizing sequence that X ∈ L2 (M τ n ) and N τ n ∈ G 2 . As X ∈ L2 (M τ n ) and N τ n ∈ G 2 by the stopping rule for the quadratic variation54 τn

[X • M, N ]

= [(X • M )

τn

, N τn]

[X • M τ n , N τ n ] = X • [M τ n , N τ n ] = τn

= X • [M, N ]

τn

= (X • [M, N ])

,

hence (2.31) is valid. Let us prove some elementary properties of the stochastic integral. The most important properties are simple consequences of (2.31), the basic properties of the quadratic variation and the analogous properties of the pathwise integration. Proposition 2.64 (Itˆ o’s isometry) If M is a continuous local martingale then the mapping X → X • M is an L2 (M ) → G02 isometry. That is if X ∈ L2 (M ) then

2 2 E (X • M ) (∞) X • M H2 = XM E

2

0

54 See:


∞

X d [M ] . 2


157

Proof. Using the definition of the norm in H2 and (2.25), by (2.31)

2 2 X • M H2 E (X • M ) (∞) = E ([X • M ] (∞)) E ([X • M, X • M ] (∞)) = ∞ =E Xd [X • M, M ] = E 0

∞

Xd (X • [M ]) .

0

In the right-hand side of the identity [X • M, M ] = X • [M, M ]. The integral is taken pathwise, hence ∞ 2 X • M H2 = E Xd (X • [M ]) = 0

∞

=E 0

2 X 2 d [M ] XM ,

and hence the mapping X → X • M is an isometry. 1

Example 2.65 The standard deviation of

0

√ wdw is 1/ 2.

The integral is meaningful and as on finite intervals w ∈ L2 (w) the integral 1 process w • w is a martingale. Hence the expected value of the integral 0 wdw is zero. By Itô’s isometry and by Fubini’s theorem 2 1 1 1

2 wdw w (s) ds = E w2 (s) ds = =E E 0

0

0

1

=

sds = 0

1 . 2

√ Hence the standard deviation is 1/ 2. We can calculate the standard deviation in the following way as well:

2

t

wdw

−

0

wdw 0

is a martingale, hence 2 1 E wdw =E 0

1

wdw

E

0

1

0

w2 d [w] = 0

1

wdw,

0

=E using (2.26) directly.

t

1

1

wdw 0

1 E w2 (s) ds = , 2

=

158


Proposition 2.66 If M is a continuous local martingale and X ∈ L2loc (M ) then [X • M ] = X 2 • [M ] .

(2.32)

Proof. By simple calculation using (2.31), and that on the right-hand side of (2.31), we have a pathwise integral [X • M ] [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M ] . Corollary 2.67 If M is a continuous local martingale and X is a progressively measurable process then X ∈ L2loc (M ) if and only if for all t almost surely

t

X 2 d [M ] X 2 • [M ] (t) < ∞.

(2.33)

0

Proof. The quadratic variation [X • M ], like every quadratic variation, is almost surely finite, hence if X ∈ L2loc (M ) then by (2.32), (2.33) holds. On the other hand, assume that (2.33) holds. For all n let us define the stopping times t τ n inf t : X 2 d [M ] ≥ n . 0

As [M ] is continuous, X 2 • [M ] is also continuous, hence

τn

X 2 d [M ] ≤ n,

0

that is X ∈ L2 (M τ n ) , hence X ∈ L2loc (M ) , so the space L2loc (M ) contains all the R-measurable processes, for which (2.33) holds for all t. Corollary 2.68 Assume that M is a local martingale and X ∈ L2loc (M ). If on an interval [a, b] 1. X (t, ω) = 0 for all ω or 2. M (t, ω) = M (a, ω) , then X • M is constant on [a, b]. Proof. The integral X 2 •[M, M ] is a pathwise integral, hence under the assumptions X 2 • [M, M ] is constant on [a, b]. As [X • M ] = X 2 • [M ] , the local martingale X • M is constant on55 [a, b]. 55 See:



159

Proposition 2.69 (Stopping rule for stochastic integrals) If M is a continuous local martingale, X ∈ L2loc (M ) and τ is an arbitrary stopping time then τ

(X • M ) = χ ([0, τ ]) X • M = X τ • M τ = X • M τ .

(2.34)

Proof. Let τ be an arbitrary stopping time. If X ∈ L2loc (M ) , then as |χ ([0, τ ]) X| ≤ |X| trivially χ ([0, τ ]) X ∈ L2loc (M ). Using the analogous properties of the L2 (M ) integrals τ τn

((X • M ) )

τ

τ

τ

= ((X • M ) n ) (X • M τ n ) = = χ ([0, τ ]) X • M τ n τn

(χ ([0, τ ]) X • M )

.

The proof of the other parts of (2.34) are analogous. Proposition 2.70 (Linearity) X • M is bilinear, that is if α1 and α2 are constants then X • (α1 M1 + α2 M2 ) = α1 (X • M1 ) + α2 (X • M2 ) and (α1 X1 + α2 X2 ) • M = α1 (X1 • M ) + α2 (X2 • M ) when all the expressions are meaningful. In these relations if two integrals are meaningful then the third one is meaningful. Proof. If X ∈ L2loc (M1 ) ∩ L2loc (M2 ) then for all t

t

X 2 d [M1 ] < ∞ 0

t

X 2 d [M2 ] < ∞.

and 0

Obviously, by the Kunita–Watanabe inequality56 [M1 + M2 ] ≤ 2 ([M1 ] + [M2 ]) hence

t

X d [M1 + M2 ] ≤ 2 2

0 56 See:


t 2

X d [M2 ] < ∞, 2

X d [M1 ] + 0

t

0

160


therefore X ∈ L2loc (M1 + M2 ). From the linearity of the pathwise integration and from the bilinearity of the quadratic variation [X • (α1 M1 + α2 M2 ) , N ] = X • [(α1 M1 + α2 M2 ) , N ] = = X • (α1 [M1 , N ] + α2 [M2 , N ]) = = α1 X • [M1 , N ] + α2 X • [M2 , N ] = = [α1 X • M1 + α2 X • M2 , N ] , from which the linearity of the integral in the integrand is evident. The linearity in the integrator is also evident as [(α1 X1 + α2 X2 ) • M, N ] = (α1 X1 + α2 X2 ) • [M, N ] = = α1 X1 • [M, N ] + α2 X2 • [M, N ] = = [α1 X1 • M, N ] + [α2 X2 • M, N ] = = [α1 X1 • M + α2 X2 • M, N ] . The remark about the integrability is evident from the trivial linearity of the space L2loc (M ). Proposition 2.71 (Associativity) If X ∈ L2 (M ) then Y ∈ L2 (X • M ) if and only if XY ∈ L2 (M ). If X ∈ L2loc (M ) then Y ∈ L2loc (X • M ) , if and only if XY ∈ L2loc (M ). In both cases (Y X) • M = Y • (X • M ) .

(2.35)

Proof. Using the construction of the stochastic integral and given that the associativity formula (2.35) is valid for pathwise integration [X • M ] = [X • M, X • M ] = X • [M, X • M ] = = X • (X • [M, M ]) = X 2 • [M, M ] . By the associativity of the pathwise integration for non-negative integrands E

∞

Y d [X • M ] = E 2

0

∞

=E

s

X d [M ] = 2

Y d

0

2

0

∞

Y 2 X 2 d [M ] ,

0

hence Y X ∈ L2 (M ) if and only if Y ∈ L2 (X • M ). If X ∈ L2 (M ) , then by the Kunita–Watanabe inequality for almost all ω the trajectory X (ω) is integrable


161

with respect to [M, N ] (ω). If XY ∈ L2 (M ) then using (2.26) again [(Y X) • M, N ] = (Y X) • [M, N ]

t

Y Xd [M, N ] = 0

t

s

Xd [M, N ] Y • (X • [M, N ]) ,

Yd 0

(2.36)

0

Using (2.26) and that Y ∈ L2 (X • M ) , Y • (X • [M, N ]) = Y • [X • M, N ] = [Y • (X • M ) , N ] . Comparing it with line (2.36), [(Y X) • M, N ] = [Y • (X • M ) , N ] . Hence by the uniqueness of the stochastic integral (Y X) • M = Y • (X • M ) . To prove the general case, observe that XY ∈ L2loc (M ) if and only if for some localizing sequence (τ n )

E χ ([0, τ n ]) X 2 Y 2 • [M ] < ∞. As

χ ([0, τ n ]) Y 2 • X 2 • [M ] = χ ([0, τ n ]) Y 2 X 2 • [M ] XY ∈ L2loc (M ) if and only if Y ∈ L2loc (X • M ). Let (τ n ) be a common localizing sequence for M and X • M . If Y ∈ L2loc (X • M ) then evidently τ

Y ∈ L2 ((X • M ) n ) = L2 ((X • M τ n )) . So τn

(Y • (X • M ))

τn

Y • (X • M )

= Y • (X • M τ n ) = τn

= (Y X • M τ n ) ((Y X • M )) from which the associativity is evident.

,


162 2.3.3

Integration with respect to semimartingales

We can extend again the definition of the stochastic integration to semimartingales: Definition 2.72 Let X = X (0) + L + V be a continuous semimartingale. If for some process Y the integrals Y • L and Y • V are meaningful then the stochastic integral Y • X of Y with respect to X by definition is the sum Y • X Y • L + Y • V. Remember that by Fisk’s theorem the decomposition X = X (0) + L + V is unique, hence the integral is well-defined. Proposition 2.73 The most important properties of the stochastic integral Y •X are the following: 1. Y • X is bilinear, that is Y • (α1 X1 + α2 X2 ) = α1 (Y • X1 ) + α2 (Y • X2 ) and (α1 Y1 + α2 Y2 ) • X = α1 (Y1 • X) + α2 (Y2 • X) assuming that all the expressions are meaningful. If two integrals are meaningful then the third is meaningful. 2. For all locally bounded processes Y, Z Z • (Y • X) = (ZY ) • X. 3. For every stopping time τ τ

(Y • X) = (Y χ ([0, τ ]) • X) = Y • X τ . 4. If the integrator X is a local martingale or if X has bounded variation on finite intervals then the same is true for the integral process Y • X. 5. Y • X is constant on any interval where either Y = 0, or X is constant. 6. [Y • X, Z] = Y • [X, Z] for any continuous semimartingale Z. 2.3.4

The Dominated Convergence Theorem for stochastic integrals

A crucial property of every integral is that under some conditions one can swap the order of taking limit and the integration: Proposition 2.74 (Dominated Convergence Theorem for stochastic integrals) Let X be a continuous semimartingale, and let (Yn ) be a sequence of


163

progressively measurable processes. Assume that (Yn (t, ω)) converges to Y∞ (t, ω) in every point (t, ω). If there is an integrable process Y such that57 |Yn | ≤ Y for all n, then Yn • X → Y∞ • X, where the convergence is uniform in probability on every compact interval, that is p

sup |(Yn • X) (s) − (Y∞ • X) (s)| → 0, s≤t

for all t ≥ 0.

Proof. One can prove the proposition separately when X has finite variation and when X is a local martingale. It is sufficient to prove the proposition when Y∞ ≡ 0. 1. First, assume that X has finite variation. In this case the integrability of Y means that for every t

t

|Y | dVar (X) < ∞. 0

As |Yn | ≤ Y , for every ω the trajectory Yn (ω) is also integrable on every interval [0, t]. Applying the classical Dominated Convergence Theorem for every trajectory individually, for all s ≤ t

0

s

t Yn dX ≤ |Yn | dVar (X) → 0. 0

Hence the integral, as a function of the upper bound uniformly converges to zero. Pointwise convergence on a finite measure space implies convergence in measure, so when the integrator has finite variation then the proposition holds. 2. Let X be a local martingale. Y is integrable with respect to X, hence by definition Y ∈ L2loc (X). Let ε, δ > 0 be arbitrary, and let (τ n ) be a localizing sequence of Y . To make the notation simpler, let us denote by σ a τ n for which σ P (τ n < t) ≤ δ/2. By the stopping rule (Yn • X) = Yn • X σ , that is if s ≤ σ (ω) then (Yn • X) (s, ω) = (Yn • X σ ) (s, ω) . If A

sup |Yn • X| (s) > ε , s≤t

Aσ

sup |Yn • X σ | (s) > ε , s≤t

57 The integrability of Y depends on the integrator X. If X is a local martingale, then by definition this means that Y ∈ L2loc (X).

164


then P (A) = P ((σ < t) ∩ A) + P ((σ ≥ t) ∩ A) ≤ ≤ P (σ < t) + P ((t ≤ σ) ∩ A) ≤

δ + P (Aσ ) . 2

Since Y ∈ L2 (X σ ), obviously Yn ∈ L2 (X σ ). Hence by the classical Dominated Convergence Theorem as Yn → 0 and |Yn | ≤ Y 2

Yn X σ E

∞

Yn2 d [X σ ] = E

0

∞

=E

χ ([0, σ]) Yn2 d [X]

∞

σ

Yn2 d [X]

=

0

→ 0,

0

that is Yn → 0 in L2 (X σ ). By Itˆ o’s isometry the correspondence Z → Z • X σ is H2

an L2 (X σ ) → H2 isometry58 . Hence Yn • X σ → 0. By Doob’s inequality59 E

2 sup |Yn • X | (s) σ

s≤∞

2 ≤ 4E ((Yn • X σ ) (∞)) 2

4 Yn • X σ H2 → 0. By Markov’s inequality, stochastic convergence follows from the L2 (Ω)convergence, hence P (Aσ ) P sup |Yn • X σ | (s) > ε → 0. s≤t

Hence for n large enough P (A) P sup |Yn • X| (s) > ε ≤ δ. s≤t

2.3.5

Stochastic integration and the Itˆ o–Stieltjes integral

As we mentioned, every integral is in some sense the limits of certain approximating sums. From the construction above it is not clear in which sense the integral X • M is a limit of the approximating sums. 58 See: 59 See:

Itˆ o’s isometry, Proposition 2.64, page 156. line (1.17) page 34. Proposition 2.52 page 147.


165

Lemma 2.75 If X is a continuous semimartingale and Y

η i · χ ((τ i , τ i+1 ])

i

is an integrable, non-negative predictable simple process60 then (Y • X) (t) =

η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .

i

Proof. If σ ≤ τ are stopping times, then using the linearity and the stopping rule χ ((σ, τ ]) • X = (χ ([0, τ ]) − χ ([0, σ])) • X = τ

σ

= (1 • X) − (1 • X) = X τ − X σ . Hence the formula holds with η ≡ 1. It is easy to check that if F ∈ Fσ ⊆ Fτ then σ (ω) if ω ∈ F τ (ω) if ω ∈ F σ F (ω) , τ F (ω) ∞ if ω ∈ /F ∞ if ω ∈ /F are also stopping times, hence (χF χ ((σ, τ ])) • X = χ ((σ F , τ F ]) • X = X τ F − X σF = χF (X τ − X σ ) , hence the formula is valid if η = χF , F ∈ Fσ . If η is an Fσ -measurable step function, then since the integral is linear one can write η in the place of χF . It is easy to show that for any Fσ -measurable function η the process ηχ ((σ, τ ]) is integrable with respect to X, hence using the Dominated Convergence Theorem one can prove the formula when η is an arbitrary Fσ -measurable function. As Y ≥0 0 ≤ Yn

n

η i χ ((τ i , τ i+1 ]) ≤ Y.

i=1

The general case follows from the Dominated Convergence Theorem and from the linearity of the integral. Corollary 2.76 If X is a continuous semimartingale, τ n ∞ and Y i η i · χ ((τ i , τ i+1 ]) is a predictable simple process then

t

Y dX (Y • X) (t) = 0 60 See:


i

η i · (X(τ i+1 ∧ t) − X(τ i ∧ t)) .

166


Proof. As τ n ∞, Y is left-continuous and has right-hand side limits. So Y is locally bounded on [0, ∞) and therefore Y ± are integrable. Proposition 2.77 If X is a continuous semimartingale, Y is a left-continuous, adapted and locally bounded process, then (Y • X) (t) is the Itˆ o–Stieltjes integral for every t. The convergence of the approximating sums is uniform in probability on every compact interval. The partitions of the intervals can be random as well. (n)

Proof. More precisely, let τ k For each t let

(n)

≤ τ k+1 ∞ be a sequence of stopping times.

(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t)

k

be the sequence of Itô-type approximating processes. Assume that for each ω (n) (n) lim max τ k+1 (ω) − τ k (ω) = 0.

n→∞

k

Define the locally bounded simple predictable processes Y (n)

(n) (n) (n) χ τ k , τ k+1 . Y τk

k

As we saw

(n) (n) (n) Y (τ k ) X(τ k+1 ∧ t) − X(τ k ∧ t) . Y (n) • X (t) = k

Y is continuous from the left, hence in every point Y (n) → Y . Let K (t) sup |Y (s)| . s n then

*  m $ +# m + E sup Li (t) ≤ C · E , Li (∞) = t

i=n

i=n

*  +m + = C · E , [Li ] (∞) . i=n

As

"∞ n=1

[Ln ] ∈ A+ by the Dominated Convergence Theorem *  +m + [Li ] (∞) = 0, lim E ,

n,m→∞

i=n

which implies that m lim E sup Li (t) = 0. n,m→∞ t

i=n

LOCAL MARTINGALES AND COMPENSATED JUMPS

237

m As L1 (Ω) is complete supt | i=1 Li (t)| is convergent in L1 (Ω). From the convergence in L1 (Ω) one has a subsequence which is almost surely convergent, therefore there is a process L such that for almost all ω n k lim sup Li (t, ω) − L (t, ω) = 0. k→∞ t i=1

L is obviously right-regular and of course L1 (Ω), that is

n i=1

Li converges to L uniformly in

n lim E sup Li (t) − L (t) = 0. n→∞ t

i=1

Again by Davis’ inequality

1/2 E sup |Li (t)| ≤ C · E [Li ] (∞) < ∞, t

hence Li is a class D local martingale hence it is a martingale. From the ∞ convergence in L1 (Ω) it follows that L i=1 Li is also a martingale. n n Li (t) + E sup L − Li (t) < ∞ E sup |L (t)| ≤ E sup t t t

i=1

i=1

hence the limit L is in D that is L a uniformly integrable martingale. "∞ + Now let us assume that n=1 [Ln ] ∈ Aloc . In this case there is a localizing sequence (τ k ) for which * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ . n=1

n=1

n=1

" Observe that (τ k ) is a common localizing sequence for all Ln , that is [Lτnk ] ∈ A for all n. Observe also, that by Davis’ inequality Lτnk ∈ M for every n and k. By the first part of the proof forevery k there is an L(k) ∈ M such that

(k+1) ∞ τk τk (k) . Obviously L = L(k) , so one can define an L ∈ L for n=1 Ln = L τk (k) which L = L . Let us fix an ε and a δ. As τ k ∞ for every t < ∞ there is

238

GENERAL THEORY OF STOCHASTIC INTEGRATION

an n such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ n. In the usual way, for k ≥ n n Lk (s) > ε ≤ P sup L(s) − s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t

k=1

The first probability is smaller than δ/2, the second probability is n τk τk P sup L (s) − Lk (s) > ε, τ k > t s≤t

k=1

which is smaller than n τk τk P sup L (s) − Lk (s) > ε . s

k=1

As Lτnk → Lτ k uniformly in L1 (Ω), by Markov’s inequality n τk τk Lk (s) > ε → 0, P sup L (s) − s

k=1

from which one can easily show that for n large enough n P sup L(s) − Lk (s) > ε < δ, s≤t

k=1

n ucp that is k=1 Lk → L, which means that on every compact interval in the topology of uniform convergence in probability

lim

n→∞

n

Lk

k=1

∞

Lk = L.

k=1

Theorem 4.27 (Parseval’s identity) Under the conditions of the theorem above for every t # lim

n→∞

L−

n k=1

$ Lk (t) = 0

(4.3)


239

and a.s.

[L] (t) =

∞

[Lk ] (t)

(4.4)

k=1

where in both cases the convergence holds in probability. Proof. By Davis’ inequality * $  +# n m + 1 Lk (t) ≤ · E sup L (s) − Ln (s) . E , L − c s≤t n=1 k=1

If

"∞ n=1

[Ln ] ∈ A+ then by the theorem just proved m Ln (s) = 0. lim E sup L (s) − m→∞ s≤t

n=1

in probability, By Markov’s" inequality convergence in L1 (Ω) implies convergence "∞ ∞ + [L ] ∈ A then (4.3) holds. Let [L ] ∈ A+ therefore if n n n=1 n=1 loc and "∞ let (τ k ) be a localizing sequence of [L ]. Let us fix an ε and a δ. As n n=1 τ k ∞ for every t < ∞ there is a q such that P (τ k ≤ t) ≤ δ/2 whenever k ≥ q. In the usual way, for k ≥ q n P sup L(s) − Lk (s) > ε ≤ s≤t k=1 n Lk (s) > ε, τ k > t . ≤ P (τ k ≤ t) + P sup L(s) − s≤t

k=1

Obviously n Lk (s) > ε, τ k > t = P sup L(s) − s≤t k=1 n τk τk Lk (s) > ε, τ k > t ≤ = P sup L (s) − s≤t k=1 n τk τk Lk (s) > ε . ≤ P sup L (s) − s≤t

k=1

240


By the stopping rule of the quadratic variation * * * τ k +∞ +∞ +∞ + + + τ τ , [Lnk ] = , [Ln ] k = , [Ln ] ∈ A+ , n=1

n=1

n=1

so by the first part of the proof if n is large enough n δ δ Lk (s) > ε ≤ + P sup L(s) − 2 2 s≤t k=1

that is (4.3) holds in the general case. By Kunita–Watanabe inequality26 *# $ * $ +# + n n + " + , [L] (t) − , Lk (t) ≤ Lk (t). L− k=1 k=1 This implies that # [L] (t) = lim

n→∞

∞

n

$ Lk (t) = lim

k=1

n→∞

n

[Lk ] (t)

k=1

[Lk ] (t)

k=1

where convergences hold in probability. 4.2.1

Construction of purely discontinuous local martingales

The cornerstone of the construction of the general stochastic integral is the next proposition: Proposition 4.28 Let H be a progressively measurable process. There is one and only one purely discontinuous local martingale L ∈ L for which ∆L = H if and only if 1. the set {H = 0} is thin, 2. p H = 0 and " 3. H 2 ∈ A+ loc . Proof. By the definition of the thin sets, for every ω there exists just a countable of points where the trajectory H (ω) is not zero. Hence the sum

number 2 H 2 (t) s≤t H (s) is meaningful. Observe that from the condition " + H 2 ∈ Aloc it implicitly follows that H (0) = 0. 26 See:



241

1. The uniqueness of L is obvious, as if purely discontinuous local martingales have the same jumps then they are indistinguishable27 . 2 2. If H ∆L for some L ∈ L then p H p (∆L) = 0, and as (∆L) = ∆ [L] and [L] is increasing

H2 =

2

(∆L) ≤

2

c

(∆L) + [L] =

= [L] . " H 2 ∈ A+ [L] ∈ A+ loc obviously loc , so the conditions are necessary. " + 2 3. Let us assume that H ∈ Aloc and let us assume that the sequence of stopping times (ρm ) exhausting29 for the thin set {H = 0}. We can assume that ρm is either totally inaccessible or predictable. For every stopping time ρm let us define a simple jump processes which jumps at ρm and for which the value of the jump is H (ρm ):

Since28

"

Nm H (ρm ) χ ([ρm , ∞)) . It is worth emphasizing that it is possible that ∪m [ρm ] = {H = 0}. That is, the inclusion {H = 0} ⊆ ∪m [ρm ] can be proper, but ∪m {∆Nm = 0} = {H = 0} . Nm is right-regular, H is progressively measurable, hence the stopped variables " H 2 ∈ A+ H (ρm ) are Fρm -measurable and so Nm is adapted. As loc |Nm | ≤

)

H 2 ∈ A+ loc

for every m, hence Nm has locally integrable variation, so it has a compensator p . Nm p 4. We show that Nm is continuous. If ρm is predictable then the graph [ρm ] of ρm is a predictable set30 so using property 6. of the predictable 27 See:

Corollary 4.7, page 228. (3.20) line, page 222. 29 See: Proposition 3.22, page 189. 30 See: Corollary 3.34, page 199. 28 See:

242


compensator31 up to indistinguishability p ∆ (Nm )=

p

(∆Nm )

p

(H (ρm ) χ ([ρm ])) =

p

(Hχ ([ρm ])) =

= (p H) χ ([ρm ]) = 0 · χ ([ρm ]) = 0. p Hence Nm is continuous. Let ρm be totally inaccessible. As above

p ∆ (Nm )=

p

(∆Nm ) =

p

(Hχ ([ρm ])) .

ρm is totally inaccessible and therefore P (ρm = σ) = 0 for every predictable stopping time σ, hence if σ is predictable then p

0 (Hχ ([ρ ]) (σ) | Fσ− ) = (Hχ ([ρm ])) (σ) E m 0 (0 | Fσ− ) = 0. =E

p By the definition of the predictable projection ∆ (Nm ) = 0. p 5. Let Lm Nm − Nm ∈ L be the compensated jumps. As the compensators are continuous and have finite variation if i = j then [Li , Lj ] = [Ni , Nj ] = 0, and

)

[Lk ] =

)

[Nk ] =

)

H 2 ∈ A+ loc .

Hence32 there is an L ∈ L for which L = k Lk . As the convergence is uniform in probability there is a sequence for which the convergence is almost surely uniform. Hence up to indistinguishability ∆L = ∆

Lk = ∆Lk = H.

Observe that in the last step we have used the fact that {H = 0} = ∪m {∆Nm = 0} = ∪m {∆Lm = 0} . 6. Let us prove that L is purely discontinuous. Let M be a continuous local martingale. Obviously [Lk , M ] = 0. Therefore by the inequality of Kunita and 31 See: 32 See:

page 217. Theorem 4.26, page 236.


243

Watanabe33 and by (4.3) # $ # $ n n Lk + M, Lk = |[M, L]| ≤ M, L − k=1 k=1 * # $ $ +# n n " + , = M, L − Lk ≤ [M ] Lk → 0 L− k=1

k=1

which implies that [M, L] = 0, that is M and L are orthogonal. Hence L is purely discontinuous. Definition 4.29 The following definitions are useful: 1. We say that process X is a single jump if there is a stopping time ρ and an Fρ -measurable random variable ξ such that X = ξχ ([ρ, ∞)). 2. We say that process X is a compensated single jump if there is a single jump Y for which X = Y − Y p . 3. We say that the X is a continuously compensated single jump if Y p in 2. is continuous. Proposition 4.30 (The structure of purely discontinuous local martingales) If L ∈ L is a purely discontinuous local martingale then in the topology of uniform convergence in probability on compact intervals L

∞

Lk ,

k=1

where for all k: 1. Lk ∈ L is a continuously compensated single jump, 2. the jumps of Lk are jumps of L. 3. If i = j then [Li , Lj ] = 0 that is Li and Lj are strongly orthogonal, 2

4. [Lk ] = (∆L (ρk )) χ ([ρk , ∞)), where ρk denotes the stopping time of Lk . & ' 5. If i = j then the graphs [ρi ] and ρj are disjoint. " If [L] ∈ A+ then the convergence holds in the topology of uniform convergence in L1 (Ω). Proof. It is sufficient to remark, that if L ∈ L is purely discontinuous then the jump process of L satisfies the conditions of the above proposition34 . 33 See: 34 See:

Corollary 2.36, page 137. Proposition 4.28, page 240.

244


4.2.2

Quadratic variation of purely discontinuous local martingales

In this subsection we return to the investigation of the quadratic variation. Definition 4.31 We say that M is a pure quadratic jump process if [M ] =

2

(∆M ) .

(4.5)

Example 4.32 Every V ∈ V is a pure quadratic jump process35 .

By (2.14) [V, V ] =

∆V ∆V =

2

(∆V ) .

Theorem 4.33 (Quadratic variation of purely discontinuous local martingales) A local martingale L ∈ L is a pure quadratic jump process if and only if it is purely discontinuous. Proof. Let L ∈ L. 1. If L is purely discontinuous, then by the structure of purely discontinuous local martingales36 L = k Lk , where [Lk , Lj ] =

0 if k = j . 2 (∆L (ρk )) χ ([ρk , ∞)) if k = j

By Parseval’s identity (4.4) for every t

a.s

[L] (t) =

∞

[Lk ] (t) =

2

(∆L) (s) .

s≤t

k=1

As both sides of the equation are right-regular [L] and indistinguishable. 2. If L is a pure quadratic jump process, then [L] = 35 See: 36 See:


2

(∆L) .

s≤t

2

(∆L)

are


245

Let L = Lc + Ld be the decomposition of L ∈ L. As Lc is continuous37 ' ' & ' & & [L] = Lc + Ld = [Lc ] + 2 Lc , Ld + Ld = & ' = [Lc ] + Ld . By the part of the theorem already proved 2 2 & d' 2 ∆Ld = ∆Ld + ∆Lc = (∆L) . L = Hence [Lc ] = 0, therefore Lc = 0 and so L = Ld . Corollary 4.34 If X is a purely discontinuous local martingale then for every local martingale Y [X, Y ] =

∆X∆Y.

(4.6)

Proof. Obviously & ' ' & [X, Y ] = X, Y c + Y d = [X, Y c ] + X, Y d . By the definition of the orthogonality [X, Y c ] is a local martingale. ∆ [X, Y c ] = ∆X∆Y c = 0, hence [X, Y c ] is continuous. [X, Y c ] ∈ V ∩ L so by Fisk’s theorem [X, Y c ] = 0. As the purely discontinuous local martingales form a linear space & ' 1 & ' & ' X +Yd − X −Yd = X, Y d = 4 2 2

1 ∆X + ∆Y d − ∆X − ∆Y d = = 4

∆X∆Y. ∆X ∆Y d + ∆Y c = = ∆X∆Y d =

Proposition 4.35 (Quadratic variation of semimartingales) For every semimartingale X [X] = [X c ] +

2

(∆X) ,

(4.7)

where, as before38 , X c denotes the continuous part of the local martingale part of X. More generally if X and Y are semimartingales then [X, Y ] = [X c , Y c ] + 37 See: 38 See:

Corollary 4.10, page 229. Definition 4.23, page 235.

∆X∆Y.

(4.8)

246


Proof. Recall39 that every semimartingale X has a decomposition, X = X (0) + X c + H + V, where X c is a continuous local martingale, V ∈ V and H is a purely discontinuous local martingale. By simple calculation [X] = [X c ] + [V ] + [H] + + 2 [X c , H] + 2 [X c , V ] + 2 [H, V ] . As X c is continuous and V has finite variation so [X c , V ] = 0. H is purely discontinuous and X c is continuous, hence by (4.6) [X c , H] = 0. Therefore [X] = [X c ] + [V ] + [H] + 2 [H, V ] . Every process with finite variation is a pure quadratic jump process so [V ] =

2

(∆V ) .

H is purely discontinuous, hence it is also a pure quadratic jump process, so [H] =

2

(∆H) .

As V has finite variation so by (2.14) [H, V ] =

∆H∆V.

Therefore [V ] + [H] + 2 [H, V ] =

2

(∆H + ∆V ) =

2

(∆X) ,

so (4.7) holds. The proof of the general case is similar. c

Corollary 4.36 If X is a semimartingale then [X c ] = [X] . More generally if c X and Y are semimartingales then [X c , Y c ] = [X, Y ] .

4.3

Stochastic Integration With Respect To Local Martingales

Recall that so far we have defined the stochastic integral with respect to local martingales only when the integrator Y was locally square-integrable. In fact, in this case the construction of the stochastic integral is nearly the same as the construction when the integrator is a continuous local martingale. The only 39 See:


STOCHASTIC INTEGRATION WITH RESPECT TO LOCAL MARTINGALES

247

2 difference is that when Y ∈ Hloc then one can integrate only predictable processes and one has to consider the condition for the jumps of the integral ∆ (X • Y ) = 2 then a predictable process X is integrable X∆Y as well. Recall that if Y ∈ Hloc if and only if

/ . X ∈ L2loc (Y ) Z : Z 2 • [Y ] ∈ A+ loc . 2 In this case X • Y ∈ Hloc . Observe that the condition X ∈ L2loc (Y ) is very 2 if and only if40 [M ] ∈ A+ natural. If M is a local martingale then M ∈ Hloc loc . 2 2 As [X • Y ] = X • [Y ] , obviously X • Y ∈ Hloc if and only if X ∈ L2loc (Y ). As ∆ (X • Y ) = X∆Y , if Y is continuous then X • Y is also continuous. Let Y = Y (0) + Y c + Y d be the decomposition of Y into continuous and purely discontinuous local martingales. As [Y ] ∈ A+ loc and as

& ' [Y ] = [Y c ] + Y d

(4.9)

& ' c d it is obvious that [Y c ] , Y d ∈ A+ loc . This immediately implies that Y and Y are 2 2 2 c in Hloc . From (4.9) loc (Y ) if and only if X ∈ Lloc (Y ) it is also clear that X ∈ L d c d 2 and X ∈ Lloc Y . This implies that X • Y and X • Y exist and obviously X • Y = X • Y c + X • Y d. By the construction X • Y c is continuous. Observe that X • Y d is a purely discontinuous local martingale as for any continuous local martingale L ' & ' & X • Y d , L = X • Y d , L = X • 0 = 0, that is X • Y d is strongly orthogonal to every continuous local martingale. The goal of this section is to extend the integration to the case when the integrator is an arbitrary local martingale. To do this one should define the stochastic integral for every purely discontinuous local martingale. Extending the integration to purely discontinuous local martingales from the integration procedure we expect the following properties: 1. If L ∈ L is purely discontinuous then X • L ∈ L should be also purely discontinuous. 2. Purely discontinuous local martingales are uniquely determined by their jumps41 , hence it is sufficient to prescribe the jumps of X • L: it is very natural to ask that the formula ∆ (X • L) = X∆L should hold. 40 See: 41 See:

Proposition 3.64, page 223. Corollary 4.7, page 228.

248

GENERAL THEORY OF STOCHASTIC INTEGRATION 1/2

3. We have proved42 [L] ∈ A+ therefore if loc for any local martingale L, " X ) • L is a purely discontinuous local martingale then the expression [X • L] = 2 (X∆L) should have locally integrable variation. 4. If L ∈ L then p (∆L) = 0. By the jump condition, if X is predictable then p

(∆ (X • L)) =

p

(X · ∆L) = X · (p (∆L)) = X · 0 = 0

from which one can expect that one can guarantee only for predictable integrands X that X • L ∈ L and ∆ (X • L) = X∆L. 4.3.1

Definition of stochastic integration

Assume, that L ∈ L is a purely discontinuous local martingale. As L is a local martingale p (∆L) is finite and p (∆L) = 0. If H is a predictable real valued process then as p (∆L) is finite43 p

(H∆L) = H (p (∆L)) = 0,

hence if )

2

H 2 (∆L) ∈ A+ loc ,

then there is one and only one purely discontinuous local martingale44 , denoted by H • L, for which ∆ (H • L) = H∆L. If one expects the properties H∆L = ∆ (H • L)

and

d

(H • L) = H • Ld

from the stochastic integral H • L then this definition is the only possible one for H • L. Definition 4.37 If L is a purely discontinuous local martingale then H • L is the stochastic integral of H with respect to L. Definition 4.38 If L = L " (0) + Lc + Ld is a local martingale and H is a predictable process for which H 2 • [L] ∈ A+ loc then H • L H • Lc + H • Ld . H • L is the stochastic integral of H with respect to L. 42 See:

(3.20), page 222. Proposition 3.37. page 201. 44 See: Proposition 4.28, page 240. 43 See:


249

Example 4.39 If X ∈ V is predictable45 and L is a local martingale then ∆X • L = ∆X∆L.

1. The trajectories of L are right-regular, therefore they are bounded on finite intervals46 . As X ∈ V obviously ∆L • X exists and ∆X∆L = ∆L • X. X is predictable and right-regular, therefore it is locally bounded47 . As Var (X) is also predictable and it is also right-regular it is also locally bounded. 2. |∆X| ≤ Var (X), which implies that ∆X • L is well-defined. Let L = L (0) + Lc + Ld be the decomposition of L. For any local martingale N ∆X • [Lc , N ] = 0 hence ∆X • Lc = 0. Therefore one can assume that L is purely discontinuous. ) ) 2 2 |∆X∆L| ≤ ∆X∆L ≤ (∆X) (∆L) ≤ " " ≤ [X] [L] < ∞. Obviously ∆ ( ∆X∆L) = ∆X∆L. As ∆X∆L has finite variation, so if it is a local martingale thenit is a purely discontinuous local martingale. Therefore we should prove that ∆X∆L is a local martingale. Hence we should prove that ∆L • X is a local martingale. 3. With localization one can assume that X and Var (X) are bounded. As X and Var (X) are bounded ) " 2 |∆L| • Var (X) = |∆X| |∆L| ≤ (∆X) [L] ≤ " " ≤ sup |X| · Var (X) [L] ∈ A+ loc . Hence with further localization we can assume that ∆L•X ∈ A. If τ is a stopping time then E ((∆L • X) (τ )) = E ((∆L • X τ ) (∞)) . As X τ is also predictable48 one should prove that if ∆L • X ∈ A and X is predictable, then E ((∆L • X) (∞)) = 0. By Dellacherie’s formula49 , using that 45 If

X is not predictable then ∆X is also not predictable so ∆X • L is undefined. Proposition 1.6, page 5. 47 See: Proposition 3.35, page 200. 48 See: Proposition 1.39, page 23. 49 See: Proposition 5.9, page 301. 46 See:

250


L is a local martingale hence p (∆L) = 0, E ((∆L • X) (∞)) = E ((p (∆L) • X) (∞)) = 0. That is ∆L • X = ∆X∆L is a local martingale. 4.3.2

Properties of stochastic integration

Let us discuss the properties of stochastic integration with respect to local martingales: " H 2 • [L] ∈ A+ 1. If loc then the definition is meaningful and H • L ∈ L. Specifically every locally bounded predictable process is integrable 50 . For any local martingale L 2 (∆L) . (4.10) [L] = [Lc ] + The integral H 2 • [Lc ] is finite, hence the integral H • Lc exists51 . By (4.10) ) ) ) 2 2 H 2 • [Ld ] = H 2 • (∆L) = (H∆L) ∈ Aloc , hence H • Ld is also meaningful. Both integrals are local martingales, hence the also a local martingale. The second observation sum H • L H • Lc + H • Ld is" easily follows from the relation [L] ∈ A+ loc . 2. H∆L = ∆ (H • L). c

d

3. (H • L) = H • Lc and (H • L) = H • Ld . 4. [H • L] = H 2 • [L]. c 2 [H • L] = [(H • L) ] + (∆ (H • L)) = & ' 2 = H 2 • [Lc ] + (H∆L) = H 2 • [Lc ] + H 2 • Ld = = H 2 • [L] . 5. H • L is the only process in L for which [H • L, N ] = H • [L, N ] holds for every N ∈ L. By the inequality of Kunita and Watanabe " " |H| • Var ([L, N ]) ≤ H 2 • [L] [N ] [M ] ∈ A+ loc for any local martingale M , hence the present construction of H • L is maximal in H, that is if one wants to extend the definition of the stochastic integral to a broader class of integrands H, then H • L will not necessarily be a local martingale. 51 See: Corollary 2.67, page 158. 50


251

hence the integral H • [L, N ] is meaningful. Therefore ( c d [H • L, N ] = [(H • L) , N c ] + (H • L) , N d = = [H • Lc , N c ] + H∆L∆N ' & = H • [Lc , N c ] + H • Ld , N d =

' & = H • [Lc , N c ] + Ld , N d = = H • [L, N ] . If H • [L, N ] = [Y, N ] for some local martingale Y , then [Y − H • L, N ] = 0. Hence if N Y − H • L then [Y − H • L] = 0. Y − H • L is a local martingale therefore52 Y − H • L = 0. 6. If τ is an arbitrary stopping time, and H • L exists then τ

H • Lτ = (H • L) = (χ ([0, τ ]) H) • L. If

" H 2 • [L] ∈ Aloc , then trivially " " H 2 • [Lτ ] = χ ([0, τ ]) H 2 • [L] ∈ Aloc

so the integrals above exists. By the stopping rule of the quadratic variation if N ∈L τ

τ

τ

τ

[(H • L) , N ] = [(H • L) , N ] = (H • [L, N ]) = H • [L, N ] = = H • [Lτ , N ] = [H • Lτ , N ] , hence by the bilinearity of the quadratic variation τ

[(H • L) − H • Lτ , N ] = 0, N ∈ L, from which τ

(H • L) = H • Lτ . For arbitrary N ∈ L τ

[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = (χ ([0, τ ]) H) • [L, N ] = = [(χ ([0, τ ]) H) • L, N ] , hence again H • Lτ = (χ ([0, τ ]) H) • L from Property 5. 52 See:


252


7. The integral is linear in the integrand. By elementary calculation ) ) ) 2 (H1 + H2 ) • [L] ≤ H12 • [L] + H22 • [L], hence if H1 • L and H2 • L exist then the integral (H1 + H2 ) • L also exists. When the integrator is continuous the integral is linear. The linearity of the purely discontinuous part is a simple consequence of the relation. (H1 + H2 ) ∆L = H1 ∆L + H2 ∆L. The proof of the homogeneity is analogous. 8. The integral is linear in the integrator. By the inequality of Kunita and Watanabe53 [L1 + L2 ] ≤ 2 ([L1 ] + [L2 ]) , hence if the integrals H • L1 and H • L2 exist then H • (L1 + L2 ) also exists. The decomposition of the local martingales into continuous and purely discontinuous c d martingales is unique so (L1 + L2 ) = Lc1 + Lc2 , and (L1 + L2 ) = Ld1 + Ld2 . For continuous local martingales we have already proved the linearity, the linearity of the purely discontinuous part is evident from the relation ∆ (L1 + L2 ) = ∆L1 + ∆L2 . 9. If H i ξ i χ ((τ i , τ i+1 ]) is an adapted simple process then (H • L) (t) =

ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) .

(4.11)

i

By the linearity it is sufficient to calculate the integral just for one jump. For the continuous part we have already deduced the formula. For the discontinuous part it is sufficient to remark that if ξ i is Fτ i -measurable and L is a purely discontinuous local martingale then ξ i (L (τ i+1 ∧ t) − L (τ i ∧ t)) is a purely discontinuous local martingale54 , with jumps ξ i χ ((τ i , τ i+1 ]) ∆L. 10. Assume that the integral H • L exists. The integral K • (H • L) exists if and only if the integral (KH) • L exists. In this case (KH) • L = K • (H • L) . Let us remark that as the integrals are pathwise integrals with respect to processes with finite variation ) " 2 K 2 • (H 2 • [L]) = (KH) • [L]. 53 See: 54 The

Corollary 2.36, page 137. space of purely discontinuous local martingales is closed under stopping.


253

K • (H • L) exists if and only if ) " " 2 K 2 • [H • L] = K 2 • (H 2 • [L]) = (KH) • [L] ∈ A+ loc , from which the first part is evident. If N is an arbitrary local martingale then [K • (H • L) , N ] = K • [H • L, N ] = KH • [L, N ] = = [KH • L, N ] , from which the second part is evident. 11. If τ is an arbitrary stopping time then τ

H • Lτ = (χ ([0, τ ]) H) • L = (H • L) . If N is an arbitrary local martingale, then τ

[H • Lτ , N ] = H • [Lτ , N ] = H • [L, N ] = = Hχ ([0, τ ]) • [L, N ] = = [Hχ ([0, τ ]) • L, N ] = τ

τ

= (H • [L, N ]) = [H • L, N ] = τ

= [(H • L) , N ] , from which the property is evident. 12. The Dominated Convergence Theorem is valid, that is if (Hn ) is a sequence of predictable processes, Hn → H∞ and there is a predictable process H, for which the integral H • L exists and |Hn | ≤ H then the integrals Hn • L also exist and Hn • L → H∞ • L, where the convergence is uniform in probability on the compact time-intervals. As Hn2 • [L] ≤ H 2 • [L] for all n ≤ ∞ the integrals Hn • L exist. By Davis’ inequality, for every stopping time τ %

2 τ E sup |((Hn − H∞ ) • L ) (t)| ≤ C · E (Hn − H∞ ) • [L] (∞) .

τ

t

"

τ H 2 • [L] m (∞) < ∞, hence by There is a localizing sequence (τ m ), that E the classical Dominated Convergence Theorem E

) 2 τ (Hn − H∞ ) • [L] m (∞) → 0

254


hence L

sup |((Hn − H∞ ) • Lτ m ) (t)| →1 0, t

from which as in the continuous case55 one can guarantee on every compact interval the uniform convergence in probability. 13. The definition of the integral is unambiguous that is if L ∈ V ∩ L then the two possible concepts of integration give the same result. It is trivial from Proposition 2.89. 14. If X is left-continuous and locally bounded then (X • L) (t) is an Itˆ o– Stieltjes integral for every t where the convergence of the approximating sums is uniform in probability on every compact interval. The approximating partitions can be random as well. The proof is the same as in the continuous case56 .

4.4

Stochastic Integration With Respect To Semimartingales

Recall the definition of stochastic integration with respect to semimartingales: Definition 4.40 If semimartingale X has a decomposition X = X (0) + L + V,

V ∈ V, L ∈ L

for which the integrals H • L and H • V exist then H • X H • L + H • V. By Proposition 2.89 the next statement is trivial57 : Proposition 4.41 For predictable integrands the definition is unambiguous, that is the integral is independent of the decomposition of the integrator. Proposition 4.42 If X and Y are arbitrary semimartingales and the integrals U • X and V • Y exist, then [U • X, V • Y ] = U V • [X, Y ] . Proof. Let XL + XV , and YL + YV be the decomposition of X and Y . [U • X, V • Y ] = [U • XL , V • YL ] + [U • XL , V • YV ] + + [U • XV , V • YL ] + [U • XV , V • YV ] . 55 See:

Proposition 2.74. page 162. Proposition 2.77, page 166. 57 See: Subsection 2.4.3, page 176. 56 See:

STOCHASTIC INTEGRATION WITH RESPECT TO SEMIMARTINGALES

255

For integrals with respect to local martingales [U • XL , V • YL ] = U V • [XL , YL ] . In the three other expressions one factor has finite variation, hence the quadratic variation is the sum of the products of the jumps58 . For example [U • XL , V • YV ] =

∆ (U • XL ) ∆ (V • YV ) =

(U ∆XL ) (V ∆YV ) .

On the other hand for the same reason U V • [XL , YV ] = U V •

∆XL ∆YV

=

U V ∆XL ∆YV ,

hence [U • XL , V • YV ] = U V • [XL , YV ] . One can finish the proof with the same calculation for the other tags. Observe that the existence of the integral H • X means that for some decomposition X = X (0) + L + V one can define the integral and the existence of the integral does not mean that in every decomposition of X the two integrals are meaningful. Observe also that with the definition we extended the class of integrable processes even for local martingales. It is possible that the integral H • L as an integral with respect to the local martingale L does not exist, but L has a decomposition L = L (0) + M + V, M ∈ L, V ∈ V for which H is integrable with respect to M and V . Of course in this general case we cannot guarantee that59 H • L ∈ L. Example 4.43 If the integrand is not locally bounded then the stochastic integral with respect to a local martingales is not necessarily a local martingale.

Let M be a compound Poisson process, where P (ξ k = ±1) = 1/2 for the jumps ξ k . M is a martingale and the trajectories of M are not continuous. Let τ 1 be the time of the first jump of M and let X (t, ω) 58 See: 59 See:

line (2.14), page 134. Example 4.43, page 255.

1 · χ ((0, τ 1 (ω)]) . t

256


X is predictable but it is not locally bounded. As the trajectories of M have finite variation the pathwise stochastic integral 1 χ ((0, τ 1 (ω)]) dM (s, ω) = L (t, ω) (X • M ) (t, ω) = (0,t] s 0 if t < τ 1 (ω) = ξ 1 (ω) /τ 1 (ω) if τ 1 (ω) ≤ t is meaningful. We prove that L is not a local martingale. If (ρk ) would be a localization of L then Lρ1 was a uniformly integrable martingale. Hence for the stopping time σ ρ1 ∧ t E (L (σ)) E (L (ρ1 ∧ t)) = E (Lρ1 (t)) = E (L (0)) = 0. Therefore it is sufficient to prove that for any finite stopping time σ = 0 E (|L (σ)|) = ∞.

(4.12)

Let σ be a finite stopping time with respect to the filtration F generated by M . 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ σ ∧ τ 1 ) dP. τ τ Ω 1 Ω 1 Hence to prove (4.12) one can assume that σ ≤ τ 1 . In this case σ is Fτ 1 measurable. Hence it is independent of the variables (ξ n ). So one can assume that σ is a stopping time for the filtration generated by the point process part of M . By the formula of the representation of stopping times of point processes60 σ = ϕ0 χ (σ < τ 1 ) +

∞

χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n )

n=1

∞

χ (τ n ≤ σ < τ n+1 ) ϕn (τ 0 , . . . , τ n ) =

n=0

= ϕ0 χ (σ < τ 1 ) + χ (σ ≥ τ 1 ) ϕ1 (τ 1 ) . From this {τ 1 ≤ ϕ0 } ⊆ {τ 1 ≤ σ}. If ϕ0 > 0 then using that τ 1 has an exponential distribution 1 1 E (|L (σ)|) = χ (τ 1 ≤ σ) dP ≥ χ (τ 1 ≤ ϕ0 ) dP = τ τ Ω 1 Ω 1 ϕ0 1 = λ exp (−λx) dx = ∞. x 0 60 See:

Proposition C.6, page 581.


257

σ = 0 and F0 = {∅, Ω}, therefore {σ ≤ 0} = ∅. Hence σ > 0, so if ϕ0 = 0 then σ ≥ τ 1 . Hence again 1 1 χ (τ 1 ≤ σ) dP = dP = ∞. E (|L (σ)|) = Ω τ1 Ω τ1 By the definition of the integral it is clear that if a process H is integrable with respect to semimartingales X1 and X2 then H is integrable with respect to aX1 + bX2 for every constants a, b and H • (aX1 + bX2 ) = a (H • X1 ) + b (H • X2 ) . Observe that by the above definitions the other additivity of the integral, that is the relation (H1 + H2 ) • X = H1 • X + H2 • X is not clear. Our direct goal in the following two subsections is to prove this additivity property of the integral. 4.4.1

Integration with respect to special semimartingales

Recall that by definition S is a special semimartingale if it has a decomposition S = S (0) + V + L,

V ∈ V, L ∈ L

(4.13)

where V is predictable. Theorem 4.44 (Characterization of special semimartingales) Let S be a semimartingale. The next statements are equivalent: 1. S is a special semimartingale, i.e. there is a decomposition (4.13) where V is predictable. 2. There is a decomposition (4.13), where V ∈ Aloc . 3. For all decompositions (4.13) V ∈ Aloc . 4. S ∗ (t) sups≤t |S (s) − S (0)| ∈ A+ loc . Proof. We prove the equivalence of the statements backwards. 1. Let us assume that the last statement holds, and let S = S (0) + V + L be a decomposition of S. Let L∗ (t) sups≤t |L (s)|. L∗ is in61 A+ loc , hence from the assumption of the fourth statement V ∗ (t) sup |V (s)| ≤ S ∗ (t) + L∗ (t) ∈ A+ loc . s≤t

61 See:


258


The process Var (V )− is increasing and continuous from the left, hence it is locally bounded, hence Var (A)− ∈ A+ loc . As Var (V ) ≤ Var (V )− + ∆ (Var (V )) ≤ Var (V )− + 2V ∗ Var (V ) ∈ A+ loc , hence the third condition holds. 2. From the third condition the second one follows trivially. 3. If V ∈ Aloc in the decomposition S = S (0)+V +L, then V p , the predictable compensator of V , exists. V − V p is a local martingale, hence S = S (0) + V p + (V − V p + L) is a decomposition where V p ∈ V is predictable, so S is a special semimartingale. 4. Let us assume that S (0) = 0 so S = V + L. If V ∗ (t) sups≤t |V (s)|, then as V ∗ ≤ Var (V ) S ∗ ≤ V ∗ + L∗ ≤ Var (V ) + L∗ . L∗ ∈ A+ loc , so it is sufficient to prove that if V ∈ V is predictable then Var (V ) ∈ A+ loc . It is sufficient to prove that Var (V ) is locally bounded. V is continuous from the right, hence when one calculates Var (V ) it suffices to use the partitions with dyadic rationals and hence if V is predictable then Var (V ) is also predictable. Var (V ) is right-continuous and predictable hence it is locally bounded62 . Example 4.45 X ∈ V is a special semimartingale if and only if X ∈ Aloc . A compound Poisson process is a special semimartingale if and only if the expected value of the distribution of the jumps is finite.

The first remark is evident from the theorem. Recall, that a compound Poisson process has locally integrable variation if and only if the distribution of the jumps has finite expected value63 . Example 4.46 If a semimartingale S is locally bounded then S is a special semimartingale.

Example 4.47 If a semimartingale S has bounded jumps then S is a special semimartingale64 .

62 See:

Proposition 3.35, page 200. Example 3.2, page 180. 64 See: Proposition 1.152, page 107. 63 See:


259

Example 4.48 Decomposition of continuous semimartingales.

Recall that by definition S is a continuous semimartingale if S has a decomposition S = S (0) + V + L, where V ∈ V, L ∈ L and V and L are continuous65 . Let S now be a semimartingale and let us assume that S is continuous. As S is continuous it is locally bounded, so S is a special semimartingale. By the just proved proposition S has a decomposition S (0) + V + L, where V ∈ V is predictable and L ∈ L. As S is continuous L is also predictable, hence it is continuous66 . This implies that V is also continuous. This means that S is a continuous semimartingale. The stochastic integral X • Y is always a semimartingale. One can ask: when is it a special semimartingale? Theorem 4.49 (Integration with respect to special semimartingales) Let X be a special semimartingale. Assume that for a predictable process H the integral H • X exists. Let X X (0) + A + L be the canonical decomposition of X. H • X is a special semimartingale if and only if the integrals H • A and H • L exist and H • L is a local martingale. In this case the canonical decomposition of H • X is exactly H • A + H • L. Proof. Let us first remark that if U and W are predictable and W ∈ V and the integral U • W exists then it is predictable. This is obviously true if U χ ((s, t]) χF ,

F ∈ Fs

as67

U • W = χF W t − W s = (χF χ ((s, ∞))) W t − W s . The general case follows from the Monotone Class Theorem. Assume that the integral68 Z H •X H •V +H •M exists and it is a special semimartingale. Let Z B + N be the canonical decomposition of Z. B ∈ Aloc and B is predictable. χ (|H| ≤ n) is bounded and predictable, hence the integral χ (|H| ≤ n) • Z χ (|H| ≤ n) • B + χ (|H| ≤ n) • N 65 See:

Definition 2.18, page 124. 3.40, page 205. 67 See: Proposition 1.39, page 23. 68 With some decomposition X = X (0) + V + M. 66 See:

260


exists. χ (|H| ≤ n) is bounded, B ∈ Aloc hence χ (|H| ≤ n) • B ∈ Aloc . As χ (|H| ≤ n) and B are predictable χ (|H| ≤ n) • B is also predictable. Let Hn Hχ (|H| ≤ n). Hn is bounded and predictable hence the integral Hn • X Hn • A + Hn • L is meaningful. Hn • A ∈ Aloc and Hn • A is predictable and Hn • L ∈ L so Hn • X is a special semimartingale and Hn • A + Hn • L its canonical decomposition. By the associativity rule of the integration with respect to local martingales and processes with finite variation, and by the linearity in the integrator χ (|H| ≤ n) • Z χ (|H| ≤ n) • (H • X) χ (|H| ≤ n) • (H • V + H • M ) = = χ (|H| ≤ n) • (H • V ) + χ (|H| ≤ n) • (H • M ) = = (χ (|H| ≤ n) H) • V + (χ (|H| ≤ n) H) • M (χ (|H| ≤ n) H) • X Hn • X = Hn • A + Hn • L. The canonical decomposition of special semimartingales is unique, hence χ (|H| ≤ n) • B = Hn • A,

χ (|H| ≤ n) • N = Hn • L.

As we have seen χ (|H| ≤ n) H 2 • [L] Hn2 • [L] = [Hn • L] = [χ (|H| ≤ n) • N ] = = χ (|H| ≤ n) • [N ] ≤ [N ] . " " [N ] ∈ A+ H 2 • [L] ∈ A+ loc , so by the Monotone Convergence Theorem loc and therefore the integral H • L ∈ L exists, and by the Dominated Convergence Theorem N = H • L. Similarly, H • A exists, it is in Aloc and H • A = B. If H and A are predictable then H • A is predictable hence the other implication is evident. Corollary 4.50 Let L be a local martingale and let us assume that the integral H • L exists. H • L is a local martingale if and only if sups≤t |(H • L) (s)| is locally integrable, that is sup |(H • L) (s)| ∈ A+ loc . s≤t


261

Proof. As sups≤t |M (s)| is locally integrable69 for every local martingale M ∈ L one should only prove that if sups≤t |(H • L) (s)| is locally integrable then H • L is a local martingale. X L is a special semimartingale with canonical decomposition X = L + 0. Hence H • L is a local martingale if and only if Y H • L is a special semimartingale. But as Y (0) = 0, the process Y is a special semimartingale70 if and only if sups≤t |Y (s)| ∈ A+ loc . 4.4.2

Linearity of the stochastic integral

The most important property of every integral is the linearity in the integrand. Now we are ready to prove this important property: Theorem 4.51 (Additivity of stochastic integration) Let X be an arbitrary semimartingale. If H1 and H2 are predictable processes and the integrals H1 • X and H2 • X exist, then for arbitrary constants a and b the integral (aH1 + bH2 ) • X exists and (aH1 + bH2 ) • X = a (H1 • X) + b (H2 • X) .

(4.14)

Proof. Let B {|∆X| > 1, |∆ (H1 • X)| > 1, |∆ (H2 • X)| > 1} be the set of the ‘big jumps’. Observe that ∆ (Hi • X) ∆ (Hi • Vi + Hi • Li ) = = ∆ (Hi • Vi ) + ∆ (Hi • Li ) = = Hi ∆Vi + Hi ∆Li = Hi ∆X, so B = {|∆X| > 1, |H1 ∆X| > 1, |H2 ∆X| > 1} . Obviously for an arbitrary ω the section B (ω) does not have an accumulation point. Let us separate the ‘big jumps’ from X. That is let X

∆XχB ,

X X − X.

∈ V and the integrals Hk • X Observe that, by the simple structure of B, X are simple sums, so they exist. By the construction of the stochastic integral 69 See: 70 See:

Example 3.3, page 181. Theorem 4.44, page 257.

262


Hk • X also exists71 . As the jumps of the X are bounded, X is a special semimartingale72 .

= ∆ Hk • X = Hk ∆X = Hk ∆ X − X = Hk ∆XχB c , hence the jumps of Hk • X are also bounded and therefore the processes Hk • X are also special semimartingales. Let X = X (0) + A + L be the canonical decomposition of X. By the previous theorem integrals Hk • A and Hk • L also exist. The integration with respect to local martingales and with respect to processes with finite variation is additive, hence (H1 + H2 ) • A = H1 • A + H2 • A, (H1 + H2 ) • L = H1 • L + H2 • L, which of course means that the integrals on the left-hand side exist. The integrals are ordinary sums, hence Hk • X = H1 • X + H2 • X. (H1 + H2 ) • X Adding up these three lines above and using that the integral is additive in the integrator we get (4.14). The homogeneity of the integral is obvious by the definition of the integral. 4.4.3

The associativity rule

Like additivity, the associativity rule is also not directly evident from the definition of the stochastic integral. Theorem 4.52 (Associativity rule) Let X be an arbitrary semimartingale and let us assume that the integral H • X exists. The integral K • (H • X) exists if and only if the integral (KH) • X exists. In this case K • (H • X) = (KH) • X. • L ≤ H 2 • [L] and Var V ≤ Var (V )! 72 See: Example 4.47, page 258. 71 H 2


263

Proof. Assume that K is integrable with respect to the semimartingale Y H • X. Let B be again the set of the ‘big jumps’, that is B {|∆X| > 1, |∆Y | > 1, |∆ (K • Y )| > 1} . As in the previous subsection for every ω the section B (ω) is a discrete set. Let us define the processes X Y

χB ∆X,

X X − X,

χB ∆Y,

Y Y − Y .

Using the formula for the jumps of the integrals and the additivity of the integral in the integrator = H • X. Y Y − Y = H • X − H • X As the jumps of X are bounded, X is a special semimartingale. Let X = X (0) + A + L be the canonical decomposition of X. By the same reason Y is also a special semimartingale and as we saw above the canonical decomposition of Y is Y = H • X = H • A + H • L. The integral K • Y on any finite interval is a finite sum, hence if K • Y exists then K • Y also exists.

∆ K • Y = K∆Y = K∆Y χB c . The jumps of K • Y are bounded so K • Y is also a special semimartingale. Therefore the integrals K • (H • A) and K • (H • L) exist and K • (H • L) is a local martingale. By the associativity rule for local martingales and for processes with finite variation K • (H • A) = (KH) • A, K • (H • L) = (KH) • L.

264


Adding up the corresponding lines K • Y = K • Y + K • Y =

= = K • (H • A + H • L) + K • H • X

= = (KH) • A + (HL) • L + (KH) • X = (KH) • X. = (KH) • X + (KH) • X The proof of the reverse implication is similar. Assume that the integrals Y H • X and (KH) • X exist, and let B {|∆X| > 1, |∆Y | > 1, |∆ ((KH) • X)| > 1} . In this case H •X =H •A+H •L (KH) • X = (KH) • A + (KH) • L = = K • (H • A) + K • (H • L) , is again a simple sum, therefore where of course the integrals exist. (KH) • X = (KH) • X = (KH) • X + (KH) • X

= = K • (H • A) + K • (H • L) + K • H • X

= =K • H •A+H •L+H •X

=K • H • A+L+X = K • (H • X) .

4.4.4

Change of measure

In this subsection we discuss the behaviour of the stochastic integral when we change the measure on the underlying probability space. Definition 4.53 Let P and Q be two probability measures on a measure space (Ω, A). Let us fix a filtration F. If Q is absolutely continuous with respect to P on the measure space (Ω, Ft ) for every t then we say that Q is locally absolutely continuous with respect to P. In this case we shall use the loc

notation Q P.


265

loc

If Q P then one can define the Radon–Nikodym derivatives Λ (t)

dQ (t) dP (t)

where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . If s < t and F ∈ Fs then dQ (t) Λ (t) dP dP = Q (t) (F ) = F dP (t) F dQ (s) dP Λ (s) dP. = Q (s) (F ) = F dP (s) F If filtration F satisfies the usual conditions then process Λ has a modification which is a martingale. As Λ (t) is defined up to a set with measure-zero one can assume that the Radon–Nikodym process Λ is a martingale. loc

Lemma 4.54 If Q P and σ is a bounded stopping time then Λ (σ) is the Radon–Nikodym derivative dQ/dP on the σ-algebra Fσ . If Λ is uniformly integrable then this is true for any stopping time σ. Proof. If σ is a bounded stopping time and σ ≤ t then by the Optional Sampling Theorem, since Λ is a martingale Λ (σ) = E (Λ (t) | Fσ ) . That is if F ∈ Fσ ⊆ Ft then Λ (σ) dP = Λ (t) dP = Q (t) (F ) = Q (F ) . F

F

As Λ is not always a uniformly integrable martingale73 the lemma is not valid a.s. for arbitrary stopping time σ. Since Λ is non-negative Λ (t) → Λ (∞) , where Λ (∞) ≥ 0 is an integrable74 variable. By Fatou’s lemma Λ (t) = E (Λ (N ) | Ft ) = lim inf E (Λ (N ) | Ft ) ≥ N →∞

≥ E lim inf Λ (N ) | Ft = E (Λ (∞) | Ft ) . N →∞

Hence the extended process is a non-negative, integrable supermartingale on [0, ∞]. By the Optional Sampling Theorem for Submartingales75 if σ ≤ τ are 73 See:

Example 6.34, page 384. Corollary 1.66, page 40. 75 See: Proposition 1.88, page 54. 74 See:

266


arbitrary stopping times then Λ (σ) ≥ E (Λ (τ ) | Fσ ) .

(4.15)

Let us introduce the stopping time τ inf {t : Λ (t) = 0} . Let L be a local martingale and let U ∆L (τ ) χ ([τ , ∞)) . As L is a local martingale U ∈ Aloc . So U has a compensator U p . With this notation we have the following theorem: loc

Proposition 4.55 Let Q P. If Λ (t)

dQ (t) dP (t)

then Λ−1 is meaningful and right-regular76 under Q. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] + U p L is a local martingale77 under measure Q. Proof. We divide the proof into several steps. 1. First we show that Λ > 0 almost surely under Q. Let τ inf {t : Λ (t) = 0} . Λ is right-continuous so if τ (ω) < ∞ then Λ (τ (ω) , ω) = 0. If 0 ≤ q ∈ Q then τ + q ≥ τ . Hence by (4.15) Λ (τ ) χ (τ < ∞) ≥ χ (τ < ∞) · E (Λ (τ + q) | Fτ ) = = E (Λ (τ + q) χ (τ < ∞) | Fτ ) . Taking expected value 0 ≥ E (Λ (τ + q) χ (τ < ∞)) ≥ 0. 76 That

a.s.

is Λ−1 is almost surely finite and right-regular with respect to Q, that is Λ > 0 a.s with respect to Q. In this case Λ−1 = Λ under Q. See: (4.18). 77 More precisely L is indistinguishable from a local martingale under Q.


267

a.s.

Hence Λ (τ + q) = 0 on the set {τ < ∞} for any q ∈ Q. As Λ is right-continuous, outside a set with P-measure-zero if τ (ω) ≤ t < ∞ then Λ (t, ω) = 0. Q (t) ({Λ (t) = 0}) = {Λ(t)=0}

dQ (t) dP = dP

Λ (t) dP = 0, {Λ(t)=0}

so Λ (t) > 0 almost surely with respect to Q (t). Q (Λ (t) = 0 for some t) = Q (τ < ∞) = Q (∪n Λ (n) = 0) ≤ ≤

∞

Q (Λ (n) = 0) =

n=1

∞

Q (n) (Λ (n) = 0) = 0.

n=1

Hence Λ−1 is meaningful and Λ−1 > 0 almost surely under Q. We prove that Λ− is also almost surely positive with respect to Q. Let ρ inf {t : Λ− (t) = 0} , 1 ρn inf t : Λ (t) ≤ . n As Λ is right-regular Λ (ρn ) ≤ 1/n. Obviously on the set {ρ < ∞} lim Λ (ρn ) = Λ (ρ−) = 0.

n→∞

By (4.15) for any positive rational number q

Λ (ρn ) χ (ρn < ∞) ≥ E Λ (ρn + q) χ (ρn < ∞) | Fρn . Taking expected value 1 ≥ E (Λ (ρn + q) χ (ρn < ∞)) ≥ 0. n By Fatou’s lemma E (Λ ((ρ + q) −) χ (ρ < ∞)) = 0. Hence for every q ≥ 0 a.s

Λ ((ρ + q) −) χ (ρ < ∞) = 0.

(4.16)

268


Hence outside a set with P-measure-zero if ρ (ω) ≤ t < ∞ then Λ− (t, ω) = 0. Hence if ρ (ω) < t < ∞ then Λ (t, ω) = 0. Therefore τ (ω) ≤ ρ (ω). Q (t) ({Λ− (t) = 0}) ≤ Q (t) ({ρ ≤ t}) = ≤

{ρ≤t}

Λ (t) dP ≤

Λ (t) dP = 0. {τ ≤t}

With the same argument as above one can easily prove that Q (Λ− (t) = 0 for some t) = 0. If for some ω the trajectory Λ (ω) and Λ− (ω) are positive then as Λ (ω) is right-regular Λ−1 (ω) is also right-regular. Therefore it is bounded on any finite interval78 . Hence if V ∈ V then Λ−1 • V is well-defined and Λ−1 • V ∈ V under Q. 2. Assume that for some right-regular, adapted process N the product N Λ is a local martingale under P. We show that N is a local martingale under Q. Let σ σ be a stopping time and let us assume that the truncated process (ΛN ) is a 79 martingale under P. If F ∈ Fσ∧t , and r ≥ t, then

N σ (t) dQ =

F

N σ (t) Λσ (t) dP = F

σ

=

σ

N σ (r) dQ.

N (r) Λ (r) dP = F

F

Hence N σ is a martingale under Q with respect to the filtration (Fσ∧t )t . We show that it is a martingale under Q with respect to the filtration F. Let ρ be a bounded stopping time under F. We show that τ ρ ∧ σ is a stopping time under (Fσ∧t )t . One should show that {ρ ∧ σ ≤ t} ∈ Fσ∧t . By definition this means that {ρ ∧ σ ≤ t} ∩ {σ ∧ t ≤ r} ∈ Fr . If t ≤ r then this is true as ρ ∧ σ and σ ∧ t are stopping times. If t > r then the set above is {σ ≤ r} ∈ Fr . By the Optional Sampling Theorem, using that τ ρ ∧ σ is a stopping time under (Fσ∧t )t and N σ is a Q-martingale under this filtration N σ (0) dQ = N σ (τ ) dQ = N σ (ρ) dQ. Ω 78 See: 79 See:

Proposition 1.6, page 5. Lemma 4.54, page 265.

Ω

Ω


269

This implies that N σ is a martingale under Q. Hence N is a local martingale under Q. 0 (0) = 0. Integrating by 3. To simplify the notation let L (0) = 0, from which L parts LΛ = L− • Λ + Λ− • L + [L, Λ] .

(4.17)

Λ and L are local martingales under P so the stochastic integrals on the righthand side are local martingales under P. Let

a

a−1 0

if a > 0 . if a = 0

(4.18)

and let A Λ • [L, Λ] .

(4.19)

A is almost surely finite under Q as Λ > 0 and Λ− are almost surely finite under Q. But we are now defining A under P and with positive probability Λ can be unbounded on some finite intervals under P. Hence we do not know that A is well-defined under P. To solve this problem let us observe that (ρn ) in (4.16) is 0 So it is sufficient to prove a localizing sequence under Q and one can localize L. ρ ρ 0 n that (L n ) = (L) is a local martingale under Q for every n. For Lρn (4.19) is well-defined. So one can assume that A is finite. Again integrating by parts, noting that Λ is right-continuous ΛA = A− • Λ + Λ− • A + [A, Λ] = = A− • Λ + Λ− • A + ∆A∆Λ = = A− • Λ + Λ− • A + ∆Λ • A = = A− • Λ + Λ • A = = A− • Λ + ΛΛ • [L, Λ] = = A− • Λ + χ (Λ > 0) • [L, Λ] . Finally80 p ΛU p = U− • Λ + Λ− • U p + [U p , Λ] = p = U− • Λ + Λ− • U p + ∆U p ∆Λ = p = U− • Λ + Λ− • U p + ∆U p • Λ = 80 See:


270


= U p • Λ + Λ− • U p = = U p • Λ + Λ − • U p ± Λ− • U = = U p • Λ + Λ− • (U − U p ) + Λ− • U The stochastic integrals with respect to local martingales are local martingales, the sum of local martingales is a local martingale so 0 ΛL − ΛA + ΛU p = ΛL = local martingale + [L, Λ] − χ (Λ > 0) • [L, Λ] + Λ− • U. Observe that the last line is χ (Λ = 0) • [L, Λ] + Λ− • U = = χ (t ≥ τ ) • [L, Λ] + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = = χ (t ≥ τ ) ∆L (τ ) ∆Λ (τ ) + Λ− (τ ) ∆L (τ ) χ (t ≥ τ ) = 0 0 is a local where we have used that [L, Λ] is constant81 on {t ≥ τ }. Hence ΛL 0 martingale under P. So by the second part of the proof L is a local martingale under Q. loc

loc

loc

Corollary 4.56 Let Q P and let P Q that is let assume that Q ∼ P. If Λ (t)

dQ (t) dP (t)

then Λ > 0. If L is a local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. loc

Corollary 4.57 Let Q P. If Λ (t)

dQ (t) dP (t)

and L is a continuous local martingale under measure P then the integral Λ−1 • [L, Λ] has finite variation on compact intervals under measure Q and 0 L − Λ−1 • [L, Λ] L is a local martingale under the measure Q. 81 See:



271

loc

If V ∈ V under P and Q P then obviously V ∈ V under Q. Hence the proof of the following observation is trivial: loc

Corollary 4.58 If X is a semimartingale under P and Q P then X is a semimartingale under Q. Let V ∈ V and assume that the integral H • V exists under measure P. By definition this means that the pathwise integrals (H • V ) (ω) exist almost surely loc

under P. If Q P then the integral H • V exists under the measure Q as well, and the value of the two processes are almost surely the same under Q. It is not too surprising that it is true for any semimartingale. Proposition 4.59 Let X be an arbitrary semimartingale and let H be a predictable process. Assume that the integral H • X exists under measure P. If loc

Q P then the integral H • X exists under measure Q as well, and the two integral processes are indistinguishable under measure Q. Proof. By the remark above it is obviously sufficient to prove the proposition if X ∈ L under P. It is also sufficient to prove that for every T > 0 the two integrals exist on the interval [0, T ] and they are almost surely equal. 1. Let X = X c + X d be the decomposition of X into continuous and purely discontinuous local martingales. As the time horizon is finite, Λ is a uniformly integrable martingale. Recall that if L is a local martingale under the measure P then L − Λ−1 • [L, Λ] + U p L

(4.20)

is a local martingale under measure Q and if L is continuous then U p can be dropped. X − Λ−1 • [X, Λ] + U p = X & ' = X c + X d − Λ−1 • X c + X d , Λ + U p = & '

= X c − Λ−1 • [X c , Λ] + X d − Λ−1 • X d , Λ + U p . By (4.20) the processes 1c X c − Λ−1 • [X c , Λ] X

1d X d − Λ−1 • &X d , Λ' + U p and X

are local martingales under measure Q. X c is continuous, hence the quadratic 1c is continuous. If W and V co-variation [X c , Λ] is also continuous82 . Hence X 82 See:

line (3.19), page 222.

272


are pure quadratic jump processes then [W + V ] = [W ] + 2 [W, V ] + [V ] = 2 2 = (∆W ) + 2 ∆W ∆V + (∆V ) = 2 = (∆ (W + V )) hence W +V is also a pure quadratic jump process. Processes with finite variation 1d is a pure quadratic jump process are pure quadratic jump processes83 , hence X under P. Under the change of measure the quadratic variation does not change, 1d is a purely 1d is a pure quadratic jump process under Q. Hence X hence X exists discontinuous local martingale under Q. We want to show that H • X under Q. This means that H • X exist on (0, t] for every t. To prove this one 1d exist under Q. 1c and H • X need only prove that the integrals H • X c 1 is a continuous local martingale, hence H•X 1c exists under Q if and only 2. X (

1c 0) Λ (τ )) ≤ c. 5. Z− is locally bounded. Let (ρn ) be a localizing sequence of Z− . Let τ n inf {s : Λ (s) > n} ∧ ρn ∧ n.

(4.21)

τ n is a bounded stopping time and if s < τ n (ω) then Λ (s, ω) ≤ n. Hence using the estimate just proved dQ = EQ (Z (τ n −)) = E Z (τ n −) dP dQ | Fτ n = = E E Z (τ n −) dP dQ = E Z (τ n −) E | Fτ n = dP = E (Z (τ n −) Λ (τ n )) ≤ kn · E (Λ (τ n )) = = kn · E ({τ n > 0} Λ (τ n ) + {τ n = 0} Λ (τ n )) ≤ ≤ kn · (n + E (Λ (0))) < ∞. 84 See:


274


6. We show that ∆U p = 0. The stopping time τ can be covered by its predictable and totally inaccessible parts so one can assume that τ is either totally inaccessible or predictable. If τ is predictable then χ ([τ ]) is predictable therefore ∆ (U p ) =

p

(∆U )

p

(∆X (τ ) χ ([τ ])) =

p

(∆X · χ ([τ ])) =

= (p ∆X) · χ ([τ ]) = 0 · χ ([τ ]) = 0. If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ, hence p

0 ((∆Xχ ([τ ])) (σ) | Fσ− ) = E 0 (0 | Fσ− ) = 0, (∆Xχ ([τ ])) (σ) E

so ∆U p =

p

(∆Xχ ([τ ])) = 0. Therefore in both cases ∆U p = 0.

2 ( 1d is purely discontinuous, hence X 1d = 1d and 7. X ∆X 1d = ∆X d − Λ ∆ &X d , Λ' + ∆U p . ∆X

Since ∆U p = 0

1d ∆X = ∆X d − Λ · ∆X d ∆Λ =

= ∆X d 1 − Λ · ∆Λ =

= ∆X d χ (Λ = 0) + Λ · Λ− . " " H 2 • [X d ] ∈ Aloc under P. One can assume that τ n localizes H 2 • [X d ] in (4.21). Therefore one may assume that )

H 2 • [X d ] (τ n ) < ∞. E ∆X d (τ n ) H (τ n ) ≤ E Using this % (

1 2 d E ∆ H • X (τ n ) = Q

% =E

H2

1d ∆X

2

dQ (τ n ) dP

1d (τ n ) Λ (τ n ) = = E H∆X

= E H∆X d (τ n ) χ (Λ = 0) + Λ− Λ (τ n ) Λ (τ n ) =

= E H∆X d (τ n ) Λ− Λ (τ n ) Λ (τ n ) ≤

≤ E H (τ n ) ∆X d (τ n ) Λ (τ n −) ≤ n · E ∆X d (τ n ) H (τ n ) < ∞.


8. As

√

x+y ≤

√

x+

√

y

% EQ (Z (τ n )) EQ

275

(

1d (τ n ) H2 • X

≤

% (

1d (τ n ) < ∞. ∆ H2 • X ≤ E (Z (τ n −)) + E Q

Q

Therefore Z ∈ Aloc under measure Q. 9. Let us consider the decomposition + A − Up + Λ−1 • [X, Λ] − U p X X=X and let us assume that the integral H • X exists under measure P. As the inte exists under Q one should prove that the Lebesgue–Stieltjes integrals gral H • X H • A and H • U p also exist. By the inequality of Kunita and Watanabe

T

T

|H| Λ dVar ([X, Λ]) ≤

|H| dVar (A) = 0

0

-

T

≤

2 |H| Λ d [X]

0

=

0 T

2 Λ d |H| • [X]

T

Λ d [Λ] = -

T

Λ d [Λ].

0

0

Λ > 0 and Λ− > 0 almost surely under Q, that is almost all trajectories of Λ and Λ− are positive85 hence Λ has regular trajectories almost surely under Q. Hence almost surely the trajectories of Λ are bounded on every finite inter) T Λ d [Λ] is finite. Similarly as H • X exists val, therefore the expression 0 R |H| • [X] ∈ V, hence Λ • R is finite under Q. That is for every trajectory T |H| dVar (A) < ∞, hence H • A exists under Q. Let σ be a stopping time in 0 " a localizing sequence of H 2 • [X]. 2

) E ((|H| • U p ) (σ)) = E ((|H| • U ) (σ)) ≤ E

2

|H| • [X] (σ)

< ∞.

Hence H • U p is almost surely finite under P so it is almost surely finite under Q. Therefore the integral H • X exists under Q. 10. Let us denote by (P) H • X and by (Q) H • X the value of H • X under P and under Q respectively. Let us denote by H the set of processes H for 85 See:


276


which (P) H • X and (Q) H • X are indistinguishable under Q. From the Dominated Convergence Theorem and from the linearity of the stochastic integral it is obvious that H is a λ-system, which contains the π-system of the elementary processes. From the Monotone Class Theorem it is clear the H contains all the bounded predictable processes. 11. If Hn Hχ (|H| ≤ n) then Hn is bounded. Hence the value of the integral (P) Hn • X is Q almost surely equal to the integral (Q) Hn • X. As H • X exists under P and under Q by the Dominated Convergence Theorem uniformly ucp in probability on compact intervals (P) Hn • X → (P) H • X and (Q) Hn • ucp X → (Q) H • X. The stochastic convergence under P implies86 the stochastic convergence under Q, hence (P) H • X = (Q) H • X almost surely under Q. Let us prove some consequences of the proposition. During the construction of the stochastic integral we emphasized that we cannot define the integral pathwise. But it does not mean that the integral is not determined by the trajectories of the integrator and the integrand. Corollary 4.60 Let X and X be semimartingales. Assume that for the predictable processes H and H the integrals H • X and H • X exist. If . / . / A ω : H (ω) = H (ω) ∩ ω : X (ω) = X (ω) then the processes H • X and H • X are indistinguishable on A. Proof. One may assume that P (A) > 0. Define the measure Q (B)

P (A ∩ B) . P (A)

Obviously Q P. The processes H, H and X, X are indistinguishable under Q. Hence processes (Q)H • X and (Q) H • X are indistinguishable under Q. By the proposition above under Q up to indistinguishability (P) H • X = (Q) H • X = (Q) H • X = (P) H • X which means that (P) H • X = (P) H • X on A. The proof of the following corollary is similar: Corollary 4.61 Let X be a semimartingale and let assume that the integral H • X exists. If on a set B the trajectories of X have finite variation then almost surely on B the trajectories of H • X are equal to the pathwise integrals of H with respect to X. 86 A sequence is stochastically convergent if and only if every subsequence of the sequence has another subsequence which is almost surely convergent to the same, fixed random variable.

THE PROOF OF DAVIS’ INEQUALITY

4.5

277

The Proof of Davis’ Inequality

In this section we prove the following inequality: Theorem 4.62 (Davis’ inequality) There are positive constants c and C such that for any local martingale L ∈ L and for any stopping time τ

"

" [L] (τ ) ≤ E sup |L (t)| ≤ C · E [L] (τ ) . c·E t≤τ

Example 4.63 In the inequality one cannot write |L| (τ ) in the place of supt≤τ |L|.

If w is a Wiener process and τ inf {t : w (t) = 1} then L wτ is a martingale. E (L (t)) = 0 for every t, hence

L (t)1 = E (|L(t)|) = 2E L+ (t) ≤ 2. On the other hand if t → ∞ " √

√ τ ∧t →E τ . [L] (t) = E 1

The density function87 of τ is

1 exp − f (x) = √ 3 2x 2x π √ hence the expected value of τ is 1

E

,

x > 0,

1 exp − dx = 2x 2x3 π 0 ∞ 1 1 1 √ exp − = dx = 2x 2π x 0 ∞ u

1 1 exp − du = ∞. =√ 2 2π 0 u

√ τ =

∞

√

x√

1

If σ is an arbitrary stopping time then in place of L one can write Lσ in the inequality. On the other hand if for some localizing sequence σ n ∞ the inequality is true for all Lσn then by the Monotone Convergence Theorem it is true for L as well. By the Fundamental Theorem of Local Martingales L ∈ L has a 2 decomposition L = H + A where H ∈ Hloc and A ∈ Aloc . With localization 2 one can assume that H ∈ H and A ∈ A. L− is left-regular, hence it is locally 87 See:

(1.58) on page 83.

278


bounded, so with further localization of the inequality one can assume that L− is bounded. It suffices to prove the inequality on any finite time horizon [0, T ]. It is

suffi(n) is an cient to prove the inequality for finite, discrete-time horizons: If tk infinitesimal sequence of partitions of [0, T ] then trivially  

(n)   E sup L tk E sup |L (t)| . (n)

t≤T

tk ≤T

Recall that as L(0) = 0 at any time t the quadratic variation [L] is the limit in probability of the sequence (n)

[L]

(t)

2 ( (n) (n) L tk ∧ t − L tk−1 ∧ t = k

= L2 (t) − 2

(

(n) (n) (n) L tk−1 ∧ t L tk ∧ t − L tk−1 ∧ t .

k

If Yn (t)

(n) (n) (n) L tk−1 ∧ t χ tk−1 ∧ t, tk ∧ t ,

k

then the sum in the above expression is (Yn • L) (t). Obviously Yn → L− and |Yn (t)| ≤ sup |L− (s)| ≤ k. s≤t

Repeating the proof of the Dominated Convergence Theorem we prove that for all t (Yn • L) (t) → (L− • L) (t) in L1 (Ω). As (Yn ) is uniformly bounded, by Itˆ o’s isometry the convergence Yn • H → L− • H holds in H2 and therefore L2

(Yn • H) (t) → (L− • H) (t). Obviously |(Yn • A) (t) − (L− • A) (t)| ≤ 2k · Var (A) (t) . As A ∈ A by the classical Dominated Convergence Theorem L1

(Yn • A) (t) → (L− • A) (t) .


279

Therefore, as we said, L1

(Yn • A) (T ) → (L− • A) (T ) . (n)

L1

Hence [L] (T ) → [L] (T ) , so by Jensen’s inequality ) ) "

" (n) E ≤ E [L](n) (T ) − [L] (T ) ≤ [L] (T ) − E [L] (T ) % (n) ≤E [L] (T ) − [L] (T ) ≤

%

(n) ≤ E [L] (T ) − [L] (T ) → 0. This means that if the inequality holds in discrete-time then it is true in continuous-time. 4.5.1

Discrete-time Davis’ inequality

Up to the end of this section we assume that if M is a martingale then M (0) = 0. Definition 4.64 Let us first introduce some notation. For any sequence M (Mn ) ∆Mn Mn − Mn−1 . If M (Mn ) is a discrete-time martingale then (∆Mn ) is the martingale difference of M . [M ]n

n k=1

Mn∗

2

(∆Mk ) =

n

2

(Mk − Mk−1 )

k=1

sup |Mk | k≤n

for any n. If n is the maximal element in the parameter set or n = ∞ then we drop the subscript n. With this notation the discrete-time Davis’ inequality has the following form: Theorem 4.65 (Discrete-time Davis’ inequality ) There are positive constants c and C such that for every discrete-time martingale M for which M (0) = 0 "

"

c·E [M ] ≤ E (M ∗ ) ≤ C · E [M ] .

280


The proof of the discrete-time Davis’ inequality is a simple but lengthy88 calculation. Let us first prove two lemmas: Lemma 4.66 Let M (Mn , Fn ) be a martingale and let V (Vn , Fn−1 ) be a predictable sequence89 , for which |∆Mn | |Mn − Mn−1 | ≤ Vn . If λ > 0 and 0 < δ < β − 1 then

" P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ ≤ P

"

[M ] > βλ, M ∗ ∨ V ∗ ≤ δλ ≤

2δ 2 (β − δ − 1)

2 P (M

∗

> λ) ,

" 9δ 2 [M ] > λ . P 2 β −δ −1 2

Proof. The proof of the two inequalities are similar. 1. Let us introduce the stopping times µ inf {n : |Mn | > λ} , ν inf {n : |Mn | > βλ} , ) σ inf n : [M ]n ∨ Vn+1 > δλ . For every j c

Fj {µ < j ≤ ν ∧ σ} = {µ < j} ∩ {ν ∧ σ < j} ∈ Fj−1 , hence if Hn

n

∆Mj χFj ,

j=1

then n ∆Mj χFj | Fn−1 ) = E (Hn | Fn−1 ) E( j=1

=

n−1 j=1

88 And 89 That

boring. is Vn is Fn−1 -measurable.

∆Mj χFj + E(∆Mn χFn | Fn−1 ) =


=

n−1

281

∆Mj χFj + χFn E(∆Mn | Fn−1 ) =

j=1

=

n−1

∆Mj χFj Hn−1 ,

j=1

therefore (Hn ) is a martingale. By the assumptions of the lemma |∆Mj | ≤ Vj , hence by the definition of σ

2 [H]n ≤ [M ]σ = [M ]σ−1 + (∆Mσ ) χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤

≤ [M ]σ−1 + Vσ2 χ (σ < ∞) + [M ]σ χ (σ = ∞) ≤ ≤ 2δ 2 λ2 . {M ∗ ≤ λ} = {µ = ∞} hence on this set H = 0 so [H] = 0. Therefore E ([H]) = E ([H] χ (M ∗ > λ) + [H] χ (M ∗ ≤ λ)) = = E ([H] χ (M ∗ > λ)) ≤ 2δ 2 λ2 P (M ∗ > λ) . Observe that Fj ∩ {ν < ∞, σ = ∞} = {µ < j ≤ ν} ∩ {ν < ∞, σ = ∞} hence on the set {ν < ∞, σ = ∞} Hn = Mν∧n − Mµ∧n . On {ν < ∞} obviously supn |Mν∧n | ≥ λβ. On {σ = ∞} by definition V ∗ ≤ δλ, hence |Mµ | = |Mµ−1 + ∆Mµ | ≤ λ + δλ. This implies that on the set {ν < ∞, σ = ∞} H ∗ = sup |Mν∧n − Mµ∧n | > λβ − λ (δ + 1) = λ (β − (1 + δ)) . n

282


By Doob’s inequality90 using the definition of ν and σ

" P1 P M ∗ > βλ, [M ] ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤

2 E H∞

∗

≤ P (H > λ (β − (1 + δ))) ≤ ≤ ≤

E ([H]) λ (β − 1 − δ) 2

2

2

λ2 (β − 1 − δ)

≤

≤

2δ 2 λ2 P (M ∗ > λ) 2

λ2 (β − (1 + δ))

=

2δ 2 (β − 1 − δ)

2 P (M

∗

> λ) ,

which is the first inequality. 2. Analogously, let us introduce the stopping times ) µ inf n : [M ]n > λ ,

) ν inf n : [M ]n > βλ ,

σ inf {n : Mn∗ ∨ Vn+1 > δλ} . Again for all j let Fj {µ < j ≤ ν ∧ σ } . As Fj ∈ Fj−1 Gn

n

∆Mj χFj

j=1

is again a martingale. If µ ≥ σ then G∗ = 0. Hence if σ < ∞ then G∗ = G∗ χ (µ < σ ) ≤

≤ Mµ∗ + Mσ∗ χ (µ < σ ) ≤

≤ Mσ∗ −1 + Mσ∗ χ (µ < σ ) =

= Mσ∗ −1 + Mσ∗ −1 + ∆Mσ∗ χ (µ < σ ) ≤

≤ Mσ∗ −1 + Mσ∗ −1 + Vσ χ (µ < σ ) ≤ ≤ δλ + δλ + δλ = 3δλ. 90 See:

line (1.14), page 33.


283

If σ = ∞ then of course σ − 1 is meaningless, but in this case obviously

∗ Mµ + Mσ∗ χ (µ < σ ) ≤ 2δλ, so in this case the inequality G∗ ≤ 3δλ still holds. On the set

" [M ] ≤ λ =

{µ = ∞} obviously G∗ = 0. "

"

2 2 2 E (G∗ ) = E (G∗ ) χ [M ] > λ + (G∗ ) χ [M ] ≤ λ = "

"

2 [M ] > λ ≤ 9δ 2 λ2 P [M ] > λ . = E (G∗ ) χ On the set {ν < ∞, σ = ∞} [G]n = [M ]ν ∧n − [M ]µ ∧n . By this using that ν < ∞ and σ = ∞ 2

2

[G] > (βλ) − [M ]µ −1 − (∆Mµ ) ≥ 2

2

≥ (βλ) − λ2 − (Vµ ) ≥

2 ≥ (βλ) − 1 + δ 2 λ2 . By Markov’s inequality and by the energy identity91 "

P2 P [M ] > βλ, M ∗ ∨ V ∗ ≤ δλ = = P (ν < ∞, σ = ∞) ≤

≤ P [G] > λ2 β 2 − 1 + δ 2 ≤

E ([G]) =

λ β − 1 + δ2

2 ∗ 2 ) E (G E G ≤ 2 2 ≤ = 2 2 λ β − 1 + δ2 λ β − 1 + δ2 "

9δ 2 P [M ] > λ . ≤ 2 2 β − (1 + δ) 2

2

Lemma 4.67 Let M (Mn , Fn ) be a martingale and let assume that M0 = 0. If dj ∆Mj Mj − Mj−1 ,

aj dj χ |dj | ≤ 2d∗j−1 − E dj χ |dj | ≤ 2d∗j−1 | Fj−1 ,

bj dj χ |dj | > 2d∗j−1 − E dj χ |dj | > 2d∗j−1 | Fj−1 E G2 = ∞ then the inequality is true, otherwise one can use Proposition 1.58 on page 35. 91 If

284


then the sequences Gn

n

aj

and

j=1

Hn

n

bj ,

j=1

are F-martingales, M = G + H and |aj | ≤ 4d∗j−1 , ∞

(4.22)

dj χ |dj | > 2d∗j−1 ≤ 2d∗ ,

(4.23)

j=1 ∞

E (|bj |) ≤ 4E (d∗ ) .

(4.24)

j=1

Proof. As M0 = 0 n

dj

j=1

n

∆Mj = Mn − M0 = Mn .

j=1

One should only prove the three inequalities, since from this identity the other parts of the lemma are obvious92 . 1. (4.22) is evident. / . 2. |dj | + 2d∗j−1 ≤ 2 |dj | on |dj | > 2d∗j−1 , hence ∞ ∞

dj χ |dj | > 2d∗j−1 ≤ 2 |dj | − 2d∗j−1 χ |dj | > 2d∗j−1 ≤ j=1

j=1

≤2

∞

d∗j − d∗j−1 = 2d∗ ,

j=1

which is exactly (4.23). 3. ∞ j=1

E (|bj |) ≤

∞

E |dj | χ |dj | > 2d∗j−1 +

j=1

+

∞

E E dj χ |dj | > 2d∗j−1 | Fj−1 .

j=1 92 For any sequence (ξ , F ) E (ξ | F n n−1 ) = 0 if and only if (ξ n , Fn ) n n difference sequence.

is a martingale


285

If in the second sum we bring the absolute value into the conditional expectation, then   ∞ ∞

E (|bj |) ≤ 2E  |dj | χ |dj | > 2d∗j−1  . j=1

j=1

By (4.23) the expression in the conditional expectation is not larger than 2d∗ , from which (4.24) is evident. The proof of the discrete-time Davis’ inequality: Let M = H + G be n the decomposition of the previous lemma. Gn j=1 aj is a martingale, |aj | ≤ 4d∗j−1 , hence by the first lemma, if λ > 0 and 0 < δ < β − 1, then

" P G∗ > βλ, [G] ∨ 4d∗ ≤ δλ ≤ P

"

[G] > βλ, G∗ ∨ 4d∗ ≤ δλ ≤

2δ 2 (β − δ − 1)

2 P (G

∗

> λ) ,

" 9δ 2 [G] > λ . P β 2 − δ2 − 1

Hence for any λ > 0 P (G∗ > βλ) ≤ P +

"

[G] > δλ + P (4d∗ > δλ) + 2δ 2

(β − δ − 1)

2 P (G

∗

> λ) ,

and P

"

[G] > βλ ≤ P (G∗ > δλ) + P (4d∗ > δλ) + +

" 9δ 2 [G] > λ . P β 2 − δ2 − 1

Integrating w.r.t. λ and using that if ξ ≥ 0 then ∞ ∞ E (ξ) = 1 − F (x)dx = P(ξ > x)dx, 0

0

one has that ∗

E (G ) ≤ β

E

"

[G] +

δ +

2δ 2 (β − δ − 1)

4E (d∗ ) + δ 2 E (G

∗

),

286


and E

"

[G]

β

≤

E (G∗ ) 4E (d∗ ) + + δ δ "

9δ 2 + 2 E [G] . β − δ2 − 1

For the stopped martingale Gn the expected values in the inequalities are finite, hence one can reorder the inequalities

2

1 2δ − 2 β (β − δ − 1)

E (G∗n ) ≤

E

"

[G]n

δ

+

4E (∆Mn∗ ) . δ

and

1 9δ 2 − 2 β β − δ2 − 1

E

"

E (G∗ ) 4E (∆M ∗ ) n n + . [G]n < δ δ

If δ is small enough then the constants on the left-hand side are positive, hence we can divide by them. Hence if n ∞ then by the Monotone Convergence Theorem "

∗ E (G∗ ) ≤ A1 E [G] + A2 E (∆M ) , "

∗ E [G] ≤ B1 E (G∗ ) + B2 E (∆M ) . By the second lemma E (M ∗ ) ≤ E (G∗ + H ∗ ) ≤ ≤ E (G∗ ) + E (|bj |) ≤ E (G∗ ) + 4E (d∗ ) ≤ "

j

∗ ∗ ≤ A1 E [G] + A2 E (∆M ) + 4E (∆M ) , "

" "

E [M ] ≤ E [G] + [H] ≤ " "

≤E [G] + E (|bj |) ≤ E [G] + 4E (d∗ ) ≤ j

∗ ∗ ≤ B1 E (G ) + B2 E (∆M ) + 4E (∆M ) . ∗


287

As G = M − H by the second lemma again E (G∗ ) ≤ E (M ∗ ) + E (H ∗ ) ≤ E (M ∗ ) +

∗ ≤ E (M ∗ ) + 4E (∆M )

∞

E (|bj |) ≤

j=1

and E

"

∞

"

"

"

[G] ≤ E [M ] + E [H] ≤ E [M ] + E (|bj |) ≤

"

∗ ≤E [M ] + 4E (∆M ) .

j=1

From this with simple calculation E (M ∗ ) ≤ A1 E

"

"

∗ [M ] + A3 E (∆M ) ≤ A · E [M ] ,

and E

"

∗ [M ] ≤ B1 E (M ∗ ) + B3 E (∆M ) ≤ B · E (M ∗ ) ,

from which Davis’ inequality already follows, trivially. 4.5.2

Burkholder’s inequality

One can extend Davis’ inequality in such a way that instead of the L1 (Ω)-norm one can write the Lp (Ω)-norm for every p ≥ 1. Theorem 4.68 (Burkholder’s inequality) For any p > 1 there are constants cp and Cp , such that for every local martingale L ∈ L and for every stopping time τ " " cp [L] (τ ) ≤ sup |L (t)| ≤ Cp [L] (τ ) . p

t≤τ

p

p

During the proof of the inequality we shall use the next result: Lemma 4.69 Let A be a right-regular, non-negative, increasing, adapted process and let ξ be a non-negative random variable. Assume that almost surely for every t E (A (∞) − A (t) | Ft ) ≤ E (ξ | Ft )

(4.25)

288


and ∆A (t) ≤ ξ. Then for every p ≥ 1 A (∞)p ≤ 2p ξp .

(4.26)

Proof. A is increasing, so for every n χ (A (t) ≥ n) (A (∞) − A (t)) = (A ∧ n) (∞) − (A ∧ n) (t) . So if (4.26) holds for some A then it holds for A ∧ n. Hence one can assume that A is bounded, since otherwise we can replace A with A ∧ n and in (4.26) one can take n ∞. If ξ is not integrable then the inequality trivially holds. Hence one can assume that ξ is integrable. 1. As ξ is integrable E (ξ | Ft ) is a uniformly integrable martingale. As A is bounded E (A (∞) − A (t) − ξ | Ft ) = E (A (∞) | Ft ) − E (ξ | Ft ) − A (t) is a uniformly integrable, non-positive supermartingale. By the Optional Sampling Theorem for every stopping time τ E (A (∞) − A (τ ) | Fτ ) ≤ E (ξ | Fτ ) .

(4.27)

Let x > 0 and let τ x inf {t : A (t) ≥ x} . Obviously A (τ x −) ≤ x. By (4.27) E ((A (∞) − x) χ (x < A (∞))) ≤ E ((A (∞) − x) χ (τ x < ∞)) = ≤ E ((A (∞) − A (τ x −)) χ (τ x < ∞)) = = E ((A (∞) − A (τ x )) χ (τ x < ∞)) + E (∆A (τ x ) χ (τ x < ∞)) ≤ ≤ E (ξχ (τ x < ∞)) + E (ξχ (τ x < ∞)) ≤ ≤ 2E (ξχ (x ≤ A (∞))) .


2. With inequality

simple

calculation

using

Fubini’s

theorem

and

289

H¨ older’s

p

A (∞)p E (Ap (∞)) = pE (Ap (∞)) − (p − 1) E (Ap (∞)) = A(∞) p−2 = p (p − 1) E A (∞) x dx 0

− p (p − 1) E

p−1

x

=

A(∞)

(A (∞) − x) x

= p (p − 1) E

dx

0

= p (p − 1)

A(∞)

p−2

dx

=

0 ∞

E ((A (∞) − x) χ (x < A (∞))) xp−2 dx ≤

0

∞

≤ 2p (p − 1)

E (ξχ (x ≤ A (∞))) xp−2 dx =

0

A(∞)

= 2p (p − 1) E

p−2

ξx

dx

= 2p · E ξAp−1 (∞) ≤

0 p−1

≤ 2p · ξp A (∞)p

. p−1

If A (∞)p > 0 then we can divide both sides by A (∞)p inequality trivially holds.

, otherwise the

Proof of Burkholder’s inequality: Let L be a local martingale. Let B ∈ Ft and let N χB (L − Lt ). N is a local martingale so by Davis’ inequality c·E

"

"

[N ] (∞) ≤ E sup |N (s)| ≤ C · E [N ] (∞) , s

which immediately implies that c·E

"

[L − Lt ] (∞) | Ft ≤ E sup |L − Ls | | Ft ≤ s

≤C ·E

"

[L − Lt ] (∞) | Ft .

Let L∗ (t) sups≤t |L (s)|. Since " " " " " [L] (∞) − [L] (t) ≤ [L] (∞) − [L] (t) = [L − Lt ] (∞) ≤ [L] (∞)

290


and L∗ (∞) − L∗ (s) ≤ sup L − Lt (s) ≤ 2L∗ (∞) s

if A (t)

" [L] (t)

and ξ c−1 2 · L• (∞)

or if A (t) L∗ (t)

and ξ C

" [L] (∞)

then estimation (4.25) in the lemma holds. Without loss of generality one can assume that the constants in the definition of ξ are larger than one. Since for every constant k ≥ 1 " " ∆L∗ ≤ |∆L| = ∆ [L] ≤ k · [L] (∞) " " ∆ [L] ≤ ∆ [L] = |∆L| ≤ k · 2L∗ (∞) in both cases we get that ∆A ≤ ξ. Hence A (∞)p ≤ 2p ξp which is just the two sides of Burkholder’s inequality. p/2

p Corollary 4.70 If L ∈ L and p ≥ 1 then L ∈ Hloc if and only if [L]

∈ Aloc .

Corollary 4.71 If M is a local martingale and for some p ≥ 1 for every sequence of infinitesimal partitions of the interval [0, t] (n)

[M ]

Lp

(t) → [M ] (t) ,

then M ∗ (t) sup |M (s)| ∈ Lp (Ω) s≤t

that is M ∈ Hp on the interval [0, t]. (n)

Proof. Let (Mn ) be a discrete-time of M . If [M ] (t) is con approximation (n) p vergent in L (Ω), then K supn [M ] (t) < ∞. By the Davis–Burkholder p

inequality and by Jensen’s inequality ) % (n) sup |Mn | (s) ≤ Cp [M ](n) (t) ≤ Cp ] (t) [M ≤ L < ∞. s≤t

p

p

p


291

For a subsequence sup |Mn | sup |M | , hence by the Monotone Convergence Theorem M ∗ (t)p ≤ L < ∞. Corollary 4.72 If q ≥ 1 and L ∈ Hq is purely discontinuous then L is the Hq -sum of its compensated jumps. Proof. Let us denote by (ρk ) the stopping times exhausting the jumps of L. Let L ∈ Hq be purely discontinuous and let L = Lk where Nk H (ρk ) χ ([ρk , ∞)) and Lk N − Nkp are the the compensated jumps of L. Recall that the convergence holds in the topology of uniform convergence in probability93 . L ∈ Hq so q/2 by Burkholder’s inequality [L] ∈ A and as the compensator Nkp is continuous q/2

[Lk ]

q

(∞) = (∆L (ρk )) ≤ q/2

≤ [L]

2

q/2

(∆L) (∞)

≤

(∞) ∈ L1 (Ω) .

This implies that Lk ∈ Hq . Hq is a vector space hence Yn n > m then ≤ sup Yn − Ym Hq |Y (t) − Y (t)| n m t

n k=1

Lk ∈ Hq . If

q

" ≤ Cp [Yn − Ym ] (∞) = q 2 = Cp (∆L) (s)χ (B \B ) n m , s q

where Bn ∪nk=1 [ρk ].

)

2

(∆L) is in Lq (Ω). Therefore if n, m → ∞ then Yn − Ym Hq → 0.

So (Yn ) is convergent in Hq . Convergence in Hq implies uniform convergence in Hq

probability so obviously Yn → L.

93 See:


5 SOME OTHER THEOREMS In this chapter we shall discuss some further theorems from the general theory of stochastic processes. First we shall prove the so-called Doob–Meyer decomposition. By the Doob–Meyer decomposition every integrable submartingale is a semimartingale. We shall also prove the theorem of Bichteler and Dellacherie, which states that the semimartingales are the only ‘good integrators’.

5.1

The Doob–Meyer Decomposition

If A ∈ A+ and M ∈ M then X A + M is a class D submartingale. Since if τ is a finite valued stopping time then |A (τ )| = |A (τ ) − A (0)| ≤ Var (A) (∞) ∈ L1 (Ω) ,

(5.1)

hence the set {X (τ ) : τ < ∞ is a stopping time} is uniformly integrable. The central observation of the stochastic analysis is that the reverse implication is also true: Theorem 5.1 (Doob–Meyer decomposition) If a submartingale X is in class D then X has a decomposition X = X (0) + M + A, where A ∈ A+ , M ∈ M and A is predictable. Up to indistinguishability this decomposition is unique. 5.1.1

The proof of the theorem

We divide the proof into several steps. The proof of the uniqueness is simple. If X (0) + M1 + A1 = X (0) + M2 + A2 292

THE DOOB–MEYER DECOMPOSITION

293

are two decompositions of X then M1 − M2 = A2 − A1 . A2 − A1 is a predictable martingale, hence it is continuous1 . As A2 − A1 has finite variation by Fisk’s theorem2 A1 = A2 , hence M1 = M2 . The proof of the existence is a bit more complicated. Definition 5.2 We say that a supermartingale P is a potential 3 , if 1. P is non-negative and 2. limt→∞ E (P (t)) = 0. Proposition 5.3 (Riesz’s decomposition) If X is a class D submartingale then X has a decomposition X = X (0) + M − P

(5.2)

where P is a class D potential and M is a uniformly integrable martingale. Up to indistinguishability this decomposition is unique. Proof. As X is in class D the set {X (t) : t ≥ 0} is uniformly integrable, hence it is bounded in L1 (Ω). Hence

sup E X + (t) ≤ sup E (|X (t)|) < K. t

t

By the submartingale convergence theorem4 the limit lim X (t) = X (∞) ∈ L1 (Ω)

t→∞

exists. Let us define the variables M (t) E (X (∞) | Ft ). As the filtration satisfies the usual conditions M has a version which is a uniformly integrable martingale. The process P M − X is in class D since it is the difference of two processes of class D. By the submartingale property P (s) M (s) − X (s) ≥ E (M (t) | Fs ) − E (X (t) | Fs ) = = E (M (t) − X (t) | Fs ) . a.s.

If t → ∞, then M (t) − X (t) → 0 and as (M (t) − X (t))t is uniformly integrable the convergence holds in L1 (Ω) as well. By the L1 (Ω)-continuity of the 1 See:

Corollary 3.40, page 205. Theorem 2.11. page 117. 3 Recall that the expected value of the supermartingales is decreasing. 4 See: Corollary 1.72, page 44. 2 See:

294

SOME OTHER THEOREMS

conditional expectation the right-hand side of the inequality almost surely goes a.s.

to zero, that is P (s) ≥ 0. E (P (s)) = E (M (s)) − E (X (s)) → E (M (∞)) − E (X (∞)) = 0, hence P is a potential. Assume that the decomposition is not unique. Let Pi , Mi , i = 1, 2 be two decompositions of X. In this case (P1 − P2 ) (t) = M1 (t) − M2 (t) = E (M1 (∞) − M2 (∞) | Ft ) . L

By the definition of the potential Pi (t) →1 0. Hence if t → ∞, then 0 = E (M1 (∞) − M2 (∞) | F∞ ) = M1 (∞) − M2 (∞) , hence M1 = M2 , so P1 = P2 . It is sufficient to proof the Doob–Meyer decomposition for the potential part of the submartingale. One should prove that if P is a class D potential, then there is one and only one N ∈ M and a predictable process A ∈ A+ for which P = N − A. If it holds then substituting −P = −N + A into line (5.2) we get the needed decomposition of X. From the definition of the potential E (A (t)) = E (N (t)) − E (P (t)) ≤ E (N (∞)) . A ∈ A+ , so A is increasing. 0 = A (0) ≤ A (t) A (∞) where E (A (∞)) < ∞. L1

Hence by the Monotone Convergence Theorem A (t) → A (∞). By the definition L1

of the potential P (t) → P (∞) = 0, hence A (∞) = N (∞). So to prove the theorem it is sufficient to prove that there is a predictable process A ∈ A+ and N ∈ M such that P (t) + A (t) = N (t) = E (N (∞) | Ft ) = E (A (∞) | Ft ) , which holds if there is an A ∈ A+ such that P (t) = E (A (∞) − A (t) | Ft ) . By the definition of the conditional expectation it is equivalent to E (χF (A (∞) − A (t))) = E (χF P (t)) = E (χF (P (t) − P (∞))) ,

F ∈ Ft .


295

Observe that S −P is a submartingale and S (∞) = 0, hence the previous line is equivalent to E (χF (A (∞) − A (t))) = E (χF (S (∞) − S (t))) ,

F ∈ Ft .

(5.3)

For an arbitrary process X on the set of predictable rectangles (s, t] × F,

F ∈ Fs

let us define the set function µX ((s, t] × F ) E (χF (X (t) − X (s))) . Recall5 that the predictable rectangles and the sets {0} × F, F ∈ F0 generate the σ-algebra of the predictable sets P. Let µX ({0} × F ) 0,

F ∈ F0 .

Definition 5.4 If a set function µX has a unique extension to the σ-algebra P which is a measure on P then µX is called6 the Doléans type measure of X. Observe that the sets in (5.3) are in the σ-algebra generated by the predictable rectangles. Hence to prove the Doob–Meyer decomposition one should prove the following: Proposition 5.5 If S ∈ D is a submartingale then there is a predictable process A ∈ A+ such that the measure µS of S on the predictable sets is generated by A, that is there is a predictable process A ∈ A+ such that µA (Y ) = µS (Y ) ,

Y ∈ P.

(5.4)

As a first step we prove that µS is really a measure on P. Proposition 5.6 If S is a class D submartingale then the Doléans type measure µS of S can be extended from the semi-algebra of the predictable rectangles to the σ-algebra of the predictable sets. Proof. Denote by C the semi-algebra of the predictable rectangles. We want to use Carathéodory’s extension theorem. To do this we should prove that µS is a measure on C. As S is a submartingale µS is non-negative. µS is trivially additive, hence µS is monotone on C. For all C ∈ C, using that µS is monotone 5 See: 6 See:

Corollary 1.44, page 26. Definition 2.56, page 151.

296

SOME OTHER THEOREMS

and (0, ∞] ∈ C, µS (C) ≤ µS ([0, ∞]) = µS ({0} × Ω) + µS ((0, ∞]) = = µS ((0, ∞]) E (S (∞) − S (0)) ≤ ≤ E (|S (∞)|) + E (|S (0)|) < ∞. Observe that in the last line we used that S is uniformly integrable and therefore S (∞) and S (0) are integrable. As µS is finite it is sufficient to prove that whenever Cn ∈ C, and Cn ∅, then µS (Cn ) 0. Let ε > 0 be arbitrary. If (s, t] × F ∈ C then 1 1 s + , t × F ⊆ s + , t × F ⊆ (s, t] × F. n n S is a submartingale so for every F ∈ Fs 1 E χF S s + − S (s) ≥ 0, n 1 E χF c S s + − S (s) ≥ 0. n S is uniform integrable, hence for the sum of the two sequences above 1 1 − S (s) = E lim S s + − S (s) = lim E S s + n→∞ n→∞ n n = E (S (s+) − S (s)) = 0, hence

1 lim E χF S s + − S (s) =0 n→∞ n so lim µS

n→∞

s+

1 1 , t × F lim E χF S (t) − S s + = n→∞ n n = E (χF [S (t) − S (s)]) µS ((s, t] × F ) .

Hence for every Cn ∈ C there are sets Kn and Bn ∈ C such that Bn ⊆ Kn ⊆ Cn , and for all ω the sections Kn (ω) of Kn are compact and µS (Cn ) < µS (Bn ) + ε2−n .

(5.5)


297

Let us introduce the decreasing sequence Ln ∩k≤n Bk . C is a semi-algebra, hence Ln ∈ C for every n. Let Ln and B n be the sets in which we close the time intervals of Ln and Bn . Ln ⊆ B n ⊆ Kn ⊆ Cn ∅, We prove that if / . γ n (ω) inf {t : (t, ω) ∈ Ln } = min t : (t, ω) ∈ Ln < ∞ then γ n (ω) ∞ for all ω. Otherwise γ n (ω) ≤ K for some ω and K < ∞ and (γ n (ω) , ω) ∈ Ln . The sets [0, K] ∩ Ln (ω) are compact and γ n (ω) ∈ [0, K] ∩ Ln (ω) for all n. Hence their intersection is non-empty. Let γ ∞ be in the intersection. Then (γ ∞ , ω) ∈ Ln for all n so (γ ∞ , ω) ∈ ∩n Ln , which is impossible. Let S = S(0) + M − P be the decomposition of S, where P is the potential part of S. As M is uniformly integrable E(M (∞)) = E(M (γ n )). Therefore µS (Ln ) ≤ E(S(∞) − S(γ n )) = E(P (γ n )). As P is in class D (P (γ n ∧ t)) is uniformly integrable for every t, so as γ n ∞ lim E(P (γ n ∧ t)) = E(P (t)).

n→∞

Using that P is a supermartingale lim sup E(P (γ n )) ≤ lim sup E(P (γ n ∧ t)) = E(P (t)). n→∞

n→∞

As lim E(P (t)) = 0

t→∞

obviously µS (Ln ) → 0. By (5.5) µS (Ln ) ≤ E (S (γ n ) − S (∞)) → 0. By (5.5) c

µS (Cn \ Ln ) µS (Cn ∩ (∩k≤n Bk ) ) = µS (Cn ∩ (∪k≤n Bkc )) ≤ ≤

n k=1

µS (Cn \ Bk ) ≤

n k=1

µS (Ck \ Bk ) ≤ ε,

298

SOME OTHER THEOREMS

hence lim sup µS (Cn ) ≤ lim sup µS (Cn \ Ln ) + lim sup µS (Ln ) ≤ ε. n→∞

n→∞

n→∞

Now we can finish the proof of the Doob–Meyer decomposition. Let us recall that by (5.4) one should prove that there is a predictable process A such that Y ∈ P.

µA (Y ) = µS (Y ) ,

(5.6)

To construct A let us extend µS from P to the product measurable subsets of R+ × Ω with the definition µ (Y ) µS ( Y ) p

p R+ ×Ω

χY dµS .

(5.7)

Observe that as p χY is well-defined the set function µ (Y ) is also well-defined. If Y1 and Y2 are disjoint then by the additivity of the predictable projection µ (Y1 ∪ Y2 ) µS ( (Y1 ∪ Y2 ))

p

p

p

=

R+ ×Ω

= R+ ×Ω

p

R+ ×Ω

χY1 ∪Y2 dµS =

χY1 + χY2 dµS =

χY1 +

p

χY2 dµS =

= µS (p Y1 ) + µS (p Y2 ) µ (Y1 ) + µ (Y2 ) , so µ is additive. It is clear from the Monotone Convergence Theorem for the predictable projection that µ is σ-additive. Hence µ is a measure. µ is absolutely continuous, since if Y ⊆ R+ ×Ω is a negligible set, then there is a set N ⊆ Ω with probability zero that Y can be covered by the random intervals [0, τ n ] where τ n (ω)

n 0

if ω ∈ N . if ω ∈ /N

As P (N ) = 0 and as the usual conditions hold τ n is a stopping time for every n. Hence the intervals [0, τ n ] are predictable, and their Doléans-measure is obviously zero. So µ (Y ) ≤

n

µ ([0, τ n ]) =

n

µS ([0, τ n ]) = 0.


299

By the generalized Radon–Nikodym theorem7 we can represent µ with a predictable8 process A ∈ A+ . Hence for all predictable Y µA (Y ) = µ (Y ) µS (p Y ) = µS (Y ) therefore for this A (5.6) holds. 5.1.2

Dellacherie’s formulas and the natural processes

In some applications of the Doob–Meyer decomposition it is more convenient to assume that in the decomposition the increasing process A is natural. Definition 5.7 We say that a process V ∈ V is natural if for every non-negative, bounded martingale N

t

N dV

E

t

=E

N− dV

0

.

(5.8)

0

Recall that for local martingales p N = N− , hence (5.8) can be written as

t

N dV

E

t p

=E

0

N dV

.

0

Proposition 5.8 (Dellacherie’s formula) If V ∈ A+ is natural then for every non-negative, product measurable process X

∞

E

XdV 0

∞

=E

p

XdV

,

(5.9)

0

where the two sides exist or do not exist in the same time. Proof. If η is non-negative, bounded random variable and X η · χ ((s, t]) then E

∞

XdV

= E (η (V (t) − V (s))) =

0

(n)

(n) =E η V tk − V tk−1 =

8 See:

=

k

(n) (n) E E η V tk − V tk−1 | Ft(n) =

k 7 See:


k

300

SOME OTHER THEOREMS

=E

(n) (n) E η | Ft(n) V tk − V tk−1

k

k

E

M

(n) tk

(n) (n) V tk − V tk−1 .

k

By our general assumption the filtration satisfies the usual conditions so M (t) E (η | Ft ) has a version which is a bounded, non-negative martingale. If (n) (n) max tk − tk−1 → 0 k

then using that M , as every martingale, is right-continuous, Mn

(n) (n) (n) χ tk−1 , tk → M. M tk

k

η is bounded and V ∈ A+ , hence the sum behind the expected value is dominated by an integrable variable, so by the Dominated Convergence Theorem

∞

XdV

E 0

= lim E n→∞

=E

=E

lim

n→∞

lim

n→∞

k

M

(n) tk

(n) (n) V tk − V tk−1

k

M

(n) tk

t

(n) (n) V tk − V tk−1

Mn dV

=E

s

t

lim Mn dV

s n→∞

=

=

t

=E

M dV

.

s

Remember that if X η · χI then9 p

X

p

(η · χI ) = M− · χI .

Using that V is natural E

∞

XdV

=E

0

t

M dV s

=E

=E

M− dV s

∞

M− χ ((s, t]) dV 0

t

=

=E

∞

p

XdV

.

0

Hence for this special X (5.9) holds. These processes form a π-system. The bounded processes for which (5.9) is true is a λ-system, hence by the Monotone 9 See:



301

Class Theorem one can extend (5.9) to the bounded processes which are measurable with respect to the σ-algebra generated by the processes X η · χ ((s, t]), hence (5.9) is true if X is a bounded product measurable process. To prove the proposition it is sufficient to apply the Monotone Convergence Theorem. Proposition 5.9 (Dellacherie’s formula) If A ∈ V and A is predictable then for any non-negative, product measurable process X ∞ ∞ p E XdA = E XdA , 0

0

where the two sides exist or do not exist in the same time. Proof. If A is predictable then Var (A) is also predictable. Therefore we can assume that A is increasing. In this case the expressions in the expectations exist and they are non-negative. Define the process σ (t, ω) inf {s : A (s, ω) ≥ t} . As A is increasing σ (t, ω) is increasing and right-continuous in t for any fixed ω. As the usual conditions hold σ t , as a function of ω is a stopping time for any fixed t. Observe that as A is right-continuous [σ t ] ⊆ {A ≥ t} , so as A is predictable Graph (σ t ) = [σ t ] = [0, σ t ] ∩ {A ≥ t} ∈ P, hence σ t is a predictable stopping time10 . By the definition of the predictable projection E (X (σ t ) χ (σ t < ∞)) = E ( p X (σ t ) χ (σ t < ∞)) . Let us remark, that for every non-negative Borel measurable function f ∞ ∞ f (u) dA (u) = f (σ t ) χ (σ t < ∞) dt. 0

0

To see this let us remark that A is right-continuous and increasing hence {t ≤ A (v)} = {σ t ≤ v} . So if f χ ([0, v]) then as A (0) = 0 ∞ ∞ f dA = A (v) = χ (t ≤ A (v)) dt = 0

=

0 ∞

0 10 See:

χ (σ t ≤ v) dt =


0

∞

f (σ t ) χ (σ t < ∞) dt.

(5.10)

302

SOME OTHER THEOREMS

One can prove the general case in the usual way. As σ t is predictable and as σ (t, ω) is product measurable by Fubini’s theorem ∞ ∞ XdA = E X (σ t ) χ (σ t < ∞) dt = E 0

0

∞

=

E (X (σ t ) χ (σ t < ∞)) dt =

0

∞

=

E ( p X (σ t ) χ (σ t < ∞)) dt =

0

∞

=E

p

XdA .

0

Theorem 5.10 (Dol´ eans) A process V ∈ A+ is natural if and only if V is predictable. Proof. If V is natural, then by the first formula of Dellacherie if p X = p Y , then µV (X) = µV (Y ), hence by the uniqueness of the representation of µV V is predictable11 . To see the other implication assume that V is predictable. By the second formula of Dellacherie for every product measurable process X

∞

XdV

E

∞

=E

0

p

XdV

.

0

If N is a local martingale then12 p N = N− , hence V is natural. Dellacherie’s formulas have an interesting consequence. When the integrator is a continuous local martingale then the stochastic integral is meaningful whenever the integrand is progressively measurable. By Dellacheries’s formulas even in this case the set of all possible integral processes is the same as the set of integral processes when the integrands are just predictable. Assume first 2 that X ∈ L2 (M ). By Jensen’s inequality ( p X) ≤ p X 2 , hence by the second Dellacherie’s formula p X ∈ L2 (M ). [M, N ] is continuous, hence it is predictable also by Dellacherie’s formula

∞

E

Xd [M, N ] = E

0

∞

p

Xd [M, N ] .

0

Hence during the definition of the stochastic integral the linear functionals N → E

∞

Xd [M, N ] ,

0 11 See: 12 See:


N → E 0

∞

p

Xd [M, N ]


303

coincide. Hence X • M = p X • M , and with localization if X ∈ L2loc (M ) then X ∈ L2loc (M ) and X • M = p X • M .

p

5.1.3

The sub- super- and the quasi-martingales are semimartingales

The main problem with the definition of the semimartingales is that it is very formal. An important consequence of the Doob–Meyer decomposition is that we can show some nontrivial examples for semimartingales. The most important direct application of the Doob–Meyer decomposition is the following: Proposition 5.11 Every integrable13 sub- and supermartingale X is semimartingale. Proof. Let X be integrable submartingale. To make the notation simple we shall assume that X (0) = 0. 1. Let us first assume that if X is an integrable submartingale. Let τ be an arbitrary stopping time. We prove that as in the case of martingales, X τ is also a submartingale. Let s < t and A ∈ Fs . Let us define the bounded stopping time σ (τ ∧ t) χAc + (τ ∧ s) χA . As X is integrable one can use the Optional Sampling Theorem, hence as σ ≤ τ ∧t E (X (σ)) E (X (τ ∧ t) χAc + X (τ ∧ s) χA ) ≤ ≤ E (X (τ ∧ t)) = E (X τ (t) χAc + X τ (t) χA ) , therefore E (X τ (s) χA ) ≤ E (X τ (t) χA ) , which means that X τ (s) ≤ E (X τ (t) | Fs ) , that is X τ is a submartingale. 2. If submartingale X is in class D then by the Doob–Meyer decomposition X is semimartingale. One should prove that there is a localizing sequence (τ n ), for which X τ n is in class D for all n , hence as the Doob–Meyer decomposition 13 That

is X (t) is integrable for every t.

304

SOME OTHER THEOREMS

is unique the decomposition Ln+1 + Vn+1 of X τ n+1 on the interval [0, τ n ] is indistinguishable from the decomposition Ln + Vn of X τ n . From this it is clear that X has the decomposition L + V lim Ln + lim Vn , n

n

where L is a local martingale and V has finite variation. 3. Let us define the bounded stopping times τ n inf {t : |X (t)| > n} ∧ n. As X is integrable by the Optional Sampling Theorem X (τ n ) ∈ L1 (Ω). For all t |X τ n (t)| ≤ n + |X (τ n )| ∈ L1 (Ω) , hence X τ n is a class D submartingale. Obviously τ n ≤ τ n+1 . Assume that for some ω the sequence (τ n (ω)) is bounded. In this case τ n (ω) τ ∞ (ω) < ∞. So there is an N such that if n ≥ N then τ n (ω) < n. Hence |X (τ n (ω))| ≥ n by the definition of τ n , therefore the sequence (X (τ n (ω))) is not convergent, which is a contradiction as by the right-regularity of the submartingales X has finite left limit at τ ∞ (ω). The semimartingales form a linear space, therefore if X Y − Z, where Y and Z are integrable, non-negative supermartingales then X is also a semimartingale. Let us extend X to t = ∞. By definition let X (∞) Y (∞) Z (∞) 0. As Y and Z are non-negative, after this extension they remain supermartingales14 . Hence one can assume that Y, Z and X are defined on [0, ∞]. Let ∆ : 0 = t0 < t1 < . . . < tn < tn+1 = ∞

(5.11)

be an arbitrary decomposition of [0, ∞]. Let us define the expression

sup E ∆ 14 Observed

n

|E (X (ti ) − X (ti+1 ) | Fti )| ,

i=0

that we used the non-negativity assumption.

(5.12)


305

where one should calculate the supremum over all possible subdivisions (5.11).

E

≤E

|E (X (ti ) − X (ti+1 ) | Fti )|

i

|E (Y (ti ) − Y (ti+1 ) | Fti )|

+E

i

≤

|E (Z (ti ) − Z (ti+1 ) | Fti )| .

i

Y is a supermartingale, hence E (Y (ti ) − Y (ti+1 ) | Fti ) = Y (ti ) − E (Y (ti+1 ) | Fti ) ≥ 0. Therefore one can drop the absolute value. By the simple properties of the conditional expectation, using the assumption that Y is integrable E

n

|E (Y (ti ) − Y (ti+1 ) | Fti )|

= E (Y (0)) − E (Y (∞)) = E (Y (0)) < ∞.

i=0

Applying the same to Z one can easily see that if X has the just mentioned decomposition then the supremum (5.12) is finite. Definition 5.12 We say that the integrable15 , adapted, right-regular process X is a quasi-martingale if the supremum in (5.12) is finite. Proposition 5.13 (Rao) An integrable, right-regular process X defined on R+ is a quasi-martingale if and only if it has a decomposition X =Y −Z where Y and Z are non-negative supermartingales. Proof. We have already proved one implication. We should only show that every quasi-martingale has the mentioned decomposition. X is defined on R+ , hence as above we shall assume that X (∞) 0. Let us fix an s. For any decomposition ∆ : t0 = s < t1 < t2 . . . of [s, ∞] let us define the two variables ± C∆

(s) E

(E (X (ti ) − X (ti+1 ) | Fti )) | Fs

i 15 That

±

is X (t) is integrable for every t.

.

SOME OTHER THEOREMS

306

± The variables C∆ (s) are Fs -measurable. Let (∆n ) be an infinitesimal16 sequence of partitions of [s, ∞] , and let us assume that ∆n ⊆ ∆n+1 , that is let us assume that we get ∆n+1 by adding further points to ∆n . We shall prove that the

± (s) are almost surely convergent and the limits are almost surely sequences C∆ n finite. First we prove that if the partition ∆ is finer than ∆ , then ± ± C∆ (s) ≤ C∆ (s) ,

(5.13)

which will imply the convergence. By the quasi-martingale property the set of ± variables C∆ (s) is bounded in L1 (Ω). From the Monotone Convergence Theorem ± (s) ∞ cannot hold on a set which has positive measure. it is obvious, that C∆ n To prove (5.13) let us assume that the new point t is between ti and ti+1 . Let us introduce the variables ξ E (X (ti ) − X (t) | Fti ) ,

η E (X (t) − X (ti+1 ) | Ft ) ,

ζ E (X (ti ) − X (ti+1 ) | Fti ) . As ζ = ξ + E (η | Fti ), by Jensen’s inequality

+ ζ + ≤ ξ + + E (η | Fti ) ≤ ξ + + E η + | Fti , hence

E ζ + | Fs ≤ E ξ + | Fs + E η + | Fs , from which the inequality (5.13) is trivial. Let us introduce the variables ± C ± (s) lim C∆ (s) . n n→∞

Obviously C ± (s) is integrable and Fs -measurable. Let us observe that the vari± (s) are defined up to a measure-zero set, hence the variables C ± (s) ables C∆ n

(n) are also defined up to a measure-zero set. For arbitrary partitions ∆n ti as X (∞) 0 and as X is adapted + C∆ n

(s) −

− C∆ n

(s) = E =

E X

(n) ti

−X

(n) ti+1

| Ft(n) | Fs

=

i

i

(n) (n) E X ti − X ti+1 | Fs =

i a.s

= E (X (s) | Fs ) − E (X (∞) | Fs ) = X (s) . 16 As the length of the [s, ∞] is infinite this property, it means that we map order preservingly [0, ∞] onto [0, 1] and then the (∆n )n is infinitesimal on [0, 1] .


307

This remains valid after we take the limit, hence for all s C + (s) − C − (s) = X (s) . a.s

(5.14)

Let us assume that t is in ∆n for all n. As s < t  

±

±   (n) (n) E X ti − X ti+1 | Ft(n) (t) | Fs = E  | Fs  ≤ E C∆ n i

(n)

tii ≥t

± (n)

(n) ≤E | Fs E X ti − X ti+1 | Ft(n) i

i ± = C∆ (s) , n

from which taking the limit and using the Monotone Convergence Theorem for the conditional expectation

E C ± (t) | Fs ≤ C ± (s) . (5.15) Let (∆n ) be an infinitesimal sequence of partitions of [0, ∞]. Let S be the union of the points in (∆n ). Obviously S is dense in R+ . By the above C ± are supermartingales on S. As S is countable so on S one can define the trajectories of C ± up to a measure zero set. By the supermartingale property except on a measure zero set N for every t the limit D± (t, ω) C ± (t+, ω)

lim

st,s∈S

C ± (s, ω)

exist and D± (t) is right-regular. X is also right-regular, hence from (5.14) on the N c for every t ≥ 0 D+ (t) − D− (t) = X (t) . D± (t) is Ft+1/n -measurable for all n, hence D± (t) is Ft+ -measurable. As F satisfies the usual conditions D± (t) is Ft measurable, that is the processes D± are adapted. If sn t and sn ∈ S, then the sequence (C ± (sn )) is a reversed supermartingale. Hence for the L1 (Ω) convergence of (C ± (sn )) it is necessary and sufficient that the sequence is bounded in L1 (Ω). By the supermartingale property as (sn ) is decreasing the expected value of (C ± (sn ))n is increasing. By the quasi-martingale property the variables C ± (0) are integrable, hence by the non-negativity the sequences (C ± (sn )) are bounded in L1 (Ω). Hence they are convergent in L1 (Ω). From this D± (t) is integrable for all t. The conditional expectation is continuous in L1 (Ω) therefore one can take the limit in (5.15) into the conditional expectation. Hence the processes D± are integrable supermartingales on R+ . Corollary 5.14 Every quasi-martingale is a semimartingale.

308

5.2

SOME OTHER THEOREMS

Semimartingales as Good Integrators

The definition of the semimartingales is quite artificial. In this section we present an important characterization of the semimartingales. We shall prove that the only class of integrators for which one can define a stochastic integral with reasonable properties is the class of the semimartingales. Recall the following definition: Definition 5.15 Process E is a predictable step process if E=

n

ξ i χ ((ti , ti+1 ])

i=0

where 0 = t0 < t1 < . . . < tn+1 and ξ i are Fti -measurable random variables. If X an arbitrary process then the only reasonable definition of the stochastic integral E • X is17 (E • X) (t) = ξ i (X (ti+1 ∧ t) − X (ti ∧ t)) . i

For an arbitrary stochastic process X the definition obviously makes the integral linear over the linear space of the predictable step processes. On the other hand it is reasonable to say that a linear mapping is an integral if the correspondence has some continuity property. Let us define the topology of uniform convergence in (t, ω) among the predictable step processes and let us define the topology for the random variables with the stochastic convergence. Definition 5.16 We say that process X is a good integrator, if for every t the correspondence E → (E • X) (t) is a continuous, linear mapping from the space of predictable step processes to the set of random variables. Observe that the required continuity property is very weak, as on the domain of definition we have a very strong, and on the image space we have a very weak, topology. As the integral is linear it is continuous if and only if it is continuous at E = 0. This means that if a sequence of step processes is uniformly convergent to zero then for any t the integral on the interval (0, t] is stochastically convergent to zero. 17 See: Theorem 2.88, page 174, line (4.11), page 252. Recall that by definition (E • X) (t) is the integral on (0, t].

SEMIMARTINGALES AS GOOD INTEGRATORS

309

Theorem 5.17 (Bichteler–Dellacherie) An adapted, right-regular process X is a semimartingale if and only if it is a good integrator. Proof. If X is a semimartingale, then by the Dominated Convergence Theorem it is obviously a good integrator18 . Hence we have to prove only the other direction. We split the proof into several steps. 1. As a first step let us separate the ‘big jumps’ of X, that is let us separate from X the jumps of X which are larger than one. By the assumptions of the theorem the trajectories of X are regular so the ‘big jumps’ do not have an accumulation point. Hence the decomposition is meaningful. From this trivially follows that the process

∆Xχ (|∆X| ≥ 1)

has finite variations. As the continuity property of the good integrators holds for processes with finite variation Y X − ∆Xχ (|∆X| ≥ 1) is also a good integrator. If we prove that Y is a semimartingale, then we obviously prove that X is a semimartingale as well. Y does not contain ‘big jumps hence if it is a semimartingale, then it is a special semimartingale19 . Therefore the decomposition of Y is unique20 . As the decomposition is unique it is sufficient to prove that Y is a semimartingale on every interval [0, t]. 2. As we have already seen21 if probability measures P and Q are equivalent, that is the measure-zero sets under P and Q are the same, then X is a semimartingale under P if and only if it is a semimartingale under Q. Therefore it is sufficient to prove that if X is a good integrator under P then one can find a probability measure Q which is equivalent to P and X is a semimartingale under Q. Observe that a sequence of random variables is stochastically convergent to some random variable if and only if any subsequence of the original sequence has another subsequence which is almost surely convergent to the same function. Therefore the stochastic convergence depends only on the collection of measurezero sets, which is not changing during the equivalent change of measure. From this it is obvious that the class of good integrators is not changing under the equivalent change of measure. 3. Let us fix an interval [0, t]. As the trajectories of X are regular the trajectories are bounded on any finite interval. Hence η sups≤t |X (s)| < ∞. Again by the regularity of the trajectories it is sufficient to calculate the supremum over the rational points s ≤ t. Therefore η is a random variable. Let Am {m ≤ η < m + 1} and ζ m 2−m χAm . ζ is evidently bounded, and as 18 See:

Lemma 2.12, page 118. Example 4.47, page 258. 20 See: Corollary 3.41, page 205. 21 See: Corollary 4.58, page 271. 19 See:

310

SOME OTHER THEOREMS

η is finite ζ is trivially positive. As E (ηζ) =

E η2−m χ (m ≤ η < m + 1) ≤ (m + 1) 2−m

m

m

it is obvious that ηζ is integrable under P. 1 R (A) E (ζ)

ζdP A

is a probability measure and as ζ is positive it is equivalent to P. For every s ≤ t

|X (s)| dR ≤

Ω

ηdR = Ω

1 E (ζ)

ηζdP < ∞, Ω

therefore X (s) is integrable under R for all s. To make the notation simple we assume that X (s) are already integrable under P for all s ∈ [0, t]. 4. Let us define the set B {(E • X) (t) : |E| ≤ 1, E ∈ E} ,

(5.16)

where E is the set of predictable step processes over [0, t]. Using the continuity property of the good integrators we prove that B is stochastically bounded, that is for every ε > 0 there is a number k, such that P (|η| ≥ k) < ε for all η ∈ B. If it was not true then there were an ε > 0, a sequence of step processes |En | ≤ 1 and kn ∞, such that P

(En • X) (t) ≥1 kn

≥ ε.

The sequence (En /kn ) is uniformly converging to zero, hence by the continuity property of the good integrators (En • X) (t) = kn

En P • X (t) → 0, kn

which is, by the indirect assumption, is not true. 5. As a last step of the proof in the next point we shall prove that for every non-empty, stochastically bounded, convex subset B of L1 there is a probability measure Q which is equivalent to P and for which

βdQ : β ∈ B

sup Ω

c < ∞.

(5.17)


311

From this the theorem follows as for every partition of [0, t] 0 = t0 < t1 < . . . < tn+1 = t if22

ξ i sgn EQ (X (ti+1 ) − X (ti ) | Fti ) , and E

ξ i χ ((ti , ti+1 ])

i

then as |E| ≤ 1 (E • X) (t) ∈ B, therefore Q

c ≥ E ((E • X) (t)) =

n

EQ (ξ i [X (ti+1 ) − X (ti )]) =

i=0

=

n

EQ EQ (ξ i [X (ti+1 ) − X (ti )] | Fti ) =

i=0

=

n

EQ ξ i EQ (X (ti+1 ) − X (ti ) | Fti ) =

i=0

n Q E (X (ti ) − X (ti+1 ) | Ft ) . =E i Q

i=0

Hence X is a quasi-martingale under Q. Therefore23 it is a semimartingale under Q. 6. Let B ⊆ L1 (Ω) be a non-empty stochastically bounded convex convex set24 . We prove the existence of the equivalent measure Q in (5.17) with the Hahn– ∞ Banach theorem. Let L∞ + denote the set of non-negative functions in L . H

ζ ∈ L∞ + : sup

βζdP : β ∈ B

0 there is a k (ε) such that P (B ≥ k (ε)) ≤ ε. 23 See:

312

SOME OTHER THEOREMS

It is sufficient to prove that H contains a strictly positive function ζ 0 , since in this case 1 Q (A) ζ dP E (ζ 0 ) A 0 is an equivalent probability measure for which (5.17) holds. Let G be the set of points of positivity of the functions in H. The set G is closed under the countable union: if ζ n ∈ H, and

βζ n dP : β ∈ B

sup

≤ cn

Ω

cn ≥ 1 then n

2−n ζ ∈ H. cn ζ n ∞ n

Using the lattice property of G in the usual way one can prove that G contains a set D which has maximal measure, that is P (G) ≤ P (D) for all G ∈ G. Of course to D there is a ζ D ∈ H. We should prove that P (D) = 1, hence in this case ζ D ∈ H, as an equivalence class, it is strictly positive. Let us denote by C the complement of D. We shall prove that P (C) = 0. As an indirect assumption let us assume that P (C) ε > 0.

(5.18)

As B is stochastically bounded to our ε > 0 in (5.18) there is a k, such that P (β ≥ k) ≤ ε/2 for all random variable β ∈ B. From this θ 2kχC ∈ / B. Of course, if ϑ ≥ 0, then P (θ + ϑ ≥ k) ≥ ε hence θ + ϑ ∈ / B, that is θ ∈ / B − L1+ . We can prove a bit more: θ is not even in the closure in L1 (Ω) of the convex25 set B − L1+ . That is

θ∈ / cl B − L1+ . P

If γ n β n − ϑn → θ in L1 (Ω), then γ n → θ, but if δ is small enough, then as ϑn ≥ 0 P (|γ n − θ| > δ) P (|β n − ϑn − θ| > δ) ≥ ≥ P ({β n < k} ∩ {θ ≥ 2k}) = = P ({β n < k} ∩ C) = P (C\ {β n ≥ k}) ≥ 25 The

B is conves hence B − L1+ is also convex.

ε , 2


313

which is impossible. By the Hahn–Banach theorem26 there is a ζ = 0 ∈ L∞ (Ω) , such that

(β − ϑ) ζdP
2−k < 2−k . i,j≥m

As we observed the real valued functions d (Zi (c, ω) , Zj (c, ω)) are measurable in (c, ω), therefore by Fubini’s theorem the probability in the formula depends on c in a measurable way. Hence nk is a measurable function of c. Let us define the ‘stopped variables’ Yk (c, t, ω) Znk (c) (c, t, ω) . For all open set G {Yk ∈ G} = ∪p {nk = p, Zp ∈ G} , therefore Yk is also product measurable. For all c −k

sup P d (Yi (c) , Yj (c)) > 2−k ≤ 2 < ∞, k

i,j≥k

k

hence for every c by the Borel and Cantelli lemma if the indexes i, j are big enough then except on a measure-zero set ω ∈ N (c) d (Yi (c, ω) , Yj (c, ω)) ≤ 2−k . D [0, ∞) is complete, hence (Yi (c, ω)) is almost surely convergent in D [0, ∞) for all c. The function limi Yi (c, t, ω) , if the limit exists, Z (c, t, ω) 0 otherwise is product measurable and Z is right-regular almost surely for all c. For an arbitrary c (Yi (c) − Z(c)) is a subsequence of (Zn (c) − Z(c)), therefore it is stochastically convergent in D [0, ∞). The measure is finite therefore for the metric space valued random variables the almost sure convergence implies the stochastic convergence. Hence Z (c, ω) is the limit of the sequence (Zn (c, ω)) for almost all ω. Returning to the proof of the proposition let us assume that Hn ∈ S and 0 ≤ Hn H, where H is bounded. By the Dominated Convergence Theorem Hn (c)• ucp X → H • X for every c. Hence by the lemma H • X has a (C × B (R+ ) × A)measurable version. That is H ∈ S. Hence the proposition is valid for bounded processes. If H is not bounded, then let Hn Hχ (|H| ≤ n). The processes Hn are also (C × G)-measurable, and of course they are bounded. Therefore the processes Hn • X have the stated version. By the Dominated Convergence

322

SOME OTHER THEOREMS ucp

Theorem Hn (c) • X → H (c) • X for every c. By the lemma this means that H (c) • X also has a measurable version. Theorem 5.25 (Fubini’s theorem for bounded integrands ) Let X be a semimartingale, and let (C, C, µ) be an arbitrary finite measure space. Let H(c, t, ω) be a function measurable with respect to the product σ-algebra C × G. Let us denote by (H • X)(c) the product measurable version of the parametric integral c → H(c) • X. If H (c, t, ω) is bounded, then (H • X) (c)dµ (c) = H (c) dµ (c) • X, (5.23) C

C

that is the integral of the parametric stochastic integral on the left side is indistinguishable from the stochastic integral on the right side. Proof. It is not a big surprise that the proof is built on the Monotone Class Theorem again. 1. By the Fundamental Theorem of Local Martingales semimartingale X has 2 . For V ∈ V one can a decomposition X (0) + V + L, where V ∈ V and L ∈ Hloc prove the equality by the classical theorem of Fubini, hence one can assume that 2 . One can easily localize the right side of (5.23). On the left side one X ∈ Hloc can interchange the localization and the integration with respect to c therefore one can assume that X (0) = 0 and X ∈ H2 . Therefore33 we can assume that E ([X] (∞)) < ∞. 2. Let us denote by S the set of bounded, (C × G)-measurable processes for which the theorem holds. If H H1 (c) H2 (t, ω) , where H1 is C-measurable step function and H2 is G-measurable and H1 and H2 are bounded functions, then arguing as in the previous proposition

H • Xdµ

C

(H1 (c) H2 ) • Xdµ(c) = C

=

C

=

αi χBi H2

• Xdµ(c) =

i

αi C

i

χBi (H2 • X) dµ(c) =

H1 (c) dµ (c) (H2 • X) =

= C

=

H1 (c) dµ (c) H2

C

so H ∈ S. 33 See:


•X = C

Hdµ • X,

THEOREM OF FUBINI FOR STOCHASTIC INTEGRALS

323

3. By the Monotone Class Theorem, one should prove that S is a λ-system. Let Hn ∈ S and let 0 ≤ Hn H, where H is bounded. We prove that one can take the limit in the equation

Hn dµ • X.

(Hn • X) dµ = C

C

As H is bounded and µ is finite, therefore on the right-hand side the integrands are uniformly bounded so one can apply the classical and the stochastic Dominated Convergence Theorem, so on the right-hand side

ucp

Hn dµ • X →

Hdµ • X.

C

(5.24)

C

4. Introduce the notations Zn Hn • X and Z H • X. One should prove that the left-hand side is also convergent that is P Z (c) dµ (c) → 0. δ sup Zn (c) dµ (c) − t

C

C

By the inequalities of Cauchy–Schwarz and Doob sup |Zn (c) − Z (c)| dµ (c) ≤

E (δ) ≤ E

t

C

* + 2 + " , sup |Zn (c) − Z (c)| dµ (c) = ≤ µ(C) E C

t

* + 2 + " sup |Zn (c) − Z (c)| dµ (c) ≤ = µ(C), E t

C

-

" 2 ≤ µ(C) 4 · E (Zn (c, ∞) − Z (c, ∞)) dµ (c). C

By Itô’s isometry34 the last integral is

E C

t

(Hn − H) d [X] dµ. 2

(5.25)

0

As µ and E ([X] (∞)) are finite and as the integrand is bounded and Hn → H by the classical Dominated Convergence Theorem the (5.25) goes to zero. 34 See:


324

SOME OTHER THEOREMS

So E (δ) → 0, that is ucp (Hn • X) dµ Zn dµ → Zdµ (H • X) dµ. C

C

C

(5.26)

C

Particularly sup |Zn (c) − Z (c)| dµ (c) < ∞, C

a.s.

t

The expression Hn (c) dµ (c) • X = (Hn (c) • X) dµ (c) Zn dµ C

C

C

is meaningful, therefore for all t and for almost all outcome ω |(H(c) • X) (t, ω)| dµ (c) |Z (c, t, ω)| dµ (c) < ∞. C

C

Hence the left-hand side of (5.23) is meaningful for H as well. By (5.24) the right-hand side is also convergent, hence from (5.26)

Hdµ • X = lim

n→∞

C

Hn dµ • X = C

(Hn • X) dµ =

= lim

n→∞

C

(H • X) dµ. C

The just proved stochastic generalization of Fubini’s theorem is sufficient for most of the applications. On the other hand one can still be interested in the unbounded case: Theorem 5.26 (Fubini’s theorem for unbounded integrands) Let X be a semimartingale and let (C, C, µ) be a finite measure space. Let H (c, t, ω) be a (C × G)-measurable process, and assume that the expression - H (t, ω)2

H 2 (c, t, ω) dµ (c) < ∞

(5.27)

C

is integrable with respect to X. Under these conditions µ almost surely the stochastic integral H (c)•X exists and if (H •X)(c) denote the measurable version of this parametric integral then (H • X)(c)dµ (c) = H (c) dµ (c) • X. (5.28) C

C


325

Proof. If on the place of H one puts Hn Hχ (|H| ≤ n) then the equality holds by the previous theorem. As in the proof of the classical Fubini’s theorem one should take a limit on both sides of the truncated equality. 1. Let us first investigate the right-hand side of the equality. By the Cauchy– Schwarz inequality - " |H (c, t, ω)| dµ (c) ≤ µ (C) H 2 (c, t, ω) dµ (c).

C

(5.29)

C

By the assumptions µ is finite and H (c, t, ω) as a function of c is in the space L2 (µ) ⊆ L1 (µ), hence by the Dominated Convergence Theorem for all (t, ω)

Hn (c, t, ω) dµ (c) →

C

H (c, t, ω) dµ (c) . C

By the just proved inequality (5.29) the processes C Hdµ and C |H| dµ are integrable with respect by the Dominated Convergence Theorem for

to X, hence ucp stochastic integrals C Hn dµ • X → C Hdµ • X. This means that one can take the limit on the right side of the equation. 2. Now let us investigate the left-hand side. We first prove that for almost all c the integral H (c) • X exists. Let X X (0) + V + L, where V ∈ V, L ∈ L is the decomposition of X for which the integral H (t, ω)2 • X exists. One can assume that V ∈ V + . Using (5.29) and for every trajectory the theorem of Fubini

t

t

|H| dV dµ = C

0

|H| dµdV =

0

0

C

t

H1 dV

t " ≤ µ (C) H2 dV < ∞. 0

t Therefore for any t for almost every35 c the integral 0 H(c)dV is finite. Of course if the integral exists for every rational t then it exists for every t, therefore unifying the measure-zero sets it is easy to show that for almost all c the integral H(c) • V is meaningful. Recall that"a process G is integrable with respect to the + 2 local ) martingale L if and only if G • [L] ∈ Aloc . This means that H2 H 2 (c) dµ (c) is integrable if and only if there is a localizing sequence (τ n ) C 35 Of

course with respect to µ.

SOME OTHER THEOREMS

326

for which the expected value of -

τn

-

τn

H 2 (c) dµ (c) d [L] = 0

H 2 (c) d [L] dµ (c)

C

0

C

is finite. By Jensen’s inequality - C

τn

0

µ (c) H 2 (c) d [L] d ≥ µ(C)

-

τn

H 2 (c) d [L]d C

0

µ (c) . µ(C)

Therefore by Fubini’s theorem -

H2

E C

-

τn

H2

(c) d [L] dµ (c) = E

0

τn

(c) d [L]dµ (c)

< ∞.

0

C

Hence except on a set Cn with µ (Cn ) = 0 the expected value of -

τn

H 2 (c) d [L] 0

is finite. Unifying the measure-zero sets Cn one can easily see that " 36 H 2 (c) • [L] ∈ A+ c, that is for almost all c the integral loc for almost all H(c) • L exists. ucp

3. If integral H (c) • X exists, then Hn (c) • X → H (c) • X. Unfortunately, as we mentioned above from the inequality |Hn (c)| ≤ |H (c)| does not follow the inequality |Hn (c) • X| ≤ |H (c) • X|, and we do not know that H (c) • X is µ integrable hence one cannot use the classical Dominated Convergence Theorem for the outer integral with respect to µ. Therefore, as in the proof of the previous theorem, we prove the convergence of the right side with direct estimation. As by the classical Fubini’s theorem the theorem is obviously valid if the integrator has finite variation one can assume that X ∈ L. 4. Let s ≥ 0. Like in the previous proof introduce the variable δ n sup ((Hn (c) − H (c)) • X) dµ (c) . t≤s C

36 Of

course with respect to µ.


327

By Davis’ inequality sup |(Hn (c) − H (c)) • X| dµ (c) =

E (δ n ) ≤ E

(5.30)

C t≤s

E sup |(Hn (c) − H (c)) • X| dµ (c) ≤

= C

≤K

t≤s

E C

"

)

2

(Hn (c) − H(c)) • [X] (s) dµ =

E

=K

[(Hn (c) − H(c)) • X] (s) dµ =

C

%

= µ(C)KE C

µ (Hn (c) − H (c)) • [X] (s) d µ(C)

2

≤

- ≤ µ(C)KE

=

µ = (Hn (c) − H (c)) • [X] (s) d µ(C) C - 2

" µ(C)KE

2

(Hn (c) − H (c)) dµ • [X] (s) . C

) C

H 2 dµ is integrable with respect to X, therefore -

- 2

(Hn (c) − H (c)) dµ • [X] ≤ C

C

H 2 dµ • [X] ∈ A+ loc .

Let (τ m ) be a localizing sequence. With localization one can assume that the last expected value is finite, that is -

H 2 dµ

E

•

[X τ m ]

< ∞.

C

Applying the estimation (5.30) for X τ m and writing δ (m) instead of δ n by n

(m) the classical Dominated Convergence Theorem E δ n → 0. Hence if m is sufficiently large then

> ε + P (τ m ≤ s) P (δ n > ε) ≤ P (δ n > ε, τ m > s) + P (τ m ≤ s) ≤ P δ (m) n Therefore δ n → 0 in probability. From this point the proof of the theorem is the same as the proof of the previous one.

328

SOME OTHER THEOREMS

Corollary 5.27 (Fubini’s theorem for local martingales) Let (C, C, µ) be a finite measure space. If L is a local martingale, H (c, t, ω) is a (C ×P)-measurable function and t H 2 (c, s)dµ(c)d [L] (s) ∈ A+ loc , 0

C

then

t

t

H (c, s) dL (s) dµ (c) = C

0

H (c, s) dµ (s) dL (s) . 0

(5.31)

C

If L is a continuous local martingale and H is a (C × R)-measurable process and t P 0

H 2 (c, s) dµ (c) d [L] (s) < ∞ = 1,

C

then (5.31) holds. Corollary 5.28 (Fubini’s theorem for Wiener processes) Let (C, C, µ) be a finite measure space. If w is a Wiener process, H (c, t, ω) is an adapted, product measurable process and t

H (c, s) dµ (c) ds < ∞ = 1, 2

P 0

C

then

t

t

H (c, s) dw (s) dµ (c) = C

5.5

0

H (c, s) dµ (s) dw (s) . 0

C

Martingale Representation

Let H0p denote the space of Hp martingales which are zero at time zero. Recall that by definition martingales M and N are orthogonal if their product M N is a local martingale. This is equivalent to the condition that the quadratic variation [M, N ] is a local martingale. This implies that if M and N are orthogonal then M τ and N are also orthogonal for every stopping time τ . The topology in the spaces H0p is given by the norm supt |M (t)|p . The basic message of the Burkholder–Davis inequality is that this norm is equivalent to the norm " M Hp [M ] (∞) . (5.32) 0 p

In this section we shall use this norm. Observe that if p ≥ 1 then H0p is a Banach space.

MARTINGALE REPRESENTATION

329

Definition 5.29 Let 1 ≤ p < ∞. We say that the closed, linear subspace X of H0p is stable if it is stable under truncation, that is if X ∈ X then X τ ∈ X for every stopping time τ . If X is a subset of H0p then we shall denote by stablep (X ) the smallest closed linear subspace of H0p which is closed under truncation and contains X . Obviously H0p is a stable subspace. The intersection of stable subspaces is also stable, hence stablep (X ) is meaningful for every X ⊆ H0p . To make the notation as simple as possible if the subscript p is not important we shall drop it and instead of stablep (X ) we shall simply write stable(X ). Lemma 5.30 Let 1 ≤ p < ∞ and let X ⊆ H0p . Let N be a bounded martingale. If N is orthogonal to X then N is orthogonal to stable(X ). Proof. Let us denote by Y the set of H0p -martingales which are orthogonal to N . Of course X ⊆ Y so it is sufficient to prove that Y is a stable subspace of H0p . As we remarked Y is closed under stopping. Let Mn ∈ Y and let Mn → M∞ in H0p . As N is bounded Mn N is a local martingale which is in class D. Hence it is a uniformly integrable martingale. So E ((Mn N ) (τ )) = 0 for every stopping time τ . Let k < ∞ be an upper bound of N. |E ((M∞ N ) (τ ))| = |E ((M∞ N ) (τ )) − E ((Mn N ) (τ ))| ≤ ≤ E (|((M∞ − Mn ) N ) (τ )|) ≤ ≤ k · E (|(M∞ − Mn ) (τ )|) ≤

" [M∞ − Mn ] (∞) ≤ ≤k·E ≤ k · M∞ − Mn Hp → 0. 0

So M∞ N is also a martingale. Hence Y {X ∈ H0p : X ⊥ N } is closed in H0p . Definition 5.31 Let 1 ≤ p < ∞. We say that the subset X ⊆ H0p has the Martingale Representation Property if H0p = stable(X ). Recall that we have fixed a stochastic base (Ω, A, P, F). Definition 5.32 Let 1 ≤ p < ∞. Let us say that the probability measure Q on (Ω, A) is a H0p -measure of the subset X ⊆ H0p if 1. Q P, 2. Q = P on F0 , 3. if M ∈ X then M is in H0p under Q as well. Mp (X ) will denote the set of H0p -measures of X .

330

SOME OTHER THEOREMS

Lemma 5.33 Mp (X ) is always convex. Proof. If Q1 , Q2 ∈ Mp (X ) and 0 ≤ λ ≤ 1 and Qλ λQ1 + (1 − λ)Q2 then for every M ∈ X EQλ

p sup |M (t)| = t

Q1

= λE

p p Q2 sup |M (t)| + (1 − λ)E sup |M (t)| < ∞. t

t

If F ∈ Fs and t > s then by the martingale property under Q1 and Q2 M (t)dQλ = λ M (t)dQ1 + (1 − λ) M (t)dQ2 = F

F

F

M (s)dQ1 + (1 − λ)

=λ F

M (s)dQ2 = F

M (s)dQλ .

= F

Hence M is in H0p under Qλ . Definition 5.34 If C is a convex set and x ∈ C then we say that x is an extremal point of C if whenever u, v ∈ C and x = λu + (1 − λ)v for some 0 ≤ λ ≤ 1 then x = u or x = v. Proposition 5.35 Let 1 ≤ p < ∞ and let X ⊆ H0p . If X has the Martingale Representation Property then P is an extremal point of Mp (X ). Proof. Assume that P = λQ + (1 − λ) R, where 0 ≤ λ ≤ 1 and Q, R ∈ Mp (X ). As R ≥ 0 obviously Q P so one can define the Radon–Nikodym derivative L (∞) dQ/dP ∈ L1 (Ω, P, F∞ ). Define the martingale L (t) E (L (∞) | Ft ) . From the definition of the conditional expectation L (t) dP = L (∞) dP = Q (F ) , F

F ∈ Ft ,

F

so L (t) is the Radon–Nikodym derivative of Q with respect to P on the measure space (Ω, Ft ). Let X ∈ X . If s < t and F ∈ Fs then as X is a


331

martingale under Q dQ X (t) L (t) dP = X (t) X (t) dQ = dP = dP F F F = X (s) dQ = X (s) L (s) dP F

F

so XL is a martingale under P. Obviously Q ≤ P/λ so 0 ≤ L ≤ 1/λ. Hence L is uniformly bounded. L (0) is bounded and F0 -measurable so X · L (0) is a martingale. This implies that X · (L − L (0)) is also a martingale under P, that is X and L − L (0) are orthogonal as local martingales. That is L − L (0) is orthogonal to X . Hence by the previous lemma L − L (0) is orthogonal to stable(X ). As X has the Martingale Representation Property L − L (0) is orthogonal to H0p . As L − L (0) is bounded L − L (0) ∈ H0p . But this means37 that L − L (0) = 0. By definition Q and P are equal on F0 , hence L (∞) = L (0) = 1. Hence P = Q. Now we want to prove the converse statement for p = 1. Let P be an extremal point of Mp (X ) and assume that X does not have the Martingale Representation Property, that is stable(X ) = H0p . As stable(X ) is a closed linear space by the Hahn–Banach theorem there is a non-zero linear functional L for which L (stable(X )) = 0.

(5.33)

Assume temporarily that L has the following representation: there is a locally bounded local martingale N such that L (M ) = E ([M, N ] (∞)) ,

M ∈ H0p .

(5.34)

stable(X ) is closed under truncation, hence for every stopping time τ τ

E ([M, N τ ] (∞)) = E ([M, N ] (∞)) = = E ([M τ , N ] (∞)) = L (M τ ) = 0 whenever M ∈ stable(X ). Hence instead of N we can use N τ . As N is locally bounded we can assume that N is a uniformly bounded martingale. Instead of N we can also write N − N (0) so one can assume that N (0) = 0. Let |N | ≤ c. If N (∞) N (∞) dQ 1 − dP, dR 1 + dP 2c 2c then Q and R are non-negative measures. As N is a bounded martingale E (N (∞)) = E (N (0)) = E (0) = 0, 37 See:


332

SOME OTHER THEOREMS

so Q and R are probability measures and obviously P = (Q + R) /2. If X ∈ X then

p

p

sup |X(s)| dQ =

sup |X(s)|

s

Ω

Ω

s

N (∞) 1− 2c

dP ≤

p

≤2

sup |X(s)| dP < ∞. Ω

s

If s < t and F ∈ Fs then

N (∞) X(t) 1 − dP = 2c F 1 = X(t)dP − X (t) N (∞) dP = 2c F F 1 = X (s) dP − X (t) N (∞) dP. 2c F F

X (t) dQ F

As F ∈ Fs σ(ω)

if ω ∈ F if ω ∈ /F

s ∞

is a stopping time. As s ≤ t τ (ω)

t if ω ∈ F ∞ if ω ∈ /F

is also a stopping time. Hence X τ , X σ ∈ stable(X ), so

X τ − X s = X t − X s χF ∈ stable(X ).

(5.35)

Obviously H0p ⊆ H01 if p ≥ 1 so |M N | ≤ sup |M | (t) sup |N | (t) ∈ L1 (Ω) . t

t

As N is bounded obviouly38 N ∈ H0q . Hence by the Kunita–Watanabe inequality using also H¨ older’s inequality |[M, N ]| ≤

"

" [M ] (∞) [N ] (∞) ∈ L1 (Ω) .

38 Recall the definition of the Hp spaces! See: (5.32) on page 328. Implicitly we have used the Burkholder–Davis inequality.


333

By this M N − [M, N ] is a class D local martingale hence it is a uniformly integrable martingale39 . Hence E (M (∞) N (∞)) = E (M (∞) N (∞)) − L (M ) = = E (M (∞) N (∞) − [M, N ] (∞)) = = E (M (0) N (0) − [M, N ] (0)) = 0 so by (5.35)

E N (∞) χF X t (∞) = E (N (∞) χF X s (∞)) . Therefore

X (t) N (∞) dP = F

X (s) N (∞) dP. F

Hence X is a martingale under Q. This implies that Q ∈ Mp (X ). In a similar way R ∈ Mp (X ) which is a contradiction. So one should only prove that if stable(X ) = H0p then there is a locally bounded local martingale N for which (5.33) and (5.34) hold. It is easy to see that if p > 1 then the dual of H0p is H0q , where of course 1/p + 1/q = 1. The H0q martingales are not locally bounded40 so the argument above is not valid if p > 1. Assume that p = 1. Proposition 5.36 If L is a continuous linear functional over H01 then (5.34) holds, that is for some locally bounded local martingale N L (M ) = E ([M, N ] (∞)) ,

M ∈ H01 .

Proof. Obviously H02 ⊆ H01 and M H1 ≤ M H2 so if c L then |L (M )| ≤ c M H1 ≤ c M H2 0

0

so L is a continuous linear functional over H02 . 1. H02 is a Hilbert space so for some N ∈ H02 L (M ) = E (M (∞) N (∞)) ,

M ∈ H02 .

Let M ∈ H02 . From the Kunita–Watanabe inequality41 " " " " |[M, N ]| ≤ [M ] [N ] ≤ [M ] (∞) [N ] (∞) ∈ L1 (Ω) . 39 See:

Example 1.144, page 102. can easily modify Example 1.138, on page 96 to construct a counter-example. 41 Observe that we used again that the two definition of H2 spaces are equivalent. 0 40 One

334

SOME OTHER THEOREMS

Also as M, N ∈ H02 |(M N ) (t)| ≤ sup |M (t)| sup |N (t)| ∈ L1 (Ω) . t

t

Therefore M N − [M, N ] has an integrable majorant so it is a local martingale from class D. Therefore it is a uniformly integrable martingale. This implies that for some N ∈ H02 L (M ) = E (M (∞) N (∞)) = E ([M, N ] (∞)) ,

M ∈ H02 .

(5.36)

2. Now we prove that for almost all trajectory |∆N | ≤ 2c. Let τ inf {t : |∆N | > 2c} . As N (0) = 0 and N is right-continuous τ > 0. If τ (ω) < ∞ then |∆N (τ )| (ω) > 2c. Hence we should prove that P (|∆N (τ )| > 2c) = 0. Every stopping time can be covered by countable number totally inaccessible or predictable stopping times, hence one can assume that τ is either predictable or totally inaccessible. If P (|∆N (τ )| > 2c) > 0 then let ξ

sgn (∆N (τ )) χ (|∆N (τ )| > 2c) . P (|∆N (τ )| > 2c)

S ξχ ([τ , ∞)) is adapted, right-continuous and it has an integrable variation. Let M S − S p . If τ is predictable then the graph [τ ] is a predictable set, hence ∆ (S p ) =

p

(∆S)

p

(ξχ ([τ ])) = (p ξ) χ ([τ ]) .

where p (ξ) is the predictable projection of the constant process U (t) ≡ ξ. By the definition of the predictable projection p

(ξ) (τ ) = E (ξ | Fτ − ) .

If τ is totally inaccessible then P (τ = σ) = 0 for every predictable stopping time σ. Hence p

(∆M ) (σ) =

p

0 (ξχ ([τ ]) (σ) | Fσ− ) = (ξχ ([τ ])) (σ) E

0 (0 | Fσ− ) = 0. =E Hence (∆M p ) = p (∆M ) = 0. Therefore in both cases S p has just one jump which occurs at τ . This implies that M has finite variation and it has just one jump which occurs at τ . As we have seen ξ − E (ξ | Fτ − ) if τ is predictable ∆M (τ ) = . ξ if τ is totally inaccessible


335

Obviously M H1 E

"

0

)

2 [M ] (∞) = E (∆M ) (τ ) =

= E (|∆M (τ )|) ≤ E (|ξ|) + E (|E (ξ | Fτ − )|) ≤ ≤ 2E (|ξ|) = 2. t

M− dM is a local martingale with localizing sequence (ρn ). By the integration 0 by parts formula and by Fatou’s lemma

E M 2 (t) = E lim M 2 (t ∧ ρn ) ≤ lim supE M 2 (t ∧ ρn ) = n→∞

n→∞

= lim supE ([M ] (t ∧ ρn )) ≤ E ([M ] (t)) ≤ E ([M ] (∞)) = n→∞

2 = E (∆M (τ )) < ∞. Hence M ∈ H02 . If τ is totally inaccessible then L (M ) = E ([M, N ] (∞)) = E ((∆M (τ ) ∆N (τ ))) = = E ((ξ∆N (τ ))) = =

E (|∆N (τ )| χ (|∆N (τ )| > 2c)) > P (|∆N (τ )| > 2c)

> 2c

E (χ (|∆N (τ )| > 2c)) = 2c ≥ c M H1 P (|∆N (τ )| > 2c)

which is impossible. If τ is predictable then E ((∆M (τ ) ∆N (τ ))) = E ((ξ∆N (τ ))) − E ((E (ξ | Fτ − ) ∆N (τ ))) . N is a martingale therefore p (∆N ) = 0 so E (E (ξ | Fτ − ) ∆N (τ )) = E (E (ξ | Fτ − ) E (∆N (τ ) | Fτ − )) = 0, and we can get the same contradiction as above. This implies that |∆N | ≤ 2c. Therefore N is locally bounded. 3. To finish the proof we should show that the identity in the theorem holds not only in H02 but in H01 as well. To do this we should prove that H02 is dense in H01 and E ([M, N ] (∞)) is a continuous linear functional in H01 . Because these statements have some general importance we shall present them as separate lemmas.

336

SOME OTHER THEOREMS

Lemma 5.37 H2 is dense in H1 . Proof. If M ∈ H1 then M = M c + M d , where M c is the continuous part and M d is the purely discontinuous part of M . ' & [M ] = [M c ] + M d so from (5.32) it is obvious that M c , M d ∈ H1 . τ

1. M c is locally bounded so there is a localizing sequence (τ n ) that (M c ) n ∈ H2 for all n. Observe that if (τ n ) is a localizing sequence then by the Dominated Convergence Theorem M τ n − M H1 → 0 for every M ∈ H1 . ∞ 2. For the purely discontinuous part M d = k=1 Lk where Lk are continuLk converges ously compensated single jumps of M . Recall42 that the series in H1 . Therefore it is sufficient to prove the lemma when M S − S p is a continuously compensated single jump. Let τ be the jump-time of M, that is let S ∆M (τ ) χ ([τ , ∞)). Let ξ k ∆M (τ ) χ (|∆M (τ )| ≤ k) . Let Sk = ξ k χ ([τ , ∞)) and Mk Sk −Skp . By the construction of Lk the stopping time τ is either predictable or totally inaccessible. In a same way as in the proof of the proposition just above one can easily prove that Mk has just one jump which occurs at τ . Also as during the previous proof one can easily prove that Mk ∈ H2 . M − Mk H1 = ∆M (τ ) − ∆Mk (τ )1 . If τ is totally inaccessible then as ∆M (τ ) is integrable ∆M (τ ) − ∆Mk (τ )1 = ∆M (τ ) χ (|∆M (τ )| > k)1 → 0. If τ is predictable then we also have the component E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − )1 . But if k → ∞ then in L1 (Ω) lim E (∆M (τ ) χ (|∆M (τ )| > k) | Fτ − ) = E (∆M (τ ) | Fτ − ) = 0,

k→∞

from which the lemma is obvious. 42 See:

Theorem 4.26, page 236 and Proposition 4.30, page 243.


337

Our next goal is to prove that E ([M, N ] (∞)) in (5.36) is a continuous linear functional over H01 . To do this we need two lemmas. As a first step we prove the following observation: Lemma 5.38 If for some N ∈ H02 E (|[M, N ] (∞)|) ≤ c · M H1 , 0

M ∈ H02

then %

2 sup E (N (∞) − N (τ −)) | Fτ τ

0

From the Kunita–Watanabe inequality

∞

0

1dVar ([M, N ]) ≤ -

≤

∞

"

) [M ] +

[M ]−

0

-

∞

d [M ]

"

) [M ] +

0

[M ]− d [N ].

Therefore by the Cauchy–Schwarz inequality 2

(E (|[M, N ] (∞)|)) ≤ ≤E

∞

"

) [M ] +

[M ]−

0

d [M ] E 0

Let (n)

a = t0

(n)

< t1

"

∞

< . . . < t(n) n =b

2

[M ]d [N ] .


339

be an infinitesimal sequence of partitions of [a, b]. Let f > 0 be a right-regular function with bounded variation on [a, b]. n " " f (b) − f (a) =

%

%

(n) (n) f ti − f ti−1 =

i=1

=

(n) (n) f ti − f ti−1 %

%

. (n) (n) + f ti−1 f ti

n i=1

f generates a finite measure on [a, b]. As f is right-regular and it is positive 1 %

%

(n) (n) + f ti−1 f ti (n) (n) is bounded and for every t ∈ ti−1 , ti 1 1 " % .

%

→" f (t) + f (t−) (n) (n) + f ti−1 f ti So by the Dominated Convergence Theorem it is easy to see that if n → ∞ then " " f (b) − f (a) = a

b

"

1 "

f (t) +

f (t−)

df (t) .

With the Monotone Convergence Theorem one can easily prove that if f is a right-regular, non-negative, increasing function then43 "

f (∞) −

"

f (0) =

∞

"

f (t) +

" f (t−) df (t) .

0

Using this E 0 43 See:

∞

"

) [M ] +

[M ]−


"

d [M ] = E [M ] (∞) M H1 . 0

340

SOME OTHER THEOREMS

Let us estimate the second integral. Integrating by parts

∞

E 0

"

=E

[M ]d [N ] =

"

∞

[M ] (∞) [N ] (∞) − 0

∞

=E

0

[N ]− d

"

[M ] =

" [N ] (∞) − [N ]− d [M ] .

It is easy to see that44

∞

E

[N ] (∞) d 0

=E

"

" [M ] = E [N ] (∞) [M ] (∞) =

[N ] (∞)

"

[M ] (sk ) −

"

[M ] (sk−1 )

=

k

=E

E ([N ] (∞) | Fsk )

"

k

=E

∞

E ([N ] (∞) | Fs ) d

"

[M ] (sk ) −

"

[M ] (sk−1 )

=

[M ] (s) .

0

So if %

2 k sup E (N (∞) − N (τ −)) | Fτ τ

∞

then

∞

E 0

"

[M ]d [N ] =

∞

=E

E ([N ] (∞) | Fs ) − [N ] (s−) d

"

[M ] (s)

0

∞

=E 0

=E

∞

E ([N ] (∞) | Fs ) − [N ] (s) + ∆ [N ] (s) d

= " [M ] (s)

" E N (∞) − N (s) + (∆N (s)) | Fs d [M ] (s) =

2

2

2

0

one should assume that [N ] (∞) is bounded and we should use that [M ] (∞) is integrable. Then with Monotone Convergence Theorem one can drop the assumption that [N ] (∞) is bounded. 44 First


∞

=E 0

≤k ·E 2

341

" 2 E (N (∞) − N (s−)) | Fs d [M ] (s) ≤

"

[M ] (∞) = k 2 · M H1 . 0

So 2

2

(E (|[M, N ] (∞)|)) ≤ 2 · k 2 · M H1 0

which proves the inequality. Definition 5.40 N is a BMO martingale if N ∈ H2 and %

2 sup E (N (∞) − N (τ −)) | Fτ τ

< ∞.

∞

Corollary 5.41 The BMO martingales are locally bounded. Corollary 5.42 (Dual of H01 ) L is a continuous linear functional over H01 if and only if for some BMO martingale N L (M ) = E ([M, N ] (∞)) . The dual of the Banach space H01 is the space of BMO martingales. Let us return to the Martingale Representation Problem. We proved the following statement: Theorem 5.43 (Jacod–Yor) The set X ⊆ H01 has the Martingale Representation Property if and only if the underlying probability measure P is an extremal point of M1 (X ). Proposition 5.44 Let 1 ≤ p < ∞ and let X be a closed linear subspace of H0p . The following properties are equivalent: 1. If M ∈ X and H • M ∈ H0p for some predictable process H then H • M ∈ X . 2. If M ∈ X and H is a bounded and predictable process then H • M ∈ X . 3. X is stable under truncation, that is if M ∈ X and τ is an arbitrary stopping time then M τ ∈ X . 4. If M ∈ X , s ≤ t ≤ ∞ and F ∈ Fs then (M t − M s ) χF ∈ X . Proof. Let H be a bounded predictable process and let |H| ≤ c.

[H • M ] (∞) = H 2 • [M ] (∞) ≤ c2 [M ] (∞)

342

SOME OTHER THEOREMS

so if M ∈ H0p then H • M ∈ H0p and the implication 1.⇒ 2. is obvious. If τ is an arbitrary stopping time then χ ([0, τ ]) • M = 1 • M τ = M τ − M (0) = M τ hence 2. implies 3. If F ∈ Fs then τ (ω)

s if ω ∈ F ∞ if ω ∈ /F

is a stopping time. If 3. holds then M τ ∈ X . As s ≤ t t if ω ∈ F σ(ω) ∞ if ω ∈ /F is also a stopping time hence M σ ∈ X . As X is a linear space M σ − M τ ∈ X . But obviously M σ − M τ = (M t − M s )χF , hence 3. implies 4. Now let H=

χFi χ ((ti , ti+1 ])

(5.37)

i

where Fi ∈ Fti . Obviously (H • X) (t) =

χFi (M (t ∧ ti+1 ) − M (t ∧ ti ))

i

and by 4. H • M ∈ X . Hn • M − H • M Hp = (Hn − H) • M Hp = 0 0 " = [(Hn − H) • M ] (∞) = p ) 2 = (Hn − H) • [M ] (∞) . p

" M ∈ H0p so [M ] (∞) < ∞. Therefore if Hn → H is a uniformly bounded p

sequence of predictable processes then from the Dominated Convergence Theorem it is obvious that ) 2 Hn • M − H • M Hp = (Hn − H) • [M ] (∞) → 0. 0

p


343

X is closed so if Hn • M ∈ X for all n then H • M ∈ X as well. Using this property and 4. with the Monotone Class Theorem one can easily show that if H is a bounded predictable process then H • M ∈ X . If H • M ∈ H0p for some predictable process H then " (H 2 • [M ]) (∞) < ∞. p

From this as above it is easy to show that in H0p H (χ (|H| ≤ n)) • M → H • M, so H • M ∈ X . Proposition 5.45 If 1 ≤ p < ∞ and M ∈ H0p then the set C {X ∈ H0p : X = H • M } is closed in H0p . Proof. It is easy to see that the set of predictable processes H for which45 " HLp (M ) H 2 • [M ] (∞) < ∞ (5.38) p

is a linear space. In the usual way, as in the classical theory of Lp -spaces46 , one can prove that if H1 ∼ H2 whenever H1 − H2 Lp (M ) = 0 then the set of equivalence classes, denoted by Lp (M ), is a Banach space. Let Xn ∈ C and assume that Xn → X in H0p . Let Xn = Hn • M . " " Xn Hp [Xn ] (∞) = H 2 • [M ] (∞) Hn Lp (M ) . 0 p

p

This implies that (Hn ) is a Cauchy sequence in Lp (M ), so it is convergent, hence Hn → H in Lp (M ) for some H and Hn • M → H • M . Therefore X = H • M , so C is closed. n

Proposition 5.46 Let (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 47 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. In this case stable(M1 , M2 , . . . , Mn ) =

n i=1

45 See:

Definition 2.57, page 151 [80], Theorem 3.11, page 69. 47 See: Definition 4.1, page 227. 46 See:

4 Hi • Mi : Hi ∈ Lp (Mi ) .

344

SOME OTHER THEOREMS

That is the stable subspace generated by a finite set of strongly orthogonal H0p martingales is the linear subspace generated by the stochastic integrals Hi • Mi , Hi ∈ Lp (Mi ). Proof. Recall that as in the previous proposition Lp (M ) is the set of equivalence classes of progressively measurable processes for which (5.38) hold. Let I denote the linear space on the right side of the equality. By Proposition 5.44 for all i Hi • Mi ∈ stable(Mi ) ⊆ stable(X ) hence I ⊆ stable(X ). From the stopping rule of the stochastic integrals I is closed under stopping. Mi (0) = 0 and Mi = 1 • Mi so Mi ∈ I for all i. By strong orthogonality * *# $ + n + n + + 2 , E, H i • Mi = Hi • [Mi ] ≤ i=1 i=1 p

p

n ) ≤ Hi2 • [Mi ] . i=1

p

From Jensen’s inequality it is also easy to show that *# n $ + n ) + 1 , √ E Hi2 • [Mi ] ≤ H • M i i . n i=1 i=1 p p

" n " n This means that the norms E [ i=1 Hi • Mi ] and i=1 Hi2 • [Mi ] are p

p

equivalent. In a similar way, as in the previous proposition, one can show that I is a closed linear subspace of H02 . Therefore stable(M1 , . . . , Mn ) ⊆ I.

Example 5.47 The assumption about orthogonality is important.


345

Let w1 and w2 be independent Wiener processes. Let J (t) t. If M1 w1 ,

M2 (1 − J) • w1 + J • w2

then [M1 , M2 ] = [w1 , (1 − J) • w1 + J • w2 ] = (1 − J) [w1 ] = (1 − J) J which is not a local martingale. So the conditions of the above proposition do not hold. We show that 4 2 p Hi • Mi : Hi ∈ L (Mi ) I i=1

is not a closed set in H0p . Let ε > 0. Obviously (ε)

H1

J −1+ε , J +ε

(ε)

H2

1 J +ε

are bounded predictable processes. (ε)

(ε)

X ε H1 • M1 + H 2 • M2 = 1−J J J −1+ε • w1 + • w1 + • w2 = J +ε J +ε J +ε ε ε • w1 + w2 − • w2 . = J +ε J +ε =

As w1 and w2 are independent

2 t ε ε ε ds → 0, • w1 − • w2 (t) = 2 J +ε J +ε s+ε 0

so Xε → w2 in H0p . Assume that for some H1 and H2 w2 = H1 • M1 + H2 • M2 = = H1 • w1 + H2 (1 − J) • w1 + H2 J • w2 . Reordering (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 . From this [(1 − H2 J) • w2 ] = [(H1 + H2 (1 − J)) • w1 , (1 − H2 J) • w2 ] = = (H1 + H2 (1 − J)) (1 − H2 J) • [w1 , w2 ] = 0,

346

SOME OTHER THEOREMS

so (H1 + H2 (1 − J)) • w1 = (1 − H2 J) • w2 = 0. This implies that 1 − H2 J = (H1 + H2 (1 − J)) = 0 that is H2 = 1/J and H1 = 1 − 1/J. But as t 1− 0

1 s

2 ds = +∞

/ Lp (w1 ). H1 = 1 − 1/J ∈ n

n

Definition 5.48 Let (Mi )i=1 be a finite subset of H0p . We say that (Mi )i=1 has the Integral Representation Property if for every M ∈ H0p M=

n

H i • Mi ,

Hi ∈ Lp (Mi ) .

i=1

The main result about integral representation is an easy consequence of the Jacod–Yor theorem and the previous proposition: n

Theorem 5.49 (Jacod–Yor) Let 1 ≤ p < ∞ and let X (Mi )i=1 be a finite subset of H0p . Assume that if i = j then the martingales Mi and Mj are strongly orthogonal 48 as local martingales, that is [Mi , Mj ] = 0 whenever i = j. If these assumptions hold then X has the Integral Representation Property in H0p if and only if P ∈ Mp (X ). Proof. If X has the Integral Representation Property then49 stable(X ) = H0p so P is an extremal point of Mp (X ). Assume that X does not have the Integral Representation Property. This means that stablep (X ) = H0p . We show that in this case stable1 (X ) = H01 as well: If stable1 (X ) = H01 then for every M ∈ H0p ⊆ H01 M=

n i=1

48 See: 49 See:

Definition 4.1, page 227. Proposition 5.35, page 330.

H i • Mi ,

Hi ∈ L1 (Mi ) .


347

But by the strong orthogonality assumption for every k $ # n n H i • Mi = Hi2 • [Mi ] ≥ Hi2 • [Mi ] [M ] (∞) = i=1

i=1

" " [M ] (∞) ∈ Lp (Ω) so Hi2 • [Mi ] (∞) ∈ Lp (Ω). Hence Hi ∈ Lp (Mi ) for every i, which is impossible as X does not have the Integral Representation Property in H0p . Hence stablep (X ) ⊆ stable1 (X ) = H01 .

∗ By the Hahn–Banach theorem there is a continuous linear functional L ∈ H01 that L (stable1 (X )) = 0. This implies that L (stablep (X )) = 0. L is of course a BMO martingale so it is locally bounded. As we have remarked one can assume that L is bounded. As we already discussed in this case P is not an extremal point of Mp (X ). The most important example is the following: Example 5.50 If X (wk )n k=1 are independent Wiener processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.

On any finite interval50 wk ∈ H01 . We show that M1 (X ) = {P}. If Q ∈ M1 (X ) then wk is a continuous local martingale under Q for every k. Obviously [wk , wj ] (t) = δ ij t. By Lévy’s characterization theorem51 X (w1 , w2 , . . . , wn ) is an n-dimensional Wiener process under Q as well. This implies that f (X) dP = f (X) dQ. Ω

Ω

for every F∞ -measurable bounded function f . As F is the filtration generated by X this implies that P (F ) = Q (F ) for every F ∈ F∞ so P = Q. Example 5.51 If X (π k )n k=1 are independent compensated Poisson processes and the filtration F is the filtration generated by X then X has the Integral Representation Property on any finite interval.

50 On

finite interval [0, s] w H1 = E

51 See:


√ [w] (s) = s. See: Example 1.124, page 87.

348

SOME OTHER THEOREMS

On any finite interval π k ∈ H01 . If two Poisson processes are independent then they do not have common jumps52 so [π k , π j ] = 0. So we can apply the Jacod– Yor theorem. We shall prove that again M1 (X ) = {P}. If X is a compensated Poisson process, then a.s.

[X] (t) − λt = X (t) .

(5.39)

Of course this identity holds under any probability measure Q P. As in the previous example one should show that if X is a local martingale then (5.39) implies that X is a compensated Poisson process with parameter λ. Let us assume that for some process X under some measure (5.39) holds. In this case obviously 2 (∆X) = ∆X, that is if ∆X = 0 then ∆X = 1. [X] has finite variation, hence X also has finite variation, so X ∈ V ∩ L. Hence X is purely discontinuous, 2 that is X is a quadratic jump process: [X] = (∆X) . The size of the jumps is constant, so as [X] is finite for every trajectory there is just finite number of jumps on every finite interval. Let N (t) denote the number of jumps in the interval [0, t]. N (t) − λt = [X] (t) − λt = X (t) .

(5.40)

As X is a local martingale this means that the compensator of N is λt. N is a counting process so exp (itN (u)) = (exp (itN (s)) − exp (itN (s−))) = = s≤u

=

(exp (it (N (s−) + 1)) − exp (itN (s−))) =

s≤u

=

(exp (it) − 1) · exp (itN (s−)) · 1 =

s≤u

= (exp (it) − 1)

exp (itN (s−)) [N (s) − N (s−)] =

s≤u

u

= (exp (it) − 1)

exp (itN (s−)) dN (s) . 0

Taking expected value and using elementary properties of the compensator, and that on every finite interval N has only finite number of jumps u ϕu (t) E (exp (itN (u))) = (exp (it) − 1) E exp (itN (s−)) dN (s) =

0

= (exp (it) − 1) E

p

exp (itN (s−)) dN (s) 0

52 See:

u


=


u

= λ (exp (it) − 1) E

exp (itN (s−)) ds 0

=

u

= λ (exp (it) − 1) E = λ (exp (it) − 1)

349

exp (itN (s)) ds

=

0 u

ϕs (t) ds, 0

where ϕu (t) is the Fourier transform of N (u). Differentiating both sides by u d ϕ (t) = λ (exp (it) − 1) ϕu (t) . du u The solution of this equation is ϕu (t) = exp (λu (exp (it) − 1)) . Hence N (u) has a Poisson distribution with parameter λu. By (5.40) X is a compensated Poisson process with parameter λ. Finally recall that Poisson processes are independent if and only if53 they do not have common jumps. This means that under Q the processes π k remain independent Poisson processes. Example 5.52 Continuous martingale which does not have the Integral Representation Property.

Let ((w1 , w2 ) , G) be a two-dimensional Wiener process. Let X w1 • w2 , and let F be the filtration generated by X. Evidently Ft ⊆ Gt . X is obviously a local martingale under G.

T

w12 d [w2 ]

E ([X] (T )) = E

=

0

T

E w12 (t) dt < ∞

0

so on every finite interval X is in H02 . Hence X is a G-martingale. As X is F-adapted one can easily show that X is an F-martingale. The quadratic variation [X] is F-adapted.

t

w12 d [w2 ] =

[X] (t) = 0 53 See:

Proposition 7.11, page 469 and 7.13, page 471

t

w12 (s) ds, 0

SOME OTHER THEOREMS

350

therefore the derivative of [X] is w12 . This implies that w12 is also F-adapted. As [w1 ] is deterministic Z

1 2 w1 − [w1 ] = w1 • w1 2

is also F-adapted. Z is an F-martingale: If s < t, then using that Z = w12 − [w1 ] is a G-martingale54 M (Z (t) | Fs ) = M (M (Z (t) | Gs ) | Fs ) = = M (Z (s) | Fs ) = Z (s) . If X had the Integral Representation Property then for some Y Z = Y • X Y • (w1 • w2 ) = Y w1 • w2 . As w1 and w2 are independent [w1 , w2 ] = 0. 0 < [Z • Z] = [w1 • w1 , Y • X] = [w1 • w1 , Y w1 • w2 ] = Y w12 • [w1 , w2 ] = 0, which is impossible.

54 w

1

is in H02 .

6 ˆ FORMULA ITO’s Itˆ o’s formula is the most important relation of stochastic analysis. The formula is a stochastic generalization of the Fundamental Theorem of Calculus. Recall that for an arbitrary process X, for an arbitrary differentiable function f and (n) for an arbitrary partition (tk ) of an interval [0, t] f (X(t)) − f (X(0)) =

k

=

(n) (n) f (X(tk )) − f (X(tk−1 )) =

(6.1)

(n) (n) (n) f (ξ k ) X(tk ) − X(tk−1 ) .

k (n)

where ξ k

(n)

(n)

∈ (X(tk−1 ), X(tk )). If X is continuous then by the intermediate (n) ξk

(n) X(τ k ),

(n)

(n)

(n)

value theorem = where τ k ∈ (tk−1 , tk ). If X has finite variation then if n ∞ the sum on the right-hand side will be convergent and one can easily get the Fundamental Theorem of Calculus: f (X(t)) − f (X(0)) =

t

f (X(s))dX(s).

0

On the other hand, if X is a local martingale then the telescopic sum on the right-hand side of (6.1) does not necessarily converge to the stochastic integral t (n) (n) f (X(s))dX(s), as one cannot guarantee the convergence unless τ k = tk−1 . 0 If we make a second-order approximation

(n) (n) (n) (n) (n) f (X(tk )) − f (X(tk−1 )) = f (X(tk−1 )) X(tk ) − X(tk−1 ) +

2 (n) (n) (n) + 12 f (ξ k ) X(tk ) − X(tk−1 ) then the sum of the first order terms

(n) (n) (n) In f X(tk−1 ) X(tk ) − X(tk−1 ) k

351

352

ˆ FORMULA ITO’s

t is an approximating sum of the Itˆ o–Stieltjes integral 0 f (X(s))dX(s). Of course the sum of the second order terms is also convergent, the only question is what is the limit? As ( ( (n) (n) (n) (n) (X(tk ) − X(tk−1 ))2 ≈ X(tk ) − X(tk−1 ) one can guess that the limit is 1 2

t

f (X(s))d [X] (s).

0 (n)

(n)

This is true if X is continuous as in this case again ξ k = X(τ k ) and the second order term is ‘close’ to the Stieltjes-type approximating sum (

1 (n)

( (n) (n) f X τk X(tk ) − X(tk−1 ) . 2 The argument just introduced is ‘nearly valid’ even if X is discontinuous. In this case the first order term is again an Itˆ o–Stieltjes type approximating sum and it is convergent again in Itˆ o–Stieltjes sense and the limit is1

t

f (X(s)) dX(s) =

0

t

f (X− (s)) dX(s).

0

The main difference is that in this case one cannot apply for the second order term the intermediate value theorem. Therefore the second order term is not a simple Stieltjes type approximating sum. If we take only the ‘continuous’ subintervals, then one gets a Stieljes-type approximating sum and the limit is 1 2

t

f (X− (s))d [X c ] .

0

For the remaining terms one can only apply the approximation

2 1 (n) (n) f (ξ k ) ∆X(tk ) = 2

(n) (n) (n) (n) (n) = f (X(tk )) − f (X(tk−1 )) − f (X(tk−1 )) X(tk ) − X(tk−1 ) which converges to f (X(s)) − f (X(s−)) − f (X(s−))∆X(s), 1 See:

Theorem 2.21, page 125. The second integral is convergent in the general sense as well.

ˆ FORMULA FOR CONTINUOUS SEMIMARTINGALES ITO’s

353

so in the limit the second-order term is 1 2

t

0

6.1

f (X− (s))d [X c ] +

(f (X(s)) − f (X(s−)) − f (X(s−))∆X(s)) .

0<s≤t

Itˆ o’s Formula for Continuous Semimartingales

Recall that for continuous semimartingales one has the following integration by parts formula2 : Proposition 6.1 If X and Y are continuous semimartingales then for every t

t

X (t) Y (t) − X (0) Y (0) =

XdY + 0

t

Y dX + [X, Y ] (t) .

(6.2)

0

Theorem 6.2 (Itˆ o’s formula) Let U be an open subset of Rn . If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales, X (t) ∈ U for every t and f ∈ C 2 (U ), then f (X (t)) − f (X (0)) =

n

t

∂f (X) dXk + ∂x k k=1 0 1 t ∂2f (X) d [Xi , Xj ] . + 2 i,j 0 ∂xi ∂xj

(6.3)

Proof. We divide the proof into several steps. 1. As a first step we prove the theorem for polynomials. If f ≡ c, where c is a constant, then the theorem is trivial. It is sufficient to prove that if the identity is valid for a polynomial f then it is true for the polynomial g xl f as well. Assume, that f (X) = f (X (0)) +

∂f 1 ∂2f (X) • Xk + (X) • [Xi, Xj ] . ∂xk 2 i,j ∂xi ∂xj k

By (6.2) g (X) Xl f (X) = = g (X (0)) + Xl • f (X) + f (X) • Xl + [Xl , f (X)] = 2 See:


354

ˆ FORMULA ITO’s

∂f = g (X (0)) + Xl • f (X (0)) + Xl • (X) • Xk + ∂xk k   ∂2f 1 (X) • [Xi, Xj ] + + Xl •  2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . Now Xl • f (X (0)) = 0, and by the associativity rule for stochastic integrals3 g (X) = g (X (0)) +

Xl

k

∂f (X) • Xk + ∂xk

∂2f 1 Xl (X) • [Xi, Xj ] + + 2 i,j ∂xi ∂xj + f (X) • Xl + [Xl , f (X)] . By the product rule of differentiation  ∂f    xl ∂x ∂g k =  ∂xk   xl ∂f + f ∂xl

if k = l .

(6.4)

if k = l

Substituting it in the formula above, g (X) = g (X (0)) +

∂g (X) • Xk + ∂xk k

∂2f 1 Xl (X) • [Xi, Xj ] + [Xl , f (X)] . + 2 i,j ∂xi ∂xj The second partial derivatives of g are  ∂2f   x  l   ∂xi ∂xj      ∂2f ∂f    x + l  ∂xl ∂xj ∂xj ∂2g =  ∂xi ∂xj ∂f ∂2f    xl /+   ∂xi ∂xl ∂xi     2   ∂ f ∂f   xl +2 ∂ 2 xl ∂xl 3 See:


if i, j = l if i = l, j = l , if i = l, j = l if i = j = l

(6.5)


355

that is matrices f and g are different only in column l and in row l. It is sufficient to prove that

[Xl , f (X)] =

n ∂f (X) • [Xl , Xj ] . ∂xj j=1

By the induction hypothesis f (X) is a semimartingale. As Xl is continuous the quadratic co-variation of the bounded variation part of f (X) is zero. The quadratic variation of the stochastic integral part is  n n ∂f ∂f Xl , (X) • Xk  = (X) • [Xl , Xj ] . ∂xk ∂xj j=1 j=1 

This means that the theorem is valid for polynomials. 2. Let us prove that one can localize the expression. That is, it is sufficient to prove the theorem for Xτ n where (τ n ) is some localizing sequence of X. Let τ be an arbitrary stopping time. The integrals in the second line are integrals taken by trajectory, hence obviously ' & ∂2f ∂2f τ (Xτ ) • Xiτ , Xjτ = (Xτ ) • [Xi , Xj ] = ∂xi ∂xj ∂xi ∂xj =

∂2f (X) χ ([0, τ ]) • [Xi , Xj ] . ∂xi ∂xj

In a similar way, using the stopping rule for stochastic integrals ∂f ∂f (Xτ ) • Xkτ = (Xτ ) χ ([0, τ ]) • Xk = ∂xk ∂xk ∂f (X) χ ([0, τ ]) • Xk . = ∂xk Assume that the theorem is valid for the truncated processes Xτ n . f ∈ C 2 (U ), hence the trajectories of the ∂f /∂xk (X) and ∂ 2 f /∂xi ∂xj (X) are continuous and therefore they are integrable. Evidently the integrands above are dominated by these common integrable processes. If τ n → ∞, then χ ([0, τ n ]) → 1. Applying the Dominated Convergence Theorem on both sides and using that f (Xτ n ) → f (X) one can easily prove the equality. 3. As X is continuous it is locally bounded. Let (τ n ) be a localizing sequence for which the images of the stopped processes Xτ n are bounded. Let K ⊆ U be a compact set which contains the image of Xτ n . One can prove, that there is a sequence of polynomials (pn ) that in the topology of C 2 (K) one has pn |K → f |K . By the definition of the topology of C 2 all the derivatives

356

ˆ FORMULA ITO’s

are uniformly convergent. As the formula is valid for every polynomial by the Dominated Convergence Theorem it is valid for the function f ∈ C 2 (U ) as well. Proposition 6.3 If the semimartingale Xl has finite variation, then it is sufficient to assume that the partial derivative ∂f /∂xl exists and it is continuous. In this case in the formula (6.3) one can drop the second-order terms with index l. Proof. If Xl has finite variation then as Xi is continuous [Xl , Xi ] = 0. If f is a polynomial, then the second-order terms with index l are zero, and in the approximation we do not need the second-order terms with index l. Corollary 6.4 (Time dependent Itˆ o formula) If the elements of the vector X (X1 , X2 , . . . , Xn ) are continuous semimartingales and the image space of X is part of an open subset U ⊆ Rn and f ∈ C 2 (R+ × U ) then4

t

f (t, X (t)) = f (0, X (0)) + n

0

∂f (s, X (s)) ds+ ∂s

t

∂f (s, X (s)) dXi (s)+ ∂x i i=1 0 n n 1 t ∂2f + (s, X (s)) d [Xi , Xj ] (s). 2 i=1 j=1 0 ∂xi ∂xj

+

If X and Y are real-valued semimartingales then we can define the object Z X + iY , which one can call a complex semimartingale. Let f : C → C be a holomorphic function. f (z) has the representation u(x, y) + iv(x, y), where u and v are differentiable functions. Recall that ∂v ∂u = ∂x ∂y

and

∂u ∂v =− . ∂y ∂x

If Z is a complex semimartingale then f (Z) = u(X, Y ) + iv(X, Y ). 4 It

is sufficient to assume that f is continuously differentiable by the time parameter.


One can apply Itˆ o’s formula for u and for v. u(X(t), Y (t)) = u(X(0), Y (0))+ t t ∂u ∂u + (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y 1 t ∂2u (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2u + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ u (X, Y )d [X, Y ] + ∂x∂y 0 and v(X(t), Y (t)) = v(X(0), Y (0))+ t t ∂v ∂v + (X, Y )dX + (X, Y )dY + 0 ∂x 0 ∂y 1 t ∂2v (X, Y )d [X, X] + + 2 0 ∂x2 1 t ∂2v + (X, Y )d [Y, Y ] + 2 0 ∂y 2 t 2 ∂ v (X, Y )d [X, Y ] . + 0 ∂x∂y The sum of the first-order terms is t ∂u ∂u (X, Y )dX + (X, Y )dY + ∂x 0 0 ∂y t t ∂v ∂v +i (X, Y )dX + i (X, Y )dY. 0 ∂x 0 ∂y

t

As ux + ivx = vy − iuy = f this sum is

t

f (Z)d (iY )

f (Z)dX + 0

t

0

0

t

f (Z)dZ.

357

358

ˆ FORMULA ITO’s

Let us calculate the second-order terms:

t

∂2u ∂2v + i 2 d [X, X] = 2 ∂x ∂x

0

t

0

∂2u ∂2v + i d [Y, Y ] = ∂y 2 ∂y 2

t

0

t

− 0

t

= 0

0

t

∂2v ∂2u +i d [X, Y ] = ∂x∂y ∂x∂y

∂2u ∂2v + i 2 d [X, X] , 2 ∂x ∂x

∂2u ∂2v + i 2 d [iY, iY ] , 2 ∂x ∂x

t

− 0

t

= 0

∂2u ∂2v − i d [Y, Y ] = ∂x2 ∂x2

∂2u ∂2v + i 2 d [X, Y ] = 2 ∂ x ∂x

∂2u ∂2v d [X, iY ] . + i ∂x2 ∂2x

Also by definition [Z] [X] + 2i [X, Y ] − [Y ] . Therefore the second order term is 1 2

t

0

∂2u ∂2v 1 + i 2 d [Z] = 2 ∂ x ∂ x 2

t

f (Z)d [Z] .

0

Corollary 6.5 (Itˆ o’s formula for holomorphic functions) If f (t, z) is continuously differentiable in t and it is holomorphic in z and Z is a continuous complex semimartingale then

t

∂f (s, Z (s)) ds+ 0 ∂s 1 t ∂ 2 f (s, Z (s)) ∂f (s, Z (s)) dZ (s) + d [Z] (s) . ∂z 2 0 ∂z 2

f (t, Z (t)) = f (0, Z (0)) + + 0

t

Example 6.6 If Z w1 + iw2 is a planar Brownian motion and f is an entire function then f (Z) is a complex local martingale and

t

f (Z (t)) = f (Z (0)) +

f (Z)dZ.

0

As [w1 , w2 ] = 0 and [w1 ] (t) = [w2 ] (t) = t obviously [Z] = 0.

SOME APPLICATIONS OF THE FORMULA

6.2

359

Some Applications of the Formula

In this section we present some famous and important applications of the formula. 6.2.1

Zeros of Wiener processes

As a first application let us investigate some important properties of the multidimensional Wiener processes. By definition assume that the coordinates of a d-dimensional Wiener process w are independent one-dimensional Wiener processes. To simplify the notation we say that a stochastic process w is a d-dimensional Wiener process starting from some point x ∈ Rd if it has the rep where w is an ordinary d-dimensional Wiener process, resentation w = x + w, obviously starting from the origin. In the same way if x is an F0 -measurable random vector then one can talk about a Wiener process starting from x. Assume that w starts from some vector x. Let5 ϑ inf {w (t) : t ≥ 0} . What is the distribution of ϑ? Theorem 6.7 (Return of a Wiener process to the origin) Every d-dimensional Wiener process w starting from some vector x = 0 satisfies the following6 : 1. If d ≥ 2 then for almost every outcome ω the trajectory w(ω) is never zero, that is P (w (t) = 0, ∀t > 0) = 1. 2. If d = 2 then P (ϑ = 0) = 1, that is, w is almost surely never zero, but it hits every neighborhood of the origin almost surely. 3. If d = 2 then the trajectories of w are almost surely dense in R2 . 4. If d ≥ 3 and w (0) = x = 0 then P (ϑ ≤ r) = 5 In

this section x denotes the norm

6 See:

Corollary B.8. page 565.

r x

d−2 ,

k

x2k .

if

0 ≤ r ≤ x .

360

ˆ FORMULA ITO’s

Proof. Assume that the twice continuously differentiable function f defined on an open set U ⊆ Rd satisfies the Laplace equation d ∂2f k=1

∂x2i

= 0,

f ∈ C 2 (U ) .

(6.6)

Let τ be a stopping time. If a d-dimensional Wiener process w starting from an x remains in U then by Itˆ o’s formula

f (wτ ) − f (w (0)) =

d ∂f (wτ ) • wkτ + ∂xk

k=1

+

' & 1 ∂2f (wτ ) • wiτ , wjτ . 2 i,j ∂xi ∂xj

& ' If i = j then7 wiτ , wjτ = 0τ = 0. Hence as [wiτ ] (s) = s ∧ τ (6.7) f (wτ (t)) − f (x) = f (wτ (t)) − f (w (0)) = d t ∂f 1 τ (wτ ) dwkτ + (∆f )(wτ (s))ds = = 2 0 0 ∂xk k=1

=

d k=1

0

t

∂f (wτ ) dwkτ . ∂xk

Assume that τ < ∞ and w is bounded on the random interval [0, τ ]. In this case the integrands in (6.7) are bounded. As on any finite interval wτ is squareintegrable the stochastic integrals are martingales8 . Hence for every point of time t 0. P inf w (t) > 0 = P inf w (t) > 0 | w (ε) = y dρ (y) , t≥ε

t≥ε

Rd

where ρ is the distribution of w (ε). Let us calculate the conditional probability. As w has stationary and independent increments

P inf w (t) > 0 | w (ε) = y

=

t≥ε

= P inf w (t) − w (ε) + w (ε) > 0 | w (ε) = y t≥ε

=

= P inf w (t) − w (ε) + y > 0 = t≥ε

=P

inf w (u) + y > 0

u≥0

=P

=

inf wy (u) > 0 ,

u≥0

where wy is the Wiener process starting from the point y. By the formula already proved for x = 0 in 3. and 4. above P inf w (t) > 0 = t≥ε

Rd

P inf wy (t) > 0 dρ (y) =

=

Rd \{0}

=

t≥0

y

P inf w (t) > 0 dρ (y) = t≥0

1dρ (y) = 1. Rd \{0}


363

If ε → 0 then P (w (t) > 0, ∀t > 0) = lim P inf w (t) > 0 = 1. t≥ε

ε0

This means that with probability one w does not return back to the origin. Hence we have proved the theorem for all initial vectors x ∈ Rd . 6. Instead of balls around the origin one can take any ball. If we take the balls with rational centers and rational radii then the two-dimensional Wiener process with probability one intersects all of them. Therefore the trajectories of the Wiener processes are dense in R2 . In the same way one can prove the following: Corollary 6.8 Let d ≥ 3 and let w be a d-dimensional Wiener process starting from some random vector x. If x is deterministic then P (ϑ ≤ r) =

r x

d−2 ,

if

0 ≤ r ≤ x .

Corollary 6.9 If d ≥ 3 and w is a d-dimensional Wiener process then limt→∞ w (t) = ∞. Proof. Let r > 0 be arbitrary and for any a ≥ r let τ a inf {t : w (t) ≥ a} . As almost surely12 lim sup w (t) = ∞ t→∞

obviously τ a < ∞ almost surely. By the strong Markov property of w w∗ (t) (w (t + τ a ) − w (τ a )) + w (τ a ) ,

t≥0

is a Wiener process starting from the random point w (τ a ) ∈ {u = a} . Since d ≥ 3 P (∃t ≥ τ a , w (t) ≤ r) = P (∃t ≥ 0, w∗ (t) ≤ r) = 12 See:

Proposition B.7, page 564.

r d−2 a

.

364

ˆ FORMULA ITO’s

If a ∞ then this probability goes to zero. Let an ∞. The probability that w (t) returns to the ball {u ≤ r} after infinitely many τ an is zero. Hence with probability one for any ω there is an n n (ω) that w (t, ω) ∈ / {u ≤ r}

t ≥ τ n (ω) .

That is with probability one13 if t ∞ then w (t, ω) → ∞. Example 6.10 Hitting times of open and closed sets in higher dimensions14 .

. / 1. Let B (x0 , r) x ∈ Rd : x − x0 < r . Let x0 = 0 ∈ B (0, 1) and let f (x) g (x − x0 )

log x − x0 2−d x − x0

if if

d=2 . d≥3

Obviously f satisfies the Laplace equation (6.6) on Rd \ {x0 }. If B (x0 , r) ⊆ B (0, 1) and B B (0, 1) \ cl (B (x0 , r)) then f is bounded on B. Let w be a d-dimensional Wiener process and let τ inf {t : w (t) ∈ ∂B (0, 1)} . As lim supt w (t) = ∞, obviously15 almost surely τ < ∞. By Itô’s formula X f (wτ ) is a bounded local martingale on B, therefore X is a uniformly integrable martingale16 . Hence if ρ inf {t : w (t) ∈ ∂B} , then E (X (ρ)) = E (X (0)) = f (0) . If ρ1 inf {t : w (t) ∈ ∂B (0, 1)} ρ2 inf {t : w (t) ∈ ∂B (x0 , r)} 13 Take

r 1, 2, . . . . Corollary B.12, page 566. 15 See: Proposition B.7, page 564. 16 See: Corollary 1.145, page 103. 14 See:


365

then as ρ = ρ1 ∧ ρ2 f (0) = E (X (ρ)) = = E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) + E (X (ρ2 ) χ (ρ2 < ρ1 )) . Obviously E (X (ρ2 ) χ (ρ2 < ρ1 )) = g (r) · P (ρ2 < ρ1 ) and for some k E (X (ρ1 ) χ (ρ1 ≤ ρ2 )) ≤ k for all r > 0. This implies that for any 0 < r < 1 P (ρ1 > ρ2 ) =

g (x0 ) − k log x0 − E (X (ρ2 ) χ (ρ1 > ρ2 )) ≤ . g (r) g (r)

If r 0, then the right-hand side goes to zero, so for any ε > 0 there is an r > 0 such that P (ρ2 < ρ1 ) < ε. 2. Let (qi ) be the non-zero rational points of B (0, 1) and for any i let ri > 0 be such that

(i) P ρ2 < ρ1 < 2−(i+1) where of course (i)

ρ2 inf {t : w (t) ∈ ∂B (qi , ri )} . Let G ∪i B (qi , ri ). Obviously G is open and τ cl(G) inf {t : w (t) ∈ cl (G)} = inf {t : w (t) ∈ cl (B (0, 1))} = 0. On the other hand obviously ρ1 > 0 and if τ G inf {t : w (t) ∈ G} then P (τ G ≥ τ 1 ) = 1 − P (τ G < τ 1 ) ≥ 1 − ≥1−

i

i

2−(i+1)

1 ≥ . 2

Therefore τ cl(G) and τ G are not almost surely equal.

(i) P ρ2 < ρ1 ≥

366

ˆ FORMULA ITO’s

6.2.2

Continuous L´ evy processes

Let X be a continuous Lévy process. Since X is continuous all the moments of X are finite17 . Hence X(t) has an expected value for every t. Observe that as on any finite interval the second moments are bounded X is uniformly integrable on these intervals. Therefore E(X(t)) is continuous in t, hence E(X(t)) = tE(X(1)). Therefore if m denotes the expected value of X(1) then X(t)−t·m is a martingale. This means that X is a continuous semimartingale. To simplify the notation assume that m = 0. By the definition of the quadratic variation [X] is also a continuous Lévy process. This again implies that Y (t) [X] (t) − E ([X] (t)) is a martingale. As Y obviously has finite variation by Fisk’s theorem18 it is a constant. So [X] (t) = E ([X] (t)) = a · t. By Itˆ o’s formula t exp (iuX (s)) dX (s) − exp (iuX (t)) − 1 = iu 0

−

2

u 2

t

exp (iuX (s)) d [X (s)] . 0

exp (iuX (t)) is bounded and the quadratic variation of X is deterministic, therefore by the characterization of H2 -martingales19 the stochastic integral is a martingale. Taking expected value on both sides t u2 E (exp (iuX (t))) − 1 = − E exp (iuX (s)) d [X (s)] = 2 0 t u2 exp (iuX (s)) d (as) = =− E 2 0 u2 t E (exp (iuX (s))) ds. = −a 2 0 If ϕ (u, t) E (exp (iuX (t))) then u2 ϕ (u, t) − 1 = −a 2

t

ϕ (u, s) ds. 0

Differentiating w.r.t. t u2 dϕ (u, t) = −a · ϕ (u, t) . dt 2 Solving the differential equation, u2 ϕ (u, t) = exp −a · t 2 17 See:

Proposition 1.111, page 74. Theorem 2.11, page 117. 19 See: Proposition 2.53, page 148. 18 See:


367

for every u. By√the formula of √ the Fourier transform for the normal distribution X (t) ∼ = N 0, at . Hence X/ a is a Wiener process. In general m is not zero, hence we have proved the next proposition: Theorem 6.11 Every continuous Lévy process is a linear combination of a Wiener process and a linear trend. One can extend the theorem to processes with independent increments: Theorem 6.12 Every continuous process with independent increments is a Gaussian process, that is for every t1 , t2 , . . . , tn (X (t1 ) , X (t2 ) , . . . , X (tn )) has Gaussian distribution. Proof. If X has independent increments then Z (t) X (t + s) − X (s) also has independent increments for every s. Therefore it is easy to prove that it is sufficient to show that X (t) has a Gaussian distribution for every t. By the continuity of X all the moments of X are bounded on every finite interval20 . Therefore the expected value E (X (t)) is finite for every t. As X is bounded in L2 (Ω) on every finite interval it is uniformly integrable on any finite interval, so E (X (t)) is continuous. Hence it is easy to see that Y (t) X (t) − E (X (t)) is a continuous martingale. Therefore one may assume that X is a continuous martingale. As X has independent increments [X] also has independent increments, so U (t) [X] (t) − E ([X] (t)) is again a continuous martingale. As [X] is increasing U has finite variation. So by Fisk’s theorem almost surely U ≡ 0. Therefore one can assume that [X] is deterministic. By Itô’s formula

t

exp (iuX (s)) dX (s) −

exp (iuX (t)) − 1 = iu 0

u2 − 2

t

exp (iuX (s)) d [X (s)] . 0

exp (iuX) is bounded and on any finite interval X ∈ H2 , therefore the stochastic integral is a martingale21 . Taking expected value u2 E (exp (iuX (t))) − 1 = − · E 2 20 See: 21 See:


0

t

exp (iuX (s)) d [X] (s) .

(6.11)

368

ˆ FORMULA ITO’s

The quadratic variation is deterministic so one can change the order of the integration: u2 E (exp (iuX (t))) − 1 = − · 2

t

E (exp (iuX (s))) d [X] (s) . 0

If ϕ (u, t) E (exp (iuX (t))), then ϕ satisfies the integral equation u2 ϕ (u, t) − 1 = − · 2

t

ϕ (u, s) d [X] (s) .

(6.12)

0

If 2 u ϕ (u, t) exp − [X (t)] 2

(6.13)

then, as [X] is deterministic with finite variation, ϕ satisfies22 (6.12). One can easily prove23 that (6.13) is the only solution of (6.12). Therefore X (t) has a Gaussian distribution for every t. 6.2.3

L´ evy’s characterization of Wiener processes

The characterization theorem of Lévy is similar to the proposition just proved: it characterizes Wiener processes among the continuous local martingales. If X ∈ L and if [X] (t)√= t then by the same argument24 as above one can prove for every that X (t) ∼ = N 0, t . As X (t + s) − X (s) ∈ L √ s the increments u − v it is easy to prove of X are also Gaussian. As X (u) − X (v) ∼ N 0, = that the increments of X are not correlated. As X has Gaussian increments the increments are independent. Therefore by the same argument as above one can prove that X is a Wiener process with respect to its own filtration25 . Our goal is to prove that X is a Wiener process with respect to the original filtration26 . Theorem 6.13 (L´ evy’s characterization of Wiener processes) Let us fix a filtration F. If the n-dimensional continuous process X (X1 , X2 , . . . , Xn ) is zero at t = 0 then the next three statements are equivalent: 1. X is an n-dimensional Wiener process with respect to F. 2. X is a local martingale with respect to F and [Xi , Xj ] (t) = δ ij t. 22 See:

(6.32), page 398. (6.48), page 416. 24 Of course X ∈ H2 2 loc and not X ∈ H so one can first localize X and then take limit in (6.11) otherwise the argument is nearly the same. 25 See: Definition B.1, page 559. 26 See: Definition B.4, page 561. 23 See:


369

3. Whenever fk ∈ L2 (R+ , λ), where λ is Lebesgue’s measure, then

n

E (i (f • X)) (t) exp i

k=1

0

t

1 fk dXk + 2 n

k=1

t

fk2 dλ 0

will be a complex martingale with respect to F. In particular, if X is a continuous local martingale and Y (t) X 2 (t) − t is a continuous local martingale then X is a Wiener process. Proof. Let us show that each statement implies the next one. 1. The implication 1. ⇒ 2. follows27 from the relation [w] (t) = t. 2. The proof of the implication 2. ⇒ 3. is the following: Using Itˆ o’s formula with a simple calculation one can show that E (if • X) is a local martingale. As fk ∈ L2 (R+ , λ) E (i (f • X)) (t) = exp i

n k=1

t

fk dXk

exp

0

1 2 n

k=1

t

fk2 dλ 0

is uniformly bounded, hence it is a local martingale in class D. Hence E (if • X) is a martingale28 . 3. Finally we prove the implication 3. ⇒ 1. If u ∈ Rn , 0 ≤ r < ∞ and f uχ ([0, r]) then as X (0) = 0 1 2 uk χ ([0, r]) dXk + u2 (t ∧ r) = E (if • X) (t) = exp i 2 k=1 0 1 2 = exp i (u, X (r ∧ t)) + u2 (t ∧ r) . 2

n

t

E (if • X) = 0 is a martingale, hence if s < t < r then

−1 1 = E E (if • X) (t) (E (if • X) (s)) | Fs = 1 2 = E exp i (u, X (t) − X (s)) + u2 (t − s) | Fs , 2 therefore 1 2 E (exp (i (u, X (t) − X (s))) | Fs ) = exp − u2 (t − s) , 2 27 See: 28 See:

Example 2.27. page 129, Example 2.46, page 144. Proposition 1.144, page 102.

ˆ FORMULA ITO’s

370

which means that for any set F ∈ Fs F

1 2 exp (i (u, X (t) − X (s))) dP = P (F ) · exp − u2 (t − s) . 2

√ If F = Ω then this implies that the distribution of Xi (t)−Xi (s) is N 0, t − s . Therefore

exp (i (u, X (t) − X (s))) dP = P (F ) ·

exp (i (u, X (t) − X (s))) dP Ω

F

Since this equality holds for every trigonometric polynomial, by the Monotone Class Theorem for every B ∈ Rn

=

P ({X (t) − X (s) ∈ B} ∩ F ) = χB (X (t) − X (s)) dP = P (F ) χB (X (t) − X (s)) dP = Ω

F

= P ({X (t) − X (s) ∈ B}) · P (F ) . Hence the increment X (t) − X (s) is independent of the σ-algebra Fs . So X is a Wiener process.

Example 6.14 For every Wiener process w the integral sgn (w)•w is a Wiener process.

The process is a continuous local martingale. The quadratic variation of sgn (w) • w is

t

2

(sgn (w)) d [w] = 0

t

2

(sgn (w(s))) ds = t. 0

Example 6.15 The reflected Wiener process is also a Wiener process.

Let w be a Wiener process and let τ be a stopping time. Define the reflected process w 0 (t, ω)

w (t, ω) if t ≤ τ (ω) = (2wτ − w)(t, w). 2w (τ (ω) , ω) − w (t, ω) if t > τ (ω)


371

Obviously w 0 (0) = 0, and the trajectories of w 0 are continuous. It is also obvious that [w] 0 = [2wτ − w] = [2wτ ] − 2[2wτ , w] + [w] = 4[w]τ − 4[w]τ + [w] = [w]. As wτ is a martingale and the sum of martingales is again a margingale w 0 is a continuous local martingale, so by Lévy’s theorem it is a Wiener process. Let us discuss an interesting relation between exponential martingales and the quadratic variation: Proposition 6.16 Let X and A be continuous adapted processes on the half-line t ≥ 0. If X (0) = 0 then the next statements are equivalent: 1. A has finite variation and for every α

exp αX − α2 A/2 is a local martingale, 2. [X] = A, and X is a local martingale.

∈

C the process Yα

Proof. We prove that each statement implies the other one. 1. Assume that Yα is a local martingale and let (σ n ) be a localizing sequence of Yα . Let τ n inf {t : |X (t)| ≥ n} ∧ inf {t : |A (t)| ≥ n} ∧ σ n . Yατ n is a martingale and obviously

|Yατ n |

1 2 ≤ exp |α| n + α n , 2

d τ Yα n ≤ |Yατ n | |X τ n − αAτ n | , dα 2 d τn τn τn τn 2 τn dα2 Yα ≤ |Yα | (X − αA ) − A . It is easy to see that if α is in a bounded neighbourhood of the origin then the expressions on the right-hand side are bounded. Hence in the next calculation one can differentiate under the integral sign at α = 0.

372

ˆ FORMULA ITO’s

If α = 0 then d τn Y = Xτn, dα α hence for any F ∈ Fs

E (X

τn

(t) | Fs ) dP =

F

E F

d τn Y (t) | Fs dP = dα α

d τn Y (t) dP = dα α d Y τ n (t) dP = = dα F α d Y τ n (s) dP = = dα F α d τn Yα (s) dP = X τ n (s) dP, = F dα F =

F

therefore a.s.

E (X τ n (t) | Fs ) = X τ n (s) . Therefore X τ n is a martingale. Hence X is a local martingale. In a similar way, using that at α = 0 d2 Yατ n 2 = (X τ n ) − Aτ n , dα2 2

one can prove that (X τ n ) − Aτ n is a martingale. This implies29 that A is increasing and [X] = A. 2. The implication 2. ⇒ 1. is an easy consequence of Itô’s formula. As the quadratic variation of a continuous semimartingale is equal to the quadratic variation of its local martingale part if Z αX − α2 A/2, then Yα = exp (Z) and 1 Yα − Yα (0) = Yα • Z + Yα • [Z] 2 1 2A 2A + Yα • αX − α = Yα • αX − α 2 2 2 29 See:



α2 Yα • [X] + 2 α2 Yα • [X] + = αYα • X − 2 = αYα • X, = αYα • X −

373

1 Yα • [αX] = 2 α2 Yα • [X] = 2

which is, as a stochastic integral with respect to a continuous local martingale, a local martingale. 6.2.4

Integral representation theorems for Wiener processes

In this subsection we return to the Integral Representation Problem. Let w be a Wiener process and let F be the filtration generated by w. Let L be a local martingale with respect to F. Let assume that L (0) = 0. Every local martingale has an H1 -localization30 . By the integral representation property of Wiener processes31 Lτ n = H • w on any finite interval. Hence τn

[L]

= [Lτ n ] = [H • w] = H 2 • [w] .

As [w] (t) = t it is obvious that [L] is continuous. Therefore L is continuous. So 2 L ∈ Hloc and one can assume that Lτ n ∈ H2 . This implies that H ∈ L2 (w). By Itˆ o’s isometry32 H is unique in L2 (w). Hence L = H • w for some H ∈ L2loc (w). Proposition 6.17 If w is a Wiener process and L is a local martingale with respect to the filtration generated by w then L is continuous and L = L (0) + H • w with some H ∈ L2loc (w). Our next statement is an easy consequence of Lévy’s characterization theorem. Proposition 6.18 (Doob) Let M be a continuous local martingale on a stochastic base (Ω, A, P, F). If the quadratic variation of M has the representation 0 30 See:

Corollary 3.59, page 221. Example 5.50, page 347. 32 See: Proposition 2.64, page 156. 31 See:

t

α2 (s, ω) ds,

[M ] (t, ω) =

(6.14)

374

ˆ FORMULA ITO’s

where α (t, ω) > 0 and α is an adapted and product measurable process, then there is a Wiener process w on (Ω, A, P, F) for which

t

α (s) dw (s) .

M (t) = M (0) + 0

Proof. One can explicitly construct the Wiener process w: 1 • M. α

w

(6.15)

First we prove that the integral exists. [M ] λ, so if αM is the Doléans measure of M then αM λ × P. Therefore the stochastic integrals are defined among adapted product measurable processes33 .

t

0

1 d [M ] = α2

t

0

1 2 α ds = t < ∞. α2

Hence 1/α ∈ L2loc (M ). So 1/α is integrable with respect to M . That is integral (6.15) exists. As M is continuous w is a continuous local martingale. By (6.14) 1 1 • M (t) = [w] (t) • [M ] (t) = t. α α2

Therefore by Lévy’s theorem w is a Wiener process. By (6.14) α is integrable with respect to w, therefore α•w α•

1 •M α

=α

1 • M = 1 • M = M − M (0) . α

Hence the proposition holds. Corollary 6.19 Let M be a continuous local martingale on a stochastic base F of (Ω, A, P, F) A, P, (Ω, A, P, F). If [M ] λ then there is an extension Ω, and a Wiener process w on the extended base space that t% M (t) = M (0) + 0

d [M ] dw (s) . dλ

Proof. Let w 0 be an arbitrary Wiener process on some stochastic base 0 0 0 0 Ω, A, P, F . Let the new stochastic base be the product of (Ω, A, P, F) and 33 See:



375

0 F0 . Obviously w 0 A, 0 P, Ω, 0 is independent of A. Let us define α by [M ] (t)

t 0

α2 (s) ds. That is, let % α

d [M ] . dλ

The process

t

w (t) 0

1 χ (α > 0) dM + α

t

χ (α = 0) dw 0 0

is a continuous local martingale. The quadratic co-variation of independent local martingales is zero34 , so [M, w] 0 = 0. Therefore

t

[w] (t) =

χ (α > 0) ds + 0

t

χ (α = 0) ds = t. 0

Hence by Lévy’s theorem w is a Wiener process. 1 χ (α > 0) • M + χ (α = 0) • w 0 = α•w α• α = χ (α > 0) • M. On the other hand [χ (α = 0) • M ] = χ (α = 0) • [M ] = 0, hence χ (α = 0) • M = 0. So α•w = χ (α > 0) • M + χ (α = 0) • M = 1 • M = M − M (0) .

6.2.5

Bessel processes

As an application of Lévy’s theorem let us investigate the Bessel processes. Let w (w1 , w2 , . . . , wd ) be a d-dimensional Wiener process. Define the Bessel process * + d + wk2 . R w w2 , k=1

We assume that w starts at x ∈ Rd , that is R (0) = x. If it is necessary we shall explicitly indicate the initial value x. Evidently the distribution of R 34 See:


376

ˆ FORMULA ITO’s

depends on x only through the size of r x: If x = y then Qx = y for some orthonormal transformation Q. It is easy to show that Qw is also a Wiener process and Qw starts at y. Obviously Rx w = Qw Ry . Proposition 6.20 If d ≥ 2 and r ≥ 0 then if we start w from some point x ∈ Rd with r = x then R w satisfies the integral equation

t

R (t) = r + 0

d−1 ds + B(t), 2R(s)

0 ≤ t < ∞,

(6.16)

wk dwk . R

(6.17)

where B is a Wiener process and B

B (k) ,

s

B (k) (s) 0

k

Put another way, R w satisfies the stochastic differential equation dR =

d−1 dt + dB. 2R

Proof. First observe that the expression in (6.16) is meaningful: as d ≥ 2 the R(s) in the denominator is almost surely not zero for every t ≥ 0. As the integral in (6.16) is taken by trajectories it is also meaningful. On the other hand t 2 t 2 t wk wk d [wk ] = dλ ≤ 1dλ = t, R R 0 0 0 hence the stochastic integrals in (6.17) are in L2 (wk ) on every finite interval. Therefore the stochastic integrals B (k) are also meaningful. 1. By the formula for the quadratic co-variation of the stochastic integrals (

B (k) , B (l) (t) = 0

t

wk wj d [wk , wj ] = δ kj R2

0

t

wk wj dλ, R2

therefore [B] (t) =

( k

t t 2 wk dλ = 1dλ = t. B (k) (t) = R2 0 0 k

The sum of local martingales is again a local martingale. Therefore by the characterization theorem of Lévy B is a Wiener process.

CHANGE OF MEASURE FOR CONTINUOUS SEMIMARTINGALES

2. The proof of (6.16) uses the integration by parts formula: 2 2 R (t) − R (0) 2 wk • wk + [wk ] (t) = k

=2

k

377

(6.18)

k t

wk dwk + t · d.

0

35 The multi-dimensional Wiener processes are almost surely not √ zero , therefore 2 almost surely R > 0. Hence one can use Itô’s formula with x:

& ' 1 1 111 R−r = √ • R2 − • R2 = 3/2 2 2 2 (R2 ) 2 R2 wk 1 1 2 d = • wk + •λ− 4 wk • λ = R 2R 8 R3 k

=

6.3

k

wk k

(6.19)

R

• wk +

d−1 • λ. 2R

Change of measure for continuous semimartingales

The class of semimartingales is remarkably stable under a lot of operations. For example, by Itˆ o’s formula a C 2 transform of a semimartingale is a semimartingale again. Later we shall show that convex transforms of semimartingales are also semimartingales. In this section we return to the discussion of the operation of equivalent changes of measure. 6.3.1

Locally absolutely continuous change of measure

If a measure Q is absolutely continuous with respect to P then one can define the Radon–Nikodym derivative dQ/dP. If a filtration F satisfies the usual conditions then the process dQ | Ft Λ (t) E dP is a martingale and as dQ dP = Q (F ) , Λ (t) dP = F F dP

F ∈ Ft

35 Let us remark that this is a critical observation as here we used the assumption that n ≥ 2. If n = 1, then one o’s formula as in this case one can only assume that R2 ≥ 0 √ cannot use Itˆ and the function x for x ≥ 0 is not a C 2 function. If we formally still apply the formula, then we get the relation R = sign (w) • w. By Example 6.14. this expression is a Wiener process. The left-hand side is non-negative, hence the two sides cannot be equal.

378

ˆ FORMULA ITO’s

Λ (t) is the Radon–Nikodym derivative of Q on (Ω, Ft , P). On the other hand let Q (t) be the restriction of Q and let P (t) be that of P to Ft . If Q (t) is absolutely continuous with respect to P (t) then one can define the derivative Λ (t)

dQ (t) . dP (t)

If F ∈ Fs ⊆ Ft then

Λ (t) dP

F

F

dQ (t) dP = Q (F ) = dP (t)

F

dQ (s) dP dP (s)

Λ (s) dP, F

hence Λ is a martingale. Of course Λ is not necessarily uniformly integrable, so it can happen that there is no ξ for which Λ (t) = E (ξ | Ft ). To put it another way, it can happen that Q P on Ft for every t, but Q is not absolutely continuous on the σ-algebra F∞ = σ (∪t Ft ). So the derivative dQ/dP need not necessarily exist. Recall the following definition: Definition 6.21 We say that a measure Q is locally absolutely continuous with respect to a measure P if Q (t) P (t) for every t where Q (t) is the restriction of Q and P (t) is the restriction of P to Ft . We shall denote this relation by loc

loc

loc

Q P. If Q P and P Q then we shall say that P and Q are locally loc equivalent. We shall denote this by P ∼ Q. loc

Definition 6.22 If Q P then the right-regular version of Λ (t)

dQ (t) dP (t)

is called the Radon–Nikodym process of P and Q. 6.3.2

Semimartingales and change of measure

We have already proved the following important observations36 : loc

Proposition 6.23 (Invariance of semimartingales) If Q P then every semimartingale under P is a semimartingale under Q. Proposition 6.24 (Integration and change of measure) Let X be an arbitrary semimartingale and assume that the integral H •X exists under the measure loc

P. If Q P then H • X exists under Q as well. Under the measure Q the two processes, the integral under P and the integral under Q, are indistinguishable. 36 See:

Proposition 4.55, page 266, Corollary 4.58, page 271, Proposition 4.59, page 271.


379

loc

Proposition 6.25 (Transformation of local martingales) Let Q P and let Λ be the Radon–Nikodym process of P and Q. If L is a continuous local martingale under the measure P then under the measure Q: 1. Λ−1 is well defined, 2. the integral Λ−1 • [L, Λ] exists and has finite variation on compact intervals, 3. the expression 0 L − Λ−1 • [L, Λ] L

(6.20)

is a local martingale. loc

Corollary 6.26 If Q ∼ P then Λ > 0 and Λ−1 is a martingale under Q. Proof. One only needs to prove that Λ−1 is a martingale under Q. If F ∈ Fs and t > s then 1 1 (t) dQ = (t) Λ (t) dP = F Λ F Λ 1 = P (F ) = Λ (s) dP = F Λ (s) 1 = dQ. Λ (s) F loc

Corollary 6.27 If Q P and X and Y are semimartingales then [X, Y ] calculated under Q is indistinguishable under Q from [X, Y ] calculated under P. If L is a local martingale and N is a continuous semimartingale then ( 0 N [L, N ] = L, 0 is as in (6.20). where L Proof. As [X, Y ] XY − X (0) Y (0) − Y− • X − X− • Y the first statement is obvious from Proposition 6.24. Λ−1 • [L, Λ] ∈ V and N is continuous so & ( ' 0 N L − Λ−1 • [L, Λ] , N = L, & ' = [L, N ] − Λ−1 • [L, Λ] , N = [L, N ] . 0 in (6.20) is called the Girsanov transform of L. Definition 6.28 L

380 6.3.3

ˆ FORMULA ITO’s

Change of measure for continuous semimartingales

If L is a continuous local martingale then from Itˆ o’s formula it is trivial that the exponential martingale 1 E (L) exp L − [L] 2 is a positive local martingale. Proposition 6.29 (Logarithm of local martingales) If Λ is a positive and continuous local martingale then there is a continuous local martingale L Log (Λ) log Λ (0) + Λ−1 • Λ which is the only continuous local martingale for which 1 Λ = E (L) exp L − [L] . 2 log Λ = L −

1 1 [L] Log (Λ) − [Log (Λ)] . 2 2

Proof. If Λ = E (L1 ) = E (L2 ) , then as Λ > 0 1=

Λ 1 1 = exp L1 − L2 − [L1 ] + [L2 ] , Λ 2 2

that is L1 − L2 = 12 ([L1 ] − [L2 ]). Hence the continuous local martingale L1 − L2 has bounded variation and it is constant. Evidently L1 (0) = L2 (0) , therefore o’s formula L1 = L2 . As Λ > 0 the expression log Λ is meaningful. By Itˆ 1 1 • [Λ] 2 Λ2 1 1 1 • [Λ] = L − [L] . L− 2 Λ2 2

log Λ = log Λ (0) + Λ−1 • Λ −

Therefore

1 Λ = exp (log Λ) = exp L − [L] E (L) . 2 Proposition 6.30 (Logarithmic transformation of local martingales) loc

Assume that P ∼ Q and let Λ (t)

dQ (t) dP


381

be continuous. If Λ = E (L), that is L = Log (Λ) then dP (t) = dQ

−1

dQ −1 0 (t) . (t) = (E (L) (t)) = E −L dP

If M is a local martingale under measure P then : = M − [M, L] = M − [M, Log (Λ)] M

(6.21)

is a local martingale under measure Q. loc

Proof. Λ > 0 as P ∼ Q. & ' [M, L] [M, Log (Λ)] M, log Λ (0) + Λ−1 • Λ = & ' = M, Λ−1 • Λ = Λ−1 • [M, Λ] . : M − Λ−1 • [M, Λ] = M − [M, L] . M

1 ( 0 0 0 0 −L, −L = E −L exp −L − 2 1 = exp −L + [L, L] − [L, L] = 2 1 −1 = exp − L − [L, L] = (E (L)) . 2 Proposition 6.31 (Girsanov’s formula) If M and L ∈ L are continuous local martingales and the process

1 Λ E (L) exp L − [L] 2 is a martingale on the finite or infinite interval [0, s] then under the measure Q (A)

Λ (s) dP. A

the process : M − [L, M ] = M − 1 • [Λ, M ] M Λ is a continuous local martingale on [0, s].

(6.22)

382

ˆ FORMULA ITO’s

Proof. L (0) = 0, therefore Λ (0) = 1. Λ is a martingale on [0, s] so Λ (s) dP = 1.

Q (Ω) = Ω

Hence Q is also a probability measure. Λ (t) = E (Λ (s) | Ft ) E

dQ | Ft , dP

that is if F ∈ Ft then

Λ (t) dP =

F

F

dQ dP = Q (F ) , dP

so Λ (t) = dQ (t) /dP (t) on Ft . The other parts of the proposition are obvious from Proposition 6.30. 6.3.4

Girsanov’s formula for Wiener processes loc

Let w be a Wiener process under measure P. If Q P then w is a continuous semimartingale37 under Q. Let M + V be its decomposition under Q. M is a continuous local martingale and M (0) = 0. The quadratic variation of M under Q is38 [M ] (t) = [M + V ] (t) = [w] (t) = t. By Lévy’s theorem39 M is therefore a Wiener process under the measure Q. By (6.20) w 0 w − Λ−1 • [w, Λ] is a continuous local martingale. As Λ−1 • [w, Λ] has finite variation by Fisk’s theorem M = w. 0 If F is the augmented filtration of w then by the integral loc representation property of the Wiener processes Λ is continuous40 . If Q ∼ P then Λ > 0 hence for some L 1 Λ E (L) exp L − [L] . 2

37 See:

Proposition 6.23, page 378. Example 2.26, page 129. 39 See: Theorem 6.13, page 368. 40 See: Proposition 6.17, page 373. 38 See:


383

Therefore by Proposition 6.30 M =w 0 = w − [w, L] . If F is the augmented filtration of w then F0 is the trivial σ-algebra, so Λ (0) = 1, hence L (0) = 0. Again by the integral representation theorem there exists an X ∈ L2loc (w) L = L (0) + X • w = X • w,

X ∈ L2loc (w) .

Hence M =w 0 = w − [w, L] = w − [w, X • w] = = w − X • [w] . loc

Hence if P ∼ Q then there is an X ∈ L2loc (w) such that 1 t 2 X (s) ds 2 0 0 1 2 exp X • w − X • [w] (t) E (X • w) 2

Λ (t) exp

t

X (s) dw (s) −

(6.23)

and w 0 (t) w (t) −

t

X (s) ds,

X ∈ L2loc (w)

(6.24)

0

is a Wiener process under Q. On the other hand, let X ∈ L2loc (w, [0, s]). Assume that Λ in (6.23) is a martingale on [0, s]. Define the measure Q by dQ/dP Λ (s). Obviously the process in (6.24) is a Wiener process under Q. Theorem 6.32 (Girsanov formula for Wiener processes) Let w be a Wiener process under measure P and let F be the augmented filtration of w. Girsanov’s transform w 0 of w has the following properties: loc

1. If Q P then the Girsanov transform of w is a Wiener process under measure Q. loc 2. If Q ∼ P then the Girsanov transform of w has the representation (6.24). 3. If X ∈ L2loc (w) and the process Λ in line (6.23) is a martingale over the segment [0, s] then the process w 0 in (6.24) is a Wiener process over [0, s] under the measure Q where dQ/dP Λ (s). Example 6.33 Even on finite intervals Λ E (X • w) is not always a martingale.

384

ˆ FORMULA ITO’s

. / Let u = 1 and let τ inf t : w2 (t) = 1 − t . If t = 0 then almost surely w2 (t, ω) < 1 − t, and if t = 1 then almost surely w2 (t, ω) > 1 − t. So by the intermediate value theorem P (0 < τ < 1) = 1. If X (t)

−2w (t) χ (τ ≥ t) 2

(1 − t)

,

then as τ < 1

1

X 2 d [w] = 4 0

0

τ

w2 (t) (1 − t)

4 dt ≤ 4

0

τ

2

(1 − t)

4 dt

(1 − t)

< ∞.

Hence X ∈ L2loc (w, [0, 1]). By Itˆ o’s formula, if t < 1 then w2 (t) 2

(1 − t) From this I

t

2w2 (s)

=

3 ds + (1 − s)

0

t

0

2w (s)

2 dw (s) + (1 − s)

0

t

1

2 ds.

(1 − s)

τ 1 2 X • [w] = 2 0 0 τ τ τ 2 2 w (τ ) 2w (s) 1 2w2 (s) =− + ds + ds − 2 3 2 4 ds = (1 − τ ) 0 (1 − s) 0 (1 − s) 0 (1 − s) τ 1 1 1 1 2 + 2w (s) + =− 3 − 4 2 ds ≤ 1−τ (1 − s) (1 − s) (1 − s) 0 τ 1 1 + ≤− 2 ds = −1, 1−τ (1 − s) 0 1

Xdw −

1 2

1

τ

X 2 ds = (X • w) −

Therefore Λ (1) = exp (I) ≤ 1/e. Hence E (Λ (1)) = E (exp (I)) ≤

1 < 1 = E (Λ (0)) , e

so Λ is not a martingale. Example 6.34 If w (t) w (t) − µ · t then there is no probability measure Q P on F∞ for which w is a Wiener process under Q.

Let µ = 0 and let A

w 0 (t) w (t) = 0 = lim =µ . t→∞ t→∞ t t lim


385

If w 0 is a Wiener process under Q then by the law of large numbers, 1 = Q (A) = P (A) = 0. Therefore Q is not absolutely continuous with respect to P on F∞ . Observe that the martingale 1 Λ (t) = exp µw (t) − µ2 t 2 is not uniformly integrable. Therefore if s = ∞ then Λ is not a martingale on [0, s]. Let us discuss the underlying measure-theoretic problem. Definition 6.35 Let (Ω, F) be a filtered space. We say that the probability spaces (Ω, Ft , Pt ) are consistent, if for any s < t the restriction of Pt to Fs is Ps . The filtered space (Ω, F) is a Kolmogorov type filtered space if whenever (Ω, Ft , Pt ) are consistent probability spaces for 0 ≤ t < ∞, then there is a probability measure P on F∞ σ (Ft : t ≥ 0) such that every Pt is a restriction of P to Ft . Example 6.36 The space C ([0, ∞)) with its natural filtration is a Kolmogorov-type filtered space.

One can identify the σ-algebra Ft with the Borel sets of C ([0, t]). Let C ∪t≥0 Ft . If we have a consistent stream of probability spaces over F, then one can define a set function P (C) Pt (C) on C. C ([0, t]) is a complete, separable metric space so P is compact regular on C, hence P is σ-additive on C. By Carathéodory’s theorem one can extend P to σ (C) = B (C [0, ∞)) = F∞ . Observe that in Example 6.34 Λ is a martingale so the measure spaces (Ω, Ft , Qt ) are consistent. If we use the canonical representation, that is Ω = C ([0, ∞)) , then there is a probability measure Q on Ω such that Q (t) is a restriction of Q for every t. Obviously w 0 is a Wiener process under Q with respect to the natural filtration F Ω . Recall that by the previous example Q cannot be absolutely continuous with respect to P. The P-measure of set A is zero so A and all of its subsets are in the augmented filtration F P . As Q (A) = 1 obviously w 0 cannot be a Wiener process under F P . If the measures P and Q are not equivalent then the augmented filtrations can be different! Hence with the change of the measure one should also change the filtration. Of course one should augment the natural filtration F Ω because F Ω does not satisfy the usual conditions. There is a simple method to solve this problem. Observe that on every FtΩ the two measures P and Q are equivalent. It is very natural to assume that we augment

386

ˆ FORMULA ITO’s

Ω FtΩ not with every measure-zero set of F∞ but only with the measure-zero sets Ω of the σ-algebras Ft for t ≥ 0. It is not difficult to see that this filtration is right-continuous and most of the results of the stochastic analysis remain valid with this augmented filtration.

There is nothing special in the problem above. Let us show a similar elementary example. Example 6.37 The filtration generated by the dyadic rational numbers.

Let (Ω, A,P) be the interval [0, 1] with Lebesgue’s measure as probability P λ. We change the filtration only at points t = 0, 1, 2, . . .. If n < t < n + 1 then Ft Fn . Obviously F is right-continuous. Let Fn be the σ-algebra generated by the finite number of intervals [k2−n , (k + 1) 2−n ] where k = 0, 1, . . . , 2n − 1. Observe that as the intervals are closed Fn contains all the dyadic rational numbers / Ft . It is also clear that 0 < k2−n < 1. It is also worth noting that {0} , {1} ∈ the dyadic rational numbers 0 < k2−n < 1 form the only measure-zero subsets of Fn . This implies that if Pt is the restriction of P to Ft , then (Ω, Ft , Pt ) is complete. F∞ σ (Ft , t ≥ 0) is the σ-algebra generated by the intervals with dyadic rational endpoints, so F∞ is the Borel σ-algebra of [0, 1]. B ([0, 1]) is not complete under Lebesgue’s measure. If we complete it, the new measure space is the set of Lebesgue measurable subsets of [0, 1]. In the completed space the number of the measure-zero sets is 2c , where c denotes the cardinality of the continuum. If we augment F∞ only with the measure-zero sets of the σalgebras Ft then F∞ does not change. The cardinality of B ([0, 1]) is just c! Let Q be Dirac’s measure δ 0 . If t < ∞, then the set {0} is not in Ft , so if A ∈ Ft and Pt (A) = 0, then Q (A) = 0, that is Q is absolutely continuous with respect loc

to Pt for every t < ∞, that is Q P. Obviously Q P does not hold. 6.3.5

Kazamaki–Novikov criteria

From Itˆ o’s formula it is clear that if L is a continuous local martingale then E (L) is also a local martingale. It is very natural to ask when E (L) will be a true martingale on some [0, T ]. As E (L) ≥ 0, from Fatou’s lemma it is clear that it is a supermartingale, that is if t > s then

E (E (L) (t) | Fs ) = E E lim Lτ n (t) | Fs ≤ n→∞

≤ lim inf E (Lτ n ) (s) = E (L) (s) . n→∞

Hence taking expected value on both sides E (E (L) (t)) ≤ E (E (L) (s))

t ≥ s.


387

If L (0) = 0 then E (L) (0) = 1 and in this case E (L) is a martingale on some [0, t] if and only if E (E (L) (t)) = 1. Let us first mention a simple, but very frequently used condition: Proposition 6.38 If X is constant and w is a Wiener process then Λ E (X • w) is a martingale on any finite interval [0, t]. A bit more generally: if X and w are independent then Λ E (X • w) is a martingale on any finite interval [0, t]. Proof. The first part of the proposition trivially follows from the formula of the expected value of the lognormal distribution. Using the second condition one can assume that (Ω, A,P) = (Ω1 , A1 , P1 ) × (Ω2 , A2 , P2 ) . X depends only on ω 2 , hence for every ω 1 the integrand below is a martingale on Ω1 so E (Λ (t)) = Λ (t) d (P1 × P2 ) = Ω1 ×Ω2

t

exp Ω2

=

Ω1

0

1 X (ω 2 ) dw (ω 1 ) − 2

t

2

X (ω 2 ) dλ dP1 dP2 = 0

1dP2 = 1. Ω2

The next condition is more general: Proposition 6.39 (Kazamaki’s criteria) If for a continuous local martingale L∈L sup E exp

τ ≤T

1 L (τ ) 2

< ∞,

(6.25)

where the supremum is taken over all stopping times τ for which τ ≤ T then E (L) is a uniformly integrable martingale on [0, T ]. In the case if T = ∞ it is also sufficient to assume that the supremum in (6.25) is finite over just the bounded stopping times.

388

ˆ FORMULA ITO’s

Proof. Observe that if τ is an arbitrary stopping time and (6.25) holds for bounded stopping times then by Fatou’s lemma 1 1 E exp L (τ ) = E lim exp L (τ ∧ n) χ (τ < ∞) ≤ n→∞ 2 2 1 ≤ lim inf E exp L (τ ∧ n) ≤ k. n→∞ 2

1. Let p > 1 and assume that sup E exp

τ ≤T

√ p

√ L (τ ) k < ∞, 2 p−1

(6.26)

where the supremum is taken over all bounded stopping times τ ≤ T . We show that E (L) (τ ) is bounded in Lq (Ω), where 1/p + 1/q = 1. The Lq (Ω)-bounded sets are uniformly integrable hence if (6.26) holds then E (L) is a uniformly integrable martingale. Let √ p+1 . r √ p−1 Let s be the conjugate exponent of r. By simple calculation 1√ p + 1. 2

s= Obviously % q

E (L) = exp

% q q q L − [L] exp q− L . r 2 r

By Hölder’s inequality q

√

% 1/s q E exp s q − . L (τ ) r

1/r

E (E (L) (τ )) ≤ E (E ( rqL (τ )))

√ E rqL is a non-negative local martingale, so it is a supermartingale. Hence by the Optional Sampling Theorem41 the first part of the product cannot be larger than 1. % √ p q ,

= s q− √ r 2 p−1 41 See:



389

hence

q

E (E (L) (τ )) ≤ E exp

1/s √ p

√ L (τ ) ≤ k 1/s . 2 p−1

2. As

exp (x) ≤ exp x+ ≤ exp (x) + 1 one has E exp

1 L (τ ) 2

1 + ≤ E exp L (τ ) ≤ 2 1 L (τ ) + 1 ≤ E exp 2

from which it is obvious that

sup E exp

τ ≤T

1 + L (τ ) 0, hence by the Dominated Convergence Theorem

lim E exp

a1

1−a2

a L (T ) 1+a

0 1 L (T ) = E exp = 1. 2

Therefore 1 ≤ E (E (L) (T )) from which, by the supermartingale property of E (L), the proposition is obvious.

Corollary 6.40 If L is a continuous local martingale and exp 12 L is a uniformly integrable submartingale then E (L) is a uniformly integrable martingale.

Proof. By the uniform integrability one can take exp 12 L on the closed interval [0, T ]. By the Optional Sampling Theorem for integrable submartingales42 if τ ≤ T then exp

1 L (τ ) 2

≤ E exp

1 L (T ) | Fτ , 2

from which (6.25) holds. Corollary 6.41 If L is a uniformly integrable continuous martingale and

E exp 12 L (T ) < ∞ then E (L) is a uniformly integrable martingale. 42 See:



391

Proof. As L is uniformly integrable L (T ) is meaningful. A convex function of a martingale is a submartingale. exp

1

2 L (t)

≤ E exp 12 L (T ) | Ft .

Taking the expected value on both sides, it is clear that exp 12 L is an integrable submartingale. By the Optional Sampling Theorem

for submartingales exp 12 L (τ ) is integrable for every τ and (6.25) holds. Corollary 6.42 (Novikov’s criteria) If L ∈ L is a continuous local martingale on some finite or infinite interval [0, T ] and

E exp 12 [L] (T ) < ∞,

(6.27)

and Λ E (L) then E (Λ (T )) = E (Λ (0)) = 1 and Λ is a uniformly integrable martingale on [0, T ]. Proof. E (L) is a non-negative local martingale, hence it is a supermartingale. By the Optional Sampling Theorem43 for any bounded stopping time τ E (L (τ )) ≤ E (L (0)) = 1. By the Cauchy–Schwarz inequality 1 L (τ ) ≤ E exp 2 - - [L] (τ ) [L] (τ ) ≤ E exp L (τ ) − E exp 2 2 - - " [L] (τ ) [L] (τ ) E (L (τ )) E exp ≤ E exp ≤ 2 2 - 1 [L] (T ) < ∞. ≤ E exp 2 Hence Kazamaki’s criteria holds. 43 See:


392

ˆ FORMULA ITO’s

Corollary 6.43 If L X • w, T is finite and for some δ > 0

sup E exp δX 2 (t) < ∞

(6.28)

t≤T

then

t

Λ (t) exp

Xdw − 0

1 2

t

X 2 dλ 0

is a martingale on [0, T ]. Proof. Let L X • w. By Jensen’s inequality exp

1 T T X 2 (t) 1 [L] (T ) = exp dt ≤ 2 T 0 2 T X 2 (t) 1 T dt. exp ≤ 2 T 0

If T /2 ≤ δ then we can continue the estimation

E exp

1 T X 2 (t) 1 T [L] (T ) ≤ dt ≤ E exp 2 T 0 2

≤ sup E exp δX 2 (t) < ∞ t≤T

n

by condition (6.28), so Novikov’s criteria holds. Hence E (Λ (T )) = 1. Let (tk )k=0 be a partition of [0, T ]. Assume that the size of the intervals [tk−1 , tk ] is smaller than 2δ. If Λk exp

tk+1

tk

then Λ =

!

X (s) dw (s) −

1 2

tk+1

X 2 (s) ds

tk

a.s.

Λk , E (Λk ) = 1 and E (Λk | Ftk ) = 1. Hence

k

E (Λ (T )) = E E Λ (T ) | Ftn−1 =

= E E Λn−1 Λ (tn−1 ) | Ftn−1 =

= E Λ (tn−1 ) E Λn−1 | Ftn−1 = = E (Λ (tn−1 )) = · · · = E (Λ (t1 )) = 1.


393

Corollary 6.44 If X is a Gaussian process, T is finite and sup D (X (t)) < ∞,

t≤T

then Λ = E (X • w) is a martingale on [0, T ]. If µt and σ t denote the expected value and the standard deviation of X (t) then 2

2

2 1 x − µt 1 exp δx exp − dx = E exp δX (t) = √ 2 σt σ t 2π R

exp δµ2t / (1 − 2δσ t ) √ = . 1 − 2δσ t

If δ < 1/ 2 supt≤T D (X (t)) then E exp δX 2 (t) is bounded. Example 6.45 Novikov’s criteria is an elegant but not a too strong condition.

Let τ be a stopping time. If L is a continuous local martingale, then Lτ is also a continuous local martingale. 1 τ E (Lτ ) = exp Lτ − [Lτ ] = E (L) , 2 so one could write any stopping time τ ≤ T in (6.27) instead of T . If for a stopping time τ 1 τ t be a point of continuity of µ. lim sup µn ((0, t]) ≤ lim sup µn ((0, r]) = µ ((0, r]) . n→∞

n→∞

Since the points of continuity of µ are dense in R+ and as µ is right-continuous lim sup µn ((0, t]) ≤ µ ((0, t])

(6.38)

n→∞

for every t ≥ 0. Also recall that µc denotes the continuous part of the increasing function t → µ ((0, t]). Definition 6.51 Let (∆n ) be an infinitesimal52 sequence of partitions: (n)

∆ n : 0 = t0

(n)

< t1

(n)

< . . . < tkn = ∞.

1. We say that a right-regular function f on [0, ∞) has finite quadratic variation with respect to (∆n ) if the sequence of point measures53

2

(n)

(n) (n) f ti+1 − f ti δ ti

µn

(n)

ti

∈∆n

50 One

should use the fact that X− is locally bounded. the points of continuity are dense the limit is unique. 52 That is, on any finite interval max (n) − t(n) → 0. k tk+1 k 51 As

53 Recall

that δ (a) is Dirac’s measure concentrated at point a.

(6.39)

ˆ FORMULA FOR NON-CONTINUOUS SEMIMARTINGALES ITO’S

405

converges in the vague topology to a locally finite measure µ where µ has the decomposition µ ((0, t]) = µc ((0, t]) +

2

(∆f (s)) .

s≤t

We shall denote µ ((0, t]) by [f ] (t) [f, f ] (t). 2. We say that right-regular functions f and g on [0, ∞) have finite quadratic co-variation with respect to (∆n ) if [f ] , [g] and [f + g] exist. In this case 1 ([f + g] − [f ] − [g]) . 2

[f, g]

3. A function g is (∆n )-integrable with respect to some function G if the limit lim

n→∞

(n)

ti

(n) (n) (n) g ti G ti+1 − G ti

≤t

is finite for every t ≥ 0. We shall denote this (∆n )-integral by

t

g (s−) dG (s) . 0

Theorem 6.52 (F¨ ollmer) Let F ∈ C 2 Rd and let (∆n ) be an infinitesimal d sequence of partitions of [0, ∞). If f (fk )k=1 are right-regular functions on R+ with finite quadratic variation and co-variation with respect to (∆n ) then for every t > 0 F (f (t)) − F (f (0)) = t ∂F = (f (s−)) , df (s) + ∂x 0 t ∂2F 1 (f (s−)) d [fi , fj ] (s) − + 2 i,j 0 ∂xi ∂xj −

+

s≤t

1 ∂2F (f (s−)) ∆fi (s) ∆fj (s) + 2 ∂xi ∂xj i,j s≤t

d ∂F F (f (s)) − F (f (s−)) − (f (s−)) ∆fi (s) ∂xi i=1

ˆ FORMULA ITO’s

406 where

t 0

∂F (n)

(n)

∂F (n) , f ti+1 − f ti (f (s−)) , df (s) lim f ti n→∞ ∂x ∂x (n) ti

≤t

where ∂F ∂x

∂F ∂F ∂F , ,..., ∂x1 ∂x2 ∂xd

denotes the gradient vector of F and all the other integrals are (∆n )-integrals. If the coordinates of the vector X (X1 , X2 , . . . , Xn ) are semimartingales, then the quadratic variations and co-variations exist and they converge uniformly on compact sets in probability. This implies that for some subsequence they converge uniformly, almost surely. Also, for semimartingales the stochastic integrals 0

t

∂F (X (s−)) dXk (s) ∂xk

exist and by the Dominated Convergence Theorem, uniformly on compact intervals in probability, 0

t

∂F ∂F (n) (n) (X (s−)) dXk (s) = (X (ti )) Xk ti+1 − X ti ∂xk ∂xk (n) ti

≤t

therefore F¨ ollmer’s theorem implies Itô’s formula. Proof. Fix t > 0. To simplify the notation we drop the superscript n. 1. If the first point in ∆n which is larger than t is tkn then tkn t. As f is right-continuous F (f (t)) − F (f (0)) = lim F (f (tkn )) − F (f (0)) = n→∞ = lim (F (f (ti+1 )) − F (f (ti ))) . n→∞

i


407

To simplify the notation further we drop all the point from ∆n which are larger than tkn . By Taylor’s formula F (f (ti+1 )) − F (f (ti )) =

d ∂F (f (ti )) (fk (ti+1 ) − fk (ti )) + ∂xk

k=1

1 ∂2F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) + + 2 ∂xk ∂xl k,l

+r (f (ti ) , f (ti+1 )) where 2

|r (a, b)| ≤ ϕ (b − a) b − a . As F is twice continuously differentiable one may assume that ϕ is increasing and limc0 ϕ (c) = 0. 2. Given ε > 0 we split the set of jumps of f into two classes. C1 is a finite set and C2 is the set of jumps for which

s∈C2 ,s≤t

d

2 |∆fk (s)|

≤ ε.

k=1

As f has quadratic variation and co-variation this separation is possible. Since C1 is finite and as f is right-regular if (1) denotes the sum over the sub-intervals which contain a point from C1 then lim

n→∞

(F (f (ti+1 )) − F (f (ti ))) =

(F (f (s)) − F (f (s−))) .

(6.40)

s∈C1

(1)

Let F denote the first derivative and F the second derivative of F . Adding up the increments of other intervals

(F (f (ti+1 )) − F (f (ti ))) =

F (f (ti )) (f (ti+1 ) − fk (ti )) +

(2)

+ −

(1)

1 2

F (f (ti )) (f (ti+1 ) − f (ti )) −

1 F (f (ti )) (f (ti+1 ) − f (ti )) + F (f (ti )) (f (ti+1 ) − f (ti )) + 2 +

(2)

r (f (ti ) , f (ti+1 )) .

408

ˆ FORMULA ITO’s

As C1 is finite the expression in the third line goes to (1)

1 F (f (s−)) ∆f (s) + F (f (s−)) (∆f (s)) . 2

(6.41)

One can estimate the last expression as 2 ≤ ϕ max r (f (t ) , f (t )) f (t ) − f (t ) f (ti+1 ) − f (ti ) i i+1 i+1 i (2) (2) (2) therefore, using (6.38), lim sup r (f (ti ) , f (ti+1 )) ≤ k→∞ (2) 2 ≤ ϕ (ε+) lim sup f (ti ) − f (ti+1 ) ≤ n→∞

≤ ϕ (ε+) lim sup n→∞

d

ti ≤t

µ(k) n ((0, t]) ≤ ϕ (ε+)

k=1

d

[fk ] (t) .

k=1

If ε 0 then this expression goes to zero and the difference of (6.40) and (6.41) goes to s≤t

1 F (f (s)) − F (f (s−)) − F (f (s−)) (∆f (s)) − F (f (s−)) (∆f (s)) 2

3. Let G now be a continuous function. We show that if f is one of the functions fk or fk + fl then

2

G (f (ti )) (f (ti+1 ) − f (ti )) = t = G (f (s−)) d [f ] (s) .

lim

n→∞

0

Using the definition of measures related to the quadratic variation this means that lim

n→∞

t

G (f ) dµn = 0

t

G (f (s−)) dµ (s) , 0

(6.42)


where the integrals are usual Lebesgue–Stieltjes integrals. and let

h (u)

409

Let ε > 0

∆f (s) .

s∈C1 ,s≤u (C )

As C1 is a finite set it is Let µn 1 be the point measure like (6.39) based on h.

(C1 ) easy to see that the sequence of point measures µn converges to the point measure µ(C1 )

2

(∆f (s)) δ (s) .

s∈C1

As C1 is finite it is also easy to see, that

t

lim

G (f

n→∞

(s)) dµn(C1 )

t

G (f (s−)) dµ(C1 ) (s) .

(s) =

0

(6.43)

0

Let g f − h. As f = h + g obviously

2

(f (ti+1 ) − f (ti )) =

ti ≤u

+ +2

2

(h (ti+1 ) − h (ti )) +

ti ≤u 2

(g (ti+1 ) − g (ti )) +

ti ≤u

(g (ti+1 ) − g (ti )) (h (ti+1 ) − h (ti )) .

ti ≤u

C1 has only a finite number of points and if h is not continuous at some point s (C ) then g is continuous at s. Hence the third term goes to zero. Therefore µn −µn 1 converges to µ − µ(C1 ) . t t

(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s−)) d µ − µ (s) − (s) ≤ 0 0 t t

(C1 ) (C1 ) G (f (s)) d µn − µn G (f (s)) d µ − µ (s) − (s) + ≤ 0

0

t

+ G (f (s)) − G (f (s−)) d µ − µ(C1 ) (s) . 0

The total size of the atoms of the measure µ − µ(C1 ) is smaller than ε2 . The function G (f ) is continuous at the point of continuity of µ − µ(C1 ) so one can

410

ˆ FORMULA ITO’s

estimate the second term by t

(C1 ) G (f (s)) − G (f (s−)) d µ − µ (s) ≤ 2ε2 sup |G (f (s))| . s≤t 0

Recall that f is bounded54 , and therefore sup |G (f (s))| < ∞. s≤t

Obviously µ − µ(C1 ) (C1 ) = 0. Hence there are finitely many open intervals which cover the points of C1 with total measure smaller than ε. Let O be the union of these intervals. As the points of continuity are dense one may assume that the points of the boundary of O are points of continuity of µ − µ(C1 ) . By the vague convergence one can assume that for some n sufficiently large (C ) (µn − µn 1 ) (O) < ε. If one deletes O from [0, t] the jumps of f are smaller than ε then on the compact set [0, t] \C1 . G is uniformly continuous on the bounded range55 of f so there is a δ such that if s1 , s2 ∈ [0, t] \O and |s1 − s2 | < δ then |G (f (s1 )) − G (f (s2 ))| < 2ε. This means that there is a step function H such that |H (s) − G (f (s))| < 2ε on [0, t] \O. On may also assume that the points of discontinuities of the step function H are points of continuity of measure µ − µ(C1 ) . t t

(C1 ) (C1 ) (s) − (s) ≤ G (f (s)) d µn − µn G (f (s)) d µ − µ lim sup n→∞ 0

0

≤ 2ε sup |G (f (s))| +

n→∞

s≤t

+2ε µn − µn(C1 ) ([0, t]) + µ − µ(C1 ) ([0, t]) + t t

(C1 ) (C1 ) H (s) d µn − µ H (s) d µ − µ (s) − (s) . + lim sup

0

0

Since the last expression, by the vague convergence goes to zero, for some k independent of ε t t

lim sup G (f (s)) d µn − µn(C1 ) (s) − G (f (s)) d µ − µ(C) (s) ≤ εk. n→∞

54 See: 55 See:

0


0


411

As ε is arbitrary lim

n→∞

t

t

G (f (s)) d µn − µn(C1 ) (s) = G (f (s−)) d µ − µ(C1 ) (s) .

0

0

Using (6.43) one can easily show (6.42). 4. Applying this observation and the definition of the co-variation one gets the convergence of F (f (ti )) (f (ti+1 ) − f (ti )) = =

∂ 2 F (f (ti )) (fk (ti+1 ) − fk (ti )) (fl (ti+1 ) − fl (ti )) ∂xk ∂xk k,l

to the sum of integrals k,l

t

0

∂2F (f (s−)) d [fk , fl ] (s) . ∂xk ∂xl

5. As all the other terms converge, ∂F (f (ti )) (f (ti )) , f (ti+1 ) − f (ti ) ∂x i also converges and its limit, by definition, is t ∂F (f (s−)) , df (s) ∂x 0 which proves the formula. 6.4.3

Exponential semimartingales

As an application of the general Itˆ o formula let us discuss the exponential semimartingales. Let Z be an arbitrary complex semimartingale, that is let Z X + iY , where X and Y are real-valued semimartingales. Let us investigate the stochastic integral equation E = 1 + E− • Z.

(6.44)

Definition 6.53 The equation (6.44) is called the Doléans equation. The simplest version of the equation is when Z(s) ≡ s E (t) = 1 +

t

E (s−) ds = 1 + 0

t

E (s) ds, 0

412

ˆ FORMULA ITO’s

which characterizes the exponential function E (t) = exp (t). This explains the next definition: Definition 6.54 The solution of (6.44), denoted by E (Z), is called the exponential semimartingale of Z. Proposition 6.55 (Yor’s formula) If X and Y are arbitrary semimartingales then E (X) E (Y ) = E (X + Y + [X, Y ]) . Proof. By the formula for the quadratic variation of stochastic integrals ' & [E (X) , E (Y )] 1 + E (X)− • X, 1 + E (Y )− • Y =

= E (X)− E (Y )− • [X, Y ] . Integrating by parts E (X) E (Y ) − 1 = E (X)− • E (Y ) + E (Y )− • E (X) + [E (X) , E (Y )] =

= E (X)− E (Y )− • (Y + X + [X, Y ]) , from which, evident.

by the definition of the operator E,

Yor’s formula is

In the definition of E(Z) and during the proof of Yor’s formula we have implicitly used the following theorem: Theorem 6.56 (Solution of Dol´ eans’ equation) Let Z be an arbitrary complex semimartingale. 1. There is a process E which satisfies the integral equation (6.44). 2. If E1 and E2 are two solutions of (6.44) then E1 and E2 are indistinguishable. 3. If τ inf {t : ∆Z = −1} then E (Z) = 0 on [0, τ ), E (Z)− = 0 on [0, τ ] and E (Z) = 0 on [τ , ∞). 4. E (Z) is a semimartingale. 5. If Z has finite variation then E (Z) has finite variation. 6. If Z is a local martingale then E (Z) is a local martingale. 7. E has the following representation: 1 c (6.45) E E (Z) = exp Z − Z (0) − [Z] × 2 ! × (1 + ∆Z) exp (−∆Z) , where the product in the formula is absolutely convergent.


413

Proof. The proof of the theorem is a direct and simple, but lengthy calculation. We divide the proof into several steps. variation of semimartingales is finite. Hence the sum 1. The quadratic 2 |∆Z (s)| is convergent. Therefore on the interval [0, t] there are just finitely s≤t many moments when |∆Z| > 1/2. If |u| ≤ 1/2, then 2

|ln (1 + u) − u| ≤ C |u| , hence ln

!

|1 + ∆Z| |exp (−∆Z)| = (ln (|1 + ∆Z|) − |∆Z|) ≤ ≤ |ln (1 + |∆Z|) − |∆Z|| ≤ 2 ≤C |∆Z| < ∞.

Therefore the product V (t)

!

(1 + ∆Z (s)) exp (−∆Z (s))

s≤t

is absolutely convergent. Separating the real and the imaginary parts and taking logarithm, one can immediately see that V is a right-regular process with finite variation. By the definition of the product operation obviously56 V (0)

!

(1 + ∆Z (s)) = 1 + ∆Z (0) = 1.

s≤0

2. Let us denote by U the expression in the exponent of E (Z): U (t) Z − Z (0) −

1 c [Z ] . 2

With this notation E E (Z) V exp (U ) . By Itô’s formula for complex semimartingales, using that E (0) = 1, c and that V has finite variation, the co-variation [U, V ] = [U c , V c ] and 56 See:

(1.1) on page 4.

414

ˆ FORMULA ITO’s

c

[V ] = [V c ] are zero and hence E = 1 + E− • U + exp (U− ) • V + 1 c + E− • [U ] + 2 + (∆E − V− exp (U− ) ∆U − exp (U− ) ∆V ) . V is a pure jump process and therefore A exp (U− ) • V =

exp (U− ) ∆V.

As ∆U = ∆Z ∆E E − E− exp (U ) V − exp (U− ) V− = = exp (U− + ∆U ) V− (1 + ∆Z) exp (−∆Z) − exp (U− ) V− = = exp (U− + ∆U ) exp (−∆U ) V− (1 + ∆U ) − exp (U− ) V− = = exp (U− ) V− ∆U E− ∆U. Substituting the expressions A and ∆E A+

(∆E − E− ∆U − exp (U− ) ∆V ) = 0.

Obviously c 1 c c [U ] Z − Z (0) − [Z] = [Z c ] = [Z] , 2 c

and therefore 1 c E = 1 + E− • U + E− • [U ] = 2 1 c = 1 + E− • U + [Z] 2 1 + E− • (Z − Z (0)) = 1 + E− • Z, hence E satisfies (6.44). 3. One has to prove that the solution is unique. Let Y be an arbitrary solution of (6.44). The stochastic integrals are semimartingales so Y is a semimartingale. By Itô’s formula H Y · exp (−U ) is also a semimartingale. Applying the


415

multidimensional complex Itˆ o’s formula for the complex function z1 · exp (−z2 ) H = 1 − H− • U + exp (−U− ) • Y + 1 c c + H− • [U ] − exp (−U− ) • [U, Y ] + 2 + (∆H + H− ∆U − exp (−U− ) ∆Y ) . Y is a solution of the Doléans equation so exp (−U− ) • Y = exp (−U− ) Y− • Z H− • Z. c

c

c

[U, Y ] = [U, (Y− • Z)] = Y− • [U, Z] c 1 c c Y− • Z − [Z] , Z = Y− • [Z] . 2 c

c

exp (−U− ) • [U, Y ] = H− • [Z] . c

c

Adding up these terms and using that [U ] = [Z]

1 c c H− • Z + [U ] − [Z] 2

= H− • U,

hence H =1+

(∆H + H− ∆U − exp (−U− ) ∆Y ) .

Y is a solution of (6.44), so ∆Y = Y− ∆Z = Y− ∆U. Hence H =1+ 1+ =1+

(∆H + H− ∆U − exp (−U− ) Y− ∆U ) (∆H + H− ∆U − H− ∆U ) = ∆H.

(6.46)

416

ˆ FORMULA ITO’s

On the other hand, using (6.46) again ∆H H − H− Y exp (−U ) − H− = = exp (−U− − ∆U ) (Y− + ∆Y ) − H− = = exp (−U− − ∆U ) Y− (1 + ∆Z) − H− = = exp (−U− ) Y− exp (−∆U ) (1 + ∆Z) − H− = = H− (exp (−∆Z) (1 + ∆Z) − 1) so H = 1 + H− • R,

(6.47)

where R

(exp (−∆Z) (1 + ∆Z) − 1) .

For some constant C if |x| ≤ 1/2 |exp (−x) (1 + x) − 1| ≤ Cx2 . 2 Z is a semimartingale so (∆Z) < ∞ and therefore R is a complex process with finite variation. 4. Let us prove the following simple general observation: if v is a right-regular function with finite variation then the only right-regular function f for which

h

h≥0

f (s−) dv (s) ,

f (h) =

(6.48)

0

is f ≡ 0. Let s inf {t : f (t) = 0}. Obviously f = 0 on the interval [0, s). Hence by the integral equation (6.48)

s

s

f (t−) dv (t) =

f (s) = 0

0dv = 0. 0

If s < ∞ then, as v is right-regular, there is a t > s such that Var (v (t)) − Var (v (s)) ≤ 1/2. If t ≥ u > s then

u

s

≤ Var(v, s, u) sup |f (u)| ≤ s≤u≤t

u

f− dv ≤

f− dv =

f (u) = f (s) +

s

1 sup |f (u)| 2 s≤u≤t

ˆ FORMULA FOR CONVEX FUNCTIONS ITO’S

417

and therefore sup |f (u)| ≤

s a) dX+

χ (X (s−) > a) (X (s) − a)− +

0<s≤t

+

χ (X (s−) ≤ a) (X (s) − a)+ +

0<s≤t

+ 66 See:

1 L (a, t) , 2

Proposition 5.23 page 319. Observe that the integrand is uniformly bounded. we show that for continuous local martingales the local time L (a, t, ω) has a version which is continuous in (a, t). 67 Later

426

ˆ FORMULA ITO’s

or (X (t) − a)− − (X (0) − a)− = −

t

χ (X− ≤ a) dX+

0

+

χ (X (s−) > a) (X (s) − a)− +

0<s≤t

+

χ (X (s−) ≤ a) (X (s) − a)+ +

0<s≤t

1 L (a, t) . 2

+

These formulas are called Tanaka’s formulas.

Let us apply the generalization of Itˆ o’s formula (6.54) for convex functions + − f (x) (x − a) and g (x) (x − a) :

t

f (X (t)) = f (X (0)) + 0 t

g (X (t)) = g (X (0)) +

f (X− ) dX + A(+) (t) , g (X− ) dX + A(−) (t) .

0

Subtracting the two lines above and using that f (x) = χ (x > a) ,

g (x) = −χ (x ≤ a)

one gets

t

1dX + A(+) (t) − A(−) (t) .

X (t) − X (0) = 0

This implies that A(+) (t) = A(−) (t). If B (+) (t) A(+) (t) −

(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s))

0<s≤t

B (−) (t) A(−) (t) −

(g (X (s)) − g (X (s−)) − g (X (s−)) ∆X (s))

0<s≤t

then by the definition of the local time B (+) (t) + B (−) (t) = L (a, t). As the difference of the sums above is zero B (+) (t) = B (−) (t) , hence B (+) = B (−) = L (a) , so the formula is valid.


427

For any process X one can introduce the occupation times measure µt (B) λ (s ≤ t : X (s) ∈ B) . Later we shall see68 that for Wiener processes the local time L (a, t) is the density function of µt . With the usual interpretation of the density functions for Wiener processes one can think about L (a, t) da as the time during the time interval [0, t] a Wiener process is infinitely closely around a. Example 6.70 The Green function and the local time of Wiener processes.

Let w(x) be a Wiener process starting from point x. Let x ∈ I (a, b) be a (x) bounded interval, and let τ I be the exit-time of w(x) from I. Let us calculate the expected value E L(x) (y, τ I ) . By the definition of local time L(x) t

(x) sign w(x) − y dw(x) + L(x) (y, t) . w (t) − y = w(x) (0) − y + 0 (x)

If we truncate w(x) by τ I then the truncated process is bounded. If we truncate (x) both sides with τ I then the truncated integrator is in H2 . By Itô’s isometry the integral is also in H2 . Therefore the stochastic integral is a uniformly integrable martingale. By the Optional Sampling Theorem the expected value of the stochastic integral is zero, so

(x) (x) E w(x) τ I − y = |x − y| + E L(x) y, τ I . w(x) leaves the bounded set [a, b] almost surely so

(x) (x) E w(x) τ I − y = |a − y| P w τ I =a

(x) + |b − y| P w τ I =b . With the Optional Sampling Theorem one can easily calculate the probabilities69 . Obviously

(x) (x) P w τI = a + P w τI = b = 1, 68 See: 69 See:

Corollary 6.75, page 435. Example 1.116, page 81.

428

ˆ FORMULA ITO’s

and

(x) x = E w(x) (0) = E w(x) τ I

(x) (x) = aP w τ I = a + bP w τ I =b . Solving the equations

b−x (x) , P w τI =a = b−a

x−a (x) P w τI . =b = b−a

Substituting back

x−a b−x (x) + |b − y| − |x − y| . = |a − y| E L(x) y, τ I b−a b−a With elementary calculation

(x)

E L

(x) y, τ I

2 = b−a

(x − a) (b − y) if a ≤ x ≤ y ≤ b . (y − a) (b − x) if a ≤ y ≤ x ≤ b

If we introduce the so-called Green function 1 (x − a) (b − y) if a ≤ x ≤ y ≤ b GI (x, y) (y − a) (b − x) if a ≤ y ≤ x ≤ b b−a then

(x) E L(x) y, τ I = 2GI (x, y) .

Example 6.71 If 0 < a < b then before reaching point b a Wiener process starting from x = 0 on average spends 2 (b − a) da time units in the da neighbourhood point a.

Let w be a Wiener process and let 0 < a < b. Let us denote by τ b the first passage time of point b. Using the interpretation of the local times one should calculate the expected value E (L (a, τ b )). Using the same method as in the previous example = |a| + E

|b − a| = E (|w (τ b ) − a|) = τb sign (w (s) − a) dw (s) + E (L (a, τ b )) .

0

Observe that now wτ b is not bounded, so it is not in H2 so the stochastic integral is not a uniformly integrable martingale. If c < 0 < a < b, then as in the previous


429

example E (L (a, τ b ∧ τ c )) = 2G(c,b) (0, a) . If c −∞, then the limit on the right-hand side is 2 (b − a). On the left-hand side τ b ∧τ c τ b and as t → L (a, t) is increasing and continuous by the Monotone Convergence Theorem E (L (a, τ b )) = 2 (b − a) . 6.5.3

Meyer–Itˆ o formula

Theorem 6.72 Let X be a semimartingale. If L (a) is the local time of X at point a then for almost all outcome ω the support of the measure generated by the increasing function t → L (a, t, ω) is in the set {s : X (s−, ω) = X (s, ω) = a} . Proof. By the definition of local times L (a, t, ω) is continuous in time parameter t. This implies that the measure of every single point, with respect to the measure generated by L (a, t, ω), is zero. For every trajectory the number of the jumps of X is maximum countable, so it is sufficient to prove that the support of the measure generated by L (a, t, ω) is a subset of {s : X (s−, ω) = a} for almost all outcome ω. As convex functions of semimartingales are semimartingales Y |X − a| is a semimartingale. Y 2 = Y 2 (0) + 2Y− • Y + [Y ] . Z X − a is also a semimartingale. Y 2 = Z 2 = Y 2 (0) + 2Z− • Z + [Z] . Obviously [Z] = [Y ], therefore Y− • Y = Z− • Z. As Y = |Z|

t

sign (Z− ) dZ + Aa (t) .

Y (t) = Y (0) + 0

By the associativity rule

t

Y− dY = 0

t

0

t

Y− dAa .

Y− sign (Z− ) dZ + 0

430

ˆ FORMULA ITO’s

By the definition of sign Y− sign (Z− ) |Z− | sign (Z− ) = Z− .

(6.58)

Therefore

t

t

Z− dZ =

Y− dY =

0

0

t

0

t

Y− dAa .

Z− dZ + 0

Hence, by the definition of L (a, t, ω)

t

Y− dAa =

0=

(6.59)

0

t

Y− dLa +

[4pt] = 0

Y (s−) (∆ |Z (s)| − sign (Z (s−)) ∆Z (t)) .

0<s≤t

Observe that by (6.58) the expression after the sum is finite and has the form |a| (|b| − |a|) − a (b − a) = |a| |b| − a2 − ab + a2 = = |ab| − ab ≥ 0. t La is increasing, therefore the integral 0 Y− dLa is non-negative. This implies that the sum and the integral in (6.59) are zero. But as the integral is zero the support of the measure generated by La is part of the set {Y (s−) = 0} {|X (s−) − a| = 0} = {X (s−) = a} .

Example 6.73 If L is the local time of a Wiener process and τ b is the first passage time of a point b and 0 ≤ a < b then L (a, τ b ) has an exponential distribution with parameter70 λ (2 (b − a))−1 .

We show that the Laplace transform of the random variable L (a, τ b ) is l (s) E (exp (−s · L (a, τ b ))) = 70 See:


1 . 1 + 2s · (b − a)

(6.60)


431

As the Laplace transform of an exponentially distributed random variable is 1 1 + s/λ this implies the statement. 1. The main idea of the proof is to show that X (t)

1 + + s · (w (t) − a) exp (−s · L (a, t)) 2

is a local martingale. As Xτb =

1 + + s · (wτ b − a) exp (−s · Lτ b (a)) 2

(6.61)

is bounded, X τ b is a bounded local martingale. Hence (6.61) is a uniformly integrable martingale. Therefore by the Optional Sampling Theorem as 0 ≤ a < b 1 =E 2

1 + exp (−s · L (a, 0)) = + s · (w (0) − a) 2

1 + + s · (w (τ b ) − a) exp (−s · L (a, τ b )) = 2

=E =

1 + s · (b − a) l (s) , 2

from which (6.60) is trivial. 2. Let us return to process X. Let U (t)

1 + + s · (w (t) − a) , 2

V (t) exp (−sL (a, t)) .

Integrating by parts

t

U dV +

X (t) = U (t) V (t) = X (0) + 0

t

V dU + [U, V ] . 0

U is continuous, V has finite variation so [U, V ] = 0. By the previous theorem the support of the measure generated by V is in {w = a}, so

t

U dV = 0

1 1 + + s · (a − a) (V (t) − V (0)) = (V (t) − 1) . 2 2

432

ˆ FORMULA ITO’s

By Tanaka’s formula 1 U (t) H (t) + s · L (a, t) , 2 where H isa continuous local martingale. V is continuous so it is locally bounded t so Z (t) 0 V dH is a local martingale. On the other hand, by the Fundamental Theorem of Calculus71

t

Vd 0

1 s t s·L = exp (−s · L (a, u)) L (a, du) = 2 2 0 t s exp (−s · L (a, u)) = = 2 −s 0 1 1 = − (exp (−s · L (a, u)) − 1) = − (V (t) − 1) . 2 2

Hence X (t) = X (0) + Z (t) +

1 1 (V (t) − 1) − (V (t) − 1) = 2 2

= X (0) + Z (t) , that is, X is a local martingale. Theorem 6.74 (Meyer–Itˆ o formula) Let X be a semimartingale and let f be denotes the left derivative of f and µ is the second a convex function. If f f− generalized derivative of f and L is the local time of X, then f (X (t)) − f (X (0)) = t f (X− ) dX+ = +

(6.62)

0

(f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) +

0<s≤t

+

1 2

L (a, t) dµ (a) . R

Proof. Recall that the second generalized derivative of |x| is 2δ 0 . So if f (x) = |x| , then by the theorem one gets just the definition of local times. 1. Let us first assume that the support of µ is compact. In this case the representation (6.53) holds. If f (x) = αx + β then the theorem is trivially true, 71 See:

(6.32), page 398. Or, if one likes, by Itˆ o’s formula.


433

therefore one can assume that 1 f (x) = 2

R

|x − a| dµ (a) .

With the Dominated Convergence Theorem one can differentiate under the integral sign f (x) f− (x) =

1 2

R

sign (x − a) dµ (a) .

If J (a, t)

(|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) ,

0<s≤t

then by the Monotone Convergence Theorem 1 2

=

J (a, t) dµ (a) = R

1 (|X (s) − a| − |X (s−) − a| − sign (X (s−) − a) ∆X (s)) dµ (a) 2 0<s≤t R = (f (X (s)) − f (X (s−)) − f (X (s−)) ∆X (s)) . 0<s≤t

Similarly if H (a, t) |X (t) − a| − |X (0) − a| , then f (X (t)) − f (X (0)) =

1 2

H (a, t) dµ (a) . R

Let Z (a, t)

t

sign (X (s−) − a) dX (s) 0

434

ˆ FORMULA ITO’s

and let us take a B (R) × B (R+ ) × A measurable version of this parametric integral72 . By Fubini’s theorem for stochastic integrals73 1 2

t

Z (a, t) dµ (a) = R

0

=

t

1 2

R

sign (X (s−) − a) dµ (a) dX (s) =

f (X (s−)) dX (s) .

0

By the definition of local times L = H − J − Z, that is 1 1 1 1 H = J + Z + L. 2 2 2 2 Integrating by µ and using the already proved formulas one can easily prove the theorem. 2. Let us take the general case and let  x ≤ −n  f (−n) + f (−n) (x + n) if f (x) if −n < x < n . fn (x)  f (n) + f (n) (x − n) if x≥n fn is also convex. Let µn be the generalized second derivative of fn . Obviously the support of µn is in [−n, n] and the measure µn is finite. Hence we can use the already proved part of the theorem. Let τ n inf {t : |X (t)| ≥ n} , and let us consider the stopped processes X τ m . By the already proved part of the theorem fn (X τ n (t)) − fn (X τ n (0)) = t fn (X τ n (s−)) dX τ n (s) + = 0

+

(∆fn (X τ n (s)) − fn (X τ n (s−)) ∆X τ n (s)) +

0<s≤t

1 + 2

R

Ln (a, t) dµn (a) ,

where obviously Ln (a) denotes the local time of X τ n . Observe that |X τ n | ≤ n on [0, τ n ). Therefore on [0, τ n ) one can write f instead fn . The support of the measure generated by Ln (a) is in the set {X τ n (s−) = a} , that is if |a| ≥ n, then 72 See: 73 See:

Proposition 5.23, page 319. Theorem 5.25, page 322.


435

Ln (a, t) = 0 for all t. The measure µ and µn are equal on the interval [−n, n] so in the integral containing the local time one can write µ instead of µn . That is R

Ln (a, t) dµn (a) =

R

Ln (a, t) dµ (a) .

From the definition of the local time it is evident that the local time of X τ n is Lτ n . Hence if t ≤ τ n , then Ln (a, t) dµn (a) = Ln (a, t) dµ (a) = Lτ n (a, t) dµ (a) = R

R

R

=

L (a, t) dµ (a) . R

If n → ∞, then τ n ∞, and the theorem holds in the general case as well. Corollary 6.75 (Occupation Times Formula) If X is a semimartingale and L is the local time of X then for every bounded Borel measurable function g : R → R and for all t for almost all outcomes

R

t

g (X (s−)) d [X c ] (s) .

L (a, t) g (a) da =

(6.63)

0

The identity is meaningful and it is also valid if g is a non-negative Borel measurable function. Proof. Let f be convex and let f ∈ C 2 . In this case one can use Itô’s formula. Comparing Itˆ o’s formula with (6.62)

L (a, t, ω) f (a) da =

R

t

f (X (s−)) d [X c ] .

0

Of course instead of f one can write any g non-negative, continuous function. By the Monotone Class Theorem the identity is valid for every bounded Borel measurable function. With the Monotone Convergence Theorem one can extend the identity to non-negative Borel measurable functions. Let X = w be a Wiener process. In this case [X] (s) = s and by (6.63) for every Borel measurable set B t L (a, t) da = χ (w (s) ∈ B) ds = λ (s ≤ t : w (s) ∈ B) . B

0

The last variable gives the time w is in the set B. For fixed t this occupation time is a measure on the time-line and L (a, t) is the Radon–Nikodym derivative

436

ˆ FORMULA ITO’s

of this occupation time measure. By the interpretation of the density functions L (a, t) da is the time w is around a during the time interval [0, t]. Corollary 6.76 If X is a semimartingale and L is the local time of X then [X c ] (t) =

L (a, t) da. R

Corollary 6.77 (Meyer–Tanaka formula) If X is a continuous semimartingale and L denotes the local time of X then74 |X| = |X (0)| + sign (X) • X + L (0) . By Itô’s formula and by the Itˆ o–Meyer formula the class of semimartingales is closed for a quite broad class of transformations. That is why the next example is interesting. Example 6.78 If X = 0 is a continuous local martingale, X (0) = 0 and 0 < α < 1 then |X|α is not a semimartingale.

1. The example is a bit surprising because |X| is a semimartingale and by the Itˆ o–Meyer formula a concave function of a semimartingale is again a semimartingale. But recall that in Theorem 6.65 the domain of definition of F is the whole real line, or at least an open convex set containing the range of X. Now this is α not true. Let us also observe that the function |x| is not concave on the whole line. 2. Let L be the local time of X. Assume that L (0) ≡ 0. By the Meyer–Tanaka formula |X| = sign (X) • X + L (0) = sign (X) • X. On the right-hand side the integral is a local martingale, hence |X| is a nonnegative local martingale so by Fatou’s lemma it is a supermartingale75 . As |X| (0) = 0 0 = E (|X (0)|) ≥ E (|X (t)|) , which implies that if L (0) ≡ 0 then |X| = 0. α 3. Now we prove that if Y |X| is a semimartingale then L (0) ≡ 0. With localization one can assume that X ∈ H02 . The support of L (0) is in {X (s) = 0} 74 Obviously 75 See:

L (0) denotes the process t → L (0, t). page 386.


437

so by the Meyer–Tanaka formula

t

L (0, t) =

χ (X (s) = 0) dL (0, s) =

0

t

1dL = 0

t

t

χ (X (s) = 0) d |X (s)| −

=

χ (X (s) = 0) sign (X (s)) dX (s) .

0

0

Let us first investigate the second integral

t

Z (t)

χ (X (s) = 0) sign (X (s)) dX (s) . 0

By Itô’s isometry and by (6.63)

E Z 2 (t) = E

0

=E

χ (X (s) = 0) d [X] (s) =

t

χ ({0}) (a) L (a, t) da R

=E

L (a, t) da

=0

{0}

hence Z = 0. Now let us calculate the first integral

t

χ (X (s) = 0) d |X (s)| . 0

0 < α < 1 so β 1/α > 1. If β ≥ 2 then by Itˆ o’ formula for C 2 functions |X| = Y β = βY β−1 • Y +

β (β − 1) β−2 • [Y ] . Y 2

Using that {X (s) = 0} = {Y (s) = 0}

t

χ (Y (s) = 0) d |X (s)| =

I (t) 0

t

χ (Y (s) = 0) Y β−1 dY +

=β 0

β (β − 1) + 2

t

χ (Y (s) = 0) Y β−2 d [Y ] . 0

The integrand in the first integral is zero, so the integral is zero. If β > 2 then the integrand in the second integral is also zero, so the second integral is zero again. If β = 2, then using (6.63)

t

χ (Y (s) = 0) d [Y ] = 0

L (a, t) χ ({0}) da =

R

L (a, t) da = 0. {0}

438

ˆ FORMULA ITO’s

Let 2 > β > 1. The function g (x)

xβ 0

if x > 0 if x ≤ 0

is a convex function on R. Hence by Itô’s formula for convex functions 1 |X| = g (Y ) = Y β = g (Y ) • Y + H (a) dµ (a) , 2 R where H is the local time of Y . In this case again

t

χ (X = 0) g (Y ) dY =

0

t

χ (X = 0) βY β−1 dY = 0. 0

Let us calculate the integral

t

χ (Y (s) = 0) d

H (a, s) dµ (a) .

(6.64)

R

0

µ is defined by the increasing function x βxβ−1 if x > 0 g− (x) h (t) dt, = 0 if x ≤ 0 −∞ where h (x)

β (β − 1) xβ−2 0

H is the local time of Y so H (a, s) dµ (a) = R

0

if x > 0 . if x ≤ 0

∞

s

H (a, s) h (a) da =

h (Y ) d [Y ] , 0

therefore (6.64) is

t

χ (Y (s) = 0) h (Y ) d [Y ] = 0. 0

This means that if Y is a semimartingale then L (0) = 0, hence X = 0. 6.5.4

Local times of continuous semimartingales

Observe that for every a the local time L (a, t, ω) is defined only up to indistinguishability. This means that for every a one can modify L (a, t, ω) on a set with probability zero. The local time is always continuous in parameter t so one can think about L as an C ([0, ∞)) valued stochastic process: (a, ω) → L (a, ω),


439

where L (a, ω) denotes the trajectory of L in t. As this function valued process is defined only almost surely one can use any of its modification as local time. In this subsection we prove that under some restrictions on semimartingale X, the process L (a, t, ω) has a version which is right-regular in a. To do this we shall use the next result: Proposition 6.79 (Kolmogorov’s criteria) Let I be an interval in R and let X be a Banach space valued stochastic process on I. If for some positive constants a, b and c a

E (X (u) − X (v) ) ≤ c u − v

1+b

,

then X has a continuous modification. Proposition 6.80 If X is a continuous local martingale then the local time L (a, t, ω) of X has a modification in a which is continuous in (a, t). Proof. One can localize the proposition as if L is the local time of X and τ is a stopping time then the local time of X τ is Lτ . Therefore one can assume that X − X (0) ∈ H02 . By definition

t

sign (X (s) − a) dX (s) .

L (a, t) = |X (t) − a| − |X (0) − a| − 0

Let us introduce the notation76 : (a, u) M

u

sign (X (s) − a) dX c (s) . 0

: has a continuous version. We want to apply It is sufficient to show that M Kolmogorov’s criterion. C ([0, t]) is a Banach space for arbitrary fix t. Obviously if a function g : I → C ([0, t]) is continuous then it defines a continuous function over I × [0, t]. We show that for all t 4 : : (b) (a) − M E M

C([0,t])

4 : : E sup M (a, s) − M (b, s) ≤

(6.65)

s≤t

2

≤ k · |a − b| . 76 Of course now instead of X c one can write X. But later we shall re-use this part of the proof in a bit different situation.

440

ˆ FORMULA ITO’s

By Burkholder’s and by Jensen’s inequality, using the Occupation Times Formula

( 4 2 : : : : = E sup M (a, s) − M (b, s) ≤ c · E M (a, t) − M (b, t)

(6.66)

s≤t

=c·E

t

2 4χ (a < X (s) ≤ b) d [X c ] (s) =

0

 = 4c · E 

b

2  L (x, t) dx  =

a

 2  = 4c · (b − a) E

b

a

2

≤ 4c · (b − a) E

dx L (x, t) b−a

1 b−a

2  ≤

b 2

L (x, t) dx

.

a

Changing the integrals by Fubini’s theorem one can estimate the last line with the following expression:

2 4c · (b − a) sup E L2 (x, t) .

(6.67)

x

Using the definition of the local times and the elementary inequalities ||X (t) − a| − |X (0) − a|| ≤ |X (t) − X (0)| .

2 (z1 − z2 ) ≤ 2 z12 + z22 2

2

t

L2 (x, t) ≤ 2 (X (t) − X (0)) + 2

sign (X (s) − x) dX (s) 0

One can estimate the expected value in (6.67) by 2

2

2 X − X (0)H2 + 2 sign (X − x) • XH2 . By Itô’s isometry 2

sign (X − x) • XH2 = E

∞

1d [X] =

0 2

2

= 1 • XH2 = X − X (0)H2 ,

.


441

so the estimation of E L2 (x, t) is independent of x. So by (6.67) inequality (6.65) follows. Definition 6.81 If X is a continuous local martingale then L (a, t, ω) denotes the version which is continuous in (a, t). Corollary 6.82 If X is a continuous local martingale then almost surely for every value of parameters a and t 1 L (a, t) = lim ε0 2ε

t

χ (−ε + a < X (s) < a + ε) d [X] (s) .

(6.68)

0

Proof. By the occupation times formula for any interval I 1 λ (I)

t

1 χI (X (s)) d [X] (s) = λ (I) 0 1 L (a, t) da. = λ (I) I

R

L (a, t) χI (a) da =

L is continuous in a hence if a0 ∈ I and λ (I) → 0 then 1 λ (I)

L (a, t) da → L (a0 , t) , I

from which (6.68) is evident. Corollary 6.83 If w is a Wiener process then the occupation time measure µt (B) λ (s ≤ t : w (s) ∈ B) almost surely has a differentiable distribution function and the derivative of this function is L (a, t) . Definition 6.84 A semimartingale X satisfies the so-called hypothesis A if for every t almost surely

|∆X (s)| < ∞.

0<s≤t

Proposition 6.85 If semimartingale X satisfies hypothesis A then the local time L (a, t, ω) has a B (R) × P-measurable equivalent modification which is almost surely continuous in t and right-regular in a.

442

ˆ FORMULA ITO’s

Proof. If X satisfies hypothesis A then process ∆X has finite variation. In this case X − ∆X is meaningful and it is a continuous semimartingale. Let J ∆X. As Y X −J is a continuous semimartingale it has a unique decomposition M + V , where M is a continuous local martingale, V is a continuous process with finite variation. By the definition of local times |X (t) − a| = |X (0) − a| + t sign (X (s−) − a) dX (s) + + 0

+

(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) +

0<s≤t

+ L (a, t) . For every s by the triangle inequality |∆ |X (s) − a|| ≤ |∆X (s)| .

(6.69)

Therefore by hypothesis A the sums

sign (X (s−) − a) ∆X (s)

0<s≤t

and

∆ |X (s) − a|

0<s≤t

are finite. Hence one can separate the terms in

(∆ |X (s) − a| − sign (X (s−) − a) ∆X (s)) .

(6.70)

0<s≤t

For every semimartingale Z let 0 (a, t) Z

t

sign (X (s−) − a) dZ (s) . 0

Observe that the second term of the sum (6.70) is −J0 (a, t). Using the decomposition X M + V + J : (a, t) + V0 (a, t) + J0 (a, t) − J0 (a, t) + |X (t) − a| = |X (0) − a| + M + ∆ |X (s) − a| + L (a, t) , 0<s≤t


443

that is : (a, t) − V0 (a, t) − L (a, t) = |X (t) − a| − |X (0) − a| − M − ∆ |X (s) − a| .

(6.71)

0<s≤t

By (6.69) and by hypothesis A ∆ |X (s) − a| is continuous by a and it is dominated by an integrable variable with respect to the counting measure. By the Dominated Convergence Theorem lim ∆ |X (s) − a| , ∆ |X (s) − u| = u→a

0<s≤t

0<s≤t

so the sum is continuous with respect to a. One should show that the proposition : (a, t) and V0 (a, t). V has finite variation on any finite interval, the is valid for M bounded function sign (X (s−) − u) is right-regular with respect to u. By the Dominated Convergence Theorem V0 is right-regular with respect to a. Finally :. The continuous part of semimartingale X is X c = M, so let us consider M repeating the proof of the previous proposition one can easily prove that M (a, t) : (a, t) . has a continuous version M Corollary 6.86 If a semimartingale X satisfies hypothesis A and if M + V is the decomposition of X − ∆X then ∆L (a, t) L (a, t) − L (a−, t−) = L (a, t) − L (a−, t) = t χ (X (s−) = a) dV (s) = =2 0

t

=2

χ (X (s) = a) dV (s) . 0

Proof. By the proof of the previous proposition only V0 (a, t) is not continuous so t 0 ∆L (a) = −∆V (a) = − sign (X (s−) − a) − sign (X (s−) − a−) dV (s) = 0

t

=2

χ (X (s−) = a) dV (s) . 0

V continuous and X (s−) = X (s) outside countable number points s, so t t 2 χ (X (s−) = a) dV (s) = 2 χ (X (s) = a) dV (s) . 0

0

of

444

ˆ FORMULA ITO’s

Example 6.87 Even for continuous semimartingales the local time can be discontinuous.

1. Let w be a Wiener process and let X |w|. As the support of the measure generated by L (a) is in the set {X = a} if a < 0, then L (a, t) = 0. Let a = 0. L is right-continuous in parameter a therefore using the occupation times formula 1 ε0 ε

ε

L (0, t) = lim

1 = lim ε0 ε

1 ε0 ε

L (a, t) da = lim 0

R

χ (0 ≤ a < ε) L (a, t) da =

t

χ (|w| < ε) d [|w|] . 0

By Tanaka’s formula |w| = sign (w) • w + Lw (0) . Lw (0) is continuous and increasing so [|w|] = [sign (w) • w] = [w]. Hence using again that Lw is continuous 1 L (0, t) = lim ε0 ε 1 ε0 ε

t

χ (−ε < w < ε) d [w] = 0

ε

Lw (a) da = 2Lw (0) = 0.

= lim

ε

This implies that the local time L (a, t) is not left-continuous in parameter a. 2. On the other hand it is interesting to discuss the case a > 0. Again by the right-continuity 1 L (a, t) = lim ε0 ε 1 ε0 ε

t

χ (|w| ∈ [a, a + ε)) (s) ds = 0

t

χ (w ∈ [a, a + ε)) (s) ds+

= lim

0

1 ε0 ε

t

χ (−w ∈ [a, a + ε)) (s) ds.

+ lim

0

The first limit is Lw (a, t) and the second is Lw (−a, t). Hence L (a, t) = Lw (−a, t) + Lw (a, t) . This expression is continuous on the set a ≥ 0.


6.5.5

445

Local time of Wiener processes

In this subsection we shall investigate the local times of Wiener processes. Definition 6.88 If w is a Wiener process then L denotes the local time of w at point a = 0. That is L Lw (0). We shall very often refer to L as the local time of w. Example 6.89 Tanaka’s formula for Wiener processes.

If w is a Wiener process and L Lw (0) is the local time of w then by Tanaka’s formula |w| = sign (w) • w + L β + L.

(6.72) 2

sign (w)•w is a continuous local martingale with quadratic variation (sign (w)) • [w] = [w]. By Lévy’s characterization theorem77 β sign (w)•w is also a Wiener process. Our goal is to describe the distribution of L. To do this we shall need the next simple lemma: Lemma 6.90 (Skorohod) If y is a continuous function defined on R+ and y (0) ≥ 0 then there are functions on R+ denoted by z and a for which: 1. z = y + a, 2. z is non-negative, 3. a is increasing, continuous and a (0) = 0 and the support of the measure generated by a is in the set {z = 0}. Functions a and z are unique and . / a (t) = sup y − (s) sup max (−y (s) , 0) . s≤t

(6.73)

s≤t

Proof. First we show that the decomposition is unique. Let (a1 , z1 ) and (a2 , z2 ) be two decompositions satisfying the conditions of the lemma. y = z1 − a1 = z2 − a2 , 77 See:


446

ˆ FORMULA ITO’s

so z1 − z2 = a1 − a2 . As a1 and a2 are increasing z1 − z2 and a1 − a2 have finite variation. Integrating by parts 2

0 ≤ (z1 − z2 ) (t) = 2

t

z1 (s) − z2 (s) d (z1 − z2 ) (s) = 0

t

z1 (s) − z2 (s) d (a1 − a2 ) (s) .

=2 0

By the assumption about the support of measures generated by functions a1 and a2 and as z1 ≥ 0 and z2 ≥ 0 the last integral is

t

z1 (s) da2 − 2

−2 0

t

z2 da1 ≤ 0. 0

Hence z1 = z2 . As a second step we show that a in (6.73) and z y + a satisfy the conditions of the lemma. a is trivially increasing. By the assumptions y is continuous, hence y − is also continuous. It is easy to show that a is continuous. For every t z (t) y (t) + a (t) ≥ y (t) + y − (t) = y + (t) ≥ 0. One should prove that the support of the measure generated by a is in the set {z = 0} , that is

χ (z > 0) da = lim

n→∞

R+

1 χ z> n R+

da = 0.

This means that one should prove that for every ε > 0 χ (z > ε) da = 0. R+

z is continuous, hence for every ε > 0 the set {z > ε} is open, hence {z > ε} is a union of countable number of open intervals. Let (u, v) be one of these intervals. It is sufficient to prove that a (v) = a (u). If s ∈ (u, v) then −y (s) a (s) − z (s) ≤ a (v) − ε. From this a (v) = max a (u) , sup y − (s) ≤ max (a (u) , a (v) − ε) . u≤s≤v

This can happen only if a (v) ≤ a (u), that is a (v) = a (u).


447

Proposition 6.91 The distribution of L (t, ω) L (0, t, ω) is the same as the distribution of the maximum of a Wiener process on the interval [0, t]. Hence the density function of L (t) is ft (x) √

2 x 2 , exp − 2t 2πt

x > 0.

Proof. By Tanaka’s formula |w| = β + L, where β is a Wiener process and the two sides are equal up to indistinguishability. The support of the measure generated by L is in the set {|w| = 0}. Hence by Skorohod’s lemma L (t) = sup β − (s) = sup (−β (s)) S−β (t) , a.s.

s≤t

(6.74)

s≤t

from which by the symmetry of Wiener process the proposition is evident78 . Proposition 6.92 The augmented filtration generated by β sign (w) • w is the same as the augmented filtration generated by |w|. Proof. Let F β and F |w| be the augmented filtration generated by β and by |w|. By (6.74) L is adapted with respect F β . By Tanaka’s formula |w| is F β adapted. Hence F |w| ⊆ F β . On the other hand for Wiener processes L (a, t) is almost surely continuous in a so by (6.68) and by the occupation times formula ε 1 L (t) = lim L (a, t, ω) da = ε0 2ε −ε 1 L (a, t, ω) χ ((−ε, ε)) (a) da = = lim ε0 2ε R 1 ε0 2ε

t

χ (|w (s)| < ε) ds.

= lim

0

Hence L is F |w| -adapted. Therefore β is F |w| -adapted, so F β ⊆ F |w| . Proposition 6.93 If L (a, ∞, ω) denote the limit limt→∞ L (a, t, ω) then for every a P (L (a, ∞) = ∞) = 1. 78 See:

Example 1.123, page 87 and Proposition B.7, page 564.

448

ˆ FORMULA ITO’s

Proof. By definition |w (t) − a| |a| + β (t) + L (a, t) . where β sign (w − a) • w. By Lévy’s theorem β is a Wiener process. Again by Skorohod’s lemma −

L (a, t) = sup (β (t) + |a|) . s≤t

Hence P (L (a, ∞) = ∞) = 1. Finally we show that for Wiener processes the support of the measure generated by t → L (t, ω) is not only almost surely in the set Z (ω) {t : w (t, ω) = 0} but the two sets are almost surely equal. Proposition 6.94 For almost all outcome ω the set Z (ω) is closed and has empty interior. Proof. The trajectories of Wiener processes are continuous which immediately implies that Z (ω) is closed. We show that almost surely the Lebesgue measure of Z (ω) is zero. This will imply that Z (ω) does not contain a segment with positive length. By Fubini’s theorem, using that for every t > 0 the value of a Wiener process has non-degenerated Gaussian distribution so P (w (t) = 0) = 0 for every t > 0 E (λ (Z (ω))) = E =

∞

χ (Z (ω)) (t) dt

=

0 ∞

E (χ (Z (ω)) (t)) dt = 0 0

hence λ (Z (ω)) = 0 almost surely. Definition 6.95 If w is a Wiener process then the intervals in the open set c Z (ω) = {|w (ω)| > 0} are called the excursion intervals of w.


449

For every t let σ t (ω) inf {s > 0 : L (s) ≥ t} , ρt (ω) inf {s > 0 : L (s) > t} . σ t and ρt are obviously stopping times. [σ t , ρt ] is the largest closed interval where L is constantly t. Let O (ω) ∪t (σ t (ω) , ρt (ω)) . O (ω) is an open set in R so by the structure of the open sets of the real line O (ω) is the union of maximum countable many disjoint intervals. As L is increasing it is easy to see that if t1 = t2 then

σ t1 (ω) , ρt1 (ω) ∩ σ t2 (ω) , ρt2 (ω) = ∅.

Hence O (ω) is maximum countable union of some intervals (σ t (ω) , ρt (ω)). Obviously O (ω) is the maximum countable number of intervals where L is constant. Proposition 6.96 If w is a Wiener process and L is the local time of w at zero then almost surely O (ω) is the union of the excursion intervals of w, that is a.s

c

O (ω) = {|w (ω)| > 0} = Z (ω) . Proof. The proof uses several interesting properties of the Wiener processes. 1. Observe that with probability one the maximum of a Wiener process β on any two disjoint, compact interval is different: If a < b < c < d < ∞ then by the definition of the conditional expectation using the independence of the increments P sup β (t) = sup β (t) = =P

a≤t≤b

c≤t≤d

sup (β (t) − β (b)) + β (b) = sup (β (t) − β (c)) + β (c) =

a≤t≤b

c≤t≤d

P β (c) − β (b) = sup (β (t) − β (b)) − sup (β (t) − β (c)) =

= R

R

a≤t≤b

c≤t≤d

P (β (c) − β (b) = x − y) dF (x) dG (y) = =

1dF (x) dG (y) = 1. R

R

450

ˆ FORMULA ITO’s

Unifying the measure-zero sets one can prove the same result for every interval with rational endpoints. 2. This implies that with probability one every local maximum of a Wiener process has different value. 3. By Tanaka’s formula |w| = L − β

(6.75)

for some Wiener process β. Recall that by Skorohod’s lemma79 L is the running maximum of β. This and (6.75) implies that L is constant on any interval80 where |w| > 0. As with probability one, the local maximums of β are different on the flat segments of L with probability one w is not zero. Hence the excursion intervals of w and the flat parts of L are almost surely equal. Proposition 6.97 Let w be a Wiener process. For almost all ω the following three sets are equal81 : 1. the sets of zeros of w; 2. the complement of the O (ω); 3. support of the measure generated by local time L (ω). Proof. Let S (ω) denote the support of the measure generated by L (ω). By definition S (ω) is the complement of the largest open set G (ω) with L (G (ω)) = 0. L is constant on the components of O, so L (O) = 0 that is O (ω) ⊆ G (ω). Hence S (ω) G c (ω) ⊆ Oc (ω) . Let I be an open interval with I ∩ O (ω) = ∅. If s1 < s2 are in I then L (s1 , ω) = L (s2 , ω) is impossible, so the measure of I with respect to L (ω) is positive, hence O (ω) is the maximal open set with zero measure, that is O (ω) = G (ω). Hence the equivalence of the last two sets is evident. By the previous proposition c (Z (ω)) = O (ω) = S c (ω) so Z (ω) = S (ω). 6.5.6

Ray–Knight theorem

Let b be an arbitrary number and let τ b be the hitting time of b. On [0, b] one can define the process Z (a, ω) L (b − a, τ b (ω) , ω) , 79 See:

Proposition 6.91, page 447. Proposition 6.97, page 450. 81 See: Example 7.43, page 494. 80 See:

a ∈ [0, b] .

(6.76)


451

If a > 0 then Z (a) has an exponential distribution82 with parameter λ 1/ (2a). In this subsection we try to find some deep reason for this surprising result. Let us first prove some lemmas. Lemma 6.98 Let Z (Za ) be the filtration generated by (6.76). If ξ ∈ L2 (Ω, Za , P), then ξ has the following representation:

∞

ξ = E (ξ) +

H · χ (b ≥ w > b − a) dw.

(6.77)

0

In the representation H is a predictable process and E

∞

H 2 χ (b ≥ w > b − a) d [w] < ∞.

0

Proof. Let us emphasize that predictability of H means that H is predictable with respect to the filtration F generated by the underlying Wiener process. 1. Let U be the set of random variables ξ with representation (6.77). χ (b ≥ w > b − a) is a left-regular process, so the processes U H · χ (b ≥ w > b − a) ,

H ∈ L2 (w)

o’s isometry it is clear that the random form a closed subset of L2 (w). From Itˆ variables satisfying (6.77) form a closed subset of L2 (Ω, F∞ , P). Obviously Za ⊆ F∞ and so the set of variables with the given property is a closed subspace of L2 (Ω, Za , P). 2. Let η g exp −

a

g (s) Z (s) ds ,

g ∈ Cc1 ([0, a])

0

where Cc1 ([0, a]) denotes the set of continuously differentiable functions which are zero outside [0, a]. Z is continuous so the σ-algebra generated by the variables η g is equal Za . Let t U (t) exp − g (b − w (s)) ds exp (−K (t)) . 0 82 See:


452

ˆ FORMULA ITO’s

g is bounded so U is bounded. By the Occupation Times Formula η g exp −

a

g (s) Z (s) ds

exp −

0

= exp −

g (b − v) L (v, τ b ) dv

g (s) L (b − s, τ b ) ds

=

0

b

a

=

b−a

= exp − g (b − v) L (v, τ b ) dv = R τb

= exp −

g (b − w (v)) dv

= U (τ b ) .

0

Let f ∈ C 2 , M f (w) exp (−K) f (w) U. K is continuously differentiable so it has finite variation so by Itˆ o’s formula M − M (0) = f (w) U • w − f (w) U • K+ 1 + U f (w) • [w] . 2 Let f be zero on (−∞, b − a] , f (b) = 1 and f (x) = 2g (b − x) f (x). The third integral is 1 U f (w) • [w] = U g (b − w) f (w) • [w] = U f (w) • K 2 hence the second and the third integrals are the same. Hence M − M (0) = f (w) U • w. As f (x) = f (x) χ (x > b − a) M (τ b ) M (τ b ) = = M (τ b ) = f (w (τ b )) f (b) τb = M (0) + U (s) f (w (s)) dw (s) =

η g = U (τ b ) =

0

τb

= M (0) +

U (s) f (w (s)) χ (w (s) > b − a) dw (s)

0

E ηg +

0

τb

Hχ (w > b − a) dw.


453

So for η g the representation (6.77) is valid. As η g generates Za and the set of variables for which (6.77) is valid is a closed set the lemma holds. Lemma 6.99 If the filtration is given by Z then Z (a) − 2a is a continuous martingale on [0, b]. Proof. Obviously Z (a) − 2a is continuous in a. By Tanaka’s formula +

t

(w (t) − (b − a)) = 0

1 χ (w (s) > b − a) dw (s) + L (b − a, t) . 2

If t = τ b , then Z (a) − 2a L (b − a, τ b ) − 2a = τb χ (w (s) > b − a) dw (s) = = −2 0

= −2

∞

χ (b ≥ w (s) > b − a) dw (s) .

0

From this Z (a) is integrable and its expected value is 2a. If u < v, then for every Zu -measurable bounded variable ξ, by the previous lemma and by Itˆ o’s isometry = −2E

∞

E ((Z (v) − 2v) ξ) = ∞ χ (b ≥ w > b − v) dw Hχ (b ≥ w > b − u) dw =

0

0

∞

= −2E 0

= −2E

χ (b ≥ w (s) > b − v) Hχ (b ≥ w (s) > b − u) ds

∞

=

Hχ (b ≥ w (s) > b − u) ds

= E ((Z (u) − 2u) ξ) .

0

Hence Z (a) − 2a is a martingale. Lemma 6.100 If X is a continuous local martingale and σ ≥ 0 is a random variable, then the quadratic variation of the stochastic process Lσ (a, ω) L (a, σ (ω) , ω) is finite. If u < v then the quadratic variation of Lσ on the interval [u, v] is v a.s.

[Lσ ]u = 4

v

L (a, σ) da. u

Proof. Of course, by definition, the random variable ξ is the quadratic

variation (n) of [u, v] of Lσ on the interval [u, v] if for arbitrary infinitesimal partition ak k,n

454

ˆ FORMULA ITO’s

if n → ∞ then

2 P (n) (n) → ξ. Lσ ak − Lσ ak−1

k

1. Let us fix t. Let 0 (a) X

t

sign (X (s) − a) dX (s) . 0

By the definition of local times 0 (a, t) . L (a, t) = |X (t) − a| − |X (0) − a| − X Let us remark that if f is a continuous and g is a Lipschitz continuous function then

(n) (n) (n) (n) |[f, g]| ≤ lim sup max f ak − f ak−1 − g ak−1 ≤ g ak n→∞

k

k

(n) (n) (n) (n) ≤ lim sup max f ak − f ak−1 K ak − ak−1 = 0. n→∞

k

k

The process Fσ (a) |X (σ) − a| − |X (0) − a| is obviously Lipschitz continuous in parameter a. X is a continuous local 0 is continuous83 in a so for every outcome martingale so X ( 0σ , Fσ = 0 and [Fσ ] = 0. Fσ + X Therefore ( ( 0σ . 0σ = X [Lσ ] = Fσ + X 2. By Itô’s formula

2 0 a(n) − X 0 a(n) = X k k−1

0 a(n) − X 0 a(n) − X 0 a(n) 0 a(n) =2 X • X + k k−1 k k−1

( 0 a(n) 0 a(n) − X . + X k k−1

83 See:



455

By the Occupation Times Formula for every t almost surely (

0 a(n) − X 0 a(n) X k k−1

( (n) (n) sign X − ak − sign X − ak−1 • X =

( (n) (n) •X = = −2χ ak−1 < X ≤ ak

(n) (n) = 4χ ak−1 < X ≤ ak • [X] = 4

(n)

ak

(n)

L (a) da.

ak−1

Hence almost surely (

0 a(n) − X 0 a(n) X (σ) = 4 k k−1

v

u

k

v

L (a, σ) da = 4

Lσ (a) da. u

3. Finally we should calculate the limit of the sum of first terms. The sum of the stochastic integrals is

(n)

(n) (n) (n) 0 a 0 a − 2 X −X χ ak−1 < X ≤ ak • X. k k−1 k

0 is continuous if n → ∞ the integrand goes to zero. The integrand is locally As X bounded so the stochastic integral goes to zero uniformly on compact intervals in probability. Theorem 6.101 (Ray–Knight) There is a Wiener process β with respect to the filtration Z, that Z (a) L (b − a, τ b ) satisfies the equation a√ Zdβ, a ∈ [0, b] . (6.78) Z (a) − 2a = 2 0

Proof. L (u, t) is positive for every t > 0, so Z (a) > 0. The quadratic variation a of Z (a) − 2a is 4 0 Z (s) ds. By Doob’s representation theorem84 there is a Wiener process β with respect to filtration generated by Z for which (6.78) valid. Z (a) is a continuous semimartingale. By Itˆ o’s formula a exp (−sZ) d (−sZ) + exp (−sZ (a)) − 1 = 0

+ 84 See:


1 2

a

exp (−sZ) d [−sZ] . 0

456

ˆ FORMULA ITO’s

Y (u) Z (u) − 2u is a martingale Z ≥ 0 so, exp (−sZ) ≤ 1

a

E

(exp (−sZ)) d [−sZ] ≤ E

a

2

0

d [−sZ] =

0

= 4s2 E

a

Z (s) ds

=

0 a

= 4s2

E (Z (s)) ds = 0

a

sds < ∞.

2

= 8s

0

Hence the integral

a

exp (−sZ (u)) d (−s (Z (u) − 2u)) 0

is a martingale. Let L (a, s) E (exp (−sZ (a))) . Taking expected value on both sides of Itˆ o’s formula and using the martingale property of the above integral a L (s, a) − 1 = E exp (−sZ (u)) d (−2su) + 0

1 + E 2

a

exp (−sZ) d [−sZ] .

0

Let us calculate the second integral. Using (6.78) 2s2 E

a

exp (−sZ (u)) Z (u) du 0

= −2s

2

a

E 0

= 2s2

a

E (exp (−sZ (u)) Z (u)) du = 0

d exp (−sZ (u)) du. ds

Changing the expected value and differentiating by a ∂L d = −2sL (a, s) − 2s2 E exp (−sZ (a)) . ∂a ds For Laplace transforms one can change the differentiation and the integration so ∂L ∂L , = −2sL (a, s) + 2s2 ∂s ∂a

L (a, 0) = 1.


457

With direct calculation one can easily verify that L (a, s) =

1 1 + 2sa

satisfies the equation. The Laplace transform L (a, s) is necessarily analytic so by the theorem of Cauchy and Kovalevskaja 1/ (1 + 2sa) is the unique solution of the equation. This implies that Z (a) has an exponential distribution with parameter λ = 1/ (2a). 6.5.7

Theorem of Dvoretzky Erd˝ os and Kakutani

First let us introduce some definitions: Definition 6.102 Let f be a real valued function on an interval I ⊆ R. 1. We say that t is a point of increase of f if there is a δ > 0 such that f (s) ≤ f (t) ≤ f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. 2. We say that t is a point of strict increase of f if there is a δ > 0 such that f (s) < f (t) < f (u) whenever s, u ∈ I ∩ (−δ + t, t + δ) and s < t < u. A striking feature of Wiener processes is the following observation: Theorem 6.103 (Dvoretzky–Erd˝ os–Kakutani) Almost surely the trajectories of Wiener processes do not have a point of increase. Proof. Let w be a Wiener process. 1. One should show that P ({ω : w (ω) has a point of increase}) = 0. Obviously sufficient to prove that for an arbitrary v > 0 P ({ω : w (ω) has a point of increase in [0, v]}) = 0. By Girsanov’s theorem there is a probability measure P ∼ Q on (Ω, Fv ) such that w (t) w (t) + t is a Wiener process on [0, v] under Q. Every point of increase of w is a strict point of increase of w. Therefore it is sufficient to prove that P ({ω : w (ω) has a point of strict increase in [0, v]}) = 0.

458

ˆ FORMULA ITO’s

Of course this is the same as P ({ω : w (ω) has a point of strict increase}) = 0. To prove this it is sufficient to show that P (Ωp,q ) = 0 for every rational numbers p and q where Ωp,q

ω : ∃t such that w (s, ω) < w (t, ω) < w (u, ω) , for every s, u ∈ (p, q) , s < t < u

.

Using the strong Markov property of w one can assume that p = 0. 2. Let L be the local time of w. We show that for every b almost surely Z (a) L (b − a, τ b (ω) , ω) > 0,

∀a ∈ (0, b] .

As we know85 if a > 0 then Z (a) has an exponential distribution so it is almost surely positive for every fixed a ∈ (0, b]. Z (a) is continuous so if Ωn is the set of outcomes ω for which Z (a, ω) ≥ 1/n for every rational a then Z (a, ω) ≥ 1/n for every a ∈ (0, b]. If Ω ∪n Ωn then P (Ω ) = 1 and if ω ∈ Ω then Z (a, ω) > 0 for every a ∈ (0, b]. 3. Now it is obvious that there is an Ω∗ with P (Ω∗ ) = 1 that whenever ω ∈ Ω∗ then a. L (a, t, ω) is continuous in (a, t); b. the support of L (a, ω) is {w (ω) = a} for every rational number a; c. Z (a) L (b − a, τ b (ω) , ω) > 0 whenever 0 < a ≤ b for every rational number b. 4. Let ω ∈ Ω∗ and let ω ∈ Ωp,q = Ω0,q . This means that for some t w (s, ω) < w (t, ω) < w (u, ω) ,

0 ≤ s < t < u ≤ q.

(6.79)

Let us fix a rational number w (t, ω) < b < w (q, ω). Let (bn ) be a sequence of rational numbers for which bn w (t, ω). As w (t, ω) < b and b is rational by c. L (w (t, ω) , τ b (ω) , ω) = L (b − (b − w (t, ω)) , τ b (ω) , ω) > 0. L is continuous so the measure of every single point is zero so by b. Obviously L (bn , τ bn , ω) = 0. So L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (bn , τ b (ω) , ω) + + L (bn , τ b (ω) , ω) − L (bn , t, ω) + + L (bn , t, ω) − L (bn , τ bn , ω) . 85 See:



459

By the construction as t is a point of increase bn < w (t, ω) < w (a, ω) < b,

a ∈ (t, τ b ) .

By b. the support of the measure generated by L (bn , ω) is {w (ω) = bn }. Hence the second line in the above estimation is zero. t is a point of increase so by (6.79) if n → ∞ then τ bn → t. Therefore using a. 0 < L (w (t, ω) , τ b (ω) , ω) = L (w (t, ω) , τ b (ω) , ω) − L (w (t, ω) , τ b (ω) , ω) + + L (w (t, ω) , t, ω) − L (w (t, ω) , t, ω) = 0. / Ωp,q . Hence P (Ωp,q ) = 0. This is a contradiction so if ω ∈ Ω∗ then ω ∈

7 PROCESSES WITH INDEPENDENT INCREMENTS In this chapter we discuss the classical theory of processes with independent increments. In the first section we return to the theory of Lévy processes. The increments of Lévy processes are not only independent but they are also stationary. Lévy processes are semimartingales, but the same is not true for processes with independent increments. In the second part of the chapter we show the generalization of the Lévy–Khintchine formula to processes with just independent increments. The main difference between the theory of Lévy processes and the more general theory of processes with independent increments is that every Lévy process is continuous in probability. This property does not hold for the more general class. This implies that processes with independent increments can have jumps with positive probability.

7.1

L´ evy processes

In this section we briefly return to the theory of Lévy processes. The theory of Lévy processes is much simpler than the more general theory of processes with independent increments. Recall that Lévy processes have stationary and independent increments. The main consequence of these assumptions is that if ϕt (u) denotes the Fourier transform of X (t) then for every u ϕt+s (u) = ϕt (u)ϕs (u),

(7.1)

so ϕt (u) for every u satisfies Cauchy’s functional equation1 . As the Fourier transforms of distributions are always bounded the solutions of equation (7.1) have the form ϕt (u) = exp (tφ(u)) , 1 See:

line (1.40), page 62.

460

(7.2)

´ LEVY PROCESSES

461

for some φ. One of our main goals is to find the proper form2 of φ(u). Representation (7.2) has two very important consequences: 1. ϕt (u) = 0 for every u and t, 2. ϕt (u) is continuous in t. As ϕt is continuous in t, if tn t, then ϕtn (u) → ϕt (u) for every u. Hence w

P

X(tn ) − X(t) → 0, that is X(tn ) − X(t) → 0. Hence for some subsequence a.s. a.s. X (tnk ) → X (t). Therefore X (t−) = X (t). Hence if X is a Lévy process then it is continuous in probability and, as a consequence of this continuity, for every moment of time t the probability of a jump at t is zero, that is P (∆X (t) = 0) = 0 for every t. As ϕt (u) = 0 for every u one can define the exponential martingale Zt (u, ω)

exp (iuX(t, ω)) . ϕt (u)

(7.3)

Recall that, applying the Optional Sampling Theorem to (7.3), one can prove that every Lévy process is a strong Markov process3 . 7.1.1

Poisson processes

Let us recall that a Lévy process X is a Poisson process if its trajectories are increasing and the image of trajectories is almost surely the set of integers {0, 1, 2, . . .}. One should emphasize that all the non-negative integers have to be in the image of the trajectories, so Poisson processes do not have jumps which are larger than one. To put it another way: Poisson processes are the Lévy type counting processes. Definition 7.1 A process is a counting process if its image space is the set of integers {0, 1, . . .}. X is a Poisson process with respect to a filtration F if it is a counting Lévy process with respect to the filtration F. Since the values of the process are integers and as the trajectories are rightregular there is always a positive amount of time between the jumps. That is if X (t, ω) = k then X (t + u, ω) = k, whenever 0 ≤ u ≤ δ for some δ (t, ω) > 0. As the trajectories are defined for every t ≥ 0 and the values of the trajectories are finite at every t the jumps of the process cannot accumulate. Let τ 1 (ω) inf {t: X (t, ω) = 1} = inf {t: X (t, ω) > 0} < ∞. 2 This 3 See:

is the famous Lévy–Khintchine formula. Proposition 1.109, page 70.

462

PROCESSES WITH INDEPENDENT INCREMENTS

τ 1 is obviously a stopping time. We show that τ 1 is exponentially distributed: if u, v ≥ 0 then P (τ 1 > u + v) = P (X (u + v) = 0) = = P (X (u) = 0, X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (u + v) − X (u) = 0) = = P (X (u) = 0) · P (X (v) = 0) , hence if f (t) P (τ 1 > t) then f (u + v) = f (u) · f (v) ,

u, v ≥ 0.

f ≡ 0 and f ≡ 1 cannot be solutions as X cannot be a non-trivial Lévy process4 , so for some 0 < λ < ∞ P (τ 1 > t) = P (X (t) = 0) = exp (−λt) . By the strong Markov property of Lévy processes5 the distribution of X1∗ (t) X (τ 1 + t) − X (τ 1 ) is the same as the distribution of X (t) so if τ 2 (ω) inf {t: X (t + τ 1 (ω) , ω) = 2} = inf {t: X1∗ (t, ω) > 0} < ∞ then τ 1 and τ 2 are independent and they have the same distribution6 . Proposition 7.2 If λ denotes the common parameter, then for every t ≥ 0 n+1 n n (λt) P exp (−λt) . τk > t ≥ τ k = P (X (t) = n) = n! k=1

k=1

Proof. Recall that a non-negative variable has gamma distribution Γ (a, λ) if the density function of the distribution is fa,λ (x)

λa a−1 x exp (−λx) , Γ (a)

x > 0.

random First we show that if ξ i are independent n nvariables with distribution Γ (ai , λ) , then the distribution of i=1 ξ i is Γ ( i=1 ai , λ). It is sufficient to 4 If f ≡ 1 then τ = ∞, hence X ≡ 0 and the image of trajectories is {0} only and not the 1 set of integers. 5 See: Proposition 1.109, page 70. 6 Let us recall that τ ∗ 1 is Fτ 1 -measurable and by the strong Markov property X1 is independent of Fτ 1 . See Proposition 1.109, page 70.

´ LEVY PROCESSES

463

show the calculation for two variables. If the distribution of ξ 1 is Γ(a, λ), and the distribution of ξ 2 is Γ(b, λ), and if they are independent, then the density function of ξ 1 + ξ 2 is the convolution of the density functions of ξ 1 and ξ 2 h (x)

∞

−∞

x

= 0

=

fa,λ (x − t) fb,λ (t) dt = a−1

λa (x − t) Γ (a)

exp (−λ (x − t))

λa+b exp (−λx) Γ (a) Γ (b)

λa+b exp (−λx) = Γ (a) Γ (b) =

x

λb tb−1 exp (−λt) dt = Γ (b)

a−1 b−1

(x − t)

t

dt =

0

1

a−1

(x − xz)

b−1

(xz)

xdz =

0

λa+b exp (−λx) xa+b−1 Γ (a) Γ (b)

1

a−1

(1 − z)

z b−1 dz =

0

a+b

=

λ exp (−λx) xa+b−1 . Γ (a + b)

Hence the distribution of ξ 1 + ξ 2 is Γ (a + b, λ). The density function of Γ (1, λ) is λ1 1−1 x exp (−λx) = λ exp (−λx) , Γ (1)

x > 0,

so Γ (1, λ) is the exponential distribution with parameter λ. If σ m then σ m has gamma distribution Γ (m, λ) .

m k=1

P (X (t) < n + 1) =

∞

λn+1 xn exp (−λx) dx = Γ (n + 1) t ∞ ∞ n λn xn−1 (λx) exp (−λx) exp (−λx) dx = + n = − Γ (n + 1) Γ (n + 1) t t = P (σ n+1 > t) =

n

=

(λt) exp (−λt) + P (X (t) < n) . n!

Hence n

P (X (t) = n) = P (X (t) < n + 1) − P (X (t) < n) =

(λt) exp (−λt) . n!

τk

464


7.1.2

Compound Poisson processes generated by the jumps

Let X now be a Lévy process and let Λ be a Borel measurable set. τ 1 (ω) inf {t: ∆X (t, ω) ∈ Λ} . Since (Ω, A, P, F) satisfies the usual conditions τ 1 is a stopping time7 . As τ 1 is measurable / Λ, ∀u ∈ [0, t]) P (τ 1 > t) = P (∆X (u) ∈ is meaningful. Assume that the closure of Λ denoted by cl (Λ) does not contain the point 0, that is Λ is in the complement of a ball with some positive radius r > 0. As X is right-continuous and as X (0) = 0 obviously 0 < τ 1 ≤ ∞. In a similar way as in the previous subsection, using that the jumps in Λ cannot accumulate8 P (τ 1 > t1 + t2 ) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 + t2 ]) = / Λ, u ∈ (t1 , t1 + t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ / Λ, u ∈ (0, t2 ]) = = P (∆X (u) ∈ / Λ, u ∈ (0, t1 ]) · P (∆X (u) ∈ = P (τ 1 > t1 ) · P (τ 1 > t2 ) . So τ 1 has an exponential distribution. Let us observe that now we cannot guarantee that λ > 0 as τ 1 ≡ ∞ is possible. Let us assume that τ 1 < ∞. Let X ∗ (t) X (τ 1 + t) − X (τ 1 ) and let τ 2 inf {t : ∆X ∗ (t) ∈ Λ} , n / cl (Λ) and as X etc. If τ 1 < ∞ then τ k < ∞ for all k. Let σ n k=1 τ k . As 0 ∈ has limits from left the almost surely9 strictly increasing sequence (σ n ) almost surely cannot have a finite accumulation point. So almost surely σ n ∞. As on every trajectory the number of jumps is at most countable one can define the 7 See:

Corollary 1.29, page 16, Example 1.32, page 17. 0∈ / cl (Λ) all the jumps are larger than some r > 0. τ 1 is a stopping time so the sets below are measurable. 9 The trajectories of a Poisson process are just almost surely nice. For example, with probability zero N (ω) ≡ 0 is possible. 8 As

´ LEVY PROCESSES

465

process N Λ which counts the jumps of X with ∆X ∈ Λ. N Λ (t)

χΛ (∆X (s)) =

∞

χ {σ n ≤ t} .

(7.4)

n=1

0<s≤t

N Λ (t) − N Λ (s) is the number of jumps in Λ during the time interval (s, t] so it is evidently measurable with respect to the σ-algebra generated by the increments of X. Hence10 N Λ (t) − N Λ (s) is independent of the σ-algebra Fs . So N Λ has independent increments. It is also easy to prove that the distribution of N Λ (t)−N Λ (s) is the same as the distribution of N Λ (t − s). It is trivial from the definition that N Λ is a right-regular counting process. Hence N Λ is a counting Lévy process. Therefore we have proved the following: Lemma 7.3 If 0 ∈ / cl (Λ) then N Λ is a Poisson process. Definition 7.4 A stopping time σ is a jump time of a process X if ∆X (σ) = 0 almost surely. Example 7.5 The jump times of Lévy processes are totally inaccessible.

Let τ be a predictable stopping time and let P (∆X (τ ) = 0) > 0. We can assume that P (|∆X (τ )| ≥ ε) > 0 for some ε > 0. If Λ {|x| ≥ ε} and if (σ n ) are the stopping times of the Poisson process N Λ then P (σ n = τ ) > 0 for some n. But this is impossible as σ n is totally inaccessible11 for every n. Therefore if τ is predictable then P (∆X (τ ) = 0) = 0. With N Λ one can define the process J Λ (t, ω)

∆X (s, ω) χΛ (∆X (s, ω)) =

(7.5)

0<s≤t N Λ (t)

=

n=1

∆X (σ n ) =

∞

∆X (σ n ) χ {σ n ≤ t} .

n=1

Lemma 7.6 If 0 ∈ / cl (Λ) then J Λ is a compound Poisson process that is: 1. J Λ (0) = 0. 2. J Λ has countable many jumps. 3. After every jump J Λ has an exponentially distributed waiting time. After this waiting time J Λ jumps again. The time between the jumps are independent and they have the same distribution. 10 See: 11 See:

Proposition 1.97, page 61. Example 3.7, page 183.

466


4. The sizes of the jumps are independent of the waiting times up to the jumps. 5. The sizes of the jumps have the same distribution and they are independent random variables. Proof. If η n ∆X (σ n ) then by the strong Markov property the variables (η n ) are independent and they have the same distribution. One need only prove that (σ n ) and (η n ) are independent. Let τ n σ n − σ n−1 . 1. If s > t, then (t) (t) {η 1 < a, σ 1 > s} = {σ 1 > t} ∩ η 1 < a, σ 1 > s − t , where η 1 and σ 1 are the size and the time of the first jump of X ∗ (u) = X (u + t) − X (t). As σ 1 is a stopping time {σ 1 > t} ∈ Ft . Hence by the strong Markov property {σ 1 > t} is independent of (t)

(t)

(t) (t) η 1 < a, σ 1 > s − t . Hence again by the strong Markov property

(t) (t) P (η 1 < a, σ 1 > s) = P {σ 1 > t} ∩ η 1 < a, σ 1 > s − t =

(t) (t) = P (σ 1 > t) P η 1 < a, σ 1 > s − t = = P (σ 1 > t) P (η 1 < a, σ 1 > s − t) . If s t then using that 0 ∈ / cl (Λ) and therefore P (σ 1 > 0) = 1, P (η 1 < a, σ 1 > t) = P (σ 1 > t) P (η 1 < a, σ 1 > 0) = = P (σ 1 > t) · P (η 1 < a) . Hence σ 1 τ 1 and η 1 are independent. In a similar way, using the strong Markov property again one can prove that τ n is independent of η n for every n. 3. By the strong Markov property (η n , τ n ) is independent of Fσn−1 . Hence

E exp i

= E E exp i

N

um η m + i

m=1 N m=1

um η m + i

N

vn τ n

n=1 N n=1

vn τ n

=

| FσN −1

=

´ LEVY PROCESSES

= E exp i

N −1

= E exp i

N −1

um η m + i

m=1 N −1

vn τ n

E exp (iuN η N + ivN τ N ) | FσN −1

n=1

m=1

= E exp i

um η m + i

N −1

N −1

um η m + i

m=1

=

vn τ n

n=1 N −1

467

· E (exp (iuN η N + ivN τ N )) =

vn σ n

· E (exp (iuN η N )) · E (exp (ivN τ N )) =

n=1

= ··· =

N !

E (exp (ium η m ))

m=1

N !

E (exp (ivm τ m )) .

m=1

This implies12 that the σ-algebras generated by (η m ) and (τ n ) are independent. Hence (η m ) and (σ n ) are also independent. Lemma 7.7 The Fourier transform of J Λ (s) is

(exp (iux) − 1) dF (x) E exp iu · J Λ (s) = exp λs R

where λ is the parameter of the Poisson part and F is the common distribution function of the jumps. Proof. Let G be the distribution function of N Λ (s).    N Λ (s) ϕ (u) E exp iu · ∆X (σ k ) = = R



k=1



N Λ (s)

E exp iu ·





∆X (σ k ) | N Λ (s) = n dG (n) .

k=1

N Λ (s) has a Poisson distribution. As N Λ (s) and the variables (∆X (σ k )) are independent one can substitute and drop the condition N Λ (s) = k: ∞ n n (λs) exp (−λs) = ϕ (u) = E exp iu · ∆X (σ k ) n! n=0 k=1 n ∞ n (λs) exp (−λs) = = exp (iux) dF (x) n! R n=0 = exp λs (exp (iux) − 1) dF (x) . R

12 See:


468


Lemma 7.8 If X is a Lévy process with respect to some filtration F and 0 ∈ / cl (Λ) then J Λ and X − J Λ are also Lévy processes with respect to F. Proof. First recall13 that if X is a Lévy process then the σ-algebra Gt generated by the increments X (u) − X (v) ,

u≥v≥t

is independent of Ft for all t. Observe that for all t increments of J Λ and X−J Λ of this type are Gt -measurable. So these processes have independent increment with respect to F. From the strong Markov property it is clear that the increments of these processes are stationary. As J Λ obviously has right-regular trajectories the processes in the lemma are Lévy processes as well. Lemma 7.9 If X is a Lévy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ)

then the variables J Λ (t) and X − J Λ (t) are independent for every t ≥ 0. Proof. Let us fix a t. To prove the independence of the variables J Λ (t) and X (t) − J Λ (t) it is sufficient to prove14 that

'

& ϕ (u, v) E exp i u · J Λ (t) + v · X (t) − J Λ (t) =

= E exp iu · J Λ (t) · E exp iv · X (t) − J Λ (t) .

(7.6)

Let us emphasize that as 0 ∈ / cl (Λ) on every finite interval the number of jumps in Λ is finite so J Λ has trajectories with finite variation. That is J Λ ∈ V. Let

exp iu · J Λ (s, ω) , M (s, ω, u) E (exp (iu · J Λ (s, ω)))

& ' exp iv · X (s, ω) − J Λ (s, ω) N (s, ω, v) E (exp (iv · [X (s, ω) − J Λ (s, ω)])) be the exponential martingale of J Λ and X − J Λ . The Fourier transforms in the denominators are never zero and they are continuous, hence the expressions are meaningful and the jumps of these processes are the jumps of the numerators. Integrating by parts M (t) N (t) − M (0) N (0) =

t

M− dN + 0

+ [M, N ] (t) . 13 See: 14 See:

Proposition 1.97, page 61. Lemma 1.96, page 60.

t

N− dM + 0

´ LEVY PROCESSES

469

The Fourier transforms in the denominators are never zero and they are continuous so their absolute value have a positive minimum on the compact interval [0, t]. The numerators are bounded, so the integrators are bounded on any finite interval. Hence the stochastic integrals above are real martingales15 . So their expected value is zero. We show that [M, N ] = 0. As J Λ (t) has a compound Poisson distribution one can explicitly write down its Fourier transform: E exp iu · J (s) = exp λs (exp (iux) − 1) dF (x)

Λ

R

exp (s · φ (u)) As J Λ ∈ V obviously M ∈ V. So M is purely discontinuous. Hence16 [M, N ] =

∆M ∆N.

J Λ and X − J Λ do not have common jumps, therefore [M, N ] (t) =

∆M (s) ∆N (s) = 0.

0<s≤t

Hence E (M (t) N (t)) = E (M (0) N (0)) = 1. From which (7.6) trivially holds. If N1 and N2 are Poisson processes and N1 and N2 do not have common jumps then [N1 , N2 ] =

∆N1 ∆N2 = 0.

Using this one can prove in a similar way as above the following observation: Lemma 7.10 If N1 and N2 are Poisson processes with respect to some filtration F and N1 and N2 do not have common jumps almost surely then N1 (t) and N2 (t) are independent for every t. Proposition 7.11 If (Ni ) are finitely many Poisson processes with respect to some filtration then they do not have common jumps almost surely if and only if the variables (Ni (t)) are independent17 for every t. 15 See:

Proposition 2.24, page 128. Corollary 4.34, page 245. 17 See: Example 2.29, page 130. 16 See:

470


Proof. If the values of Poisson processes are independent then the same is true for the compensated Poison processes. By the independence on every finite time interval the compensated Poisson processes are orthogonal in the Hilbert space H02 . Hence they are orthogonal as local martingales18 . Therefore their quadratic variation is a uniformly integrable martingale19 . This implies that the expected value of the quadratic co-variation [N1 , N2 ] =

∆N1 ∆N2

is almost surely zero. As ∆N1 ∆N2 ≥ 0 the quadratic co-variation is almost surely zero. Hence the two processes do not have common jumps almost surely. The proof of the other part of the proposition is clear from the previous lemma. Theorem 7.12 (Decomposition of L´ evy processes ) If X is a Lévy process, Λ is a Borel measurable set and 0 ∈ / cl (Λ) then J Λ and X − J Λ are independent Lévy processes. Proof. Recall that by definition two processes are independent if they are independent as sets of random variables. As we proved20 J Λ (t) and X − J Λ (t) are independent for every t. From the Markov property it is clear that if h > 0 then the increments J Λ (t + h) − J Λ (t) and

X − J Λ (t + h) − X − J Λ (t)

are also independent. Let (tk ) be a time sequence. Let (αk ) denote the corresponding increments of J Λ and let (β k ) denote the corresponding increments of X −J Λ . Let Gt be the σ-algebra generated by the increments of X after t. Observe that αk and β k are Gtk -measurable. Hence the linear combination uk αk + vk β k is also Gtk -measurable. So uk αk + vk β k is independent21 of Ftk . Using these one can easily decompose the joint Fourier transform: n n iuk αk + ivk β k = ϕ (u, v) E exp = E exp

k=1 n

k=1 18 See:

Proposition 4.15, Proposition 2.84, 20 See: Lemma 7.9, page 21 See: Proposition 1.97, 19 See:

page 230. page 170. 468. page 61.

k=1

i (uk αk + vk β k )

=

´ LEVY PROCESSES

= E E exp = E exp

n−1

n

k=1

i (uk αk + vk β k )

471

| Ftn−1

=

i (uk αk + vk β k ) E (exp (i (un αn + vn β n )))

=

k=1

= ··· =

n !

E (exp (i (uk αk + vk β k ))) =

k=1

=

n !

(E (exp (iuk αk )) · E (exp (ivk β k ))) = ϕ1 (u) · ϕ2 (v) .

k=1

This means that the sets of variables (αk ) and (β k ) are independent. Hence the σalgebras generated by the increments, that is by the processes, are independent. Therefore the processes X − J Λ and J Λ are independent. With nearly the same method one can prove the following proposition. Proposition 7.13 If (Ni ) are finitely many Poisson processes with respect to some common filtration then they do not have common jumps almost surely if and only if the processes are independent. Proof. Let F be the common filtration of N1 and N2 and let U and V be the exponential martingales of N1 and N2 . As N1 and N2 do not have a common jumps the quadratic co-variation of U and V is zero. Hence they are orthogonal. That is U V is a local martingale with respect to F. On every finite interval U, V ∈ H2 , therefore |U V (t)| ≤ sup |U (s)| sup |V (s)| ∈ L1 (Ω). s

s

Hence U V is a martingale. Therefore

E U V (tk ) | Ftk−1 = U V (tk−1 ) . If we use the notation of the proof of the previous proposition then with simple calculation one can write this as

E exp (i (uk αk + vk β k )) | Ftn−1 = E (exp (iuk αk )) · E (exp (ivk β k )) . From this the proof of the proposition is obvious. Corollary 7.14 If (Ni ) are countably many independent Poisson processes then they do not have common jumps almost surely. Proof. Let N1 and N2 be independent Poisson processes and let F (1) and F (2) be the filtration generated by the processes. Let U and V be the exponential

472


martingales of N1 and N2 . U and V are martingales with respect to filtrations F (1) and F (2) . Let F be the filtration generated by the two processes N1 and N2 . Using the independence of N1 and N2 we show that U and V are martingales (1) (2) with respect to F as well. If F1 ∈ Fs and F2 ∈ Fs where s < t then F1 ∩F2

U (t) dP = E χF1 χF2 U (t) = E χF2 E χF1 U (t) =

= E χF2 E χF1 U (s) = E χF2 χF1 U (s) = U (s) dP. = F1 ∩F2

With the Monotone Class Theorem one can prove that the equality holds for every F ∈ σ F1 ∩ F2 : F1 ∈ Fs(1) , F2 ∈ Fs(2) = Fs , that is E (U (t) | Fs ) = U (s). Hence U is a martingale with respect to F.

Example 7.15 Poisson processes without common jumps which are not independent.

Let (σ k ) be the jump times generating some Poisson process. Obviously variables (2σ k ) also generate a Poisson process. As the probability that two independent continuous random variable is equal is zero the jump times of the two processes are almost surely never equal. But as they generate the same non-trivial σ-algebra they are obviously not independent. Proposition 7.16 If X is a Lévy process and (Λk ) are finitely many

disjoint Borel measurable sets with 0 ∈ / cl (Λk) for all k, then processes N Λk are independent. The same is true for J Λk . Proof. It is sufficient to show the second part of the proposition. If X J ∪i=1 Λk n then J ∪i=2 Λk = X − J Λ1 and J Λ1 are independent. From this the proposition is obvious. n

7.1.3

Spectral measure of L´ evy processes

First let us prove a very simple identity.

´ LEVY PROCESSES

473

Definition 7.17 Let (X, A) and (Y, B) be measurable spaces. A function µ : X × B → [0, ∞] is a random measure if: 1. for every B ∈ B the function x → µ (x, B) is A-measurable, 2. for every x ∈ X the set function B → µ (x, B) is a measure on (Y, B). Proposition 7.18 Let (X, A) and (Y, B) be measurable spaces and let µ : X × B → [0, ∞] be a random measure. If ρ is a measure on (X, A) and ν (B)

µ (x, B) dρ (x) , X

then ν is a measure on (Y, B). If f is a measurable function on (Y, B) then

f (y) µ (x, dy) dρ (x) ,

f (y) dν (y) = Y

X

Y

whenever the integral on the left-hand side

f dν is meaningful.

Y

Proof. ν is non-negative and if (Bn ) are disjoint sets then by the Monotone Convergence Theorem ν (∪n Bn )

µ (x, ∪n Bn ) dρ (x) =

X

=

n

X

µ (x, Bn ) dρ (x)

X

µ (x, Bn ) dρ (x) =

n

ν (Bn ) ,

n

so ν is really a measure. If f = χB , B ∈ B, then

f (y) dν (y) = ν (B)

Y

=

µ (x, B) dρ (x) = X

χB (y) µ (x, dy) dρ (x) = X

Y

X

Y

=

f (y) µ (x, dy) dρ (x) .

In the usual way, using the linearity of the integration and the Monotone Convergence Theorem the formula can be extended to non-negative measurable functions. If f is non-negative and Y f dν is finite then almost surely w.r.t. ρ


474

the inner integral is also finite. Let f = f + − f − and assume that the integral of f − w.r.t. ν is finite. In this case, as we remarked, the integral Y f − (y) µ (x, dy) is finite for almost all x and the integral

f (y) µ (x, dy) −

f (y) µ (x, dy) = Y

+

Y

f − (y) µ (x, dy)

Y

is almost surely meaningful. The integral of the second part with respect to ρ is finite, hence

f dν Y

f dν − +

Y

f − dν =

Y

f + (y) µ (x, dy) dρ (x) −

= X

Y

X

f (y) µ (x, dy) −

X

Y

−

f (y) µ (x, dy) dρ (x)

+

=

f − (y) µ (x, dy) dρ (x) =

Y

Y

f (y) µ (x, dy) dρ (x) . X

Y

Let us fix a moment t. For an arbitrary ω define the counting measure supported by the jumps of s → X (s, ω) in [0, t]. Denote this random measure by µX (t, ω, Λ) = µX t (ω, Λ). That is µX t (ω, Λ)

χΛ (∆X (s, ω)) = N Λ (t, ω) .

(7.7)

0<s≤t

In general the process X is fixed so in order to simplify the notation as much as possible we shall drop the superscript X and instead of µX we shall simply write µ. If 0 ∈ / cl (Λ) then by (7.7) µt (ω, Λ) is measurable in ω. Obviously if Λ ⊆ R \ {0} then c

µ (t, ω, Λ) = lim µ (t, ω, Λ ∩ [−1/n, 1/n] ) , n→∞

so µt (ω, Λ) is also measurable in ω for any Borel measurable subset Λ of R \ {0}. This implies that µt (ω, Λ) is a random measure over R \ {0}. Hence Λ → ν t (Λ) E (µt (Λ))

µt (ω, Λ) dP (ω) ,

Λ ∈ B (R \ {0})

Ω

is a measure on (R \ {0} , B (R \ {0})). If 0 ∈ / cl (Λ) then ν t (Λ) is the expected value of a Poisson process at a fixed time, therefore ν t (Λ) < ∞. Therefore ν t is σ-finite for every t.

´ LEVY PROCESSES

475

Definition 7.19 The measures ν t (Λ) E (µt (Λ)) ,

Λ ∈ B (R \ {0})

are called the spectral measures of X. To simplify the notation let ν ν 1 . Lemma 7.20 ν t (Λ) = t · ν 1 (Λ) t · ν (Λ). Proof. If 0 ∈ / cl (Λ) then N Λ is a Poisson process. In this case

ν t (Λ) E N Λ (t) = t · E N Λ (1) tν (Λ) . In the general case by the Monotone Convergence Theorem

c ν t (Λ) = E lim µt (Λ ∩ [−1/n, 1/n] ) = n→∞

c

= lim E (µt (Λ ∩ [−1/n, 1/n] )) = n→∞

c

= lim t · ν (Λ ∩ [−1/n, 1/n] ) = t · ν (Λ) . n→∞

Proposition 7.21 (L1 -identity) If X is a Lévy process then for every Borel measurable function f : R \ {0} → R   E f dµt = E  f (∆X (s)) χ (∆X (s) = 0) R\{0}

0<s≤t

= R\{0}

whenever the integral

R\{0}

f dν t = t

f dν,

(7.8)

R\{0}

f dν is meaningful.

Proof. As µt (ω, Λ) is a counting measure for ever Borel measurable function f f (x) µt (ω, dx) = f (∆X (s, ω)) χ (∆X (s) = 0) . R\{0}

0<s≤t

The other parts of (7.8) are direct consequences of the previous proposition. Corollary 7.22 Let X be a Lévy process. If 0 ∈ / cl (Λ) and Λ xdν (x) is finite then

J Λ (t) − E J Λ (t) = J Λ (t) − t xdν (x) (7.9) Λ

is a martingale. In particular if Λ is bounded and 0 ∈ / cl (Λ) then (7.9) is a martingale.

476


Proof. As Λ xdν (x) R\{0} xχΛ (x) dν (x) is finite by the L1 -identity with f (x) xχΛ (x) 

E J Λ (t) E 

 ∆X (s) χΛ (∆X (s)) =

0≤s≤t

=t R\{0}

xχΛ (x) dν (x) = t

xdν (x) . Λ

/ cl (Λ) the jumps X is a Lévy process so J Λ has independent increments. As 0 ∈ Λ has right-regular trajectories. This implies that in Λ cannot accumulate. So J J Λ (t) − E J Λ (t) is a martingale. Let P denote the σ-algebra of the predictable sets. By the martingale property of the compensated jumps it is clear that if 0 ∈ / cl (Λ), F ∈ Fs and s < t then

µ (t, ω, Λ) − t · ν (Λ) dP (ω) = F

µ (s, ω, Λ) − s · ν (Λ) dP (ω) . F

This means that as ν (Λ) < ∞

µ (t, ω, Λ) − µ (s, ω, Λ) dP (ω) = F

(t − s) · ν (Λ) dP (ω) , F

that is if H (u, ω, e) χΛ (e) χF (ω) χ(s,t] (u) then

∞

Hµ (du, ω, de)

E 0

R\{0}

∞

=E

Hdν (e) du .

0

R\{0}

The meaning of the left-hand side is the following. For every ω let µ (ω, D) denote22 the counting measure of the jumps of X, that is if D ∈ B (R+ ) × B (R \ {0}) then let µ (ω, D) be the number of jumps in D. First we integrate by this measure and then, if it is meaningful, we take the expected value. If the time interval is finite and we restrict µ to a set with ν (Λ) < ∞ then the set of bounded processes for which the formula is valid is a linear space. From this in the usual way, using the Monotone Class Theorem and the Monotone 22 See:


´ LEVY PROCESSES

477

Convergence Theorem, one can prove the following: Proposition 7.23 (General L1 -identity) If H ≥ 0 is measurable with respect to P × B (R \ {0}) then

∞

E

H (u, ω, e) µ (du, ω, de) 0

∞

=E

R\{0}

H (u, ω, e) dν (e) du . R\{0}

0

Example 7.24 The Lévy–Khintchine formula for compound Poisson processes.

Let X be a Lévy process and let 0 ∈ / cl (Λ). Let J Λ be the compound Poisson process of the jumps of X. The Fourier transform of J Λ (s) is23

exp λs R

(exp (iux) − 1) dF (x)

,

where F is the common distribution function of the jumps, and λ is the parameter of the underlying Poisson process. What is the relation between F and ν? If B ∈ B (R\ {0}) and τ is the time of the first jump in Λ then by the general L1 -identity using that χ ([0, τ ]) is predictable F (B) = P (∆X (τ ) ∈ B ∩ Λ) = E (χB∩Λ (∆X (τ ))) = ∞ =E χ ([0, τ ]) µ (du, B ∩ Λ) = 0

∞

=E

χB∩Λ (e) χ ([0, τ ]) µ (du, de)

R\{0}

0 ∞

χB∩Λ (e) χ ([0, τ ]) dν (e) du

=E 0

R\{0}

= ν (B ∩ Λ) E

∞

χ ([0, τ ]) du

=

0

= ν (B ∩ Λ) E (τ ) =

ν (B ∩ Λ) . λ

That is the Fourier transform of J Λ (s) is (exp (iux) − 1) dν (x) . exp s Λ 23 See:


=

=

478


Definition 7.25 Let (E, E, ν) be a measure space and let (Ω, A, P) be a probability space. We say that the random measure µ : Ω × E → [0, ∞] is a random Poisson measure with control measure ν if: 1. whenever the sets (Λk ) are disjoint the variables µ (ω, Λk ) are independent and 2. whenever ν (Λ) < ∞ the variable µ (ω, Λ) has a Poisson distribution with parameter ν (Λ). Proposition 7.26 Let X be a Lévy process. For every t the counting measure µt (ω, Λ) is a random Poisson measure. The control measure of µt is the spectral measure ν t . 'c & Proof. For every Λ ⊆ R \ {0} let Λn Λ ∩ − n1 , n1 . Obviously 0 ∈ / cl (Λn ) and µ (t, ω, Λ) = lim µ (t, ω, Λn ) . n→∞

As 0 ∈ / cl (Λn ) the variable ω → µ (t, ω, Λn ) = N Λn (t, ω) has a Poisson distribution. The Fourier transform of this variable is exp (tν (Λn ) (exp (iu) − 1)) . The convergence for every ω implies the weak convergence, so if ν (Λ) < ∞, then as ν (Λ) = limn→∞ ν (Λn ) the Fourier transform of ω → µ (t, ω, Λ) is exp (tν (Λ) (exp (iu) − 1)) . Hence it has a Poisson distribution. If the sets Λk =

∪n Λ(k) n

∪n

c 1 1 Λk ∩ − , n n

(k)

are disjoint then the sets Λn are also disjoint for every n. Hence the variables

µ t, ω, Λ(k) n are independent. The limit of independent variables is independent, so if the sets (Λk ) are disjoint, then the variables µ (t, ω, Λk ) are independent.

´ LEVY PROCESSES

479

Definition 7.27 Let H be a Hilbert space and let (C, C, ν) be a measure space and let S ⊆ C denote the subsets of C with finite measure. π : S → H is a vector measure with control measure ν if for every S ∈ S: 1. π (S) ∈ H is defined, 2

2. π (S)H = ν (S), 3. if S1 and S2 are disjoint sets in S then the vectors π (S1 ) and π (S2 ) are orthogonal. We say that a function f : C → R is integrable to π if there is a

with respect sequence of finite valued step functions (sn ) = c χ k nk Cnk with: 1. sn → f in L2 (ν) and 2. In k cnk π (Cnk ) is a Cauchy sequence in H. If I limn→∞ In , then we shall call this limit I the integral of f with respect to π. We shall denote this integral by C f (x) dπ (x) or simply C f dπ. Proposition 7.28 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure (C, C, ν) then f is integrable with respect to π and f dπ C

Proof. Let s and 2.

k ck

- = f 2

f 2 dν.

(7.10)

C

H

· χCk where Ck are disjoint and in S. By conditions 3.

2 2 s dπ = c · π (C ) k k = C H k H 2 2 = ck · π (Ck )H = k

=

k

(7.11)

c2k

2

· ν (Ck ) = C

s2 dν = s2 .

there is a sequence sn k cnk χCnk with As the step functions are dense in L2 sn → f in L2 (ν). From (7.11) In k cnk π (Cnk ) is a Cauchy sequence in H. From this the proposition is obvious. Corollary 7.29 If f ∈ L2 (C, C, ν) and π is a vector measure with control measure ν then the value of the vector integral C f dπ is independent of the approximating sequence (sn ). Proposition 7.30 If X is a Lévy process and H L2 (Ω) then for every t ≥ 0 π t (Λ) N Λ (t) − ν t (Λ) = N Λ (t) − t · ν (Λ)

(7.12)

480


is a a Hilbert space valued vector measure over (R \ {0} , B (R \ {0}) , ν t ). The same is true if H H02 on the time interval [0, t] and (π (Λ)) (s) N Λ (s) − s · ν (Λ) ,

s ≤ t < ∞.

Proof. As we have already proved, if Λ ⊆ R \ {0} and ν t (Λ) < ∞ then the Fourier transform of N Λ (t) is exp (ν t (Λ) (exp (iu) − 1)) . Hence if ν t (Λ) < ∞, then N Λ (t) has a Poisson distribution with parameter 2 ν t (Λ). This implies that the expected value of (7.12) is zero and π t H = ν t (Λ). Λ1 As we have also proved that if Λ1 ∩ Λ2 = ∅ then N (t) and N Λ2 (t) are independent24 . So (π t (Λ1 ) , π t (Λ2 )) π t (Λ1 ) π t (Λ2 ) dP = 0. Ω

7.1.4

Decomposition of L´ evy processes

Now we are ready to prove that Lévy processes are semimartingales. Proposition 7.31 If X is a Lévy process then: 1. X is a semimartingale, 2. X has a decomposition X =V +M where: 3. V and M are independent Lévy processes, 4. M is a martingale with bounded jumps and on every finite interval M ∈ H02 , 5. V ∈ V, that is on every finite interval the trajectories of V have finite variation. Proof. If Λ {|x| ≥ 1} then the jumps of Y X − J Λ are bounded. Y is a Lévy process with bounded jumps25 . This implies that Y (t) has an expected value26 for every t. Therefore M (t) Y (t) − E (Y (t)) =

= X (t) − J Λ (t) − t · E X (1) − J Λ (1) X (t) − J Λ (t) − t · γ

24 See:

Proposition 7.16. page 472. Lemma 7.8, page 468. 26 See: Proposition 1.111, page 74. 25 See:

´ LEVY PROCESSES

481

is a Lévy process with zero expected value. Hence M is a martingale. The martingale M has finite moments, so on any finite interval M is in H02 . Therefore M satisfies 4. Obviously V (t) J Λ (t) + E (Y (t)) J Λ (t) + γ · t satisfies 5. As X − J Λ and J Λ are independent27 the proposition holds.

Corollary 7.32 The spectral measure ν has the following properties x2 dν (x) < ∞.

ν (|x| ≥ 1) < ∞, 0

Stochastic integration theory

Stochastic Integration Theory

Introduction to stochastic integration

Introduction to Stochastic Integration

Introduction to Stochastic Integration

Stochastic integration with jumps

Introduction to Stochastic Integration (Universitext)

Stochastic Integration and Differential Equations

Stoshastic processes and stochastic integration

Introduction to Stochastic Integration (Universitext)

Stochastic integration and differential equations

Integration Theory

Stochastic Integration and Differential Equations

Stochastic Integration and Differential Equations

Stochastic limit theory

Measure and integration theory

Introduction to Stochastic Integration - Second Edition

Introduction to stochastic integration, Second Edition

European Integration Theory

Geometric integration theory

Geometric Integration Theory

Geometric Integration Theory

Measure Theory and Integration

Geometric Integration Theory

Modern Theory of Integration

Geometric Integration Theory

Geometric Integration Theory (Cornerstones)

Stochastic Networks: Theory and Applications

Stochastic Processes and Filtering Theory

Stochastic processes without measure theory

Introduction to Stochastic Control Theory

Stochastic integration theory