Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, Second Edition

Springer Finance Editorial Board M. Avellaneda G. Barone-Adesi M. Broadie M.H.A. Davis E. Dennan C. Kliippelberg ...

Author: Nicholas H. Bingham | Rüdiger Kiesel

569 downloads 2562 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Springer Finance

Editorial Board

M. Avellaneda

G. Barone-Adesi

M. Broadie

M.H.A. Davis E. Dennan

C. Kliippelberg

E. Kopp

w. Schachermayer

Springer

London Berlin Heidelberg New York Hong Kong Milan Paris Tokyo

Springer Finance Springer Finance is a prograrmne of books aimed at students, academics, and

practitioners working on increasingly technical approaches to the analysis of financial markets. It aims to cover a variety of topics, not only mathematical

finance but foreign exchanges, tenn structure, risk management, portfolio theory, equity derivatives, and financial economics.

M Ammann, Credit Risk Valuation: Methods, Models, and Applications (2001)

E. Barucci, Financial Markets Theory: Equilibrium, Efficiency and Information (2003)

N.H. Bingham and R. Kiesel, Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, 2nd Edition (2004)

T.R. Bielecki and M Rutkowski, Credit Risk: Modeling, Valuation and Hedging (2001)

D. Brigo amd F. Mercurio, Interest Rate Models: Theory and Practice (2001)

R. Buff, Uncertain Volatility Models - Theory and Application (2002)

R.-A. Dana and M Jeonblonc, Financial Markets in Continuous Time (2003)

G. Deboeck and T. Kohonen (Editors), Visual ExplDnltions in Finance with Self

Organizing Maps (1998)

R.J. Elliott and P.E. Kopp, Mathematics of Financial Markets

(1999)

H. Geman, D. Madan, S.R. Pliska and T. Vorst (Editors), Mathematical Finance Bachelier Congress 2000 (2001)

M Gundlach and F. Lehrbass (Editors), CreditRisk+ in the Banking Industry (2004) Y.-K. Kwok. Mathematical Models of Financial Derivatives (1998)

M Ku/pmann. Irrational Exuberance Reconsidered: The Cross Section of Stock Returns, 2"" Edition (2004)

A. Pelsser, Efficient Methods for Valuing Interest Rate Derivatives (2000)

J.-L Prigent, Weak Convergence of Financial Markets (2003)

B. Schmid, Credit Risk Pricing Models: Theory and Practice, 2 "" Edition (2004)

S.E. Shreve. Stochastic Calculus for Finance I: The Binomial Asset Pricing Model (2004) S.E. Shreve, Stochastic Calculus for Finance II: Continuous-Time Models (2004)

M Yor, Exponential Functionals of Brownian Motion and Related Processes (2001) R. Zagst, Interest-Rate Management (2002)

y'-L Zhu and l-L Chern, Derivative Securities and Difference Methods (2004)

A. Ziegler, Incomplete Information and Heterogeneous Beliefs in Continuous-Time Finance (2003) A. Ziegler, A Game Theory Analysis of Options: CorpDnlte Finance and Financial Intermediation in Continuous Time, 2"" Edition (2004)

N

.

R

.

Bingham and

R.

Kiesel

Risk-Neutral Valuation Pricing and Hedging of Financial Derivatives Second Edition

,

Springer

Nicholas H. Bingham, ScD

RUdiger Kiesel, PhD

Department of Probability and Statistics

Department of Financial

University of Sheffield

Mathematics

Sheffield S3 7RH, UK

University of Ulm

Department of Mathematical Sciences

89069 Ulm, Germany

BruneI University

Department of Statistics

Uxbridge

London School of Economics

Middlesex UB8 3PH, UK

London WC2A 2AE, UK

British Library Cataloguing in Publication Data Bingham, N. H. Risk-neutral valuation: pricing and hedging of financial derivatives. - 2nd ed. 1. Investments - Mathematical models

2. Finance -

Mathematical models I. Title

n. Kiesel, RUdiger, 1962-

332'.015118 ISBN 1852334584 Library of Congress Cataloging-in-Publication Data Bingham, N. H. Risk-neutral valuation: pricing and hedging of financial derivatives / N.H. Bingham and R. Kiesel.- 2nd ed. p. cm. - (Springer finance) Includes bibliographical references and index. ISBN 1-85233-458-4 Calk. paper) 1. Investments-Mathematical models. RUdiger. 1962HG4515.2.B56

II. Title.

2. Finance-Mathematical models.

I. Kiesel,

III. Series.

2004

332.64'57-dc22

2003067310

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be repro duced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the tenTIS of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those tenTIS should be sent to the publishers. ISBN 1-85233-458-4 Springer-Verlag London Berlin Heidelberg ISSN 1616-0533 Springer-Verlag is part of Springer Science+Business Media springeronline.com © Springer-Verlag London Limited 2004 Printed in the United States of America The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Typesetting: Camera-ready by authors 12/3830-543210 Printed on acid-free paper SPIN 10798207

To my mother, Blanche Louise Bingham (nee Corbitt,

1912-)

and to the memory of my father, Robert Llewelyn Bingham

(1904-1972).

Nick

Filr Corina, Una und Roman.

Rudiger

Preface to the Second Edition

Books are written for use, and the best compliment that the community in the field could have paid to the first edition of 1998 was to buy out the print run, and that of the corrected printing, as happened. Meanwhile, the fast developing field of mathematical finance had moved on, as had our thinking, and it seemed better to recognize this and undertake a thorough-going re write for the second edition than to tinker with the existing text. The second edition is substantially longer than the first; the principal changes are as follows. There is a new chapter ( the last, Chapter 9) on credit risk - a field that seemed too important to exclude. We have included continuous-time processes more general than the Gaussian processes of the Black-Scholes theory, in order in particular to model driving noise with jumps. Thus we include material on infinite divisibility and Levy processes, with hy perbolic models as a principal special case, in recognition of the growing importance of 'Levy finance'. Chapter 5 is accordingly extended, and new material on Levy-based models is included in Chapter 7 on incomplete mar kets. Also on incomplete markets, we include more material on criteria for selecting one equivalent martingale measure from many, and on utility-based approaches. However, arbitrage-based arguments and risk-neutral valuation remain the basic theme. It is a pleasure to record our gratitude to the many people who have had a hand in this enterprise. We thank our students and colleagues over the years since the first edition was written at Birkbeck - both Nick ( BruneI and Sheffield ) and Rudiger ( London School of Economics and Ulm) . Special thanks go to Holger Hafting, Torsten Kleinow and Matthias Scherer. In partic ular, we thank Stefan Kassberger for help with Sections 8.5 and 8.6, and with parts of Chapter 9. We are grateful to a number of sharp-eyed colleagues in the field who have pointed out errors ( of commission or omission ) in the first edition, and made suggestions. And we thank our editors at Springer-Verlag for bearing with us gracefully while the various changes ( in the subject, our jobs and lives etc. ) resulted in the second edition being repeatedly delayed. Again: last, and most, we thank our families for their love, support and forbearance throughout. August 2003 Nick, Brunel and Sheffield Riidiger, Ulm and LSE

Preface to the First Edition

The prehistory of both the theory and the practicalities of mathematical fi nance can be traced back quite some time. However, the history proper of mathematical finance - at least, the core of it, the subject-matter of this book - dates essentially from 1973. This year is noted for two developments, one practical, one theoretical. On the practical side, the world's first options exchange opened in Chicago. On the theoretical side, Black and Scholes pub lished their famous paper (Black and Scholes 1973) on option pricing, giving in particular explicit formulae, hedging strategies for replicating contingent claims and the Black-Scholes partial differential equation. Both Black's article (Black 1989) and the recent obituary of Fischer Black (1938-1995) (Chichilnisky 1996) contain accounts of the difficulties Black and Scholes encountered in trying to get their work published. After several rejec tions by leading journals, the paper finally appeared in 1973 in the Journal of Political Economy. It was alternatively derived and extended later that year by Merton. Thus, like so many classics, the Black-Scholes and Merton papers were ahead of their time in the economics and financial communities. Their ideas became better assimilated with time, and the Arbitrage Pricing Technique of S.A. Ross was developed by 1976-1978; see (Ross 1976) , (Ross 1978) . In 1979, the Cox-Ross-Rubinstein treatment by binomial trees (Cox, Ross, and Rubinstein 1979) appeared, allowing an elementary approach showing clearly the basic no-arbitrage argument, which is the basis of the majority of con tingent claim pricing models in use. The papers (Harrison and Kreps 1979), (Harrison and Pliska 1981) made the link with the relevant mathematics martingale theory - explicit. Since then, mathematical finance has devel oped rapidly - in parallel with the explosive growth in volumes of derivatives traded. Today, the theory is mature, is unchallengeably important, and has been simplified to the extent that, far from being controversial or arcane as in 1973, it is easy enough to be taught to students - of economics and finance, financial engineering, mathematics and statistics - as part of the canon of modern applied mathematics. Its importance was recognized by the award of the Nobel Prize for Economics in 1997 to the two survivors among the three founding pioneers, Myron Scholes and Robert Merton (Nobel prize laudatio 1997) .

x


The core of the subject-matter of mathematical finance concerns ques tions of pricing - of financial derivatives such as options - and hedging covering oneself against all eventualities. Pervading all questions of pricing is the concept of arbitrage. Mispricing will be spotted by arbitrageurs, and exploited to extract riskless profit from your mistake, in potentially unlimited quantities. Thus to misprice is to expose oneself to being used as a money pump by the market. The Black-Scholes theory is the main theoretical tool for pricing of options, and for associated questions of trading strategies for hedging. Now that the theory is well-established, the profit margins on the standard - 'vanilla' - options are so slender that practitioners constantly seek to develop new - nonstandard or 'exotic' - options which might be traded more profitably. And of course, these have to be priced - or one will be used as a money-pump by arbitrageurs ... The upshot of all this is that, although standard options are well established nowadays, and are accessible and well understood, practitioners constantly seek new financial products, of ever greater complexity. Faced with this open-ended escalation of the theoretical problems of mathematical finance, there is no substitute for understanding what is going on. The gist of this can be put into one sentence: one should discount everything, and take expected values under an equivalent martingale measure. Now discounting has been with us for a long time - as long as inflation and other concomitants of capitalism - and makes few mathematical demands beyond compound inter est and exponential growth. By contrast, equivalent martingale measures the terminology is from ( Harrison and Pliska 1981), where the concept was first made explicit - make highly non-trivial mathematical demands on the reader, and in consequence present the expositor with a quandary. One can presuppose a mathematical background advanced enough to include measure theory and enough measure-theoretic probability to include martingales say, to the level covered by the excellent text ( Williams 1991). But this is to restrict the subject to a comparative elite, and so fails to address the needs of most practitioners, let alone intending ones. At the other extreme, one can eschew the language of mathematics for that of economics and finance, and hope that by dint of repetition the recipe that eventually emerges will appear natural and well-motivated. Granted a leisurely enough approach, such a strategy is quite viable. However, we prefer to bring the key concepts out into the light of day rather than leave them implicit or unstated. Con sequently, we find ourselves committed to using the relevant mathematical language - of measure theory and martingales - explicitly. Now what makes measure theory hard ( final year material for good mathematics undergradu ates, or postgraduates ) is its proofs and its constructions. As these are only a secondary concern here - our primary concern being the relevant concepts, language and viewpoint - we simply take what we need for granted, giving chapter and verse to standard texts, and use it. Always take a pragmatic view in applied mathematics: the proof of the pudding is in the eating.


xi

The phrase 'equivalent martingale measure' is hardly the language of choice for practitioners, who think in terms of the risk-adjusted or - as we shall call it - risk-neutral measure: the key concept of the subject is risk neutrality. Since this concept runs through the book like a golden thread (roter Faden, to use the German) , we emphasize it by using it in our title. One of the distinctive features of mathematical finance is that it is, by its very nature, interdisciplinary. At least at this comparatively early stage of the subject's development, everyone involved in it - practitioners, students, teachers, researchers - comes to it with his/her own individual profile of expe rience, knowledge and motivation. For ourselves, we both have a mathematics and statistics background (though the second author is an ex-practitioner) , and teach the subject to a mixed audience with a high proportion of prac titioners. It is our hope that the balance we strike here between the math ematical and economic/financial sides of the subject will make the book a useful addition to the burgeoning literature in the field. Broadly speaking, most books are principally aimed at those with a background on one or the other side. Those aiming at a more mathematically advanced audience, such as the excellent recent texts (Lamberton and Lapeyre 1996) and (Musiela and Rutkowski 1997) , typically assume more mathematics than we do specifically, a prior knowledge of measure theory. Those aiming at a more economic/financial audience, such as the equally excellent books (Cox and Rubinstein 1 985) and of (Hull 1 999) , typically prefer to 'teach by doing' and leave the mathematical nub latent rather than explicit. We have aimed for a middle way between these two. We begin with the background on financial derivatives or contingent claims in Chapter 1 , and with the mathematical background in Chapter 2, leading into Chapter 3 on stochastic processes in discrete time. We apply the theory developed here to mathematical finance in discrete time in Chapter 4. The corresponding treatment in continuous time follows in Chapters 5 and 6. The remaining chapters treat incomplete markets and interest rate models. We are grateful to many people for advice and comments. We thank first our students at Birkbeck College for their patience and interest in the courses from which this book developed, especially Jim Aspinwall and Mark Deacon for many helpful conversations. We are grateful to Tomas Bjork and Martin Schweizer for their careful and scholarly suggestions. We thank Jon McLoone from Wolfram UK for the possibility of using Mathematica for numerical experiments. Thanks to Alex Schone who always patiently and helpfully explained the mysteries of LaTeX to the second author over the years ohne Dich, Alex, wurde ich immer noch im Handbuch nachschlagen! It is a pleasure to thank Dr Susan Hezlet and the staff of Springer-Verlag UK for their support and help throughout this project. And, last and most, we thank our families for their love, support and forbearance while this book was being written. N.H. Bingham Rudiger Kiesel London, March 1998 -

Contents

Preface to the Second Edition Preface to the First Edition 1.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Derivative Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 . 1 Financial Markets and Instruments . . . . . . . . . . . . . . . . . . . . . . . 2 1 . 1 . 1 Derivative Instruments . . . . . . . . . . . . . . . . . . . .. . . . . . . . 2 1 . 1.2 Underlying Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 . 3 Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 . 1 .4 Types of Traders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1 . 5 Modeling Assumptions . . . . . . . . . . . .. . . . . . . . . . . . . . . . 6 1 . 2 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 . 3 Arbitrage Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 . 3 . 1 Fundamental Determinants of Option Values . . . . . . . . . 1 1 1 . 3 . 2 Arbitrage Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 1 . 4 Single-period Market Models . . . . . .. . . . .. . . . . . . . . . . . . . . . . . 1 5 1 .4. 1 A Fundamental Example . . . . . . . . . . . . . . . . . . . . . . . . . . 1 5 1 .4 . 2 A Single-period Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1 .4 . 3 A Few Financial-economic Considerations . . . . . . . . . . . 2 5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 .

2.

Probability Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 1 Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Equivalent Measures and Radon-Nikodym Derivatives . . . . . . . 2 . 5 Conditional Expectation . . . . . . . . .. . . . ... . . . . . . . . . . . . . . .. . 2 . 6 Modes of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 7 Convolution and Characteristic Functions . . . . . . . . . . . . . . . . . 2 .8 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Asset Return Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . 1 0 Infinite Divisibility and the Levy-Khintchine Formula . . . . . .. 2 . 1 1 Elliptically Contoured Distributions . . . . . . . . . . . . . . . . . . . . . . . 2 . 12 Hyberbolic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

.

.

.

29 30 34 37 42 44 51 53 57 61 63 65 67

xiv

Contents

Exercises 3.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Stochastic Processes in Discrete Time . ..... .. . . 3.1 Information and Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Discrete-parameter Stochastic Processes . . . . . . . . . . . . . . . . . . 3.3 Definition and Basic Properties of Martingales . . . . . 3.4 Martingale Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Stopping Times and Optional Stopping . . . . . . . . . . . . . . . . . . . 3.6 The Snell Envelope and Optimal Stopping . . . . . . . . . . . . . . . . 3.7 Spaces of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . ... . . . . .

75 75 77 78 80 82 88 94 96 98

.... . . .. . 4 . 1 The Model . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Existence of Equivalent Martingale Measures . . . . . . . . . . . . . . 4.2.1 The No-arbitrage Condition . . . . 4.2.2 Risk-Neutral Pricing . . .. .. . . . ..... .. 4.3 Complete Markets: Uniqueness of EMMs . . .... ..... 4.4 The Fundamental Theorem of Asset Pricing: Risk-Neutral Valuation . . . . . . . .. 4.5 The Cox-Ross-Rubinstein Model . . . . 4 .5. 1 Model Structure . . . . . . . . . . . . . .. . 4.5.2 Risk-neutral Pricing . . . . . . . . . . . 4 .5.3 Hedging . . . . . . . ... . . . . 4.6 Binomial Approximations . . . .. . . . 4.6.1 Model Structure . . .. . . ... . ... 4.6.2 The Black-Scholes Option Pricing Formula . . . . . . 4 .6.3 Further Limiting Models . . . . . . . . . 4.7 American Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Theory . . . . .. . . . 4 .7.2 American Options in the CRR Model . . . . . . . . . 4.8 Further Contingent Claim Valuation in Discrete Time . . 4.8.1 Barrier Options . . . . . . . . . 4.8.2 Lookback Options . . .. 4 .8.3 A Three-period Example . .. . . . .. .... 4 .9 Multifactor Models . . . . . . . . . . . . . . 4.9.1 Extended Binomial Model . . . . . ..... . 4 .9.2 Multinomial Models . . . . . Exercises . . . ... . .. .. ..

101 101 105 105 1 12 116

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

4.

71

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Mathematical Finance in Discrete Time .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

1 18 121 122 124 126 130 130 131 136 138 138 141 14 3 14 3 144 145 14 7 147 14 8 150

Contents

5.

xv

Stochastic Processes in Continuous Time . . . . . . . . . . 153 5.1 Filtrations; Finite-dimensional Distributions . . . . . . . . . . 153 5.2 Classes of Processes . . . . . . . . . . . . . . . . . . . . 155 5.2. 1 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 5.2.2 Gaussian Processes . . . . . . . . . . . . . . . . . . . 158 5.2.3 Markov Processes . . . . . . . . . . . . . . . . . . . 158 5.2.4 Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.3 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . 160 5.3.1 Definition and Existence . . . . . . . . . . . . . . 160 5.3.2 Quadratic Variation of Brownian Motion . . . . . . . . . 167 5 .3.3 Properties of Brownian Motion . . . . . . . . . . . . . 171 5.3.4 Brownian Motion in Stochastic Modeling . . . . . . . . 173 5.4 Point Processes . . . . . . . . . . . . . . . . . . . . . . . 175 5.4. 1 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 175 5.4.2 The Poisson Process . . . . . . . . . . . . . . . . . . 176 5.4.3 Compound Poisson Processes . . . . . . . . . . . . . . . . . . . . . . 176 5.4.4 Renewal Processes . . . . . . . . . . . . . . . . . . 177 5.5 Levy Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 5.5.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . 179 5.5.2 Levy Processes . . . . . . . . . . . . . . . . . . . . . 181 5.5.3 Levy Processes and the Levy-Khintchine Formula . . . 183 5.6 Stochastic Integrals; Ito Calculus . . . . . . . . . . . . 187 5.6.1 Stochastic Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 5.6.2 Ito's Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.6.3 Geometric Brownian Motion . . . . . . . . . . . 196 5.7 Stochastic Calculus for Black-Scholes Models . . . . . . . . . 198 5.8 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 202 5.9 Likelihood Estimation for Diffusions . . . . . . . . . . . . . . . 206 5.10 Martingales, Local Martingales and Semi-martingales . . . . 209 5.10. 1 Definitions . . . . . . . . . . . . . . . . . . . . . . . 209 5. 10.2 Semi-martingale Calculus . . . . . . . . . . . . . 211 5.10.3 Stochastic Exponentials . . . . . . . . . . . . . . . . 215 5.10 .4 Semi-martingale Characteristics . . . . . . . . . . . 217 5. 1 1 Weak Convergence of Stochastic Processes . . . . . . . . 219 5.1 1 . 1 The Spaces Cd and Dd . . . ... .. . .. .. 219 5.11.2 Definition and Motivation . . . . . . . . . . . . . . 220 5.11.3 Basic Theorems of Weak Convergence . . . . . . . . . . . . . . . 222 5 . 1 1 .4 Weak Convergence Results for Stochastic Integrals . 223 Exercises 225 .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

6.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Mathematical Finance in Continuous Time . . . . . . . . . . . ...... .. 6. 1 Continuous-time Financial Market Models 6.1.1 The Financial Market Model . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Equivalent Martingale Measures . . . . . . . . . . . . . . 6.1.3 Risk-neutral Pricing . . . . . .. . . ...... . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

229 229 229 232 235

xvi

Contents

Changes of Numeraire . . . . . . . . . . . . . . . . . . . . . The Generalized Black-Scholes Model . . . . . . ... 6 . 2 . 1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Pricing and Hedging Contingent Claims . . . . . . . . . 6.2.3 The Greeks . . . . . . . . . . . . . . . .. .. ... ... 6.2.4 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Further Contingent Claim Valuation . . . . . . . . . . . . . . . . 6.3. 1 American Options . . . . . . . . . . . . . . . . . . . . . . . . . 6. 3 . 2 Asian Options . ... . .. . .. . .. . . 6.3.3 Barrier Options . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .... 6. 3 . 4 Lookback Options . . . . 6.3. 5 Binary Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Discrete- versus Continuous-time Market Models . . . . . . . . 6.4. 1 Discrete- to Continuous-time Convergence Reconsidered . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Finite Market Approximations . . . .. . 6.4.3 Examples of Finite Market Approximations . . . . . . 6.4.4 Contiguity .. .. ... . . . . . . . 6. 5 Further Applications of the Risk-neutral Valuation Principle . . . . . . . . . . . . . . . . . . . . . . . . . 6. 5 . 1 Futures Markets . . . . . . . . . . . . . . . . . . . . . . . 6. 5 . 2 Currency Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises 6.2

6. 1 .4

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

7.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Incomplete Markets . . . . . . . . . . .... ...... . 7 . 1 Pricing in Incomplete Markets . . . . . . . . . . . . . . . . . . 7. 1 . 1 A General Option-Pricing Formula . . . . . . . 7 . 1 . 2 The Esscher Measure . . . . . . . . . . . . . . . . . . . . . . 7.2 Hedging in Incomplete Markets . . . . . . . . . . . 7. 2 . 1 Quadratic Principles . . . . . . . . . . . . . . . . . . . . . . 7.2.2 The Financial Market Model . . . . . . . . . . . . . . . . 7.2.3 Equivalent Martingale Measures . . . . . . . . . . . 7.2.4 Hedging Contingent Claims . . . . . . . . . . . . 7.2. 5 Mean-variance Hedging and the Minimal ELMM . 7.2.6 Explicit Example . . . . . . . . . . . .... ... .. 7.2.7 Quadratic Principles in Insurance . . . . . . . . . . . . . . 7.3 Stochastic Volatility Models . . . . . . . . . . . . . . . . . . . 7.4 Models Driven by Levy Processes . . . . . . . . . . . . 7.4 . 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 General Levy-process Based Financial Market Model . . . . . . . . . ... .. . 7.4.3 Existence of Equivalent Martingale Measures . . . . 7.4.4 Hyperbolic Models: The Hyperbolic Levy Process .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . .. . ... . .

.

.

.

.

.

.

.

.

. . . .. . . ... ... . . .. .. .

.

.

.

.

.

.

.

.

239 242 242 2 50 254 255 2 58 2 58 260 263 266 269 270 270 271 274 280 281 281 28 5 287

. 289 . 289 289 292 . 29 5 . 296 . 297 299 . 300 30 5 307 . 312 . 314 . 3 18 . 3 18 . .

.

.

.

. . 3 19 . . . 321 .. 32 3 .

.

. .

.

xvii

Contents

8.

Interest Rate Theory . . . . . . . . . . . . ... . 8 . 1 The Bond Market . . . . . . . . ... .. . .. .. .. 8. 1 . 1 The Term Structure of Interest Rates .. .. . . 8. 1 . 2 Mathematical Modelling . . . . . . . .. .. ... 8. 1 . 3 Bond Pricing, . . . . . .... .. .. ... ... .. 8.2 Short-rate Models . . . .. . . . . . . . . . . . . . . . . . . .. 8.2. 1 The Term-structure Equation . . . . . . . . . . . . . . . 8.2 . 2 Martingale Modelling . . . . . .. . . . . . . . . . . . . . . . 8.2. 3 Extensions: Multi-Factor Models . . . . . . .. . . . . . 8. 3 Heath-Jarrow-Morton Methodology . . . ... 8. 3 . 1 The Heath-Jarrow-Morton Model Class . .. . . .. 8. 3 . 2 Forward Risk-neutral Martingale Measures . 8. 3 . 3 Completeness . . . . . . . . .. .. . .. . .. . . .. 8. 4 Pricing and Hedging Contingent Claims . 8. 4 . 1 Short-rate Models . . . . . . .. . . .. .. .. 8. 4 . 2 Gaussian HJM Framework . . . . . . . . . . .. ... . 8. 4 . 3 Swaps . . . . . . . . . . . . . . . . . . . . . .. .. . 8. 4 . 4 Caps . . . . .. . . . . . . . . .. . . . 8. 5 Market Models of LIBOR- and Swap-rates . . . . . . .. 8. 5 . 1 Description of the Economy ... . . .. . . .. . . .. . 8. 5 . 2 LIBOR Dynamics Under the Forward LIB OR Measure . . . . . . 8.5 . 3 The Spot LIBOR Measure . . . . . .. . . 8. 5 . 4 Valuation of Caplets and Floorlets in the LMM 8. 5 . 5 The Swap Market Model . . .. . ... . . . 8. 5.6 The Relation Between LIBOR- and Swap-market Models . . . . . . . . . . . . . . 8.6 Potential Models and the Flesaker-Hughston Framework . . . . 8.6. 1 Pricing Kernels and Potentials . . . . . . . . . .. . 8.6.2 The Flesaker-Hughston Framework . . . . ..... .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Exercises

9.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Credit Risk . . ...... .. . . . .. . . . . . . . . . . 9 . 1 Aspects of Credit Risk . . . . . . . . . . . . ... . . . . . . . . 9. 1 . 1 The Market . .. . .. . . . . . . . . .. 9 . 1 . 2 What Is Credit Risk? . . . . . . . . . . . . . . . . . .. . 9. 1 . 3 Portfolio Risk Models . ...... ... 9.2 Basic Credit Risk Modeling . . . . . .. ... 9.3 Structural Models . . . . . . . .. 9. 3 . 1 Merton's Model . . . .. . . . . 9. 3 . 2 A Jump-diffusion Model . . . . . 9. 3 . 3 Structural Model with Premature Default 9. 3.4 Structural Model with Stochastic Interest Rates . 9. 3.5 Optimal Capital Structure - Leland's Approach . . 9. 4 Reduced Form Models . . . . . . . . .. . . . . . . . .. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .. .. .. . .. .. . .

.

.

.

.

.

.

.

. .

3 57 361 362 363 367 368 368 3 70 372

37 5 3 76 376 376 377 378 379 379 382 38 4 . 388 . 389 390 .

.

327 328 328 3 30 334 3 36 3 37 3 38 3 42 343 343 3 46 3 48 3 50 3 50 351 353 354 3 56 3 56

.

.

.

xviii

Contents

9.5 Credit Derivatives . . . 9.6 Portfolio Credit Risk Models . . .. 9.7 Collateralized Debt Obligations ( CDOs ) 9.7.1 Introduction . . . ........ 9.7.2 Review of Modelling Methods . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

A. Hilbert Space . . . . .. .. .. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Bibliography

.

.

.

.

.

.

.

.

.

.

.

Index

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

. . .

.

.

.

.

.

.

.

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.. . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

B. Projections and Conditional Expectations C . The Separating Hyperplane Theorem

. . . 399 .. .. 400 ... . ... .. . 404 .. .... . 404 . . . .. . . 405 .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . .. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

.

.

409

.

.

411

. . . 415 .

.. .

.

.

.

.

.

417

.

.

.

433

1. Derivative Background

The main focus of this book is the pricing of financial assets. Price formation in financial markets may be explained in an absolute manner in terms of fundamentals, as, e.g. in the so-called rational expectation model, or, more modestly, in a relative manner explaining the prices of some assets in terms of other given and observable asset prices. The second approach, which we adopt, is based on the concept of arbitrage. This remarkably simple concept is independent of beliefs and tastes ( preferences ) of the actors in the financial market. The basic assumption simply states that all participants in the mar ket prefer more to less, and that any increase in consumption opportunities must somehow be paid for. Underlying all arguments is the question: Is it possible for an investor to restructure his current portfolio ( the assets cur rently owned ) in such a way that he has to pay less today for his restructured portfolio and still has the same ( or a higher ) return at a future date? If such an opportunity exists, the arbitrageur can consume the difference today and has gained a free lunch. Following our relative pricing approach, we think of financial assets as specific mixtures of some fundamental building blocks. A key observation will be that the economics involved in the relative pricing lead to linearity of the price formation. Consequently, if we are able to extract the prices of these fundamental building blocks from the prices of the financial assets traded in the market, we can create and price new assets simply by choosing new mixtures of the building blocks. It is this special feature of financial asset pricing that allows the use of modern martingale-based probability theory ( and made the subject so special to us ) . We will review in this chapter the relevant background for the pricing theory in financial markets. We start by describing financial markets, the actors in them and the financial assets traded there. After clarifying the fun damental economic building blocks we come to the key concept of arbitrage. We introduce the general technique of arbitrage pricing and finally specify our first model of a financial market.

2

1 . Derivative Background

1. 1 Financial Markets and Instruments

This book is on the risk-neutral ( probabilistic ) pricing of derivative securities. In practitioner's terms a 'derivative security' is a security whose value depends on the values of other more basic underlying securities; cf. Hull ( 1 999), p. l . We adopt a more precise academic definition, i n the spirit of Ingersoll ( 1986) : 1.1.1. A derivative security, or contingent claim, is a financial contract whose value at expiration date T {more briefly, expiry} is determined exactly by the price {or prices within a prespecified time-interval} of the un derlying financial assets {or instruments} at time T {within the time interval [O, T] }.

Definition

We refer to the underlying assets below simply as 'the underlying'. This section provides the institutional background on derivative securities, the main groups of underlying assets, the markets where derivative securities are traded and the financial agents involved in these activities. As our focus is on ( probabilistic ) models and not institutional considerations we refer the reader to the references for excellent sources describing institutions such as Davis ( 1994), Edwards and Ma ( 1 992) and Kolb ( 1 991 ) . 1. 1. 1

Derivative Instruments

Derivative securities can be grouped under three general headings: Options, Forwards and Futures and Swaps. During this text we will mainly deal with options although our pricing techniques may be readily applied to forwards, futures and swaps as well. Options. An option is a financial instrument giving one the right but not the obligation to make a specified transaction at ( or by ) a specified date at a specified price. Call options give one the right to buy. Put options give one the right to sell. European options give one the right to buy/sell on the specified date, the expiry date, on which the option expires or matures. American options give one the right to buy/sell at any time prior to or at expiry. Over-the-counter ( OTC ) options were long ago negotiated by a broker between a buyer and a seller. In 197 3 ( the year of the Black-Scholes formula, perhaps the central result of the subject ) , the Chicago Board Options Ex change ( CBOE ) began trading in options on some stocks. Since then, the growth of options has been explosive. Options are now traded on all the ma jor world exchanges, in enormous volumes. Risk magazine ( 1 2/97) estimated $3 5 trillion as the gross figure for worldwide derivatives markets in 1996 . By contrast the Financial Times of 7 October 2002 ( Special Report on Deriva tives ) gives the interest rate and currency derivatives volume as $ 8 3 trillion an indication of the rate of growth in recent years! The simplest call and put options are now so standard they are called vanilla options. Many kinds of options now exist, including so-called exotic options. Types include Asian -

1 . 1 Financial Markets and Instruments

3

options, which depend on the average price over a period; lookback options, which depend on the maximum or minimum price over a period; and barrier options, which depend on some price level being attained or not. The asset to which the option refers is called the underlying the underlying. The price at which the parties agree to buy/sell the underlying, on/by the expiry date ( if exercised ) , is called the exercise price strike price. We shall usually use K for the strike price, time t 0 for the initial time ( when the contract between the buyer and the seller of the option is struck ) and time t T for the expiry or final time. Consider, say, a European call option, with strike price K; write S (t) for the value ( or price ) of the underlying at time t . If S (t) > K, the option is in the money; if S (t) K, the option is said to be at the money; and if S (t) < K, the option is out of the money. The payoff from the option is Terminology.

asset or

or

=

=

=

S ( T)

-

K if S (T)

>

K and

0

otherwise,

more briefly written as [S ( T ) Kj+. Taking into account the initial payment of an investor one obtains the profit diagram below. -

profit

Fig. 1 . 1 .

Profit diagram for a European call

Forwards. A forward contract is an agreement to buy or sell an asset S at a certain future date T for a certain price K. The agent who agrees to buy the underlying asset is said to have a long position, the other agent assumes a short position. The settlement date is called delivery date; and the specified price is referred to as delivery price. The forward price f(t, T) is the delivery

4


price that would make the contract have zero value at time t. At the time the contract is set up, t 0, the forward price therefore equals the delivery price, hence 1(0, T) K. The forward prices I(t, T) need not (and will not) necessarily be equal to the delivery price K during the lifetime of the contract. The payoff from a long position in a forward contract on one unit of an asset with price S(T) at the maturity of the contract is =

=

S(T) - K.

Compared with a call option with the same maturity and strike price K, we see that the investor now faces a downside risk, too. He has the obligation to buy the asset for price K. Swaps. A swap is an agreement whereby two parties undertake to exchange, at known dates in the future, various financial assets (or cash flows) according to a prearranged formula that depends on the value of one or more underlying assets. Examples are currency swaps (exchange currencies) and interest-rate swaps (exchange of fixed for floating set of interest payments) . 1.1.2

Underlying Securities

Stocks. The basis of modern economic life - or of the capitalist system - is the limited liability company ( UK: & Co. Ltd, now pIc - public limited com pany) , the corporation (US: Inc.), 'die Aktiengesellschaft' (Germany: AG) . Such companies are owned by their shareholders; the shares provide partial ownership of the company, pro rata with investment, have value, reflecting both the value of the company's (real) assets and the earning power of the company's dividends. With publicly quoted companies, shares are quoted and traded on the Stock Exchange. Stock is the generic term for assets held in the form of shares. Interest Rates. The value of some financial assets depends solely on the level of interest rates (or yields) , e.g. Treasury (T-) notes, T-bills, T-bonds, municipal and corporate bonds. These are fixed-income securities by which national, state and local governments and large companies partially finance their economic activity. Fixed-income securities require the payment of in terest in the form of a fixed amount of money at predetermined points in time, as well as repayment of the principal at maturity of the security. Inter est rates themselves are notional assets, which cannot be delivered. Hedging exposure to interest rates is more complicated than hedging exposure to the price movements of a certain stock. A whole term structure is necessary for a full description of the level of interest rates, and for hedging purposes one must clarify the nature of the exposure carefully. We will discuss the subject of modeling the term structure of interest rates in Chapter 8. •

•


5

Currencies. A currency is the denomination of the national units of pay ment (money) and as such is a financial asset. The end of fixed exchange rates and the adoption of floating exchange rates resulted in a sharp increase in exchange rate volatility. International trade, and economic activity involv ing it, such as most manufacturing industry, involves dealing with more than one currency. A company may wish to hedge adverse movements of foreign currencies and in doing so use derivative instruments. See for example the exposure of the hedging problems British Steel faced as a result of the sharp increase in the pound sterling in 96 /97, Rennocks (1997) . Indexes. An index tracks the value of a (hypothetical) basket of stocks (FT SElDO, S&P-500, DAX) , bonds (REX), and so on. Again, these are not assets themselves. Derivative instruments on indexes may be used for hedging if no derivative instruments on a particular asset (a stock, a bond, a commodity) in question are available and if the correlation in movement between the index and the asset is significant. Furthermore, institutional funds (such as pension funds, mutual funds etc.) , which manage large diversified stock portfolios, try to mimic particular stock indexes and use derivatives on stock indexes as a portfolio management tool. On the other hand, a speculator may wish to bet on a certain overall development in a market without exposing him / herself to a particular asset. A new kind of index was generated with the Index of Catastrophe Losses (CAT-Index) by the Chicago Board of Trade (CBOT) lately. The growing number of huge natural disasters (such as hurricane Andrew 1992, the Kobe earthquake 1995 etc) has led the insurance industry to try to find new ways of increasing its capacity to carry risks. The CBOT tried to capitalize on this problem by launching a market in insurance derivatives. Investors have been offered options on the CAT-Index, thereby taking in effect the position of traditional reinsurance. Derivatives are themselves assets - they are traded, have value etc. - and so can be used as underlying assets for new contingent claims: options on futures, options on baskets of options etc. These developments give rise to so-called exotic options, demanding a sophisticated mathematical machinery to handle them. 1 . 1 . 3 Markets

Financial derivatives are basically traded in two ways: on organized exchanges and over-the-counter (OTC). Organized exchanges are subject to regulatory rules, require a certain degree of standardization of the traded instruments (strike price, maturity dates, size of contract etc.) and have a physical loca tion at which trade takes place. Examples are the Chicago Board Options Exchange (CBOE) , which coincidentally opened in April 1973, the same year as the seminal contributions on option prices Black and Scholes (1973) and Merton (1973) were published, the London International Financial Futures Exchange (LIFFE) and the Deutsche Terminb6rse (DTB).

6


OTC trading takes place via computers and phones between various com mercial and investment banks (leading players include institutions such as Bankers Trust, Goldman Sachs - where Fischer Black worked -, Citibank, Chase Manhattan and Deutsche Bank). Due to the growing sophistication of investors boosting demand for in creasingly complicated, made-to-measure products, the OTC market volume is currently growing at a much faster pace than trade on most exchanges. 1 . 1 .4 Types of Traders

We can classify the traders of derivative securities in three different classes. Hedgers. Successful companies concentrate on economic activities in which they do best. They use the market to insure themselves against adverse move ments of prices, currencies, interest rates etc. Hedging is an attempt to reduce exposure to risk a company already faces. Shorter Oxford English Dictionary (OED) : Hedge: 'trans. To cover oneself against loss on (a bet etc.) by betting, etc., on the other side. Also fig. 1672.' Speculators. Speculators want to take a position in the market - they take the opposite position to hedgers. Indeed, speculation is needed to make hedg ing possible, in that a hedger, wishing to lay off risk, cannot do so unless someone is willing to take it on. In speculation, available funds are invested opportunistically in the hope of making a profit: the underlying itself is irrelevant to the investor (specu lator) , who is only interested in the potential for possible profit that trade involving it may present. Hedging, by contrast, is typically engaged in by companies who have to deal habitually in intrinsically risky assets such as foreign exchange next year, commodities next year, etc. They may prefer to forgo the chance to make exceptional windfall profits when future uncertainty works to their advantage by protecting themselves against exceptional loss. This would serve to protect their economic base (trade in commodities, or manufacture of products using these as raw materials) , and also enable them to focus their effort in their chosen area of trade or manufacture. For specu lators, on the other hand, it is the market (forex, commodities or whatever) itself that is their main forum of economic activity. Arbitrageurs. Arbitrageurs try to lock in riskless profit by simultaneously entering into transactions in two or more markets. The very existence of arbitrageurs means that there can only be very small arbitrage opportunities in the prices quoted in most financial markets. The underlying concept of this book is the absence of arbitrage opportunities (cf. § 1 .2 ) . 1 . 1 . 5 Modeling Assumptions Contingent Claim Pricing. The fundamental problem in the mathematics of financial derivatives is that of pricing. The modern theory began in 1973


7

with the seminal Black-Scholes theory of option pricing, Black and Scholes ( 1973) , and Merton's extensions of this theory, Merton (1973) . To expose the relevant features, we start by discussing contingent claim pricing in the simplest ( idealized ) case and impose the following set of as sumptions on the financial markets ( we will relax these assumptions subse quently) . No market frictions No default risk Competitive markets Rational agents No arbitrage

No transaction costs, no bid / ask spread, no taxes, no margin requirements, no restrictions on short sales Implying same interest for borrowing and lending Market participants act as price takers Market participants prefer more to less Table 1.1.

General assumptions

All real markets involve frictions; this assumption is made purely for sim plicity. We develop the theory of an ideal - frictionless - market so as to focus on the irreducible essentials of the theory and as a first-order approxi mation to reality. Understanding frictionless markets is also a necessary step to understand markets with frictions. The risk of failure of a company - bankruptcy - is inescapably present in its economic activity: death is part of life, for companies as for individu als. Those risks also appear at the national level: quite apart from war, or economic collapse resulting from war, recent decades have seen default of in terest payments of international debt, or the threat of it. We ignore default risk for simplicity while developing understanding of the principal aspects ( for recent overviews on the subject we refer the reader to Jameson ( 1995) , Madan (1998) ). We assume financial agents to be price takers, not price makers. This implies that even large amounts of trading in a security by one agent does not influence the security's price. Hence agents can buy or sell as much of any security as they wish without changing the security's price. To assume that market participants prefer more to less is a very weak assumption on the preferences of market participants. Apart from this we will develop a preference-free theory. The relaxation of these assumptions is subject to ongoing research and we will include comments on this in the text. We want to mention the special character of the no-arbitrage assumption. If we developed a theoretical price of a financial derivative under our assump tions and this price did not coincide with the price observed, we would take this as an arbitrage opportunity in our model and go on to explore the conse quences. This might lead to a relaxation of one of the other assumptions and

8


a restart of the procedure again with no-arbitrage assumed. The no-arbitrage assumption thus has a special status that the others do not. It is the basis for the arbitrage pricing technique that we shall develop, and we discuss it in more detail below. 1.2 Arbitrage

We now turn in detail to the concept of arbitrage, which lies at the centre of the relative pricing theory. This approach works under very weak assump tions. We do not have to impose any assumptions on the tastes (preferences) and beliefs of market participants. The economic agents may be heteroge neous with respect to their preferences for consumption over time and with respect to their expectations about future states of the world. All we assume is that they prefer more to less, or more precisely, an increase in consumption without any costs will always be accepted. The principle of arbitrage in its broadest sense is given by the following quotation from OED: '3 [Comm. ) . The traffic in Bills of Exchange drawn on sundry places, and bought or sold in sight of the daily quotations of rates in the several markets. Also, the similar traffic in Stocks. 1881 . ' Used i n this broad sense, the term covers financial activity of many kinds, including trade in options, futures and foreign exchange. However, the term arbitrage is nowadays also used in a narrower and more technical sense. Fi nancial markets involve both riskless (bank account) and risky (stocks etc.) assets. To the investor, the only point of exposing oneself to risk is the op portunity, or possibility, of realizing a greater profit than the riskless pro cedure of putting all one's money in the bank (the mathematics of which compound interest - does not require a textbook treatment at this level) . Generally speaking, the greater the risk, the greater the return required to make investment an attractive enough prospect to attract funds. Thus, for instance, a clearing bank lends to companies at higher rates than it pays to its account holders. The companies' trading activities involve risk; the bank tries to spread the risk over a range of different loans, and makes its money on the difference between high/ risky and low/riskless interest rates. The essence of the technical sense of arbitrage is that it should not be possible to guarantee a profit without exposure to risk. Were it possible to do so, arbitrageurs (we use the French spelling, as is customary) would do so, in unlimited quantity, using the market as a 'money-pump' to extract arbitrarily large quantities of riskless profit. This would, for instance, make it impossible for the market to be in equilibrium. We shall restrict ourselves to markets in equilibrium for simplicity - so we must restrict ourselves to markets without arbitrage opportunities. The above makes it clear that a market with arbitrage opportunities would be a disorderly market - too disorderly to model. The remarkable thing is the converse. It turns out that the minimal requirement of absence of arbitrage

1 . 2 Arbitrage

9

opportunities is enough to allow one to build a model of a financial market that - while admittedly idealized - is realistic enough both to provide real insight and to handle the mathematics necessary to price standard contingent claims. We shall see that arbitrage arguments suffice to determine prices the arbitrage pricing technique. For an accessible treatment rather different to ours, see e.g. Allingham (1991) . To explain the fundamental arguments of the arbitrage pricing technique we use the following: Example. Consider an investor who acts in a market in which only three financial assets are traded: ( riskless ) bonds B ( bank account ) , stocks 8 and European Call options C with strike K 1 on the stock. The investor may invest today, time 0, in all three assets, leave his investment until time T, t T and get his returns back then ( we assume the option expires at also ) . We assume the current £ prices of the financial assets are given by =

t

=

=

t

B (O)

=

1,

8(0)

1 , C( O )

=

=

=

0 .2

and that at t = T there can be only two states of the world: an up-state with £ prices B (T, u )

=

1 . 25, 8(T, u )

and a down-state with B (T, d)

=

£

=

1 .75,

and therefore C(T

, u

)

=

0.75,

prices

1 .25, 8(T, d)

=

0.75,

and therefore C(T, d)

=

o.

Now our investor has a starting capital of £25 , and divides it as in Table 1 . 2 below ( we call such a division a portfolio ) . Depending on the state of Financial asset Bond Stock Call

Number of 10 10 25

Table 1 . 2 .

Total amount in 10 10 5

£

Original portfolio

the world at time t T; this portfolio will give the £ return shown in Table 1.3. Can the investor do better? Let us consider the restructured portfolio of Table 1 .4. This portfolio requires only an investment of £24.6. We compute its return in the different possible future states ( Table 1 .5) . We see that this portfolio generates the same time T return while costing only £24.6 now, a saving of £0.4 against the first portfolio. So the investor should use the second portfolio and have a free lunch today! =

t

=

10


Bond 12.5 12.5

State of the world Up Down Table 1 . 3 .

Call 1 S . 75 0

Total 4S . 75 20.

Return of original portfolio

Financial asset Bond Stock Call

Number of 1 l .S 7 29

Table 1 . 4 .

State of the world Up Down Table 1 . 5 .

Stock 17.5 7.5

Total amount in 1 l .S 7 5.8

£

Restructured portfolio

Bond 14.75 14.75

Stock 1 2 . 25 5 . 25

Call 2 1 . 75 0

Total 48.75 20.

Return of the restructured portfolio

In the above example the investor was able to restructure his portfolio, reducing the current (time 0 ) expenses without changing the return at the future date t T in both possible states of the world. So there is an arbitrage possibility in the above market situation, and the prices quoted are not arbitrage (or market) prices. If we regard (as we shall do) the prices of the bond and the stock (our underlying) as given, the option must be mispriced. In this book we will develop models of financial market (with different degrees of sophistication) which will allow us to find methods to avoid (or to spot) such pricing errors. For the time being, let us have a closer look at the differences between portfolio 1 , consisting of 10 bonds, 10 stocks and 2 5 call options, in short ( 10, 10 , 2 5 ) , and portfolio 2 , of the form ( 1 1 .8, 7, 29 ) . The difference (from the point of view of portfolio 1 , say) is the following portfolio, D: ( - 1 .8, 3, -4 ) . Sell short three stocks (see below) , buy four options and put £ 1 .8 in your bank account. The left-over is exactly the £ 0.4 of the example. But what is the effect of doing that? Let us consider the consequences in the possible states of the world. From Table 1 .6 below, we sec in both cases that the effects of the different positions of the portfolio offset themselves. But clearly the portfolio generates an income at 0 and is therefore itself an arbitrage opportunity. If we only look at the position in bonds and stocks, we can say that this position covers us against possible price movements of the option, i.e. having £ 1 .8 in your bank account and being three stocks short has the same time T effects of having four call options outstanding against us. We say that the bond/stock position is a hedge against the position in options. =

t

=

t

t

=

=

1 . 3 Arbitrage Relationships

World is i n state Exercise option Buy 3 stocks at 1 . 75 Sell bond

up

World is in state 3

Balance

o Table 1 . 6 .

down

Option is worthless Buy 3 stocks at 0.75 Sell bond

-5.25 2 .25

I

11

Balance

0 -2.25 2.25 o

Difference portfolio

Let us emphasize that the above arguments were independent of the pref erences and plans of the investor. They were also independent of the inter pretation of = T: it could be a fixed time, maybe a year from now, but it could refer to the happening of a certain event, e.g. a stock hitting a certain level, exchange rates at a certain level etc.

t

1 . 3 Arbitrage Relat ionships

We will in this section use arbitrage-based arguments ( arbitrage pricing tech nique) to develop general bounds on the value of options. Such bounds, de duced from the underlying assumption that no arbitrage should be possible, allow one to test the plausibility of sophisticated financial market models. In our analysis here we use stocks as the underlying. 1 . 3 . 1 Fundamental Determinants of Option Values

We consider the determinants of the option value in Table 1 . 7 below. Since we restrict ourselves to stocks not paying dividend we don't have to consider cash dividends as another natural determinant. Current stock price Strike price Stock volatility Time to expiry Interest rates Table 1 . 7.

S(t) K a

T-t r

Determinants affecting option value

We now examine the effects of the single determinants on the option prices other factors remaining unchanged ) . We saw that at expiry the only variables that mattered were the stock price S ( T ) and strike price K: remember the payoffs C ( S ( T) - K) + , P ( all

=

=

12


(S(T) K) - (:= max{K S(T) , O}). Looking at the payoffs, we see that an increase in the stock price will increase (decrease) the value of a call (put) option (recall all other factors remain unchanged) . The opposite happens if the strike price is increased: the price of a call (put) option will go down (up). When we buy an option, we bet on a favourable outcome. The actual outcome is uncertain; its uncertainty is represented by a probability density; favourable outcomes are governed by the tails of the density (right or left tail for a call or a put option) . An increase in volatility flattens out the density and thickens the tails, so increases the value of both call and put options. Of course, this argument again relies on the fact that we don't suffer from (with the increase of volatility more likely) more severe unfavourable outcomes we have the right, but not the obligation, to exercise the option. A heuristic statement of the effects of time to expiry or interest rates is not so easy to make. In the simplest of models (no dividends, interest rates remain fixed during the period under consideration) , one might argue that the longer the time to expiry the more can happen to the price of a stock. So a longer period increases the possibility of movements of the stock price and hence the value of a call (put) should be higher the more time remains before expiry. But only the owner of an American-type option can react immediately to favourable price movements, whereas the owner of a European option has to wait until expiry, and only the stock price then is relevant. Observe the contrast with volatility: an increase in volatility increases the likelihood of favourable outcomes at expiry, whereas the stock price movements before expiry may cancel themselves out. A longer time until expiry might also increase the possibility of adverse effects from which the stock price has to recover before expiry. We see that by using purely heuristic arguments we are not able to make precise statements. One can, however, show by explicit arbitrage arguments that an increase in time to expiry leads to an increase in the value of call options as well as put options. (We should point out that in case of a dividend-paying stock the statement is not true in general for European-type options.) To qualify the effects of the interest rate we have to consider two aspects. An increase in the interest rate tends to increase the expected growth rate in an economy; and hence the stock price tends to increase. On the other hand, the present value of any future cash flows decreases. These two effects both decrease the value of a put option, while the first effect increases the value of a call option. However, it can be shown that the first effect always dominates the second effect, so the value of a call option will increase with increasing interest rates. The above heuristic statements, in particular the last, will be verified again in appropriate models of financial markets, see §4.5 and §6.2. We summarize in table 1 .8 the effect of an increase of one of the param eters on the value of options on stocks no paying dividends while keeping all others fixed: -

-

1 . 3 Arbitrage Relationships Parameter (increase )

Call

Put

Stock price Strike price Volatility Interest rates Time to expiry

Positive Negative Positive Positive Positive

Negative Positive Positive Negative Positive

Table 1 . 8 .

13

Effects of parameters

We would like to emphasize again that these results all assume that all other variables remain fixed, which of course is not true in practice. For example stock prices tend to fall ( rise) , when interest rates rise fall ) , and the observable effect on option prices may well be different from the effects deduced under our assumptions. Cox and Rubinstein ( 1985) , p. 37-39, discuss other possible determining factors of option value, such as expected rate of growth of the stock price, additional properties of stock price movements, investors' attitudes toward risk, characteristics of other assets and institutional environment ( tax rules, margin requirements, transaction costs, market structure ) . They show that in many important circumstances the influence of these variables is marginal or even vanishing.

(

1 .3.2 Arbitrage Bounds

We now use the principle of no-arbitrage to obtain bounds for option prices. Such bounds, deduced from the underlying assumption that no arbitrage should be possible, allow one to test the plausibility of sophisticated financial market models. We focus on European options ( puts and calls ) with identical underlying ( say a stock S) , strike K and expiry date T. Furthermore we assume the existence of a risk-free bank account ( bond ) with constant interest rate r ( continuously compounded ) during the time interval [0, T] . We start with a fundamental relationship: Proposition 1 .3 . 1 . We have the following put-call parity between the prices

of the underlying asset S and European call and put options on stocks that pay no dividends: (1.1) S + P - C = Ke- r ( T - t ) . Proof. Consider a portfolio consisting of one stock, one put and a short position in one call ( the holder of the portfolio has written the call ) ; write for the value of this portfolio. Then

V(t)

V(t) = S(t) + P(t) - C(t)

for all t E

[0, T] .

At expiry we have

14


V(T)

=

S(T) + (S(T) - K) - - (S(T) - K ) +

=

S( T) + K - S(T)

=

K.

This portfolio thus guarantees a payoff K at time T. Using the principle of no-arbitrage, the value of the portfolio must at any time t correspond to the D value of a sure payoff K at T, that is V (t) = K e- r (T- t ) . Having established ing.

(1 .1) ,

we concentrate on European calls in the follow

Proposition 1 . 3 . 2 . The following bounds hold for European call options:

{

max S(t) - e - r (T-t) K, o }

=

( S(t) - e- r (T-t) K) + ::; C(t) ::; S(t) .

Proof. That C � 0 is obvious, otherwise 'buying' the call would give a riskless profit now and no obligation later. Similarly the upper bound C ::; S must hold, since violation would mean that the right to buy the stock has a higher value than owning the stock. This must be false, since a stock offers additional benefits. Now from put-call parity ( 1 . 1 ) and the fact that P � 0 (use the same argument as above) , we have

S(t) - K e - r( T- t )

=

C(t) - P(t) ::; C(t) ,

which proves the last assertion.

D

It is immediately clear that an American call option can never be worth less than the corresponding European call option, for the American option has the added feature of being able to be exercised at any time until the maturity date. Hence (with the obvious notation) : CA (t) � CE (t) . The striking result we are going to show, is Theorem 8.2 in Merton ( 1990 ) : Proposition 1 .3 . 3 . For a stock not paying dividends we have

( 1 .2 )

Exercising the American call at time t < T generates the cash-flow From Proposition 1 .3.2 we know that the value of the call must be greater or equal to S(t) K e- r( T-t) , which is greater than S(t) - K. Hence selling the call would have realized a higher cash-flow and the early exercise D of the call was suboptimal. Remark 1 . 3. 1 . Qualitatively, there are two reasons why an American call should not be exercised early: (i) Insurance. An investor who holds a call option instead of the underlying stock is 'insured' against a fall in stock price below K , and if he exercises early, he loses this insurance. Proof.

S(t) - K .

-

1 . 4 Single-period Market Models

15

( ii )

Interest on the strike price. When the holder exercises the option, he buys the stock and pays the strike price, K. Early exercise at t < T deprives the holder of the interest on K between times t and T: the later he pays out K, the better. We remark that an American put offers additional value compared to a European put. 1 . 4 Single- p eriod Market Models

1 .4 . 1 A Fundamental Example

0

0

We consider a one-period model, i.e. we allow trading only at t = and t = T = 1 ( say ) . Our aim is to value at t = a European derivative on a stock S with maturity T. First Idea. Model ST as a random variable on a probability space (n, F, IP) . The derivative is given by H = f ( ST ) , i.e. it is a random variable ( for a suitable function f ( . » . We could then price the derivative using some discount factor (3 by using the expected value of the discounted future payoff: Ho

=

lE«(3H) .

( 1 .3)

Problem. How should we pick the probability measure IP? According to their preferences, investors will have different opinions about the distribution of the price ST . Black-Scholes-Merton ( Ross ) Approach. Use the no-arbitrage principle and construct a hedging portfolio using only known ( and already priced ) securities to duplicate the payoff H. We assume 1. Investors are nonsatiable, i.e. they always prefer more to less. 2. Markets do not allow arbitrage, i.e. the possibility of risk-free profits. From the no-arbitrage principle we see: If it is possible to duplicate the payoff H of a derivative using a portfolio V of underlying (basic) securities, i. e. H(w) = V (w) , 't/w, the price of the portfolio at t = must equal the price of the derivative at t = O.

0

•

Let us assume there are two tradeable assets a risk-free bond ( bank account ) with B (O) = 1 and B(T) = 1 , that is the interest rate r and the discount factor (3(t) = 1. ( In this context we use (3(t) = 1/ B (t) as the discount factor. ) a risky stock S with S ( O ) 1 0 and two possible values at t = T with probability p e ) ST = 7. 5 with probability 1 p. =

•

0

{20 =

-

16


We call this setting a (B, S)-market. The problem is to price a European call at = 0 with strike K 15 and maturity T, i.e. the random payoff H (S(T) - K) + . We can evaluate the call in every possible state at t T and see H = 5 (if S( T ) 20) with probability p and H 0 (if S( T ) = 7.5) with probability 1 - p. This is illustrated in Figure (1.2)

t

=

=

=

=

=

today

80

Bo

= =

{

one period 81

Bl Hi

= =

::

20 1 ax{ 20 - 15, O}

�

10 1

{

Ho = ?

Fig. 1.2.

81 7. 5 Bl 1 HI :: max{7.5 - 1 5 , 0} - O. =

=

One-period example

} }

up-state

down-state

The key idea now is to try to find a portfolio combining bond and stock, which synthesizes the cash flow of the option. If such a portfolio exists, holding this portfolio today would be equivalent to holding the option - they would produce the same cash flow in the future. So the price of the option should be the same as the price of constructing the portfolio, otherwise investors could just restructure their holdings in the assets and obtain a risk-free profit today. We briefly present the constructing of the portfolio (J = (00 , ( 1 ) , which in the current setting is just a simple exercise in linear algebra. If we buy 01 stocks and invest 00 £ in the bank account, then today's value of the portfolio is V(O) = 00 + 01 . S(O) . In state 1 the stock price is 20 £ and the value of the option 5 £ , so 00 + 01 . 20 In state 2 the stock price is 7.5

£

=

5.

and the value of the option 0 £ , so

00 + 0 1 . 7.5 = O. We solve this and get 00 - 3 and 0 1 = 0.4. So the value of our portfolio at time 0 in £ is =

V(O) = - 3B ( 0 ) + O.4S(O) = - 3 + 0.4 10 = 1 x

1 .4 Single-period Market Models

17

V(O) is called the no-arbitrage price. Every other price allows a riskless profit, since if the option is too cheap, buy it and finance yourself by selling short the above portfolio (i.e. sell the portfolio without possessing it and promise to deliver it at time T 1 - this is risk-free because you own the option). If on the other hand the option is too dear, write it (i.e. sell it in the market) and cover yourself by setting up the above portfolio. We see that the no-arbitrage price is independent of the individual pref erences of the investor (given by certain probability assumptions about the future, i.e. a probability measure JP) . But one can identify a special, so-called risk-neutral, probability measure JP* , such that Ho IE* ({3 H ) = (p* . {3(81 - K) + ( 1 - p *) · 0) 1 . =

=

=

In the above example, we get from 1 p*5 + ( 1 - p*)O that p * 0.2. This probability measure JP* is equivalent to JP, and the discounted stock price process, i.e. {3t8t , t = 0, 1 , follows a JP*-martingale. In the above example, this corresponds to 8(0) p* 8(T)UP + (1 - p * )8(T)dow n , that is 8(0) IE* ({38(T)). We will show that the above generalizes. Indeed, we will find that the no =

=

=

=

arbitrage condition is equivalent to the existence of an equivalent martingale measure (first fundamental theorem of asset pricing) and that the property that we can price assets using the expectation operator is equivalent to the uniqueness of the equivalent martingale measure.

Let us consider the construction of hedging strategies from a different perspective. Consider a one-period (B, 8)-market setting with discount factor {3 = 1. Assume we want to replicate a derivative H (that is a random variable on some probability space (il, F, JP)) . For each hedging strategy (J = (00 , Or ) we have an initial value of the portfolio V(O) 00 + 0 1 8(0) and a time t T value of V(T) Oo + 01 8(T) . We can write V(T) V(O) + (V(T) - V(O)) with G(T) = V(T) - V(O) 0 1 (8(T) - 8(0)) the gains from trading. So the costs C(O) of setting up this portfolio at time t = 0 are given by C(O) V(O) , while maintaining (or achieving) a perfect hedge at t T requires an additional capital of C(T) H - V(T) . Thus we have two possibilities for finding 'optimal' hedging strategies: Mean-variance hedging. Find 00 (or alternatively V(O)) and 0 1 such that IE ( (H - V (T)) 2 ) IE ( ( H - (V(O) + 01 (8(T) - 8(0)))) 2 ) -+ min =

=

=

=

=

=

=

=

•

=

Risk-minimal hedging. Minimize the cost from trading, i.e. an appropriate functional involving the costs C(t) , t = 0, T. In our example, mean-variance hedging corresponds to the standard linear regression problem, and so 8(0)) 0 1 Cov(H, (8(T) -8(0)) ) and Vo IE(H) - 01IE(8(T) - 8(0)) . Var (8(T) •

=

=

18


We can also calculate the optimal value of the risk functional Rmin =

Var ( H ) - (}�Var(8(T) - 8(0) ) = Var ( H ) ( I - p2 ) ,

where p is the correlation coefficient of H and 8(T) . Therefore we can't expect a perfect hedge in general. If however Ip l = 1, i.e. H is a linear function of 8(T) , a perfect hedge is possible. We call a market complete if a perfect hedge is possible for all contingent claims. 1 .4.2 A Single-period Model

We proceed to formalize and extend the above example and present in detail a simple model of a financial market. Despite its simplicity, it already has all the key features needed in the sequel (and the reader should not hesitate to come back here from more advanced chapters to see the bare concepts again) . We introduce i n passing a little of the terminology and notation of Chapter 4; see also Harrison and Kreps ( 1979) . We use some elementary vocabulary from probability theory, which is explained in detail in Chapter 2. We consider a single period model, i.e. we have two time-indexes, say t 0, which is the current time (date) , and t = T, which is the terminal date for all economic activities considered. The financial market contains d + 1 traded financial assets, whose prices at time t = ° are denoted by the vector 8(0) E IRd + I , =

(where ' denotes the transpose of a vector or matrix) . At time T, the owner of financial asset number i receives a random payment depending on the state of the world. We model this randomness by introducing a finite probability space ( D, F, lP ) , with a finite number I D I N of points (each corresponding to a certain state of the world) WI , , Wj , . . . , W each with positive probability: JP( {w} ) > 0, which means that every state of the world is possible. F is the set of subsets of D (events that can happen in the world) on which JP(.) is defined (we can quantify how probable these events are) , here F P ( D ) the set of all subsets of D. (In more complicated models it is not possible to define a probability measure on all subsets of the state space D, see §2. 1 . ) We can now write the random payment arising from financial asset i as =

.

•

.

N,

=

At time t 0, the agents can buy and sell financial assets. The portfolio position of an individual agent is given by a tmding stmtegy O.

o}

have no common points. A statement like that naturally points to the use of a separation theorem for convex subsets, the separating hyperplane theo rem (see e.g. Rockafellar (1970) for an account of such results, or Appendix A ) . Using such a theorem, we come to the following characterization of no arbitrage. Theorem 1 .4 . 1 . There is no arbitrage if and only if there exists a vector

such that

(1.4) S'If; = 8(0) . Proof. The implication ' {= ' follows straightforwardly: assume that 8(T, w),c.p 2:: 0, w E il for a vector c.p E JRd+1 . Then

20


1/Ji

S(O)'cp = (S1/J)'cp = 1/J' S'cp

�

0,

since > 0, \11 ::; i ::; N. So no arbitrage opportunities exist. To show the implication ' =} ' we use a variant of the separating hyperplane theorem. Absence of arbitrage means the r and IR�+ 1 have no common points. This means that K c IR�+ l defined by K

=

{Z

E IR�+ l

:

t,Zi I} =

t=O

and r do not meet. But K is a compact and convex set, and Nby the separating hyperplane theorem ( Appendix C), there is a vector A E IR + 1 such that for all E K

Z

A'

but for all

(x, y )'

Zi

Er

Z

>

0

Now choosing 1 successively we see that A i > 0, i = 0, . . . N, and hence by normalizing we get 'lj; = AjAo with 1/Jo = 1 . Now set x -S(O)'cp and 0 y = S' cp and the claim follows. =

=

The vector 1/J is called a state-price vector. We can think of 1/Jj as the marginal cost of obtaining an additional unit of account in state Wj . We can now reformulate the above statement to: There is no arbitmge if and only if there exists a state-price vector.

Using a further normalization, we can clarify the link to our probabilistic setting. Given a state-price vector 1/J = (1/Jb . . . , 1/JN ) , we set 1/Jo = 1/Jl + . + 1/J N and for any state Wj write qj = 1/Jj /1/Jo . We can now view (ql , " " qN ) as probabilities and define a new probability measure on il by Q( {Wj } ) qj , j 1 , . , N. Using this probability measure, we see that for each asset i we have the relation .

.

=

=

.

.

Hence the normalized price of the financial security i is just its expected payoff under some specially chosen 'risk-neutral' probabilities. Observe that we didn't make any use of the specific probability measure 1P in our given probability space. So far we have not specified anything about the denomination of prices. From a technical point of view, we could choose any asset i as long as its price vector (Si (O) , Si (T, wd , . . . , Si (T, WN ) )' only contains positive entries,

1 .4 Single-period Market Models

21

and express all other prices in units of this asset. We say that we use this asset as numeraire. Let us emphasize again that arbitrage opportunities do not depend on the chosen numeraire. It turns out that appropriate choice of the numeraire facilitates the probability-theoretic analysis in complex settings, and we will discuss the choice of the numeraire in detail later on. For simplicity, let us assume that asset 0 is a riskless bond paying one unit in all states w E n at time T. This means that 80 (T, w) = 1 in all states of the world w E n. By the above analysis we must have

and 'ljJ0 is the discount on riskless borrowing. Introducing an interest rate r, we must have 80 (0) = 'ljJo = ( 1 + r) - T . We can now express the price of asset i at time t 0 as =

We rewrite this as

8i (T) ( l + r)O

- (

_

IE

Q

8i (T) ( l + r)T

)

.

In the language of probability theory, we just have shown that the processes 8i (t)/( 1 + r) t , t = 0, T are Q-martingales. (Martingales are the probabilist's way of describing fair games: see Chapter 3.) It is important to notice that under the given probability measure IP (which reflects an individual agent's belief or the market's belief) the processes 8i (t)/( 1 + r) t , t = 0, T generally do not form IP-martingales. We use this to shed light on the relationship of the probability measures IP and Q. Since Q( {w }) > 0 for all w E n, the probability measures IP and Q are equivalent, and (see Chapters 2 and 3) because of the argument above we call Q an equivalent martingale measure. So we arrived at yet another characterization of arbitrage: There is no arbitrage if and only if there exists an equivalent martingale measure.

We also see that risk-neutral pricing corresponds to using the expectation operator with respect to an equivalent martingale measure. This concept lies at the heart of stochastic (mathematical) finance and will be the golden thread (or roter Faden) throughout this book. We now know how the given prices of our (d + 1) financial assets should be related in order to exclude arbitrage opportunities, but how should we price a newly introduced financial instrument? We can represent this financial instru ment by its random payments 8(T) = (8(T, w d , . . . , 8(T, wj ) , . . . , 8(T, W N ) )'

22


(observe that 8(T) is a vector in JRN ) at time t = T and ask for its price 8(0) at time t O. The natural idea is to use an equivalent probability measure Q and set 8(0) IEQ (8(T)/(1 + r f ) (recall that all time t 0 and time t T prices are related in this way) . Unfortunately, as we don't have a unique martingale measure in general, we cannot guarantee the uniqueness of the t 0 price. Put another way, we know every equivalent martingale measure leads to a reasonable relative price for our newly created financial instrument, but which measure should one choose? The easiest way out would be if there were only one equivalent martingale measure at our disposal - and surprisingly enough, the classical economic pricing theory puts us exactly in this situation! Given a set of financial assets on a market, the underlying question is whether we are able to price any new financial asset which might be introduced in the market, or equivalently whether we can replicate the cash-flow of the new asset by means of a portfolio of our original assets. If this is the case and we can replicate every new asset, the market is called complete. In our financial market situation the question can be restated mathemat ically in terms of Euclidean geometry: do the vectors Si (T) span the whole JRN ? This leads to: =

=

=

=

=

Theorem 1 .4.2. Suppose there are no arbitrage opportunities. Then the

model is complete if and only if the matrix equation S ' cp

=

8

E JRN. Linear algebra immediately tells us that the above theorem means that the number of independent vectors in S' must equal the number of states in n. In an informal way, we can say that if the financial market model con tains 2 (N) states of the world at time T, it allows for 1 (N 1) sources of randomness (if there is only one state we know the outcome) . Likewise we can view the numeraire asset as risk-free and all other assets as risky. We can now restate the above characterization of completeness in an informal (but intuitive) way as: has a solution cp

E

JRd+ l for any vector 8

-

A financial market model is complete if it contains at least as many in dependent risky assets as sources of randomness.

The question of completeness can be expressed equivalently in probabilis tic language (to be introduced in Chapter 3) as a question of represent ability of the relevant random variables or whether the a-algebra they generate is the full a-algebra.


23

If a financial market model is complete, traditional economic theory shows that there exists a unique system of prices. If there exists only one system of prices, and every equivalent martingale measure gives rise to a price system, we can only have a unique equivalent martingale measure. ( We will come back to this important question in Chapters 4 and 6.) The (arbitrage-free) market is complete if and only if there exists a unique equivalent martingale measure. Example. We give a more formal example of a binary single-period model. We have d + l 2 assets and I nl = 2 states of the world n {Wl , W2 } . Keep ing the interest rate r = we obtain the following vectors ( and matrices ) :

0, 0 (0) ] = [ 1501 ] 80 (T) = [ 11 ] 81 (T) = [ 1 8900 ] ' = [ 1 801 901 ] . 8(0) = [ 881(0) We try to solve ( 1.4) for state prices, i.e. we try to find a vector 'I/J ( 'l/J 1 , 'l/J2 ) ' , 'l/Ji > 0, i 1, 2 such that =

=

S

'

'

=

=

Now this has a solution

[ ij�l [��] hence showing that there are no arbitrage opportunities in our market model. =

1

Furthermore, since 'l/J l + 'l/J 2 we see that we already have computed risk neutral probabilities, and so we have found an equivalent martingale measure Q with 2 (W2 ) 1 (W ) =

Q

l = 3' Q

=

3·

We now want to find out if the market is complete ( or equivalently if there is a unique equivalent martingale measure ) . For that we introduce a new financial asset a with random payments o(T) (o(T, wd , o (T, W2 ) )' . For the market to be complete, each such o(T) must be in the linear span of and 1 (T) . Now since 80 (T) and 81 (T) are linearly independent, their linear span is the whole JR2 ( = JR l n l ) and o(T) is indeed in the linear span. Hence we can find a replicating portfolio by solving

8

[

30,

80 (T)

=

0

O (T, wd O(T, W2 )

] [ 1 19080 ] [ ] 1 =

CPO . l cp

Let us consider the example of a European option above. There, o(T, WI ) o(T, W2 ) = and the above becomes

=


24

with solution CPo = and CP = � , telling us to borrow units and buy � stocks, which is exactly what we did to set up our portfolio above. Of course, an alternative way of showing market completeness is to recognize that above admits only one solution for risk-neutral probabilities, showing the uniqueness of the martingale measure. Example. Change of Numeraire. We choose a situation similar to the above example, i.e. we have d + = assets and I n l = states of the world n = {W 1 , W2 } . But now we assume two risky assets (and no bond) as the financial instruments at our disposal. The price vectors are given by (O) S( O ) = So = Sl ( ) = = Sl ( O ) = S We solve and get state prices

-30

1

30

(1. 4)

2

1 2

] [ 11 ] ' So(T) [ 35/4/4 ] ' T [ 1 /22 ] '

[

[ 31 //42 5/42 ] '

(1 .4 )

[ 'l/J'l/J21 ] [ 6/7 2/7 ] ' =

showing that there are no arbitrage opportunities in our market model. So we find an equivalent martingale measure Q with Since we don't have a risk-free asset in our model, this normalization (this numeraire) is very artificial, and we shall use one of the modelled assets as a numeraire, say So. Under this normalization, we have the following asset prices (in terms of So (t, w )!):

Since the model is arbitrage-free (recall that a change of numeraire doesn't affect the no-arbitrage property) , we are able to find risk-neutral probabilities - = 154 ' q-1 194 and q2 We now compute the prices for a call option to exchange So for Sl . Define = m {S O } and the cash flow is given by =

Z(T)

1 (T) - So(T), ] [ 3/04 ] . [ Z(T,wd Z (T, W2 ) There are no difficulties in finding the hedge portfolio CPo and pricing the option as Zo 134 ' ax

-

=

=

-

¥ and CP

1

=

194


We want to point out the following observation. Using we naturally write

Z(T,w) 05 (T, w ) - max { 51(50 (TT,,w)w) 1 , o }

25

50 as numeraire,

_

_

and see that this seemingly complicated option is (by use of the appropriate numeraire) equivalent to

Z(T, w) = max {S't (T,w)

-

}

1, 0 ,

a European call on the asset !h .

1 .4.3 A Few Financial-economic Considerations

The underlying principle for modelling economic behaviour of investors (or economic agents in general) is the maximization of expected utility, that is one assumes that agents have a utility function U ) and base economic deci sions on expected utility considerations. For instance, assuming a one-period model, an economic agent might have a utility function over current (t 0) and future (t = values of consumption

(

T)

.

=

(1.5) U( Co, CT ) = u(co ) + lE ((3U ( CT )), where Ct is consumption at time t and u(.) is a standard utility function expressing nonsatiation - investors prefer more to less; u is increasing; risk aversion - investors reject an actuarially fair gamble; u is concave; and (maybe) decreasing absolute risk aversion and constant relative risk aversion. Typical examples are power utility u(x) = (x')' - l)h, log utility u(x) log(x) or quadratic utility u x) x2 + dx (for which only the first two properties are true for certain arguments) . Assume such an investor is offered at t 0 at a price p a random payoff X at t How much will he buy? Denote by � the amount of the asset he chooses to buy and by e T , T = 0 , his original consumption. Thus, his problem is max [u( co) + lE[(3U(CT ) ]] •

•

•

=

such that

T.

(

=

=

=

T

�

Co = e o - p� and CT eT + X � . Substituting the constraints into the objective function and differentiating with respect to �, we get the first-order condition =

26


[

x] .

' p = .IE ,B U '((CT)) U CO The investor buys or sells more of the asset until this first-order condition is satisfied. If we use the stochastic discount factor

pu' (co ) = .IE [,BU' ( CT )X]

m

we obtain the central equation

=

,B U' ( CT )

u' (co ) ,

(1 .6) p = .IE ( mX ) We can use (under regularity conditions) the random variable m to perform a change of measure, Le. define a probability measure JP* using JP* (A) = .IE * (IA ) = .IE(mlA ) . We write (1 .6) under measure JP* and get p = .IE * ( X ) . Returning to the initial pricing equation, we see that under JP* the investor has the utility function u ( x ) = x. Such an investor is called risk-neutral, and consequently one often calls the corresponding measure a risk-neutral measure. An excellent discussion of these issues (and further much deeper results) are given in Cochrane (2001). .

Exercises

Draw payoff diagrams of the following portfolios: 1 . A vertical spread: one option is bought and another sold, both on the same underlying stock and with same expiration date, but with different strikes. 2. A horizontal spread: one option is bought and another sold, both on the same underlying stock and with same strike, but with different expiration dates. 3. A straddle: a put and a call on the same underlying stock, with same strike and same expiration date. 1 . 2 A bear spread is created by buying a call with strike price Kl and selling a call with strike price K2 . Both calls are on the same underlying stock and have the same expiry date T, but K2 is greater than Kl (Le. K2 > Kt) . 1 . Draw the payoff diagram of a bear spread. 2. What are the market expectations of a (rational) investor setting up a bear spread? 3. How does a change in the stock's volatility affect the value of the position? 4. Construct a bear spread using put options. Compare the initial costs with the initial costs of a bear spread using calls. 1.1

27

Exercises

Show the following arbitrage bounds for call options. 1 . Consider call options, which are identical (same underlying, same expiry date) except for the strike price. The following relations hold: (a) C(KI ) ;:::: C(K2), if K2 ;:::: KI ;

1.3

(b) e- rT (K2 - Kd ;:::: C(Kd - C(K2), if K2 ;:::: K1 ; (c) AC(Kd + ( 1 - A)C(K2) ;:::: C(AKI + (1-A)K2), if K2 ;:::: K1 and 0 ::; >. ::; 1 . 2 . Consider call options, which are identical (same underlying, same strike price) except for the date of expiry, then Consider the example at the end of §1.4: 1. Solve (1.4) for state prices. 2. Compute risk-neutral probabilities with numeraire 80 . 3. Show that the model leads to a complete market. 4. Price a call on 80 and a call on 81 by computing the appropriate expectations and by constructing a hedge portfolio. 1 . 5 Consider a one-period financial market model consisting of a bank ac count B and a stock 8 modelled on a probability space (n, F, JP) with n {wI , w2}, F pen) and JP a probability measure on (n, F) such that JP({wI } ) > 0, JP({W2}) > O . Suppose that the current asset prices (time t 0) are B ( O ) 1 and 8(0) 5 and that the terminal prices (time t 1) are B (l , WI ) B ( l , W2) 1 + r with r 1/9, and 8 ( 1 , wd = 20/3 and 8 ( 1 , W2) 40/9. 1. Show that the model is free of arbitrage by computing - a state-price vector; - an equivalent martingale measure (using B as numeraire) . 2. Is the financial market model complete? 3. Consider a contingent claim X with X(WI ) 7 and X(W2) = 2. Find the time t 0 value of this claim by - using the risk-neutral valuation formula; - constructing a replicating portfolio. 1.4

=

=

=

=

=

=

=

=

=

=

=

=

2 . Probability Background

No one can predict the future! All that can be done by way of prediction is to use what information is available as well as possible. Our task is to make the best quantitative statements we can about uncertainty - which in the financial context is usually uncertainty about the future. The basic tool to quantify uncertainty is a probability density or distribution. We will assume that most readers will be familiar with such things from an elementary course in probability and statistics; for a clear introduction see, e.g. Grimmett and Welsh (1986), or the first few chapters of Grimmett and Stirzaker (2001); Ross (1997) , Resnick (2001), Durrett (1999) , Ross (1997) , Rosenthal (2000) are also useful. We shall use the language of probability, or randomness, freely to describe situations involving uncertainty. Even in the simplest situations this requires comment: the outcome of a coin toss, for instance, is deterministic given full information about the initial conditions. It is our inability in practice to specify these accurately enough to use Newtonian dynamics to predict the outcome that legitimizes thinking of the outcome as random - and makes coin-tossing available as a useful symmetry-breaking mechanism, e.g. to start a football match. We note also the important area of Bayesian statistics, in which the em phasis is not on randomness as such, but on uncertainty, and how to quantify it by using probability densities or distributions. This viewpoint has much to recommend it; for an excellent recent textbook treatment, see Robert (1997) . With this by way of preamble, or apology, we now feel free to make explicit use of the language of randomness and probability and the results, viewpoints and insights of probability theory. The mathematical treatment of probability developed through the study of gambling games (and financial speculation is basically just a sophisticated form of gambling!). Gambling games go back to antiquity, but the first well documented study of the mathematics of gambling dates from the correspon dence of 1654 between Pascal and Fermat. Probability and statistics grew together during the next two and a half centuries; by 1900, a great deal of value was known about both, but neither had achieved a rigorous modern form, or even formulation. Indeed, the famous list of Hilbert problems - posed to the International Congress of Mathematicians in 1900 by the great Ger-

30

2. Probability Background

man mathematician David Hilbert (1862-1943; see Appendix A ) , contains ( as part of Problem 6) putting probability theory onto a rigorous mathemat ical footing. The machinery needed to do this is measure theory, originated by Lebesgue ( see §2.1, below ) ; the successful harnessing of measure theory to provide a rigorous treatment of probability theory is due to the great Russian mathematician and probabilist Andrei Nikolaevich Kolmogorov (1903-1987) , in his classic book Kolmogorov (1933) . We begin with a brief summary in Chapter 2 of what we shall need of probability theory in a static setting. For financial purposes, we need to go further and handle the dynamic setting of information unfolding with time. The framework needed to describe this is that of stochastic processes ( or random processes ) ; we turn to these in discrete time in Chapter 3, and in continuous time in Chapter 5. Unless the reader is already familiar with measure theory, we recommend that he read Chapter 2 taking omitted proofs for granted: our strategy is to summarize what we need, and then use it. For the reader wishing to fill in the background here, or revise it, we recommend particularly Rudin (1976) ; many other good analysis texts are available. An excellent introductory measure theoretic text is Williams (1991). 2 . 1 Measure

The language of modeling financial markets involves that of probability, which in turn involves that of measure theory. This originated with Henri Lebesgue (1875- 1941), in his thesis, 'Integrale, longueur, aire', Lebesgue (1902) . We begin with defining a measure on 1R generalizing the intuitive notion of length. The length J.L(I) of an interval I (a, b) , [a, b] , [a, b) or (a, b] should be b - a: J.L(I) b - a. The length of the disjoint union I U;= l lr of intervals lr should be the sum of their lengths: =

=

=

( finite

additivity ) .

Consider nOw an infinite sequence h , 12 , ( ad infinitum) of disjoint inter vals. Letting n tend to 00 suggests that length should again be additive over disjoint intervals: .

.

.

( countable

additivity) .

For I an interval, A a subset of length J.L(A) , the length of the complement I \ A : I n A C of A in I should be =

J.L(I \ A)

=

J.L(I) - J.L(A)

( complementation ) .

2 . 1 Measure

31

If A � B and B has length J-L ( B ) = 0, then A should have length 0 also: A � B and J-L ( B ) = 0 :::} J-L ( A ) = 0 ( completeness ) . The term 'countable' here requires comment. We must distinguish first be tween finite and infinite sets; then countable sets ( like IN = { I , 2, 3, . . } ) are the 'smallest', or 'simplest', infinite sets, as distinct from uncountable sets such as JR = ( - 00 , ) Let F be the smallest class of sets A c JR containing the intervals, closed under countable disjoint unions and complements, and complete ( containing all subsets of sets of length 0 as sets of length 0) . The above suggests - what Lebesgue showed - that length can be sensibly defined on the sets F on the line, but on no others. There are others - but they are hard to construct ( in technical language: the axiom of choice, or some variant of it such as Zorn's lemma, is needed to demonstrate the existence of non-measurable sets - but all such proofs are highly non-constructive ) . So: some but not all subsets of the line have a length. These are called the Lebesgue-measurable sets, and form the class F described above; length, defined on F, is called Lebesgue measure J-L ( on the real line, JR) . Turning now to the general case, we make the above rigorous. Let n be a set. Definition 2 . 1 . 1 . A collection AD of subsets of n is called an algebra on n .

(0

.

if:

(i) n E AD, (ii) A E AD :::} AC = n \ A E AD, (iii) A, B E AD :::} A u B E AD .

Using this definition and induction, we can show that an algebra on n is a family of subsets of n closed under finitely many set operations. Definition 2 . 1 . 2 . An algebra A of subsets of n is called a a-algebra on n if for any sequence An E A, ( n E IN ) , we have 00

U A n E A.

n= l Such a pair (n, A) is called a measurable space.

Thus a a-algebra on n is a family of subsets of n closed under any countable collection of set operations. The main examples of a-algebras are a-algebras generated by a class C of subsets of n, i.e. a ( C ) is the smallest a-algebra on n containing C. The Borel a-algebra B = B(JR) is the a-algebra of subsets of JR generated by the open intervals ( equivalently, by half-lines such as ( - 00 , xl ) as x varies in JR. As our aim is to define measures on collection of sets, we now turn to set functions.

32


Ao an algebra on Q and Ilo a non-negative set function Il o : Ao --+ [0, 00] such that llo (0) = O . Ilo is called: (i) additive, if A, B E Ao , A n B = 0 =? llo (A U B) = llo ( A ) + llo (B) , (ii) countably additive, if whenever (An)nElN is a sequence of disjoint sets in Ao with U An E Ao then Definition 2 . 1 .3. Let Q be a set,

Definition 2 . 1 .4. Let

map

(Q, A) be a measurable space.

A countably additive

Il : A --+ [0, 00] is called a measure on (Q, A) . The triple (Q, A , Il) is called a measure space. Recall that our motivating example was to define a measure on JR consis tent with our geometrical knowledge of length of an interval. That means we have a suitable definition of measure on a family of subsets of JR and want to extend it to the generated a-algebra. The measure-theoretic tool to do so is the CaratModory extension theorem, for which the following lemma is an inevitable prerequisite. Lemma 2 . 1 . 1 . Let Q be a set. Let I be a 7r-system on Q, that is, a family of subsets of Q closed under finite intersections: h , 12 E I =? h n 12 E I. Let A a (I) and suppose that III and 11 2 are finite measures on (Q, A) (i. e. Il I (Q) 1l 2 ( Q ) < 00 ) and III = 112 on I. Then III 11 2 on A . Theorem 2 . 1 . 1 (Caratheodory Extension Theorem) . Let Q be a set, Ao an algebra on Q and A = a (Ao) . If Ilo is a countably additive set function on Ao , then there exists a measure Il on (Q, A) such that Il Ilo on Ao · If Ilo is finite, then the extension is unique. For proofs of the above and further discussion, we refer the reader to Chapter 1 and Appendix 1 of Williams ( 1991 ) and the appendix in Durrett ( 1996a ) . Returning to the motivating example Q JR, we say that A c JR belongs to the collection of sets Ao if A can be written =

=

=

=

=

as

where r E lN, -00 � a l is an algebra and a(Ao)

' . . X n E ( - 00 00 ] ,

,

Now using the usual measure-theoretic steps (going from simple to inte grable functions) it is easy to show: Theorem 2 . 3 . 1 (Multiplication Theorem) . If X l , . . . , Xn are indepen

dent and IE IXi l

O. From this definition, we get the multiplication rule IP(A n B) IP(A I B)IP(B) . Using the partition equation IP(B) L n IP(B I An)IP(An) with (An) a finite or countable partition of il, we get the Bayes rule IP(Ai)IP(BIAi) IP( A I B) Lj IP(Aj )IP(B I Aj ) ' We can always write IP(A) .JE(lA) with lA (W) 1 if w E A and lA (W) = 0 otherwise. Then the above can be written :=

=

=

=

=

=

(2.2)

This suggests defining, for suitable random variables X, the IP-average of X over B as (2.3) .JE(X IB) .JE(XIB) IP(B) . =

Consider now discrete random variables X and Y. Assume X takes values Xl , . . . , Xm with probabilities It (Xi) > 0, Y takes values Y l , . . . , Yn with prob abilities !2 (Yj ) > 0, while the vector (X, Y) takes values (Xi , Yj ) with proba bilities ! (Xi, Yj ) > O. Then the marginal distributions are

45

2.5 Conditional Expectation n

m

j=l

i= l

We can use the standard definition above for the events {Y xd to get

=

Yj }

and {X =

Thus conditional On X Xi (given the information X = Xi ) , Y takes On the values Y1 , . . . , Yn with (conditional) probabilities =

So we can compute its expectation as usual:

Now define the random variable Z lE(YIX) , the conditional expectation of Y given X, as follows: if X(w) Xi , then Z(w) lE(YIX = Xi ) = Zi (say) . Observe that in this case Z is given by a 'nice' function of X. However, a more abstract property also holds true. Since Z is constant On the sets {X xd it is O'(X)-measurable (these sets generate the O'-algebra) . Furthermore =

=

=

=

J

j

{ X =x; }

J

j

YdIP.

{ X =Xi }

Since the {X = xd generate O'(X), this implies

J ZdIP = J YdIP

G

G

V G E O'(X) .

Density Case. If the random vector (X, Y) has density f(x, Y ) , then X has (marginal) density h (x) J�oo f(x , y)dy, Y has (marginal) density h ( Y ) := J�oo f(x, y)dx. The conditional density of Y given X X is: :=

=

fYl x (y l x) : = f(x,(x)y) . h

46


Its expectation is 00

lE(YIX

=

x)

1

=

y fY l x ( Y l x)dy

=

- 00

So we define

{

J�oo Yf (x , y )dy . JI (x)

lE(YIX x) if JI (x) > 0 if JI (x) 0, o and call c(X) the conditional expectation of Y given X, denoted by lE(YIX) . Observe that on sets with probability zero (i.e { X(w ) Xj JI (x) O}) the choice of c(x) is arbitrary, hence lE(YIX ) is only defined up to a set of probability zerOj we speak of different versions in such cases. With this definition we again find c(x)

=

=

=

=

w :

1 c(X) dJP 1 YdJP =

G

VG

G

Indeed, for sets G with G = {w : X(w ) Fubini's theorem

E B}

E

=

a(X) .

with B a Borel set, we find by

00

1 c(X)dJP 1 I B (x)c(x)JI (x)dx =

G

- 00 00

=

1 I B (x)JI (x) 1 yfyl x (y l x)dydx

- 00 00

=

00

00

- 00

1 1 I B (x) yf (x , y) dydx 1 YdJP. =

G

- 00 - 00

Now these sets G generate a(X ) and by a standard technique (the 7l"-systems lemma, see Williams (2001) , §2.3) the claim is true for all G E a(X ) . Example. Bivariate Normal Distribution, N ( #-tl , #-t2 , � � p) . O"

, O"

,

the familiar regression line of statistics (linear model) - see Exercise 2.6. General Case. Here, we follow Kolmogorov's construction using the Radon-Nikodym theorem. Suppose that 9 is a sub-a-algebra of F, 9 c F. If Y is a non-negative random variable with lEY < 00, then Q(G) :=

1 YdJP

G

(G E 9)

2.5 Conditional Expectation

47

is non-negative, a-additive - because

if G U n Gn , Gn disjoint - and defined on the a-algebra Q, so it is a measure on Q. If lP(G) = 0, then Q(G) 0 also (the integral of anything over a null set is zero) , so Q « lP. By the Radon-Nikodym theorem, there exists a Radon-Nikodym deriva tive of Q with respect to lP on Q, which is Q-measurable. Following Kol mogorov, we call this Radon-Nikodym derivative the conditional expectation of Y given (or conditional on) Q, lE(YIQ) , whose existence we now have es tablished. For Y that changes sign, split into Y y+ Y- , and define lE(YIQ) := lE(Y + IQ) - lE(Y - IQ) · We summarize: =

=

-

=

lE( I YI) < 00 and Q be a sub- u -algebra of F. We call a random variable Z a version of the conditional expectation lE(Y IQ) of Y given Q, and write Z = lE(YIQ), a. s., if (i) Z is Q-measurable; (ii) lE(IZI) < 00; (iii) for every set G in Q, we have Definition 2. 5 . 1 . Let Y be a random variable with

J YdlP = J ZdlP

VG

G

G

E

Q.

(2.4)

Notation. Suppose Q = u(X1 , . . . , Xn). Then

and one can compare the general case with the motivating examples above. To see the intuition behind conditional expectation, consider the following situation. Assume experiment has been performed, i.e. w E Q has been realized. However, the only information we have is the set of values X (w) for every Q-measurable random variable X. Then Z(w) lE(Y I Q) (w) is the expected value of Y(w) given this information. We used the traditional approach to define conditional expectation via the Radon-Nikodym theorem. Alternatively, one can use Hilbert space projection theory (Neveu (1975) and Jacod and Protter (2000) follow this route) . Indeed, for Y E .c 2 (Q, F, lP) one can show that the conditional expectation Z lE(Y IQ) is the least-squares-best Q-measurable predictor of Y: amongst all Q-measurable random variables it minimizes the quadratic distance, i.e. lE[(Y _ 1E(Y IQ)) 2 ] min{lE[(Y - X) 2 ] : X Q - measurable} . an

=

=

=

48


Note. 1 . To check that something is a conditional expectation: we have to check that it integrates the right way over the right sets ( Le., as in (2.4) ). 2. From (2.4): if two things integrate the same way over all sets B E 9, they have the same conditional expectation given 9. 3. For notational convenience, we shall pass between lE(YI9) and lEgY at will. 4. The conditional expectation thus defined coincides with any we may have already encountered - in regression or multivariate analysis, for example. However, this may not be immediately obvious. The conditional expectation defined above - via O'-algebras and the Radon-Nikodym theorem - is rightly called by Williams 'the central definition of modern probability' ( see Williams ( 1991) , p.84 ) . It may take a little getting used to. As with all important but non-obvious definitions, it proves its worth in action: see §2.6 below for properties of conditional expectations, and Chapter 3 for its use in studying stochastic processes, particularly martingales ( which are defined in terms of conditional expectations ) . We now discuss the fundamental properties of conditional expectation. From the definition linearity of conditional expectation follows from the lin earity of the integral. Further properties are given by Proposition 2 . 5 . 1 . 1 . I f9

= {0, a} , lE(Y I {0, a}) = lEY. 2. If 9 = :F, lE(Y I:F) = Y 1P a . s . . 3. If Y is 9-measurable, lE(Y lm = Y 1P a . s . . 4. Positivity. If X 2: 0 , then lE(X lm 2: 0 1P a . s . . 5. Taking out what is known. If Y is 9-measurable and bounded, lE(YZI9) = YlE(Z I 9) 1P a . s . . 6. Tower property. If 90 C 9, lE[lE(Y l m I 90] = lE[Y I90] a . s . . 7. Conditional mean formula. lE[lE(Y l m] = lEY 1P a . s . 8. Role of independence. If Y is independent of 9, lE(Y l m = lEY a . s . 9. Conditional Jensen formula. If c : IR -+ IR is convex, and lElc(X ) 1 < 00 , then lE(c(X) 1 9) 2: c (lE(X I 9 ) ) . -

-

-

-

-

Proof. 1 . Here 9 {0, a} is the smallest possible O'-algebra ( any 0' algebra of subsets of a contains 0 and a) , and represents 'knowing nothing'. We have to check (2.4) for G 0 and G a. For G 0 both sides are zero; for G a both sides are lEY. 2. Here 9 = :F is the largest possible O'-algebra, and represents 'knowing everything'. We have to check ( 2.4 ) for all sets G E :F. The only integrand that integrates like Y over all sets is Y itself, or a function agreeing with Y except on a set of measure zero. Note. When we condition on :F ( 'knowing everything' ) , we know Y ( because we know everything ) . There is thus no uncertainty left in Y to average out, =

=

=

=

=

2.5 Conditional Expectation

49

so taking the conditional expectation (averaging out remaining randomness) has no effect, and leaves Y unaltered. 3. Recall that Y is always F-measurable (this is the definition of Y being a random variable). For 9 c F, Y may not be g-measurable, but if it is, the proof above applies with 9 in place of F. Note. To say that Y is g-measurable is to say that Y is known given 9 that is, when we are conditioning on g. Then Y is no longer random (being known when 9 is given) , and so counts as a constant when the conditioning is performed. 4. Let Z be a version of lE(X I Q ) . If IP(Z < 0) > 0, then for some n, the set G := {Z < _n- 1 } E 9 and IP ( { Z < _ n - 1 }) > O . Thus o � lE ( X I G ) = lE ( Z l G ) < -n- 1 IP(G) < 0, which contradicts the positivity of X . 5. First, consider the case when Y is discrete. Then Y can be written as

-

for constants bn and events Bn E g. Then for any B E g, B n Bn E 9 also (as 9 is a a-algebra) , and using linearity and (2.4) :

[

Y JE ( Z I Q ) dlP

�

[ (t, ) I>" l B.

= I:>n =J N

n =1

J

JE ( Z I Q ) dJP

ZdIP

BnBn

=J

�

t, /. bn

B

N

JE ( Z I Q ) dJP

'

L bn l B n ZdIP n B =1

YZdIP.

B

Since this holds for all B E g, the result holds by (2.4). For the general case, we approximate to a general random variable Y by a sequence of discrete random variables Yn , for each of which the result holds as just proved. We omit details of the proof here, which involves the standard approximation steps based on the monotone convergence theorem from measure theory (see e.g. Williams (1991), p.90, proof of (j)). We are thus left to show the lE ( I ZY i ) < 00, which follows from the assumption that Y is bounded and Z E .c 1 . 6. lEgo lEg Y is go-measurable, and for C E go C g, using the definition of lEgo ' lEg :

J lEgo [lEg Yj dIP = J lEg YdIP = J YdIP.

c

c

c

50


So lEgo [lEgY) satisfies the defining relation for lEgo Y. Being also go-measur able, it is lEgo Y (a.s.) . We also have: 6'. If go c g, lE[lE(Y l go) l g) lE[Y l go ) a.s . Proof. lE[Ylgo) is go-measurable, so g-measurable as go C g, so lE[.lg ) has no effect on it, by 3. Note. 6, 6' are the two forms of the iterated conditional expectations prop erty. When conditioning on two a-algebras, one larger (finer) , one smaller (coarser) , the coarser rubs out the effect of the finer, either way round. This may be thought of as the coarse-averaging property: we shall use this term in terchangeably with the iterated conditional expectations property; Williams (1991) uses the term tower property. 7. Take go {0, Q} in 6 and use 1 . 8. If Y is independent o f g, Y is independent o f IB for every B E g . So by (2.4) and linearity, =

.

=

J lE(YI 9 )dlP J YdlP J IBYdlP =

B

B lE(lBY)

=

n

=

lE(lB)lE(Y)

J

lEYdlP, B using the multiplication theorem for independent random variables. Since this holds for all B E g , the result follows by (2.4) . 9. Recall (see e.g. Williams (1991) , §6.6a, §9.7h, §9.8h), that for every convex function there exists a countable sequence ( (an , bn)) of points in JR2 such that + bn) , x E JR. c(X) sup(anx n For each fixed n we use 4 to see from c(X) � anX + bn that =

=

=

So,

lE[c(X) l g)

Remark 2. 5. 1 .

�

sup n (anlE(XIQ) + bn)

=

c (lE(X l g)) . o

If in 6, 6' we take 9 go, we obtain: =

lE[lE(X l g) l 9 l

=

lE(X l g) ·

Thus the map X -+ lE(XIQ) is idempotent: applying it twice is the same as applying it once. Hence we may identify the conditional expectation operator as a projection. This point of view, which is powerful and useful, is developed in Appendix B.

2.6 Modes of Convergence

51

2 . 6 Modes of C onvergence

So far, we have dealt with one probability measure - or its expectation oper ator - at a time. We shall, however, have many occasions to consider whole sequence of them, converging (in a suitable sense) to some limiting proba bility measure. Such situations arise, for example, whenever we approximate a financial model in continuous time (such as the continuous-time Black Scholes model of §6.2 ) by a sequence of models in discrete time (such as the discrete-time Black-Scholes model of §4.6) . In the stochastic-process setting - such as the passage from discrete to continuous Black-Scholes models mentioned above - we need concepts beyond those we have to hand, which we develop later. We confine ourselves here to setting out what we need to discuss convergence of random variables, in the various senses that are useful. The first idea that occurs to one is to use the ordinary convergence concept in this new setting, of random variables: then if Xn, X are random variables, a

Xn -+ X (n -+ oo) would be taken literally - as if the Xn, X were non-random. For instance, if Xn is the observed frequency of heads in a long series of n independent tosses of a fair coin, X = 1/2 the expected frequency, then the above in this case would be the man-in-the-street's idea of the 'law of averages'. It turns out that the above statement is false in this case, taken literally: some qualification is needed. However, the qualification needed is absolutely the minimal one imaginable: one merely needs to exclude a set of probability zero - that is, to assert convergence on a set of probability one ('almost surely') , rather than everywhere. Definition 2.6. 1 . If Xn , X are random variables, we say Xn converges to X almost surely Xn -+ X (n -+ 00 ) - if Xn -+ X with probability one - that is, if JP({w : Xn (w) -+ X(w) as n -+ oo } ) = l . The loose idea of the 'law of averages' has as its precise form a statement on convergence almost surely. This is Kolmogorov's strong law of large numbers, see e.g. Williams (1991) , §12. 1O, which is quite difficult to prove. Weaker convergence concepts are also useful: they may hold under weaker conditions, or they may be easier to prove. Definition 2 . 6 . 2 . If Xn, X are random variables, we say that Xn converges to X in probability -

a. s .

Xn -+ X

(n -+ 00 ) in probability

52

2 . Probability Background

- if, for all E >

0,

JP ( { w : I Xn (w ) - X (w ) 1 > E } ) -+ 0 ( n -+ 00 ) . It turns out that convergence almost surely implies convergence in probabil ity, but not in general conversely. Thus almost-sure convergence is a stronger convergence concept than convergence in probability. This comparison is re flected in the form the 'law of averages' takes for convergence in probability: this is called the weak law of large numbers, which as its name implies is a weaker form of the strong law of large numbers. It is correspondingly much easier to prove: indeed, we shall prove it in §2.8 below. Recall the LP-spaces of pth-power integrable functions (§2.2) . We similarly define the LP-spaces of pth-power integrable random variables: if p � 1 and X is a random variable with we say that X E LP ( or V(n, F, JP ) to be precise) . For Xn , X E V, there is a natural convergence concept: we say that Xn converges to X in LP, or in pt h mean, Xn -+ X in LP, if II Xn - X l lp -+ 0 ( n -+ 00 ) , that is, if lE ( I Xn X I P) -+ 0 ( n -+ 00 ) . The cases p 1 , 2 are particularly important: if Xn -+ X in L 1 , we say that Xn -+ X in mean; if Xn -+ X in L 2 we say that Xn -+ X in mean square. Convergence in pth mean is not directly comparable with convergence almost surely (of course, we have to restrict to random variables in LP for the comparison even to be meaningful): neither implies the other. Both, however, imply convergence in probability. All the modes of convergence discussed so far involve the values of random variables. Often, however, it is only the distributions of random variables that matter. In such cases, the natural mode of convergence is the following: -

=

Xn converge to X in dis tribution if the distribution functions of Xn converge to that of X at all points of continuity of the latter:

Definition 2.6.3. We say that random variables

Xn -+ X for all points

in distribution, if x

JP ( {Xn � x }) -+ JP({X � x }) ( n -+ 00 )

at which the right-hand side is continuous.

The restriction to continuity points x of the limit seems awkward at first, but it is both natural and necessary. It is also quite weak: note that the function

2 . 7 Convolution and Characteristic Functions

53

H P( { X :S x}), being monotone in x, is continuous except for at most countably many jumps. The set of continuity points is thus uncountable: 'most' points are continuity points. Convergence in distribution is (by far) the weakest of the modes of con vergence introduced so far: convergence in probability implies convergence in distribution, but not conversely. There is, however, a partial converse (which we shall need in §2.8) : if the limit X is constant (non-random) , convergence in probability and in distribution are equivalent. Weak Convergence. If Pn , P are probability measures, we say that X

Pn -+ P ( n -+

if

)

weakly if

J jdPn J jdP

(n -+ (0 )

(0

-+

(2.5)

for all bounded continuous functions j. This definition is given a full-length book treatment in Billingsley (1968) , and we refer to this for background and details. For ordinary (real-valued) random variables, weak convergence of their probability measures is the same as convergence in distribution of their distribution functions. However, the weak-convergence definition above ap plies equally, not just to this one-dimensional case, or to the finite-dimensional (vector-valued) setting, but also to infinite-dimensional settings such as arise in convergence of stochastic processes. We shall need such a framework in the passage from discrete- to continuous-time Black-Scholes models. 2 . 7 Convolution and C haracteristic Functions

The most basic operation on numbers is addition; the most basic operation on random variables is addition of independent random variables. If X, Y are independent, with distribution functions F, G, and Z := X + Y,

let Z have distribution function H. Then since X + Y Y + X (addition is commutative) , H depends on F and G symmetrically. We call H the convo lution (German: Faltung) of F and G, written =

H = F * G.

Suppose first that X , Y have densities j, g. Then H (z ) = P ( Z

:S

z)

=

P (X + Y

:S

z) =

J

{ (x , y ) : x + y :S; z }

j(x)g(y)dxdy,

54


since by independence of X and Y the joint density of X and Y is the product I(x)g(y) of their separate (marginal) densities, and to find probabilities in the density case we integrate the joint density over the relevant region. Thus

H(z) If

� 1 L{ } � 1 f(x)

9(Y)dY dx

f(x)G(z - x)dx.

<Xl

h( z) :=

J I(x)g(z - x)dx,

-

<Xl

(and of course symmetrically with 1 and 9 interchanged) , then integrating we recover the equation above, after interchanging the order of integration. This is legitimate, as the integrals are non-negative, by Fubini's theorem, which we quote from measure theory, see e.g. Williams (1991) , §8.2. This shows that if X, Y are independent with densities I, g, and Z = X + Y, then Z has density h, where <Xl

h(x) = We write

J I(x - y)g(y)dy.

- <Xl

h = I * g, and call the density h the convolution of the densities 1 and g. If X, Y do not have densities, the argument above may still be taken as far as <Xl

H(z) = lP(Z :::; z) = lP ( X + Y :::; z) =

J F(x - y)dG(y)

-

<Xl

(and, again, symmetrically with F and G interchanged), where the integral on the right is the Lebesgue-Stieltjes integral of §2.2. We again write H = F * G, and call the distribution function H the convolution of the distribution func tions F and G . In sum: addition of independent random variables corresponds to convo lution of distribution functions or densities. Now we frequently need to add (or average) lots of independent random variables: for example, when forming sample means in statistics - when the bigger the sample size is, the better. But convolution involves integration, so adding n independent random variables involves n 1 integrations, and this is awkward to do for large n. One thus seeks a way to transform distributions su as to make the awkward operation of convolution as easy to handle as the operation of addition of independent random variables that gives rise to it. -

2 . 7 Convolution and Characteristic Functions

55

2 . 7. 1 . If X is a random variable with distribution function F , its characteristic function ¢ (or ¢ x if we need to emphasize X) is

Definition

J eitx dF(x) , 00

¢ (t)

:=

JE( e itX )

=

-

(t

E JR) .

00

Note. Here i : = p. All other numbers t , x etc. - are real; all expressions involving i such as e i t x , ¢(t) = JE( e it x ) are complex numbers. The characteristic function takes convolution into multiplication: if X , Y are independent, -

For, as X, Y are independent, so are ei tX and e it Y for any t, so by the multiplication theorem ( Theorem 2.3. 1) ,

JE( e it ( x +Y ) )

=

JE( e itX . e it Y )

=

JE( e itX ) · JE( e it Y ) ,

as required. We list some properties of characteristic functions that we shall need. 1. ¢(O) = 1 . For, ¢(O) JE( e i . O . X ) = JE( e O ) = JE(I) = 1 . 2 . 1 ¢(t) 1 ::; 1 for all t E JR. =

Proof. 1 ¢ ( t ) 1

=

I J�oo eitxdF( x) I J�oo l eit x l dF(x) J�oo IdF(x) ::;

=

=

1.

Thus, in particular the characteristic function always exists ( the integral defining it is always absolutely convergent ) . This is a crucial advantage, far outweighing the disadvantage of having to work with complex rather than real numbers ( the nuisance value of which is in fact slight ) . 3. ¢ is continuous ( indeed, ¢ is uniformly continuous ) . Proof.

I ¢ ( t + u ) - ¢(t) 1

J {ei(t +u) x - eitX } dF(x) J eitx (eiuX - l )dF(x) J l eiux - 1 1 dF(x) , 00

=

-

00

00

-

00

1> (t) = 1> (pt) 1>p ( t) for some characteristic function 1>p (whence the name 'self-decomposable'). Self-decomposable laws have many nice properties; we quote two, for later x

use. (i) They are absolutely continuous (possess densities) , and are unimodal ('one-peaked'). (ii) In one dimension, they are the laws with Levy measure of the form

/-l

/-l(dx ) = k(xI x l ) dx

with k increasing on (-00, 0) and decreasing on (0, 00) . In particular, they are easy to recognize from the Levy-Khintchine formula, and easy to simulate from. For these properties, and further background, we refer to Sato ( 1 999) , §5.3. 2 . 1 1 Elliptically Cont oured D istributions

Recall (§2.7) that the normal/Gaussian density and its characteristic func tion 1> are given by

f

f (x)

=

1y 27ra exp { 21 (X -a2/-l) 2 } and ¢(t) ICC

- -

=

{ 2I a2 t2 } ,

exp i/-lt

-

_

and (§2.3) that such laws are useful in modeling asset return distributions. We may hold, not just one asset, but a whole of assets, r of them say. To describe the return distribution of such a portfolio, we need to work in r dimensions, with a density ( ) ( = xr)) and characteristic function 1>(t) (t = The univariate normal law above generalizes

portfolio

(tl, . . . , tr )). f

x

x

(Xl'. ' . '


66

to the multivariate normal ('multinormal') , the basis of multivariate analysis in statistics:

= (21f) �1r IEI 2 exp { -�(2 x - 1L) t E- 1 (x } = exp { ilL � E } (Edgeworth's Theorem: F. Edgeworth (1845-1926) in 1892) . Here IL ( J.L l , . . , J.Lr ) t is the mean vector, E = (aij ) the covariance matrix r r, positive definite, symmetric) . Thus IL, E completely specify the multinormal F(x)

-

1

¢(t)

tt -

tt

IL )

t

Y.

.

,

(

=

x

in dimensions, Nr (lL , E) say, and are interpretable via the Markowitz mean variance theory. One way to keep most of the desirable features of the multinormal interpretable parameters IL, E, elliptical contours, linear regression etc. - is to replace f by a more general form, i.e. assume that the density f is a function of the quadratic form Q (x - IL) T E- 1 (x - IL):

r

:=

Here f is called elliptically contoured: we write f ECr (lL, E; g ) , and call gparameter, : IR + -+ IR + the density generator of f, or 'shape'. Then 0 : = (IL, E) is the or parametric part, of the model, g the non-parametric part. The characteristic function (CF) 'Ij; of f is of the form '"

(EC' )

for some scalar function ¢ called the of 'Ij;, or f Fang, Kotz, and Ng ( 1990) , Ch. 2, Cambanis, Huang, and Simons ( 1981) . It is convenient to write here f ECr (J.L, E; ¢) also. Examples. 1 . The l/ case. The density generator of the mul tivariate normal distribution is given by

characteristic generator

'"

norma Gaussian

The characteristic function is

{i(}T IL � OT EO } , so {�} The multivariate normal is a member (the s = 1 , t = � case) of the class of symmetric Kotz-type distributions, which are characterized by exponen 'Ij;(O)

=

exp

¢(u) = exp

-

- u .

N

tially decaying density generators of form

=

2 . 1 2 Hyberbolic Distributions

g ( u ) = Gr U N - 1 exp { _tuB } ,

67

t > 0, 2N + r > 2, Gr a constant. For further details see Fang, Kotz, and Ng (1990) , §3.2. 2. The For the multivariate t-distribution with m degrees of freedom the density generator exhibits power decay. It is a member (the N ! ( + m ) , m an integer case) of the class of symmetric multivariate Pearson type VII distributions with density generators r(N) U -N 1 + g ( u) = m - l r ' N > r/2, m > O . ( 7f ) 2 r(N r/2) m Again we refer to Fang, Kotz, and Ng (1990) , §3.3 for further discussion.

multivariate t-distribution. = r

(

_

S,

)

Elliptically contoured distributions are well adapted to modelling any desired rate of tail decay. Indeed, the more slowly g ( u ) decays as u increases, the more slowly I(x) decays as x moves away from J.L. The whole range of rate of tail decay is possible. For example, the r-variate Student t-distribution with m degrees of freedom,

(

)

- l2 ( r +m ) u g ( u) = canst 1 + __ ,

gives Pareto or power-law decay, while

m-2

gives the multinormal case, with log-quadratic decay. So far as modeling skewness or asymmetry is concerned: the function I in (EC) is certainly not symmetrical in its components Xi (unless E is the identity matrix) . Nevertheless, the functional form (EC) , which restricts I to have (paths in x-space of constant I-value) does impose a partial symmetry restriction, and we say that I is Some elliptically contoured distributions are infinitely divisible; in prin ciple, these can be identified via the function 'l/J from (EC) and the Levy Khintchine formula. Restricting further to self-decomposability, we obtain the class

elliptical contours

elliptically symmetric.

SDEG

:=

SD n EG.

Recall that laws in SD are (§2. 10) , which corresponds via (EC) to 9 being We shall meet examples in §2.12 below.

decreasing.

unimodal


Our concern here is the hyperbolic family, a four-parameter family with two type and two shape parameters. Recall that, for normal (Gaussian) distribu tions, the log-density is quadratic - that is, parabolic - and the tails are very

68


thin. The hyperbolic family is specified by taking the log-density instead to be hyperbolic, and this leads to thicker tails as desired ( but not as thick as for the stable family ) . Before turning to the specifics of notation, parametrization etc., we com ment briefly on the origin and scope of the hyperbolic distributions. Both the definition and the bulk of applications stem from Barndorff-Nielsen and co-workers. Thus Barndorff-Nielsen (1977) contains the definition and an ap plication to the distribution function of particle size in a medium such as sand ( see also Barndorff-Nielsen, Blaesild, Jensen, and SQlrensen (1985)) . Later, in Barndorff-Nielsen, Blaesild, Jensen, and SQlrensen ( 1985) , hyperbolic distri bution functions are used to model turbulence. Now the phenomenon of at mospheric turbulence may be regarded as a mechanism whereby energy, when present in localized excess on one volume scale in air, cascades downwards to smaller and smaller scales ( note the analogy to the decay of larger particles into smaller and smaller ones in the sand studies ) . Barndorff-Nielsen had the acute insight that this 'energy cascade effect' might be paralleled in the 'in formation cascade effect', whereby price-sensitive information originates in, say, a global newsflash, and trickles down through national and local level to smaller and smaller units of the economic and social environment. This in sight is acknowledged by Eberlein and Keller ( 1995) ( see also Eberlein, Keller, and Prause ( 1998) and Eberlein and Raible ( 1998)) , who introduced hyper bolic distribution functions into finance and gave detailed empirical studies of its use to model financial data, particularly daily stock returns. Further and related studies are Bibby and SQlrensen ( 1997) , Chan ( 1999) , Eberlein and Jacod (1997) , Kuchler ( 1999) , Rydberg ( 1999) and Rydberg (1997) . We need some background on Bessel functions, see Watson ( 1944) . Recall the Bessel functions Jv of the first kind, Watson (1944) , §3. 1 1 , Yv of the sec ond kind, Watson (1944) , §3.53, and Kv, Watson ( 1944) , §3.7, there called a Bessel function with imaginary argument or Macdonald function, nowa days usually called a Bessel function of the third kind. From the integral representation 00

Kv{x) � J uv- 1 exp {-�x(u l/U)} du (x > 0) +

=

(2.7)

o

( Watson (1944) ,§6.23)

one sees that

(x > 0)

(2.8)

is a probability density function. The corresponding law is called the gener alized inverse Gaussian GlG>.,1/J,x ; the inverse Gaussian is the case >. 1 : lGx,1/J = GlG 1 ,1/J,x ' These laws were introduced by Good (1953) ; for a mono graph treatment of their statistical properties, see JQlrgensen (1982), and for their role in models of financial markets, Shiryaev (1999) , III, l .d. =


69

Now consider a Gaussian (normal) law N(J-t + (3(1 2 , (1 2 ) where the pa rameter (12 is random and is sampled from GIG1 ,1/J,x ' The resulting law is a mean-variance mixture of normal laws, the mixing law being generalized inverse Gaussian. It is written IE,,2 N (J-t + (3(12 , (12 ) ; it has a density of the form Ja 2 - (32 exp -a Jo2 + (x - J-t) 2 + (3 (x - J-t) } (2 . 9 ) 2 (3 2 2aoK1 (o Ja -

{

)

(Barndorff-Nielsen ( 1977)) , where a 2 = 'Ij; + (3 2 and 0 2 = X. Just the Gaussian law has log-density a quadratic - or parabolic - function, so this law has log-density a hyperbolic function. It is accordingly called a hyperbolic distribution. Various parametrizations are possible. Here J-t is a location and o a scale parameter, while a > 0 and (3 (0 ::; 1(31 < a) are shape parameters. One may pass from (a, (3) to (4), 'Y) via a = (4) + 'Y)/2, (3 = (4) - 'Y)/2, so 4>'Y = a2 - (3 2 , and then to (�, X) via �(3 � 4> - 'Y � = ( 1 + 0 �) - � , = . = as

x

a

4> + 'Y

This parameterization (in which � and X correspond to the classical shape parameters of skewness and kurtosis) has the advantage of being affine in variant (invariant under changes of location and scale) . The range of (�, X) is the interior of a triangle v = { (�, X) : 0 ::; I x i < � < I } , called the shape triangle (see Figure 1). It suffices for our purpose to restrict to the centred (J-t 0) symmetric ((3 = 0, or X = 0) case, giving the two parameter family of densities (writing ( � - 2 - 1)

=

hYP('/i (x)

=

= 20;1 ( () exp { V + (Jf } , -

(

I

( ( , 0 > 0) .

(2 . 10)

Infinite Divisibility. Recall ( Feller ( 1971 ) , XIII, 7, Theorem 1) that a func tion is the Laplace transform of an infinitely divisible probability law on IR+ iff = e-1/J , where 'Ij; (0) 0 and 'Ij; has a completely monotone deriva tive (that is, the derivatives of 'Ij;' alternate in sign) . Grosswald (1976) showed that if Qv (x) : = Kv - 1 ( VX) / ( vxKv hIX) ) ( v 2: O , x > 0) , then Qv is completely monotone. Hence Barndorff-Nielsen and Halgreen (1977) showed that the generalized inverse Gaussian laws GIG are infinitely divisible. Now the GIG are the mixing laws giving rise to the hyperbolic laws normal mean-variance mixtures. This transfers infinite divisibility (see e.g. Kelker ( 1971 ) , Keilson and Steutel ( 1974) , §§1,2) , so the hyperbolic laws are infinite divisible. w

w

as

=


70

Characteristic Functions. The mixture representation transfers to char acteristic functions on taking the Fourier transform. It gives the characteristic function of hyp( , 0 as

If 0,

Xn

/1-

(J'2/n,

/1-

In(x) Xn,

/1-;

2.

Xn

/1-,

bivariate normal I (x, y) /1-1, /1-2 (J'1 , (J'2 - 1 p I(x,y) := c exp { - � Q (x,y) } , 1 c = . 2�(J'1 (J'2 �' Q (x, y) : = ( 1 � p2 ) ( x :1/1-1 r - 2p ( x :1/1-1 ) ( y :;2 ) + ( y :;2 ) 2 . ·

-------:====

]

[

1 . Show that this is indeed a density (integrates to 1 ) . (Complete the square - cf. solving quadratic equations!) Show that if Y) has density is (and so Y is N by symmetry) . 3. Show that the conditional density of Y given is normal, i.e. Y N + 4. Interpret the conditional means in terms of the population regression line of §2.5, and its sample analogue given by the familiar method of least squares. 5. Interpret the conditional variance as decomposing the variance variabil ity) of Y into two parts, the part accounted for by knowledge of X and the remaining part (1 6. Interpret the fraction of the variability o f Y accounted for by knowl edge of (or vice versa, by symmetry). 7. Show that (X, Y) has (joint) characteristic function

2. (/1- , (J'� ) (X, I(x, y), X N(/1-1, (J'D 2 X=x (/1-2 !!£- (x - /1- d , (J'� (1 - p2) ) . '"

(J'�

Xp2

(J'�p2 (J'� p2). _

as

(=

Exercises

73

¢(tb t 2 ) := lE (exp{itIX + it2 Y})

= exp { i (/l l tl + /l2t2 ) � (CT�t� + 2pCTICT2ht2 + CT�t�) } . -

8. Show that the fifth parameter p is the correlation coefficient between X and Y. 9. Deduce that X, Y are independent if and only if they are uncorrelated ( recall that independent random variables are uncorrelated when they are square-integrable - have variances, so a covariance and a correlation - but the converse is false in general ) . Note. The bivariate normal distribution models the common situation of two random variables, each of which is partially but not completely informative about the other. It is basic to statistics - in particular, to regression, and so is its extension to higher dimensions, the multivariate normal distribution. In finance, it is basic to correlation-based options - options such as quantos, involving the currencies of two different but interlinked economies.

3 . Sto chastic Pro cesses in Discrete Time

3 . 1 Informat ion and Filtrat ions

Access to full, accurate, up-to-date information is clearly essential to any one actively engaged in financial activity or trading. Indeed, information is arguably the most important determinant of success in financial life. Partly for simplicity, partly to reflect the legislation and regulations against insider trading, we shall confine ourselves to the situation where agents take deci sions on the basis of information in the public domain, and available to all. We shall further assume that information once known remains known - is not forgotten - and can be accessed in real time. In reality, of course, matters are more complicated. Information overload is as much of a danger as information scarcity. The ability to retain infor mation, organize it, and access it quickly is one of the main factors that will discriminate between the abilities of different economic agents to react to changing market conditions. However, we restrict ourselves here to the sim plest possible situation and do not differentiate between agents on the basis of their information-processing abilities. Thus as time passes, new information becomes available to all agents, who continually update their information. What we need is a mathematical language to model this information flow, unfolding with time. This is provided by the idea of a filtration; we outline below the elements of this theory that we shall need. The Kolmogorov triples (il, F, P) , and the Kolmogorov conditional expec tations .lE(XIB) , give us all the machinery we need to handle static situations involving randomness. To handle dynamic situations, involving randomness which unfolds with time, we need further structure. We may take the initial, or starting, time as t O. Time may evolve discretely, or continuously. We postpone the continuous case to Chapter 5; in the discrete case, we may suppose time evolves in integer steps, t 0, 1 , 2, . . . ( say, stock-market quotations daily, or tick data by the second ) . There may be a final time T, or time horizon, or we may have an infinite time horizon ( in the context of option pricing, the time horizon T is the expiry time ) . We wish to model a situation involving randomness unfolding with time. As above, we suppose, for simplicity, that information is never lost ( or for gotten ) : thus, as time increases we learn more. We recall from Chapter 2 that =

=

76

3. Stochastic Processes in Discrete Time

a-algebras represent information or knowledge. We thus need a sequence of a-algebras IF = {Fn n 0, 1, 2, . . . } , which are increasing: :

=

Fn c Fn + l ( n = 0, 1 , 2, . . . ) , with Fn representing the information, or knowledge, available to us at time n. We shall always suppose alI a-algebras to be (this can be avoided, and is not always appropriate, but it simplifies matters and suffices for our purposes) . Thus, Fo represents the initial information (if there is none, Fo = {0, ill, the trivial a-algebra) . On the other hand,

complete

represents all we ever will know (the 'Doomsday a-algebra'). Often, Foo will be F (the a-algebra from Chapter 2, representing 'knowing everything') . But this will not always be so; see e.g. Williams (1991) , § 15.8 for an interesting example. Such a family IF {Fn n = 0, 1 , 2, . . . } is called a a probability space endowed with such a filtration, {il, IF, F, P} is called a or These definitions are due to P. A. Meyer of Strasbourg; Meyer and the Strasbourg (and more generally, French) school of probabilists have been responsible for the 'general theory of (stochastic) processes', and for much of the progress in stochastic integration since the 1960s; see e.g. Dellacherie and Meyer (1978) , Dellacherie and Meyer (1982) , Meyer (1966), Meyer (1976) . For the special case of a finite state space il {W l , . . . , wn} and a given a-algebra F on il (which in this case is just an algebra), we can always find a unique finite partition P {A I , . . . , Ad of il, i.e. the sets A i are disjoint and U�=l A i = il, corresponding to F. A filtration IF therefore corresponds to a sequence of finer and finer partitions P At time t = 0 the agents only know that some event W E il will happen, at time T < 00 they know which specific event w* has happened. During the flow of time the agents learn the specific structure of the (a-) algebras Fn, which means they learn the corresponding partitions P. Having the information in Fn revealed is equivalent to knowing in which A� n ) E P the event w* is. Since the partitions become finer the information on w* becomes more detailed with each step. Unfortunately this nice interpretation breaks down as soon as il becomes infinite. It turns out that the concept of filtrations rather than that of parti tions is relevant for the more general situations of infinite il, infinite T and continuous-time processes. :=

filtration;

:

stochastic basis filtered probability space.

=

=

n.

n

3.2 Discrete-parameter Stochastic Processes

77

3 . 2 D iscrete-parameter Stochastic Processes

The word 'stochastic' ( derived from the Greek) is roughly synonymous with 'random'. It is perhaps unfortunate that usage favours 'stochastic process' rather than the simpler 'random process', but as it does, we shall follow it. We need a framework which can handle dynamic situations, in which time evolves, and in which new information unfolds with time. In particular, we need to be able to speak in terms of 'the information available at time n ' , or, 'what we know at time n ' . Further, we need to be able to increase n thereby increasing the information available as new information ( typically, new price information ) comes in, and talk about the information flow over time. One has a clear mental picture of what is meant by this - there is no conceptual difficulty. However, what is needed is a precise mathematical construct, which can be conveniently manipulated - perhaps in quite complicated ways - and yet which bears the above heuristic meaning. Now 'information' is not only an ordinary word, but even a technical term in mathematics - many books have been written on the subject of information theory. However, information theory in this sense is not what we need: for us, the emphasis is on the flow of information, and how to model and describe it. With this by way of motivation, we proceed to give some of the necessary definitions. A X {Xn : n E I} is a family of random variables, defined on some common probability space, indexed by an index-set I. Usu ally ( always in this book ) , I represents time ( sometimes I represents space, and one calls X a spatial process ) . Here, 1 = {O, 1, 2, . , T} ( finite horizon ) or 1 = {O, 1, 2, . . . } ( infinite horizon ) . The ( stochastic ) process X = {Xn )�= o is said to be to the filtration IF = {Fn )�=o if -

stochastic process

=

.

.

adapted

Xn is Fn

-

measurable for all

n.

So if X is adapted, we will know the value of Xn at time n. If we call (Fn ) the natural filtration of X . Thus a process is always adapted to its natural filtration. A typical situation is that

=

is the natural filtration of some process W (Wn ) . Then X is adapted to = (Fn ) , i.e. each Xn is Fn - ( or a{Wo , · · · , Wn )-) measurable, iff

IF

for some measurable function fn ( non-random ) of n + 1 variables.

78


Notation. For a random variable X on (il, F , lP) , X(w) is the value X takes on w (w represents the randomness) . For a stochastic process X (Xn) , it is convenient (e.g., if using suffixes, ni say) to use Xn , X(n) interchangeably, and we shall feel free to do this. With w displayed, these become Xn(w) , X(n, w) etc. The concept of a stochastic process is very general - and so very flexible but it is too general for useful progress to be made without specifying further structure or further restrictions. There are two main types of stochastic pro cess which are both general enough to be sufficiently flexible to model many commonly encountered situations, and sufficiently specific and structured to have a rich and powerful theory. These two types are Markov processes and martingales. A models a situation in which where one is, is all one needs to know when wishing to predict the future - how one got there provides no further information. Such a 'lack of memory' property, though an idealization of reality, is very useful for modeling purposes. We shall en counter Markov processes more in continuous time (see Chapter 5) than in discrete time, where usage dictates that they are called Markov chains, see §3.8. For an excellent and accessible recent treatment of Markov chains, see e.g. Norris on the other hand (see §3.3 below) model fair gambling games - situations where there may be lots of randomness (or unpredictability ) , but no tendency to drift one way or another: rather, there is a tendency towards stability, in that the chance influences tend to cancel each other out on average.

=

Markov process

(1 997). Martingales,

3 . 3 D efinition and Basic P ropert ies of Martingales

Excellent accounts of discrete-parameter martingales are Neveu (1975) , Jacod and Protter (2000), Williams ( 1 1 ) and Williams (2001) to which we refer the reader for detailed discussions. We will summarize what we need to use martingales for modeling in finance. Definition 3.3. 1 . A X = (Xn)

99

process is called a martingale relative to if (i) is adaptedfor(to all (ii) (iii) a. s . (n � 1 ) . is a supermartingale if in place of (iii) ({Fn}, lP) X {Fn}}; IE I Xn l < 00 n; IE [ Xn lFn-ll = Xn- 1 lP X X

is a submartingale if in place of (iii) IE[Xn I Fn-d � Xn -1 lP -

a. s .

(n � 1 ) .

3.3 Definition and Basic Properties of Martingales

79

Martingales have a useful interpretation in terms of dynamic games: a martingale is 'constant on average', and models a fair game; a supermartin gale is 'decreasing on average', and models an unfavourable game; a sub martingale is 'increasing on average', and models a favourable game. Note. 1 . Martingales have many connections with harmonic functions in probabilistic potential theory. The terminology in the inequalities above comes from this: supermartingales correspond to superharmonic functions, submartingales to subharmonic functions. 2. X is a submartingale (supermartingale) if and only if -X is a super martingale (submartingale) ; X is a martingale if and only if it is both a submartingale and a supermartingale. 3. (Xn) is a martingale if and only if (Xn Xo) is a martingale. So we may without loss of generality take Xo = 0 when convenient. 4. If X is a martingale, then for < n using the iterated conditional ex pectation and the martingale property repeatedly (all equalities are in the a.s.-sense) -

m

lE [Xn l FmJ

= lE [lE(Xn IFn- d I FmJ = lE[Xn- l IFmJ = . . . = lE[Xm l FmJ = Xm ,

and similarly for sub martingales , supermartingales. From the (etymology unknown) 1. 1589. An article of harness, to control a horse's head. 2. Naut. A rope for guying down the jib-boom to the dolphin-striker. 3. A system of gambling which consists in doubling the stake when losing in order to recoup oneself (1815) . Thackeray: 'You have not played as yet? Do not do so; above all avoid a martingale if you do. ' Gambling games have been studied since time immemorial - indeed, the Pascal-Fermat correspondence of 1654 which started the subject was on a problem (de Mere's problem) related to gambling. The doubling strategy above has been known at least since 1815. The term 'martingale' in our sense is due to J. Ville (1939) . Martingales were studied by Paul Levy (1886-1971) from 1934 on (see obituary Loeve ( 1973)) and by J.L. Doob (1910-) from 1940 on. The first systematic exposi tion was Doob (1953) . This classic book, though hard going, is still a valuable source of information. Examples. 1. Mean zero random walk: Sn = L Xi , with Xi independent with lE(Xi ) 0 is a martingale (submartingales: positive mean; supermartin gale: negative mean) .

Oxford English Dictionary: martingale

=

80


2. Stock prices: Sn = SO(l . . . (n with (i independent positive r.vs with exist ing first moment. 3. Accumulating data about a random variable (Williams (1991), pp. 96, 166-167) . If � E .c 1 (fl, F, IP) , Mn := lE(�IFn) (so Mn represents our best estimate of � based on knowledge at time n ) , then using iterated conditional expectations so (Mn) is a martingale. One has the convergence

3 . 4 Martingale Transforms

Now think of a gambling game, or series of speculative investments, in discrete time. There is no play at time OJ there are plays at times n = 1, 2, . . . , and represents our net winnings per unit stake at play n. Thus if Xn is a martin gale, the game is 'fair on average'. Call a process C = (Cn)�= l if Cn is Fn_1-measurable for all n � 1. Think of Cn as your stake on play n (Co is not defined, as there is no play at time 0). Predictability says that you have to decide how much to stake on play n based on the history time n (Le., up to and including play n - 1). Your winnings on game n are CnLlXn = Cn (Xn - Xn - 1 ) . Your total (net) winnings up to time n are n n Yn = L Ck LlXk = L Ck (Xk - Xk-d · k= l k=l We write Y = C . X, Yn (C . X)n , LlYn = CnLlXn ((C . X)o = 0 as E� =l is empty) , and call C . X the of X by C. Theorem 3.4. 1 . C X C.X C X C.X

predictable before

=

martingale transform (i) If is a bounded non-negativenull predictable process and is a supermartingale, is a supermartingale at zero. (ii) If nullis bounded tingale at zero. and predictable and is a martingale, is a mar

3.4 Martingale Transforms

Proof.

Now

Y

=

81

C X is integrable, since C is bounded and X integrable. •

1E[Yn - Yn - 1 IFn - 1l

=

1E[Cn (Xn - Xn - d I Fn - 1l

Cn1E[(Xn - Xn - 1 ) IFn - d (as Cn is bounded, so integrable, and Fn_1-measurable, so can be taken out) =

::; 0 in case (i) , as C � 0 and X is a supermartingale, =0 in case (ii), as X is a martingale.

o

You can't beat the system! In the martingale case, pre dictability of C means we can't foresee the future (which is realistic and fair) . So we expect to gain nothing - as we should. Note. 1 . Martingale transforms were introduced and studied by Burkholder ( 1966). For a textbook account, see e.g. Neveu ( 1975), VIlI.4. 2. Martingale transforms are the discrete analogues of stochastic integrals. They dominate the mathematical theory of finance in discrete time, just as stochastic integrals dominate the theory in continuous time. 3. We will deal with stochastic integrals in Chapter 5, where we cover Ito calculus, a probabilistic elaboration of ordinary calculus, the old-fashioned term for which is infinitesimal calculus. Martingale transforms belong to the probabilistic elaboration of the discrete analogue of this, the calculus of finite differences. The passage from discrete to continuous is written formally as Interpretation.

where the dY, dX on the right are stochastic differentials. Lemma 3.4.1 ( Martingale Transform Lemma ) .

An adapted sequence of real integrable random variables (Xn) is a martingale iff for any bounded pre dictable sequence (Cn),

(t ) =

..

Ck .:1Xk 0 ( n = 1 , 2, . . ) k=l Proof. If (Xn) is a martingale, Y defined by Yo 0, n Yn = L Ck.:1 Xk ( n � 1 ) k=l 1E

=


82

is the martingale transform C X, so is a martingale. Now lE(Yl ) lE(CllE(Xl - Xo)) = 0 and we see by induction that •

Conversely, if the condition of the proposition holds, choose j, and for any Frmeasurable set A write Cn = 0 for n =f. j + 1 , Cj + l = IA . Then (Cn) is predictable, so the condition of the proposition, lE(E�= l Ck.:1Xk) = 0, becomes

lE[ l A (Xj + l - Xj )] = O. Since this holds for every set A E Fj , the definition of conditional expectation gives lE(Xj + l IFj) = Xj . o Since this holds for every j, (Xn) is a martingale.

The proof above is a good example of the value of Kol mogorov's definition of conditional expectation - which reveals itself, not in immediate transparency, but in its ease of handling in proofs. We shall see in Chapter 4 the financial significance of martingale transforms H M.

Remark 3. 4 .1.

•

3 . 5 Stopping Times and Optional Stopping

A random variable r taking values in {O, 1, 2, . . . ; + oo} is called a (or optional time) if

time

stopping

{r ::; n} = {w : r(w) ::; n} E Fn V n ::; 00 . From {r = n} = {r ::; n} \ {r ::; n - I} and {r ::; n} = Uk -< n {r = k}, we see the equivalent characterization {r = n} E Fn V n ::; oo. Call a stopping time r bounded if there is a constant K such that JP( r ::; K) = 1 . (Since r(w) ::; K for some constant K and all w E n \ N with JP(N) 0 all identities hold true except on a null set, i.e. almost surely.)

=

Suppose (Xn) is an adapted process and we are interested in the time of first entry of X into a Borel set B (typically one might have

Example.

B

= [c, oo)):

=

r inf{n 2 0 Xn E B}. Now {r ::; n} = Uk < n {Xk E B} E Fn and r = 00 if X never enters B. Thus is a stopping time.

r

:


83

Intuitively, think of T as a time at which you decide to quit a gambling game: whether or not you quit at time n depends only on the history up to and including time n - NOT the future. Thus stopping times model gambling and other situations where there is no foreknowledge, or prescience of the future; in particular, in the financial context, where there is no insider trading. Furthermore since a gambler cannot cheat the system the expectation of his hypothetical fortune (playing with unit stake) should equal his initial fortune. Theorem 3.5. 1 (Doob's Stopping-time Principle (STP ) ) . T X (Xn) XT

bounded stopping time and and Proof.

and write

Assume T ( W )

Let be a a martingale. Then is integrable,

=

lE(XT) lE(Xo). � K for all w, where we can take K to be an integer =

00

=

K

L Xk (w) l {T(w)=k} L Xk (w) l {T(w)=k} k=O k=O Thus using successively the linearity of the expectation operator, the martin gale property of X, the Fk -measurability of { T = k } and finally the definition of conditional expectation, we get XT(w) (w)

lE(XT)

=

=

=

=

lE K

[t,

Xk 1{T =k}

] = t,

lE [Xk 1 {T =k } ]

= kK=O

L lE [lE(XK IFk ) l {T =k } ] L lE [XK l{T=k} ]

k=O

[

lE XK

t, l{T=k} ]

=

lE(XK ) = lE(Xo) ·

The stopping time principle holds also true if X gale; then the conclusion is

o

= (Xn) is a supermartin

lEXT � lEXo. Also, alternative conditions such as X (Xn) is bounded ( I Xn( w) 1 � L for some L and all n, w ) ; lET < 00 and (Xn - Xn - 1 ) is bounded; suffice for the proof of the stopping time principle. The stopping time principle is important in many areas, such as sequential analysis in statistics. We turn in the next section to related ideas specific to the gambling/financial context. •

•

=

84


We now wish to create the concept of the a-algebra of events observable up to a stopping time in analogy to the a-algebra Fn which represents the events observable up to time

T, n. FrDefinition is defined3 .to5 . 1be. Let T be a stopping time. The stopping time a-algebra Fr = {A E F : A n {T � n} E Fn, for all n}. Proposition 3.5 . 1 . For T a stopping time, Fr is a a-alg ebra. Proof. We simply have to check the defining properties. Clearly fl, 0 are in Fr . Also for A E Fr we find

thus

AC E Fr. Finally, for a family Ai E Fr , i = 1, 2,

.

. . we have

o

�

Let T be stopping times with � Then Fu Fr. Proof. Since � T we have {T � n} { a � n}. So for A E Fu we get A n {T � n} = (A n {a � n} ) n {T � n} E Fn , since (.) E Fn as A E Fu. So A E Fr. Proposition 3 . 5 . 3 . For any adapted sequence of random variables X (Xn) and a.s. finite stopping time T, define Xr = L Xn l { r = n } ' n =O Then Xr is FT-measurable. Proof. Let B be a Borel set. We need to show { Xr E B} E Fr . Now using the fact that on the set {T = k} we have Xr = Xk , we find n n {XT E B} n {T � n} = U {Xr E B} n {T = k} = U {Xk E B} n {T = k}. k= l k =l Now sets {Xk E B} n {T = k} E Fk Fn, and the result follows. We are now in position to obtain an important extension of the Stopping Proposition 3 . 5 . 2 .

0',

0'

T.

�

(]'

o

00

�

Time Principle, Theorem 3.5.1.

0

85


be a martingale and let be bounded stopping times with Let

Theorem 3 . 5 . 2 (Doob's Optional-Sampling Theorem, OST) . X = (Xn) a, T a s T.

Then and thus IE(Xr ) = IE(Xa ) .

Proof. First observe that X-r and Xa are integrable (use the sum rep resentation and the fact that T is bounded by an integer K) and Xa is Fa-measurable by Proposition 3.5.3. So it only remains to prove that

(3.1) For any such fixed A E Fa , define p by p (W )

Since

=

a(w ) lA ( w ) + T( w ) lAc ( w ) ,

(A n {a S n } ) U (AC n {T S n } ) E Fn p is a stopping time, and from P S T we see that p is bounded. So the STP (Theorem 3.5. 1) implies IE(Xp) = IE(Xo ) = IE(X-r ) . But {p S n}

=

IE(Xp) = IE (Xa 1A + X-r 1Ac ) , IE(X-r ) = IE (X-r lA + X-r 1Ac ) .

So subtracting yields (3. 1).

o

We can establish a further characterization of the martingale property.

Let X = (Xn) be an adapted sequence of random vari ables with IE(IXn l ) 00 for all n and IE(X-r ) = 0 for all bounded stopping times T . Then X is a martingale.

Proposition 3 . 5 .4.

But X ::; Z, so Xu ::; Zu , while by the above Xu and Zu have the same expectation. So they must be a.s. equal: Xu Zu a.s., showing ( i ) . To see ( ii ) , observe that for any n ::; N

=

where the second inequality follows from Doob's OST ( Theorem 3.5.2) with the bounded stopping times ( a 1\ n ) ::; a and the supermartingale Z. Using that Z is a supermartingale again, we also find (3.6) As above, this inequality between random variables with equal expectations forces a.s. equality: ZU An JE( Zu jFn) a.s .. Apply JE( . jFn-l ) :

=

by above with n - 1 for n. This says so ZU is martingale.

o

a

From Proposition 3.6.1 and its definition ( first time when Z and X are equal ) it follows that is the smallest optimal stopping time . To find the largest optimal stopping time we try to find the time when Z 'ceases to be a martingale'. In order to do so we need a structural result of genuine interest and importance. r

*

3.6 The Snell Envelope and Optimal Stopping

93

Theorem 3.6.4 (Dooh Decomposition) . Let X = ( Xn ) be an adapted 1 process unique) Doob decom positionwith each Xn [, . Then X has anXo(essentially + Mn + An 'lin (3.7) X = Xo + M + A : Xn with M a martingale null at zero, A a predictable process null at zero. If also X is a submartingale ('i ncreasing on average') , A is increasing: An :S An +l for all n, a.s .. E

=

Proof.

If X has a Doob decomposition (3.7) ,

The first term on the right is zero, as M is a martingale. The second is An - A n - I . since An ( and A n- I ) is Fn_ l -measurable by predictability. So (3.8) and summation gives An

n

=

I: JE [Xk - Xk -1 I Fk - l ] ,

k= 1

a. s .

So set Ao = 0 and use this formula to (An ) , clearly predictable. We then use (3.7) to ( Mn ) , then a martingale, giving the Doob decomposition (3.7) . To see uniqueness, assume two decompositions, i.e. Xn = Xo + Mn + An XO + Mn + A n , then Mn - Mn = An - An . Thus the martingale Mn - Mn is predictable and so must be constant a.s .. If X is a submartingale, the LHS of (3.8) is � 0, so the RHS of (3.8) is D � 0, i.e. (An ) is increasing.

define

define

=

Although the Doob decomposition is a simple result in discrete time, the analogue in continuous time - the Doob-Meyer decomposition - is deep. This illustrates the contrasts that may arise between the theories of stochastic processes in discrete and continuous time. Equipped with the Doob-decomposition we return to the above setting and can write Z = Zo + L + B with L a martingale and B predictable and decreasing. Then M Zo + L is a martingale and A = (-B) is increasing and we have Z = M - A. Definition 3.6.2. fl --+ INo =

Define a random variable

1/ ( w ) =

{Nmin n � 0 An+ ! {

:

>

by setting

1/ :

O}

if AN � ) = O AN ( w ) > O.

if

94


Observe that v (bounded by N) is a stopping time, since {v = n} =

U {A k = O}

k�n

n

{An+ l

>

O} E Fn

as A is predictable. Proposition 3.6.3. v is optimal for (Xt) , and it is the largest optimal stop ping time for ( Xt ) . Proof. We use Proposition 3.6.2. Since for k � v (w) , Zk (W) Mk (W) A k (W) = Mk (w) , z v is a martingale and thus we have (ii) of Proposition 3.6.2. To see (i) we write N l L l { v=k } Zk + l {v=N} ZN Zv k =O N l = L l { v=k } max{Xk , JE (Zk+l I Fk ) } + l { v=N} XN ' k=O Now JE (Zk+l I Fk ) JE (Mk+l - A k +l I Fk ) = Mk - A k+l ' On {v = k} we have A k 0 and A k+l > 0, so JE (Zk+ l I Fk ) < Zk . Hence Zk max{Xk , JE (Zk+ l IFk ) } = Xk on the set {v = k} . So N l Zv = L l {v=k } Xk + l { v=N} XN = Xv , =

-

=

-

=

=

=

-

k= O

which is (i) of Proposition 3.6.2. Now take T E { T} O N with T 2:: v and JP(T > v ) > O. From the definition of v and the fact that A is increasing, Ar > 0 with positive probability. So JE (Ar ) > 0, and ,

JE (Zr ) = JE (Mr ) - JE (Ar) = JE (Zo) - JE (Ar)

(iii). We use the following: (a) .IE [ IXn l l { IXn l>K} ] � .IE [ I Xoo l l{ IXn l >K} ] , using the conditional Jensen inequality. (b) KJP( IXn l > K) � .IE ( IXn l ) � .IE ( IXoo l ) , by truncation and Jensen's inequality. (c) If X E £ 1 , then for any > 0, there exists some 8 > ° such that €

JP(F)

.IE ( IX I I F )

< €.

Now given any > 0, choose 8 such that (c) holds for Xoo and choose KE in (b) such that JP (IXn l > K,) < 8. Then using (a) €

.IE [ IXn l l {IX" I> K. } ] � .IE [ IXoo l l {IX" I>K. } ] � € for all n , so (Xn) is (VI) . (iii) ==> (i). From X (VI) we know that X is bounded in £ 1 , hence Xn � Xoo a.s. for some Xoo E £1 . Since almost-sure convergence of a (UI) process 0 implies £ 1 convergence, (i) follows. 3 . 8 Markov C hains

Let (n, F, JP) be a probability space and {Xn ' n = 0, 1, . . . } be a sequence of random variables (rvs) with a discrete state space I. We interpret Xn as the state of some dynamic system at time n. Definition 3.8 . 1 . {Xn , n = 0, 1 , . . I n 0, 1, . . .

The stochastic is called a discrete-time Markov process chain if , for each

for all possible values of (io, . . . , in + 1 )

E

I.

.

=

} with state space

3.8 Markov Chains

97

In the following we consider only Markov chains with time-homogeneous transition probabilities; that is we assume JP( Xn + l

= = i = Pij , i , )

Ef independently of the time parameter n . The probabilities Pij are called and satisfy j l Xn

j

step transition probabilities Pij

;:::

0,

one

= i

E f and L Pij 1 , E f. j E] We call (Pij ) the transition matrix. The n-step transition probabilities are defined by i,j

P=

p�;) = JP ( Xn

for any n

= = j l Xo

i) , i , j

Ef

= 1 , 2, . . (p�J) = 8ij ). .

Consider a gambling game in which on any turn you win 1 pound with probability P 0.4 or lose 1 pound with probabil ity 1 - P 0.6. Suppose further that you adopt the rule that you quit playing if your fortune reaches N pounds. For instance, for N 5 the transition matrix is 1 0 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0.6 0 0.4 0 0 0 0 0 1 2. Ratings of countries/firms can be thought off as following a Markov chain. Rating agencies typically assign classes like AAA, AA, A, BBB, BB, B, C, D. See Chapter 9 for details. 3. Consider a population in which each individual in the nth generation gives birth, producing k children with probability Pk . The number of individuals in generation n, Xn can be any nonnegative integer. If we let Y1 , Y2 , be independent random variables with Examples. 1 .

Gambler's ruin.

=

=

=

P=

Bond ratings. Branching process. .

•

.

JP(Ym

= k) = Pk ,

then we can write transition probabilities

=

as

=

+ . . . + Yi j ) . One can easily see that the m-step transition probability JP( Xn + m j l Xn ) is the mth power of transition matrix P. We also have the following im portant relation p(i, j )

i

JP(YI

= =

98


Proposition 3 . 8 . 1 (Chapman-Kolmogorov Equations) .

0, 1 , . . . ,

For all n, m =

( n + m ) = � ( n ) ( mj ) , Z. , ]. E I . L...... Pik P k

Pij

kE f

Let Ty = min { n ;::: 1 Xn = y} be the time of first return to y, and let :

gy = lP(Ty < 00 ) be the probability that Xn returns to y when it starts at stopping time. The key property of Markov processes is

y.

Then Ty is a

of the Markov chain {Xn }n>O with transition matrixSuppose P . Thenis a stopping time The process process after after TT isanda Markov the processchainbeforewithTtransition are independent. (ii)(i) The matrix P. Theorem 3.8.1 (Strong Markov Property) .

T

More details and further discussion can be found in e.g. Norris (1997).

Exercises

3.1

Show that in general,

Also, show that if (Xn) is an L2 martingale difference sequence ( that is, Xn = Zn - Zn- l with (Zn) an L2 martingale ) ,

In particular, this holds if the Xn are independent. 3.2 1 . Let X, Y E L2 (D, F, lP). Show that the mean-square error

IE [(Y - (aX + b» 2] is minimized for a* = Y)/Var(X) and b* = IE(Y) - a* IE(X). 2. Now let Y E L2 (D, F, lP) and 9 a a-algebra with 9 � F. Show that min IE [ (Y - (aX + b» 2 ] = IE [ (Y _ IE(Y lg » 2 ] . 2

Cov(X,

a,b;X E L ( !1 , Q , lP )

99

Exercises

A number d of balls are distributed between two urns, I and II. At each time n = 0 , 1 , 2, . . . a ball is chosen - each with equal probability l /d and transferred to the other urn. 1 . Show that the number of balls in urn I forms a Markov chain ( see e.g. Feller (1968) or Norris (1997) for background ) , with transition probabilities Pi , H I = (d - i)/d, Pi , i - I = i/d, Pij = 0 otherwise (i = 0, 1 , . . . , d) . 2. Show that the stationary distribution is ( 7fi ) ' where 3.3

-

7f i

(i)

d = 2 _d

- that is, if the process is started in this distribution, it stays in it. ( This is the Ehrenfest treated in detail in Cox and Miller (1972) , 129-132, Feller ( 1968) , 377-378, etc. It exhibits a strong 'central push' towards the central states, and is a discrete-time analogue of the Ornstein-Uhlenbeck velocity process of §5.7.) 3.4 Consider a gambler who bets a unit stake on a succession of independent plays, each of which he wins with probability P , loses with probability q : = 1 - P , with the strategy of quitting when first ahead. Write Sn for his net gain after n plays, urn,

fn

:=

JP(SI ::; 0, . . . , Sn - I ::; 0, Sn

=

1)

for the probability that he quits at time n, F(s) : =

00

L fnsn

n= 1

for the generating function of the sequence Un ) . Show that: 1 . F(s) = ( 1 - J 1 - 4pq s 2 )/(2q s ) ; ( _ 1 ) n - l 1n 2. h n - I = -2q - ( � ) (4pq) , h n = 0 ; 3. He eventually wins with probability 1 if P ::::: q, p /q if P < q; 4. For P ::::: q ( when he is certain to win eventually ) , the expected duration of play is l/(p - q) if P > q, + 00 if P = q = � . Thus if the game is fair, the expected waiting time to quitting when first ahead is infinite. ( For a detailed account, see e.g. Feller (1968) , XI.3, Grimmett and Stirzaker (2001), §5.3.) 3 . 5 In the fair game case P = q = � of the above: 1 . For each real () , show that Mn : = ( cosh () ) - n eIiSn is a martingale; 2. If T : = inf { n : Sn = I } is the duration of play, JP(T < 00) = 1 ; 3. 1E ( s T ) = L s n JP(T = n ) = (1 - v'f=S2) / s , JP ( T = 2n - 1 ) = ( _ l ) n - 1 (!) . 4. lE(T) = 00. ( Differentiate lE(S T ) in ( iii ) and put s = 1 . ) ( This illustrates the power of martingale methods in such problems; for a detailed treatment, see Williams ( 1991) , § 1O. 12.)

4 . Mathematical Finance in Discrete Time

4 . 1 The Model

We will study so-called finite markets - i.e. discrete-time models of financial markets in which all relevant quantities take a finite number of values. Fol lowing the approach of Harrison and Pliska (1981) and Taqqu and Willinger (1987) , it suffices, to illustrate the ideas, to work with a finite probability space (fl, F, JP) , with a finite number I fl l of points w , each with positive probability: JP( { w }) > O. We specify a time horizon T, which is the terminal date for all economic activities considered. ( For a simple option-pricing model the time horizon typically corresponds to the expiry date of the option. ) As before, we use a filtration IF = {Ft l;=o consisting of u-algebras Fo C Fl C . . . e FT : we take Fo {0 , fl} , the trivial u-field, FT F P(fl) ( here P (fl) is the power-set of fl, the class of all 2 1nl subsets of fl: we need every possible subset, as they all - apart from the empty set - carry positive probability ) . The financial market contains d + 1 financial assets. The usual interpre tation is to assume one risk-free asset ( bond, bank account ) labeled 0, and d risky assets ( stocks, say ) labeled 1 to d. While the reader may keep this interpretation as a mental picture, we prefer not to use it directly. The prices of the assets at time t are random variables, So ( t, W) , Sl ( t, W) , . . . , Sd ( t, W) say, non-negative and Fcmeasurable ( i.e. adapted: at time t, we know the prices Si ( t) ) . We write S(t) = ( So ( t ) , Sl ( t ) , . . . , Sd ( t )) ' for the vector of prices at time t. Hereafter we refer to the probability space (fl, F, JP) , the set of trading dates, the price process S and the information structure IF, which is typically generated by the price process S, together as a securities market model. It will be essential to assume that the price process of at least one asset follows a strictly positive process. =

=

=

numeraire price process of random variables), which is strictlyis apositive for all t( XE ({tO,) ) ;=1 ,o. (a. , T}sequence .

Definition 4. 1 . 1 . A

.

For the standard approach the risk-free bank account process is used as numeraire. In some applications, however, it is more convenient to use a

102

4. Mathematical Finance in Discrete Time

security other than the bank account and we therefore just use 80 without further specification as a numeraire. We furthermore take 80 (0) = 1 ( that is, we reckon in units of the initial value of our numeraire ) , and define (3(t) 1 /80 ( t) as a discount factor. A ( or
0, and then discretizing time, into TO , Tl , . . . , where or

TO :=

0,

Tn + ! :=

inf{t > Tn :

I X (t) - X ( Tn ) 1 >

t5X } ,

n ::::

0,

and deal with the resulting random walk (�n) ' where This approximation scheme is accurate, reasonably fast, and very flexible: it is capable of handling a wide variety of problems, with moving as well fixed barriers. For the theory, and detailed comparison with other available methods, see Rogers and Stapleton (1998) ; another approach is due to Ait Sahlia and Lai (1998b). Techniques useful here include continuity corrections for approximations to normality, Edgeworth expansions, and Richardson ex trapolation. as

4 . 8 . 2 Lookback Options

Lookback - or hindsight - options, which we discuss in more detail in §6.3.4 in continuous time, are options that convey the right to 'buy at the low,

4.8 Further Contingent Claim Valuation in Discrete Time

145

sell at the high' - in other words, to eliminate the regret that an investor operating in real time on current, partial knowledge would feel looking back in time with complete knowledge. Again, most of the theory is for continuous time (see e.g. Zhang (1997) , Chapter 12), but a discrete-time framework may be preferred - or needed, if the only prices available are those sampled at certain discrete time-points. Care is obviously needed here, as discretization of time will miss the extremes of the peaks and troughs giving the highs and lows in continuous time. Discrete lookback options have been studied from several viewpoints; see e.g. Heynen and Kat (1995) , Kat (1995) and Levy and Mantion (1997). An interesting approach using duality theory for random walks has been given by AitSahlia and Lai (1998a). 4.8.3 A Three-period Example

Assume we have two basic securities: a risk-free bond and a risky stock. The one-year risk-free interest rate (continuously compounded) is r = 0.06 and the volatility of the stock is 20%. We price calls and puts in a three-period Cox-Ross-Rubinstein model. The up and down movements of the stock price are given by 1 + u = eUvzs. = 1. 1224 and 1 + d = (1 + u ) - l

= e-uvzs.

=

0.8910,

with a = 0.2 and ..::1 = 1/3. We obtain risk-neutral probabilities by (4. 10) p*

=

eT4 - d

u-d

=

0.5584.

We assume that the price of the stock at time t = 0 is 8(0) = 100. To price a European call option with maturity one year (N 3) and strike K = 10) we can either use the valuation formula (4.13) or work our way backwards through the tree. Prices of the stock and the call are given in Figure 4.2 below. One can implement the simple evaluation formulae for the CRR- and the BS-models and compare the values. Figure 4.3 is for 8 = 100, K = 90, r = 0.06, a = 0.2, T = 1 . =

To price a European put, with price process denoted by p(t) , and an Amer ican put, P(t) , (maturity N = 3, strike 100), we can for the European put either use the put-call parity ( 1 . 1 ) , the risk-neutral pricing formula, or work backwards through the tree. For the prices of the American put we use the technique outlined in §4.8. 1 . Prices of the two puts are given in Figure 4.4. We indicate the early exercise times of the American put in bold type. Recall that the discrete-time rule is to exercise if the intrinsic value K - 8(t) is larger than the value of the corresponding European put.

146

4 . Mathematical Finance in Discrete Time

t ) ( the 'future' ) and

lP(A I X (t) , B) = lP(A I X (t) ) .

That is, if you know where you are ( at time t ) , how you got there doesn't matter so far as predicting the future is concerned - equivalently, past and future are conditionally independent given the present. The same definition applied to Markov processes in discrete time. X is said to be strong Markov if the above holds with the fixed time t replaced by a stopping time ( a random variable ) . This is a real restriction of the Markov property in the continuous-time case ( though not in discrete time ) . Perhaps the simplest example of a Markov process that is not strong Markov is given by T

5 . 2 Classes of Processes

1 59

X(t) : = 0 (t ::; r ) , t - r (t � r ) ,

where is an exponentially distributed random variable. Then X is Markov (from the lack of memory property of the exponential distribution) , but not strong Markov (the Markov property fails at the stopping time r) . One must expect the strong Markov property to fail in cases, as here, when 'all the action is at random times'. Another example of a Markov but not strong Markov process is a left-continuous Poisson process - obtained by taking a Poisson process (see below) and modifying its paths to be left-continuous rather than right-continuous. For background and further properties, see e.g. Cox and Miller (1972 ) , Chapter 5, Ethier and Kurtz (1986) , Chapter 4, and Karatzas and Shreve (1991) , §2.6. r

5.2.4 Diffusions

A diffusion is a path-continuous strong Markov process such that for each time t and state x the following limits exist: I-£(t, x ) a 2 (t, x )

: = limh./.O : = limh./.O

-hl lE [(X(t + h ) - X(t) ) I X (t) = -hl lE [(X(t + h ) - X(t)) 2 I X(t)

xl ,

=

x] .

Then 1-£ ( t, x ) is called the drift, a 2 (t, x ) the diffusion coefficient. The term 'diffusion' derives from physical situations involving Brownian motion (§5.3 below) . The mathematics of heat diffusing through a conducting medium (which goes back to Fourier in the early 19th century) is intimately linked with Brownian motion (the mathematics of which is 20th century). The theory of diffusions can be split according to dimension. For one dimensional diffusions, there are a number of ways of treating the theory; see for instance the classic treatments of Breiman (1992) and Doob ( 1953 ) . For higher-dimensional diffusions, there is basically one way: via the stochastic differential equation methodology (or its reformulation in terms of a martin gale problem) . This shows the best way to treat the one-dimensional case: the best method is the one that generalises. It also shows that Markov pro cesses and martingales, as well as being the two general classes of stochastic process with which one can get anywhere mathematically, are also intimately linked technically. We will encounter diffusions largely as solutions of stochas tic differential equations in §5.6; for further background see Grimmett and Stirzaker (2001) , Chapter 13, Revuz and Yor (1991), Chapter 7, and Stroock and Varadhan (1979) .

1 60

5. Stochastic Processes in Continuous Time

5 . 3 Brownian Motion

Brownian motion originates in work of the botanist Robert Brown in 1828. It was introduced into finance by Louis Bachelier in 1900, and developed in physics by Albert Einstein in 1905; see §5.3.4. for background and reference. Trle fact that Brownian motion exists is quite deep, and was first proved by Norbert Wiener (1894-1964) in 1923. In honour of this, Brownian motion is also known as the Wiener process, and the probability measure generating it - the measure IP. on e[O, 1] (one can extend to e[o, 00 ) ) by IP* ( A ) = IP( W. E A)

=

IP({t --+ Wt (w ) } E A)

for all Borel sets A E e[O, 1] - is called Wiener measure. 5 . 3 . 1 Definition and Existence Definition 5.3. 1 . A stochastic process X

= (X(t)h�o is a standard (one dimensional) Brownian motion, BM or BM(IR) , on some probability space (il, :F, IP) , if (i) X (O) = 0 a.s. , (ii) X has independent increments: X (t+u) - X (t) is independent ofa(X(s) : s ::; t) for u � 0, (iii) X has stationary increments: the law of X (t + u) - X(t) depends only on u, (iv) X has Gaussian increments: X (t + u) - X (t) is normally distributed with mean 0 and variance u, X (t + u) - X (t) N(O, u) , (v) X has continuous paths: X(t) is a continuous function of t, i. e. t --+ X (t, w ) is continuous in t for all W E il. rv

The path continuity in (v) can be relaxed by assuming it only a.s.; we can then get continuity by excluding a suitable null-set from our probability space. We shall henceforth denote standard Brownian motion BM(IR) by W (W(t)) (W for Wiener), though B = (B(t)) (B for Brown) is also common. Standard Brownian motion BM(IRd ) in d dimensions is defined by W(t) (W1 (t) , . . . , Wd (t) ) , where Wb . . . , Wd are independent standard Brownian motions in one dimension (independent copies of BM(IR) ) . We turn next to Wiener's theorem, on existence of Brownian motion. The proof, in which we follow Steele (2001), Ch. 3, is a streamlined version of the classical due to Levy in his book of 1948 and Cieselski in 1961 (see below for references) . =

:=

Theorem 5 . 3 . 1 (Wiener) . Brownian motion exists.

5.3 Brownian Motion

161

Covariance. Before addressing existence, we first find the covariance func tion. For s ::::: t, Wt = Ws + (Wt - Ws ) , so as JE(Wt ) = 0,

The last term is JE(Ws )JE(Wt - Ws ) by independent increments, and this is zero, so Cov(W. , Wt) = JE(W; ) =

(s ::::: t) : Cov(Ws , Wt)

s

=

min(s, t) .

A Gaussian process (one whose finite-dimensional distributions are Gaus sian) is specified by its mean function and its covariance function, so among centered (zero-mean) Gaussian processes, the covariance function min(s, t) serves as the signature of Brownian motion. For a ::::: tl < . . . < tn , the joint law of X(td , X(t 2 ) ' . . . ' X (tn ) can be obtained from that of X(td, X (t 2 ) X(h ) , . . . , X (tn) - X (tn - d · These are jointly Gaussian, hence so are X (td , . , X(tn) : the finite-dimensional distributions are multivariate normal. Re call §5.2.2 that the multivariate normal law in n dimensions, Nn (p" E) is specified by the mean vector p, and the covariance matrix E (non-negative definite) . So to check the finite-dimensional distributions of BM stationary independent increments with Wt N(O, t) it suffices to show that they are multivariate normal with mean zero and covariance Cov(Ws , Wt ) = min(s, t) as above. Finite-dimensional Distributions.

.

.

-

-

rv

for t E [0, 1] ) . This gives 00. First, take L 2 [0, 1] , and any complete orthonormal system (cons) (¢n ) on it. Now L 2 is a Hilbert space, under the inner product Construction of HM.

It suffices to construct

BM

t E [0, n] by dilation, and t E [0, 00) by letting n

--+

1 (f , g)

=

J

f (x) g (x) dx

o

so norm

I l f ll

(or

J

fg ) ,

: = (J f2 ) 1/2 ) . By Parseval's identity, 1

J o

00

fg

= "L (f , ¢n ) (g , ¢n) n=O

(where convergence of the series on the right is in L 2 , or in mean square: Il f - L � (f, ¢k ) ¢k ll --+ a as n --+ 00) . Now take, for s, t E [0, 1 ] ' f ( x ) = l [o, s] (x) ,

g (x )

= l [o, t] (x) .

1 62

5 . Stochastic Processes in Continuous Time

Parseval's identity becomes min(s, t) Now take

s

00

=

t

nL=O J cPn(x)dx J cPn(x)dx. 0

0

(Zn) independent and identically distributed N(O, 1 ) , and write t Wt L Zn J cPn(x)dx. n=O 00

=

0

This is a sum of independent random variables. Kolmogorov's theorem on random series ('three-series theorem', Shiryaev (1996) , IV, Theorem 3) says that it converges a.s. if the sum of the variances converges. This is t by above. So the series above converges a.s., and by excluding the exceptional null set from our probability space (as we may), everywhere.

§2,

L:=o (J� cPn(x)dx)2,

=

The Haar System. Define

on [0, ! ) , else. Write o ( t 1 , and for � 1 , express in dyadic form as = + k for a unique j 0, 1 , . . . and k 0, 1 , . . . - 1 . Using this notation for j, k throughout, write - k) (t) (so has support [k/2j , (k + 1 / ) . So if ( f. ) have the same j, 0, while if have different js, one can check that is on half its support, on the other half, so = 0. Also is on [k/2 , (k + 1 ) /2 ] , so 1. Combining:

H )

==

=

n

n

, 2j Hn 2j/2 H(2j t ) 2j ) j _2(jj t+h)/2J H� J HmHn c5mn,

n

=

2j

n,

:=

Hn HmHn 2(j t+h)/2 H� 2j ==

m, n

m

m, n

=

n

HmHn J HmHn

=

and form an orthonormal system, called the Haar system. For complete ness: the indicator of any dyadic interval [k/ (k + 1 /2 ] is in the linear span of the (difference two consecutive and scale) . Linear combinations of such indicators are dense in £2 [0, 1] . Combining: the Haar system is a complete orthonormal system in £2 [0, 1] .

(Hn) Hn

Hns

2j,

) j

(Hn)

The Schauder System. We obtain the Schauder system by integrating the Haar system. Consider the triangular function (or 'tent function')


2t

L\(t)

�

1:

(1

�

163

t) on [�, 1] , else.

Llo (t) t, Lll(t) := Ll(t), and define the nth Schauder function Lln Lln(t) : = Ll(2jt - k) ( n 2j + k 1). Note that Lln has support [k/2j, (k + 1)/2j] ( so is 'localized' on this dyadic

Write by

:=

;:::

=

interval, which is small for n, j large). We see that t

Jo H(u)du

and similarly

where

lo

=

=

� Ll(t),

t

Jo Hn(u)du = lnLln(t),

1 and for n ;:::

1,

In �2 Tj/2 (n 2j + k 1). The Schauder system ( Ll n) is again a cons on L2 [0, 1] . =

Theorem 5 . 3 . 2 . For (Zn ) O' independent N(O,

as above,

converges uniformly on Brownian motion.

: = nL=O lnZn Lln(t)

[0, 1] ' a. s . The process

Lemma 5 . 3 . 1 . For Zn independent N(O, 1 ) ,

for some random variable

So for any

1) random variables,

In, Lln

00

Wt

Proof.

;:::

=

X

For x > 1 ,

a

> 1,

C < 00 a. s.

W

( Wt

t E

[0, 1 ] ) is

164


JP ( J Zn J > y'2a log n)

Since E n- a

< 00

for a > 1 ,

:::;

y'2/ 7l' exp{ -a log n}

= y'2/ the Borel-Cantelli lemma gives

JP ( J Zn J > y'2a log n

So

for infinitely many n)

C : = sup � n > 2 y'log n

< 00

=

7l'

n-a .

0.

a.s. o

Proof of Theorem 5.3.2.

1 . Convergence.

Choose J and M � 2 J ; then 00

L

n =M

In J Zn J Lln (t) :::;

00

C L In y'log nLln (t) . M

The right is majorized by 00 2i _ 1

�

C L L Tj/ 2 y'j + 1 Ll 2i +k (t) J k =O (perhaps including some extra terms at the beginning, using 2j + k < 2j + 1 , log n :::; (j + 1) log 2, and Lln ( . ) � 0, so the series is absolutely conver gent). In the inner sum, only one term is non-zero (t can belong to only one dyadic interval [k/2j , (k + 1)/2j ) ) , and each Lln (t) E [0, 1] . So n =

00

C L "21 T j / 2 v'.7+1 \:It E [0, 1] ' j=J and this tends to ° J -+ 00 , so M -+ 00 . So the series E In Zn Lln (t) is absolutely and uniformly convergent, a.s. Since continuity is preserved under uniform convergence and each Lln (t) (so each partial sum) is continuous, Wt is continuous in t. LHS :::; as

2. Covariance.

as

By absolute convergence and Fubini's theorem,

So the covariance is


2: n

165

t

s

J ¢ J ¢n

=

m

0

0

min ( s, t) ,

by the Parseval calculation above. Take h , . . . , tm E [0, 1] ; we have to show that (W(t1 ) , . . . , W(t n ) ) is multivariate normal, with mean vector 0 and covari ance matrix ( min ( ti, tj ) ) . The multivariate characteristic function is 3. Joint Distributions.

which by independence of the Zn is

Since each Zn is N(O, 1), the right-hand side is

The sum in the exponent on the right is tj

tk

2: l� ?= 2: Uj Uk .:::l n (tj )Lln (tk ) = ?= 2: Uj Uk 2: J Hn (u)du J Hn (u)du, 00

m

m

m

n =O 3=1 k =1

m

3 =1 k =1

giving

m

00

n=O 0

0

m

2: 2: Uj Uk min(tj , tk) ,

j =1 k = 1 by the Parseval calculation, as (Hn ) are a cons. Combining,

This says that (W(t1 ) , . . . , W(t n ) ) is multinormal with mean 0 and covari ance function min ( tj , t k) required. This completes the construction of BM. o

as


166

Wavelets. The Haar system ( and the Schauder system obtained by integration from it, are examples of wavelet systems. The original function, H or is a mother wavelet, and the 'daughter wavelets' are obtained from it by dilation and translation. The expansion of the theorem is the wavelet expansion of BM with respect to the Schauder system For any f E C[O, l] , we can form its wavelet expansion

Hn),

(.1n)

.1,

(.1n).

)

00

f ( t = n=O cn.1n(t ), with wavelet coefficients Cn. Here Cn are given by k + � ) - "21 [f ( 2ki ) + f (k----v-+ 1)] . cn = f ( � This is the form that gives the .1n ( ) term its correct triangular influence, localized on the dyadic interval [k/2i , (k + 1)/2i ] . Thus for f EM, Cn lnZn, with In, Zn as above. The wavelet construction of BM above is, in modern language, the classical 'broken-line' construction of BM due to Levy in his L

.

=

book of 1948 - the Levy representation of EM using the Schauder system, and extended to general cons by Cieselski in 1961; see McKean (1969) , §1.2 for a textbook account. The earliest expansion of BAI 'Fourier-Wiener ex pansion' - used the trigonometric cons ( Paley and Zygmund 1930-32, Paley, Wiener and Zygmund 1932) ; see Kahane (1985), Preface and §16.3. -

We shall see that Brownian motion is a fractal, and wavelets are a useful tool for the analysis of fractals more generally. For background, see e.g. Holschneider (1995) , §4.4. Note.

For further background, see any measure-theoretic text on stochastic pro cesses. A treatment starting directly from our main reference of measure theoretic results ( Williams (1991)) is Rogers and Williams (1994) , Chapter 1 . The classic is Doob (1953), VIII.2. Excellent modern texts include Karatzas and Shreve (1991) and Revuz and Yor ( 1991) ( see particularly Karatzas and Shreve (1991) , §2.2-4 for construction ) . From the mathematical point of view, Brownian motion owes much of its importance to belonging to all the important classes of stochastic processes: it is ( strong ) Markov, a ( continuous ) martingale, Gaussian, a diffusion, a Levy process etc. From an applied point of view, as its diverse origins Brown's work in botany, Bachelier's in economics, Einstein's in statistical mechanics etc. - suggest, Brownian motion has a universal character, and is ubiquitous both in theory and in applied modeling. The universal nature of Brownian motion as a stochastic process is simply the dynamic counterpart - where we work with evolution in time - of the universal nature of its static counterpart, the normal ( or Gaussian ) distribution - in probability,

5.3 Brownian Motion

167

statistics, science, economics etc. Both arise from the same source, the central limit theorem. This says that when we average large numbers of independent and comparable objects, we obtain the normal distribution (see §2.8) in a static context, or Brownian motion in a dynamic context (see §5. 1 1 for the machinery - weak convergence - needed to handle such limiting results for stochastic processes; cf. §2.6 for its static counterpart) . What the central limit theorem really says is that, when what we observe is the result of a very large number of individually very small influences, the normal distribution or Brownian motion will inevitably and automatically emerge. This explains the central role of the normal distribution in statistics - basically, this is why statistics works. It also explains the central role of Brownian motion as the basic model of random fluctuations, or random noise as one often says. As the word noise suggests, this usage comes from electrical engineering and the early days of radio (see e.g. Wax (1954)) . When we come to studying the dynamics of stochastic processes by means of stochastic differential equations (§5.8 below), we will usually find a 'driving noise' term. The most basic driving noise process is Brownian motion; its role is to represent the 'random buffeting' of the object under study by a myriad of influences which we have no hope of studying in detail - and indeed, no need to. By using the central limit theorem, we make the very complexity of the situation work on our side: Brownian motion is a comparatively simple and tractable process to work with - vastly simpler than the underlying random buffeting whose effect it approximates and represents. The precise circumstances in which one obtains the normal or Gaussian distribution, or Brownian motion, have been much studied (this was the predominant theme in Levy's life's work, for instance) . One needs means and variances to exist (which is why the mean p, and the variance 0' 2 are needed to parametrize the normal or Gaussian family) . One also needs either independence, or something not too far removed from it, such as suitable martingale dependence (for martingale central limit theory, see the excellent book Hall and Heyde (1980)) or Markov dependence (see Ethier and Kurtz (1986) ) . 5.3.2 Quadratic Variation of Brownian Motion

Recall that a N(p" 0' 2 ) distributed random variable � has moment-generating function JvI (t) .IE (exp{tO ) exp P,t + (j 2 t 2 ;=

=

{ � }.

We take p, 0 below; we can recover the general case by adding p, back on. So, for � N(0, (j 2 ) distributed, =

168


{�0'2e } 1 �0'2t2 ;! ( �0'2 t2 ) 2 (t6) = 1 � 0'2 t 2 ! 0' 4t4 ( t6 ) . 4! 2! As the Taylor coefficients of the moment-generating function are the mo ments (hence the name moment-generating function! ) , JE (e) Var (O = 0'motion 2 ,JE(t;,on4) =JR3, 0'4this, sogivesVar (e) = JE (t;,4 ) - [JE (eW 20'4. For W Brownian JE (W(t)) 0, Var (W(t)) JE« W(t)2) = t, Var (W(t)2) 2t2. In particular, for t > 0 small, this shows that the variance of W(t ) 2 is neg ligible compared with its expected value. Thus, the randomness in W(t ) 2 is negligible compared to its mean for t small. This suggests that if we take a fine enough partition P of [0, tJ a finite set of points 0 = to tl . t n t with grid mesh I I P I I max I ti - ti- Il small enough - then writing LlW(ti ) W(ti ) - W(ti - d and Llti ti - ti-I , n �(LlW(ti))2 i=1 will closely resemble n n n �JE i=1 - ti d t. i=1 « LlW(ti ))2) � i=1 Llti = �(ti This is in fact true: n n � i=1 (LlW(ti))2 -+ � i=1 Llti = t in probability (max I ti ti - l 0) . This limit is called the quadratic variation of W over [0 , tJ : Start with the formal definitions. A partition 1l'n of [0, tJ is a finite set of points tni such that 0 = tno t n = tj the mesh of the tn partition is l 1l'n l maxi( tni - tn , ( i - I » ) , the maximal subinterval length. We consider nested sequences (1l'n ) of partitions (each refines its predecessors by adding further partition points) , with l 1l'n l O. Call (writing ti for tni for M(t) = exp +

+

=

+

+

+0

+

0

=

=

=

=

=

:=

=

-

0, write

with W BM. Then We is Gaussian, with mean 0, variance c - 2 x c2t = t and covariance

min(s, t) = Cov( W(s), W(t)) . Also We has continuous paths, W does. So We has all the properties of Brownian motion. So, We is Brownian motion. It is said to be derived from W by Brownian scaling with scale-factor c > O. Since (W( ut) : t 2 0) ( v'uW(t) : t 2 0) in law, Vu > 0, W is called self-similar with index 1/2 ( Bingham, Goldie, and Teugels (1987), §8.5) . Brownian motion is thus a fractal. A piece of Brownian path, looked at under a microscope, still looks Brownian, however much we 'zoom in and magnify'. Of course, the contrast with a function f with some smoothness is stark: a differentiable function begins to look straight under repeated zoom ing and magnification, because it has a tangent. =

as

=

1 72


Time-Inversion.

Write

Xt

:=

Then X has mean 0 and covariance Cov(Xs , Xt )

=

tW( l /t) .

s t.Cov(B( l / s ) , B( l /t))

=

s t. min( l / s , l /t)

min(t, s ) = min(s, t) . Since X has continuous paths also, as above, X is Brownian motion. We say that X is obtained from W by time-inversion. This property is useful in transforming properties of BM 'in the large' (t � 00 ) to properties 'in the small ' , or local properties (t � 0) . For example, one can translate the law of the iterated logarithm (LIL) from global to local form. Using time-inversion, we see that - as the zero-set of Brownian motion Z := {t 2: 0 Wt = O} is unbounded (contains infinitely many points increas ing to infinity) , it must also contain infinitely many points decreasing to zero. That is, any zero of Brownian motion (e.g., time t 0, as we are choosing to start our BM at the origin) produces an 'echo' - an infinite sequence of zeros at positive times decreasing to zero. How can we hope to graph such a function? (We can't!) How on earth does it manage to escape from zero, when hitting zero at one time, say, forces zero to be hit infinitely many times in any time-interval [ + ] ( > O)? The answer to these questions in volves excursion theory, one of Ito's great contributions to probability theory (1 9 70) . When BM is at zero, it is as likely to leave to the right as to the left, by symmetry - but it will leave, immediately, with probability one. These 'excursions away from zero' - above and below - happen according to a Pois son random measure governing the excursions - the excursion measure - on path-space. As there are infinitely many excursions in finite time-intervals, the excursion measure has infinite mass - it is a-finite but not finite. For details of the form of the Brownian excursion, background, proofs etc., we refer to Rogers and Williams (2000), or Bertoin (1996) , IV. Note however that, far from being pathological as one might at first imagine, the behaviour described above is what one expects of a normal, well-behaved process: the technical term is ' {O} is regular for 0' (Bertoin ( 1996) , IV) , and 'regular' is used to describe good, not bad, behaviour. Since Brownian motion has continuous paths, its zero-set Z is closed. Since each zero is, by above, a limit-point of zeros, Z is a perfect set. The zero-set is also uncountable ('big', in one sense) , but Lebesgue-null - has Lebesgue measure zero ('small', in another sense) . The machinery for measuring the size of small sets such as Z is that of Hausdorff measures. The Hausdorff measure properties of Z have been studied in great detail. The zero-set Z has a fractal structure, which it inherits from that of W under Brownian scaling. The natural machinery for studying the fine detail of the structure of fractals is, as above, that of Hausdorff measures. =

:

=

u

u, u

t

t

5.3 Brownian Motion

1 73

Parameters of Brownian Motion - Estimation and Hypothesis Test ing. If we form J.Lt + aWt - or replace N(O, t ) by N ( J.Lt , at) in the defini

tion of Brownian increments - we obtain a Levy process that has contin uous paths and Gaussian increments, called Brownian motion with drift J.L and diffusion coefficient a, BM(J.L, a ) , rather than standard Brownian mo tion BM = BM(O, 1 ) as above. By above, the quadratic variation of a seg ment of BM(J.L, a ) path on the time-interval [0, t] is a 2 t, a.s. So, if we can observe a Brownian path completely over any time-interval however short, then in principle we can determine the diffusion coefficient a with probability one. In particular, we can distinguish between two different a s - a1 and a2 , say - with certainty. In technical language: the Wiener measures IP.l and IP. 2 representing these two Brownian motions with different as on function space are mutually singular. By contrast, if the two as are the same, the two measures are mutually absolutely continuous. We can then test a hypothesis Ho : J.L J.Lo against an alternative hypothesis Hl : J.L = J.Ll by means of the appropriate likelihood ratio ( LR) . To find the form of the LR, we shall use Girsanov's theorem, which we discuss in §5.7. In practice, of course, we cannot observe a Brownian path exactly over a time-interval: there would be an infinite amount of information, and our ability to sample is finite. So one must use an appropriate discretization - and then we lose the ability to pick up the diffusion coefficient with certainty. Problems of this kind are not only of theoretical interest, but also important in practice. In mathematical finance, when the driving noise is modeled by Brownian motion, the diffusion coefficient is called the volatility, the parameter that describes how sensitive a stock-price is to price-sensitive information ( or economic uncertainty, or driving noise ) . Volatility enters explicitly into the most famous formula of mathematical finance, the Black-Scholes formula. Volatility estimation is of major importance. So too is volatility modeling: alas, in real financial data the assumption of constant volatility is usually untenable for detailed modeling, and one resorts instead to more complicated models, say involving stochas tic volatility, §7.3. For recent work here, see Barndorff-Nielsen and Shephard =

(2001).

5.3.4 Brownian Motion i n Stochastic Modeling

To begin at the beginning: Brownian motion is named after Robert Brown (1773- 1858) , the Scottish botanist who in 1828 observed the irregular and haphazard - apparently random - motion of pollen particles suspended in water. Similar phenomena are observed in gases - witness the familiar sight of dust particles dancing in sunbeams. During the 19th C., it became sus pected that the explanation was that the particles were being bombarded by the molecules in the surrounding medium - water or air. Note that this picture requires three different scales: microscopic ( water or air molecules ) , mesoscopic ( pollen or dust particles ) and macroscopic ( you, the observer ) . These ideas entered the kinetic theory of gases, and statistical mechanics,

1 74


through the pioneering work of Maxwell, Gibbs and Boltzmann. However, some scientists still doubted the existence of atoms and molecules (not then observable directly) . Enter the birth of the quantum age in 1900 with the quantum hypothesis of Max Planck (1858-1947) . Louis Bachelier ( 1870-1946) introduced Brownian motion into the field of economics and finance in his thesis Theorie de la speculation of 1900. His work lay dormant until much later; we will pick up its influence on Ito, Samuelson, Merton and others below. Albert Einstein (1879-1955), in his work of 1905, attacked the problem of demonstrating the existence of molecules, and for good measure estimating Avogadro 's number (c. 6.02 10 2 3 ) experimentally. Einstein realized that what was informative was the mean square displacement of the Brownian particle - its diffusion coefficient, in our terms. This is proportional to time, and the constant D of proportionality, WarWt = Dt, is informative about Avogadro's number (which, roughly, gives the scale factor in going from the microscopic to the macroscopic scale) . This Einstein relation is the prototype of a class of results now known in statistical me chanics as fluctuation-dissipation theorems. All this was done without any proper mathematical underpinning. This was provided by Wiener in 1923, as mentioned earlier. Quantum mechanics emerged in 1925-28 with the work of Heisenberg, Schrodinger and Dirac, and with the 'Copenhagen interpretation' of Bohr, Born and others, it became clear that the quantum picture is both inescapable at the subatomic level and intrinsically probabilistic. The work of Richard P. Feynman (1918-1988) in the late 1940s on quantum electrodynamics (QED) , and his approach to quantum mechanics via 'path integrals', introduced Wiener measure squarely into quantum theory. Feynman's work on quan tum mechanics was made mathematically rigorous by Mark Kac ( 1914-1984) (QED is still problematic!) ; the Feynman-Kac formula (giving a stochastic representation for the solutions of certain PDEs) stems from this. Subsequent developments involve Ito calculus, and we shall consider them in §5.6 below. Suffice it to say here that Ito's work of 1944 picked up where Bachelier left off, and created the machinery needed to use Brownian motion to model stock prices successfully (note: stock prices are nonnegative - pos itive, until the firm goes bankrupt - while Brownian motion changes sign, indeed has lots of sign changes, as we saw above when discussing its zero-set Z). The economist Paul Samuelson in 1965 advocated the Ito model - geo metric Brownian motion - for financial modelling. Then in 1973 Black and Scholes gave their famous formula, and the same year Merton derived it by Ito calculus. Today Ito calculus is a fundamental tool in stochastic modeling generally, and the modelling of financial markets in particular. In sum: wherever we look - statistical mechanics, quantum theory, eco nomics, finance - we see a random world, in which much that we observe is X

5.4 Point Processes

1 75

driven by random noise, or random fluctuations. Brownian motion gives us an invaluable model for describing these, in a wide variety of settings. This is statistically natural. The ubiquitous nature of Brownian motion is the dy namic counterpart of the ubiquitous nature of the normal distribution. This rests ultimately on the Central Limit Theorem ( CLT) - known to physicists as the Law of Errors - and is, fundamentally, why statistics works. 5 .4 Point Processes

Suppose that one is studying earthquakes, or volcanic eruptions. The events of interest are sudden isolated shocks, which occur at random instants, the history of which unfolds with time. Such situations occur in financial set tings also: at the macro-economic level, the events might be stock-market crashes, devaluations etc. At the micro-economic level, they might be indi vidual transactions. In other settings, the events might be the occurrence of telephone calls, insurance claims, accidents or admissions to hospital etc. The mathematical framework needed to handle such situations is that of point processes. A point process is a stochastic process whose realizations are, not paths as above, but counting measures: random measures f..L whose value on each interval I ( or Borel set, more generally is a non-negative integer f..L ( I) . Often, each point may come labeled with some quantity ( the size of the transaction, or of the earthquake on the Richter scale, for instance) , giving what is called a marked point process. We turn below to the simplest and most fundamental point process, the Poisson process, and the simplest way to build it. Stochastic processes with stationary independent increments are called Levy processes ( after the great French probabilist Paul Levy (1886-1971» ; see §5.5 below, and for a modern textbook reference, see Bertoin (1996) . The two most basic prototypes of Levy processes are Poisson processes and Brownian motion (§5.3) . We include below a number of results without proof. For proofs and background, we refer to any good book on stochastic processes, e.g. Dur rett (1999) .

)

5.4. 1 Exponential Distribution

exponential(>')T !P(T ::; t) e->.t t ;::: JE(T) War(T) !P(T > + ti T > t) !P(T >

A random variable is said to have an exponential distribution with rate >., T if 1 for all o. Recall 1/>. and 1/>. 2 . Further important properties are or

=

=

=

-

=

Proposition 5 .4. 1 . (i) Exponentially distributed random variables possess

the 'lack of memory ' property:

s

=

s

).

176


(ii) Let T1 , T2 , Tn be independent exponentially distributed mndom vari ables with pammeters AI , A 2 , . . . , A n resp. Then min { TI ' T2 , , Tn } is exponentially distributed with mte Al + A 2 + . . . + A n · (iii) Let TI , T2 , Tn be independent exponentially distributed mndom vari ables with pammeter A. Then G n = TI +T2 + . . + Tn has a Gamma(n, A) distribution. That is, its density is (At) n - l lP(Gn = t) = Ae-A t for t :::: 0 (n I ) ! •

.

.

•

•

•

•

·

•

.

_

5.4.2 The Poisson Process Definition 5.4. 1 . Let tl , t 2 , . . . t n be independent exponential(A) mndom variables. Let Tn = tl , + . + t n for n :::: 1, To = 0, and define N(s) = .

max { n : Tn ::; s } .

.

Interpretation: Think of t i the time between arrivals of events, then Tn is the arrival time of the nth event and N(s) the number of arrivals by time s. as

Lemma 5.4. 1 . N ( s ) has a Poisson distribution with mean AS.

The Poisson process can also be characterised via Theorem 5.4. 1 . If {N(s) , s :::: O} is a Poisson process, then

(i) N(O) = 0, (ii) N(t + s) - N(s) = Poisson(At) , and (iii) N(t) has independent increments. Conversely, if (i), (ii) and (iii) hold, then {N(s) , s :::: O} is a Poisson process.

The above characterization can be used to extend the definition of the Poisson process to include time-dependent intensities Definition 5.4.2. We say that {N(s) , s :::: O} is a Poisson process with mte

A (r) if (i) N(O) = 0, (ii) N(t + s) - N(s) is Poisson with mean J: A(r)dr, and (iii) N(t) has independent increments. 5.4.3 Compound Poisson Processes

We now associate Li.d. random variables Yi with each arrival and consider S(t)

=

YI

+ . . . + YN( t ) ,

S(t)

=

0

if N(t) = o.

5.4 Point Processes

1 77

Theorem 5.4.2. Let ( Yi ) be i. i. d. and N be an independent nonnegative

integer random variable, and S as above. (i) If lE(N) < 00, then lEeS) = lE(N)lE(Yt ) . (ii) If lE(N 2 ) < 00, then Ware S) = lE(N) War(Yl ) + War(N) (lE(Yl ) ) 2 . (iii) If N = N(t) is Poisson(At), then WareS) = tA(lE(Yl ) ) 2 .

A typical application in the insurance context is a Poisson model of claim arrival with random claim sizes. Again we are interested in bounds on the ruin probability. Let (X (t) ) model the capital of an insurance company. The inital capital > 0, insurance payments arrive continuously at a constant rate c > 0 Xo and claims are received at random times t! , t 2 , . ' " where the amounts paid out at these times are described by nonnegative random variables Yl , Y2 , Assuming ( i ) arrivals according to a Poisson process, ( ii ) Li.d. claim sizes Yi with F(x) = IP(Yl � x) , F(O) = 0 , J1 = Iooo xdF(x) < 00 , and ( iii ) independence of arrival and claim size process, we use a model = u

•

as

X (t)

= u

+

ct - Set) .

.

. .

(5. 1)

Using the above we see that a natural requirement is c > AJ1. We are again interested in the probability of ruin. Write T=

inf {t 2': 0 : X(t) � O} ,

then the probability of ruin is JP ( T < 00) and the probability of ruin before time t is JP(T � t) . Theorem 5.4.3. Let R be the (unique) root of the equation 00

� J eTX (1 - F(x))dx = 1 . o

Then for any t and thus JP ( T

2': 0

s) = JP(X > t) (s , t > 0) , or JP(X > s + t) = JP(X > s)JP(X > t) . Writing F(x) := 1 F(x) (x :::=: 0) for the tail of F , this says that -

F(s + t) = F(s)F(t) (s, t :::=: 0) .

Obvious solutions are for some

A>

0 - the exponential law E(A). Now J(s + t) = f(s)f (t) (s , t :::=: 0)

is a 'functional equation' - the Cauchy functional equation and it turns out that these are the only solutions, subject to minimal regularity (such one sided boundedness, as here - even on an interval of arbitrarily small length!). For details, see e.g. Bingham, Goldie, and Teugels (1987) , § l . l . l . So the exponential laws E(A) are characterized by the lack-of-memory property. Also, the lack-of-memory property corresponds in the renewal con text to the Markov property. The renewal process generated by E(A) is called the Poisson (point) process with rate A, Ppp(>.. ) . So: among renewal processes, the only Markov processes are the Poisson processes. When we meet Levy processes we shall find also: among renewal processes, the only Levy processes are the Poisson processes. It is the lack of memory property of the exponential distribution that (since the inter-arrival times of the Poisson process are exponentially dis tributed) makes the Poisson process the basic model for events occurring 'out of the blue'. For basic background, see e.g. Grimmett and Stirzaker (2001) , §6.8. Excellent textbook treatments are Embrechts, Kliippelberg, and Mikosch (1997) (motivated by insurance and finance applications) and Daley and Vere-Jones (1988) (motivated by the geophysical applications mentioned above) . -

as

5.5 Levy Processes

1 79

5 . 5 Levy Pro cesses

5 . 5 . 1 Distributions The Levy-Khintchine Formula. The form of the general infinitely-divisible distribution was studied in the 1930s by several people (including Kolmogorov and de Finetti) . The final result, due to Levy and Khintchine, is expressed in CF language - indeed, cannot be expressed otherwise. To describe the CF of the general i.d. law, we need three components: (i) a real (called the drift, or deterministic drift), (ii) a non-negative (called the diffusion coefficient, or normal component, or Gaussian component) , (iii) a (positive) measure on JR (or JR \ {O}) for which

a

a

f..L 00 J00 min(1, I x I 2 )f..L (dx) J I x I 2 f..L (dx) J f..L (dx) < 00 ,

-

that is, called the

< 00 ,

Ixl < l Levy measure.

Ix l � l

< 00 ,

The result is (recall §2.1O)

Theorem 5 . 5 . 1 (Levy-Khintchine Formula) . A function ¢ is the char

acteristic function of an infinitely divisible distribution iff it has the form ¢ (u ) =

exp { - !li (u)} (u E JR) ,

where

!li (u) = u � a2 u2 J - eiux iux1C- l, 1 ) (x)f..L (dx) (5.2) 2: 0 f..L . 1. Normal N(f..L , (2). Here 0, f..L = O. 2. Compound Poisson CP(l, Here a = 0, f..L has finite total mass (far from true in general!), say, and f..L Then J� I x l df..L ( x) and J� l xf..L ( dx). 3. Cauchy. See below (under 'Stability') . ia +

for some real

a,

a

+

+

(1

and Levy measure

Examples.

F) .

l

a =

=

a=

IF.

< 00 ,

l

Recall the classical Central Limit Theorem . . . are iid with mean f..L and variance a2, Sn Xk , then X , 2 - nf..L ) /(aVn) is asymptotically standard normal: ( S:�;t � x) q>(x) := vk ] e - �y2 dy (n -+ (0) "Ix E JR. 00

The Central Limit Problem.

(CLT) . If X l , (Sn

1P

=

-+

-

E�

1 80


Self-decomposability. Recall that if, in the central limit problem of §2.1O, we restrict from (two-suffix) triangular arrays ( Xn k ) to (one-suffix) sequences Xn ) , we come to a subclass of the infinite-divisible laws I, called the class of self-decomposable laws SD : SD c I.

(

Stability. Suppose we now restrict to identical distribution as well as inde pendence in SD above. That is, we seek the class of limit laws of random walks Sn E� Xk with ( Xn ) iid - after an affine transformation (centering and scaling) - that is, for all limit laws of ( Sn - n ) / bn . It turns out that the class of limit laws so obtained is the same as the class of laws for which Sn has the same type as Xl - i.e. the same law to within an affine transformation, or a change of location and scale. Thus the type is 'stable' (invariant, un changed) under addition of independent copies, whence such laws are called stable. They form the class S: =

a

S

c

SD

c

I.

It turns out that this class of stable laws can be described explicitly by parameters - four in all, of which two (location and scale, specifying the law within the type) are of minor importance, leaving two essential parameters, called the index E (0, 2)) and the skewness parameter E [- 1, 1)) . To within type, the Levy exponent is

Q (Q

f3 (f3

!Ji(u) l u l O« l - if3sgn(u) tan �1l'Q) for Q =I- 1 (0 Q 1 or 1 Q 2) and !Ji(u) l u l ( l + if3sgn(u) log l u I ) if Q 1 . The Levy measure is absolutely continuous, with density of the form + dx/x H o< x 0 , M(dx) - { cc_dx/ l xI H o< x 0, with c+ , c- � 0 and f3 (c+ - c_ )/(c+ + c_ ) . For proof, see Gnedenko and Kolmogorov (1954) , Feller (1968) , XVIII.6, or Breiman (1992) , §§9.8-1 1 . The case Q 2 (for which f3 drops out) gives the normal/Gaussian case, already familiar. The case Q 1 and f3 0 gives the (symmetric) Cauchy law above. The case Q 1 , f3 =I- 0 gives the asymmetric Cauchy case, which is awkward, and we shall not pursue it. From the form of the Levy exponents of the remaining stable CFs (where the argument u appears only in l u l o< and sgn(u)), we see that, if . . . + with Xi independent copies, =

0) has Laplace transform exp { -av'2S} ( ::::: 0) ; see Rogers and Williams (1994) §I.9 for proof. This is the density of the first-passage time of Brownian motion over a level a > O. The other remarkable case is that of = 3/2, f3 = 0, studied by the Danish astronomer J. Holtsmark in 1919 in connection with the gravitational field of stars - this before Levy's work on stability. The power 3/2 comes from 3 s

a

dimensions and the inverse square law of gravity. 5.5.2 Levy Processes

Suppose we have a process X = (Xt : t ::::: 0) that has stationary indepen dent increments. Such a process is called a Levy process, in honour of their creator, the great French probabilist Paul Levy (1886-1971) . Then for each n = 1 , 2, . . . , displays Xt as the sum of n independent ( by independent increments ) , identi cally distributed ( by stationary increments ) random variables. Consequently, Xt is infinitely divisible, so its CF is given by the Levy-Khintchine formula 5.2. The prime example is: the Wiener process, or Brownian motion, is a Levy process. Poisson Processes. The increment Nt + u - Nu (t, u ::::: 0) of a Poisson pro cess is the number of failures in {u, t + u] ( in the language of renewal theory ) . By the lack-of-memory property of the exponential, this is independent of the failures in [0, u] , so the increments of N are independent. It is also identi cally distributed to the number of failures in [0, t] , so the increments of N are stationary. That is, N has stationary independent increments, so is a Levy process: Poisson processes are Levy processes. We need an important property: two Poisson processes ( on the same fil tration ) are independent iff they never jump together ( a.s. ) . For proof, see e.g . Revuz and Yor ( 1991) , XII. I.

182


The Poisson count in an interval of length t is Poisson pe A t ) (where the rate A is the parameter in the exponential E(A) of the renewal-theory viewpoint), and the Poisson counts of disjoint intervals are independent. This extends from intervals to Borel sets: (i) For a Borel set B, the Poisson count in B is Poisson P(A I B I ) , where 1 . 1 denotes Lebesgue measure; (ii) Poisson counts over disjoint Borel sets are independent. Poisson ( Random ) Measures. If v is a finite measure, call a random mea sure ¢ Poisson with intensity (or characteristic) measure v if for each Borel set B, ¢(B) has a Poisson distribution with parameter v(B) , and for Bl " ' " Bn, ¢(Bl ) , . . . , ¢(Bn) are independent. One can extend to a-finite measures v: if (En) are disjoint with union JR and each v(En) < 00, construct ¢n from v restricted to En and write ¢ for L ¢n ' Poisson Point Processes. With v as above a (a-finite) measure on JR, consider the product measure IL = v dt on JR x [0, 00) , and a Poisson measure ¢ on it with intensity IL. Then ¢ has the form x

¢ = L d( e(t ) , t ) , t �O

where the sum is countable (for background and details, see Bertoin (1996) , §0.5, whose treatment we follow here) . Thus ¢ is the sum of Dirac measures over 'Poisson points' e ( t ) occurring at Poisson times t. Call e = ( e ( t ) : t � 0) a Poisson point process with characteristic measure v,

e

=

Ppp(v) .

For each Borel set B, [ O, t ] ) = card { s � t : e ( s ) E B } is the counting process of B it counts the Poisson points in B - and is Poisson process with rate (parameter) v(B) . All this reverses: starting with an e = ( e ( t ) : t � 0) whose counting processes over Borel sets B are Poisson P(v(B) ) , then - as no point can contribute to more than one count over disjoint sets, disjoint counting processes never jump together, so are inde d(e ( t) , t ) is a Poisson measure with intensity pendent by above, and ¢ : = L t >o N(t, B) : = ¢(B

x

-

IL =

v

x

a

dt .

Note. The link between point processes and martingales goes back to S. Watanabe in 1964. The approach via Poisson point processes is due to K. Ito in 1970 (Proc. 6th Berkeley Symp.); see below, and - in the context of excursion theory - Rogers and Williams (2000) , VI §8. For a monograph treatment of Poisson processes, see Kingman (1993) .

5 . 5 Levy Processes

183

5 . 5 . 3 Levy Processes and the Levy-Khintchine Formula.

We can now sketch the close link between the general Levy process on the one hand and the general infinitely-divisible law given by the Levy-Khintchine formula (L-K) on the other. We follow Bertoin (1996) , § l . l . First, if X (Xt) i s Levy, the law of each Xl is infinitely divisible, so given by lE exp{iuXt } = exp{ -!li (u) } (u E JR) with !Ii a Levy exponent as in (5.2) . Similarly, =

lE exp{iuXt } = exp{ -t!li(u) } (u E JR) , for rational t at first and general t by approximation and cadlag paths. Then is called the Levy exponent, or characteristic exponent, of the Levy process X. Conversely, given a Levy exponent !Ii (u) as in 5.2, construct a Brownian motion as in §5.3, and an independent Poisson point process .1 = ( .1t t 2: 0) with characteristic measure jJ-, the Levy measure in (5.2) . Then X ( t ) at + aBt has CF !Ii

:

I

lE exp{iuXI (t) }

=

{

exp{ -t!lil (t) } = exp -t ( i a + U

.

� a 2 u2 ) } ,

giving the non-integral terms in (5.2) . For the 'large' jumps of .1 , write .1 t(2)

.

·

=

if l { 0 t else. .1

.1 t l 2: 1 ,

Then .1(2) is a Poisson point process with characteristic measure jJ-(2) (dx ) : = l ( l x l 2: 1)jJ-(dx ) . Since J min(l, I x I 2 )jJ-(dx ) < 00 , jJ-(2) has finite mass, so .1(2) , a ppp(jJ-(2) ) , is discrete and its counting process X?)

:=

� .1�2)

(t 2:

0)

s :S;t

is compound Poisson, with Levy exponent

There remain the 'small jumps', .1 t(3)

. _

.

-

if { 0.1t else.

l .1 t l

0, the 'compensated sum of jumps' =

E

184


X; E, 3)

:=

L 1 ( 10 < I Ll s l s� t

0, due to Ito in 1944. This corrects Bachelier's earlier attempt of 1900 (he did not have the factor 8(t) on the right - missing the interpretation in terms of returns, and leading to negative stock prices!) Incidentally, Bachelier's work served as Ito's motivation in introducing Ito calculus. The mathemat ical importance of Ito's work was recognised early, and led on to the work of Doob (1953) , Meyer (1976) and many others (see the memorial volume Ikeda, Watanabe, M., and Kunita (1996) in honour of Ito's eightieth birthday in 1995) . The economic importance of geometric Brownian motion was rec ognized by Paul A. Samuelson in his work from 1965 on (Samuelson (1965)), for which Samuelson received the Nobel Prize in Economics in 1970, and by Robert Merton (see Merton (1990) for a full bibliography) , in work for which he was similarly honoured in 1997. -

5.6 Stochastic Integrals; Ito Calculus

197

The differential equation above has the unique solution

Set) S(O) exp { (JL - �a2 ) t + adW(t)}. =

For, writing

f (t,x) : = exp { (JL - �a2 ) t + ax }, we have ft = (JL - �a2 ) f, fx af, fxx = 0'2 f, and with x Wet), one has dx dW(t), (dx) 2 dt. Thus Ito's lemma gives df(t, Wet)) ftdt fx dW(t) 21 fxx (dW(t)) 2 = f ( ( JL - �a 2 ) dt adW(t) �a 2 dt ) = f(JLdt adW(t)), so f (t, Wet)) is a solution of the stochastic differential equation, and the initial condition f (O, W(O)) S(O) as W(O) = 0, giving existence. For uniqueness, we need the stochastic ( or DolE�ans, or Doleans-Dade ) exponential ( see §5.1O below) , giving Y c(X) exp { X - ! (X)} ( with X a continuous semi-martingale) as the unique solution to the stochastic =

=

=

=

+

=

+

+

+

+

=

=

=

differential equation

1. ( for the general definition and properties see e.g. Jacod and Shiryaev ( 1987 ) , lA, Protter (2004), 11.8, Revuz and Yor ( 1991 ) , IV.3, VULl, Rogers and Williams (2000) , IV. 19 ) ( Incidentally, this is one of the few cases where a stochastic differential equation can be solved explicitly. Usually we must be content with an existence and uniqueness statement, and a numerical algo below. ) Thus above is the rithm for calculating the solution; see stochastic exponential of + Brownian motion with mean ( or drift ) In particular, and variance ( or volatility )

dY(t) Y (t-)dX(t), YeO) =

=

§5. 7 Set) JLt aW(t), 0'2 . JL log Set) log S(O) (JL - �a 2 ) t aW(t) has a normal distribution. Thus Set) itself has a lognormal distribution. This geometric Brownian motion model, and the log-normal distribution that it =

+

+

entails, are the basis for the Black-Scholes model for stock-price dynamics in continuous time, which we study in detail in §6.2.

198


5 . 7 Stochastic Calculus for Black- Scholes Mo dels

In this section we collect the main tools for the analysis of financial markets with uncertainty modelled by Brownian motions. Consider first independent N(O, 1) random variables Zl , . . . , Zn on a prob ability space (n, F, lP) . Given a vector "I bl , . . . , "In ) , consider a new probability measure P on (n, F) defined by =

As exp{. } > 0 and integrates to 1 , as J expbiZildlP exp{ hn , this is a probability measure. It is also equivalent to lP (has the same null sets) , again as the exponential term is positive. Also =

P(Zi E dzi , i = l , . . . ) , n

This says that if the Zi are independent N(O, 1 ) under lP, they are indepen dent N bi 1) under P. Thus the effect of the change of measure lP -+ P, from the original measure lP to the equivalent measure P, is to change the mean, from 0 (0, . . . , 0) to "I bl , . . . , "In ) . This result extends t o infinitely many dimensions - i.e., from random vectors to stochastic processes, indeed with random rather than deterministic means. Let W (WI , . . . Wd ) be a d-dimensional Brownian motion defined on a filtered probability space (n, F, lP, IF) with the filtration IF satisfying the usual conditions. Let b (t) : 0 � t :s; T) be a measurable, adapted d dimensional process with 1:: "Ii (t) 2 dt < 00 a.s., i = 1, . . . , d, and define the process (L(t) : 0 � t � T) by '

=

=

=

L(t)

�

exp

{! -

�(,) ' dW( , ) -

�

!

l I >(s ) 1 1' ds

}.

(5.5)

Then L is continuous, and, being the stochastic exponential of - J; "I ( s ) ' d W( s ) is a local martingale. Given sufficient integrability on the process "I, L will in fact be a (continuous) martingale. For this, Novikov 's condition suffices:

5 . 7 Stochastic Calculus for Black-Scholes Models

199

We are now in the position to state a version of Girsanov's theorem, which will be one of our main tools in studying continuous-time financial market models. Theorem 5 . 7 . 1 ( Girsanov) . Let , be as above and satisfy Novikov 's con

dition; let L be the corresponding continuous martingale. Define the processes Wi , i = 1 , . . . , d by

Jo t

Wi (t) : = Wi (t) +

,i (S) dS,

(0

�

t � T) , i = 1 , . . . , d.

Then under the equivalent probability measure jp (defined on (n, FT ) ) with Radon-Nikodym derivative djp = L(T) , dIP the process W = ( WI , . . . , Wd) is d-dimensional Brownian motion.

In particular, for ,(t) constant ( ,) , change of measure by introduc ing the Radon-Nikodym derivative exp { -,W(t) - h 2t } corresponds to a change of drift from c to c - ,. If IF (Ft } is the Brownian filtration (ba sically Ft = a(W(s) , 0 � s � t) slightly enlarged to satisfy the usual condi tions) any pair of equivalent probability measures Q IP on F = FT is a Girsanov pair, i.e. =

=

'"

diJ dIP

l

= L (t) Ft

with L defined as above. Girsanov's theorem (or the Cameron-Martin Girsanov theorem ) is formulated in varying degrees of generality, discussed and proved, e.g. in Karatzas and Shreve ( 199 1 ) , §3.5, Protter (2004) , 111.6, Revuz and Yor ( 199 1 ) , VIII, Dothan ( 1990) , §5.4 (discrete time) , § 1 1 .6 (con tinuous time) . Our main application of the Girsanov theorem will be the change of mea sure in the Black-Scholes model of a financial market §6.2 to obtain the risk neutral martingale measure, which will as in the discrete-time case guarantee an arbitrage-free market model and may be used for pricing contingent claims (see § 6 . 1 ) . To discuss questions of attainability and market completeness we will need: Theorem 5 . 7 . 2 (Representation Theorem) . Let M =

(M(t))t>o be a ReLL local martingale with respect to the Brownian filtration (Ft ) . Then


200

t

t M (O) + J H(s)dW(s) , t � 0 o with H ( H(t))t? o a progressively measurable process such that J� H ( S) 2 ds 00, t 0 with probability one. That is, all Brownian local martingales may beas represented as stochastic integrals with respect to Brownian motion (and such are continuous). The following corollary has important economic consequences. Corollary 5.7. 1 . Let G be an FT -measurable random variable 0 T oowith G I 00 ; then there exists a process H as in theorem 5. 7. 2 such lE(I ) that T G lEG + J H (s)dW(s) . M( )

�

=

0, and boundedness of b on compact sets, one can construct a unique solution x by the Picard iteration

t

x ( O ) (t)

:=

Xo , x ( n +l) (t) := Xo +

J b(s, x ( n) (s))ds. o

See e.g. Hale (1969) , Theorem 1.5.3, or any textbook on analysis or differential equations. (The result may also be obtained as an application of Banach's contraction-mapping principle in functional analysis.) Naturally, stochastic calculus and stochastic differential equations contain all the complications of their non-stochastic counterparts, and more besides. Thus by analogy with PDEs alone, we must expect study of SDEs to be complicated by the presence of more than one concept of a solution. The first solution concept that comes to mind is that obtained by sticking to the non-stochastic theory, and working pathwise: take each sample path of a stochastic process as a function, and work with that. This gives the concept of a strong solution of a stochastic differential equation. Here we are given the probabilistic set-up - the filtered probability space in which our SDE arises - and work within it. The most basic results, like their non-stochastic counterparts, assume regularity of coefficients (e.g., Lipschitz conditions) , and construct a unique solution by a stochastic version of Picard iteration. The following such result is proved in Karatzas and Shreve (1991), §5.2. Consider the stochastic differential equation dX(t) = b (t, X (t))dt + a(t, X (t))dW(t) , X ( O) = �,

where b(t, x) is a d-vector of drifts, a(t, x) is a d dispersion matrix, W(t) is an r-dimensional Brownian motion, � is a square-integrable random d-vector independent of W , and we work on a filtered probability space satisfying the usual conditions on which W and � are both defined. Suppose that the coefficients b, a satisfy the following global Lipschitz and growth conditions: x r

I I b(t, x) - b (t, y ) II + l I a ( t , x) - a(t, y ) II

:S

K Il x - yll ,

IIb(t, x) 11 2 + Il a(t, x) 11 2 :S K 2 ( 1 + Il x I1 2 ) , for all t � 0 , x, y E JRd , for some constant K > O . Theorem 5 . 8 . 1 . Under the above Lipschitz and growth conditions,

(i) the Picard iteration X ( O ) ( t )

:=

�,

204


x< n +1 ) (t) : = e +

t

t

J b(s, (s))ds + J O"(s, x O. Now note that the integral is real. Differentiate under the integral sign, and use 1000 X-I sin xdx 1'0/2.) -

=

-

=

II .

=

as

-

0: =

=

=

=

6 . Mathematical Finance in Continuous Time

This chapter discusses the general principles of continuous-time financial mar ket models. In the first section we use a rather general model, which will serve also as a reference in the later chapters. A thorough discussion of the benchmark multi-dimensional Black-Scholes model is the topic of the second section. We discuss the valuation of several standard and exotic contingent claims in the continuous-time Black-Scholes model in the third section. After examining the relation between continuous-time and discrete-time models we close with a discussion of futures and currency markets. 6 . 1 Continuous-time Financial Market Models

6 . 1 . 1 The Financial Market Model

We start with a general model of a frictionless (compare Chapter 1) security market where investors are allowed to trade continuously up to some fixed finite planning horizon T . Uncertainty in the financial market is modelled by a probability space (Q, F, JP) and a filtration IF = (Ft ) O 0, where ,\ denotes Lebesgue measure, such that for all t E [0, T] we have a(t) ' cp(t) = 0 ('no exposure to risk') , ' cp(t) (b(t) - r (t) I d ) ¥= 0 ('non-zero rate') , on A . For an arbitrary constant c > 0 , construct a new trading strategy � . sgn (cp(t) ' (b(t) - r (t) Id ) ) cp(t) 'Ij;(t) =

{

For the discounted value process of the trading strategy 'Ij; we have the fol lowing dynamics: d dV", (t) = L 'lj;i (t)d Si (t) i= l �

t, ", (t) 8, (t) { (b, (t) - r(t))dt + t, u,; (t) dWj (t) } d

=

L Si (t)C Icpi (t) (bi (t) r(t)) 1 dt. i =l

-

So V", (t) � 0, 0 :::; t :::; T, hence 'Ij; is tame. In particular, V", (T) > 0 on a set B E FT with JP(B) > O. This means 'Ij; gives rise to an arbitrage opportunity.

246

6. Mathematical Finance in Continuous Time

To rule out such an arbitrage opportunity we must have that every vector in the kernel of a ' (t, w) must be orthogonal to b(t, w) - r(t, w) l d for a.e. (t, w ) . Thus b(t, w) - r(t, w) l d should belong to (kernel (a' (t, w)) .l.. = range(a(t, w)) , which is precisely the above condition. It then can be shown that 'Y(.) can be selected in Condition (6.5) to be progressively measurable. To prove (ii), recall that by Theorem 6.1 . 1 the existence of an equivalent martingale measure rules out arbitrage. The above Conditions (6.6) and (6.7) ensure that the exponential process L(t)

�

exp

{-/

�(u)' dW(u)

is a martingale, and thus JP * (A)

:=

-�/

�}

Ib(u) II' u ,

0 :5 t :5 T

lE(L(T) lA ) ' A E FT

(6.8)

(6.9)

defines an equivalent probability measure JP* with Radon-Nikodym deriva tive dJP* dJP Ft = L(t) , 0 � t � T. (Occasionally, we shall call L the Girsanov density.) JP* is the risk-neutral equivalent martingale measure . Under JP*,

1

t

W (t)

:=

W (t )

+ f 'Y ( s )ds ,

0�t�T

o

is Brownian motion. The stock-price process dynamics under JP* are

or equivalently for the discounted price processes, dB; (t)

�

B; (t)

(t,

u;;

(t)d W; (t)

),

i

�

1 , . . . , d.

So S is a local JP* -martingale and therefore JP* is an equivalent martingale 0 measure. Remark 6. 2. 2. Observe that Si are the stochastic exponentials of the pro cesses Zi = E7=1 J a j dWj (compare §5.6.1). So under the Novikov condition i

6.2 The Generalized Black-Scholes Model

247

they are martingales. As an illustration of the effect of different choices of " we consider a simple model with two securities 81 , 82 . Let the price-process dynamics be given by

+ +

d81 (t) = 81 (t) (b1 (t)dt lTl (t)dW(t) ) , d82 (t) = 82 (t) (b2 (t)dt lT2 (t)dW(t)) . Assume there i s a process , ( ) (which we might use to define a change of .

measure) such that

Observe that in the numerators we have the excess rate of return of the risky assets over the risk-free rate and in the denominators the volatility of the assets. So these quotients can be interpreted as the risk premium per unit of volatility. As in the theorem above this ratio is often called the market price of risk (compare §8.2) . We can rewrite the above as d81 ( t) = 81 (t) ( (r(t) + ,(t)O" I (t)) dt d82 (t) = 82 (t) ( (r(t) , (t) 0"2 (t) )dt

+

If, for example, "( 0, then ==

d81 (t) = 81 (t) (r(t)dt d8� (t) = 82 (t) (r(t)dt

+ 0"1 (t)dW(t) ) , + 0"2 (t)dW(t) ) .

+ 0"1 (t)dW(t)) , + 0"2 (t)dW(t) ) ,

and we see that Hi = 8d B, i 1 , 2 are (local) martingales under IP , so we are already in our usual risk-neutral setting. If we set "( = 0"2 , we get (doing some stochastic calculus) =

d

[ 8812 (t)(t) ] = (O"I (t) - 0"2 (t)) 8821 (t)(t) dW(t) .

So 8d 82 is a (local) martingale in this setting. Since the attitude towards risk is described by 0"2 (= , ) (the 'risk' in holding the asset 82 ) IP is called a risk-neutral measure with respect to 82 • (Observe that in this example we calculated the drift coefficients bi from the volatilities O"i , the risk-free rate r and the market price of risk , in order to use the original probability measure IP as a martingale measure. Our usual approach is to change the underlying measure using , determined by r, bi , O"d We now turn to the question of market completeness. Again we start by looking at the classical Black-Scholes example.

248


Example. Black-Scholes model (Completeness) . We already know that we have a unique martingale measure JP* (recall 'Y = (b r) l(J' in Gir sanov's transformation) . Given a contingent claim X E L l (fl, F, JP), then X E Ll (fl, F, JP* ) also, and we can define the JP* -martingale -

Using the martingale representation theorem for:. one-dimensional Brownian motion (compare Corollary 5.7.1), we know that under JP*

J h (u)dW(u) . t

M (t)

=

M(a) +

o

Now the JP* -dynamics of S are dS(t) = S (t ) dW ( t ) , (J'

so in fact M (t )

=

M(a) +

t

J c,ol (u)dS(u) , o

with c,ol (t) = hit ) . Using Remark 6.1.2 and defining (J'S(t)

c,oo (t) = M(t) - c,o l (t)S(t) = M ( t ) - h(t) , (J'

we obtain a self-financing replicating trading strategy (in terms of the dis counted values (1 , S)). We have thus shown that X is attainable, and since X was arbitrary the model is complete (in the restricted sense). Turning to the general case, we restrict consideration to contingent claims X with XI B(T) E L l (fl, FT , JP* ) . From Theorem 6.1.5 we know that unique ness of the martingale measure implies that a financial model is complete. In M B S all equivalent martingale measures are given by means of Girsanov's transformation, and hence are characterised by their corresponding Radon Nikodym derivatives with respect to JP. However, these Radon-Nikodym derivatives are characterised by functions 'Y satisfying (use Theorem 6.2.1) bet ) - r(t) l d

(J'(t)"( (t) , a ::; t

::;

T. Keeping this in mind, a characterization of completeness in M B S will involve conditions on the coefficients of the model. More precisely: =

Theorem 6 . 2 . 2 . The following are equivalent:

(i) there exists a unique equivalent martingale measure for the discounted stock price process S;

6 . 2 The Generalized Black-Scholes Model

249

(ii) the generalised Black-Scholes model MB S is complete (in the restricted sense that every contingent claim X with X/B (T) E L1 (fl, F, IP* ) is attainable) ; (iii) equality n = d holds and the volatility matrix a(t, w) is (..\ ® IP ) -a. e.

non-singular.

Remark 6. 2. 3. ( i ) If n � d and the matrix a has full rank, we can reduce the number of stocks by duplicating some of them as (t, w ) -dependent linear combinations of others. ( ii ) We discuss completeness by assuming that trading is restricted to primary securities. Completeness of financial market models with traded put and call options is discussed, e.g. in Bajeux-Besnainou and Rochet ( 1996 ) and Madan and Milne (1993). In particular the question of static hedging is treated in Carr, Ellis, and Gupta (1998) ( compare also Exercise 4.6) . Proof of Theorem 6 . 2 . 2 Again we only sketch the proof. ( i ) {:} ( iii ) follows from the special structure of the Girsanov density used to perform a change of measure in the Brownian setting ( compare §5.8) .

To show ( iii ) => ( ii ) ( which is equivalent to showing ( i ) => ( ii ) , a special case of Theorem 6.1 .5) we repeat the argument given in the classical Black Scholes model. Let IP* be the martingale measure defined by the unique Gir sanov transformation with parameter "((t) (a ' (t) ) - l (b(t) r ( t ) l d ) Using the martingale representation property for the multidimensional Brownian motion ( compare Corollary 5.7. 1), we have for any contingent claim X -

=

M(t)

:=

=

IElP*

M(O)

( B�T) 1 Ft )

+ J h' (u)dW(u) t

=

M(O)

+ L J hi (u)dWi (u) , d

'

t

,= 1 0

+

o

0 :::; t :::; T,

with W(t) = W(t) J; "((u)du . On the other hand, the IP*-dynamics of S are d dSi (t) Si (t) L O'ij (t)dWj (t) . j= l We thus have d t M(t) = M(O) L IPi(u)dSi (u), =

+

with

J

'= 1 0

250


Using Remark 6.1.2 and defining CPo (t)

=

M (t) - cp' S(t) ,

cp is admissible. Then X is attainable using cp as replicating trading strategy ( the replicating trading strategy for the undiscount�d assets is of course B( t ) · cp(t» , and since X was arbitrary the model is complete ( in the restricted sense ) . To show ( ii ) implies ( i ) we use an argument like that in the first part of

the proof of Theorem 4.3.1 to show that in a complete market the mapping IEq (Y/B (T) ) is constant for any ( integrable ) contingent claim Y ( see Jacka (1992)) . If there are two equivalent martingale measures Q1 and Q 2 with Q1 =I- Q 2 , then we must have an event A E F with Q1 (A) =I- Q 2 (A) . SO we can define a contingent claim Y lA with IEql (lA/B (T» =I- IEq2 ( l A /B (T) ) , which is a contradiction to the above 0 statement. So we must have a unique martingale measure. H : P -t IR given by H(Q)

=

:=

6.2.2 Pricing and Hedging Contingent Claims

We assume now that d = n, that the coefficients of the model satisfy the inte grability conditions, that a is non-singular and that 1' ( . ) in (6.5) exists. Then the financial market model admits a unique equivalent martingale measure JP* with Radon-Nikodym derivative given by the Girsanov transformation L(')

�

exp

{i -

�(u) ' dW(u)

-

�

i dU} ' 11�(u) , , 2

0 ':; ' ':; T

and 'Y(t) = a 1 ( t ) ( b ( t) - r(t) l d ) . By Theorems 6.2. 1 and 6.2.2, the model is free of arbitrage and complete. We call such a model standard. Recall that a contingent claim X is a FT -measurable random variable such that X/ B(T) E L1 (fl, :FT , JP* ) . Using this we get ( we write IE* for IE!po in this section ) : -

Theorem 6.2.3 (Risk-neutral Valuation Formula) . Let M B S be a

standard multi-dimensional Black-Scholes model and X a contingent claim. The arbitmge price process of X is given by the risk-neutml valuation formula

Proof. Since the model is complete, any such contingent claim is attain0 able, and the result follows from Theorem 6 . 1 .4

Of course, for practioners it is of equal importance to find the replicat ing portfolio ( which exists by completeness ) . The martingale representation

6 . 2 The Generalized Black-Scholes Model

251

theorem guarantees the existence, but the explicit construction is rather in volved. We now look at the special case of the classical Black-Scholes model. By the risk-neutral valuation principle the price of a contingent claim X is given by IIx (t) = e { - r ( T - t ) } /E* [ X I Ft ] , with IE " given via the Girsanov density L(t) =

exp

{

(b - r)

( )

1 -;;b-r

- -;;- W (t) - "2

2

}

t .

Furthermore, if X is of form X = ifJ(S(T) ) with a sufficiently integrable function ifJ, then the price process is also given by IIx (t) = F(t, S(t) ) , where F solves the Black-Scholes partial differential equation Ft (t, s)

+

rsFs (t, s)

+

�

u 2 s 2 Fss (t, s) - rF(t, s)

=

0,

F(T, s)

=

ifJ(s) .

(6. 11) (6. 12)

To obtain this partial differential equation we used the Feynman-Kac repre sentation (see §5.7) of the P* -martingale (6.13)

M(t) = exp { - rT} IE* [ ifJ( S (T) ) I Ft l ;

indeed, with G (t, S (t) ) = M (t) we know from Theorem 5.7.3 that G satisfies the partial differential equation (recall that S satisfies the P* SDE dS = :

S(rdt + udW) )

with 'initial' (more accurately, final or terminal) condition G (T, s) = s-rT ifJ(s) . By the risk-neutral valuation principle IIx (t) = e r tM(t) , and so F(t, s) = e rt G (t , s ) , and computing the partial derivatives of F we obtain the repre sentation. Specialising further and considering a European call with strike K and maturity T on the stock S (so ifJ(T) = ( S(T) - K)+ ) , we can evaluate the above expected value (which is easier than solving the Black-Scholes partial differential equation) and obtain: Proposition 6 . 2 . 1 (Black-Scholes Formula) . The Black-Scholes price

process of a European call is given by C (t) = S(t) N (d1 ( S(t) , T - t) )

- K e -r ( T -t) N(d2 ( S(t) , T - t) ) .

The functions d 1 ( s , t) and d2 ( s , t) are given by

(6. 14)

252

6 . Mathematical Finance in Continuous Time _

d 1 ( s , t) -

)

2

log(s/K) + (r + T ) t O"yIit

Ii _

_

d2 (s, t - d1 ( S , t ) - O" y t -

'

log (s /K ) + (r Ii

O"y t

� )t

Observe that we have already deduced this formula as a limit of a discrete time setting in §4.6. To obtain a replicating portfolio we use Ito's lemma to find the dynamics of the JP* -martingale M(t) G (t, S (t) ) : =

dM(t)

O"S (t) Gs (t, S(t) ) d W (t) .

=

U sing this representation, we get in terms of the notation in the Black-Scholes model h(t)

=

O"S (t) Gs (t, S (t) ) ,

which gives for the stock component of the replicating portfolio using the discounted assets !P I (t)

=

Gs (t, S (t) ) B (t) ,

and using the self-financing condition the cash component is !Po (t)

=

G (t, S(t) ) - Gs (t, S (t) ) S(t) .

To transfer this portfolio to undiscounted values we multiply it by the dis count factor, i.e F(t, S(t) ) B (t) G (t, S(t)) and get: =

Proposition 6.2.2. The replicating strategy in the classical Black-Scholes

model is given by

!Po

=

F(t, S(t) ) - Fs (t , S(t) ) S(t) , B (t)

!P I

=

Fs (t, S (t) ) .

In their original paper Black and Scholes (1973) , Black and Scholes used an arbitrage pricing approach (rather than our risk-neutral valuation ap proach) to deduce the price of a European call as the solution of a partial differential equation (we call this the PDE approach) . The idea is as follows: start by assuming that the option price C(t) is given by C (t) f(t, S(t)) for some sufficiently smooth function f JR+ x [0, T] -+ JR. By Ito's formula (Theorem 5.6.2) we find for the dynamics of the option price process (observe that we work under JP so dS S (bdt + O"dW) ) =

:

dC

=

{

=

I

ft (t, S) + fs (t, S) S b + "2 fss (t , S) S 2 0" 2

}

dt + fs SO" dW.

(6. 15)

Consider a portfolio 'l/J consisting of a short position in 'l/JI (t) = fs (t, S(t)) stocks and a long position in 'l/J2 (t) = 1 call and assume the portfolio is self-financing. Then its value process is


V", (t)

=

-'lfJ1 (t) S(t)

253

+ G(t) ,

and by the self-financing condition we have (to ease the notation we omit the arguments) dV", =

-

=

-

=

'l/h dS

+ dG

fs (Sbdt

( +1 ft

+ SO"dW) + ( ft + fs Sb + � fSS S2 0"2 ) dt + fs SO"dW

)

2 2 2" fss S 0" dt.

So the dynamics of the value process of the portfolio do not have any exposure to the driving Brownian motion, and its appreciation rate in an arbitrage-free world must therefore equal the risk-free rate (for a mathematically precise formulation of this equality we refer the reader to Musiela and Rutkowski (1997), §5.2), i.e. dV", (t)

=

=

rV", (t)dt

( -r fs S

Comparing the coefficients and using G(t) -rSfs

+ rf

=

ft

=

+ rG) dt.

f(t, Set» � ,

we must have

+ 2"1 0"2 S2 fss .

This leads to the Black-Scholes partial differential equation for f (6.11), i.e. ft + r s fs

Since

+ 2"1 0"2 s2 fss - rf

G(T) = (S(T) - K) + we need f(s , T) = (s K) + for all s E 1R + . Note. One point in the justification of

=

O.

to impose the terminal condition

-

the above argument is missing: we have to show that the trading strategy short '!/J l stocks and long one call is self-financing. In fact, this is not true, since '!/J l = '!/Jl (t, Set) ) is dependent on the stock price process. Formally, for the self-financing condition to be true we must have dV", (t)

Now '!/J (t)

=

=

d( '!/Jl (t) S(t) )

'!/J (t , Set» �

+

dG(t)

=

'!/Jl (t)dS(t)

+ dG(t) .

depends on the stock price and so we have

d( '!/Jl (t , S (t» S(t) ) = '!/Jl (t) dS(t)

+ S(t)d'!/Jl (t, Set)) + d ('!/Jl , S) (t) .

We see that the portfolio '!/J is self-financing, if S (t)d'!/Jl (t , Set» �

+ d ('!/Jl , S) (t)

=

O.

It is an exercise in Ito calculus to show that this is not the case (see Exercise 6. 1).

254


6.2.3 The Greeks

We will now analyse the impact of the underlying parameters in the standard Black-Scholes model on the prices of call and put options. The Black-Scholes option values depend on the ( current ) stock price, the volatility, the time to maturity, the interest rate and the strike price. The sensitivities of the option price with respect to the first four parameters are called the Greeks and are widely used for hedging purposes. We can determine the impact of these parameters by taking partial derivatives. Recall the Black-Scholes formula for a European call (6. 14) : C(O)

=

C(S , T, K, r, a)

=

SN(dI ( S, T) )

-

K e - rT N(d2 (S, T) ) ,

with the functions dI e S , t) and d2 (s, t) given by _

d1 ( s , t ) -

2

log(s/ K) + ( r + T ) t ' avfi.t

. fi. _ log(s/ K) + ( r d2 ( s , t ) - d l ( S, t ) - av t avfi.t _

� )t

One obtains . ac V =

.

e :=

p

:=

r

'

.

=

aa

ac aT ac ar a2 c aS 2

= TKe -rT =

N(d2 ) >

0,

n(dd > °. Sa VT

( As usual N is the cumulative normal distribution function and n is its density. ) From the definitions it is clear that ..:1 - delta - measures the change in the value of the option compared with the change in the value of the underlying asset, V - vega - measures the change of the option compared with the change in the volatility of the underlying, and similar statements hold for - theta - and p - rho ( observe that these derivatives are in line with our arbitrage-based considerations in §1.3). Furthermore, ..:1 gives the number ofrshares in the replication portfolio for a call option ( see Proposition 6.2.2) , so measures the sensitivity of our portfolio to the change in the stock price.

e


255

The Black-Scholes partial differential equation (6.1 1 ) can be used to ob tain the relation between the Greeks, i.e. ( observe that 8 is the derivative of C, the price of a European call, with respect to the time to expiry T t, while in the Black-Scholes PDE the partial derivative with respect to the current time t appears ) -

1 rC = '2 s 2 a 2 r + rsLl - 8.

Let us now compute the dynamics of the call option's price C(t) under the risk-neutral martingale measure JP* . Using formula (6.15) we find dC(t)

=

rC(t)dt + aN(d1 (S(t) , T - t)) S(t)dW (t) .

Defining the elasticity coefficient of the option's price as ry C (t)

=

Ll (S(t) , T - t)S(t) C(t)

=

N(d 1 (S(t) , T - t)) C(t)

we can rewrite the dynamics as dC(t) = rC(t)dt + m( (t)C(t)dW (t) .

So, as expected in the risk-neutral world, the appreciation rate of the call option equals the risk-free rate r. The volatility coefficient is a ryC , and hence stochastic. It is precisely this feature that causes difficulties when assessing the impact of options in a portfolio. 6.2.4 Volatility

One of the main issues raised by the Black-Scholes formula is the question of modelling the volatility a ( for a discussion of further sensitivities in using the Black-Scholes model see Dana and Jeanblanc (2002)) . Before we can implement the Black-Scholes formula to price options, we have to estimate

a. V

Because the formula is explicit, we can, as noted in §6.2.3, determine the the partial derivative

-

V

=

8C/8a,

finding The important thing to note here is that vega is always positive. This mathe matical consequence of the explicit Black-Scholes formula is reassuring, as it is evident on financial grounds: increasing volatility does not make us more likely to 'win' ( possess an option in the money at expiry ) , but does mean that when we win, we 'win bigger', hence that our option - the right to make this one-way bet - is worth more.

256


Next, since vega is positive, C is a continuous - indeed, differentiable - strictly increasing function of a . Thrning this round, a is a continuous ( differentiable ) strictly increasing function of Cj indeed, 1 aa V = ac aa ' so V = ac ' Informally, one can get the graph of a against C from that of C against a thought of as drawn on tracing paper - by turning the paper over and rotating through a right angle. Formally, one needs explicit proof of the relevant result on inverse functions, for which see a text on analysis ( e.g. Burkill (1962) , §3.9) . Thus the value a = a(C) corresponding to the actual value C C(a) at which call options are observed to be traded in the market can be read off. The value of a obtained in this way is called the implied volatility. First, note that the accuracy with which the ( implied ) volatility a can be inferred from the observed price C depends on the vega V, or rather its reciprocal l/V, by the formulae t5C Vt5a, t5a V - 1 t5Cj if vega is small, i.e. V - 1 is large, small errors t5C in call-option prices are magnified into large errors t5a in the implied volatility, and one loses accuracy. But when vega is large, one gains accuracy: since implied volatility depends on sensitivity analysis, one expects it to work best when prices are sensitive to a! Next, note that determination of implied volatility in this way is well adapted to the familiar and classical Newton-Raphson iteration procedure of numerical analysis. If the transcendental equation C = C(a) is to be solved for a, and an is the current approximation, then the next is - C an - C(an) - C an +l := an - C(Sn) V(an) . C' (an) For background, see a text on numerical analysis, e.g. Jacques and Judd ( 1987) . But the most important thing to note about implied volatility is that here one is at the mercy of the model. One of the great merits of the Black-Scholes analysis is that it uses a simple, parametric model and gives explicit formulae in its conclusions. All parametric models - in statistics, finance or any other field - are wrong in the sense that they represent at best idealised or simpli fied descriptions of reality, and yield correspondingly idealised or simplified conclusions. One cannot expect the geometric Brownian motion model un derlying the Black-Scholes analysis to be any more than a rough description of reality. Thus, one cannot expect the above derivation of implied volatil ity, which depends explicitly on the details of this model, to give anything =

/'oJ

/'oJ

=

6. 2 The Generalized Black-Scholes Model

257

more than a rough description of actual volatility of actual prices ( see Shaw (1998) , Chapter 1 , for the pitfalls arising when using implied volatility in the context of exotic options ) . Since volatility is so important - just as the Black-Scholes analysis is so important - it has been studied in great detail, and much empirical knowledge has been built up about implied volatility in a range of empirical studies. Now volatility is a parameter of the model, and so should be constant - it represents a property of the particular underlying stock in question. However, volatility is observed to vary, most notably with the underlying stock price S - or equivalently, with the strike-price K. In particular, for values of S much higher than the strike price K - that is, for options deeply in the money - the volatility is higher. The resulting upward curve in the graph of against K is called the volatility smile. Since volatil ity is observed in practice to vary with the price of the underlying, which is random - a stochastic process - an obvious way to refine the model is to accept that the volatility is itself random and treat it accordingly. Such stochastic volatility models ( on which we touched in §5.1O) are undoubtedly important, but as they are much more complicated than their counterparts with deterministic volatility, we postpone a treatment of them in the context of incomplete market models to §7.2 For references to the recent literature, see e.g. Campbell, Lo, and MacKinlay (1997) , §§9.3.6, 12.2. 1, 12.2.2, Frey (1997) and Hobson (1998) . Returning to the Black-Scholes model with its geometric Brownian motion dynamics, the other obvious way to estimate volatility is to use, not the option price observed now, but the underlying asset price observed over time. The resulting volatility estimate is called the historic volatility. To estimate it, given data available in practice - finitely many values of the stock price at finitely many time points tl < . . . < tn - one may follow the usual statistical procedure for parameter estimation, the method of maximum likelihood ( see §5.9 and e.g. Rao (1973)) . One can avoid explicit use of a parametric model as simple as geometric Brownian motion: motivated by the important problem of volatility estima tion, there has been an upsurge of recent interest in estimation of diffusion coefficients for diffusion processes. For details, background and references, see e.g. Dohnal (1987) , Florens-Zmirou (1989) and Genon-Catalot and Jacod (1994). We now have two types of volatility estimate: implied and historic. Were the model exact, these two would agree ( approximately, to allow for sampling error ) , as they would simply be different ways of measuring the same thing. However, empirical studies reveal more disagreement between implied and historic volatility estimates than can be accounted for merely by sampling error - which of course reflects, in another guise, the deficiencies of the un derlying model - geometric Brownian motion - in the Black-Scholes analysis. One obvious way to seek to avoid the limitations of this - or any other parametric model is to avoid choice of a parametric model altogether and a

.

258


resort instead to the methods of non-parametric statistics. Non-parametric volatility estimates have been studied recently, but to discuss them in detail here would take us too far afield. We refer for details to Campbell, Lo, and MacKinlay ( 1997) , § 12.3.4, Ghysels et al. ( 1998 ) , Fouque, Papanicolaou, and Sircar (2000) .

6 . 3 Further Contingent Claim Valuation

We generally consider a Black-Scholes setting. 6.3.1 American Options

As in discrete time ( § 1 .3.3) , these are equivalent to Eu ropean calls - there is no advantage in exercise before expiry. One can also handle extensions, including ) which goes back to McKean in 1 965; ( i ) the perpetual call (T ( ii ) options on stocks paying dividends. For a full account, we refer to Karatzas and Shreve ( 1998) , §2.6. American Puts. We saw in §4.8, in discrete time, the link between Ameri can options and the Snell envelope. We also saw, in the binomial model; how to price American options by backward induction ( dynamic programming ) , and how the stopping and continuation regions S and C - the parts of the tree where it is optimal to exercise now and when it is optimal to hold for exercise later - emerge a corollary of this procedure. We now consider American options ( puts, above ) in more detail, following the excellent sur vey by Myneni ( 1 992) and the textbook account Karatzas and Shreve ( 1998) to which we refer for proofs, background and further references. We content ourselves with formulating and discussing the main results. We stress that, American options are much harder to handle than European ones, the avail able results are much less explicit than in the European case. In particular, explicit pricing formulae - and explicit descriptions of the regions S, C or the optimal stopping boundary S * separating the two - are not known, and are probably inaccessible in general. Early-exercise Decomposition. We use our standard approach, risk neutral valuation. Writing Tt, T for the set of stopping times 7 taking values in [t, T] , the optimal stopping problem is to maximise the risk-neutral expectation of our discounted payoff (K S (7) )+ over all stopping times 7 E Tt, T . Write American Calls.

= (0

,

as

as

as

-

-


259

for this optimal expected payoff over all stopping times (optimal expected gain on stopping without foreknowledge of the future) . Then as in discrete time, J(t) is the Snell envelope (least supermartingale majorant of the dis counted payoff process). When the current stock price at time t is x, write P(x, t) sup{lEx [e-r( r- t ) (K - S(7) ) + ] : 7 E 1t,r } for the value function of the option (so P(x, t) =: pet) = ert J(t) , using that Set) = x contains as much relevant information as Ft ) . Then P(x, t) ;::: ( - x) + , with equality if and only if stopping now is optimal: S = { (x, t) E IR+ [0, T] : P(x , t) = - x) + }, :=

K

X

(K [0, T] : P(x, t) > (K - x) + } .

C = { (x, t) E IR+ Then P is continuous (Myneni (1992), Prop. 3.1) , whence the stopping region S is closed and its complement, the continuation region S, is open. Also Set) := {x : (x, t) E S} and C (t) := {x : (x, t) E C} are intervals, the graph of S* (t) := sup{x : x E Set) } is contained i n S , and for each t, S * (t) gives the price level at or below which exercising now is optimal (as the option is a put - giving us the right to sell at price K - the optimal exercise region will clearly be of this form) . The Snell envelope, being a supermartingale, may be (a.s.) decomposed uniquely into a martingale part and a potential part (the Riesz decomposition; see below) , as follows: X

J(t) lE [,-"T (K - S(T)) + IF,j + lE �

[1 ,-""rKl{s(u) <s- (u) } du l

F' '

(Myneni (1992) , Theorem 3.2; the result is due to El Karoui and Karatzas) . Because pet) ert J ( t) , there is an analogous decomposition of the value function, P(x, t) = p(x, t) + e(x, t) , where p(x, t) = lEx [e-r(T - t ) (K - S(T)) + is the price of the corresponding European option, which can be thought of as the intrinsic value of the American option - what it is worth if the possibility of early exercise is removed. Accordingly, the second term =

]

260


is the early-exercise premium - the extra value conferred by the right to ex ercise early. It is proportional to the strike price K, as it should be; it also shows very clearly the role of the indicator process l {s( t) <s* (t)} , which tells us whether or not exercise now is optimal. Because of this, the result is often called the early-exercise decomposition. The proof involves the Ito-Tanaka extension of Ito's formula ( and so local time ) , dual predictable projections ( compensators ) , and the Doob-Meyer decomposition in continuous time. For background on the Riesz decomposition - the probabilistic counterpart of the classical decomposition of F. Riesz of 1930 of a superharmonic function into a harmonic function and potential - see Doob (1984) and Neveu (1975). The theory may be extended in various ways, including perpet ual options, options on dividend-paying stocks, restrictions on borrowing ( of cash ) , restrictions on short selling ( of stocks ) , higher interest rates for bor rowing than for lending etc. For a detailed account, see Karatzas and Kou (1998) . Convergence of American Options. Given a sequence of discrete-time financial models approximating a continuous-time model, the weak conver gence theory of §6.4 below ensures that the price of, say, a European option in the discrete setting converges to that of its continuous-time counterpart. This argument does not, however, apply to the more complicated setting of American options, as here the early exercise possibility introduces a control to which standard weak-convergence theory does not apply. Nevertheless, under mild conditions the prices of American options in the discrete-time setting do indeed converge to their continuous-time counterparts. For details, see Amin and Khanna (1994) . By appropriate choice of topology, weak convergence theory can be ex tended to American options, giving convergence results for price processes, their Snell envelopes and hedging strategies. For formulation and proof of results of this type, see Lamberton and Pages (1990) . Extensions.

6.3.2 Asian Options

Here the payoff is a function of the average price of the underlying between contract time and expiry time. Asian options are widely used in practice for instance, for oil and foreign currencies. The averaging complicates the mathematics, but, e.g., protects the holder against speculative attempts to manipulate the asset price near expiry. For details and references, see e.g. Geman and Yor (1993), and Rogers and Shi (1995) .

6.3 Further Contingent Claim Valuation

261

Pricing options typically involves calcu lating an expectation of the form JE (X + ) , where X is a random variable such as S K (European call) , K - S (European put) or Y K with Y JOT Sdp, (Asian option above) . Rogers and Shi (1995), §3, point out a general method yielding a lower bound, which may be surprisingly accurate. Splitting X into its positive and negative parts X + and X - (X X + X , IXI X + + X - ) , we have X + 2: X, whence

The Rogers-Shi Lower Bound. -

-

=

as

=

=

and also JE (X + I Z) 2: 0, so Thus using iterated conditional expectations and the above, The accuracy of this lower bound may be estimated from the quite general bounds The success of the method hinges on finding a suitable choice of condition ing variable Z for which JE (X I Z) , and hence the expectation of its positive part, is easier to handle than JE (X + ) but the two are still close. In the case of a fixed-strike Asian option considered above, a suitable choice is given by T Z := W (u)du

J a

(a zero-mean Gaussian variable) . For the details (which are quite compli cated) , and numerical illustration, we refer to Rogers and Shi (1995) . We now outline the approach to Asian options of Geman and Yor (Geman and Yor (1993) ; Yor (1992a) , Yor (1992b) , Chap ter 6) in terms of Bessel processes and Laplace transforms. For b 2: 0 and W a standard B M (IR) , consider the SDE The Geman-Yor Method.

dp(t)

=

bdt + 2 -/p (t) dW (t) , Po

=

a 2: O.

This SDE has a solution p = ( p (t)) t > o , or p" , called a Bessel-squared pro cess with index �b - 1, BES Q li� As for the name: the square root R" of a Bessel-squared process p" is a Bessel process BESli , a diffusion whose v :=

262


generator involves the Bessel operator giving rise to the classical Bessel func tions (see Watson (1944) ) . The semigroup of and the transition probability of involve the Bessel function of imaginary argument. The link with mathematical finance arises because the exponential of a drifting Brownian motion (the price process in the lognormal or geometric Brownian motion model of Black-Scholes theory) is a time-changed Bessel process. Specifically (Geman and Yor ( 1993) , Prop. 2.3) :

BESQ Iv v ,

BES6,

exp(W(t) + vi)

�

R"

(i

)

exp(2(W(s) + VS) } dS .

For the fixed-strike Asian option, the payoff is 1 (A (T) - K) + , A(x) := __ x to -

x

jS(U)dU,

to and risk-neutral valuation gives the price at time t as (writing IE for the risk-neutral expectation) Define c(") (x, q)

,�

DE

[ (l

exp{2(W( u ) H u ) } du

-

q

fl

Use of the Markov (indeed, independent increments) property of Brownian motion at time t and Brownian scaling leads to e - r ( T- t ) 4 (t) ( ) Ct, (K) = C ( h , q) , T a2 T - t0 where t a2 a2 2r T to) = 1 , t)j q := K(T h := 4 ( : lJ 2 a 4S(t)

(S )

-

(

v

-

-

[ S(U)dU) .

Thus q is a random variable whose value is known at time t. Negative values of q reflect high stock prices over the time already expired. Geman and Yor (1993) , (3. 1 ) , obtain the following contingent pricing formula (which applies only when q :::; 0):


263

This explicit pricing formula - the Geman- Yor formula - somewhat resem bles the classic Black-Scholes formula in structure. It does not involve the volatility a explicitly (only implicitly, via the stock price S(t) and its integral to date, It: S(u)du). From the Geman-Yor formula, one may make comparisons between Asian option prices and those of their European counterparts. Geman and Yor showed that: (i) for v + 1 ;?: 0 (that is, for ;?: 0) the Asian option price is less than that of its European counterpart, for any strike price K; (ii) for v < 0 the reverse is true, at least for K close enough to zero. Note that for options on domestic stock the risk-neutral drift r is the domestic spot rate, which will be positive, so case (i) applies. For options on foreign stock, or currency options, the risk-neutral drift is the difference of the foreign and domestic spot rates, and can have either sign. In particular, the wide spread belief that Asian options are cheaper than European ones - which partly accounts for the popularity of Asian options - is only partly true. As above, our ability to price Asian options is tantamount to our knowl edge of the functions CCv) (h, q). Geman and Yor show that the Laplace trans form (in h) of this function can be identified in terms of confluent hyperge ometric functions (for background on these, see Slater (1960)) . They find that, writing J.L := V2A v 2 , r

oo J

+

e - >'hCCvl (h , q)dh

o

=

fl/C2q) e-x x ! CJL-Vl - 2 ( 1 - 2 q x) ! (JL + v ) + l dx . A(A 2 2v)r( -21 ( J.L v) - 1)

Jo

-

-

-

Numerical inversion of the Laplace transform may now be applied to the right-hand side to obtain numerical values of CCvl (h, q) . The fast Fourier transform (FFT) is applied to this problem in Eydeland and Geman ( 1995) . 6.3.3 Barrier Options

The question of whether or not a particular stock will attain a particular level within a specified period has long been an important one for risk managers. From at least 1967 - predating both CBOE and Black-Scholes in 1973 practitioners have sought to reduce their exposure to specific risks of this kind by buying options designed with such barrier-crossing events in mind. As usual, the motivation is that buying specific options - that is, taking out specific insurance - is a cheaper way of covering oneself against a specific danger than buying a more general one. One-barrier options specify a stock-price level, H say, such that the op tion pays ('knocks in') or not ('knocks out') according to whether or not level H is attained, from below ('up') or above ('down') . There are thus four pos sibilities: 'up and in', 'up and out', 'down and in' and 'down and out'. Since

264


barrier options are path-dependent (they involve the behaviour of the path, rather than just the current price or price at expiry), they may be classified as exotic; alternatively, the four basic one-barrier types above may be regarded as 'vanilla barrier' options, with their more complicated variants, described below, as 'exotic barrier' options. Note that holding both a knock-in option and the corresponding knock-out is equivalent to the corresponding vanilla option with the barrier removed. The sum of the prices of the knock-in and the knock-out is thus the price of the vanilla - again showing the attractive ness of barrier options being cheaper than their vanilla counterparts. A barrier option is often designed to pay a rebate a sum specified in ad vance - to compensate the holder if the option is rendered otherwise worthless by hitting/ not hitting the barrier. We restrict attention to zero rebate here for simplicity. Consider, to be specific, a down-and-out call option with strike K and barrier H (the other possibilities may be handled similarly) . The payoff is (unless otherwise stated min and max are over [0, T] ) as

-

(S(T) - K )+ l {min S( .) ;:::: H } = (S(T) - K) l {s (T);:::: K,min S( .);:::: H } ,

so by risk-neutral pricing the value of the option is DOCK, H IE [e-rT (S(T) - K) l{ s ( T) ;:::: K ,min S (. );:::: H } ] ' where S is geometric Brownian motion, S(t) = po exp{ (/L - � a 2 t) aW(t) } . Write c /L - �a 2 fa; then min S(.) 2: H iff min (ct + W(t) ) 2: a - 1 log (H/po ) . Writing X for X (t) := ct + W(t) drifting Brownian motion with drift c, m , M for its minimum and maximum processes m (t) min{X( s ) : s E [0, tn , M(t) := max{X(s) : s E [0, tn , the payoff function involves the bivariate process (X, m ) , and the option price involves the joint law of this process. Consider first the case c = 0: we require the joint law of standard Brown ian motion and its maximum or minimum, (W, M) or (W, m ) . Taking (W, M) for definiteness, we start the Brownian motion W at the origin at time zero, choose a level b > 0, and run the process until the first-passage time (see Exercise 5.2) r(b) := inf{t 2: 0 : W(t) 2: b} at which the level b is first attained. This is a stopping time, and we may use the strong Markov property for W at time r(b) . The process now begins afresh at level b , and by symmetry the probabilistic properties of its further evolution are invariant under reflection in the level b (thought of as a mirror) . This reflection principle leads to the joint density of (W(t) , M(t) ) as lPo (W(t) E dx , M(t) E dy ) :=

+

:=

-

:=

=

{I

x) 2 v'2ifi3 exp - 2 (2Y - X) /t

2(2 y

-

}

(0 :::;

x

:::;

y) ,


265

a formula due to Levy. ( Levy also obtained the identity in law of the bivariate processes (M(t) - W (t) , M (t ) ) and ( ! W(t) ! , L(t) ) , where L is the local time process of W at zero: see e.g. Revuz and Yor The idea behind the reflection principle goes back to work of Desire Andre in and indeed further, to the method of images of Lord Kelvin then Sir William Thomson, of on electrostatics. For background on this, see any good book on electromagnetism, e.g. Jeans Chapter Levy's formula for the joint density of (W(t) , M (t) ) may be extended to the case of general drift c by the usual method for changing drift, Girsanov's theorem. The general result is

(1991), VI. 2 ). (1824-1907), 1887, (1925), VIII.

1848

IPo (X (t) E

dx, M (t) E dy) 2(2y - x) exp { - (2Y - X) 2 + ex - 1 t } (O :s; x :s; y) . 2t 2 27ft3 See e.g. Rogers and Williams (1994), I, ( 13. 10), or Harrison (1985), §1. 8 . As an alternative to the probabilistic approach above, a second approach to this formula makes explicit use of Kelvin's language - mirrors, sources, sinks; see e.g. Cox and Miller (1972), §5.7. Given such an explicit formula for the joint density of (X ( t ) , M(t)) - or equivalently, (X (t) , m(t) ) - we can calculate the option price by integration. The factor S(T) - K, or S - K, gives rise to two terms, in S and K, while the integrals, involving relatives of the normal density function may be =

� V

-c

2

n,

obtained explicitly in terms of the normal distribution function N - both features familiar from the Black-Scholes formula. Indeed, this resemblance makes it convenient to decompose the price DOCK ,H of the down-and-out call into the ( Black-Scholes ) price of the corresponding vanilla call, CK say, and the knockout discount, K O D K , H say, by which the knockout barrier at H lowers the price:

The details of the integration, which are tedious, are omitted; the result is, writing A - � a2 , := r

where Po is the initial stock price as usual and Cl , C2 are functions of the price P Po and time t T given by ) log (H jpK ) + ( r ± � a 2 ) t Cl , 2 p, t = Vt =

=

2

(

a

is that of the excellent text Musiela and Rutkowski (1997), §9.(the6 , notation to which we refer for further detail) . The other cases of vanilla barrier

266


options, and their sensitivity analysis, are given in detail in Zhang (1997) , Chapter 10. Geman and Yor ( 1996) adapt the Laplace-transform method they devel oped for the Asian options of §6.3.2 to two-barrier options. They show (their Appendix 2) how, and why, the distribution of (X(t) , m(t) , M (t) ) becomes simpler if the fixed time t is replaced by a random time To , independent of X and exponentially distributed with parameter �02 (such exponentially distributed times lead directly to Laplace transforms) . We refer to Geman and Yor ( 1996) for details of their proof, which involves Girsanov's theo rem, Brownian scaling, Wiener-Hopf factorization and excursion theory. The Geman-Yor method gives numerical results for pricing after numerical in version of the Laplace transform. It also gives numerical results for hedging to the same accuracy - which is a rare occurrence for path-dependent op tions. In particular, this approach completely avoids the dangers inherent in Monte-Carlo simulation applied to barrier-crossing problems. Before sim ulating, we must discretize to pass from a diffusion in continuous time to a random walk in discrete time, which reduces calculating probabilities to counting paths. However, the path counts inevitably depend on the specifics of the discretization used (irrelevant to the original problem) , to a degree that leads to unavoidable loss of accuracy. Many other, exotic, types of barrier options have been developed. These include: (i) moving-boundary (or 'floating-barrier') options, where the barrier at con stant level H is replaced by a function H (t) of time; (ii) Asian barrier options, where the boundary crossing is by an average of the price process rather than the price process itself; (iii) barrier options where the barrier operates only for part of the time interval [0, T] : forward start, for an interval of the form [u, TJ , early end ing, for one of the form [0, J window for one of the form [u, ] C [0, T] ; (iv) outside barrier options, where the barrier and the payoff function relate to different underlying assets (for instance, a stock and a currency, etc.). v ,

v

For details, see e.g. Zhang (1997) , Chapter 1 1 . 6.3.4 Lookback Options

In everything we have encountered so far, uncertainty has unfolded with time, and our task has been to make optimal use of the information available to date. For options, at expiry T the investor is in possession of the history of the price evolution over the time interval [0, T] of the option's life, and it may well be - will be - that with hindsight he could have done better than he actually did without hindsight. It is only natural to look back with regret at what inability to foresee the future has cost one. If only one could buy at the low, and sell at the high . . .


267

Goldman, Sosin, and Gatto (1979) made two key realizations: i this natural wish on the part of investors would provide a potential mar ket for options providing them with the right to do just that, looking back in time; ii the relevant mathematics - risk-neutral valuation, and the distributional results on drifting Brownian motion and its maximum and minimum, used above in our treatment of barrier options - existed to price such options. It thus became as practical matter to develop, and price, such lookback options as other option types we have already encountered, and this was accordingly done. Of course, the term 'option' is applied here by extension rather than strictly: the options described above will always be exercised! They will accordingly be more expensive than options encountered before. We write S for the price process, Ml, mr for its maximum and minimum over [0, t , M� , v] ' mru , v] for its maximum and minimum over [u , and simi larly with S replaced by other processes X . The two basic types of standard, vanilla lookback options are the lookback call, with payoff

() ()

a

)

]

v], (

LC(T) (S(T) - mf)+ S(T) mf, giving one the right at time ° to buy at the low over [0, T] , and the lookback put, with payoff LP(T) (M,j. - S(T)) + M,j. S(T) , =

:=

:=

-

-

=

giving one the right at time t to sell at the high over [0, TJ . Risk-neutral pricing gives the arbitrage price at time t E [0, T) of the lookback call as LC(t) e-r (T - t ) IE [(S(T) - mf ) IFd e - r ( T - t ) IE (S(T) IFt) e-r ( T - t ) IE ( m f l Ft) II - 12 , say. Now =

-

=

=

as the discounted price process is a JP* -martingale. Also

S(t)

=

Po exp

{ (r

-

� a2 ) t

+

}

a W (t) ,

with W a standard JP*-Brownian motion, so for t :::; u :::; T,

(

S u)

=

( )}

S(t) exp {X(u) X t , -

268


where

1 2

X (t) = ct + O'W(t) , c := r - _ 0' 2

is a drifting ]P* -Brownian motion. Now

S ( . ) = S(t) exp {m;T mf, T = min ' }' [t ,TJ and

' { mtS , mts T } . mTS = mIn ,

By stationary independent increments of X, m;T is independent of Ft and equal in law to m ; , where T := T - t . To evaluate I2 , it thus suffices to find f(s, m ) : = .IE [min { m, s exp {m ; } }) ,

as then

I2 = e-r ( T -t ) f(S(t) , m f ) , giving 12 in terms of data known at time t. Now f(s, m ) - m = .IE [min { m, s exp{m; } } - m] = .IE [ (s exp { m; } - m) l{ m:; �log ( m/ s ) }] .

Since we already have an explicit expression for the density of the minimum of a drifting Brownian motion, we can calculate the expectation on the right by multiplying the function on the right by the density and integrating. As before, the calculations are simplified by reducing to the drift less case c = 0 by Girsanov's theorem. We omit the details of the calculation, which are quite tedious; the result is, writing s for S(t) , m for m f , for T - t as before, and r I , 2 := r ± � 0' 2 ,

T

with a similar formula for the put, P LP(t) . For details, see Musiela and Rutkowski (1997) , Proposition 9.7. 1. The options above are floating strike lookback options, in that the role previously played by the strike K is now played by a random variable. But one can also have fixed strike lookback options, with payoffs max ( M� - K, O ) , max ( K - mf, 0)


269

for calls and puts respectively. One can have partial lookback options, with payoffs max{S{T) - )..mf , 0)

for calls,

max{).. Mi - S{T) , 0)

for puts, where ).. E (0, 1] is the degree of partiality: one only partially elim inates regret - and pays less for the option accordingly. And one can have American lookback options, which may be exercised early. For details and background on such exotic lookback options, see Zhang (1997) , Chapter 12. The theory above supposes that prices are monitored continuously. In practice, prices are only monitored discretely - and as we saw in our treat ment of barrier options, discretization can produce appreciable errors, partic ularly if Monte-Carlo simulation is resorted to. For a contemporary account of discrete lookback ( and barrier) options, see Levy and Mantion (1997). 6 . 3 . 5 Binary Options

A binary ( or digital ) option is a contract whose payoff depends in a discon tinuous way on the terminal price of the underlying asset. The most popular variants are: Cash-or-nothing options. Here the payoffs at expiry of the European call resp. put are given by BCC{T) = C l{s ( T » K} resp. BCP{T) = C l {s( T) 1 ( see §2.6 and Williams ( 1991) , Chapter 13 ) . Example. As an example one can consider pricing a European put option in a continuous-time model. The theorem tells us that we can price the put by pricing the corresponding put in the appropriate discrete-time setting and then computing the limit of the discrete-time model prices ( a European call is then priced by put-call parity ) . The next theorem explores the consequences for trading strategies for contingent claims. Theorem 6.4.2. Let M ( n ) (s( n ) , L( n ) ) be a finite market approximation of a

continuous-time market model M (S, L) , and assume that { s( n ) } is a good se quence of semi-martingales. Assume that the sequence of discrete-time trading strategies cp( n ) converges weakly to a continuous-time trading strategy cpo Then the (discrete-time) gains Gcp(n) and value Vcp( n ) processes converge weakly to their continuous-time counterparts Gcp and Vcp .

The proof is a direct consequence of the definition of the processes and the goodness property of semi-martingales. Of course the real work in applying Theorem 6.4.2 is to show the goodness of the underlying approximating process { s( n ) } and the weak convergence of the trading strategies. Having established these facts, one can apply the the orem to justify 'discrete-time hedging of continuous-time claims'. Suppose we construct a continuous-time hedging strategy cp to hedge a contingent claim X, then the theorem tells us that under appropriate conditions on the ap proximating discrete-time models, we can hedge ( in the limit ) the contingent claim with the induced discrete-time strategies. Given that in practice trad ing is discrete, this is very reassuring for risk managers trying to cover their positions with trading strategies deduced from continuous-time models.

274


6.4.3 Examples of Finite Market Approximations

We consider a Black-Scholes type continuous-time financial market model with a diffusion setting within the framework of §6.2. Let us consider a securities market consisting of d risky stocks and one locally risk-free bond. The d-dimensional vector of stock prices, S, and the bond price, B, are described by the stochastic differential equations 1,

dB(t)

=

B(t)r(S(t) )dt, B (O)

dS(t)

=

b(S(t)) dt + a(S(t) )dW(t) , Si (O)

=

=

Pi

E (0, 00) ,

(6 . 16)

where W (WI , . . . , Wd ) is a d-dimensional standard Brownian motion de fined on a complete probability space (il, :F, JP). We assume that the ap preciation rate vector b JRd � JR, the volatility matrix a JRd � JRd x d and the interest-rate process r JRd � JR are continuous. Furthermore, r is assumed to be bounded and a to be non-singular. Define the covariance matrix a(x ) a(x)a' (x) (aij (x))i.j = I . . . . d . This is necessarily non-negative definite; we assume that it is positive definite: then for some number 8 > 0, =

:

:

:

=

=

(6. 1 7)

In addition, we assume that b and satisfy the uniform Lipschitz condition, that is, there exists a constant C > 0, such that for all x , y E JRd , a

j b(x) - b( y ) j + jj a(x) - a( y ) jj � C j x - yj .

(6.18)

Given this setting, it follows by the results of §6.2 that the market M = M (B, S) described by the above bond process B and the d-dimensional vec tor S with the bond price process used as a numeraire is arbitrage-free and complete ( in the restricted sense). In particular, there exists a unique mar tingale measure given by its Radon-Nikodym derivative L( ' )

�

exp

with ( Recall I d equation

{-i

, ( S (u)) ' dW(u)

0 ", t '" T,

a(x) - I (b(x) - r (x) ld ) . ( 1 , . . . , 1)'.) Observe that L satisfies the stochastic differential , (x)

=

- � i 1 I, ( 8 (u)) II ' dU } ,

dL(t)

=

-1' (S(t) )L(t)dW(t) , L (O) = 1 . ( L i s the stochastic exponential as defined in §5. 1O, compare §6.2.2) . Our =

aim now is to construct a sequence of discrete-time financial markets M ( n ) , sharing the properties of no-arbitrage and completeness, which approximates the above continuous-time financial market.

6.4 Discrete- versus Continuous-time Market Models

275

We do this by utilising a finite Markov-chain approximation scheme ( for an excellent overview of such methods, see Kushner and Dupuis (1992) ) , and start by approximating the stock price processes. Let h > 0 be a scalar approximation parameter. We wish to find a se quence of Markov chains on finite state spaces that converges in distribution ( h -+ 0) to the process defined in (6. 16) over the time interval [0, T] . The basis of the approximation is a discrete-time parameter, finite state space Markov chain {�� , n < } whose 'local properties' are consistent with those given in ( 6 . 1 6 ) . The continuous-time parameter approximating process will be a piecewise constant interpolation of this chain, with appropriately chosen interpolation intervals. To make the above precise, for each h > 0 let {�� , n < } be a discrete parameter Markov chain on a discrete state space Sh E ]Rd . Suppose we have an interpolation interval Ll th ( x ) > 0, and define Llt� Ll th (�� ) . Let sUPx Ll th ( x ) -+ 0 as h -+ 0, but infx Llth ( x ) > 0 for each h > O. Define the difference Ll�� = �� +1 - �� and let IE�, n resp. Cov� , n de note the conditional expectation resp. covariance given {�f , i ::; n, �� x } . Suppose that the chain obeys the following 'local consistency' conditions: sup 1I�� +1 - �� I I -+ 0 (h -+ 0) , as

oo

,

oo

=

=

n

IE�, n (Ll�� ) = b h ( x ) Ll t h ( x ) + 0 (Ll t h ( x ) ) , Cov�, n (Ll�� - IE�, n Ll�� ) = a h ( x ) Llt h ( x ) = a ( x ) Llt h ( x ) + 0 (Llth ( x ) ) . Note that the chain has the 'local properties' of the diffusion process (6. 16) ,

in the following sense:

IE [ S(t + Llt) - S(t) I S(t)] '" b(S(t) ) Ll t, Cov [S(t + Llt) - S(t) I S(t)] '" a (S(t)) Ll t.

We outline the construction of suchT a Markov chain ( the reader should consult He ( 1990) for details ) . Assume = 1 and set h lin , hence our grid . h t ( n ) kl n , k 0 . . . , n. T } WIt n n n I n = {O = t o( ) < t (l ) < . . . < t n( ) k Now construct a sequence of triangular arrays of independent, identically distributed, d-dimensional random vectors (E( n ) ) k::; n , with components E� ; ) uncorrelated, but possibly dependent. Each component takes exactly d + 1 different values and =

�

=

=

=

,

We now define the d-variate, ( d + l ) -nomial approximation process for the diffusion, s ( n ) , as the solution of the stochastic difference equation ( we omit the index n for the grid ) s( n ) (tk + d

=

�

s (n Ek s( n ) (tk ) + b( ) (tk ) ) + O' ( s( n ) (tk ) ) , n yn

(6. 19)

276


and s( n) (0) = S(O) . From (6.19) , we see that the random vector €�n) is used to approximate the random increment of the Brownian motion from time tk to time tk +l , and that s( n) is a Markov chain. We check the consistency requirements: for the drift condition we use that €�n) have expectation vector zero:

for the covariance it follows that

Hence the constructed Markov chain has the local consistency property. Define s( n) (t) = s( n) (tk) for tk � t < tk + l , then the sample paths of S are piecewise constant and only have jumps at tk (of course s( n) (T) = s( n) (T) ) . In the same way, replacing differential equations by difference equations we model bond price processes B( n) as

and the Radon-Nikodym processes L( n) as

We refer the reader to He (1990) for the actual construction of the processes L( n) , starting with the state-price vectors (compare § 1 .4) . In our current set ting, we know from §4.7.2 that the discrete-time markets are free of arbitrage and complete, so L( n) (T) defines the unique martingale measure. We have the following convergence theorem. Theorem 6 . 4 . 3 . The sequence of discrete-time financial markets M ( n) ((B( n) , s( n » ) , L( n » ) is a structure-preserving finite market approxima tion with respect to dynamic completeness of M ( (B, S) , €) . Proof.

We already know that completeness is shared by each of the

M ( n) , s and M . So it only remains to show that x( n) = ((B( n) , s( n » ) , L ( n » ) converges weakly to X = ( (B, S) , L ) . This is done by applying the martingale

6.4 Discrete- versus Continuous-time Market Models

277

central limit theorem ( Ethier and Kurtz (1986) , Chapter 7, and again see He 0 (1990) for the computational details ) . We now show weak convergence of contingent prices. Here a contingent claim is defined to be a random variable of the form Y = for some measurable, bounded function 1] --+ 1R+ Define by the time 0 values of the contingent claim in the continuous-time market and the nth approximating market.

t

g(S) n IIy, II� )

g : Dd[O,

=

Proposition 6.4. 1 . We have the following convergence:

nlim-too IIy(n) IIy. Proof. Since the discount factors 1/ B resp 1/ B( n ) are bounded we only have to show uniform integrability of L( n ) (T) to be able to apply Theorem 6.4.1 . Using Remark 6.4. 1 , it is enough to show uniform boundedness of n) (T) in L2. The continuity of b, and together with the condition on the L(covariance matrix ( 6.17 ) ensure that "Y is continuous. Furthermore, mimicking the uniform boundedness proof of the successive approximations from any =

a

r

existence proof for strong solutions of stochastic differential equations ( see §5.8 or Kloeden and Platen ( 1992), proof of Theorem 4.5.3) , we can show that the sequences are uniformly bounded, i.e. for = 1, 2, . . .

s(n) L2 n sup IE ( Is; n\t k ) 1 2 ) C 00, i = 1 , . , d O� k � n ( recall the initial conditions are Si(O) Pi E [0, 00) , i = 1 , . d and that the components of the tkn ) have second moment equal to 1 ) . So "Y(s( n ) (t k )) is uniformly bounded for n 1 , 2, . . . O�supk � n IE ( l "Yi(s(n)(tk )) 12 ) C'Y 00, i 1 , . . . , d (6.20) for some constant C'Y ' Now using successively the Cauchy-Schwarz inequality, independence of the tkn) and S(n)(t k ), zero correlation between the coeffi cients of the tkn ), and condition (6.20): ::;

0,

where J1 Q is a constant drift ( which may have either sign ) , uQ is a positive vector of volatilities, W(t) is a standard d-dimensional Brownian motion. Solving, one obtains

where 11 . 1 1 denotes the Euclidean norm in JR d . The value of our foreign savings account Bf (t) is Bf (t) Q (t) in domestic currency, and Bf (t) Q (t)/ Bd (t) when discounted by the domestic interest rate. We write this process as Q * : its dynamics are given by d Q * (t)

with solution

=

Q * (t) ( ( J1Q

+ rf - rd )t + uQ dW(t ) ) ,

To avoid arbitrage - between the domestic and foreign bond markets ( the only markets presently in play ) - we need to pass to an equivalent mar tingale measure eliminating the drift term in the dynamics above. We have

286


a d-dimensional noise process, and cannot expect uniqueness of equivalent martingale measures unless there are d independent traded assets available. When this is the case, there exists a unique equivalent martingale measure JP* , called the domestic martingale measure, giving the dynamics as dQ ( t ) Q (t) ( (rd rf ) dt + UQ . d W ( t )) , Q (O) 0 =

>

-

with W a JP* -Brownian motion. This, as its name implies, is the risk-neutral probability measure for an investor reckoning everything in terms of the do mestic currency. Risk-neutral valuation gives the price process of a contingent claim X as 7l"x ( t ) e - rd ( T -t ) JE* (X I Fd · Consider now an agent involved in international trade, wishing to limit his exposure to adverse movements in the exchange-rate process Q. He will seek to purchase an option protecting him against this, in the same way that an agent dealing in a stock S will purchase an option in the stock. Of course, an exchange rate is not a tangible asset in the sense that a stock is; nevertheless, it is possible, and helpful, to treat options on currency in a way closely analogous to how we treat options on stock. For this purpose, consider the forward price at time t of one unit of the foreign currency, to be delivered at the settlement date T, in terms of the domestic currency. It is natural to call this the forward exchange rate, FQ (t , T) say. In terms of the bond-price processes, one has =

a relationship known as interest-rate parity. For, absence of arbitrage requires that the forward exchange premium must be the difference rd rf between the two exchange rates. Currency options may now be constructed, and priced, analogously to options on stock. For example, a standard currency European call option may be constructed, with payoff -

GQ (T)

:=

(Q (T) - K)+ ,

where Q (T) is the spot exchange rate at the option's delivery date and K is the strike price. Finding the arbitrage price of this option is formally the same as finding the price of a futures option, with the forward price of the stock replaced by the forward exchange rate. We already have the solution to this problem, in Black's futures options formula. Adapted to the present context, this yields the following currency options formula, due to Garman and Kohlhagen (1983) and to Biger and Hull (1983) independently. Proposition 6 . 5 . 2 . The arbitrage price of the currency European call option

above is

Exercises

where F ( t )

=

FQ ( t , T) is the forward exchange rate, log ( F/ K) ± � a� t d 1 , 2 (F, t)

:=

aQ Vit

287

'

Exercises

6 . 1 In the setting of the classical deduction of the Black-Scholes differential equation ( compare §6.2.2 ) , show that the trading strategy short fs stocks, long a call is not self-financing. Show furthermore that the additional cost associated with this trading strategy up to time T can be represented by a random variable with zero mean under the risk-neutral martingale measure in the standard Black-Scholes model. 6 . 2 1. Use the put-call parity ( compare §1 .3) to compute the pricing formula for the European put option in the classical Black-Scholes setting: P(O) P(S, T, K, r, a ) K e -rT N( -d2 (S, T)) - SN(-d1 (S, Tn 2. Compute the 'Greeks' for the European put option ( with the help of a computer -programme such as Mathematica, if available ) . 6.3 Use the parameters of the example in §4.8.4 to compute Black-Scholes prices of the European put and call options. Construct discrete ( using the tree ) and continuous ( using the Black-Scholes . 0). An investor with such a utility function U and initial endowment x trading only in the underlying assets So , . . . , Sd forms a dynamic portfolio t,p, whose value at time t is (we need to keep track of the initial endowment) . His objective is to maximise expected utility under the original probability measure of his final wealth at time given that he is allowed to choose his trading strategy t,p from a suitable subset iPa of the set of self-financing trading strategies. We write =

=

=

=

=

=

e-cx ,

Vcp , x (t)

T

U(x)

=

cpsupE<Pa IE [U (Vcp,x (T)) ]

for the maximal utility. Now suppose that a contingent claim X (a sufficiently integrable random variable) is made available for trading with current pur chase price p. We ask the question whether the maximal utility above can be increased by doing so. To find a 'fair' price p for a contingent claim we follow a 'marginal rate of substitution argument' quite commonly used in pricing: p is a fair price for the contingent claim if diverting a little of his funds into it at time zero has a neutral effect on the investor's achievable utility. More precisely, if we set W(8, x, p)

=

sup

¢E<Pa

IE

[U ( Vcp,x -o (T) + P�X) ] ,

then we can state: Definition 7. 1 . 2 . Suppose that for each fixed (x, p) , W (8, x, p) is differen tiable as a function of 8 for 8 = 0, and that there is a unique solution p(x)

of the equation

8W (O , p, x) = O. 88 Then p(x) is the fair option price at time t = O.

Before we can go on, we have to check whether this pricing approach is consistent with the arbitrage price for attainable contingent claims (otherwise the new approach would introduce arbitrage opportunities in the market and

7. 1 Pricing in Incomplete Markets

291

X

we would have to reject it ) . To this end let be an attainable contingent claim with arbitrage price ( found as the unique initial endowment of any self-financing replicating trading strategy ) . Suppose that is offered for p and that the investor buys contingent claims with the amount of cash and so has a remaining endowment of which can be used to set up a dynamic trading strategy cp E CPa . Since is increasing it is optimal for the investor to sell the replicating portfolio short ( so no exposure results from the contingent claim at time and invest 1) optimally. This results 1)). The marginal rate of in attaining an expected utility of substitution is therefore

Po 81p

X

x 8, U x + 8(� U(x + 8(P;

8,

-

T)

-

-

:8 U [X + 8 ( � - 1 )] ! 6=0 = ( � - l) U' (X) , which is only zero for p Po . In general it is very hard to check that the function U is differentiable =

( to justify our rather informal deduction above ) . Fortunately only a 'weak' form of differentiability, quite similar to a 'viscosity solution'-like approach to optimal control problems, suffices ( see Fleming and Soner (1993) and Karatzas (1996) for details on such weak approaches) . For the purpose of our current outline we will take differentiability of for granted, and as sume additionally the existence of a trading strategy cp * E CPa such that = IE Under these assumptions we obtain a general pric ing formula based on Definition Theorem 7. 1 . 1 (Davis) . Suppose that is differentiable at each x E IR +

U

[U (Vcp*,x (T))].

U(x)

and that

7.1. 2 .

U of Definition 7.1.2 is given by U' (x) > O. Then the fair price fi(x) IE l U ' (Vcp* , x (T)) X] p (7.1) UI(X) . •

Proof.

_

-

Assume that cp * E CPa is such that

W(8, x, p)

[U ( Vcp* , x (T) + �X) ] for 8 0, then one can show ( see Davis (1997) , Lemma 2) that !W (0,X, P) ! 6 = 0 :8 IE [U ( Vcp * , x (T) + �X) ] ! 6=0 ' So we have to differentiate IE ( U (.)) with respect to Using a Taylor expan sion of U around Vcp*, x (T) we have =

IE

=

=

O.

IE

[U (VCP* ,X -6(T) + �X)]

[U (Vcp*, x -6(T))] + P�IE lU' (V,CP' .x - 6(T)) X] + ( 0) . IE

0

292

7. Incomplete Markets

Neglecting the (J)-term and differentiating with respect to 15, we obtain 0

:J IE [U (v� x -o (T) � X ) ] 1 0=0 :J IE [U (V� x - o (T))) 1 0=0 + P� IE [U' (V� x (T)) X) . IE [U(V� x (T))) the first-order condition for optimality is -U ' (x) � IE [U' (V� x (T)) X ] 0 +

. .

•

Since U(x)

=

•

•

.

.

+

and from this

.

• .

(7.1) follows.

=

o

We ask for the reader's patience until cation of this pricing approach.

§7. 3 , where we will give an appli

7. 1 . 2 The Esscher Measure

S(t)

t

Let denote the price at time of a non-dividend-paying stock. Assume that there is a stochastic process {X(t)} with stationary and independent increments (Le. a Levy process, §5.5) such that

S(t) S(O) eX(t) , t � O. For 'IjJ a measurable function and Y a random variable, write IE ['IjJ(Y) ; h] IE ['IjJ(Y) ehY] lIE [ehY] . Assume that the moment-generating function M(h, t) IE [e hX(t) ] =

=

=

of X (t) exists; then The process

t) [M(h, 1 )] t. { ehX(t) M(h, 1)- t } t �O M(h,

=

is a positive martingale and can be used to define a change of probability measure, Le. it can be used to define the Radon-Nikodym derivative dQldlP of a new probability measure Q with respect to the original probability mea sure IP; Q is called the Esscher measure of parameter h. Gerber and Shiu introduced the risk-neutral Esscher measure: the Esscher measure of parameter h h * such that the process

(1995)

=

is a martingale. The condition

7. 1 Pricing in Incomplete Markets

293

lE [e - rt S(t); h*] S(O) yields ert lE [eX(t) ; h* ] = lE [ e X(M(ht ) *h,.lX() tt) ] [ M(lM(h+*h*, 1), 1) ] t or l + h* , 1 ) . e r = M(M(h * , l) This equation then uniquely determines the parameter h * . Because for t 0 h , e hX (t) M(h , l ) - t - -lE--=-e[he-XhX-(t()""'t)�] lES(t) [S(t)h] =

=

+

=

2::

_

we have the following:

( For a measurable function and h, k and t real numbers, with t 0, lE [S(t)kg (S(t» ; h] lE [S(t)k; h] lE [g (S(t») ; k + h].

Lemma 7. 1 . 1 Factorization Formula ) . 2::

9

=

Proof.

lE [S( t)k g (S(t)); h] lE [S(t)k g (S(t))e hX (t) M(h, 1 ) - t ] _ lE [ S(t)k+ h g (S(t))] - lE [S (t)h] _ lE [S (t)k+ h ] lE [ S(t) k+ h g ( S (t)) ] - lE [S(t)h] lE [S (t) k +h] = lE [S(t)k ; h] lE [g (S(t)) ; k + h]. =

o

We apply the above technique to the valuation of a European call with maturity T and strike K on the underlying with price dynamics By the risk-neutral valuation principle, we have to calculate

S(t).

lE [e-rT (S(T) K)+ ; h* ] lE [e - rT (S(T) K ) l{ s (T» K } ; h* ] e- rT { lE [S ( T) l { S ( T » K } ; h* ] K lE [ l { S ( T » K } ; h* ] }. -

=

=

-

-

To evaluate the first term, we apply the factorization formula with k and = l{ x> K} and get

h* g (x)

=

1,

h

=

294


IE [ S( T) I {S( T » K} ; h * J

= IE [S( T) ; h *] IE [1{S( T » K} ; h * + 1 J = IE [e - rTS ( T) ; h * J erT JP [S(T)

>

K; h * + 1]

>

= S(O)erT JP [S(T) K; h* + 1] , where we used the martingale property of e- r t S (t) under the risk-neutral Esscher measure for the last step. Now the pricing formula for the European call becomes S(O)JP [S ( T)

>

K; h * + 1] - e - rT KJP [S( T)

>

K; h *] .

For X (t) a Brownian motion, the above formula recovers the Black-Scholes pricing formula. It is possible to generalise the above approach to the multi-asset case. Let X (t) = (X l ( t) , . . . , Xd ( t) ) ' . Then M ( z , t) = IE e zf X (t ) , z E IRd is the moment-generating function of X (t) . Assuming that { X ( t) } t >o is a Levy process (has stationary independent increments: §5.5) , we have

[

For h = (h 1 ' . . . ' hd ) E

]

M ( z , t) = [ M ( z, l) ] t , t ::::: O. IRd for which M (h , 1 ) exists the e hf X (t) M (h , l ) - t t�O

{

}

positive martingale

can be used to define a new measure, the Esscher measure of parameter vector h. Again h * is defined as the parameter h = h * such that e-r t Sj (t) , j = I , . . . , d are martingales. We can rewrite these conditions as er =

M (lj + h * , l) . , = l, . . . M (h * , l ) )

, d,

where I j = (0, . . . , 0 , 1 , 0, . . . , 0) with the jth coordinate being l . As in the one-dimensional case we have a factorization formula. For k = (k1 , . . . , kd )' let S( t)k = S l ( t) k1 . . . Sd ( t) kJ , then IE [S (t) k g (S(t) ) ; hJ = IE [ S( t) k ; hJ IE [g(S( t) ) ; k + h] . Remark 1. 1 . 1 . (i) The choice of the Esscher measure may be justified by a utility-maximising argument for a representative agent along lines similar to those leading to the pricing formula in §7. l .2. However, the class of possible trading strategies for the agent is very restrictive in this case (see Gerber and Shiu ( 1995) for details) . (ii) The Esscher measure offers a very attractive way to find at least one equivalent martingale measure in many incomplete market situations (see for instance its use in the hyperbolic Levy model Eberlein and Keller ( 1995) ) . X

X

7.2 Hedging in Incomplete Markets

295

7 . 2 Hedging in Incomplete Market s

In Chapter 6 we used general martingale representation theorems to prove existence of hedging strategies and to construct them. The existence of non attainable contingent claims implies in that context that there exist ( suffi ciently ) integrable, Fr-measurable random variables X for which we cannot find an integral representation ( with respect to an equivalent martingale mea sure ) in terms of the ( discounted ) price processes S (So , . . . , Sd) ' . There fore such claims carry an intrinsic risk, and our aim can only be to reduce the remaining risk to this minimal component. We will now describe sev eral criteria to quantify the remaining risk and the related construction of 'optimal' hedging strategies. To avoid technicalities, we assume that price processes are already discounted ( i.e. use So 1 as numeraire ) , and that the prices of the risky assets are given by continuous, square-integrable semi martingales, i.e. S P + M + A with P (P I , . . . , Pd ) ' , Pi E [0, 00 ) M J ad ( M ) , (MI , . . . , Md) ' a square-integrable martingale under 1P and A with a ( 0. 1 , . . , ad )' a predictable process. One of the most prominent problems in the contemporary mathematical finance literature is the pricing and hedging of contingent claims ( random cash flows at a prespecified time point ) in an incomplete market. In case of a complete market, the pricing and hedging of the contingent claim can be done via the unique martingale measure and martingale representation results. To price a contingent claim, we calculate an expected value with respect to the martingale measure, to hedge a contingent claim perfectly, we can obtain a hedging strategy by using the integrand of an appropriate stochastic integral, as in Chapter 6, or Harrison and Pliska ( 1981) . In the incomplete setting however, there are infinitely many martingale measures, each leading to a different pricing formula, and so the question of finding a 'price' of a contingent claim and a suitable hedging strategy is more involved. One stream of thought originating in the work of Follmer and Sonder mann (1986) and subsequently considerably improved by, to name a few, Follmer and Schweizer (1991 ) , Schweizer (1991) , Schweizer (1994) , Pham, RheinHinder, and Schweizer (1998) , Schweizer (2001b) ( an excellent review ) , is to use quadratic hedging approaches: local-risk-minimization and mean variance hedging ( see §7.2.2 below ) . Typically, one tries to find a hedging strategy that minimises a suitably defined ( quadratic ) risk function. In case an optimal hedging strategy exists it could be used to define a dynamic value process for the contingent claim; however, there is typically a positive probability for the wealth process of an investor following this strategy to be negative at the end, and thus this approach seems not to be suitable for pricing purposes ( see Korn ( 1997b) for a detailed discussion ) . Alternative approaches relying on utility-based arguments for pricing con tingent claims in incomplete markets are Aurell and Simdyankin (1998) , Davis (1994) , Karatzas and Kou (1996) , Rouge and EI Karoui (2000) . ( For a detailed study for the use of utility function and optimal investments, see =

==

=

=

=

=

=

.

296


Kramkov and Schachermayer (1999) .) In all of these papers, links to hedging problems and thus martingale measure can be established. 7.2.1 Quadratic Principles

Quadratic principles have a distinguished history in finance and insurance, arising from their even more distinguished history in statistics and mathemat ics. The fountainhead in statistics is the method of least squares, introduced by Gauss and Legendre in the early 19th century, and its development into the Linear Model of statistics (see e.g. Plackett (1960)). The fountainhead in finance is the work of Markowitz, from 1952 on. Markowitz emphasised that one should think in terms of risk, as well as (expected) returns. Since expectations about returns can be summarized by means (mean vectors, in the multi-dimensional case), risks by variances (covariance matrices) , this gives one the mean-variance framework, still one of the cornerstones of con temporary investment theory. Mean-variance portfolio theory fits naturally with any developments that depend on distributional assumptions (on asset returns) only through the first two moments, such as normality assump tions or investors maximising quadratic utility. In an equilibrium setting, mean-variance theory leads to capital asset pricing models and the concept of diversification: one should hold a basket of assets to mimic the (efficient) market portfolio in order to be exposed only to systematic risk and diversify away specific risk exposure (which according to this theory is not rewarded anyway). For further aspects of mean-variance theory, see e.g. Bodie, Kane, and Marcus (1999) , Elton and Gruber (1995), Ingersoll ( 1986) . A third aspect is the use of quadratic loss functions. In many areas decision-theoretic statistics, optimization theory etc. - one introduces a loss function to quantify choice between alternatives, and is guided by the prin ciple of minimising expected loss. For background, see e.g. Robert (1997), Chapter 2 (esp. §2.5. 1) (Bayesian and decision-theoretic statistics) , Whittle (1996) (LQG: linear dynamics, quadratic cost, Gaussian noise). It is as natural to think of maximising expected utility as to minimize expected loss. To quantify this, one needs to select a utility function, u, say. The classic treatment here is von Neumann and Morgenstern (1953); for a more recent treatment see Elton and Gruber ( 1995) , Ingersoll (1986) (finance and investment) , or Robert (1997), Chapter 2 (Bayesian statistics). Here quadratic loss min iE [(W(O) - L) 2 ] (J E 8 where W(.) is the wealth function and L the target for final wealth, e an appropriate space of strategies - corresponds in the utility formulation max iE [u(W(O » ] (J E 8

to a quadratic utility function of the form ( w ) = w - cuP . (Note while the former seems very natural, the latter seems less so, as one needs to restrict u


297

in order to avoid negative wealth, and it implies unrealistic attitudes of investors towards absolute risk aversion, again see Elton and Gruber (1995) or Ingersoll ( 1986)) . Relevant here is the contrast between economic or financial theories that do, or do not, depend on the attitude to risk of individual agents. On the one hand, the assumption of non-satiation (preferring more to less) plus the condition of absence of arbitrage lead in complete markets (informally: there are at least as many traded assets as sources of risk) via the arbi trage pricing technique to pricing and hedging results including the classic Black-Scholes-Merton theory. One the other hand, incomplete market situa tions (more sources of risk than traded assets) are by definition characterised by the nonexistence of a perfect hedge for some securities, and thus require further assumptions on, e.g. the agent's risk preferences as expressed in his utility function or loss function to solve the hedging and pricing problems. Premium-principles, well-established in insurance, might also be used in this framework, see Schweizer (2001a) for application of quadratic valuation prin ciples, such as the variance and the standard-deviation principle. From the mathematical point of view, one uses martingale theory, as mar tingales model random phenomena without systematic drift - fair games in gambling, absence of arbitrage in financial market models etc. (see Musiela and Rutkowski (1997)). A mean-variance framework as above necessitates second moments, so one uses the L2 -theory of martingales, in particular the Kunita-Watanabe inequalities and decomposition (Rogers and Williams (2000), IV.4, Dellacherie and Meyer ( 1978), VII.2). They allow one to use Hilbert space methods, and the geometric language of projections and or thogonality. This is very natural in situations where all the essential features of that situation are modelled by the mean and (co-)variances. The classical situa tion of this type is, of course, the Gaussian (or multivariate normal) case, where means and variances determine - in the presence of Gaussianity the structure. The downside is that one cannot then model other features asymmetry and heavy tails, for instance - not present in the Gaussian case. One way to proceed is to retain means and variances - for convenience, their economic interpretation in the Markovitz case etc. - but replace the Gaus sian assumption by something more general. One is led to a semi-parametric model, with means and variances forming the parametric component and a non-parametric component describing other aspects. See for instance Bing ham and Kiesel (2002) and the references quoted there for background. w

7.2.2 The Financial Market Model

We use the setting outlined in e.g. Heath, Platen, and Schweizer (2001), Pham, Rheinliinder, and Schweizer (1998) and Schweizer (2001b) to which we also refer for further details.

298


Let ( fl, F, JP) be a probability space with a filtration IF = ( Ft ) O:5t:5T satisfying the usual conditions of right-continuity and completeness, where T E (0, 00] is a fixed time-horizon. All processes considered will be indexed by t E [0, T] . We consider a market with d + 1 assets with price processes (St ) = (Sf , Sf , . . . st ) available for trading. One of the assets, So , is used as a numeraire, that is Sf is assumed to be strictly positive and all other assets are discounted with So . (In order to reflect the time value of money, it is natural when following asset values over time to work in discounted terms. Indeed, we have a choice as to which numeraire - the basic asset to take as our unit of accounting - to use, and to discount this to make its value constant.) We denote the discounted assets by ( 1 , Xd , with Xt (Xl , · . . , xt ) = (Sf / Sf , . . . , Sf / Sf ) , and consider only the discounted values in the sequel. We assume that X satisfies the structure condition (SC) ; this means X admits the decomposition (se) X = Xo + M + A , where M E M (JP) � is an JRd -valued locally square-integrable local JP martingale null at 0, 'and A is an JRd -valued adapted continuous process of fi nite variation null at 0. Furthermore, we denote by (M) = ( (M)ij ) t,:J. l , , d = = ( (M i , MJ» ) j = l d the matrix-valued covariance process of M. We assume i , , , continuous with respect to M, in the sense that that A is absolutely ,

=

lac

.

...

...

A� =

(J

d (M) ) \s

o

)

, �

t j X�d (M' , M'). , 0

J=1 0

'"

t " T, i

�

1, 2, . . , d, .

with a predictable JRd -valued process >" such that the variance process of the stochastic integral J >"dM , namely

J >..;rd t

Kt =

o

(M) s >" s =

J 'J= 1 0 d

.2:

t

>"� >"� d (M i , Mj) s '

is JP - almost surely finite for t E [0, T] . We fix an RCLL version of K and call this the mean-variance tradeoff (MVT) process. As shown in Delbaen and Schachermayer (1995a) and Schweizer (1995) , the structure condition (SC) is equivalent to a rather weak form of the no arbitrage condition (no free lunch with bounded risk) . Heuristically speaking, we can say that the MVT process K 'measures' the extent to which X devi ates from being a martingale (consult Schweizer (1994) , Schweizer (1995) for a precise statement and an explanation of the terminology 'mean-variance tradeoff') . If X is continuous (as it will be throughout) and admits an equiv alent local martingale measure (see §7.2.3 below) , the structure condition is automatically satisfied.

7 . 2 Hedging in Incomplete Markets

299

By construction, XI are discounted price processes that together with the constant price process 1 form a financial market model. We now introduce as a further traded asset a European contingent claim, i.e. a random payoff at time T, in our market. Formally, a contingent claim H is an (TT ' IP) random variable. A standard example would be a European call option On Xi with strike K and payoff H max{X i K, O } . If X is continuous and admits an equivalent local martingale measure (see below) , the structure condition is automatically satisfied. =

-

7.2.3 Equivalent Martingale Measures

The martingale property is evidently suitable for financial modelling because it captures unpredictability - absence of systematic movement or drift. The basic insight - due to Harrison and Kreps (1979) and Harrison and Pliska (1981) in the financial setting of Chapters 4 and 6 - is that, although dis counted prices are not martingales in general, they become martingales un der a suitable change of measure. One passes to an equivalent measure (same null sets: same things possible, same things impossible) , under which dis counted processes become martingales. Such a measure is called equivalent martingale measure (EMM); localization may be needed - working with local martingales rather than martingales - leading to equivalent local martingale measures (ELMM) . Mathematically, this change of measure technique is Gir sanov'S theorem, §5.7. Although this idea dates in the financial context from around 1980, it may be traced back much further in the actuarial and insur ance settings. The intuition is that it amounts to shifting probability weight between outcomes - giving more weight to unfavourable ones - to change the risk-averse into a risk-neutral environment. Returning to the mathematical formulation, we denote by P the set of equivalent local martingale measures (ELMM) . We assume that X admits an equivalent local martingale measure Q E P, and hence that our market model is free of arbitrage opportunities. For our subsequent analysis, two equivalent martingale measures will prove to be important. We denote by pE

=

{ Q P I �� E

E

L 2 (IP)

}

the set of all ELMMs with square-integrable density. Now define the strictly positive continuous local lP-martingale

(this is the stochastic exponential of Theorem 5. 10.4; see Musiela and Rutkowski (1997) , §1O.1.4 and recall the definition of K). If Z is a square integrable IP-martingale, then

300


diP := dIP

A

ZT

E L 2 (IP)

defines an equivalent probability measure iP IP under which X is a local martingale, i.e. iP E P E . iP is called minimal equivalent local martingale measure for X (see Follmer and Schweizer ( 1991) , Schweizer ( 1995)) . As our second important ELMM we need the variance-optimal ELMM iP. The formal definition is �

Definition 7.2 . 1 . The variance-optimal ELMM jp is the unique element in

p 2 that minimizes

1 + War./p

[��]

over all Q

E

p2 .

The existence of iP for continuous X was shown by Delbaen and Schacher mayer (1996). As mentioned above, the first formal definition of the minimal martingale measure was given in Follmer and Schweizer (1991) , where the terminology was motivated by the fact that a change of measure to the minimal martingale measure disturbs the overall martingale and orthogonality structure as little as possible. Subsequently the minimal martingale measure was found to be very useful in a number of hedging and pricing applications (see Schweizer (2001b) for discussion and references) . Furthermore Schweizer (1995) pro vides several characterizations of the minimal martingale measure in terms of minimizing certain functional over suitable classes of (signed) equivalent local martingale measures (see also Schweizer (1999)) . 7.2.4 Hedging Contingent Claims

We now turn to the problem of hedging, that is, covering oneself against future losses associated with possession, or sale, of a contingent claim. The problem of hedging requires the introduction of trading strategies, i.e. selection of a covering portfolio in the underlying assets, and loss functions, i.e. how and when to value the success of a strategy. Informally, we can think of this problem as follows. By using our initial capital and following a certain trading strategy, we can generate a value process, which we use to cover our exposure. That is, at time T we have a certain value VT from trading; we need to cover H and evaluate the difference H - VT . Rewriting this, we get a decomposition

LT

=

H = VT + L T ·

(7.2)

As we shall see, decompositions like this under different measures play a decisive role in the solution of various hedging problems.


301

The corresponding mathematical tool in our setting is the basic decompo sition for L 2 -martingales, the (Galtchouk-)Kunita-Watanabe decomposition (see Rogers and Williams (2000) , IV.4) . L 2 -martingales null at 0 are if 0 for all stopping times T (equivalently, if is a uniformly integrable martingale). Such an possesses an or thogonal decomposition into its continuous part and its purely discontinuous part. In the financial context the continuous part is the stochastic integral - seen as a gains from trading process - and the purely discontinuous part orthogonal to it is the unhedgable risk L. Thrning to the problem of exact definitions of trading strategies, we note that P =I- 0 implies that X is a semi-martingale under !P. Thus we can introduce stochastic integrals with respect to X and associate the integrands with trading strategies.

strongly orthogonal lE(MT NT ) MN

M, N M

=

d -valued pre the linear space of all JR Denote by L(X) dictable X -integrable processes iJ. A self-financing trading strategy is any pair (Vo, iJ) such that Vo is an .'Fo -measurable random variable and iJ L(X ) We can think of iJ; as being the numbers of shares of asset i held at time t and Vo the initial capital of an investor. We associate a value process vt ( Va , iJt ) with it given by (7.3) vt (Vo, iJt ) : = Va + f iJu dXu · o Definition 7.2.2.

E

.

t

Here, we call Gt (iJ) = J� iJ u dXu the gains from trading process. The process is self-financing since as soon as the initial capital is fixed no further in- or outflow of funds is needed. We now turn to the problem of identifying the risk associated with the hedging problem, focusing first on the problem of covering our exposure to H at time T. Starting with an initial capital Vo and following a trading strategy iJ, we obtain a final wealth VT (VO , iJ) . Our assessment of the quality of our trading strategy for the problem at hand will thus rely on the difference between these quantities. Several approaches are possible: We could use a quadratic loss function; this leads to the notion of mean-variance hedging, and is described in further detail below. Other possibilities are Value-at-Risk (VaR) principles to cover against extreme losses, as in Follmer and Leukert (1999) and Follmer and Leukert (2000) , Or value-preserving strategies as in Korn (1997c) . In order to be able to consider quadratic loss functions, we restrict the class of trading strategies using Definition 7.2.3. iJ E L(X) GT (iJ) Gt (iJ) Q L 2 (!P)

the spaceprocess of all ()' betrading and the gainsLetfrom

which for iseachin is a formartingale

302


E p 2 . A pair ( vo , iJ) such that Vo E 1R and iJ E () ' is called a mean-variance optimal strategy for H, if it solves the optimization problem

Q

(7.4) over iJ

E

() ' .

The solution of the problem can b e given i n terms of a decomposition (7.2) under the variance-optimal measure iP. From Gourioux, Laurent, and Pham (1998) we know

[ l] -

- diP :Ft = Zo + Zt : = IE dIP

t

Xu , I0 (ud

for some ( E 8' . We have (see Schweizer (2001b)) Theorem 7.2 . 1 . Let H E L 2 (IP) be a contingent claim and write the Galtchouk-Kunita- Watanabe decomposition of H under iP with respect to X as H = iE(H)

with

v,.H,jp

+

T

I ��, jp dXs + L!f.' jp o

:= iE[H I:Ft l

=

vi!' jp ,

(7.5)

t

=

iE(H) +

I ��, jp dXs + Lf' jp · o

Then the mean-variance optimal strategy for H is given by Vo* iJ * t

iE(H)

- cH,jp (v.H,jp

=

0, solve the optimization problem (7. 18) c

=

c

(G
2 and P solved by a Hilbert-space projection argument: cp * E P is optimal if and only if IE [(X - c - G K } lE [S(T) l{s (T» K} ; ()* ] = lE [S (T) ; () *] lE [ l{S (T» K} ; () * 1] = lE [e-r T S(T); () * ] erT JP [ S (T) > K ; () * = S(O)erT JP [ S (T) > K; () * where we used the martingale property of e - r t S(t) under the risk-neutral Esscher measure for the last step. Now the pricing formula for the European

To evaluate the first term, we apply the factorization formula with k and and get

h

=

=

+ 1] ,

call becomes

K; +

+

+ 1]

K;

K

S(O)JP [S(T) > () * 1] - e - rT JP [S(T) > () *] . (7.48) We now can use formula ( 7.48) to compute the value of a European call with strike K and maturity T. Denote the density of .C(Zf., 8 (t)) by ftf. , 8 ( compare (2.8) for the exact form ) . Then 00

E [e - r T (ST - K)+ ; () * ] S(O) J f�, 8 (x ; ()* l )dx +

=

c

K

00

_e-rT J f�, 8 (x ; ()*)dx , c

( 7.49)

326


where c = log ( Kj S ( O )) . Eberlein and Keller (1995) and Eberlein, Keller, and Prause (1998) com pare option prices obtained from the Black-Scholes model and prices found using the hyperbolic model with market prices. They find that the hyper bolic model provides very accurate prices and a reduction of the smile effect observed in the Black-Scholes model.

8 . Interest Rate Theory

In this chapter, we apply the techniques developed in the previous chap ters to the fast-growing fixed-income securities market . We mainly focus on the continuous-time model (since the available tools from stochastic calcu lus allow an elegant presentation) and comment of the discrete-time analogue (Jarrow ( 1996) gives a splendid account of discrete-time models) . As we want to develop a relative pricing theory, based on the no-arbitrage assumption, we will assume prices of some underlying objects as given. In the present context we take zero-coupon bonds as the building blocks of our theory. In doing so we face the additional modelling restriction that the value of a zero-coupon bond at time of maturity is predetermined ( = 1 ) . Furthermore, since the entirety of fixed-income securities gives rise to the term-structure of inter est rates (sometimes called the yield curve) , which describes the relationship between the yield-to-maturity and the maturity of a given fixed-income secu rity, we face the further task of calibrating our model to a whole continuum of initial values (and not j ust to a vector of prices) . A first attempt at ex plaining the behaviour of the yield curve is in terms of a continuum of spot rates of maturities between T and T, where T is the shortest (instantaneous) lending/borrowing period, and T the longest maturity of interest . We model these rates as correlated stochastic variables with the degree of correlation decreasing in terms of the difference in maturity. Discretizing the maturity spectrum, we are tempted to start with a generalized Black-Scholes model (as in §6.2) d

dri (t) = ai (t) + L bij (t)dWj (t) , i j=l

=

1, .

..,d

with W = (Wi , . . . , Wd) a standard d-dimensional Brownian motion. The de gree of (instantaneous) correlation of the different rates can then be described in terms of their covariation

d (ri , rj) (t)

=

d

L bik (t)bjk (t)dt = pij (t)dt.

k=l

Empirical investigations in the degree of correlation show that the Pij are usually very large (close to 1 ) . In fact , using principal component analysis

328

8. Interest Rate Theory

(Mardia, Kent, and Bibby ( 1 979) , Chapter 8, Krzanowski ( 1 988) , §2.2) one can show that the first principal component often explains 80 - 90% of the total variance, and that the first three components taken together describe up to 90-95% of the total variance. Therefore, especially in early approaches, it has been tempting to choose one specific rate, usually the short rate, r { t) , as a proxy for the single variable that principal component analysis indicates can best describe the movements of the yield curve. The standard model is then of the form dr{t)

=

a{t, r { t ) ) dt + b{t, r { t))dW { t) .

(8. 1 )

After developing the general features of a model using zero-coupon bonds as building blocks, based on the general theory in §6. 1 and §6.2, we will discuss various such models of form (8. 1 ) , taking the short rate as state variable, and explore their implications and shortcomings in §8.2. By contrast, the modern, evolutionary approach pioneered by Heath, Jarrow, and Morton ( 1 992) takes a continuum of instantaneous forward rates as the building blocks to describe the dynamics of the whole term structure. We will discuss this approach in detail in §8.3. Finally, we apply the different models to contingent claim pricing in §8.4. Modern approaches to the pricing of interest derivatives are discussed in sections §8.5 and §8.6. The structure of our approach below follows the treatment of Bjork ( 1997) . For additional textbook accounts, see Part II of Musiela and Rutkowski ( 1 997) and Brigo and Mercurio (200 1 ) . 8 . 1 The Bond Market

8 . 1 . 1 The Term Structure of Interest Rates

We start with a heuristic discussion, which we will formalize in the follow ing section. The main traded objects we consider are zero-coupon bonds. A zero-coupon bond is a bond that has no coupon payments. The price of a zero-coupon bond at time t that pays, say, a sure £ at time T 2: t is denoted p{ t, T) . All zero-coupon bonds are assumed to be default-free and have strictly positive prices. Various different interest rates are defined in connection with zero-coupon bonds, but we will only consider continuously compounded interest rates ( which facilitates theoretical considerations) . Using the arbitrage pricing technique, we easily. obtain pricing formulas for coupon bonds. Coupon bonds are bonds with regular interest payments, called coupons, plus a principal repayment at maturity. Let Cj be the pay ments at times tj , j 1 , F be the face value paid at time tn . Then the price of the coupon bond Be must satisfy =

. . . , n,

n

Be

=

L Cjp{O, tj ) + Fp{ O, tn ) .

j= 1

(8.2)

8 . 1 The Bond Market

329

Hence, we see that a coupon bond is equivalent to a portfolio of zero-coupon bonds. The yield-to-maturity is defined as an interest rate per annum that equates the present value of future cash flows to the current market value. Using continuous compounding, the yield-to-maturity Ye is defined by the relation n Be = L Cj exp{ Ye tj + F exp{ - Yetn}. j=l

- }

If for instance the tj , j = 1 , are expressed in years, then Ye is an annual continuously compounded yield-to-maturity ( with the continuously compounded annual interest rate, defined by the relation = exp The term structure of interest rates is defined as the relationship between the yield-to-maturity on a zero-coupon bond and the bond's maturity. Nor mally, this yields an upward sloping curve ( Fig. 8 . 1 ) , but flat and downward sloping curves have also been observed.

{-r(T) (T/365)}).

. . . , n

r (T) ,

p(O , T)

Yield

Maturity

Fig. 8 . 1 .

Yield curve

In constructing the term structure of interest rates, we face the additional problem that in most economies no zero-coupon bonds with maturity greater than one year are traded ( in the USA, Treasury bills are only traded with maturity up to one year ) . We can, however, use prices of coupon bonds and invert formula (8.2) for zero-coupon prices. In practice, additional compli cations arise, since the maturities of coupon bonds are not equally spaced and trading in bonds with some maturities may be too thin to give reli able prices. We refer the reader to Edwards and Ma ( 1992) and Jarrow and Turnbull (2000) for further discussion of these issues.

330


8 . 1 . 2 Mathematical Modelling

We use a standard setting as described in §6. 1 and §6.2 (we follow Bjork ( 1 997) and El Karoui, Myneni, and Viswanathan ( 1 992a) ) . Let ([2, F, IP, IF) be a filtered probability space with a filtration IF = (Ft)t � T ' satisfying the usual conditions (used to model the flow of information) and fix a terminal time horizon T * . We assume that all processes are defined on this prob ability space. The basic building blocks for our relative pricing approach, zero-coupon bonds, are defined as follows.

Definition 8 . 1 . 1 . A zem-coupon bond with maturity date T, also called a T - bond, is a contract that guarantees the holder a cash payment of one unit on the date T. The price at time t of a bond with maturity date T is denoted by p(t, T) . Obviously we have p(t , t ) = 1 for all t . We shall assume that the price process p(t , T ) , t E [0 , TJ is adapted and strictly positive and that for every fixed t , p(t, T) is continuously differentiable in the T variable. Based on arbitrage considerations (recall our basic aim is to construct a market model that is free of arbitrage) , we now define several risk-free inter est rates. Given three dates t < Tl < T2 the basic question is: what is the risk-free rate of return, determined at the contract time t , over the interval [Tl ' T2J of an investment of 1 at time Tl ? To answer this question we consider the arbitrage Table 8 . 1 below (compare § 1 . 3 for the use of arbitrage tables) .

t

Time

Buy

Sell Tl bond p ( t , Tt l T2 bonds p ( t , T2 )

Net investment Table 8 . 1 .

0

Tl

T2

Pay out 1 Receive

-1

p ( t ,Tt l p ( t , T2 )

+ p ( t , Tt l p(t , T2 )

Arbitrage table for forward rates

To exclude arbitrage opportunities, the equivalent constant rate of interest

R over this period (we pay out 1 at time Tl and receive e R(T2 - T, ) at T2 ) has

thus to be given by

We formalize this in:

Definition 8 . 1 . 2 . (i) The forward rate for the period [Tl ' T2J as seen at time at t is defined as


R ( t ., T1 , 'T' .L 2 )

=

331

_ log p (t , T2 ) - log p (t , T1 ) . To2 - T1

(ii) The spot rate R (T1 , T2 ) , for the period [T1 , T2J is defined as

(iii) The instantaneous forward rate with maturity T, at time t, is defined by f (t , T)

=

_

a log p (t , T) ' aT

(iv) The instantaneous short rate at time t is defined by r (t )

=

f (t , t ) .

Definition 8 . 1 . 3 . The money account process is defined by B (t)

�

exp

{j } . r(s ) d'

The interpretation of the money market account is a strategy of instan taneously reinvesting at the current short rate. An immediate consequence of the definitions is

Lemma 8 . 1 . 1 . For t �

s

p (t , T)

� T we have �

p( t , , ) exp

and in particular

{ -1

f(t, u ) d"

}.

In what follows, we model the above processes in a generalized Black Scholes framework. That is, we assume that W ( WI , . . . , Wd ) is a standard d-dimensional Brownian motion and the filtration IF is the augmentation of the filtration generated by W ( t ) . The dynamics of the various processes are given as follows: =

Short-rate Dynamics: dr (t)

=

a (t) dt + b(t) d W ( t ) ,

(8. 3 )

Bond-price Dynamics: dp(t , T )

=

p ( t , T ) {m( t , T ) dt + v (t , T ) dW ( t ) } ,

(8.4)


332

Forward-rate Dynamics: df (t, T)

=

a(t, T)dt + a(t, T)dW (t) .

(8.5)

We assume that in the above formulas, the coefficients meet standard con ditions required to guarantee the existence of the various processes - that is, existence of solutions of the various stochastic differential equations; see Furthermore, we assume that the processes are smooth enough to allow differentiation and certain operations involving changing of order of integra tion and differentiation. Since the actual conditions are rather technical, we refer the reader to Bjork ( 1997 ) , Heath, Jarrow, and Morton ( 1 992 ) and Prot ter (2004) ( the latter reference for the stochastic Fubini theorem ) for these conditions. Following Bjork ( 1 997 ) for formulation and proof, we now give a small toolbox for the relationships between the processes specified above.

§5 . 8 .

dynamics we have(i) If p(t, T) satisfies (8.4 ), then for the forward-rate

Proposition 8 . 1 . 1 .

df (t, T)

=

a(t, T)dt + u(t, T)dW(t) ,

where a and u are given by

{

VTV(t,(t,T)T)v (t , T) mT (t , T) , T . (ii) If f (t, T) satisfies (8. 5 ), then the short rate satisfies a (t, T) u(t, T)

-

=

=

dr (t)

-

=

a(t) dt + b(t) dW (t) ,

where a and b are given by { b(t)a (t) u(t, fr (t, t) + a (t, t) , (8. 6 ) t) . (iii) If f (t, T) satisfies (8. 5 ), then p (t, T) satisfies dp (t, T) p( t , T) { ( r (t) + A(t, T) + � I I S(t, T) 1 1 2 ) dt S (t, T) dW (t) } , where T A(t, T) J a (t, s)ds , S(t, T) J u (t, s)ds. (8. 7) =

=

+

=

T

=

-

=

t

-

t

Proof. To prove ( i ) we only have to apply Ito's formula to the defining equation for the forward rates and we leave this as Exercise below. To prove ( ii ) we start by integrating the forward-rate dynamics. This leads to

8. 7


J( t, t) = ret) = J( O, t) + Writing also

t

t

! a(s, t)ds + ! a(s, t)dW(s) . o

0

a and a in integrated form a (s, t) = a(s, s) + a (s, t)

333

( 8.8 )

t

! aT (s, u)du, s

t

!

a ( s, s) + aT (s, u)du,

=

s

and inserting this into ( 8.8 ) , we find

ret)

=

J( O, t) + +

t

t t

t

! a (s, s)ds + ! ! aT (s, u)duds o

0 t t

s

! a(s, s)dW(s) + ! ! aT ( s, u)dudW(s) . o

0

s

After changing the order of integration we can identify terms to establish ( ii ) . For ( iii ) we use a technique from Heath, Jarrow, and Morton (1992) . By the definition of the forward rates we may write the bond-price process as

p(t, T) where

Z is given by

=

exp {Z(t ,

T) }

,

T

Z ( t,

T) ! J( t, s)ds. =

-

( 8 . 9)

t

Again we write ( 8.5 ) in integrated form:

J( t, s) = J( O, s) +

t

t

! a(u, s)dt + ! a(u, s)dW(u) . o

We insert the integrated form in ( 8 . 9 ) to get

T

Z ( t,

t

0

T

t

T

T) ! J (O, s)ds ! ! a(u, s)dsdu ! ! a(u, s)dsdW(u). =

-

t

-

O t

-

0 t

Now, splitting the integrals and changing the order of integration gives us

334


Z(t, T)

=

T

t

T

t

T

J 1(0, s)ds J J a(u, s)dsdu J J a(u, s)dsdW(u) + J f(O, s)ds + J J a(u, s)dsdu + J J a(u, s)dsdW (u) Z(O, T) J J a(u, s)dsdu J J a(u, s)dsdW (u) + J 1(0, s)ds + J J a(u, s)duds + J J a(u, s)dW (u)ds. -

o

-

0

t

o

-

t

0

t

o

t

0

-

t

0

0

t

u

T

u

u

-

s

0

t

0

t

0

t

u

T

u

u

t

0

s

0

The last line is just the integrated form of the forward-rate dynamics (8.5) over the interval [0, sJ . Since r(s) f(s, s) , this last line above equals J� r(s)ds. So we obtain =

Z(t, T)

=

Z(O, T) +

t

t

T

t

T

J r(s)ds J J a(u, s)dsdu J J a(u, s)dsdW(u) . -

o

0

u

-

0

u

Using A and S from (8.7) , the stochastic differential of Z is given by

dZ(t, T)

=

(r(t) + A(t, T))dt

+ S(t, T)dW(t).

Now we can apply Ito's lemma t o the process p(t, T) complete the proof.

=

exp{Z(t, T) } to 0

8 . 1 . 3 Bond Pricing, Martingale Measures and Trading Strategies

We will now examine the mathematical structure of our bond-market model in more detail. As usual, our first task is to find a convenient characterization of the no-arbitrage assumption. By Theorem 6 . 1 . 1 , absence of arbitrage is guaranteed by the existence of an equivalent martingale measure Q. Recall that by definition an equivalent martingale measure has to satisfy Q '" JP and the discounted price processes ( with respect to a suitable numeraire ) of the basic securities have to be local Q-martingales. For the bond market this implies that all zero-coupon bonds with maturities 0 :=:; T :=:; T* have to be local martingales. More precisely, taking the risk-free bank account B (t) as numeraire we have

Definition 8 . 1 .4 . A measure Q

'" JP defined on (il, F, JP) is an equivalent martingale measure for the bond market, if for every fixed 0 :=:; T :=:; T * the process

8. 1 The Bond Market

p(t, T ) , B (t)

335

o � t � T

is a local Q -martingale. Assume now that there exists at least one equivalent martingale measure, say Q . Defining contingent claims as FT-measurable random variables such that XI B (T) E L l ( FT , Q) with some 0 � T � T * ( notation: T-contingent claims ) , we can use the risk-neutral valuation principle (6. 1) to obtain:

Proposition 8 . 1 . 2 . Consider a T-contingent claim X . Then the price pro cess IIx (t) , 0 � t � T of the contingent claim is given by

In particular, the price process of a zero- coupon bond with maturity T is given by T p(t, T) = IEQ e - ft r( s )ds Ft .

1 )

[

Proof. We j ust have to apply Theorem

6.1.4.

We thus see that the relevant dynamics o f the price processes are those given under a martingale measure Q. The implication for model building is that it is natural to model all objects directly under a martingale measure Q. This approach is called martingale modelling. The price one has to pay in using this approach lies in the statistical problems associated with parameter estimation. Turning now to market completeness, we define trading strategies - as with our models in Chapters 4 and 6 as dynamic investment portfolios involving some or possibly all of the ( basic ) traded securities ( in the present case zero-coupon bonds ) . In contrast to our previous market models, we now have a continuum of bonds with different maturities available for trade. So a portfolio could in clude an infinite number of financial securities ( consider e.g. the rolling-over strategy to define a risk-free savings process involving always only bonds of the shortest possible maturity ) . We shall, however, in accordance with prac tical restrictions only consider portfolios involving an arbitrary, but finite, number of financial securities ( see Bjork, Di Masi, Kabanov, and Runggaldier ( 1 997) and Bjork, Kabanov, and Runggaldier ( 1 997) for infinite portfolios ) . For any collection of maturities 0 < Tl < T2 < . . . < Tk = T* we then consider bond-trading strategies
K} and Q T resp. Q* the T- resp. T*-forward risk neutral measure. Now T*) T) = T) has Q-dynamics (omitting the arguments and writing S* for

t

=

=

=

{w :

Z (t,

p(t, p(t,

dZ Z { S(S - S*)dt - (S - S*)dW(t) } , =

S(t, T*»

352


so a deterministic variance coefficient. Now Q* (P (T, T* )

2::

(��,r;;

K) = Q *

2::

K

)

=

Q* (Z (T, T)

2::

K) .

Since Z(t , T) is a Q T -martingale with Q T -dynamics

dZ(t , T) -Z(t , T) (8(t , T) - 8(t, T* ))dW T (t) , we find that under Q T (use the stochastic exponential, compare §5.7 and §5.8) (again 8 = 8(t, T) , 8* = 8(t, T* ) ) =

Z(T, T)

�

:��.'�:i exp

{j

- ( S - S' ),nvT (t)

- � 1 <s - S')'dt }

(with W T a Q T -Brownian motion). The stochastic integral in the exponential is Gaussian (compare §5.2) with zero mean and variance T 2 E (T) (8(t, T) - 8(t , T*)) 2 dt . =

J o

So with

d2 =

log

( i')(��' » ) - � E2 (T) . JE2 (T)

Similarly, for the first term

Z* (t , T)

=

p(t, T) p(t, T * )

has Q-dynamics ( compare (8.22))

dZ*

=

{

Z* 8* (8* - 8)dt + (8 - 8*)dW(t) } ,

and also a deterministic variance coefficient. Now

Q * (p (T, T* )

2::

K)

=

Q*

( P(/T* ) �) ::;

=

Q* ( Z* (T, T)

Under Q * Z * (t, T) is a martingale with

dZ* (t , T) = Z* (t , T) (8(t , T) - 8(t, T*))dW* (t) , so

::;

�).

8.4 Pricing and Hedging Contingent Claims

353

Again we have a Gaussian variable with the ( same ) variance E 2 (T) in the exponential. Using this fact it follows ( after some computations ) that :

Q * (p(T, T* ) � K)

=

N(dI ) ,

with So we obtain:

Proposition 8 . 4. 1 . The price of the call option defined in (8.23) is given by

C(O)

=

(8.24)

p(O, T* ) N(d2 ) - Kp(O , T)N(dI ) ,

with parameters given as above. Remark 8.4 . 1 . ( i ) Observe that the above reasoning generalizes to assets Set) in place of pet, T * ) as long as t ( t , T) = S( t ) /p ( t , T) has a deterministic volatility coefficient . ( ii ) Since we have a closed formula for the option's price above, we can com pute sensitivities with respect to the underlying in much the same way as in the standard Black-Scholes model. In particular, we can construct .d-neutral hedging portfolios. 8.4.3 Swaps

This section is devoted to the pricing of swaps. We consider the case of a forward swap settled in arrears. Such a contingent claim is characterized by: • • • •

a fixed time t, the contract time, dates To < TI , · · · < Tn , equally distanced R , a prespecified fixed rate of interest , K, a nominal amount.

Ti +1 - Ti

=

0 holds. Clearly, the payer-swaption-value in To is So ,{3 ( t )

_

•

-

as

=

(3

(So,{3(To) - K) + L Tip(To, Td , i= o + l and the receiver-swaption-value is {3 (K - So, {3(To)) + L TiP (To, Ti) . i=o + l Now we turn to the problem of modelling the dynamics of the swap rate, and the related question of deriving a pricing formula for swaptions. First, we observe that AO,{3(t )

(3

L Tip(t, Ti) i=o + l is the t-price of a a portfolio of bonds ( i.e. a traded asset ) . Consequently, A O, {3 (t) , which if known as accrual factor or present value of a basis point, can be used as numeraire. Now note that =

366


Sa,{3 (t)

=

p(t, Ta ) - p(t, T(3 ) , A a , {3 (t)

where the numerator can be regarded as the price of a traded asset as well. We conclude that , in order for our model to be arbitrage-free, the swap rate Sa,{3 (') has to be a martingale under the numeraire pair (Qa,{3 , A a,{3 ( ' ) ) ' Qa , {3 is the so-called forward swap measure. To proceed, we assume that Sa,{3 (' ) follows a lognormal martingale:

where u is a deterministic function and Wa ,{3 (') is a standard Qa,{3-Brownian motion. The fact that the forward swap rate Sa,{3 (t) is lognormally distributed under Qa,{3 motivates the name lognormal forward swap model. This leads us to the following

Theorem 8.5.4. The price of a payer swaption as specified above in a log

normal forward swap model is consistent with Black 's formula for swaptions and is thus given by with

and d2

=

d 1 - E(Ta ) with E 2 (Ta )

J U(s) 2 ds. To

=

o

The price of a receiver-swaption is given by Proof. For the payer-swaption, we have

PS ( 0, { Ta , Ta , . . . , T{3 } , K ) = Aa,{3 ( O ) lElQo,{3 = =

( (Sa,{3 (TaA) a,{3- K)(Ta+)Aa,{3 (Ta ) )

A a , {3 (O)lElQo,{3 ( (Sa , {3 (Ta ) - K ) + ) A a,{3 (O) ((Sa,{3 (0 ) N(d 1 ) - KN(d2 ) ) ,

and it is evident that our judicious choice of the numeraire simplifies the calculations considerably. The same argument goes through for the receiver 0 swaption.

8.5 Market Models of LIB OR- and Swap-rates

367

So far, we have only been concerned with the evolution of one swap-rate. While this is sufficient for the purpose of swaption-pricing, the pricing of more complicated derivatives whose value depends on more than one swap rate necessitates the modelling of the simultaneous evolution of a set of swap rates under one probability-measure. This leads us to the class of Swap Market Models (SMMs), which are arbitrage-free models of the joint evolution of a set of swap rates under a common probability measure. The modeler has some freedom of choice when it comes to determining the set of swap-rates that is to be modelled. Examples are: 1 . The set of swap-rates {Sa,a + ! (t) , Sa+!,a + 2 (t) , . . . , S(3 - 1, (3 (t) , } . As this is exactly the set of LIBOR rates {L(t, Ta) , L (t, Ta + l ) ' . . . ' L(t, T(3 - 1 ) } ,

this choice leads us back to the LMM framework. 2. The set of swap-rates {Sa,a + ! (t), Sa,a + 2 (t) , . . . , Sa, (3 (t) } . 3. The set of swap-rates {Sa ,(3 (t) , Sa +l ,(3 (t) , . . . , S(3 - 1 , (3 (t) } . All of the above choices have in common that they fully describe the LIBOR structure from Ta to T(3 , so that all products that can be priced in a LMM can also be priced in a SMM ( and vice versa, as the LIBORs also determine the swap rates ) . Which of the above sets is chosen of course depends on the concrete application or the interest-rate derivative to be priced. The ideas and tools underlying the construction of a SMM are very similar to those of a LMM, and so are the results ( e.g. measure relationships ) . The interested reader might want to consult Hunt and Kennedy ( 2000 ) , Pelsser ( 2000 ) and Rutkowski (1999) for further details. 8.5.6 The Relation Between LIBOR- and Swap-market Models

As already remarked, all products that can be priced in a LMM-framework can in principle also be priced in a SMM-framework and vice versa. This is due to the fact that swap rates can be expressed as weighted sums of LI BOR rates, and LIBOR rates can be expressed as functions of swap-rates. However, this does not mean that these model-classes are equivalent. As we have seen, one of the main advantages of the lognormal LMM is that it prices caplets with Black's caplet formula, which is the market standard. Accord ingly, the lognormal SMM prices swaptions with Black's swaption formula, which is also standard among practitioners. At this stage, a natural question is whether lognormal LMMs yield swaption-prices that agree with Black-prices, and whether lognormal SMMs give Black-consistent caplet prices. In general, the answer is negative. The reason for the incompatibility of lognormal LMMs and lognormal SMMs lies in the fact that if LIBOR rates are lognormal under their respective forward-measures, swap rates ( that are determined by these LIBOR rates ) cannot be lognormal under their respective forward-swap mea sures and vice versa. Even though this might seem disappointing at first, as it

368


implies that neither the lognormal LMM nor the lognormal SMM can price both caplets and swaptions according to the market standard, it does not constitute a major drawback or practical limitation. The incompatibility is mostly of a theoretical nature, as it can be shown ( for example by simulation studies ) that swap rates in the lognormal LMM are "almost" lognormal, and therefore swaption-prices in the lognormal LMM ( which usually have to be obtained by Monte Carlo simulations, as no closed formulas exist ) are very similar to those one would obtain by the Black formula. For more information on the incompatibility problem, the interested reader is referred to Rebonato ( 1999 ) and Brigo and Mercurio (2001). 8 . 6 Potential Models and the Flesaker-Hughst on Framework

The aim of this section is to present the basic ideas of the potential approach to term structure modelling as set out in Rogers ( 1997) , and to develop connections with the positive interest rate models formulated by Flesaker and Hughston ( 1997) and Flesaker and Hughston ( 1996b ) . Our discussion builds on Rogers ( 1997 ) , Jin and Glasserman (2001) and Hunt and Kennedy (2000) . 8.6.1 Pricing Kernels and Potentials

Our starting point is a filtered probability space ( il, F, ( Ft ) , IP ) . We assume that ( Ft ) is the IP-augmentation of the natural filtration generated by a continuous and positive process process ( at ) . Define t At = a ( t) dt .

J o

As A ( At ) is an increasing process, Aoo = limt-+oo At exists. We assume that lE ( A ;" ) < 00 holds. Define the pricing kernel ( Zt ) as Zt = lE ( Aoo I Ft ) - At , t E [0, 00] . Without loss of generality, we may assume that Zo lE ( Aoo ) = 1 . Because lE ( Zt I Fs ) S Zs for 0 S s S t S 00, ( Zs ) is a supermartingale. Furthermore, limt-+oo lE ( Zt ) = 0 for obvious reasons. Stochastic processes characterized by these two properties are called potentials. =

=

We are now in the position to define zero coupon bond prices by (8.31)

8.6 Potential Models and the Flesaker-Hughston Framework

369

Observe that bond prices tend to zero with increasing time to maturity as

1 0. Tlim -+oo -+oo p (t, T) Tlim Zt IE(A:;o - A T I Ft ) For the bond prices in t 0, we get =

=

=

p(O, T } lE(ZT ) �

� l - lB(Ar}

�

l - JE

(1 ) � l 0,

d'

l-

lB(o.} d'

where we have used Fubini's theorem. Differentiating both sides with respect to T leads to

fJ fJT P (O , T)

=

- IE(aT ) '

The above relation can be used to calibrate the model to an observed bond price structure. As is well-known from the treatment of the HJM framework, the specifica tion of bond-price dynamics implicitly gives rise to the dynamics of forward rates and vice versa. Using the well-known relation between bond-prices and instantaneous forward rates, we can derive the forward-rate-dynamics in the above model:

Here we have used that

by Fubini 's theorem for conditional expectations; differentiating both sides then gives

IE(aT IFt)

=

fJ fJT IE(AT I Ft) .

Obviously, the forward rates f(t, T) are strictly positive, so that the bond prices p (t, T) = exp f ( t , s ) ds are strictly decreasing in T. Further more, using r(t) = f ( t , t), the short rate is simply r(t) = a(t)/Zt .

( - It

)

370


Before proceeding, let us briefly touch on the question of absence of ar bitrage in our framework. From equation (8.31), it is immediately clear that all normalized bond price processes (p ( t , T)Zt ) are lP-martingales. However, this does not imply that the model is arbitrage-free, as the process (ljZt) does not necessarily represent the price process of a traded asset. Neverthe less, Rutkowski ( 1999 ) is able to prove that in our framework, there exists an implied money market account B (Bt) , B being an increasing process of finite variation, and a corresponding risk-neutral measure lP* , such that all bond price processes discounted by B are lP* -martingales, which proves that the model excludes arbitrage opportunities. =

8.6.2 The Flesaker-Hughston Framework

This section presents the basic ideas of the positive interest rate framework introduced by Flesaker and Hughston ( 1997) and Flesaker and Hughston ( 1996b ) , and shows how their approach relates to the potential approach de scribed above. Flesaker and Hughston introduce a general class of term structure models by postulating that bond prices are of the form

f; ¢(s)Mts ds , 0 :::; t :::; T < 00 , (8.32) Jroo t ¢(s)Mts ds where ¢ is a deterministic and positive function, and (MtT k�o is a family of positive martingales indexed by T with respect to a probability measure lP and a filtration (Ft). Both integrals in (8.32) are assumed to be finite almost surely. From the definition, it is evident that bond prices are strictly decreas ing in T, Le. interest rates are always positive. The potential approach and the Flesaker-Hughston approach are closely related as the following theorem shows. p

( t, T)

=

Theorem 8.6. 1 . (i) Given a Flesaker-Hughston-model, define the pricing

kernel (Zt) as

00

Zt

=

J ¢(s)Mts ds. t

Then (Zt) is a potential and the zero coupon bond prices defined by

are of the form of Equation (8. 32}.

8.6 Potential Models and the Flesaker-Hughston Framework

371

(ii) Conversely, given a (9t ) -adapted, positive and continuous process (at ) and a positive and increasing process (At) with A t = f; as ds, define a potential ( Zt) by setting Zt = IE(A oo I9t ) - At . Then there exist a deterministic and positive function fjJ, and a family of positive ( !P, Wt ) ) martingales (MtT k:�o indexed b y T with M( O, T ) 1, such that Zt can be represented as

=

00

Zt

=

! fjJ( s )Mts ds . t

Consequently, bond prices are of the form of Equation (8. 32}.

Proof. ( i )

With Fubini's theorem for conditional expectations, we get

=!

00

T

and therefore

00

fjJ( s ) IE (Mts IFt ) ds

=

! fjJ( s)Mts ds

T

The potential-property of ( Zt) is apparent. ( ii ) Observe JE ( ZT I9, )

�

JE (A oo 19, ) - JE (A TI9, )

�

JE

and set MtT

=

(1 ) l a . ds 9,

IE (aT I 9t) and fjJ( t ) IE(aT )

=

�

JE ( . '9, ) ds

IE(at ) .

Then MOT 1 and (MtT k::o is a family of strictly positive ( !P, ( 9t)) martingales indexed by T, while fjJ is a deterministic and positive function. This leads to =

00

Zt

=

which completes the proof.

IE( Zt I 9t)

=

! fjJ( s )Mts ds, t

o

372


Remark 8. 6. 1 . Observe that (i) above, together with the comment on the no arbitrage property of potential models in the preceding section, shows that Flesaker-Hughston models are arbitrage free. Probably the most popular subclass of term structure models in the Flesaker Hughston framework is the class of rational models. The building blocks in this case are a positive (IP, (Ft))-martingale (Nt } t? o with No 1 and ( Ft ) the IP-augmentation of the natural filtration generated by (Nt) , two positive and deterministic functions f, 9 : � + -+ �+ with f (t) + g ( t ) 1 Vt 2: 0 and a positive and deterministic function ¢ � + -+ �+ defined by ¢(t) - itp(O, t) , where we assume that p(O, t) is strictly decreasing in t. We now introduce a family (Mt T )t?o of strictly positive martingales by setting =

=

=

:

Mt T

=

f (T) + g (T)Nt .

Then Formula (8.32) immediately yields the bond prices p(t , T)

=

�

I ¢(s) Mts ds t ¢(s)Mts ds

I

=

�

I ¢(s) ( f (s) + g (s)Nt) ds It ¢ (s) ( f ( s ) + g ( s) Nt) ds

=

F (T) + G ( T ) Nt F (t) + G(t) Nt

with positive and decreasing functions F and G defined by 00

F(T)

=

00

J ¢(s) f ( s) ds and G (T) J ¢(s) g (s) ds . =

T

T

A nice feature of this class of models is that the consistency with the initial term structure is already guaranteed by its very construction, because p(O, T) (F(T) + G(T) ) j (F(O) + G(O) ) , and one easily checks that F(T) + G(T) p(O, T) and F(O) + G(O) 1. It can also be shown, see e.g. Flesaker and Hughston (1997) , that both bond prices and interest rates are bounded above and below: =

=

=

G (T) G(t)

T) . Now Y (t) Ys ,r (t) ;�::�� . Check that Y has deterministic volatility, and find the parameters for the pricing formula. 8.4 .d-hedge a zero-coupon bond p( t, T) using two other zero-coupon bonds that mature on dates Tl and T2 . Use the self-financing condition and compute the sensitivity of changes in dx ( consider only first-order terms ) in the bonds in order to set up equations to find the portfolio weights. Is your portfolio insensitive to large changes in x(t)? Here, ( dX ) 2 must be considered. Include a third zero-coupon bond with maturity T3 in your portfolio to obtain a Gamma-neutral position. =

=

374

8.5

by


In the Gaussian HJM framework, assume that the volatility is specified O' (t, T)

exp{ ->" (T - tn , where 0' > 0 and >" 2: O. This specification leads to a deterministic, time stationary volatility, dependent on T - t and not T, t separately, which in creases time to maturity decreases. The HJM Condition (8.21) implies = 0' X

as

a(t, T)

=

0' 2

A

exp { - >" (T - t n ( 1 exp { - >" (T - tn) · -

1 . Verify that bond prices are given by p(t, T)

where

=

��,�1 exp {S(t, T)x(t) - a(t, Tn ,

x (t)

=

r(t) - 1(0, t) ,

S(t, T)

=

1 - >: { I - exp{ ->" (T - t n } ,

a(t, T)

=

(0' 2 / 4 >" )S(t, T) 2 ( 1 - exp { -2 >.. t } ) .

2 . Since x (t) i s not dependent on the maturity T , it can b e used as a single factor in a factor model (replacing e.g. r(t)) . Use x (t) to construct ..1- and

r-neutral portfolios of T-bonds. 3. Find the specific form of European call option prices in this setting (use (8.24) ) and compute sensitivities with respect to x(t) . 8.6 Verify the caplet Formula (8.25) in the setting of §8.4.4. (Repeat the deduction of (8.24) in §8.4.2.) 8 . 7 Prove Proposition 8. 1 . 1 part (i) .

9 . Credit Risk

Approaches to modelling financial assets subject to credit risk can roughly be divided into two types of models: reduced-form and structural models. While reduced-form models typically use a point process to model the default event ( exogeneously ) , structural models try to describe the default triggering event within the framework of all traded assets. The structural approach goes back to Merton (1974) , where the dynam ics of the value of the assets of a firm are described by a standard geometric Brownian motion and the default event is triggered by this value process crossing a default boundary given by the value of a single bond issued. Al though this model gave valuable insight into the default process, shortcomings have subsequently been raised: the liabilities of the firm are supposed to con sist only of a single class of debt, the debt has a zero coupon, bankruptcy is triggered only at maturity of the debt, bankruptcy is cost less and inter est rates are assumed to be constant over time. Thus, the assumptions of the Merton model are highly stylized versions of reality and are not able to account for the magnitude of yield spreads. This motivated several gener alizations of Merton's model: Black and Cox ( 1976) incorporate classes of senior and junior debt, safety covenants, dividends, and restrictions on cash distributions to shareholders, Geske (1977) considers coupon bonds by using a compound options approach and provides a formula for subordinate debt within this compound option framework. Leland (1994) extends the model further to incorporate bankruptcy costs and taxes, which makes it possible to work with optimal capital structure. Zhou (2001) and Madan (2000) use Levy processes to model the value of the firm process. While in most pa pers using option pricing frameworks bankruptcy is triggered as the moment when the value of the firm reaches the value of the debt, they model default as the time when the value of the debt reaches some constant threshold value K that serves as a distress boundary, i.e. the default time T can then be expressed formally as T inf {t � 0 V (t) ::; K} , the first passage time for V (t) ( the value of the firm's assets at time t) to cross the lower bound K. If the value of the assets breaches this level, default is triggered, some form of restructuring occurs and the remaining assets of the firm are allocated among the firm's claimants. Implicit in this formulation is the assumption that once this level is reached, default occurs on all outstanding liabilities at the same =

:

376

9. Credit Risk

time. Thus, contrary to Merton's model, default can occur prior to maturity. Nielsen, Saa-Requejo, and Santa-Clara (1993) , Briys and de Varenne (1997) and Hsu, Saa-Requejo, and Santa-Clara (1997) allow for stochastic default boundaries and deviation from the absolute priority rule. Because of the com plexities of all these extensions, often a closed-form solution can no longer be obtained and numerical procedures must be used. Literature related to credit risk in book form includes Sch6nbucher (2003) , Bielecki and Rutkowski (2002) and Duffie and Singleton (2003) . Madan (2000), Rogers (1999) and Lando (1997) are overview papers. 9 . 1 Aspects of Credit Risk

9 . 1 . 1 The Market

According to the International Swaps and Derivatives Association, the credit derivatives market grew 37%, with total notional outstandings reaching $2. 15 trillion during the first half of 2002. Notional outstanding volume in interest rate and currency derivatives increased 20%, to $99.83 trillion, in the first half, while equity derivatives outstanding volumes rose to $2.45 trillion up 6%. This growth demonstrates the importance of credit derivatives as a mechanism for mitigation and dispersion of credit risk. 9 . 1 . 2 What Is Credit Risk?

We can distinguish between individual risk elements: 1. Default Probability. The probability that the obligor or counterparty will default on its contractual obligations to repay its debt. 2. Recovery Rates. The extent to which the face value of an obligation can be recovered once the obligor has defaulted. 3. Credit Migration. The extent to which the credit quality of the obligor or counterparty improves or deteriorates, and portfolio risk elements: 1. Default and Credit Quality Correlation. The degree to which the default or credit quality of one obligor is related to the default or credit quality of another. 2. Risk Contribution and Credit Concentration. The extent to which an individual instrument or the presence of an obligor in the portfolio contributes to the totality of risk in the overall portfolio. Since credit risk focuses on default probabilities, recovery in default, identity of the counterparty - all factors that are not directly relevant to market risk - the modelling of credit risk requires the development of new techniques. In particular, since the underlying risk variable in credit risk - occurrence

9 . 1 Aspects of Credit Risk

377

or otherwise of default - is not normally distributed, we find credit portfolio distributions which are asymmetric with fat tails ( limited upside potential with remote possibilities of severe losses ) . In addition, when implementing models, we face the problem of sparse data. Data on credit events are much more limited than information on market risk. Credit events are infrequent and many credit instruments are not marked-to-market on a daily basis, so parameter estimation is difficult. Finally, market risk tends to focus on a relatively short time horizon - credit risk analysis is concerned with a much longer horizon. 9 . 1 . 3 Portfolio Risk Models

Banks implement credit risk models, which may be Ratings Based (Cred itMetrics) . So we need:

1. the

definition of the possible states for each obligor's credit quality, and a description of how likely obligors are to be in any of these states at the horizon date - Ratings and Transition Matrix, 2. the revaluation of exposures in all possible credit states - using term struc ture of bond spreads and risk-free interest rates, 3. the interaction and correlation between credit migrations of different oblig ors - use of an unseen driver of credit migrations. Ratings are in principle supplied by commercial firms, so-called rating agencies. Rating agencies evaluate the creditworthiness of corporate, mu nicipal, and sovereign issuers of debt securities. In the USA, where capital markets are the primary source of debt capital , rating agencies have assumed enormous importance. The importance of these agencies has been increased through the internal rating approach in the new Basel capital accord ( Basel II ) , which requires established rating agencies as benchmark. Of course, this increased importance triggers the question of how reliable rating agencies are. Several empirical studies ( some by the agencies them selves ) find: •

•

•

Moody's ratings have a high predictive power for defaults; spreads are generally higher for lower rated bonds, bond and equity values move in the expected direction when issuers' ratings change, ratings do help predict financial distress and bond spreads.

However, recent studies show the equity-based default probabilities ( see be low ) change well before ratings when firms fall into financial distress ( rating stickiness ) and bond prices based on average spreads show inconsistencies with actual prices. Equity-Based Models (Moody's KMV, Credit Grades) rely on the argument that a firm defaults when its asset value drops to the value of

378

9. Credit Risk

its contractual obligations (or a critical threshold - the default point). One can use an option pricing framework to derive the value of debt and equity. Each model contains parameters that affect the risk measures produced, but which, because of a lack of suitable data, must be set on a judgmental basis. Empirical studies such as Gordy (2000) and Koyluoglu (1998) show that parameterization of various models can be harmonized, but use only default-driven versions. 9 . 2 Basic Credit Risk Modeling

A complete mathematical framework is developed in Bielecki and Rutkowski (2002) . For our purpose, a simplification of their framework suffices. To in troduce our basic model, we specify a time horizon T* > 0 and assume an underlying stochastic basis (fl, F, JP, IF) with IF (Ft) O �t� T ' ) a filtration that supports the following objects: the firm's value process V , thought of as the total value of firm's assets, the barrier process (signalling process) which will serve to specify default time, promised contingent claim X, representing the firm's liabilities to be re deemed at time T � T* (other notation D , L) , 7, a default time, which - in the structural approach is defined as 7 := inf{ t > 0 : Vi < vt} so 7 is a IF stopping-time; - in the intensity-based approach is not a stopping time for the market filtration, recovery claim X , represents recovery payoff received at T, if default occurs prior or at the claim's maturity date T ( recovery at maturity), recovery process Z, specifies recovery payoff received at time of default, if default happens prior to or at T (recovery at default) . Technical Assumptions. V, Z, A , are progressively measurable with re spect to IF, X and X are FT measurable. All processes are assumed to satisfy further suitable conditions. Suppose there exists an equivalent martingale measure (EMM) JP* (im plying that the financial market model is arbitrage-free) . So discounted price processes of tradeable securities, which pay no coupons or dividends, follow IF-martingales under JP* . Let r be the short-term interest rate process and use as the discount factor the savings (bank), account, which we assume to exist: =

• •

v,

•

•

•

•

v

9.3 Structural Models

379

Let H

= (Ht ) = ( l{r$ t } ) be the indicator process of the default event, D = (Dt) cash flows -received by the owner of the defaultable claim and X d ( T ) = X1{r> T } + X1{ r $ T } . Then the process D of a defaultable claim, which settles at time T, equals

Dt

=

X d (T) 1 { t � T } +

J Zu dHu ,

(O , t]

where the first term takes care of the payoff at T ( if any ) and the second term captures the payments in case of premature default. Observe that D is of finite variation over [0, T] . Now let Xd (t , T) be the price process of a default able claim. Thus, Xd (t, T) represents the current value at time t of all future cash flows associated with a given default able claim. We have Definition 9 . 2 . 1 (Risk-neutral Valuation Formula) . The price process

of a defaultable claim which settles at T is given as

X d (t, T )

=

BtIE*

(J

( t , T]

B; / dDu Ft

)

Vt E [0, T] .

Use of the formula depends on the attainability of a defaultable claim, which is not obvious. Usually one argues that pricing the defaultable claim accord ing to the above formula does not introduce an arbitrage opportunity into a previously arbitrage-free market. Example.

In case of recovery at maturity, we have Z

=

o.

So

Then the valuation formula is X d (t , T )

=

BtlE*

( (X1{r> T } + X1{r$ T } ) BT l I Ft ) ,

and the discounted price process follows an .IF-martingale under JP* ( given some integrability conditions ) . 9 . 3 Structural Models

9.3.1 Merton's Model

The basic foundations of structural models have been laid in the seminal paper Merton (1974) . Here it is assumed that a firm is financed by equity

380

9. Credit Risk

and a single zero-coupon bond with notational amount ( face value) F and maturity T. The firm's value is given by dV(t)

=

(r - 8)V(t)dt + o"V (t)dW(t)

under an equivalent martingale measure 1P'* , with r, a constant, W Brownian motion and constant payout ( dividend ) rate 8, which may be negative ( i.e. pay-in ) . Default is only possible at maturity. There are two possibilities; Vr

or

:::::

0 and the ( l +Ei) are log-normally distributed with parameters 2 lE* log(l + E ) i - 82 , War* log(l + ) 8 2 , lE*E k ei' - l under JP * . We find for the put price =

E

=

=

=

with parameters A'

=

5.(1 + k) ,

Theorem 9.3. 1 . The price of a credit risky bond is given by

p(O, T) = p(O, T)F

( ( X T) exp{ -A' T} �1 t}. T


= JE * ( L e -r(T - t) l {f�T,vT � L } 1 Ft ) + JE* ( ,61 VT e- r(T - t) l {T�T,vT < L} 1 Ft ) + JE* ( K,62 e - I'(T - f) e -r(T - t ) 1 { t ( - IOg(Vt/ii(t)) + v(s - t) ) , ii (t )

2

a

a�

386

9. Credit Risk

-

with a V/U 2 (r "I - u 2 /2)/u 2 . By Equation (9.3) , we find that for every t < s :::; T and x � v ( s ) we have on {T > t} (9.5) =

=

{p

=

_

K,

log(x / v (s) ) v (s - t) ) ( log( Vt /V (t)) - u� log(x/ v (s) ) + 1/ (s - t) ) . ( VVt(t) ) 2a {p ( - log( Vt /V (t)) -u� +

-

Proposition 9.3.2. Set i/ = 1/ "I and a = i/u - 2 . Assume that i/ 2 + 2u 2 (r > O. Then the price process of a defaultable bond on {T > t} equals

"I )

pd(t, T)

Lp(t, T) ({P(h l ( Vt , T - t)) - R�ii{p (h 2 ( Vt , T - t)) )

=

+ i31 Vt e - It( T - T ) ( {P (h3 ( Vt , T - t)) - {P ( h4 ( Vt , T - t))) + i31 vt e-It(T - T ) R�lL+ 2 ( {P (h5 ( vt , T - t)) - {P (h6 ( vt , T - t) ) )

(

where Rt

=

)

+ i32 Vt RfH{p (h 7 ( Vt , T - t)) + RfH{p ( hs ( Vt , T - t)) , v (t)/ Vt , () a + 1 , ( u - 2 Ji/2 + 2u2 (r "I ) and =

h 1 (V, T t) t,

_

h2 ( V,t , T t) _

_

h3 (V,t , T t) h

4

(V,t , T

h

5

(V,t , T

=

=

=

_

t)

=

_

t)

=

h6 ( V,t , T t )

=

h 7 (V,t . T t)

=

_

_

hs ( V,t , T t ) _

-

_

-

=

log( Vt /L) + 1/ (T - t) ' uvT t log v 2 (t) - log(L Vt ) + v(T - t) ' uVT t log(L/ Vt ) - ( 1/ + ( 2 ) (T - t) ' uVT - t log(K/ Vt ) - ( 1/ + ( 2 ) (T - t) ' uVT - t log( v 2 (t)) - log(L Vt ) + ( 1/ + ( 2 ) (T - t) ' uVT t log v 2 (t) - log(K Vt ) + ( + ( 2 ) (T - t) ' uVT t log( v (t)/ Vt ) + ( u 2 (T - t) ' uVT t log( v (t)/ Vt ) - ( u2 (T - t) . � uvT -t _

_

_

1/

_

_


Proof (Outline) .

387

We need to evaluate

D1 ( t, T)

=

Le - r (T - t ) JP* ( r 2 T, VT 2 L I Ft ) ,

We consider only t = ° for convenience (for the general case apply the strong Markov property). Now the formula for D1 (O , T) follows from Equation (9.5) . To compute D2 (O, T) we observe again using (9.5) L

=

J XdJP*(VT X, r 2 T)
t} the value of senior debt is

L

L,

Lj

388

9. Credit Risk

p� (t, T) = pd (t, T ; Ls , v)

and at time of default it is min{v (T) , Lsp(T, T) } for T < T. Now on the value of junior debt is

{ T > t}

pj (t, T) = pd (t, T) - P s (t, T) = p d (t, T ; L, v) - pd (t, T ; Ls , v) and at T < T it equals min{ v (T) - Lsp(T, T) , Lj p(T, T) } . If v (t) = Kp(t, T) for some constant K :::; L , we get if K = L, Ljp(t, T) pj (t, T) = p d (t, T) - Lsp(t, T) if Ls :::; K < L,

{

p d (t, T) - p� (t, T)

if K < Ls .

9.3.4 Structural Model with Stochastic Interest Rates

We now introduce stochastic interest rates into the modelling framework, as e.g in Black and Cox (1976) , Longstaff and Schwartz (1995) and Briys and de Varenne (1997) . We assume the same dynamics of the firm value process as before, i.e. (9.6) where rt , Kt and u(t) are suitable processes. rt models the stochastic short rate, K t is the dividend-rate and u (t) specifies the volatility. The dynamics of default-free bond prices are given by dp(t, T) p(t, T) (rt dt + b(t, T)dWt ) . (9.7) We consider the forward value of the firm Fv (t, T) Vt /p(t, T) under the T - forward measure jpT . Now (9.8) dFv (t, T) Fv (t, T) ( - K t dt + (u(t) - b(t, T) ) dWt) , T where W T is a JP Brownian motion. Now Fv (t, T) Fv (t, T)e - It I« u) du has the dynamics (9.9) dFv (t, T) FV (t, T) (u(t) - b(t, T) )dWr If (u( t) -b( t, T) ) is deterministic, Fv is a Gaussian process. In case of a bound ary function of the form Kp(t, T)e- It I« u) du one can exploit this property to again obtain an analytic expression for the forward value of a defaultable bond. The main steps are an application of the change-of-numeraire tech nique to transfer the valuation problem to the forward values (as outlined above) , and in case (u(t) - b(t, T) ) is constant, calculation of first-passage times similar to the ones performed in the last section. If (u( t) - b( t, T) ) not constant one has to perform a deterministic time change to be able to use the technique above. For background on time-changes of Brownian motion, we refer to e.g. Revuz and Yor (1991), V. l . =

=

=

=

=


Note.

389

In Equation ( 9.8 ) , the key component is the 'volatility' coefficient,

(u(t) - b(t, T)) , which we need to be deterministic in order to find explicit

formulae. This is a strong restriction, but does include models with both volatility and interest rates being stochastic. 9.3.5 Optimal Capital Structure - Leland's Approach

The basic idea is that equity holders ( = owners of the firm ) can choose the bankruptcy policy in such a way that the value of the equities will be maximized ( or the value of debt will be minimized ) . We assume the standard model with 0 and r > 0 constant. Also, the outstanding debt is a consol bond, i.e. a bond with infinite maturity, which pays continuously at a constant rate c. Its price Dc (t) at any date t E 1R+ equals /'i, =

D, (t)

,�

(1

T� E'

ce-> ( · - 'l l{,,,) dB

(

:F')

+ )�moo JE * K,82 el'(r- t ) e-r(r- t ) 1 { t t) > O. The associated jump process H ( Ht ) = (1 { r 9 } ) is called the default process. Define 1H as the filtration generated by H, i.e. 1it = a(Hu u S t) a({T S u } : u S t). Consider the enlarged =

=

=

=

:

=

392

9 . Credit Risk

filtration G IF V IH with gt Ft V 'Nt a(Ft , 'Nt ) . T is not necessarily a stopping time w.r.t. IF, but T is a stopping time w.r.t. G. We want to value default able claims within the framework of a financial market model for which we assume that IF contains the market information (and IH contains the information on the default time) . In this market, we assume the existence of a savings account =

=

=

where is the short-term interest rate process. We assume the existence of an equivalent martingale measure JP * (risk-neutral measure) such that the discounted price process of any tradeable security, which pays no dividends or coupons, follows a G-martingale under JP* . Recall that the cash-flow process Dt (payments from time t on) of a default able claim equals r

Dt

=

X d (T) l{t>T} +

J

(O, t ]

ZudHu ,

where the pay-out at maturity T is Xd(T) X l { o t } + Zl { r::; T} and the pay out in case of premature default is modelled by Z . Let X d (t, T) be the price process of the defaultable claim. Using the risk-neutral valuation formula we obtain =

Xd (t

,

T)

=

BtIE*

(J

B;; l dDu gt

(t ,T)

)

' t} B n {T > t}. The usual measure-theoretic approximation argument implies then that for a gt-measurable random variable Y there exists an Frmeasurable random variable Y such that Y = Y on {T > t}. Furthermore, we find that for any g-measurable random variable and any t E 1R+ we have

Ft

=

Y I Ft) IE * ( l {T> t } Y I gt ) l {T> t } IE*IP*(l{T>t} (l{T>t} J Ft ) =

t

(9.16)

One then only needs to use Equation (9. 16) together with the tower prop0 erty of conditional expectation to prove the claims. The final step is to establish a representation for the pre-default value of a defaultable claim in terms of the hazard process ( or its intensity 'Y in case it exists ) of the default time. In doing so we will be able to find convenient formulations for the valuation equation (9. 1 1 ) in important applications.

r

9.4 Reduced Form Models

395

Theorem 9.4. 1 . The value process X (t, T) of a defaultable claim has the following representation for t E [0, T] .

1. X d (t , T )

=

l {r>t} Bt IE*

(J

B;; l e rt - ru Zu dru + BT l X e rt -rT Ft

( t ,Tj

)

.

2. In case r admits an intensity 'Y X d (t, T)

=

l {r>t } IE*

(J

e - ft ( r . + 'Y. ds ) 'Yu Zu du Ft

( t , T]

(

l )

)

+ l {r>t} IE* e- f,T ( r s + 'Y. ds ) X Ft .

For proof and further discussion, we refer the reader to Bielecki and Rutkowski (2002) , Chapter 8. Valuation of General Defaultable Claims. We can now use Theorem 9.4.1 to value various defaultable claims. We will always assume that r admits an intensity 'Y. Fractional Recovery of Par (Face) Value. Let V represents the claim's constant par value and 8 the claim's recovery rate. Thus the pre-default value of the claim has no influence on the recovery in ca..c;e of default. We set Zt 8 · V, 0 ::; t ::; T and Theorem 9.4.1 yields =

X6 (t, T )

=

l{r>t} IE*

( J 8V

(

( t ,Tj

e - ftU (r s + 'Ys ) ds 'Yu du Ft

I )

)

+ l {r >t } IE* e- f,T ( r' + 'Ys )d S V Ft .

Fractional Recovery of No-default Value. Here, it is assumed that in case of default a fixed fraction of an equivalent non-defaultable security is received. In case of a default able bond this scheme is known as fractional recovery of treasury value. By the risk-neutral valuation formula, the time t value of the non-defaultable equivalent security xe is given by the discounted expectation ( under JP* ) of the promised payoff X, so

Now assume that Zt 8xe (t, T) with 8 the recovery rate. The valuation equation ( 9.1 1 ) , Theorem 9.4.1 and an application of Fubini's theorem yield =

396

9 . Credit Risk

Xe , O (t, T )

= (1

-

(

8 ) 1 { r> t} 1E *

+ 81{ r> t } 1E*

(

e

-

e

-

l )

It Crs + 'Ys ) ds X Ft

)

It r s ds X l gt .

For corporate bonds, one can find a convenient pricing formula. Let L 1 8 be the stochastic loss rate and L t = lE* ( L I Ft ) the risk-neutral mean fraction of market value if default occurs at time t. Thus Lt captures all information about recovery that is important for bond pricing. The pricing formula =

-

(9.17)

where S t 'Yt L t is now the risk-neutral conditional expected rate of loss of market value, and the necessary technical conditions under which it holds, are given in Duffie and Singleton ( 1 999) . Building on its specific form, one can build tractable models for defaultable bond pricing models parallel to the default-free models. In particular, HJM-type forward spread models can be constructed. One observes the initial default-free forward rates 1(0, T ) and the initial forward spread rates s(O, T) . After specification of the volatilities a f (t, u) for forward rates and as (t, u) for spreads and fractional loss at default L, default able bond prices are given as =

In addition to the standard HJM-drift condition, the spread dynamics ds(t, T )

=

f..L s (t, u)du + as (t, u)dW(u)

have to satisfy a drift condition (under recovery of market value) f..Ls (t, u)

=

as (t , u)

T

J af (t, v )dv t

+af (t, u)

T

J as (t, v )dv , t

see Schonbucher ( 1 998) and Schonbucher (2003) for additional details. Reduced-form Models with State Variables. In many applications, it is useful to think of underlying state variables that drive the economy (economic cycle) and subsequently have an influence on the default intensity. To model such state variables, we assume that Y is a d-dimensional stochastic process defined on (n, g , IF, IP* ) and follows a IF-Markov process under IP* .

9.4 Reduced Form Models

397

One can then model T as the first jump time of a Cox process, which has an intensity of form At A(yt) for some function A JRd -+ JR+ , as e.g. in Lando (1998). Under the further assumptions that the promised payoff X at T of the default able claim is FT -measurable, that the recovery process Z is JF predictable and finally, that the s.hort-rate process satisfies rt r(yt) for some function r JRd -+ JR, we can apply Theorem 9.4.1 to obtain =

:

=

:

Proposition 9.4. 1 . The price process of a defaultable claim with the above

specifications is given by

Motivated by the pricing formula (9.17), Duffie and Singleton (1997) , Duffie and Singleton (1999) and Duffee (1999) used an econometric model for the term structure of credit spreads. They modelled the short-rate process and the short-spread process using underlying square-root process state variables Xl , X2 , X3 • They as sumed that the state variables satisfy dX1 (t)

=

[lI:u ( lh - X1 (t)) + 11:1 2 (02 - X2 (t) )]dt + JX1 (t)dW1 (t) ,

dX2 (t)

=

11: 22 (02 - X2 (t) )dt + 0"22 VX2 (t)dW2 (t) ,

dX3 (t)

=

11:33 (03 - X3 (t) ) + 0"3 2 VX2 (t)dW2 (t) + 0"33 VX3 (t) dW3 (t) .

Conditions on the coefficients are needed to ensure that s et) > 0 (positive affine function of correlated square-root diffusions) and the correlation be tween s and r is negative. 9.4.2 Rating-based Models

Usually there is a deterioration of credit quality until risky debt goes into default mode - this is called credit migration. Credit quality corresponds to the probability that a firm will be able to meet its contractual obligations. Ordering firms according to their default probabilities leads to rating systems

398

9. Credit Risk

Year Rating AAA AA A BBB BB B CCC

1

3

2

4

0.00 0.05 0 .00 0. 1 1 0.00 0.02 0.07 0 . 1 5 0.04 0.12 0.21 0.36 0.24 0.54 0.85 1 . 52 1.01 6.32 9.38 3 . 40 5.45 12.36 19.03 24.28 23.69 33.52 4 1 . 1 3 47.43 Table 9 . 1 .

5

6

7

8

9

10

0 . 1 7 0.31 0.47 0 . 76 0.87 1 .00 0.27 0.43 0.62 0.96 0 . 77 0.85 0 . 56 1.01 1 .69 0.76 1 .34 2.06 4.55 2.19 2.91 3 . 52 4.09 5.03 12.38 15.72 17.77 20.03 22.05 23.69 28.38 31 .66 34.73 37. 58 40.02 42.24 54.25 56.37 57.94 58.40 59.52 60.91

Standard and Poor's cumulative Default Rates ( Percent)

(discrete) . These rating systems allow empirical calculation of transition ma trices. The table below shows Standard and Poor's cumulative Default Rates (Percent) ordered according to the rating classes of S&P. One can now assume that the credit quality of a firm is a continuous time Markov chain M on a finite state space E = ( 1 , . . . , K, K + 1) (the rating classes) with transition probability matrix pet) . We assume that the state K + 1 corresponds to bankruptcy and that a bankrupt firm remains in that state. As usual, we call Pi , j the probability that at time t the process, which started in state i, is in state j. Thus P k + l, i (t) = 0, i = 1, . . . K and P K + l, K+l (t) 1. The semigroup of M is =

pet) = exp(tA)

=

00 (tA) k

L �'

(9. 18)

k=O

where A is the generator matrix -AI , l Al ,2 A2 ,l -A2 , 2 A= AK- l ,l AK- I , 2 . . . -AK- l ,K-l AK- I , K o

o

o

where by definition

o

Pi,j (t) - Pi,j (O) A 'O ,J = tlim . --+O t We have Ai,j � 0, i =f. j and Lj A i,j 0, Vi. Now assume that a firm has rating k(t) at time t. Let A i,j (Xt) > 0 be the state-dependent, risk-neutral transition intensity from rating i to j (where Xt is a suitable process for the state variable) . With ret) = r(Xt) default able bond prices are as usual 0

=

9.5 Credit Derivatives

Pd (Xt , kt , t, T)

=

IE

399

( e- It r (u)du l{ r > T} I 9t ) ( e - It r (u)du 81 {r�T} 1 9t ) ,

+IE

where 8 is the recovery rate. Now assume fractional recovery of market value, so 8 = (1 - L)Pd (r- , T) with a constant L. Then Pd (Xt , kt , t, T)

=

IE

( e - It (r (u)+h (u) L)du I 9t ) ,

where h(t) Llk t , K+1 (Xt ) is the rate of transition from current rating into default. For further details on rating-based models, we refer the reader to Jarrow, Lando, and Turnbull ( 1 997) , Lando ( 1 998) , Lando (2000) and Bielecki and Rutkowski (2002) , Chapters 1 1 and 12. =

9 . 5 Credit Derivatives

We only discuss two specific examples. Credit Default Swaps, CDS. A credit default swap is an exchange of a periodic payment against a one-off contingent payment if some credit event occurs on a reference asset. The basic cash flow is shown in Figure 9 . 2 contingent payment Protection Buyer

f----

�

Protection Seller

periodic fee Table 9 . 2 .

Cash flow of a credit default swap

The ingredients of the basic structure are specification of 1 . maturity T: usually from one to ten years, 2. underlying: corporate or sovereign, 3. credit event: default, bankruptcy, downgrade. Let c(T) be the fixed coupon that the protection buyer pays. The payment continues until either default or maturity. In case of default, assume that the payment from the protection seller to the protection buyer is equal to the difference between the notational amount of the bond and the recovery value 8. The fixed side of the payment is set so that contract value is zero at initiation. Thus, since the cash flow at coupon date i for the protection buyer is c(T) l { r> i } and the payment for the protection seller at time of default r is ( 1 - 8 ) 1 { r � T} ' we obtain

400

9. Credit Risk

where we assume constant interest rates. Since both ]E* (e - r T l { r$T } ) and !P* (l { r>i}) are readily available in intensity-based form models (or can be inferred from market data assuming such a model) , these models are typically used to price CDS. Extensions of CDS include: 1. contingent credit swaps, which require an additional trigger, i.e. a credit event with respect to another entity or movement in equity prices or interest rates, 2. total (rate of) return swaps, which transfer an asset's total economic performance including - but not restricted to - its credit-related performance. First-to-default Swap (FtD) and Basket Default Swap. Now several assets are bundled together, and a credit swap is created on the whole basket. The default event is defined in terms of default on any of the assets in the basket, e.g. the first default of any asset in case of a first-to-default swap or the ith-to-default, or any other similar contract structure. The additional modelling component to be considered now is the dependence of defaults (e.g. clustering of defaults) . For the FtD, recall that for intensity-based models the default time T min{ T1 , . . . , Td } has intensity A A 1 + . . . + A n , where Ai is intensity of Ti , the default time of asset i. Thus, with an affine model for Ai , tractable models for FtD can be obtained. Further details on credit derivatives can be found in Bielecki and Rutkowski (2002) and Sch6nbucher (2003) . =

=

9 . 6 Portfolio Credit Risk Mo dels

We will only consider a rating-based credit portfolio model, such as Credit Metrics. A formal description of such a model consists of n + 1 rating cate gories (of which the first one corresponds to the default state) , a transition matrix of probabilities of rating changes within the time horizon of interest, and some re-evaluation procedure for the exposures within each rating class. To introduce dependencies of the individual exposures, a latent factor driving the transitions is assumed. Model Description. Assume the portfolio consists of bonds (obligors) that we consider at discrete time periods t = 0, 1 , . . . , T corresponding to the coupon payments. Let Ri (t) be the state indicator (i.e. rating class) at time t. We assume n + 1 rating classes i.e. the state space is {O, 1 , . . . n } with class o corresponding to default. m

9.6 Portfolio Credit Risk Models

Let

Furthermore, let

S =

- 00

401

(SI , . . Sm ) ' be a m-dimensional random vector. .

= C- l < Co < C l < . . . < Cn

be a sequence of cut-off levels. We assume

=

00

We call (Si (t) , ( Cj ) j E { - l , o , . . . , n } ) i E { l , . . . m } a dynamic latent variable model for the state vector R = (R1 , , Rm ) ' . Dynamic Modelling. In general Merton-type ( or asset-based ) models the value of Si (t) may be interpreted as the value of the assets of the firm. Indeed, a variety of distributions is possible for Si corresponding to various Levy-type specifications for stock price modelling ( e.g. the hyperbolic model, Eberlein (2001), the Variance-Gamma model and relatives, Carr, Geman, Madan, and Yor (2002), Carr, Chang, and Madan (1998)) . In particular, jump-diffusion models are easily incorporated, see Hamilton, James, and Webber (2001), for such a model. The relation to the standard approach is seen by recalling the standard Black-Scholes model defined via the SDE .

•

.

with constant coefficients and a standard Brownian motion W . The solution of the SDE is A t Ao exp �2 t + Wt

{,.tt

=

-

a

},

hence motivating the use of a normal return distribution. One can now con sider a general exponential Levy process model for asset values with a Levy process L. Now the solution of the SDE can be written as At

=

Ao exp { Lt }

with Lt a related Levy process ( which can be computed explicitly ) . Hence we may assume that the return distribution St log ( A d is generated by a Levy process. See §5.5, §7.4, and Bingham and Kiesel (2002) for an overview and further discussion of Levy-type models. Factor Modelling. In a typical credit portfolio model, dependencies of in dividual obligors are modelled via dependencies of the underlying latent vari ables S. In light of the typical portfolio analysis, the vector S is embedded in a factor model, which allows for easy analysis of correlation, the typical measure of dependence. One assumes that the underlying variables Si are driven by a vector of common factors. Typically, this vector is assumed to be normally distributed ( see e.g. JP Morgan (1997) ) . Let Z N(O, E) be =

rv

402

9. Credit Risk

a p-dimensional normal vector and € = ( 10 1 , 10m ) ' independent normally distributed random variables, independent also of Z. Define •

•

•

,

p

. . . m.

L aij Zj + Uifi, i 1 , j=l Setting Yi = Si generates a Gaussian factor model. However, such a set ting corresponds to the standard Brownian motion return structure. To cap ture Levy-type equity models, we use a normal mean-variance mixture model ( compare Bingham and Kiesel (2002) ) . To define such a model, let W be a further positive random variable, independent of € and Z and define Si ai + biW2 + WYi, i 1 , . . with ai, bi constants. Then S has a (p + I ) -dimensional conditional indepen dence structure. Now S inherits ( in principle ) the correlation matrix of Y . The individual returns Si are heavy-tailed and the vector S exhibits tail dependence. Such a model could alternatively be generated by using heavy-tailed factors and heavy-tailed idiosyncratic risk. An advantage of such a model is that it allows analytic approximations, in the sense that the loss distribution can be approximated as a function of the factor risk only. Approximation of Loss Distribution. Assume that 1 , i.e. we only have two rating classes, one of which corresponds to default. We calculated the distribution of number of defaults for one period. Assume the collateral consists of bonds, all from the same rating class implying a common default boundary Assume one normally distributed common factor ( as in Gordy (2000) ) , i.e. Yi pZ + V I - p2 fi and Si a + bW2 + WYi a + bW 2 + pWZ + VI - p2 Wfi. So the conditional distribution of Si given (Z, W) is normal. Also by condi tional independence, we get for the number M of defaults in one period ( let W have density g). Yi

=

=

=

=

.m

n =

c.

=

=

7r1

=

ooJ Joo (m)tP (c 00 (

= JP (M = l) =

o

-

l

xtP

) - )

a - bw 2 - pwz l V I - p2 w m -l c - a - bw 2 Pwz ¢J(z)g(w)dzdw. 1 - p2 w � -

9.6 Portfolio Credit Risk Models

403

Using the above formula, we can obtain the loss distribution analytically; see also Croughy, Galai, and Mark (2001), Ong ( 1 999) and the article Frey and McNeil (2003) and references. Multi-period losses: here we simply assume independence of the subse quent periods. Hence the only change in the above calculation is an adjust ment of the number of bonds in the collateral. The distribution of losses in the second period is given as m

IP(M2 l) jL=O IP(M2 = ll M1 j)IP(M1 j) jL=O IP(M(m - j) = lI M (m) = j)7rj(m), with the notation that M (m) is the number of defaults given that m bonds are in the collateral, and 7rj(m) = IP(M(m) j). An application of a strong law of large numbers ( Hall and Heyde ( 1 980 ), =

=

=

=

m

=

=

Theorem 2. 19) yields a convenient approximation: 11. m

.!.M( )

n -+ oo n

n

_

- A'o �

(c -

b 2 - pwz 1 - p2

a - W � V W

)

a.s.

(9. 1 9)

Approximation (9.19) allows us to compute the one-period number of losses given the joint distribution of (Z, W) . See Lucas, Klaassen, Spreij , and Straetmans ( 1999) for such an analytic approach. Copula Modelling. Multi-variate Normal models don't show tail depen dence, i.e. for bivariate normal with correlation p E (-1, 1), we have oX = 0, where

(Xl, X2 )

The degree of tail dependence depends on the copula of to which we now turn. A is a multivariate distribution with standard uniform marginal distributions. That is, is a mapping [0, l]d ---+ [0, 1 ] with - 1, . is increasing in each component = for all i E { I , . . . , d} , E . . , 1 , 1, . For all ( a I , . . E [0, l]d with a � we have: • •

•

(Xl, X2 )' copula C C C(ul ,U ) Ui,Ui [0 , 1]' C( , Ui, d, a ), ,(b1,1) . . Ui. , b ) i bi d d 2 2 L · · · L (-1) i 1 + . . + id C(U1, i 1, . . . , Ud ,tLd ) :::: 0, i1 =1 id= l where Uj , l and Uj , 2 = bj for all j E { I , . . , d } .

. .. . . . =

aj

.

404

9. Credit Risk

If X

FI , .

. .

= (X l , . . . , Xd )' has joint distribution F with continuous marginals , Fd , then the distribution function of the transformed vector

is a copula C and Thus copulas can be used to link marginal distributions (alternative to using the joint distribution) . We can now consider a general latent-variable model. That is, we model each of the underlying variables independently and subsequently use a copula to model the dependence structure. If we assume that the copula function is symmetric (in all variables) , the random vector S will be exchangeable: d

(8I , . . . , 8m ) = (8p(I) , . . . , 8p(m) )

for every permutation p of { I , } In this case, all possible k-dimensional marginal distributions are identical, and in particular, we have for the default probabilities 0, . . 0) , V {i l , . . . , i k } C { I , . } 1 � k � m. 7rk In this case, the distribution of the number of defaults can be computed via an application of the inclusion-exclusion principle: . . . , m

=

JP(Ril

JP(M

=

=

k)

=

.

,Rik

.

=

. . , m

(7)JP(RI 0, . . . ,Rk =

=

O, Rk + 1

#

0, . .

,

. ,Rm

#

0) (9.20)

If we fix the individual default probabilities of a group of k obligors to be we get (9.21) 7rk = C1 , . . . k ( . . . , 7f ) where C1 , ... , k is the k-dimensional margin of C. Using copulae with positive tail dependence for the factors (e.g. t distribu tion) leads to heavier tails of loss distribution, i.e. increased VaRs (compare results from the study of Frey and McNeil (2003) ) . 7r,

,

7r ,

9 . 7 Collateralized Debt Obligat ions ( CD O s )

9.7. 1 Introduction

Collateralized Debt Obligations (CDOs) are an important example of asset backed securities (ABS) and as such are backed by a pool of assets. Basic

9 . 7 Collateralized Debt Obligations (CDOs)

405

information on ABS and CDOs is given in Bowler and Tierny (1999) , Lucas (200 1 ) and Rayre (200 1 ) or in textbook form Bluhm, Overbeck, and Wagner (2003 ) . In the case of CD Os we distinguish two basic types, based on the type of debt used as the collateral: Collateralized Loan Obligations ( CLOs ) , backed by a pool of loans, and Collateralized Bond Obligations ( CBOs ) backed by a pool of bonds. The typical st.ructure is shown below: Collateral

----+

----+ ----+

Spy

----+

----+

----+

Notes

So we start with a pool of credit risky assets. This pool is then transferred to a special purpose vehicle (SPV) , which is a company set-up only for the purpose of the transaction. The SPY then issues securities or structured notes backed by the cash flow of the asset pool. Thus interest and principal of the notes are paid from interest and principal proceeds from the pool. The notes are divided in several classes according to their credit quality: senior notes and mezzanine, which usually carry ratings from triple-A to single-B. There is frequently an unrated equity class. The holders of the notes are paid interest and principal in order of seniority One can distinguish the following types of CDOs. Arbitrage cno s . In an arbitrage CDO, an issuer seeks to capture an arbi trage between the pricing/yield of high-yield sub-investment-grade securities that are acquired in the capital markets and yield on investment-grade bond assets that are sold to investors. This allows investors who could not otherwise invest in sub-investment-grade assets to participate in this market. There are two basic types of arbitrage CD Os, namely cash-flow CD Os and market value CDOs. While for the former, credit events short of default are not relevant for the performance, the collateral pool of the latter is marked to market regularly, and the asset manager is required to trade actively. Conventional cnos. A balance sheet CLO is typically created by a bank or financial institution wishing to securitize illiquid loan assets that they have originated. The loan assets may be fairly heterogenous, although most balance sheet CLOs have been done using investment-grade loans. 9.7.2 Review of Modelling Methods

To construct a model to evaluate CDOs, the following quantities have to be modelled: default probabilities for the asset in the pool, default dependence of the assets, loss in event of default, timing of default. We will now review several modelling approaches that have been proposed in the literature and are being used in practical applications. •

•

• •

9. Credit Risk

406

Moody's Binomial Expansion Technique. A simple approximation is given by Moody's Binomial expansion technique ( BET ) , which is based on a diversity score: represent the loss distribution of N bonds from the same industry by that of M ::; N independent identical bonds, i.e. construct a comparison portfolio. The parameter M is known as the diversity score ( DS ) . Table 9.7.2 reports the suggested diversity score for a standard application.

I Number of firms I Diversity score I 1 .0 1.5 2.0 2.3 2.6 3.0 3.2 3.5 3.7 4.0 case by case

1 2 3 4 5 6 7 8 9 10 >

10

Table 9 . 3 .

Moody's Diversity Score

Given the diversity score, Moody's technique is as follows. To calculate default probability under a diversity score of M, we can now use the Binomial distribution. Thus Z, the number of defaulting bonds, has distribution

qn

p

= JP(Z

= n

)=

(�)pn ( l _ p) M - n ,

with the default probability according to the rating of the tranche. Fur thermore, the expected loss is computed as

Ln

M LqnLn, n=O

where is the loss incurred when bonds default in the portfolio. The problematic aspects of this approach are that there is no probabilistic basis for diversity score. The idea is simply to match the mean and the standard deviation of the return distribution associated with the collateral pool. Infectious Defaults. This is a first application in this field of the important general phenomenon of market contagion. The idea has been put forward by Davis and Lo ( 2001a ) and Davis and Lo ( 2001b ) , to which we refer for details of the calculations involved. The basic idea is that a bond can either default directly or may be infected to default by the default of a different bond ( infectious default ) . n

9 . 7 Collateralized Debt Obligations (CDOs)

.. +Zn

407

So assume that we consider n bonds, and use indicator Zi 1 if bond i defaults, Zi 0 otherwise. Then N Zl + . is the number of defaulted bonds. For i = 1 , . . , n and j 1, . . . , n with i =f=. j let Xi , Yij be independent Bernoulli random variables with lP(Xi = 1) p and lP ( Yij = 1 ) = q . Then =

.

Zi = Xi

=

=

=

=

(

+ (1 - Xd 1 - Il( l - Xj }}i ) Hi

)

,

where the second term models infection. lP(N k) can be computed in closed form, also lE(N) = n ( 1 - ( 1 - p) ( l - pq ) n - l ) and WareN) = lE (N) + n(n l ) ,B�q (1E(N)) 2 . To see the effect of q, we keep the expected number of defaults constant while increasing q. We consider a total of n 50 bonds. So

-

=

=

q implied p std. dev. 0.5 3.54 a 6.05 0. 194 0.05 0.1 7.70 0. 1 16 0.2 10.32 0.064 Table 9 . 4 .

Effect of infection parameter

q can be used to model the volatility of the default (loss) distribution. To model the timing of defaults, the following extension of the model is used. Assume n bonds with exponentially distributed default time, then t p = 1 - e- A is probability of default within [0, t] . Let Nt be the number of defaults in [0, t] , then the default rate is proportional to the number of bonds alive: lP(default in [t, t + dtl l N ) lE(dNt I Nt ) = A(n - Nt ) dt. For met) = lE(Nd , we have t t

m et)

-

=

nAt

=

-J -

Am(s)ds

o

(as Mt = Nt J� A(n - Ns )ds is a martingale) . Thus met)

= n

(1

e - At ) .

Interaction is modelled by increasing the total hazard rate after a default by a factor a for an exponentially distributed time interval. It is possible to compute distribution of the number of defaults; see Davis and Lo (2001b) for details.

408

9. Credit Risk

Duffie and Garleanu (200 1 ) present an inten sity-based model. They model the default intensities of bonds as

Intensity-based Approach.

with X l ' . . . ' Xn , Xc independent affine processes ( say of CIR-type ) . The common intensity factor allows one to model dependency. Efficient simu lation methods are available for obtaining the distribution of the number of defaults. Ratings-based Modelling. One could also use a diffusion-driven Credit Metrics extension. Here every bond is modelled using the structural approach, i.e. we model the underlying value of the firm by a diffusion process. Default occurs if the value-of-the-firm process falls below a threshold. Dependencies within the pool can be captured by the cross-variation of the underlying Brownian Motions. See also Hamilton, James, and Webber (2001) for an extension of this approach.

A . Hilbert Space

Recall our use of n-dimensional Euclidean space JRn , the set of n-vectors or n-tuples x (X l , " " Xn) with each Xi E JR. Here one has the ordinary Euclidean length - the norm =

- and the inner product ( or dot product ) of two vectors x and y: (x, y) ,

or

X ·

y, :=

n

L XiYi . i= l

This setting is adequate for handling finite-dimensional situations, but not for infinite-dimensional ones. The simplest infinite-dimensional situation contain ing the above as a special case is the space £2 of square-summable sequences x = (Xl , X 2 , ' " ) with Because of the Cauchy-Schwarz inequality

if x , y E £2 , then is defined, and

(x , y ) :=

00

L XnYn

n= l

l ( x , y ) 1 :s II x li l l y ll

convergent, that is, L: :::'=l IXn l lYn l < the above ) . One may choose to work instead with complex sequences: all the above goes through, with the changes

( the series L::::'= l x nYn is absolutely - just replace Xn , Yn by IXn l , IYn l in

00

410

A . Hilbert Space 00

(x , y ) : = L X n Yn .

n= l A Hilbert space is a ( possibly - indeed, usually ) infinite-dimensional vector space endowed with such an inner product, and also complete ( in the sense of metric spaces; all Cauchy sequences converge - see Burkill and Burkill ( 1970) ) . For background, we must refer to the excellent textbook treatments of e.g. Young ( 1 988) and Bollobas ( 1 990) . Note that we already have a good supply of Hilbert spaces: finite-dimensional ones such as JRn and infinite dimensional ones such as £2 and L 2 . Hilbert spaces closely resemble ordinary Euclidean spaces in many re spects. In particular, one can take orthogonal complements. If M is a vector subspace of a Hilbert space H which is closed ( contains all its limit points ) , one can form its orthogonal complement MJ.. : = { y E H : (x, y ) = 0 \Ix E M } ;

then M J.. is also a closed vector subspace of H, and any expressible in the form z =x+y

with

z E H

is uniquely

x E M, y E MJ..

( and so (x, y ) = 0) . One then says that H is the direct sum of M and MJ.. , written H = M EB MJ.. . If z = x + y , II z l 1 2 = ( z , z) = (x + y , x + y ) = (x, x) + (x, y ) + ( y, x) + ( y , y ) = I I x l 1 2 + 2 (x, y ) + I I y l 1 2 .

In particular, if (x, y )

= 0 ( Le. x, y

are orthogona0 ,

This is the ( in general ) infinite-dimensional version of Pythagoras ' theorem. Hilbert spaces are the easiest of infinite-dimensional spaces to work with because they so closely resemble finite-dimensional, or Euclidean, ones. In particular, one can think geometrically in a Hilbert space, using diagrams as one would for ( say ) JR2 or JR3 ; see Appendix C. Functional analysis is the study of infinite-dimensional spaces, and so Hilbert-space theory forms an important part of it. Excellent textbook treat ments are available; we particularly recommend Young ( 1 988) for Hilbert space alone, Bollobas (1990) for the more general setting of functional anal ysis.

B. Projections and Conditional Expectations

Given a Hilbert space (or more generally, an inner product space) V, suppose V is the direct sum of a closed subspace M and its orthogonal complement

M.L :

In the direct-sum decomposition

P:z z P (pz) pz

of a vector E V into a sum of x E M and y E M.L , consider the map -+ x. This is called the orthogonal projection of V onto M. It is linear, and idempotent: since the direct-sum decomposition of x = is x = x + 0, = = x, or By Pythagoras' theorem,

pz

p2 = P.

I x l 1 2 I Pz l 1 2 :::; I l Pz 1 2 .

I PI P P I P I :::; R P) P, z,z pz); P ( P R P). pz z ( N.L z z P, N ( P) P. R( P) N ( P). The situation is symmetric between M and M.L : write Q : = 1 - P, with I the identity mapping. Then Q is linear, and as p2 = P, Q2 = (I - p)2 = I - 2P + p2 = I - 2P + P = I P = Q : Q2 = Q, and conversely, if Q2 = Q, then p2 = P. Thus also

= That is, application of decreases the In particular, norm of a vector: one says that has norm :::; 1 . Conversely, we quote that on an inner product space, these properties linear and idempotent with 1 characterise orthogonal projections. The range of that is, the set of x of the form x = for some is, as above, the set of vectors invariant under (that is, the set of with = is called the projection onto its range M = The orthogonal complement of the range - the set of with zero x-component - is the set of annihilated by the kernel (or nullspace) of Thus the direct-sum decomposition for the orthogonal projection P onto M is V = M 61 M .L = 61 -

-

-

412

B. Projections and Conditional Expectations v = M EI1 M -L

=

N(Q) EI1 R(Q) ,

Q :=

I

-

P.

In particular, when V is finite-dimensional, the content of the above re duces to linear algebra (there is no need to assume M closed, as closure is automatic, so analysis is not needed) . For a textbook treatment in this context, see Halmos ( 1958) . The above use of orthogonal projection, Pythagoras' theorem etc. under lies the theory of the familiar linear model of statistics: normally distributed errors, least squares, regression, analysis of variance etc. For such a geometric treatment of the linear model, see e.g. Rawlings ( 1 988) , Chapter 6. Conditional Expectations and Projections

We confine ourselves here to the L2 theory. Take the vector space L 2 L 2 (il, F , JP) of square-integrable random variables X on a probability space (il, F, JP) . This is a Hilbert space under the norm =

and inner product

(X, Y)

:=

JE(XY).

The space L2 is complete, by the Riesz-Fischer theorem (see e.g. Bollobas ( 1990) ) .

If M is a vector subspace of L 2 which is closed (equivalently: complete), given any X E L 2 one can 'drop a perpendicular' from X to M , obtaining Y E M with II X - Y I I = inf { II X W I I : W E M } and (X - Y, Z) 0 for all Z E M (see e.g. Williams ( 1 99 1 ) , §6 . 1 1 ) Then the map X � Y is the orthogonal projection onto M: it is linear, idempotent and of norm at most 1 . Suppose now that Q is a sub-a-field of F and M is the L 2 -space of Q measurable functions. Then -

=

.

X � JE(X IQ)

is the orthogonal projection of X onto M L 2 (il, Q , JP) . It gives the best predictor (in the least-squares sense) of X given Q (that is, it minimises the mean-square error of all predictors of X given the information represented by Q) : see e.g. Williams ( 1 99 1 ) , §9.4. The idempotence of this conditional expectation operator follows from the iterated conditional expectation oper ation of §2.5. This idempotence is also suggested by the above interpretation: forming our best estimate given available information should give the same result done once as done twice. =

B. Projections and Conditional Expectations

413

This picture of conditional expectation projection is powerful - partly because, in the L 2 -setting, it allows us to think and argue geometrically. The excellent text Neveu (1975) on martingales is based on this viewpoint. as

c . The Separating Hyperplane Theorem

In a vector space V , if x and y are vectors, the set of linear combinations ax + /3y, with scalars a, /3 2 0 with sum a + /3 = 1 , represents geometrically the line segment joining x to y. Each such linear combination .>..x + (1 - .>.. ) y, with 0 ::; .>.. ::; 1 , is called a convex combination of x and y . A set C in V is called convex if, for all pairs x and y of points in C, all convex combinations of x and y are also in C. If V has dimension n and U is a subspace of dimension n - 1, U is said to have codimension 1 . If U is a subspace, x+U

:=

{x

+

u : u

E U}

is called the translate of U by the vector x. A hyperplane in V is a translate of a subspace of codimension 1. Such a hyperplane is always representable in the form H = [f, a] : = {x : f (x) = a } ,

for some scalar a and linear functional f : that is, a map f : V -+ IR with f ( x + y) = f ( x)

+ f ( y)

( x, y E V) ,

f ( .>..x ) = .>.. f ( x) ( x

E V, .>.. E IR ) .

Such an f is of the form f ( x) = iI x! + . . . + fnXn i

then f ( iI , . . , fn ) defines a vector f in V, and the hyperplane H = [f, a] consists of those vectors x in V whose projections onto f have magnitude a. The hyperplane [j, a] bounds the set A c V if =

the hyperplane

·

f ( x) 2 a \Ix [f, a] separates

E V or

f ( x) ::; a \Ix E

Vi

the sets A, B c V if

f ( x) 2 a \Ix E

A and

f ( x) ::; a \Ix

E B,

or the same inequalities with A < B ( or 2 , ::; ) interchanged. The following result is crucial for many purposes, both in mathematics and in economics and finance.

416

C. The Separating Hyperplane Theorem

Theorem C.O.1 (Separating Hyperplane Theorem) . If A, B are two

non-empty disjoint convex sets in a vector space V, they can be separated by a hyperplane.

For proof and background, see e.g. Valentine ( 1 964) , Part II and Bott The restriction to finite dimension is not in fact necessary: the re sult is true as stated even if V has infinite dimension ( for proof, see e.g. BolloMs ( 1990) , Chapter 3) . In this form, the result is closely linked to the Hahn-Banach theorem, the cornerstone of functional analysis. Again, Bol lobas (1990) is a fine introduction. ( 1942) .

Remark. When using a book on functional analysis, it is usually a good idea to look out for the results whose proof depends on the Hahn-Banach theorem: these are generally the key results, and the hard ones. The same is true in mathematical economics or finance of the separating hyperplane theorem.

Bibliography

A'it-Sahalia, Y., 1996, Nonparametric pricing of interest rate derivative securities, Econometrica 64, 527-600 . AitSahlia, F . , and T.-L. Lai, 1998a, Random walk duality and the valuation of discrete lookback options, Working paper, Department of Statistics, Stanford University. AitSahlia, F . , and T.-L. Lai, 1 998b, Valuation of discrete barrier and hindsight options, To appear in Journal of Financial Engineering. Allingham, M . , 199 1 , A rbitrage . Elements of financial economics. ( MacMillan, New York ) . Amin, K . , and A. Khanna, 1994, Convergence of American option values from discrete- to continuous-time financial models, Mathematical Finance 4, 289304. Applebaum, D . B . , 2004, Levy processes and stochastic calculus. ( Cambridge Uni versity Press, Cambridge ) . Aurell, E . , and S.L Simdyankin, 1 998, Pricing risky option simply, International Journal of Theoretical and Applied Finance 1, 1-23. Back, K . , and S.R Pliska, 1991 , On the fundamental theorem of asset pricing with an infinite state space, Journal of Mathematical Economics 20, 1-18. Bagnold, RA., 1941 , The physics of blown sand a n d des ert dunes. ( Matthew, Lon don ) . Bagnold, R A . , and O.E. Barndorff-Nielsen, 1979, The pattern of natural size dis tributions, Sedimentology 27, 1 99-207. Bajeux-Besnainou, I . , and R Portrait, 1997, The numeraire portfolio: A new ap proach to continuous time finance, The European Journal of Finance. Bajeux-Besnainou, I . , and J-C. Rochet, 1996, Dynamic spanning: Are options an appropriate instrument?, Mathematical Finance 6, 1-16. Barndorff-Nielsen, O . E . , 1977, Exponentially decreasing distributions for the loga rithm of particle size, Proc. Roy. Soc. London A 353, 401-419. Barndorff-Nielsen, O.E. , 1 998, Processes of normal inverse Gaussian type, Finance and Stochastics 2, 41-68. Barndorff-Nielsen, O.E. , P. Blaesild, J.L Jensen, and M . S0rensen, 1 985 , The fas cination of sand, in A.C. Atkinson, and S.E. Fienberg, eds . : A celebration of statistics ( Springer, New York ) . Barndorff-Nielsen, O . E . , and O. Halgreen, 1977, Infinite diversibility of the hyper bolic and generalized inverse Gaussian distributions, Z. Wahrschein. 38, 309312. Barndorff-Nielsen, O . E . , T. Mikosch, and S.L Resnick, 2000, Levy processes: theory and applications. ( Birkhauser Verlag, Basel ) . Barndorff-NieIsen, O . E . , and N. Shephard, 200 1 , Non-Gaussian Ornstein Uhlenbeck-based models and some of their uses in financial economics, J. R. Statist. Soc . B 63, 167-24 1 .

418

Bibliography

Barndorff-Nielsen, O . E . , and N. Shephard, 2002, Econometric analysis of realized volatility and its use in estimating stochastic volatility models, J. R. Statist. Soc. B 64, 253-280. Barndorff-Nielsen, O . E . , and M. S!2!rensen, 1 994, A review of some aspects of asymp totic likelihood theory for stochastic processes , International Statistical Review 62, 1 33-165. Bass, RF., 1995, Probabilistic techniques i n analysis. ( Springer, Berlin Heidelberg New York ) . Beran, J . , 1994, Statistics for long-memory process es. ( Chapman & Hall, London ) . Bertoin, J . , 1 996, L evy processes vol. 1 2 1 of Cambridge tracts in mathematics. ( Cambridge University Press, Cambridge ) . Bibby, B . M . , and M . S!2!rensen, 1997, A hyperbolic diffusion model for stock prices, Finance and Stochastics 1 , 25-4 1 . Bielecki, T.R , and M . Rutkowski, 2002, Credit risk: modeling, valuation and hedg ing. ( Springer, New York ) . Biger, N . , and J. Hull, 1 983, The valuation of currency options, Finan. Management 1 2 , 24-28. Billingsley, P. , 1968, Convergence of probability measures. ( Wiley, New York ) . Billingsley, P. , 1986, Probability Theory. ( Wiley, New York ) . Bingham, N . H . , and R Kiesel, 200 1 , Hyperbolic and semiparametric models in finance, in P. Sollich, A . C . C . Coolen, L.P. Hughston, and R.F. Streater, eds. : Disordered and complex systems ( Amer. Inst. of Physics ) . Bingham, N . H . , and R. Kiesel, 2002, Semi-parametric modelling in finance: theo retical foundation, Quantitative Finance 2 pp. 241-250. Bingham, N . H . , C.M. Goldie, and J . L . Teugels, 1 987, Regular Variation. ( Cam bridge University Press, Cambridge ) . Bjork, T . , 1995, Arbitrage theory in continuous time, Notes from Ascona meeting. Bjork, T . , 1997, Interest rate theory, in Financial Mathematics, ed . by W.J. Rung galdier Lecture Notes in Mathematics pp. 53-122. Springer, Berlin New York London. Bjork, T . , 1999, A rbitrage theory in continuous time. ( Oxford University Press, Oxford ) . Bjork, T . , G. Di Masi, Y. Kabanov, and W. Runggaldier, 1997, Towards a general theory of bond markets, Finance and Stochastics 1 , 141-174. Bjork, T., Y. Kabanov, and W. Runggaldier, 1997, Bond market structure in the presence of marked point processes, Mathematical Finance 7, 2 1 1-239. Black, F . , 1976, The pricing of commodity contracts, J. Financial Economics 3 1 , 167-179. Black, F., 1989, How we came up with the option formula, J. Portfolio Management 1 5 , 4-8. Black, F. , and J . C . Cox, 1976, Valuing corporate securities: Some effects of bond indenture Provisions, Journal of Finance 3 1 , 351-367. Black, F., E. Derman, and W. Toy, 1990, A one-factor model of interest rates and its application to treasury bond options, Finan. Analysts J. pp. 33-39. Black, F., and M. Scholes, 1973, The pricing of options and corporate liabilities, Journal of Political Economy 72, 637-659. Bluhm, C . , L. Overbeck, and C . Wagner, 2003, An introduction to credit risk mod elling. ( Chapman & Hall, London ) . Bodie, Z . , A. Kane, and A . J . Marcus, 1999, Investments. ( McGraw-Hill ) 4th edn. Bollobas, B . , 1 990, Linear analysis . A n introductory course. ( Cambridge University Press, Cambridge ) . Bott, T . , 1 942, Convex sets, A merican Math. Monthly 49, 527-535.

Bibliography

419

Bowler, T . , and J . F . Tierny, 1 999, Credit derivatives and structured credit, Working paper, Deutsche Bank. Boyle, P.P. , J. Evnine, and S. Gibbs, 1 989, Numerical evaluation of multivariate contingent claims, Review of Financial Studies 2, 241-250. Brace, A . , D. Gatarek, and M. Musiela, 1997, The market model of interest rate dynamics, Mathematical Finance 7, 1 27-1 54. Brace, A., M . Musiela, and W. Schlogl, 1998, A simulation algorithm based on mea sure relationships in the lognormal market models, Working paper, University of New South Wales. Brandt, W . , and P. Santa-Clara, 2002, Simulated likelihood estimation of diffusions with an application to exchange rate dynamics in incomplete markets, Journal of Financial Economics 63, 1 6 1-210. Breiman, L . , 1992, Probability. ( Siam, Philadelphia) 2nd edn. First edition Addison Wesley, Reading, Mass. 1968. Brigo, D., and F . Mercurio, 200 1 , Interest rate models - theory and practice. ( Springer ) . Briys, E . , and F. de Varenne, 1997, Valuing risky fixed rate debt: An extension, Journal of Financial and Quantitative A nalysis 32, 239-248. Broadie, M., and J. Detemple, 1997, Recent advances in numerical methods for pricing derivative securities, in L . C . G . Rogers, and D . Talay, eds . : Numerical Methods in Finance ( Cambridge University Press, Cambridge ) . Brown, R.H . , and S.M. Schaefer, 1995, Interest rate volatility and the shape of the term structure, in S . D . Howison, F . P. Kelly, and P. Wilmott, eds . : Mathematical models in finance ( Chapman & Hall, London ) . Biihlmann, H . , F . Delbaen, P. Embrechts, and A. Shiryaev, 1996, No arbitrage, change of measure and conditional Esscher transforms, C WI Quarterly 9, 2913 1 7. Biihlmann, H . , F. Delbaen, P. Embrechts, and A. Shiryaev, 1998 , On Esscher trans forms in discrete financial models, Preprint, ETH Ziirich. Burkholder, D.L. , 1966, Martingale transforms, Ann. Math. Statist. 37, 1494-1504. Burkill, J . C . , 1962, A first course in mathematical analysis. ( Cambridge University Press, Cambridge ) . Burkill, J . C . , and H. Burkill, 1970, A second course in mathematical analysis. ( Cam bridge University Press, Cambridge ) . Cambanis, S . , S. Huang, and G . S . Simons, 1981 , On the theory of elliptically con toured distributions, J. Multivariate A nalysis 1 1 , 368-385. Campbell, J .Y., A.W. Lo, and A . C . MacKinlay, 1997, The econometrics of financial markets. ( Princeton University Press, Princeton ) . Carr, P. , E. Chang, and D . B . Madan, 1998, The Variance-Gamma process and option pricing, European Finance Review 2, 79-105. Carr, P. , K Ellis, and V. Gupta, 1998, Static hedging of exotic options, Journal of Finance pp. 1 1 65-1 190. Carr, P. , H. Geman, D . B . Madan, and M . Yor, 2002, The fine structure of asset returns: An empirical investigation . , Journal of Business 75, 305-332. Chan, K C . , et al. , 1 992, An empirical comparison of alternative models of the short-term interest rates . , Journal of Finance 47, 1 209-1228. Chan, K C . , G.A. Karolyi, F.A. Longstaff, and A.B. Sanders, 1992, An empirical comparison of alternative models of the short-term interest rate, Journal of Finance 47, 1209-1 227. Chan, T. , 1999, Pricing contingent claims on stocks driven by Levy processes, An nals Applied Probab. 9, 504-528. Chapman, D . A . , and N.D. Pearson, 2000, Is the short rate drift actually nonlinear? Journal of Finance 55, 355-388.

420

Bibliography

Chen, L . , 1996, Interest rote dynamics, derivatives pricing, and risk management vol. 435 of Lecture notes in economics and mathematical systems. (Springer, Berlin Heidelberg New York) . Chichilnisky, G . , 1 996, Fisher Black: Obituary, Notices of the A MS pp. 319-322. Chow, Y.S . , H . Robbins, and D . Siegmund, 1 99 1 , The theory of optimal stopping. (Dover, New York) 2nd edn. 1st ed. , Great expectations: The theory of optimal stopping, 1971 . Cochrane, J . H . , 200 1 , Asset pricing. (Princeton University Press , Princeton) . Cox, D . R , and H . D . Miller, 1 972, The theory of stochastic processes. (Chapman and Hall, London and New York) First published 1965 by Methuen & Co Ltd. Cox, J . C . , and S.A. Ross, 1 976, The valuation of options for alternative stochastic processes, Journal of Financial Economics 3, 145-166. Cox, J . C . , S. A. Ross, and M. Rubinstein, 1 979, Option pricing: a simplified ap proach, J. Financial Economics 7, 229-263. Cox, J . C . , and M. Rubinstein, 1985, Options markets. (Prentice-Hall, Englewood Cliffs, NJ) . Croughy, M . , D . Galai, and R Mark, 200 1 , Risk management. (McGraw Hill, New York) . Cutland, N.J. , E. Kopp, and W. Willinger, 199 1 , A nonstandard approach to option pricing, Mathematical Finance 1, 1-38. Cutland, N . J . , E. Kopp, and W. Willinger, 1 993a, From discrete to continuous finan cial models: New convergence results for option pricing, Mathematical Finance 3, 1 01-1 23 . Cutland, N.J . , E. Kopp, and W. Willinger, 1 993b, A nonstandard tratment of options driven by Poisson processes, Stochastics and Stochastics Reports 42, 1 1 5-133. Dalang, R C . , A. Morton, and W. Willinger, 1 990, Equivalent martingale mea sures and no-arbitrage in stochastic securities market models, Stochastics and Stochastic Reports 29, 185-201 . Daley, D . , and D . Vere-Jones, 1988, A n introduction t o the theory of point processes. (Springer, New York) . Dana, R-A . , and M. Jeanblanc, 2002, Financial markets in continuous time. (Springer, Berlin Heidelberg New York) . Davis, M.H.A., 1 994, A general option pricing formula, Preprint , Imperial College. Davis, M.H.A., 1 997, Option pricing in incomplete markets, in M.A.H. Dempster, and S.R Pliska, eds . : Mathematics of derivative securities (Cambridge Univer sity Press, Cambridge) . Davis, M . , and V. Lo, 2001a, Infectious Default, Quantitative Finance 1 , 382-386. Davis, M . , and V. Lo, 2001b, Modelling default correlation in bond portfolios, in Carol Alexander, eds. : Mastering risk volume 2: Applications (Financial Times Prentice-Hall, Englewood Cliffs, NJ) . Delbaen, F . , et al. , 1 997, Weighted norm inequalities and hedging i n incomplete markets, Finance and Stochastic 1 , 181-227. Delbaen, F . , and W. Schachermayer, 1994, A general version of the fundamental theorem of asset pricing, Mathematische Annalen 300, 463-520. Delbaen, F., and W. Schachermayer, 1995a, The existence of absolutely continuous local martingale measures, Ann. Appl. Prob . 5, 926-945. Delbaen, F . , and W. Schachermayer, 1995b, The no-arbitrage property under a change of numeraire, Stochastics and Stochastics Reports 53, 213-226. Delbaen, F . , and W. Schachermayer, 1996, The variance-optimal martingale mea sure for continuous processes, Bernoulli 2, 81-106. Delbaen, F., and W. Schachermayer, 1 998, The fundamental theorem of asset pric ing for unbounded stochastic processes, Math. Annal 312, 2 1 5-250.

Bibliography

42 1

Dellacherie, C . , and P.-A. Meyer, 1978, Probabilities and potential vol. A. (Hermann, Paris) . Dellacherie, C . , and P.-A . Meyer, 1982, Probabilities and potential vol. B. ( North Holland, Amsterdam New York) . Dixit, A . K , and R.S. Pindyck, 1994, Investment under uncertainty. (Princeton University Press, Princeton) . Dohnal, G . , 1987, O n estimating the diffusion coefficient, J. Appl. Probab. 24, 1051 14. Doob, J . L . , 1984, Classical potential theory and its probabilistic counterpart vol. 262 of Grundl. math. Wissenschaft. (Springer, Berlin Heidelberg New York) . Doob, J. L . , 1953, Stochastic processes. (Wiley, New York) . Dothan, M. U . , 1990, Prices in financial markets. (Oxford University Press, Ox ford) . Downing, C . , 1999, Nonparametric estimation of multifactor continuous time inter est rate models, Working Paper, Federal Reserve Board. Duan, J-C . , 1 995, The GARCH option pricing model, Mathematical Finance 5 , 1 3-32. Dubins, L.E. , and L.J. Savage, 1976, Inequalities for stochastic processes. How t o gamble if you must. (Dover, New York) 2nd edn. 1st ed. How to gamble if you must. Inequalities for stochastic processes, McGraw-Hill, 1965. Dudley, R.M. , 1989, Real analysis and probability. (Wadsworth, Pacific Grove ) . Duffee, Gregory R. , 1999, Estimating the price o f default risk, Review of Financial Studies 12, 197-226. Duffie, D . , 1989, Futures markets. (Prentice-Hall, Englewood Cliffs, NJ) . Duffie, D . , 1992, Dynamic asset pricing theory. (Princton University Press, Prince ton) . Duffie, D . , 1 996, State-space models of the term structure of interest rates, in L. Hughston, eds. : Vasicek and beyond (Risk Publications, London) . Duffie, D . , and N . Garleanu, 200 1 , Risk and Valuation of Collateralized Debt Obli gations, Financial A nalysts Journal 57, 41-59. Duffie, D . , and C.-F. Huang, 1985, Implementing Arrow-Debreu equilibria by con tinuous trading of a few long-lived securities, Econometrica 53, 1 337-1356. Duffie, D., and R. Kan, 1995, Multi-factor term structure models, in S.D. Howison, F.P. Kelly, and P. Wilmott, eds . : Mathematical models in finance (Chapman & Hall, London) . Duffie, D . , and P. Protter, 1992, From discrete- to continuous-time finance: Weak convergence of the financial gain process, Mathematical Finance 2, 1-15. Duffie, D., and H.R. Richardson, 1 99 1 , Mean-variance hedging in continuous time, A nn. Appl. Probab. 1, 1-15 . Duffie, D . , and K J . Singleton, 1997, An econometric model o f the term structure of interest rate swap yields, Journal of Finance 52, 1 287-132 1 . Duffie, D . , and K J . Singleton, 1999, Modeling term structures o f defaultable bonds, Review of Financial Studies 12, 687-720. Duffie, D . , and KJ. Singleton, 2003, Credit Risk. (Princeton University Press, Princeton) . Duffie, D . , and C. Skiadas, 1994, Continuous-time security pricing: A utility gradi ent approach, J. Mathematical Economics 23, 1 07-13 1 . Durrett , R. , 1996a, Probability: Theory and examples. (Duxbury Press at Wadsworth Publishing Company) 2nd edn. Durrett, R. , 1996b, Stochastic Calculus: A practical introduction. ( CRC Press) . Durrett , R. , 1999, Essentials of stochastic processes. (Springer, New York) .

422

Bibliography

Dybvig, P.H . , and S.A. Ross, 1987, Arbitrage, in M . Milgate J . Eatwell, and P.Newman, eds . : The New Palgrave: Dictionary of Economics ( Macmillan, Lon don ) . Eberlein, E . , 200 1 , Applications of generalized hyperbolic Levy motions to finance, in O.E. Barndorff-Nielsen, T. Mikosch, and S. Resnick, eds . : Levy processes: Theory and Applications ( Birkhauser Verlag, Boston ) . Eberlein, E . , and J. Jacod, 1997, On the range of option prices, Finance and Stochas tics 1 , 131-140. Eberlein, E., and U . Keller, 1995, Hyperbolic distributions in finance, Bernoulli 1 , 281-299. Eberlein, E., U. Keller, and K Prause, 1 998, New insights into smile, mispricing and Value-at-Risk: The hyperbolic model, J. Business 7 1 , 371-406. Eberlein, E . , and S . Raible, 1998, Term structure models driven by general Levy processes, Mathematical Finance 9, 3 1-53 . Edwards, F.R. , and C.W. Ma, 1992, Futures and options. ( McGraw-Hill, New York ) . EI Karoui, N . , R. Myneni, and R. Viswanathan, 1992a, Arbitrage pricing and hedg ing of interest rate claims with state variables: I. Theory, Universite de Paris VI and Stanford University. EI Karoui, N . , R. Myneni, and R. Viswanathan, 1992b, Arbitrage pricing and hedg ing of interest rate claims with state variables: II. Applications, Universite de Paris VI and Stanford University. EI Karoui, N . , and M . C . Quenez , 1995, Dynammic programming and pricing of contingent claims in an incomplete market, SIA M J. Control Optim. 33, 29-66. EI Karoui, N . , and M . C . Quenez, 1997, Nonlinear pricing theory and backward stochastic differential equations, in Financial Mathematics, ed. by W.J. Rung galdier no. 1656 in Lecture Notes in Mathematics pp. 19 1-246. Springer, Berlin New York London Lectures given at the 3rd Session of the Centro Internazionale Matematico Estivo ( C.I.M.E ) held in Bressanone, Italy, July 8- 1 3 , 1996. Elton, E. J . , and M.J. Gruber, 1995, Modern portfolio theory and inves tment anal ysis. ( Wiley, New York ) 5th edn. Embrechts, P. , 2000, Actuarial versus financial pricing of insurance, Risk Finance 1 , 17-26. Embrechts, P. , C. Kliippelberg, and P. Mikosch, 1997, Modelling extremal events. ( Springer, New York Berlin Heidelberg) . Esscher, F . , 1932, On the probability function in the collective theory of risk, Skan dinavisk A ktuarietidskrift 1 5 , 1 75-195. Ethier, S . N . , and T.G. Kurtz, 1986, Markov processes. ( John Wiley & Sons, New York ) . Eydeland, A . , and H. Geman, 1995, Domino effect: Inverting the Laplace transform, in Over the rainbow ( Risk Publications, London ) . Fang, K-T . , S. Kotz, and K-W. Ng, 1 990, Symmetric multivariate and related distributions. ( Chapman & Hall, London ) . Feller, W . , 1 968, An introduction to pro bability theory and its applications, Volume 1. ( Wiley, New York ) 3rd edn. Feller, W. , 1971 , An introduction to probability theory and its applications, Volume 2. ( Wiley & Sons, Chichester ) 2nd edn. Fleming, W.H., and H . M Soner, 1 993, Control led Markov processes and viscosity solutions. ( Springer, New York Berlin Heidelberg ) . Flesaker, R , and L. Hughston, 1 996a, Positive interest, Risk magazine 9. Flesaker, B . , and L . Hughston, 1996b, Positive interest:foreign exchange, in L . Hughston, eds . : Vasicek and beyond ( Risk publications , London ) .

Bibliography

423

Flesaker, B . , and L . Hughston, 1997, Positive interest, in M.A. Dempster, and S. Pliska, eds . : Mathematics of derivative securities (Cambridge University Press, Cambridge) . Florens-Zmirou, D . , 1989, Approximate discrete-time schemes for statistics of dif fusion processes, Statistics 20, 547-557. Follmer, H., 199 1 , Probabilistic aspects of options, Discussion Paper B-202, Uni versitat Bonn. Follmer, H . , and P. Leukert, 1999, Quantile hedging, Finance and Stochastic 3, 25 1-274. Follmer, H . , and P. Leukert , 2000, Efficient hedging: Cost versus shortfall risk, Finance and Stochastic 4, 1 1 7-146. Follmer, H., and M . Schweizer, 1991 , Hedging of contingent claims under incomplete information, in M.H.A. Davis, and R.J. Elliott, eds . : Applied stochastic analysis (Gordon and Breach, London New York) . Follmer, H. , and D . Sondermann, 1 986, Hedging of non-redundant contingent claims, in W. Hildenbrand, and A. Mas-Colell, eds . : Contribution to mathe matical economics (North Holland, Amsterdam). Fouque, J .-P. , C. Papanicolaou, and K.R. Sircar, 2000, Derivatives i n financial markets with stochastic volatility. (Cambridge University Press, Cambridge). Frey, R., 1997, Derivative asset analysis in models with level-dependent and stochas tic volatility, C WI Quarterly pp. 1-34. Frey, R. , and A. McNeil, 2003 , Dependent Defaults in Models of Portfolio Credit Risk, Journal of Risk. Garman, M . , and S. Kohlhagen, 1983, Foreign currency option values, J. Interna tional Money Finance 2, 231-237. Geman, H . , N . EI Karoui , and J-C . Rochet, 1995, Changes of numeraire, changes of probability measure and option pricing, J. Appl. Pro b. 32, 443-458. Geman, H., and M . Yor, 1993, Bessel processes, Asian options and perpetuities, Mathematical Finance 3, 349-375 . Geman, H . , and M. Yor, 1996, Pricing and hedging double barrier options: A prob abilistic approach, Mathematical Finance 6, 365-378 . Genon-Catalot, V. , and J . Jacod, 1 994, Estimation of the diffusion coefficient for diffusion processes: random sampling, Scand. J. Statist. 2 1 , 193-221 . Gerber, H . U . , and E.S. Shiu, 1995, Actuarial approach t o option pricing, Preprint , Istitut de Sciences Actuarielles, Universite de Lausanne. Gerber, U . , 1973, Martingales in risk theory, Mitteilungen der Vereinigung Schweiz erischer Versicherungsmathematiker 73, 205-2 16. Geske, R., 1977, The valuation of corporate liabilities as compound options, Journal of Financial and Quantitative Analysis pp. 541-552. Ghysels, E . , et al. , 1998, Non-parametric methods and option pricing, in D . J . Hand, and S . D . Jacka, eds . : Statistics in finance (Arnold, London) . Gnedenko, B.V. , and A.N. Kolmogorov, 1 954, Limit theorems for sums of indepen dent random variables. (Addison-Wesley) . Goldman, B . , H . Sosin, and M . Gatto, 1979, Path dependent options: Buy at a low, sell at a high, J. Finance 34, 1 1 1 1- 1 1 28 . Goll, T . , and J. Kallsen, 2003 , A complete explicit solution to the log-optimal portfolio problem, The A nnals of Applied Pro bability pp. 774-799. Good, I . J . , 1953, The population frequency of species and the estimation of popu lation parameters, Biometrika 1953, 237-240. Goovaerts, M . , E. de Vylder, and J . M . Haezendonck, 1994, Insurance premiums. (North-Holland, Amsterdam). Goovaerts, M., R. Kaas, A.E. van Heerwaarden, and T. Bauwelinckx, 1990, Effective actuarial methods. (North-Holland, Amsterdam) .

424

Bibliography

Gordy, M . , 2000, A comparative anatomy of credit risk models, Journal of Banking and Finance 24, 1 19-149. Gourieroux, C., 1997, A R CH models and financial applications. ( Springer, New York Berlin Heidelberg ) . Gourieroux, C . , and A. Monfort, 1996, Simulation- based econometric methods. ( Ox ford University Press, Oxford ) . Gourioux, C . , J . P. Laurent, and H. Pham, 1998, Mean-variance hedging and numeraire, Mathematical Finance 8, 179-200. Grimmett, G . R , and D . J . A . Welsh, 1986 , Probability: An introduction. ( Oxford University Press, Oxford ) . Grimmett, G. R , and D . Stirzaker, 200 1 , Probability and random processes. ( Oxford University Press, Oxford ) 3rd edn. 1st ed. 1982, 2nd ed. 1992. Grosswald, E., 1976, The Student t-distribution function of any degree of freedom is infinitely divisible, Z. Wahrschein. 36, 103-109 . Hale, J . , 1969, Ordinary differential equations. ( J . Wiley and Sons / lnterscience, New York ) . Hall, P. , and C . C . Heyde, 1980, Martingale limit theory and its applications . ( Aca demic Press, New York ) . Halmos, P.R , 1958, Finite- dimensional vector spaces. ( Van Nostrand ) . Hamilton, D . , J. James, and N. Webber, 200 1 , Copula methods and the analysis of credit risk, Preprint, University of Warwick. Hamilton, J . , 1994, Time series analysis. ( Princeton University Press, Princeton ) . Harrison, J.M. , 1 985, Brownian mo tion and stochastic flow systems. (John Wiley and Sons, New York ) . Harrison, J . M . , and D. M. Kreps, 1 979, Martingales and arbitrage in multiperiod securities markets, J. Econ. Th. 20, 381-408. Harrison, J . M . , and S.R Pliska, 198 1 , Martingales and stochastic integrals in the theory of continuous trading, Stochastic Processes and their Applications 1 1 , 2 1 5-260. Hayre, L . , 200 1 , SalomonSmithBarney Guide to mortgage- backed and asset- backed securities. ( Wiley, New York ) . He, H . , 1990, Convergence from discrete- to continuous contingent claim prices, Rev. Fin. Studies 3, 523-546. He, H., 1 99 1 , Optimal consumption-portfolio policies: A convergence from discrete to continuous time models, J. Econ. Theory 55, 340-363. Heath, D . , R Jarrow, and A . Morton, 1 992, Bond pricing and the term structure of interest rates: a new methodology for contingent claim valuation, Econometrica 60, 77-105. Heath, D., E. Platen, and M . Schweizer, 2001 , A comparison of two quadratic approaches to hedging in incomplete markets, Math. Finance 1 1 , 4385-413. Heston, S . L . , 1993, A closed-form solution for options with stochastic volatilities with applications to bond and currency options, Review of Financial Studies 6, 327-343. Heynen, R C . , and H . M . Kat , 1995, Lookback options with discrete and partial monitoring of the underlying price, Applied Mathematical Finance 2, 273-284. Hobson, D . G . , 1998, Stochastic volatility, in D . J . Hand, and S . D . Jacka, eds. : Statis tics in finance ( Arnold, London ) . Hobson, D . , and L . C . G . Rogers, 1998, Complete models with stochastic volatility, Mathematical Finance 8, 27-4 1 . Holschneider, M . , 1995, Wavelets: A n analytical tool. ( Oxford University Press, Oxford ) . Hsu, J . , J. Saa.-Requejo, and P. Santa-Clara, 1997, Bond pricing with default risk, Preprint, Andersen School of Management.

Bibliography

425

Hubalek, F., and W. Schachermayer, 1 998, When does convergence of asset prices imply convergence of option prices?, Math. Finance 8, 385-403. Hull, J . , 1999, Options, futures, and other derivative securities. ( Prentice-Hall, En glewood Cliffs , NJ ) 4th edn. 3rd ed. 1997, 2nd ed. 1993, 1 st ed. 1 989. Hull, J . , and A . White, 1987, The pricing of options on assets with stochastic volatilities, Journal of Finance XLII, 281-300. Hull, J . , and A. White, 2000, Forward rate volatilities, swap rate volatilities, and the implementation of the LIB OR market model, Journal of Fixed Income 10, 46-62. Hunt, P.J . , and J.E. Kennedy, 2000, Financial derivatives i n theory a n d proctice. ( Wiley, New York ) . Ikeda, N . , S. Watanabe, Fukushima M . , and H. Kunita ( eds. ) , 1996, Ito stochastic calculus and probability theory. ( Springer, Tokyo Berlin New York ) Festschrift for Kiyosi Ito's eightieth birthday, 1995. Ince, E.L. , 1 944, Ordinary differential equations. ( Dover, New York ) . Ingersoll, J.E., 1986, Theory of financial decision making. ( Rowman & Littlefield, Totowa, NJ ) . Jacka, S . D . , 1992, A martingale representation result and an application to incom plete financial markets , Math. Finance 2, 239-250. Jacod, J., and P. Protter, 2000, Pro bability essentials. ( Springer, New York Berlin London ) . Jacod, J . , and A.N. Shiryaev, 1987, Limit theorems for stochastic processes vol. 288 of Grundlehren der mathematischen Wissenschaften. ( Springer, New York Berlin London ) . Jacod, J . , and A.N. Shiryaev, 1998, Local martingales and the fundamental asset pricing theorems in the discrete-time case, Finance and Stochastics 2, 259-273. Jacques, I . , and C . Judd, 1987, Numerical Analysis. ( Chapman & Hall, London ) . James, J . , and N. Webber, 2000, Interest rote modelling. ( John Wiley, New York ) . Jameson, R ( ed. ) , 1995, Derivative credit risk. ( Risk Publications, New York London ) . Jamshidian, F . , 1997, LIBOR and swap market models and measures, Finance and Stochastics 1 , 261-29 1 . Jarrow, R A . , 1996, Modelling fixed income securities and interest rote options. ( McGraw-Hill, New York) . Jarrow, R , D . Lando, and S. Turnbull, 1 997, A Markov model for the term structure of credit spreads, Review of Financial Studies 10, 481-523. Jarrow, R A . , and S . M . Turnbull, 2000, Derivative Securities. ( South-Western Col lege Publishing, Cincinnati ) 2nd edn. 1st ed. 1996. Jeans, Sir James, 1925, The mathematical theory of electricity and magnetism. ( Cambridge University Press, Cambridge ) 5th. edn. Jin, Y . , and P. Glasserman, 200 1 , Equilibrium positive interest rates: a unified view, Review of financial studies 14, 187-214. Jones, E.P. , S.P. Mason, and E. Rosenfeld, 1 984, Contingent claim analysis of cor porate capital structures: An empirical investigation, Journal of Finance 39, 6 1 1-625. J{1lrgensen, B . , 1982, Statistical properties of the generolized inverse Gaussian dis tribution function vol. 9 of Lecture Notes in Statistics. ( Springer, Berlin ) . JP Morgan, 1 997, Creditmetrics- Technical document. ( JP Morgan New York ) . Kabanov, Y.V. , 200 1 , Arbitrage theory, in E. Jouini, J. Civtanic, and M. Musiela, eds . : Option pricing, interest rotes and risk management ( Cambridge University Press, Cambridge ) . Kabanov, Y., and D . Kramkov, 1994, Large financial markets: Asymptotic arbitrage and continuity, Theo. Prob . Appl. 38, 222-228.

426

Bibliography

Kabanov, Y . , and D. Kramkov, 1998, Asymptotic arbitrage in large financial mar kets, Finance & Stochastics 2, 143-172. Kabanov, Y. M., and O. D . Kramkov, 1995 , No-arbitrage and equivalent martin gale measures: An elementary proof of the Harrison-Pliska theorem, Theory of Pro bability and Applications pp. 523-527. Kahane, J.P., 1985, Some random series as functions. ( Cambridge University Press, Cambridge) 1st. ed 1968. Karatzas, I., 1 996, Lectures on the Mathematics of Finance vol. 8 of CRM Mono graph Series. (American Mathematical Society Providence, Rhode Island, USA) . Karatzas, I . , and G. Kou, 1996, On the pricing of contingent claims under con straints, Annals Appl. Pro bab. 6, 32 1-369. Karatzas, I., and G . Kou, 1998, Hedging American contingent claims with con strained portfolios, Finance and Stochastics 3, 2 1 5-258. Karatzas, I. , and S. Shreve, 199 1 , Brownian Motion and Stochastic Calculus. (Springer-Verlag, Berlin Heidelberg New York) 2nd edn. 1rst edition 1 998. Karatzas , I. , and S . Shreve, 1998, Methods of mathematical finance. (Springer, New . Th� . Kat, H.M. , 1995, Pricing lookback options using binomial trees: An evaluation, J. Financial Engineering 4, 375-397. Keilson, J . , and F.W. Steutel, 1974 , Mixtures of distributions, moment inequalities and measures of exponentiality and normality, A nnals of Probability 2, 1 12-130. Kelker, D . , 1 97 1 , Infinite divisibility and variance mixtures of the normal distribu tion, A nnals of Mathematical Statistics 42, 802-808. Kiesel, R . , 200 1 , Nonparametric statistical methods and the pricing of derivative securities, Journal of Applied Mathematics and Decision Sciences 5, 1-28. Kim, J . , K. Ramaswamy, and S . Sundaresan, 1993, The valuation of corporate fixed income securities, Financial Management pp. 1 1 7-13 1 . Kingman, J . F . C . , 1993, Poisson processes. (Oxford University Press, Oxford) . Klein, I . , 2000, A fundamental theorem of asset pricing for large financial markets, Math. Finance 10, 443-458. Kloeden, P.E. , and E. Platen, 1992, Numerical solutions of stochastic differential equations vol. 23 of Applications of Mathematics, Stochastic Modelling and Ap plied Probability. (Springer, Berlin Heidelberg New York) . Kolb, R.W. , 199 1 , Understanding Futures Markets. (Kolb Publishing, Miami) 3rd edn. Kolmogorov, A.N. , 1933, Grundbegriffe der Wahrscheinlichkeitsrechnung. (Springer) English translation: Foundations of probability theory, Chelsea, New York, ( 1 965) . Korn, R. , 1 997a, Optimal portfolios. (World Scientific, Singapore) . Korn, R. , 1997b, Some applications of L 2 -hedging with a nonnegative wealth pro cess, Applied Mathematical Finance 4, 64-79. Korn, R. , 1997c, Value preserving portfolio strategies in continuous-time models, Mathematical Methods of Operational Research 45 , 1-43. Koyluoglu, H.U. und Hickmann A . , 1998, A generalized framework for credit port folio models, Working Paper, Oliver, Wyman & Company. Kramkov, D . , and W. Schachermayer, 1999, The asymptotic elasticity of utility functions and optimal investment in incomplete markets, Ann. Appl. Pro b. 9, 904-950. Kreps, D . M., 198 1 , Arbitrage and equilibrium in economies with infinite many commodities, Journal of Mathematical Economics 8, 1 5-35 . Kreps, D . M . , 1 982, Multiperiod securities and the efficient allocation o f risk: A comment on the Black-Scholes option pricing model, in J. McCall, eds . : The eco nomics of information and uncertainty (University of Chicago Press, Chicago) .

Bibliography Krzanowski, W.J . , 1 988,

427

Principles of multivariate analysis vol. 3 of Oxford Statis (Oxford University Press, Oxford) . Kiichler, U. et aI. , 1999, Stock returns and hyperbolic distributions, Mathematical and Computer Modelling 29, 1-15. Kurtz, T., and P. Protter, 1 99 1 , Weak limit theorems for stochastic integerals and stochastic differential equations, A nnals of Probability 19, 1035-1070. Kushner, H. J . , and P.G . Dupuis, 1992, Numerical methods for stochastic control problems in continuous time vol. 24 of Applications of Mathematics, Stochastic Modelling and Applied Pro bability. (Springer, New York Berlin Heidelberg) . Lamberton, D . , and B . Lapeyre, 1993, Hedging index options with few assets , Math ematical Finance 3, 25-42. Lamberton, D., and B . Lapeyre, 1 996, Introduction to stochastic calculus applied to finance. (Chapman & Hall, London) . Lamberton, D . , and G . Pages , 1990, Sur l' approximation des reduities, A nn. Inst. H. Poincare Prob . Stat. 26, 331-355. Lando, D., 1995, On jump-diffusion option pricing from the viewpoint of semi martingale characterstics , Surveys in Applied and Industrial Mathematics 2, 605-625. Lando, D . , 1997, Modelling bonds and derivatives with default risk, in M.A.H. Dempster, and S .R. Pliska, eds . : Mathematics of derivative securities (Cam bridge University Press, Cambridge) . Lando, D . , 1998, On Cox processes and credit risky securities, Review of Derivatives Research 2, 99-120. Lando, D., 2000 , Some elements of rating-based credit risk modeling, in N. Je gadeesh, and B. Tuckman, eds . : Advanced Tools for the Fixed Income Profes sional (Wiley, New York) . Lebesgue, H . , 1902, Integrale, longueur, aire, Annali di Mat. 7, 231-259. Leisen, D .P.J , 1996, Pricing the American put option: A detailed convergence anal ysis for binomial models , Working paper, University of Bonn Discussion Paper B-366. Leland, H . E . , 1994, Corporate debt value, bond covenants, and optimal capital structure, The Journal of Finance 49, 1213-1252. Levy, E., and F . Mantion, 1 997, Approximate valuation of discrete lookback and barrier options, Net Exposure 2, 13p http: //www.netexposure.co.uk. Liesenfeld, R. , and J . Breitung, 1999, Simulation based method of moments, in L. Matyas, eds . : Generalized method of moments estimation (Cambridge Uni versity Press, Cambridge) . Loeve, M . , 1973, Paul Levy ( 1886- 1 971 ) , obituary, Annals of Probability 1 , 1-18. Longstaff, F.A, and E . Schwartz , 1995, A simple appraoch to valuing risky fixed and floating rate debt, The Journal of Finance 50, 789-819. Lucas, A . , P. Klaassen, P. Spreij , and S . Straetmans, 1999, An analytic approach to credit risk of large corporate bond and loan portfolios, Journal of Banking (3 Finance 25, 1635-1664. Lucas, D . , 200 1 , CDO Handbook, Working paper, JP Morgan. Madan, D . , 1998, Default risk, in D . J . Hand, and S . D . Jacka, eds . : Statistics in finance (Arnold, London) . Madan, D . , 2000, Pricing the risks of default: A survey, Preprint , University of Maryland. Madan, D . , and F. Milne, 1993, Contingent claims valued and hedged by pricing and investing in a basis, Mathematical Finance 3, 223-245 . Magill, M . , and M. Quinzii, 1996, Theory of incomplete markets vol. 1 . (MIT Press, Cambridge, Massachusetts; London, England) . tical Science Series.

428

Bibliography

Maitra, A .D . , and W.D. Sudderth, 1996, Discrete gambling and stochastic games. ( Springer, New York ) . Mardia, K.V., J.T. Kent , and J.M. Bibby, 1979, Multivariate A nalysis. ( Academic Press, London New York ) . Matyas, L . , 1999, Generalized method of moments estimation. ( Cambridge Univer sity Press, Cambridge ) . McKean, H.P. , 1969, Stochastic Integrals. ( Academic Press, New York ) . Merton, R.C . , 1973, Theory of rational option pricing, Bell Journal of Economics and Management Science 4, 1 4 1-183. Merton, R.C . , 1 974, On the pricing of corporate debt: The risk structure of interest rates, J. Finance 29, 449-470 Reprinted as Chapter 12 in Continuous-time finance Blackwell, Oxford, 1 990. Merton, R.C. , 1 990, Continuous- time finance. ( Blackwell, Oxford ) . Meyer, P.-A. , 1966, Probability and potential. ( Blaisdell, Waltham, MA. ) . Meyer, P.-A. , 1976, Un cours sur les integrales stochastiques, in Seminaire de Proba bilitis X no. 5 1 1 in Lecture Notes in Mathematics pp. 245-400. Springer, Berlin Heidelberg New York. Miltersen, K . , K . Sandmann, and D. Sondermann, 1 997, Closed form solutions for term-structure derivatives with log-normal interest rates, Journal of Finance 52, 409-430. M!1l11er, T . , 1998, Risk-minimizing hedging strategies for unit-linked life insurance contracts, A S TIN Bulletin 28, 1 7-47. M!1l11er, T . , 2001a, Hedging equity-linked life insurance contracts, North American A ctuarial Journal 5, 79-95. M!1l11er, T., 200lb, On transformations of actuarial valuation principles, Insurance: Mathematics f1 Economics 28, 281-303 . M!1l11er, T . , 200 1c, Risk-minimizing hedging strategies for insurance payment pro cesse, Finance and Stochastics 5 , 419-446. Monat, P. , and C . Stricker, 1 995, Follmer-Schweizer decomposition and mean variance hedging for general claims, A nnals of Probability 23, 605-628. Musiela, M . , and M . Rutkowski, 1 997, Martingale methods in financial modelling vol . 36 of Applications of Mathematics: Stochastic Modelling and Applied Prob ability. ( Springer, New York ) . Myneni, R. , 1992, The pricing of the American option, Ann. Appl. Pro bab. 2, 1-23. Nelson, D . B . , and K . Ramaswamy, 1990, Simple binomial processes as diffusion approximations in financial models, Review of Financial Studies 3, 393-430. Neveu, J . , 1975, Discrete-parameter martingales. ( North-Holland, Amsterdam ) . Nielsen, L . , J. Saa-Requejo, and P. Santa-Clara, 1993, Default risk and interest rate risk: The term structure of default spreads, Working Paper, INSEAD. Nobel prize laudatio, 1 997, Nobel prize in Economic Sciences, http://www. no bel. se/announcement- 97.

Norris, J.R., 1997, Markov chains. ( Cambridge University Press, Cambridge ) . Nualart , D . , 1995, The Malliavin calculus and related topics. ( Springer, New York Berlin London ) . 0ksendal, B . , 1 998, Stochastic differential equations: An introduction with applica tions. ( Springer, Berlin Heidelberg New York ) 5th edn. Ong, M . K . , 1999, Internal Credit Risk Models. Capital Allocation and Performance Measurement. ( Risk Books, London ) . Pelsser, A . , 2000, Efficient models for valuing interest rate derivatives. ( Springer, New York Berlin London ) . Pham, H . , T. Rheinlander, and M. Schweizer, 1998, Mean-variance hedging for continuous processes: New proofs and examples, Stochastics and Finance 2, 1 73-198.

Bibliography

429

Pham, H . , and N. Touzi, 1996, Equilibrium state prices in a stochastic volatility model, Mathematical Finance 6, 2 1 5-236. Plackett, R.L. , 1960, Principles of regression analysis. ( Oxford University Press, Oxford) . Platen, E . , and M . Schweizer, 1 994, O n smile and skewness, Statistics Research Report No. SRR 027-94, School of Mathematical Sciences, The Australian Na tional University. Protter, P. , 2004, Stochastic integmtion and differential equations. ( Springer, New York) 2nd ed. 1st edition, 1992. Rachev, S.T., and L . Riischendorf, 1994, Models for option prices, Th. Pro b . Appl. 39, 1 20-152. Rao, C.R. , 1973 , Linear inference a n d its applications. (Wiley) 2nd ed. , 1st ed. 1965 . Rawlings, J . O . , 1988, Applied regression analysis. A research tool. (Wadsworth & Brooks/Cole, Pacific Grove, CA) . Rebonato, R. , 1999, On the pricing implications of the joint lognormal assumption of the swaption and cap market, Journal of Computational Finance 2, 57-76. Rebonato, R. , 2002 , Modern pricing of interest-rote derivatives. (Princeton Univer sity Press, Princeton) . Renault, E . , and N . Touzi, 1996, Option hedging and implied volatilities i n a stochastic volatility model, Mathematical Finance 6, 272-302. Rennocks, John, 1997, Hedging can only defer currency volatility impact for British Steel, Financial Times 08, Letter to the editor. Resnick, S . , 200 1 , A probability path. (Birkhiiuser, Basel) 2nd printing. Revuz, D . , and M. Yor, 199 1 , Continuous martingales and Brownian motion. (Springer, New York) . Rheinlander, T . , and M. Schweizer, 1997, On L 2 -projections in a space of stochastic integrals, Annals of Probabiliy 25, 1810-183 1 . Robert, C.P. , 1997, The Bayesian choice: A decision- theoretic approach. (Springer, New York) . Rockafellar, R.T. , 1 970, Convex Analysis. (Princton University Press, Princton NJ) . Rogers, L . C. G . , 1994, Equivalent martingale measures and no-arbitrage, Stochastics and Stochastic Reports 5 1 , 41-49 . Rogers, L . C .G . , 1995, Which model of the term structure of interest rates should one use? in M . H . A . Davis, et al. , eds . : Mathematical finance (Springer, Berlin Heidelberg New York) . Rogers, L.C. G . , 1997, The potential approach to the term structure of interest rates and foreign exchange rates, Mathematical Finance 7, 1 57-1 76. Rogers, L.C.G., 1998, Utility based justification of the Esscher measure, Private communication. Rogers, L.C.G. , 1999, Modelling credit risk, Preprint, University of Bath. Rogers, L.C . G . , and Z . Shi, 1995, The value of an Asian option, J. Applied Proba bility 32, 1077-1088. Rogers, L.C.G . , and E . J . Stapleton, 1998, Fast accurate binomial pricing, Finance and Stochastics 2, 3-1 7. Rogers, L . C . G . , and D . Talay (eds . ) , 1997, Numerical methods in finance. (Cam bridge University Press, Cambridge) . Rogers, L . C . G . , and D . Williams, 1994, Diffusions, Markov processes and martin gales, Volume 1 : Foundation. (Wiley, New York) 2nd ed. 1st ed. D. Williams, 1 970. Rogers, L.C . G . , and D. Williams , 2000, Diffusions, Markov processes and martin gales, Volume 2: ItO calculus. (Cambridge University Press, Cambridge) 2nd ed.

430

Bibliography

Rosenthal, J . S . , 2000, A first look at rigorous probability theory. ( World Scientific, Singapore ) . Ross; S . , 1976, The arbitrage theory of capital asset pricing, Journal of Economic Theory 13, 341-361 . Ross, S . , 1 978, A simple approach to the valuation of risky streams, Journal of Business 5 1 , 453-475. Ross, S . M . , 1997, Probability models. ( Academic Press, London New York) 6th edn. Rossi, P.E . , 1 995, Modelling stock market volatility. ( Academic Press, London New York ) . Rouge, R. , and N. El Karoui, 2000, Pricing via utility maximization and entropy, Mathematical Finance 10, 259-276. Roussas, G . , 1972, Contiguity of probability measures: Some applications in Statis tics. ( Cambridge University Press, Cambridge ) . Rudin, W. , 1976, Principles of mathematical Analysis. ( McGraw-Hill, New York ) 1st ed. 1953, 2nd ed. 1 964. Rutkowski, M., 1999, Models of forward LIBOR and swap rates, Applied Mathe matical Finance 6, 1-32. Rydberg, T . H . , 1997, The normal inverse Gaussian Levy process: Simulation and approximation, Research Report, Department of Theoretical Statistics, Insti tute of Mathematics, University of A rhus University. Rydberg, T.H. , 1999, Generalized hyperbolic diffusions with applications towards finance, Mathematical Finance 9, 183-201 . Samuelson, P.A . , 1965, Rational theory of warrant pricing, Industrial Management Review 6, 1 3-39. Sato, K .-I. , 1999, Levy processes and infinite divisibility vol. 68 of Cambridge studies in advanced mathematics. ( Cambridge University Press, Cambridge ) . Schachermayer, 2003, Introduction to the mathematics of financial markets, in S . Albeverio, W. Schachermayer, and M. Talagrand, eds . : Lectures on proba bility theory and statistics ( Springer, New York Berlin London ) . Schachermayer, W . , 1992, A Hilbert space proof of the fundamental theorem of asset pricing in finite discrete time, Insurance: Mathematics and Economics l l , 249-257. Schachermayer, W., 1 994, Martingale measures for discrete-time processes with infinite horizon, Mathematical Finance 4, 25-55. Schal, M., 1994, On quadratic cost criteria for option hedging, Mathematics of Operations Research 19, 1 2 1-13l . Schonbucher, P.J . , 1998, Term structure modelling of defaultable bonds , Rev. Derivative Research 2, 1 6 1-192. Schonbucher, P. J . , 2003, Credit derivative pricing models. ( Wiley Finance, Chich ester ) . Schweizer, M . , 1988, Hedging of options in a general semimartingale model, Ph.D. thesis ETH Zurich. Schweizer, M . , 1 99 1 , Option hedging for semi-martingales, Stoch. Processes Appl. 37, 339-363. Schweizer, M . , 1992, Mean-variance hedging for general claims, A nnals of Applied Probability 2 , 171-179. Schweizer, M., 1994, Approximating random variables by stochastic integrals, A n nals of Probability 22, 1 536-1575. Schweizer, M., 1995, On the minimal martingale measure and the Follmer-Schweizer decomposition, Stochastic Analysis and its Applications 13, 573-599. Schweizer, M . , 1999, A minimality property of the minimal martingale measure, Statistics and Probability Letters 42, 27-3 1 .

Bibliography

431

Schweizer, M . , 200la, From actuarial to financial valuation principles, 28, 31-47 Insurance: Mathematics & Economics. Schweizer, M . , 200lb, A guided tour through quadratic hedging approaches, in E. Jouini, J. Cvitanic, and M . Musiela, eds. : Advances in Mathematical Finance ( Cambridge University Press, Cambridge ) . Shaw, W.T . , 1998, Modelling financial derivatives with Mathematica. ( Cambridge University Press, Cambridge ) . Shephard, N . , 1996, Statistical aspects of ARCH and stochastic volatility, in D . R Cox, D .V. Hinkley, and O.E. Barndorff-Nielsen, eds. : Time Series Models - in econometrics, finance and o ther fields ( Chapman & Hall, London ) . Shiryaev, A . , 1996, Pro bability. ( Springer, New York Berlin London ) . Shiryaev, A . N . , 1999, Essentials of stochastic finance vol. 3 of Advanced Series of Statistical Science f1 Applied Probability. ( World Scientific, Singapore ) . Shiryaev, A. N, et al. , 1995, Towards the theory of pricing of options of both European and American types. I: Discrete time, Theory of Probability and Ap plications 39, 14-60. Slater, L . J . , 1 960, Confluent hypergeometric functions. ( Cambridge University Press, Cambridge ) . Snell, J . L . , 1 952, Applications of martingale systems theorems, Trans. Amer. Math. Soc . 73, 293-3 1 2 . S¢rensen, M . , 200 1 , Simplified estimating functions for diffusion models with a high-dimensional parameter, Scand. J. Statist. pp. 99-1 1 2 . Stanton, R , 1 997, A nonparametric model o f the term structure dynamics and the market price of interest rate risk, Journal of Finance 52, 1973-2002. Steele, J . M . , 200 1 , Stochastic calculus and financial app lications. ( Springer, New York Berlin Heidelberg ) . Stein, E . M . , and J . C . Stein, 199 1 , Stock price distributions with stochastic volatil ities: An analytic approach, Review of Financial Studies 4, 727-752. Stroock, D.W. , and S . R S . Varadhan, 1979, Multidimensional diffusion processes. ( Springer, New York ) . Taqqu, M . S . , and W. Willinger, 1987, The analysis of finite security markets using martingales, A dv. Appl. Prob. 19, 1-25. Valentine, F. A., 1964, Convex Sets. ( McGraw-Hill, New York ) . von Neumann, J . , and O . Morgenstern, 1953, Theory of games and economic ba haviour. ( Princeton University Press, Princeton ) 3rd edn. Watson, G . N . , 1944, A treatise on the theory of Bessel functions. ( Cambridge Uni versity Press , Cambridge ) 2nd edn. 1st ed. 1922. Wax, N . ( ed. ) , 1954, Selected papers on noise and stochastic processes. ( Dover, New York ) . Whittle, P. , 1 996, Optimal control: Basics and beyond. ( Wiley, New York ) . Widder, 194 1 , The L aplace transform. ( Princeton University Press , Princeton ) . Williams, D . , 199 1 , Probability with martingales. ( Cambridge University Press, Cambridge ) . Williams, D . , 200 1 , Weighing the odds. ( Cambridge University Press, Cambridge ) . Willinger, W. , and M . S . Taqqu, 1 99 1 , Towards a convergence theory for continuous stochastic securities market models , Mathematical Finance 1 , 55-99. Yor, M . , 1978, Sous-espaces denses dans LI et HI et representation des martingales, in Seminaire de Probabilites, XII no. 649 in Lecture Notes in Mathematics pp. 265-309. Springer. Yor, M . , 1992a, On some exponential functionals of Brownian motion, A dv. Appl. Probab. 24, 509-531 . Yor, M . , 1992b, Some aspects of Brownian motion. Part 1 : Some special functionals. ( Birkhiiuser Verlag, Basel ) .

432

Bibliography

Young, N .J . , 1988 , Hilbert Space. ( Cambridge University Press, Cambridge) . Zhang, P. G . , 1997, Exotic Options. (World Scientific, Singapore) . Zhou, C . , 200 1 , The Term Structure of Credit Spreads with Jump Risk, Journal Banking and Finance 25, 20 1 5-2040.

of

Index

c5-admissible, 234 u-algebra, 31 - stopping time, 84 affine term structure, 340 algebra, 3 1 almost everywhere, 33 almost surely, 33 American put, 1 4 1 , 258 arbitrage, 1, 1 5 - free, 106 - opportunity, 19, 106, 232 - price, 1 1 5 - pricing technique, 8, 9, 328 - strategy, 106 arbitrageur, 6 ARCH (autoregressive conditional heteroscedasticity) , 3 1 7 Asian option - Geman-Yor method, 261 - Rogers-Shi method, 261 barrier option - Asian, 266 - down-and-out call, 264 - forward start, 266 - knockout discount, 265 - moving boundary, 266 - outside, 266 basket default swap, 400 Bayes formula, 225, 239 binomial model, 1 2 1 Black formula - caplets, 363 - swaption, 366 Black's futures option formula, 283 Black-Derman-Toy model, 341 Black-Scholes - complete, 248 - European call price, 133 - formula, 44, 251, 294 - hedging, 1 1 5, 252, 279

- martingale measure, 243 - model, 196, 243, 270 - partial differential equation, 253, 255 - risk-neutral valuation, 250 - stochastic calculus, 198 - volatility, 314 Borel u-algebra, 3 1 Brownian motion, 160 - geometric, 197 - martingale characterization, 1 7 1 , 2 1 5 - quadratic variation, 169 call - European - - convergence of CRR price, 133 - - Cox-Ross-Rubinstein price, 1 26 cap, 354 - Black's model, 355 caplet , 354, 362 central limit theorem - functional form, 223 Collateralized Debt Obligations, 404 complete market , 22 completeness theorem, 1 16, 238 conditional expectation - iteration, 50 conditional Jensen formula, 48 conditional mean formula, 48 conditional probability, 44 confluent hypergeometric function, 263 contingent claim, 2, 105, 1 16, 230, 248, 277 - attainable, 236 convergence - almost surely, 5 1 - i n pth mean, 5 2 - i n distribution, 52 - in probability, 52 - mean square, 52 - weak, 53, 221 convolution, 53 copula, 403

434

Index

cost process, 3 1 1 coupon, 328 coupon bonds, 328 Cox-Ross-Rubinstein model, 1 2 1 credit default swap, 399 credit migration, 376 currency, 5 currency option, 286 default correlation, 376 default probability, 376 derivative - Radon-Nikodym, 43 derivative securities, 2 diffusion, 159, 243 - constant elasticity of variance, 137 distribution - tn , 67 Bernoulli, 40 binomial, 41 - bivariate normal, 46 elliptically contoured, 66 generalized inverse Gaussian, 68 - hyperbolic, 42, 68 multinormal, 66 - normal, 4 1 , 56 Poisson, 42, 57 distribution function, 38 Doob Decomposition, 93 Doob-Meyer-decomposition, 1 70 dynamic completeness, 272 dynamic portfolio, 1 02 early-exercise - decomposition, 258 - premium, 260 elasticity coefficient , 255 equivalent martingale measure, 233 Esscher measure, 292 expectation, 39 expectation hypothesis, 346 , 348 expiry, 3 Follmer-Schweizer decomposition, 308 factor modelling, 401 factorization formula, 293 Feynman-Kac formula, 20 1 , 25 1 , 339 filtration, 75, 76, 153 - Brownian, 199, 243 financial market model, 101 , 229 finite market approximation, 271 finite-dimensional distributions, 153 first-passage time, 225, 264 first-to-default swap, 400

Flesaker-Hughston-model, 370 formula currency option, 286 - Geman-Yor, 263 - Levy-Khintchine, 64, 1 79 - risk-neutral pricing, 1 19, 1 20 - Stirling'S, 60 forward, 2 - contract , 3 - price, 3 forward LIBOR measure, 357 forward rate, 330 - instantaneous, 331 free lunch, 19 function - characteristic, 55 - finite variation, 37 indicator, 34 - Lebesgue-integrable, 35 - measurable, 34 - simple, 34 futures, 2, 282 gains process, 102, 230 GARCH (generalised autoregressive conditional heteroscedasticity) , 3 1 7 Gaussian process, 158 Girsanov pair, 1 99 Greeks, 254 - delta, 254 - gamma, 254 - rho, 254 - theta, 254 - vega, 254, 255 Heath-Jarrow-Morton - drift condition, 345 - model, 343 hedge - perfect, 1 1 9 hedgers, 6 hedging - mean-variance, 18, 307 - risk-minimizing, 18, 3 1 1 hedging strategy - CRR model, 127, 128 Hilbert space, 308, 410 hyperplane, 4 1 5 implied volatility, 3 1 4 independence, 40 index, 5 infinite divisible, 69

Index inner product, 409 integral - Lebesgue-Stieltjes , 36 - Legesgue, 34 - Riemann, 36 interest rate, 4 intrinsic value, 259 invariance principle, 223 Ito - calculus, 1 87 - lemma, 1 9 5 Ito formula - basic, 1 94 - for ItO process , 1 9 5 - for semi-martingales, 2 1 2 - multidimensional, 1 9 5 Ito process , 1 93 Levy process , 1 78, 294 Laplace transform, 263, 266 law of large numbers - weak, 58 Lebesgue measure, 3 1 LIBOR dynamics, 358, 3 6 1 LIBOR rate, 357 Lipschitz condition, 203, 274 local martingale, 209 localization, 192 lookback option - call, 267 - partial, 269 - put, 267 Levy exponent , 64 market - complete, 1 1 6 , 236 - incomplete, 289 , 295 , 3 1 5 market price of risk, 2 4 5 , 2 4 7 , 338 Markov chain, 78, 96 Markov process , 78, 1 5 8 - strong, 1 5 8 martingale, 78, 1 5 5 - local, 1 9 2 representation, 1 1 8 - square-integrable, 94 - transform, 80 martingale measure, 2 1 , 108 - forward risk-neutral, 24 1 , 346 - minimal, 3 1 6 - risk-neutral, 1 2 0 , 246 martingale modelling, 335 martingale representation, 308 martingale transform lemma, 8 1

435

maximal utility, 290 maximum likelihood, 257 mean reversion, 342 measurable space, 31 measure, 3 2 - absolutely continuous , 43 - equivalent , 43 measure space, 32 Merton model, 379 , 380 method of images , 265 Monte-Carlo, 266 multinomial models, 1 48 Newton-Raphson iteration, 256 no free lunch with vanishing risk, 235 norm, 409 Novikov condition, 1 99 numeraire, 2 1 , 1 0 1 , 230, 239 numeraire invariance theorem, 231 optimal capital structure, 389 optimal stopping problem, 91 option, 2 - American , 2 , 1 3 8 , 258 - Asian, 3 , 260 - barrier, 3, 263 - - discrete, 143 - binary, 269 - call, 2 - European, 2 - exotic, 5 - fair price (Davis) , 290 - futures call, 283 - lookback, 3, 266 - - discrete, 145 - put, 2 Ornstein-Uhlenbeck process, 205 orthogonal complement , 4 1 0 orthogonal projection, 4 1 1 partition, 3 7 point process, 1 7 5 Poisson process, 1 75 portfolio, 9 portfolio credit risk, 400 - asset-based, 401 - loss distribution, 402 predictable, 80, 1 9 2 , 209 previsible, 209 price - arbitrage, 1 1 9 pricing kernel , 368 probability - measure, 33

436

Index

- space, 33 probability space, 38 - filtered, 76 process - Bessel, 261 - Bessel-squared, 262 - maximum of BM, 264 - minimum of BM, 264 projection, 308 put-call parity, 143 quadratic covariation, 157 quadratic variation, 168, 2 1 1 random variable, 38 - expectation, 39 - variance, 39 reduced-form model, 391 - valuation, 395 reflection principle, 264 representation property, 238 Riccati equation, 340 Riesz decomposition, 259 risk - intrinsic, 295 - remaining, 295, 3 1 1 risk management, 273, 279 risk premium, 338 risk-neutral valuation, 1 15, 236, 250, 273, 335 sample space, 37 self-decomposability, 65 semi-martingale, 186, 209 - characteristics, 218 - good sequence of, 223 separating hyperplane Theorem, 19 Snell envelope, 89, 259 speculator, 6 spot LIBOR measure, 361 spot rate, 331 spreads, 382 state-price vector, 20, 276 stochastic basis, 76, 153 stochastic differential equations - strong solution, 204 - weak solution, 204 stochastic exponential, 197, 198, 216 stochastic integral - quadratic variation of, 192 stochastic process, 77, 1 53 - adapted, 77, 1 53 - cadlag, 154 - Poisson, 42

- progressively measurable, 243 - RCLL, 1 54 stochastic volatility, 219 stock, 4 stock price - jump, 136 stopping time, 82, 155 strategy - replicating, 1 1 2 strike price, 3 structural model - Black-Cox, 384 - Leland, 389 - Merton, 380 - stochastic interest rate, 388 structure-preserving property, 270 submartingale, 79, 155 supermartingale, 79, 155 swap, 2, 4 , 353, 363 swaption, 365 tail dependence, 403 term structure equation, 337, 339 theorem - central limit, 59 - Doob's Martingale convergence, 95 - Feynman-Kac, 202 - fundamental theorem of asset pricing, 1 19, 235 - Girsanov, 199 - local limit, 60 - monotone convergence, 35 - Optional Sampling, 85 - Poisson limit, 60, 61 - portmanteau, 222 - representation of Brownian martingales, 200 - Stopping time principle, 83 trading strategy, 18, 102 - admissible, 235 - mean-self-financing, 312 - replicating, 236 - self-financing, 230 - tame, 233 uniform integrability, 1 56 usual conditions ( for a stochastic basis) , 153 utility function, 290 value process, 102, 230 Vasicek model, 340 volatility, 196

Index - historic, 257 - implied, 256 - non-parametric estimation, 258 - stochastic, 257 volatility matrix, 243 volatility smile, 314

weak law of large numbers, 52 wealth process, 102 yield-to-maturity, 329 zero-coupon bond, 329, 330

437

Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, Second Edition

Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives

Financial Valuation: Applications and Models (Second Edition)

Pricing and Hedging Swaps

Applications of Least-Squares Regressions to Pricing and Hedging of Financial Derivatives

Modelling and hedging equity derivatives

An Introduction to the Mathematics of Financial Derivatives, Second Edition

Financial derivatives

Pricing Interest-Rate Derivatives

Financial Derivatives

Financial Derivatives Pricing: Selected Works of Robert Jarrow

Credit risk: modeling, valuation, and hedging

Credit risk: modeling, valuation and hedging

Quantitative methods in derivatives pricing

Financial Valuation: Applications and Models

C++ Design Patterns and Derivatives Pricing

C++ Design Patterns and Derivatives Pricing

Financial Valuation Workbook

American-Style Derivatives: Valuation and Computation

Valuation For Financial Reporting

Financial Derivatives Modeling

Mathematical Models of Financial Derivatives

Mathematical models of financial derivatives

Mathematical models of financial derivatives

The mathematics of financial derivatives

American-style derivatives. Valuation and computation

C++ Design Patterns and Derivatives Pricing

Advanced derivatives pricing and risk management

Credit derivatives: instruments, applications and pricing

Theory of financial risk and derivative pricing

Risk-Neutral Valuation: Pricing and Hedging of Financial Derivatives, Second Edition