Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger
Springer Series in Statistics Alho/Spencer: Statistical Demography and Forecasting Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes Atkinson/Riani: Robust Diagnostic Regression Analysis Atkinson/Riani/Ceriloi: Exploring Multivariate Data with the Forward Search Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications, 2nd edition Brockwell/Davis: Time Series: Theory and Methods, 2nd edition Bucklew: Introduction to Rare Event Simulation Capp´e/Moulines/Ryd´en: Inference in Hidden Markov Models Chan/Tong: Chaos: A Statistical Perspective Chen/Shao/Ibrahim: Monte Carlo Methods in Bayesian Computation Coles: An Introduction to Statistical Modeling of Extreme Values Devroye/Lugosi: Combinatorial Methods in Density Estimation Diggle/Ribeiro: Model-based Geostatistics Dudoit/Van der Laan: Multiple Testing Procedures with Applications to Genomics Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications Eggermont/LaRiccia: Maximum Penalized Likelihood Estimation, Volume I: Density Estimation Fahrmeir/Tutz: Multivariate Statistical Modeling Based on Generalized Linear Models, 2nd edition Fan/Yao: Nonlinear Time Series: Nonparametric and Parametric Methods Ferraty/Vieu: Nonparametric Functional Data Analysis: Theory and Practice Ferreira/Lee: Multiscale Modeling: A Bayesian Perspective Fienberg/Hoaglin: Selected Papers of Frederick Mosteller Fr¨uhwirth-Schnatter: Finite Mixture and Markov Switching Models Ghosh/Ramamoorthi: Bayesian Nonparametrics Glaz/Naus/Wallenstein: Scan Statistics Good: Permutation Tests: Parametric and Bootstrap Tests of Hypotheses, 3rd edition Gouri´eroux: ARCH Models and Financial Applications Gu: Smoothing Spline ANOVA Models Gy¨ofi/Kohler/Krzy´zak/Walk: A Distribution-Free Theory of Nonparametric Regression Haberman: Advanced Statistics, Volume I: Description of Populations Hall: The Bootstrap and Edgeworth Expansion H¨ardle: Smoothing Techniques: With Implementation in S Harrell: Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis Hart: Nonparametric Smoothing and Lack-of-Fit Tests Hastie/Tibshirani/Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction Hedayat/Sloane/Stufken: Orthogonal Arrays: Theory and Applications Heyde: Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation Huet/Bouvier/Poursat/Jolivet: Statistical Tools for Nonlinear Regression: A Practical Guide with S-PLUS and R Examples, 2nd edition Iacus: Simulation and Inference for Stochastic Differential Equations (continued after index)
Stefano M. Iacus
Simulation and Inference for Stochastic Differential Equations With R Examples
123
Stefano M. Iacus Dept. Economics, Business and Statistics University of Milan Via Conservatorio, 7 20122 Milano Italy
[email protected] ISBN: 978-0-387-75838-1 DOI: 10.1007/978-0-387-75839-8
e-ISBN: 978-0-387-75839-8
Library of Congress Control Number: c 2008 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper 9 8 7 6 5 4 3 2 1 springer.com
To Ilia, Lucy, and Ludo
Preface
Stochastic differential equations model stochastic evolution as time evolves. These models have a variety of applications in many disciplines and emerge naturally in the study of many phenomena. Examples of these applications are physics (see, e.g., [176] for a review), astronomy [202], mechanics [147], economics [26], mathematical finance [115], geology [69], genetic analysis (see, e.g., [110], [132], and [155]), ecology [111], cognitive psychology (see, e.g., [102], and [221]), neurology [109], biology [194], biomedical sciences [20], epidemiology [17], political analysis and social processes [55], and many other fields of science and engineering. Although stochastic differential equations are quite popular models in the above-mentioned disciplines, there is a lot of mathematics behind them that is usually not trivial and for which details are not known to practitioners or experts of other fields. In order to make this book useful to a wider audience, we decided to keep the mathematical level of the book sufficiently low and often rely on heuristic arguments to stress the underlying ideas of the concepts introduced rather than insist on technical details. Mathematically oriented readers may find this approach inconvenient, but detailed references are always given in the text. As the title of the book mentions, the aim of the book is twofold. The first is to recall the theory and implement methods for the simulation of paths of stochastic processes {Xt , t ≥ 0} solutions to stochastic differential equations (SDEs). In this respect, the title of the book is too ambitious in the sense that only SDEs with Gaussian noise are considered (i.e., processes for which the writing dXt = S(Xt )dt + σ(Xt )dWt has a meaning in the Itˆo sense). This part of the book contains a review of well-established results and their implementations in the R language, but also some fairly recent results on simulation. The second part of the book is dedicated to the review of some methods of estimation for these classes of stochastic processes. While there is a wellestablished theory on estimation for continuous-time observations from these processes [149], the literature about discrete-time observations is dispersed (though vaste) in several journals. Of course, real data (e.g., from finance [47],
VIII
Preface
[88]) always lead to dealing with discrete-time observations {Xti , i = 1, . . . , n}, and many of the results from the continuous-time case do not hold or cannot be applied (for example, the likelihood of the observations is almost always unavailable in explicit form). It should be noted that only the observations are discrete whilst the underlying model is continuous; hence most of the standard theory on discrete-time Markov processes does not hold as well. Different schemes of observations can be considered depending on the nature of the data, and the estimation part of the problem is not necessarily the same for the different schemes. One case, which is considered “natural,” is the fixed-∆ scheme, in which the time step between two subsequent observations Xti and Xti +∆n is fixed; i.e., ∆n = ∆ (or is bounded away from zero) and independent from n. In this case, the process is observed on the time interval [0, T = n∆] and the asymptotics considered as n → ∞ (largesample asymptotics). The underlying model might be ergodic or stationary and possibly homogeneous. For such a scheme, the time step ∆ might have some influence on estimators because, for example, the transition density of the process is usually not known in explicit form and has to be approximated via simulations. This is the most difficult case to handle. Another scheme is the “high frequency” scheme, in which the observational step size ∆n decreases with n and two cases are possible: the time interval is fixed, say [0, T = n∆n ], or n∆n increases as well. In the first case, neither homogeneity nor erogidicy are needed, but consistent estimators are not always available. On the contrary, in the “rapidly increasing experimental design,” when ∆n → 0 and n∆n → ∞ but n∆2n → 0, consistent estimators can be obtained along with some distributional results. Other interesting schemes of partially observed processes, missing at random [75], thresholded processes (see, e.g., [116], [118]), observations with errors (quantized or interval data, see, e.g., [66], [67], [97]), or large sample and “small diffusion” asymptotics have also recently appeared in the literature (see, e.g., [222], [217]). This book covers essentially the parametric estimation under the large-sample asymptotics scheme (n∆n → ∞) with either fixed ∆n = ∆ or ∆n → 0 with n∆kn → 0 for some k ≥ 2. The final chapter contains a miscellaneous selection of results, including nonparametric estimation, model selection, and change-point problems. This book is intended for practitioners and is not a theoretical book, so this second part just recalls briefly the main results and the ideas behind the methods and implements several of them in the R language. A selection of the results has necessarily been made. This part of the book also shows the difference between the theory of estimation for discrete-time observations and the actual performance of such estimators once implemented. Further, the effect of approximation schemes on estimators is investigated throughout the text. Theoretical results are recalled as “Facts” and regularity conditions as “Assumptions” and numbered by chapter in the text.
Preface
IX
So what is this book about? This book is about ready to be used, R-efficient code for simulation schemes of stochastic differential equations and some related estimation methods based on discrete sampled observations from such models. We hope that the code presented here and the updated survey on the subject might be of help for practitioners, postgraduate and PhD students, and researchers in the field who might want to implement new methods and ideas using R as a statistical environment. What this book is not about This book is not intended to be a theoretical book or an exhaustive collection of all the statistical methods available for discretely observed diffusion processes. This book might be thought of as a companion book to some advanced theoretical publication (already available or forthcoming) on the subject. Although this book is not even a textbook, some previous drafts of it have been used with success in mathematical finance classes for the numerical simulation and empirical analysis of financial time series. What comes with the book All the algorithms presented in the book are written in pure R code but, because of the speed needed in real-world applications, we have rewritten some of the R code in the C language and assembled everything in a package called sde freely available on CRAN, the Comprehensive R Archive Network. R and C functions have the same end-user interface; hence all the code of the examples in the book will run smoothly regardless of the underlying coding language. A minimal knowledge of the R environment at the introductory level is assumed, although brief recalls to the main R concepts, limited to what is relevant to this text, are given at the end of the book. Some crucial aspects of implementation are discussed in the main body of the book to make them more effective. What is missing? This book essentially covers one-dimensional diffusion processes driven by the Wiener process. Today’s literature is vast and wider than this choice. In particular, it focuses also on multidimensional diffusion processes and stochastic differential equations driven by L´evy processes. To keep the book self-contained and at an introductory level and to preserve some homogeneity within the text, we decided to restrict the field. This also allows simple and easy-tounderstand R code to be written for each of the techniques presented.
X
Preface
Acknowledgments I am grateful to Alessandro De Gregorio and Ilia Negri for careful reading of the manuscript, as well as to the anonymous referees for their constructive comments and remarks on several drafts of the book. All the remaining errors contained are of course my responsibility. I also wish to thank John Kimmel for his great support and patience with me during the last three years.
Milan, November 2007
Stefano M. Iacus
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XVII 1 Stochastic Processes and Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Elements of probability and random variables . . . . . . . . . . . . . 1.1.1 Mean, variance, and moments . . . . . . . . . . . . . . . . . . . . 1.1.2 Change of measure and Radon-Nikod´ ym derivative . . 1.2 Random number generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Variance reduction techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Preferential sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 Control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 Antithetic sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Generalities of stochastic processes . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Simple and quadratic variation of a process . . . . . . . . 1.5.3 Moments, covariance, and increments of stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.5 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Brownian motion as the limit of a random walk . . . . . 1.6.2 Brownian motion as L2 [0, T ] expansion . . . . . . . . . . . . 1.6.3 Brownian motion paths are nowhere differentiable . . 1.7 Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Brownian bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Stochastic integrals and stochastic differential equations . . . . 1.9.1 Properties of the stochastic integral and Itˆ o processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 4 5 5 8 9 12 13 14 14 15 16 16 18 18 20 22 24 24 27 29 32
XII
Contents
1.10
Diffusion processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.1 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.2 Markovianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.3 Quadratic variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.4 Infinitesimal generator of a diffusion process . . . . . . . . 1.10.5 How to obtain a martingale from a diffusion process . 1.11 Itˆo formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.11.1 Orders of differentials in the Itˆo formula . . . . . . . . . . . 1.11.2 Linear stochastic differential equations . . . . . . . . . . . . 1.11.3 Derivation of the SDE for the geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.11.4 The Lamperti transform . . . . . . . . . . . . . . . . . . . . . . . . . 1.12 Girsanov’s theorem and likelihood ratio for diffusion processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13 Some parametric families of stochastic processes . . . . . . . . . . . 1.13.1 Ornstein-Uhlenbeck or Vasicek process . . . . . . . . . . . . 1.13.2 The Black-Scholes-Merton or geometric Brownian motion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13.3 The Cox-Ingersoll-Ross model . . . . . . . . . . . . . . . . . . . . 1.13.4 The CKLS family of models . . . . . . . . . . . . . . . . . . . . . . 1.13.5 The modified CIR and hyperbolic processes . . . . . . . . 1.13.6 The hyperbolic processes . . . . . . . . . . . . . . . . . . . . . . . . 1.13.7 The nonlinear mean reversion A¨ıt-Sahalia model . . . . 1.13.8 Double-well potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13.9 The Jacobi diffusion process . . . . . . . . . . . . . . . . . . . . . . 1.13.10 Ahn and Gao model or inverse of Feller’s square root model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13.11 Radial Ornstein-Uhlenbeck process . . . . . . . . . . . . . . . . 1.13.12 Pearson diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13.13 Another classification of linear stochastic systems . . . 1.13.14 One epidemic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13.15 The stochastic cusp catastrophe model . . . . . . . . . . . . 1.13.16 Exponential families of diffusions . . . . . . . . . . . . . . . . . 1.13.17 Generalized inverse gaussian diffusions . . . . . . . . . . . . .
33 35 36 37 37 37 38 38 39
2 Numerical Methods for SDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Euler approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 A note on code vectorization . . . . . . . . . . . . . . . . . . . . . 2.2 Milstein scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Relationship between Milstein and Euler schemes . . . . . . . . . . 2.3.1 Transform of the geometric Brownian motion . . . . . . . 2.3.2 Transform of the Cox-Ingersoll-Ross process . . . . . . . . 2.4 Implementation of Euler and Milstein schemes: the sde.sim function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Example of use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 62 63 65 66 68 68
39 40 41 43 43 46 47 49 49 50 50 51 51 52 52 52 54 56 57 58 59
69 70
Contents
2.5 2.6 2.7 2.8 2.9 2.10
2.11
2.12 2.13 2.14 2.15 2.16 2.17
The constant elasticity of variance process and strange paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictor-corrector method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Strong convergence for Euler and Milstein schemes . . . . . . . . . KPS method of strong order γ = 1.5 . . . . . . . . . . . . . . . . . . . . . Second Milstein scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Drawing from the transition density . . . . . . . . . . . . . . . . . . . . . . 2.10.1 The Ornstein-Uhlenbeck or Vasicek process . . . . . . . . 2.10.2 The Black and Scholes process . . . . . . . . . . . . . . . . . . . 2.10.3 The CIR process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.4 Drawing from one model of the previous classes . . . . . Local linearization method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11.1 The Ozaki method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11.2 The Shoji-Ozaki method . . . . . . . . . . . . . . . . . . . . . . . . . Exact sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation of diffusion bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.13.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical considerations about the Euler scheme . . . . . . . . . . Variance reduction techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.15.1 Control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of the function sde.sim . . . . . . . . . . . . . . . . . . . . . . . . Tips and tricks on simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Parametric Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Exact likelihood inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 The Ornstein-Uhlenbeck or Vasicek model . . . . . . . . . 3.1.2 The Black and Scholes or geometric Brownian motion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 The Cox-Ingersoll-Ross model . . . . . . . . . . . . . . . . . . . . 3.2 Pseudo-likelihood methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Euler method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Elerian method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Local linearization methods . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Comparison of pseudo-likelihoods . . . . . . . . . . . . . . . . . 3.3 Approximated likelihood methods . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Kessler method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Simulated likelihood method . . . . . . . . . . . . . . . . . . . . . 3.3.3 Hermite polynomials expansion of the likelihood . . . . 3.4 Bayesian estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Estimating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Simple estimating functions . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Algorithm 1 for simple estimating functions . . . . . . . . 3.5.3 Algorithm 2 for simple estimating functions . . . . . . . . 3.5.4 Martingale estimating functions . . . . . . . . . . . . . . . . . . 3.5.5 Polynomial martingale estimating functions . . . . . . . .
XIII
72 72 74 77 81 82 83 83 83 84 85 85 87 91 98 99 101 102 103 105 106 109 112 113 117 119 122 122 125 127 128 131 131 134 138 155 157 157 164 167 172 173
XIV
Contents
3.5.6 Estimating functions based on eigenfunctions . . . . . . . 3.5.7 Estimating functions based on transform functions . . Discretization of continuous-time estimators . . . . . . . . . . . . . . . Generalized method of moments . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 The GMM algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 GMM, stochastic differential equations, and Euler method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What about multidimensional diffusion processes? . . . . . . . . .
178 179 179 182 184
4 Miscellaneous Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Model identification via Akaike’s information criterion . . . . . . 4.2 Nonparametric estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Stationary density estimation . . . . . . . . . . . . . . . . . . . . 4.2.2 Local-time and stationary density estimators . . . . . . . 4.2.3 Estimation of diffusion and drift coefficients . . . . . . . . 4.3 Change-point estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Estimation of the change point with unknown drift . . 4.3.2 A famous example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
191 191 197 198 201 202 208 212 215
Appendix A: A brief excursus into R . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Typing into the R console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 R vectors and linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4 Subsetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 Different types of objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.6 Expressions and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.7 Loops and vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.9 Time series objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.10 R Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.11 Miscellanea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
217 217 218 220 221 222 225 227 228 229 231 232
Appendix B: The sde Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . cpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DBridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcElerian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcEuler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcKessler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcOzaki . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcShoji . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . dcSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DWJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EULERloglik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . gmm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
233 234 235 236 237 238 238 239 240 241 243 243 245
3.6 3.7
3.8
185 190
Contents
XV
HPloglik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ksmooth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . linear.mart.ef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rcBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rcCIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rcOU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rsCIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . rsOU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sde.sim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sdeAIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SIMloglik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simple.ef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simple.ef2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
247 248 250 251 252 253 254 255 256 259 261 262 264
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
267
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
279
Notation
Var : the variance operator E : the expected value operator N (µ, σ 2 ) : the Gaussian law with mean µ and variance σ 2 χ2d : chi-squared distribution with d degrees of freedom td : Student t distribution with d degrees of freedom Iν (x) : modified Bessel function of the first kind of order ν R : the real line N : the set of natural numbers 1, 2, . . . d
→ : convergence in distribution p
→ : convergence in probability a.s.
→ : almost sure convergence
B(R) : Borel σ-algebra on R χA , 1A : indicator function of the set A x ∧ y : min(x, y) x ∨ y : max(x, y) (f (x))+ : max(f (x), 0) Φ(z) : cumulative distribution function of standard Gaussian law [x] : integer part of x <X, X>t , [X, X]t : quadratic variation process associated to Xt Vt (X) : simple variation of process X ∝ : proportional to
XVIII Notation
fvi (v1 , v2 , . . . , vn ) :
∂ ∂vi f (v1 , v2 , . . . , vn )
fvi ,vj (v1 , v2 , . . . , vn ) :
∂2 ∂vi vj f (v1 , v2 , . . . , vn ),
∂θ f (v1 , v2 , . . . , vn ; θ) :
∂ ∂θ f (v1 , v2 , . . . , vn ; θ)
∂θk f (v1 , v2 , . . . , vn ; θ) :
∂k f (v1 , v2 , . . . , vn ; θ) ∂θ k
etc.
Πn (A) : partition of the interval A = [a, b] in n subintervals of [a = x0 , x1 ), [x1 , x2 ), . . . , [xn−1 , xn = b] ||Πn || : maxj |xj+1 − xj | C02 (R) : space of functions with compact support and continuous derivatives up to order 2 L2 ([0, T ]) : space of functions from [0, T ] → R endowed by the L2 norm ||f ||2 : the L2 norm of f Wt : Brownian motion or Wiener process i.i.d. : independent and identically distributed AIC : Akaike information criterion CIR : Cox-Ingersoll-Ross CRAN : the Comprehensive R Archive Network CKLS : Chan-Karolyi-Longstaff-Sanders EA : exact algorithm GMM : generalized method of moments MCMC : Markov chain Monte Carlo MISE : mean integrated square error
1 Stochastic Processes and Stochastic Differential Equations
This chapter reviews basic material on stochastic processes and statistics as well as stochastic calculus, mostly borrowed from [170], [130], and [193]. It also covers basic notions on simulation of commonly used stochastic processes such as random walks and Brownian motion and also recalls some Monte Carlo concepts. Even if the reader is assumed to be familiar with these basic notions, we will present them here in order to introduce the notation we will use throughout the text. We will limit our attention mainly to one-dimensional, real random variables and stochastic processes. We also restrict our attention to parametric models with multidimensional parameters.
1.1 Elements of probability and random variables A probability space is a triple (Ω, A, P ) where Ω is the sample space of possible outcomes of a random experiment; A is a σ-algebra: i.e., A is a collection of sets such that i) the empty set ∅ is in A; ii) if A ∈ A, then the complementary set A¯ ∈ A; iii) if A1 , A2 , . . . ∈ A, then ∞ [
Ai ∈ A .
i=1
P is a probability measure on (Ω, A). In practice, A forms the collection of events for which a probability can be assigned. Given a probability space (Ω, A, P ), a random variable X is defined as a measurable function from Ω to R, X : Ω 7→ R . In the above, the term measurable intuitively means that it is always possible to calculate probabilities related to the random variable X. More precisely, denote by B(R) the Borel σ-algebra on R (i.e., the σ-algebra generated by the open sets of R) and let X −1 be the inverse function of X. Then, X is measurable if S.M. Iacus, Simulation and Inference for Stochastic Differential Equations, doi: 10.1007/978-0-387-75839-8 1, © Springer Science+Business Media, LLC 2008
2
1 Stochastic Processes and Stochastic Differential Equations
∀A ∈ B(R),
∃B ∈ A : X −1 (A) = B;
i.e., such that it is always possible to measure the set of values assumed by X using the probability measure P on the original space Ω, P (X ∈ A) = P ({ω ∈ Ω : X(ω) ∈ A}) = P ({ω ∈ Ω : ω ∈ X −1 (A)}) = P (B), for A ∈ B(R) and B ∈ A. Distribution and density function The function F (x) = P (X ≤ x) = P (X(ω) ∈ (−∞, x]) is called the cumulative distribution function: it is a nondecreasing function such that limx→−∞ F (x) = 0, limx→+∞ F (x) = 1, and F is right continuous. If F is absolutely continuous, its derivative f (x) is called a density function, which is a Lebesgue integrable nonnegative function whose integral over the real line is equal to one. Loosely speaking, if F (x) is the probability that the random variable X takes values less than or equal to x, the quantity f (x)dx can be thought of as the probability that the random variable takes values in the infinitesimal interval [x, x + dx). If the random variable takes only a countable set of values, then it is said to be discrete and its density at point x is defined as P (X = x). In the continuous case, P (X = x) = 0 always. Independence Two random variables X and Y are independent if P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B) for any two sets A and B in R. 1.1.1 Mean, variance, and moments The mean (or expected value) of a continuous random variable X with distribution function F is defined as Z Z EX = X(ω)dP (ω) = xdF (x) Ω
R
R provided that the integral is finite. If X has a density, then EX = R xf (x)dx and the integral is the standard Riemann integral; otherwise integrals in dP or dF should be thought of as integrals in the abstract sense. If Ω is countable, the expected value is defined as X EX = X(ω)P (ω) ω∈Ω
or, equivalently, when P X is a discrete random variable, the expected value reduces to EX = x∈I xP (X = x), where I is the set of possible values of X. The variance is defined as
1.1 Elements of probability and random variables 2
Z
Var X = E (X − EX) =
3
(X(ω) − EX)2 dP (ω) ,
Ω
and the kth moment is defined as EX k =
Z
X k (ω)dP (ω) .
Ω
In general, for any measurable function g(·), Eg(X) is defined as Z Eg(X) = g(X(ω))dP (ω) , Ω
provided that the integral is finite. Types of convergence Let {Fn }n∈N be a sequence of distribution functions for the sequence of random variables {Xn }n∈N . Assume that lim Fn (x) = F (x)
n→∞
for all x ∈ R such that F (·) is continuous in x, where F is the distribution function of some random variable X. Then, the sequence Xn is said to converge d in distribution to the random variable X, and this is denoted by Xn → X. This only means that the distributions Fn of the random variables converge to another distribution F , but nothing is said about the random variables itself. So this convergence is only about the probabilistic behavior of the random variables on some intervals (−∞, x], x ∈ R. A sequence of random variables Xn is said to converge in probability to a random variable X if, for any > 0, lim P (|Xn − X| ≥ ) = 0.
n→∞ p
This is denoted by Xn → X and it is a pointwise convergence of the probabilities. This convergence implies the convergence in distribution. Sometimes we use the notation p − lim |Xn − X| = 0 n→∞
for the convergence in probability. A stronger type of convergence is defined as the probability of the limit in the sense P (limn→∞ Xn = X) = 1 or, more precisely, P ({ω ∈ Ω : lim Xn (ω) = X(ω)}) = 1. n→∞
When this happens, Xn is said to converge to X almost surely and is denoted a.s. by Xn → X. Almost sure convergence implies convergence in probability. A sequence of random variables Xn is said to converge in the rth mean to a random variable X if
4
1 Stochastic Processes and Stochastic Differential Equations
lim E|Xn − X|r = 0, r ≥ 1.
n→∞
The convergence in the r-th mean implies the convergence in probability thanks to Chebyshev’s inequality, and if Xn converges to X in the rth mean, then it also converges in the sth mean for all r > s ≥ 1. Mean square convergence is a particular case of interest and corresponds to the case r = 2. This type of convergence will be used in Section 1.9 to define the Itˆo integral. 1.1.2 Change of measure and Radon-Nikod´ ym derivative In some situations, for example in mathematical finance, it is necessary to reassign the probabilities to the events in Ω, switching from a measure P to another one P˜ . This is done with the help of a random variable, say Z, which reweights the elements in Ω. This change of measure should be done set-by-set instead of ω-by-ω (see, e.g., [209]) as Z P˜ (A) = Z(ω)dP (ω) , (1.1) A
where Z is assumed to be almost surely nonnegative and such that EZ = 1. The new P˜ is then a true probability measure and, for any nonnegative random variable X, the equality ˜ = E(XZ) EX R ˜ = holds, where EX X(ω)dP˜ (ω). Two measures P and P˜ are said to be Ω equivalent if they assign probability 0 to the same sets. The previous change of measure from P to P˜ trivially guarantee that the two measures are equivalent when Z is strictly positive. Another way to read the change of measure in (1.1) is to say that Z is the Radon-Nikod´ym derivative of P˜ with respect to P . Indeed, a formal differentiation of (1.1) allows us to write Z=
dP˜ . dP
(1.2)
Fact 1.1 (Theorem 1.6.7 [209]) Let P and P˜ be two equivalent measures on (Ω, A). Then, there exists a random variable Z, almost surely positive, such that EZ = 1 and Z P˜ (A) = Z(ω)dP (ω) A
for every A ∈ A. The Radon-Nikod´ ym derivative is an essential requirement in statistics because Z plays the role of the likelihood ratio in the inference for diffusion processes.
1.3 The Monte Carlo method
5
1.2 Random number generation Every book on simulation points the attention of the reader to the quality of random number generators. This is of course one central point in simulation studies. R developers and R users are in fact quite careful in the implementations and use of random number generators. We will not go into details, but we just warn the reader about the possibilities available in R and what is used in the examples in this book. The random number generator can be specified in R via the RNGkind function. The default generator of uniform pseudo random numbers is the Mersenne-Twister and is the one used throughout the book. Other methods available as of this writing are Wichmann-Hill, Marsaglia-Multicarry, SuperDuper, and two versions of Knuth-TAOCP random number generators. The user can implement and provide his own method as well. Specifically, for the normal random number generators, available methods are KindermanRamage, Buggy Kinderman-Ramage, Ahrens-Dieter, Box-Muller, and the default Inversion method, as explained in [229]. For this case as well, the user can provide her own algorithm. For other than normal variates, R implements quite advanced pseudo random number generators. For each of these, the reader has to look at the manual page of the corresponding r* functions (e.g., rgamma, rt, rbeta, etc.). For reproducibility of all the numerical results in the book we chose to use a fixed initialization seed before any listing of R code. We use everywhere the function set.seed(123), and the reader should do the same if she wants to obtain the same results.
1.3 The Monte Carlo method Suppose we are given a random variable X and are interested in the evaluation of Eg(X) where g(·) is some known function. If we are able to draw n pseudo random numbers x1 , . . . , xn from the distribution of X, then we can think about approximating Eg(X) with the sample mean of the g(xi ), n
Eg(X) '
1X g(xi ) = g¯n . n i=1
(1.3)
The expression (1.3) is not just symbolic but holds true in the sense of the law of large numbers whenever E|g(X)| < ∞. Moreover, the central limit theorem guarantees that 1 d g¯n → N Eg(X), Var(g(X)) , n where N (m, s2 ) denotes the distribution of the Gaussian random variable with expected value m and variance s2 . In the end, the number we estimate with
6
1 Stochastic Processes and Stochastic Differential Equations
simulations will have a deviation from the true expected value Eg(X) of order √ 1/ n. Given that P (|Z| < 1.96) ' 0.95, Z ∼ N (0, 1), one can construct an interval for the estimate g¯n of the form σ σ Eg(X) − 1.96 √ , Eg(X) + 1.96 √ , n n p with σ = Varg(X), which is interpreted that the Monte Carlo estimate of Eg(X) above is included in the interval above 95% of the time. The confidence interval depends on Varg(X), and usually this quantity has to be estimated through the sample as well. Indeed, one can estimate it as the sample variance of Monte Carlo replications as n
σ ˆ2 =
1 X 2 (g(xi ) − g¯n ) n − 1 i=1
and use the following 95% level Monte Carlo confidence interval1 for Eg(X): σ ˆ σ ˆ . g¯n − 1.96 √ , g¯n + 1.96 √ n n √ The quantity σ ˆ / n is called the standard error. The standard error is itself a random quantity and thus subject to variability; hence one should interpret this value as a “qualitative” measure of accuracy. √ One more remark is that the rate of convergence n is not particularly fast but at least is independent of the smoothness of g(·). Moreover, if we need to increase the quality of our approximation, we just need to draw additional samples2 instead of rerunning the whole simulation. About Monte Carlo intervals length In some cases, Monte Carlo intervals are not very informative if the variance of Y = g(X) is too large. The next example, taken from [156], is one such case. Let Y = g(X) = eβX with X ∼ N (0, 1), and assume we are interested in 2 Eg(X) with β = 5. The analytical value as eβ /2 = 268337.3, √ can2 be calculated and the true standard deviation σ = e2β − eβ 2 = 72004899337, quite a big number with respect to the mean of Y . Suppose we want to estimate EY via the Monte Carlo method using 100000 replications and construct 95% confidence intervals using the true standard deviation σ and the estimated standard error. The following R code does the job. 1 2
Again, this means that the interval covers the true value 95% of the time. A warning note: Of course one should take care of the seed of the random number generator to avoid duplicated samples. If we have already run n replications and we want to add n0 new samples, we cannot simply rerun the algorithm for a length of n0 with the same original seed because in this case we are just replicating the first n0 samples among the n original ones, hence inducing bias without increasing accuracy.
1.3 The Monte Carlo method
7
> # ex1 .01. R > set . seed (123) > n beta x y > # true value of E ( Y ) > exp ( beta ^2 / 2) [1] 268337.3 > # MC e s t i m a t i o n of E ( Y ) > mc . mean mc . mean [1] 199659.2 > mc . sd true . sd > # MC conf . interval based on true sigma > mc . mean - true . sd * 1.96 / sqrt ( n ) [1] -140929943 > mc . mean + true . sd * 1.96 / sqrt ( n ) [1] 141329262 > > # MC conf . interval based on e s t i m a t e d sigma > mc . mean - mc . sd * 1.96 / sqrt ( n ) [1] 94515.51 > mc . mean + mc . sd * 1.96 / sqrt ( n ) [1] 304802.9 > > plot (1: n , cumsum ( y ) / (1: n ) , type = " l " , axes =F , xlab = " n " , + ylab = expression ( hat ( g )[ n ]) , ylim = c (0 ,350000)) > axis (1 , seq (0 ,n , length =5)) > axis (2 , seq (0 ,350000 , length =6)) > abline ( h =268337.3) # true value > abline ( h = mc . mean - mc . sd * 1.96 / sqrt ( n ) , lty =3) # MC conf i n t e r v a l > abline ( h = mc . mean + mc . sd * 1.96 / sqrt ( n ) , lty =3) > abline ( h = mc . mean , lty =2) # MC e s t i m a t e > box ()
Running this code in R, we obtain the two intervals (−140929943; 141329262)
using σ
and (94515.51; 304802.9)
using σ ˆ
with an estimated value of Eg(X), gˆn = 199659.2. As one can see, the confidence interval based on σ contains the true value of Eg(X) but is too large and hence meaningless. The confidence interval based on σ ˆ is smaller but still large. The first effect is due to the big variance of g(X), while the second is due to the fact that the sample variance underestimates the true one (ˆ σ = 53644741). The reason is that, in this particular case, the state of asymptotic normality after n = 1000000 replications is not yet reached (the reader is invited to look at this with a plot(density(y))) and thus the estimator σ ˆ is not necessarily an unbiased estimator of the true σ. Looking at Figure 1.1 one can expect that the Monte Carlo confidence interval for smaller values of n (the reader can try with n = 100000) does not even contain the true value.
1 Stochastic Processes and Stochastic Differential Equations
210000 0
70000
^ g n
350000
8
0
250000
500000
750000
1000000
n
Fig. 1.1. The very slow convergence of the Monte Carlo estimate associated with a target true value with too high variability (see Section 1.3). The solid line is the target value, dotted lines are upper and lower limits of the Monte Carlo 95% confidence interval, and the dashed line is the estimated value gˆn .
1.4 Variance reduction techniques The example in the last section gives evidence that in order to have less variability in Monte Carlo methods and hence use a smaller number of replications in simulations, one needs to try to reduce variability with some workaround. There are several methods of variance reduction for Monte Carlo estimators. We review here just the ones that can be applied in our context, but interesting reviews on methods for other classes of problems and processes can be found, for example, in [156] and [125]. Here we just show the basic ideas, while applications to stochastic differential equations are postponed to Section 2.15. We do not include the treatment of sequences with low discrepancy3 because this is beyond the scope of this book.
3
Discrepancy is a measure of goodness of fit for uniform random variates in high dimensions. Low-discrepancy sequences are such that numerical integration on this grid of points allows for a direct variance reduction. The reader can refer to the review paper [153].
1.4 Variance reduction techniques
9
1.4.1 Preferential sampling The idea of this method is to express Eg(X) in a different form in order to reduce its variance. Let f (·) be the density of X; thus Z Eg(X) = g(x)f (x)dx. R
Introduce now another strictly positive density h(·). Then, Z g(x)f (x) h(x)dx Eg(X) = h(x) R and
Eg(X) = E
g(Y )f (Y ) h(Y )
= E˜ g (Y ) ,
with Y a random variable with density h(·), and denote g˜(·) = g(·)f (·)/h(·). If we are able to determine an h(·) such that Var˜ g (Y ) < Varg(X), then we have reached our goal. But let us calculate Var˜ g (Y ), Z 2 g (x)f 2 (x) dx − (Eg(X))2 . Var˜ g (Y ) = E˜ g (Y )2 − (E˜ g (Y ))2 = h(x) R If g() is strictly positive, by choosing h(x) = g(x)f (x)/Eg(X), we obtain Var˜ g (Y ) = 0, which is nice only in theory because, of course, we don’t know Eg(X). But the expression of h(x) suggests a way to obtain a useful approxi˜ mation: just take h(x) = |g(x)f (x)| (or something close to it), then normalize it by the value of its integral, and use h(x) = R
˜ h(x) . ˜ h(x)dx R
Of course this is simple to say and hard to solve in specific problems, as integration should be done analytically and not using the Monte Carlo technique again. Moreover, the choice of h(·) changes from case to case. We show an example, again taken from [156], which is quite interesting and is a standard application of the method in finance. Suppose we want to calculate Eg(X) with g(x) = max(0, K − eβx ) = (K − eβx )+ , K and β constants, and X ∼ N (0, 1). This is the price of a put option in the Black and Scholes framework [36, 162], and the explicit solution, which is known, reads as 1 2 log(K) log(K) E K − eβX + = KΦ − e2β Φ −β , β β where Φ is the cumulative distribution function of the standard Gaussian law; i.e., Φ(x) = P (Z < z) with Z ∼ N (0, 1). The true value, in the case K = β = 1, is Eg(X) = 0.2384217. Let’s see what happens in Monte Carlo simulations.
10
1 Stochastic Processes and Stochastic Differential Equations
> # ex1 .02. R > set . seed (123) > n beta K x y > # the true value > K * pnorm ( log ( K ) / beta ) - exp ( beta ^2 / 2) * pnorm ( log ( K ) / beta - beta ) [1] 0.2384217 > > t . test ( y [1:100]) # first 100 s i m u l a t i o n s One Sample t - test data : y [1:100] t = 7.701 , df = 99 , p - value = 1.043 e -11 alternative hypothesis : true mean is not equal to 0 95 percent confidence interval : 0.1526982 0.2586975 sample estimates : mean of x 0.2056978 > t . test ( y [1:1000]) # first 1000 s i m u l a t i o n s One Sample t - test data : y [1:1000] t = 24.8772 , df = 999 , p - value < 2.2 e -16 alternative hypothesis : true mean is not equal to 0 95 percent confidence interval : 0.2131347 0.2496388 sample estimates : mean of x 0.2313868 > t . test ( y ) # all s i m u l a t i o n results One Sample t - test data : y t = 80.3557 , df = 9999 , p - value < 2.2 e -16 alternative hypothesis : true mean is not equal to 0 95 percent confidence interval : 0.2326121 0.2442446 sample estimates : mean of x 0.2384284
Results of simulations are reported in Table 1.1 (a). Note that on the algorithm we have used the function sapply instead of the (apparently) more natural (but wrong) line of code y # ex1 .09. R > set . seed (123) > phi n Z Delta W for ( i in Delta ) + Wh inc . ratio plot ( Delta , inc . ratio , type = " l " , log = " y " , xlab = expression ( Delta * t ) , + ylab = expression ( abs ( W (0.5+ Delta * t ) - W (0.5)) / Delta * t )) > max ( inc . ratio , na . rm = T ) [1] 701496.4
The nice property of the Brownian motion is that its quadratic variation is finite. In particular, we have that [W, W ]t = t for all t ≥ 0.
1.7 Geometric Brownian motion A process used quite often in finance to model the dynamics of some asset is the so-called geometric Brownian motion. This process has the property of
1e+03
1e+05
25
1e+01
W(0.5 + ∆t) − W(0.5) ∆t
1.7 Geometric Brownian motion
0.000
0.002
0.004
0.006
0.008
0.010
∆t
Fig. 1.5. Graphical evidence that the trajectory of the Brownian motion is nondifferentiable. The plot shows the behavior of lim∆t→0 |W (0.5+∆t)−W (0.5)|/∆t → +∞.
having independent multiplicative increments and is defined as a function of the standard Brownian motion σ2 S(t) = x exp r− t + σW (t) , t > 0, (1.5) 2 with S(0) = x, x ∈ R is the initial value σ > 0 (interpreted as the volatility), and r (interpreted as the interest rate) two constants. It has been introduced in finance in [172]. One implementation to simulate a path of the geometric Brownian motion can be based on previous algorithms. Assuming that W contains a trajectory of W and t is the vector containing all the time points, a path can be drawn (see Figure 1.6) as follows. > > > > > > > > > > > +
# ex1 .10. R set . seed (123) r > + > > > > >
# ex1 .11. R set . seed (123) N > > + > > > > > >
# ex1 .14. R set . seed (123) par ( " mar " = c (3 ,2 ,1 ,1)) par ( mfrow = c (2 ,1)) npaths > > > + > + > >
# ex1 .15. R W f simple . ef (X , f , lower =0 , upper = Inf ) Running optimizer ... $ estimate theta 1.122539 > K . est ( X ) [1] 1.122539 > > # Least Squares e s t i m a t o r r e v i s i t e d > f simple . ef (X , f , lower =0 , upper = Inf ) Running optimizer ... $ estimate theta 1.100543 > LS . est ( X ) [1] 1.100543 > > # MLE e s t i m a t o r revisited , f is based on > # the score function of the process > f > simple . ef (X , f , lower =0 , upper = Inf ) Running optimizer ... $ estimate theta 1.119500 > MLE . est ( X ) [1] 1.119500
3.5.3 Algorithm 2 for simple estimating functions This second algorithm solves the problem for estimating functions of type I in the particular case of equation (3.37). The user is allowed to specify the full model in terms of drift and diffusion parameters and is asked to specify the function h(·). It eventually calculates first and second derivatives of the function h(·) in equation (3.37). It is also possible to specify lower and upper bounds of the parameter space. The only requirement is that the user specify the space variable x as ‘x’ in the coefficients. As before, the function h(·) is expected to be a list of R expressions, one for each parameter to be estimated, and we use a constrained optimization approach as before. Most of the following code is dedicated to extracting the parameter names from the coefficients of the stochastic differential equation and to preparing the functions (drift, diffusion, h(·) and its derivatives) to be passed to the
168
3 Parametric Estimation
optimizer. Variable ‘x’ will be interpreted as variable x in b(x, θ), σ(x, θ), and h(x). If derivatives of function h(·) are missing, then symbolic differentiation is attempted as in the sde.sim function. Option hessian=TRUE in the function deriv is used to get second-order derivatives. simple . ef2