Taylor Approximations for Stochastic Partial Differential Equations
cB83_Jentzen_fM.indd 1
9/28/2011 3:28:57 PM
This page intentionally left blank
CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS A series of lectures on topics of current research interest in applied mathematics under the direction of the Conference Board of the Mathematical Sciences, supported by the National Science Foundation and published by SIAM. Garrett Birkhoff, The Numerical Solution of Elliptic Equations D. V. Lindley, Bayesian Statistics, A Review R. S. Varga, Functional Analysis and Approximation Theory in Numerical Analysis R. R. Bahadur, Some Limit Theorems in Statistics Patrick Billingsley, Weak Convergence of Measures: Applications in Probability J. L. Lions, Some Aspects of the Optimal Control of Distributed Parameter Systems Roger Penrose, Techniques of Differential Topology in Relativity Herman Chernoff, Sequential Analysis and Optimal Design J. Durbin, Distribution Theory for Tests Based on the Sample Distribution Function Sol I. Rubinow, Mathematical Problems in the Biological Sciences P. D. Lax, Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves I. J. Schoenberg, Cardinal Spline Interpolation Ivan Singer, The Theory of Best Approximation and Functional Analysis Werner C. Rheinboldt, Methods of Solving Systems of Nonlinear Equations Hans F. Weinberger, Variational Methods for Eigenvalue Approximation R. Tyrrell Rockafellar, Conjugate Duality and Optimization Sir James Lighthill, Mathematical Biofluiddynamics Gerard Salton, Theory of Indexing Cathleen S. Morawetz, Notes on Time Decay and Scattering for Some Hyperbolic Problems F. Hoppensteadt, Mathematical Theories of Populations: Demographics, Genetics and Epidemics Richard Askey, Orthogonal Polynomials and Special Functions L. E. Payne, Improperly Posed Problems in Partial Differential Equations S. Rosen, Lectures on the Measurement and Evaluation of the Performance of Computing Systems Herbert B. Keller, Numerical Solution of Two Point Boundary Value Problems J. P. LaSalle, The Stability of Dynamical Systems D. Gottlieb and S. A. Orszag, Numerical Analysis of Spectral Methods: Theory and Applications Peter J. Huber, Robust Statistical Procedures Herbert Solomon, Geometric Probability Fred S. Roberts, Graph Theory and Its Applications to Problems of Society Juris Hartmanis, Feasible Computations and Provable Complexity Properties Zohar Manna, Lectures on the Logic of Computer Programming Ellis L. Johnson, Integer Programming: Facets, Subadditivity, and Duality for Group and Semi-Group Problems Shmuel Winograd, Arithmetic Complexity of Computations J. F. C. Kingman, Mathematics of Genetic Diversity Morton E. Gurtin, Topics in Finite Elasticity Thomas G. Kurtz, Approximation of Population Processes Jerrold E. Marsden, Lectures on Geometric Methods in Mathematical Physics Bradley Efron, The Jackknife, the Bootstrap, and Other Resampling Plans M. Woodroofe, Nonlinear Renewal Theory in Sequential Analysis D. H. Sattinger, Branching in the Presence of Symmetry R. Temam, Navier–Stokes Equations and Nonlinear Functional Analysis
cB83_Jentzen_fM.indd 2
9/28/2011 3:28:57 PM
Miklós Csörgo, Quantile Processes with Statistical Applications J. D. Buckmaster and G. S. S. Ludford, Lectures on Mathematical Combustion R. E. Tarjan, Data Structures and Network Algorithms Paul Waltman, Competition Models in Population Biology S. R. S. Varadhan, Large Deviations and Applications Kiyosi Itô, Foundations of Stochastic Differential Equations in Infinite Dimensional Spaces Alan C. Newell, Solitons in Mathematics and Physics Pranab Kumar Sen, Theory and Applications of Sequential Nonparametrics László Lovász, An Algorithmic Theory of Numbers, Graphs and Convexity E. W. Cheney, Multivariate Approximation Theory: Selected Topics Joel Spencer, Ten Lectures on the Probabilistic Method Paul C. Fife, Dynamics of Internal Layers and Diffusive Interfaces Charles K. Chui, Multivariate Splines Herbert S. Wilf, Combinatorial Algorithms: An Update Henry C. Tuckwell, Stochastic Processes in the Neurosciences Frank H. Clarke, Methods of Dynamic and Nonsmooth Optimization Robert B. Gardner, The Method of Equivalence and Its Applications Grace Wahba, Spline Models for Observational Data Richard S. Varga, Scientific Computation on Mathematical Problems and Conjectures Ingrid Daubechies, Ten Lectures on Wavelets Stephen F. McCormick, Multilevel Projection Methods for Partial Differential Equations Harald Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods Joel Spencer, Ten Lectures on the Probabilistic Method, Second Edition Charles A. Micchelli, Mathematical Aspects of Geometric Modeling Roger Temam, Navier–Stokes Equations and Nonlinear Functional Analysis, Second Edition Glenn Shafer, Probabilistic Expert Systems Peter J. Huber, Robust Statistical Procedures, Second Edition J. Michael Steele, Probability Theory and Combinatorial Optimization Werner C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Second Edition J. M. Cushing, An Introduction to Structured Population Dynamics Tai-Ping Liu, Hyperbolic and Viscous Conservation Laws Michael Renardy, Mathematical Analysis of Viscoelastic Flows Gérard Cornuéjols, Combinatorial Optimization: Packing and Covering Irena Lasiecka, Mathematical Control Theory of Coupled PDEs J. K. Shaw, Mathematical Principles of Optical Fiber Communications Zhangxin Chen, Reservoir Simulation: Mathematical Techniques in Oil Recovery Athanassios S. Fokas, A Unified Approach to Boundary Value Problems Margaret Cheney and Brett Borden, Fundamentals of Radar Imaging Fioralba Cakoni, David Colton, and Peter Monk, The Linear Sampling Method in Inverse Electromagnetic Scattering Adrian Constantin, Nonlinear Water Waves with Applications to Wave-Current Interactions and Tsunamis Wei-Ming Ni, The Mathematics of Diffusion Arnulf Jentzen and Peter E. Kloeden, Taylor Approximations for Stochastic Partial Differential Equations
cB83_Jentzen_fM.indd 3
9/28/2011 3:28:57 PM
Arnulf Jentzen Princeton University Princeton, New Jersey
Peter E. Kloeden Goethe University Frankfurt am Main, Germany
Taylor Approximations for Stochastic Partial Differential Equations
Society for Industrial and Applied Mathematics Philadelphia
cB83_Jentzen_fM.indd 5
9/28/2011 3:28:58 PM
Copyright © 2011 by the Society for Industrial and Applied Mathematics. 10 9 8 7 6 5 4 3 2 1 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA. Trademarked names may be used in this book without the inclusion of a trademark symbol. These names are used in an editorial context only; no infringement of trademark is intended. Maple is a trademark of Waterloo Maple, Inc. Mathematica is a registered trademark of Wolfram Research, Inc. MATLAB is a registered trademark of The MathWorks, Inc. For MATLAB product information, please contact The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA 01760-2098 USA, 508-647-7000, Fax: 508-647-7001,
[email protected], www.mathworks.com. Figure 3.1 and 6.1 used with permission from Springer. Figure 4.1 used with permission from Cambridge University Press. Figures 8.1-8.10 used with permission from the American Institute of Mathematical Sciences. Library of Congress Cataloging-in-Publication Data Jentzen, Arnulf. Taylor approximations for stochastic partial differential equations / Arnulf Jentzen, Peter E. Kloeden. p. cm. -- (CBMS-NSF regional conference series in applied mathematics ; 83) Includes bibliographical references and index. ISBN 978-1-611972-00-9 1. Stochastic partial differential equations. 2. Approximation theory. I. Kloeden, Peter E. II. Title. QA274.25.J46 2011 515'.353--dc23 2011029546
is a registered trademark.
cB83_Jentzen_fM.indd 6
9/28/2011 3:28:58 PM
Contents Preface
xi
List of Figures 1
I 2
3
xiii
Introduction 1.1 Taylor expansions for ODEs . . . . . . . . . . . . . . . . . . . 1.1.1 Taylor schemes for ODEs . . . . . . . . . . . . . 1.1.2 Integral representation of Taylor expansions . . .
Random and Stochastic Ordinary Differential Equations Taylor Expansions and Numerical Schemes for RODEs 2.1 RODEs . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Equivalence of RODEs and SODEs . . 2.1.2 Simple numerical schemes for RODEs 2.2 Taylor-like expansions for RODEs . . . . . . . . . 2.2.1 Multi-index notation . . . . . . . . . . 2.2.2 Taylor expansions of the vector field . 2.3 RODE–Taylor schemes . . . . . . . . . . . . . . . 2.3.1 Discretization error . . . . . . . . . . 2.3.2 Examples of RODE–Taylor schemes .
7 . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
SODEs 3.1 Itô SODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Existence and uniqueness of strong solutions of SODEs . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Simple numerical schemes for SODEs . . . . . . 3.2 Itô–Taylor expansions . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Iterated application of the Itô formula . . . . . . . 3.2.2 General stochastic Taylor expansions . . . . . . . 3.3 Itô–Taylor numerical schemes for SODEs . . . . . . . . . . . vii
1 1 3 4
9 9 10 12 13 14 14 15 17 22 25 25 26 27 29 29 31 32
viii
Contents 3.4 3.5
4
II 5
6
7
Pathwise convergence . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Numerical schemes for RODEs applied to SODEs Restrictiveness of the standard assumptions . . . . . . . . . . . 3.5.1 Counterexamples for the Euler–Maruyama scheme
Numerical Methods for SODEs with Nonstandard Assumptions 4.1 SODEs without uniformly bounded coefficients . . . . . . 4.2 SODEs on restricted regions . . . . . . . . . . . . . . . . . 4.2.1 Examples . . . . . . . . . . . . . . . . . . . . 4.3 Another type of weak convergence . . . . . . . . . . . . .
. . . .
. . . .
Stochastic Partial Differential Equations
35 36 37 37 43 43 44 46 47
51
Stochastic Partial Differential Equations 5.1 Random and stochastic PDEs . . . . . . . . . . . . . . . . 5.1.1 Mild solutions of SPDEs . . . . . . . . . . . . 5.2 Functional analytical preliminaries . . . . . . . . . . . . . 5.2.1 Hilbert–Schmidt and trace-class operators . . 5.2.2 Hilbert space valued random variables . . . . 5.2.3 Hilbert space valued stochastic processes . . . 5.2.4 Infinite dimensional Wiener processes . . . . . 5.3 Setting and assumptions . . . . . . . . . . . . . . . . . . . 5.4 Existence, uniqueness, and regularity of solutions of SPDEs 5.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Finite dimensional SODEs . . . . . . . . . . . 5.5.2 Second order SPDEs . . . . . . . . . . . . . . 5.5.3 Fourth order SPDEs . . . . . . . . . . . . . . 5.5.4 SPDEs with time-dependent coefficients . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
53 53 54 55 55 56 56 56 57 58 60 60 60 65 67
Numerical Methods for SPDEs 6.1 An early result . . . . . . . . . . . . . . . . . . 6.2 Other results . . . . . . . . . . . . . . . . . . . 6.3 The exponential Euler scheme . . . . . . . . . . 6.3.1 Convergence . . . . . . . . . . . . 6.3.2 Numerical results . . . . . . . . . 6.3.3 Restrictiveness of the assumptions
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
69 70 71 73 74 75 76
Taylor Approximations for SPDEs with Additive Noise 7.1 Assumptions . . . . . . . . . . . . . . . . . . . . . 7.1.1 Properties of the solutions . . . . . . . 7.2 Autonomization . . . . . . . . . . . . . . . . . . . 7.3 Examples . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Semigroup generated by the Laplacian
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
79 80 83 88 91 91
. . . . . .
Contents
7.4 7.5 7.6
7.7
8
ix 7.3.2 The drift as a Nemytskii operator . . . . . . . . 7.3.3 Stochastic process as stochastic convolution . . 7.3.4 Concrete examples of SPDEs with additive noise Taylor expansions . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Integral operators . . . . . . . . . . . . . . . . Abstract examples of Taylor expansions . . . . . . . . . . . Examples of Taylor approximations . . . . . . . . . . . . . . 7.6.1 Space–time white noise . . . . . . . . . . . . . 7.6.2 Taylor approximations for a nonlinear SPDE . . 7.6.3 Smoother noise . . . . . . . . . . . . . . . . . Numerical schemes from Taylor expansions . . . . . . . . . 7.7.1 The exponential Euler scheme . . . . . . . . . . 7.7.2 A Runge–Kutta scheme for SPDEs . . . . . . .
Taylor Approximations for SPDEs with Multiplicative Noise 8.1 Heuristic derivation of Taylor expansions . . . . . . . . . . 8.2 Setting and assumptions . . . . . . . . . . . . . . . . . . . 8.2.1 Stochastic heat equation . . . . . . . . . . . . 8.3 Taylor expansions for SPDEs . . . . . . . . . . . . . . . . 8.3.1 Integral operators . . . . . . . . . . . . . . . 8.3.2 Derivation of simple Taylor expansions . . . . 8.3.3 Higher order Taylor expansions . . . . . . . . 8.4 Stochastic trees and woods . . . . . . . . . . . . . . . . . 8.4.1 Stochastic trees and woods . . . . . . . . . . 8.4.2 Construction of stochastic trees and woods . . 8.4.3 Subtrees . . . . . . . . . . . . . . . . . . . . 8.4.4 Order of stochastic trees and woods . . . . . . 8.4.5 Stochastic woods and Taylor expansions . . . 8.5 Examples of Taylor approximations . . . . . . . . . . . . . 8.5.1 Abstract examples of Taylor approximations . 8.5.2 Application to the stochastic heat equation . . 8.5.3 Finite dimensional SODEs . . . . . . . . . . . 8.6 Numerical schemes for SPDEs . . . . . . . . . . . . . . . 8.6.1 The exponential Euler scheme . . . . . . . . . 8.6.2 An infinite dimensional analogue of Milstein’s scheme . . . . . . . . . . . . . . . . . . . . . 8.6.3 Linear-implicit Euler and Crank–Nicolson schemes . . . . . . . . . . . . . . . . . . . . 8.6.4 Global and local convergence orders . . . . . 8.6.5 Numerical simulations . . . . . . . . . . . . . 8.7 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.1 Proof of Lemma 8.3 . . . . . . . . . . . . . . 8.7.2 Proof of Lemma 8.5 . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
92 93 94 96 98 104 114 114 116 121 121 123 124
. . . . . . . . . . . . . . . . . . .
127 127 130 133 135 136 138 139 140 140 142 145 146 147 149 149 151 152 154 155
. . 156 . . . . . .
. . . . . .
157 157 158 160 160 163
x
Contents 8.7.3 8.7.4
A
Some additional lemmas . . . . . . . . . . . . . . 164 Proof of Theorem 8.4 . . . . . . . . . . . . . . . 171
Regularity Estimates for SPDEs A.1 Some useful inequalities . . . . . . . . . . . . . . . . . . A.1.1 Minkowski’s integral inequality . . . . . . . A.1.2 Burkholder–Davis–Gundy-type inequalities A.2 Semigroup term . . . . . . . . . . . . . . . . . . . . . . A.3 Drift term . . . . . . . . . . . . . . . . . . . . . . . . . A.4 Diffusion term . . . . . . . . . . . . . . . . . . . . . . . A.5 Existence and uniqueness . . . . . . . . . . . . . . . . . A.6 Regularity . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
173 174 174 175 177 182 193 204 206
Bibliography
209
Index
219
Preface The numerical approximation of stochastic partial differential equations (SPDEs), specifically, stochastic evolution equations of the parabolic or hyperbolic type, encounters all of the difficulties that arise in the numerical solution of both deterministic PDEs and finite dimensional stochastic ordinary differential equations (SODEs) as well as many more due to the infinite dimensional nature of the driving noise processes. The state of development of numerical schemes for SPDEs compares with that for SODEs in the early 1970s. Most of the numerical schemes that have been proposed to date have a low order of convergence, especially in terms of an overall computational effort, and only recently has it been shown how to construct higher order schemes. The breakthrough for SODEs started with the Milstein scheme and continued with the systematic derivation of stochastic Taylor expansions and the numerical schemes based on them. These stochastic Taylor schemes are based on an iterated application of the Itô formula. The crucial point is that the multiple stochastic integrals which they contain provide more information about the noise processes within discretization subintervals, and this allows an approximation of higher order to be obtained. This theory is presented in detail in the monographs Kloeden & Platen [82] and Milstein [95]. There is, however, no such Itô formula for the solutions of stochastic PDEs in Hilbert spaces or Banach spaces (see Chapter 7 for more details). Nevertheless, it has recently been shown that Taylor expansions for the solutions of such equations can be constructed by taking advantage of the mild form representation of the solutions. Moreover, such expansions are robust with respect to the noise in the additive noise case, i.e., hold for other types of stochastic processes with Hölder continuous paths such as fractional Brownian motion. This book is based on recent work of the coauthors. Its style, contents, and structure follow the series of lectures given by the second author, Peter Kloeden, in August 2010 at the Illinois Institute of Technology in Chicago. The main difference from the lectures is the existence and uniqueness theorem in Chapter 5. Most of that chapter and the entire appendix were written by the first coauthor, Arnulf Jentzen.
xi
xii
Preface
The book also includes new developments on numerical methods for random ordinary differential equations and SODEs, since these are relevant for solving spatially discretized SPDEs as well as in their own right. The focus is on pathwise and strong convergences. In finance mathematics, weak convergence is of primary interest, but strong convergence is nevertheless important, too, as an essential component of the multilevel Monte Carlo method introduced recently in Giles [35] (see also Heinrich [54]). Much of this book was written during an extended stay by the second author at the Isaac Newton Institute for Mathematical Sciences at the University of Cambridge during a special half year on SPDEs in the first half of 2010. The financial support and congenial working atmosphere are gratefully acknowledged. We also thank SIAM and the National Science Foundation for sponsoring and funding these CBMS lectures, as well as Jinqiao Duan, Igor Cialenco, and Fred J. Hicknell of the Illinois Institute of Technology for hosting the lectures and for their local organization. In particular, we thank Michael Tretyakov for carefully reading parts of the manuscript and Sebastian Becker for his assistance with the computations, figures, and LATEX. Arnulf Jentzen, Princeton
Peter Kloeden, Frankfurt am Main
List of Figures 3.1
4.1
6.1
Stochastic Taylor trees for the stochastic Euler scheme (left) and the Milstein scheme (right). The multi-indices are formed by concatenating indices along a branch from the right back toward the root ∅ of the tree. Dashed line segments correspond to remainder terms. . . . . . . . . . (0.5)
34
(1.0)
Empirical distributions of K0.001 and K0.001 (sample size: N = 104 ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
Mean-square error vs. computational effort as log-log plot with the function f (u) = 12 u. . . . . . . . . . . . . . . . . . . . . . .
77
7.1
Pathwise approximation error of the Taylor approximations (7.54)– (7.59) vs. t = t − 21 for different t ∈ [ 12 , 1] and one random realization of (7.18). . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2
Mean square approximation error of the Taylor approximations (7.54)–(7.59) vs. t = t − 12 for different t ∈ [ 12 , 1] and (7.18). . . 120
8.1
Two examples of stochastic trees. . . . . . . . . . . . . . . . . . 141
8.2
The stochastic wood w0 in SW. . . . . . . . . . . . . . . . . . . 142
8.3
The stochastic wood w1 in SW. . . . . . . . . . . . . . . . . . . 143
8.4
The stochastic wood w2 in SW. . . . . . . . . . . . . . . . . . . 143
8.5
The stochastic wood w3 in SW. . . . . . . . . . . . . . . . . . . 144
8.6
The stochastic wood w4 in SW. . . . . . . . . . . . . . . . . . . 144
8.7
The stochastic wood w5 in SW. . . . . . . . . . . . . . . . . . . 145
8.8
Subtrees of the right tree in Figure 8.1. . . . . . . . . . . . . . . 146 xiii
xiv
List of Figures 8.9
Root-mean-square approximation error of (8.42), (8.43), and (8.46) vs. time steps t = t − t0 = t − 12 of the size t ∈ { 214 , 215 , . . . , 218 }. . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.10
Root-mean-square discretization error (8.57) of the exponential Euler approximation Yk128,M,128 , k ∈ {0, 1, . . . , M}, given by (8.54), and of the linear-implicit Euler approximation Yk128,M,128 , k ∈ {0, 1, . . . , M}, given by (8.55), vs. M ∈ {2, 4, 8, 16} and M ∈ {4, 16, 64, 256}, respectively. . . . . . . . . . . . . . . . . . . . . 160
Chapter 1
Introduction
Taylor expansions are a very basic tool in numerical analysis and other areas of mathematics which require approximations. In particular, they enable the derivation of one-step numerical schemes for ordinary differential equations (ODEs) of arbitrary high order, although in practice such Taylor schemes are rarely implemented but are used instead as a theoretical comparison for determining the convergence orders of other schemes that have been derived by more heuristic methods. A similar situation holds for random ordinary differential equations (RODEs), which are pathwise ODEs, where the vector field is continuous but not differentiable in time due to the nature of the driving noise processes.
1.1 Taylor expansions for ODEs The Taylor expansion of a p + 1 times continuously differentiable function x : R → R is given by x(t) = x(t0 ) + x (t0 ) h + · · · +
1 (p) 1 x (t0 ) hp + x (p+1) (θ ) hp+1 p! (p + 1)!
(1.1)
with h = t − t0 and the remainder term evaluated at some intermediate value θ ∈ [t0 , t], which is usually unknown. Now let x(t) = x(t; t0 , x0 ) be the solution of a scalar ODE dx = f (t, x) dt 1
(1.2)
2
Chapter 1. Introduction
with the initial value x(t0 ) = x0 and define the differential operator L by ∂g ∂g (t, x) + f (t, x) (t, x), ∂t ∂x i.e., Lg(t, x(t)) is the total derivative of g(t, x(t)) with respect to a solution x(t) of the ODE (1.2), since Lg(t, x) :=
d ∂g ∂g g(t, x(t)) = (t, x(t)) + (t, x(t)) x (t) = Lg(t, x(t)) dt ∂t ∂x by the chain rule. In particular, for any such solution x (t) = f (t, x(t)), x (t) =
d d x (t) = f (t, x(t)) = Lf (t, x(t)), dt dt
x (t) =
d d x (t) = Lf (t, x(t)) = L2 f (t, x(t)) , dt dt
and, in general, x (j ) (t) = Lj −1 f (t, x(t)), j = 1, 2, . . . , provided f is smooth enough. For notational convenience, define L0 f (t, x) ≡ f (t, x). If f is p times continuously differentiable, then the solution x(t) of the ODE (1.2) is p + 1 times continuously differentiable and has a Taylor expansion (1.1), which can be rewritten as p 1 j −1 x(t) = x(t0 ) + L f (t0 , x(t0 )) (t − t0 )j (1.3) j! j =1
+
1 Lp f (θ, x(θ)) (t − t0 )p+1 . (p + 1)!
On a subinterval [tn , tn+1 ] with h = tn+1 − tn > 0, the Taylor expansion is x(tn+1 ) = x(tn ) +
p 1 j −1 1 L f (tn , x(tn )) hj + Lp f (θn , x(θn )) hp+1 j! (p + 1)! j =1
(1.4) for some θn ∈ [tn , tn+1 ], which is usually unknown. Nevertheless, the error term can be estimated and is of order 0(hp+1 ), since hp+1 p hp+1 L f (θn , x(θn ; tn , xn )) ≤ max Lp f (t, x) ≤ Cp,T ,D hp+1 , (p + 1)! (p + 1)! t0 ≤t≤T x∈D
where D is some sufficiently large compact subset of R containing the solution over a bounded time interval [t0 , T ], which contains the subintervals under consideration.
1.1. Taylor expansions for ODEs
3
The maximum can be used here since Lp f is continuous on [t0 , T ] × D. Note that Lp f contains the partial derivatives of f of up to order p.
1.1.1 Taylor schemes for ODEs The Taylor scheme of order p for the ODE (1.2), p hj j −1 L f (tn , xn ), xn+1 = xn + j!
(1.5)
j =1
is obtained by discarding the remainder term in the Taylor expansion (1.4) and replacing x(tn ) by xn . The Taylor scheme (1.5) is an example of a one-step explicit scheme of the general form xn+1 = xn + hF (h, tn , xn )
(1.6)
with an increment function F defined by p 1 j −1 L f (t, x) hj −1 . F (h, t, x) := j! j =1
Such a scheme is said to have order p if its global discretization error Gn (h) := |x(tn , t0 , x0 ) − xn | ,
n = 0, 1, . . . , Nh :=
T − t0 , h
converges with order p, i.e., if max Gn (h) ≤ Cp,T ,D hp .
0≤n≤Nh
A basic result in numerical analysis says that a one-step scheme converges with order p if its local discretization error converges with order p + 1. This is defined by Ln+1 (h) := |x(tn+1 ) − x(tn ) − hF (h, tn , x(tn ))| , i.e., the error on each subinterval takes one iteration of the scheme starting at the exact value of the solution x(tn ) at time tn . Thus, the Taylor scheme of order p is indeed a pth order scheme. The simplest Taylor scheme is the Euler scheme, xn+1 = xn + hf (tn , xn ), with p = 1.
(1.7)
4
Chapter 1. Introduction
The higher coefficients Lj −1 f (t, x) of a Taylor scheme of order p > 1 are, however, very complicated. For example, ∂ ∂ [Lf ] + f [Lf ] ∂t ∂x ∂f ∂ ∂f ∂f ∂ ∂f +f +f +f = ∂x ∂x ∂t ∂x ∂t ∂t
L2 f = L[Lf ] =
=
∂ 2f ∂ 2f ∂ 2 f ∂f ∂f + + f + f +f ∂t ∂x ∂t∂x ∂x∂t ∂t 2
∂f ∂x
2 +f2
∂ 2f . ∂x 2
Taylor schemes are thus rarely used in practice,1 but they are very useful for theoretical purposes, e.g., for determining by comparison the local discretization order of other numerical schemes derived by heuristic means, such as the Heun scheme, 1 xn+1 = xn + h [f (tn , xn ) + f (tn+1 , xn + hf (tn , xn ))] , 2 which is a Runge–Kutta scheme of order 2.
1.1.2
(1.8)
Integral representation of Taylor expansions
Taylor expansions of a solution x(t) = x(t, t0 , x0 ) of an ODE (1.2) also have an integral derivation and representation. These are based on the integral equation representation of the initial value problem of the ODE, t f (s, x(s)) ds, (1.9) x(t) = x0 + t0
and, by the Fundamental Theorem of Calculus, the integral form of the total derivative, t g(t, x(t)) = g(t0 , x0 ) + Lg(s, x(s)) ds. (1.10) t0
Note that (1.10) reduces to the integral equation (1.9) with g(t, x) = x, since Lg = f in this case. Applying (1.10) to g = f over the interval [t0 , s] to the integrand of the integral equation (1.9) gives t s f (t0 , x0 ) + Lf (τ , x(τ )) dτ ds x(t) = x0 + t0
t0
= x0 + f (t0 , x0 )
t
t0
ds +
t
s
Lf (τ , x(τ )) dτ ds, t0
t0
1 Symbolic manipulators now greatly facilitate their use. Indeed, Coomes, Koçak & Palmer [20] applied a Taylor scheme of order 31 to the three-dimensional Lorenz equations.
1.1. Taylor expansions for ODEs
5
which is the first order Taylor expansion. Then applying (1.10) to g = Lf over the interval [t0 , τ ] to the integrand Lf in the double integral remainder term leads to t t s x(t) = x0 + f (t0 , x0 ) ds + Lf (t0 , x0 ) dτ ds t0
+
t s t0
t0
t0 τ
t0
L2 f (ρ, x(ρ)) dρ dτ ds.
t0
In this way one obtains the Taylor expansion in integral form: x(t) = x(t0 ) +
p
L
j −1
t f (t0 , x0 ) t0
j =1
+
t
s1
sj
... t0
s1
t0
t0
t0
···
sj −1
dsj · · · ds1
(1.11)
t0
Lp f (sj +1 , x(sj +1 )) dsj +1 · · · ds1 .
(For j = 1, there is just a single integral over t0 ≤ s1 ≤ t.) This is equivalent to the differential form of the Taylor expansion (1.3) by the Intermediate Value Theorem for Integrals and the fact that t s1 sj −1 1 ··· dsj · · · ds1 = (t − t0 )j , j = 1, 2, . . . . j! t0 t0 t0
Missing page
Missing page
This page intentionally left blank
Chapter 2
Taylor Expansions and Numerical Schemes for RODEs
Random ordinary differential equations (RODEs) are pathwise ODEs that contain a stochastic process in their vector field functions. They have been used for many years in a wide range of applications but have been very much overshadowed by stochastic ordinary differential equations (SODEs). They are intrinsically nonautonomous ODEs due to the noise. Typically, however, the driving stochastic process has at most Hölder continuous sample paths, so the sample paths of the solutions are certainly continuously differentiable, but the derivatives of the sample paths are at most Hölder continuous in time. Thus, after insertion of the driving stochastic process, the resulting vector field is at most Hölder continuous in time, no matter how smooth the vector field is in its original variables. Consequently, although classical numerical schemes for ODEs can be used pathwise for RODEs, they rarely attain their traditional order and new forms of higher order schemes are required.
2.1
RODEs
Let (, A, P) be a probability space and let ζ : [0, T ] × → Rm be an Rm -valued stochastic process with continuous sample paths. In addition, let f : Rm × Rd → Rd be a continuous function. A RODE in Rd , dx x ∈ Rd , (2.1) = f (ζt (ω), x), dt is a nonautonomous ODE dx = Fω (t, x) := f (ζt (ω), x) (2.2) dt for almost every realization ω ∈ . 9
10
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs A simple example of a scalar RODE is dx = −x + sin(Wt (ω)) , dt
(2.3)
where Wt is a scalar Wiener process. Here f (z, x) = −x + sin z and d = m = 1. RODEs with other kinds of noise, such as fractional Brownian motion, have been used, e.g., in Garrido-Atienza, Kloeden & Neuenkirch [34]. For convenience, it will be assumed that the RODE (2.2) holds for all ω ∈ , by restricting to a subset of full probability if necessary, and that f is infinitely often continuously differentiable in its variables, although k-times continuously differentiable with k sufficiently large would suffice. In particular, f is then locally Lipschitz in x, so the initial value problem d xt (ω) = f (ζt (ω), xt (ω)), dt
x0 (ω) = X0 (ω),
(2.4)
where the initial value X0 is an Rd -valued random variable, has a unique pathwise solution xt (ω) for every ω ∈ , which will be assumed to exist on the finite time interval [0, T ] under consideration. Sufficient conditions that guarantee the existence and uniqueness of such solutions can be found in Arnold [2] and Bunke [12]. They are similar to those for ODEs. The solution of the RODE (2.4) is a stochastic process (xt )t∈[0,T ] . Its sample paths t → xt (ω) are continuously differentiable but need not be further differentiable since the vector field Fω (t, x) of the nonautonomous ODE (2.2) is usually only continuous, but not differentiable in t, no matter how smooth the function f is in its variables.
2.1.1
Equivalence of RODEs and SODEs
RODEs occur in many applications; see, for example, [2, 12, 113, 114] and the papers cited therein. Moreover, RODEs with Wiener processes can be rewritten as SODEs, so results for one can be applied to the other. For example, the scalar RODE (2.3) can be rewritten as the two-dimensional SODE Xt −Xt + sin(Yt ) 0 d = dt + dWt . Yt 0 1 On the other hand, any finite dimensional SODE can be transformed to an RODE. In the case of commutative noise this is the famous Doss–Sussmann result [31, 116], which was generalized to all SODE in recent years by Imkeller & Schmalfuss [60]. It is easily illustrated for a scalar SODE with additive noise: The equation dXt = f (Xt ) dt + dWt
2.1. RODEs
11
is equivalent to the RODE dz = f (z + Ot ) + Ot , dt
(2.5)
where z(t) := Xt −Ot and Ot is the stochastic stationary Ornstein–Uhlenbeck process satisfying the linear SODE dOt = −Ot dt + dWt .
(2.6)
To see this, subtract integral versions of both SODE and substitute to obtain z(t) = z(0) +
t
[f (z(s) + Os ) + Os ] ds.
0
It then follows by continuity and the Fundamental Theorem of Calculus that z is pathwise differentiable. In particular, deterministic calculus can be used pathwise for RODEs. This greatly facilitates the investigation of dynamical behavior and other qualitative properties of RODEs. For example, suppose that f in the RODE (2.5) satisfies a one-sided dissipative Lipschitz condition, x − y, f (x) − f (y) ≤ −L|x − y|2 for all x, y ∈ Rd and some L > 0. Then, for any two solutions z1 (t) and z2 (t) of the RODE (2.5),
d dz1 dz2 |z1 (t) − z2 (t)|2 = 2 z1 (t) − z2 (t), − dt dt dt = 2 z1 (t) − z2 (t), f (z1 (t) + Ot ) − f (z2 (t) + Ot ) = 2 (z1 (t) + Ot ) − (z2 (t) + Ot ) , f (z1 (t) + Ot ) − f (z2 (t) + Ot ) ≤ −2L |z1 (t) − z2 (t)|2 , from which it follows that |z1 (t) − z2 (t)|2 ≤ e−2Lt |z1 (0) − z2 (0)|2 → 0
as t → ∞.
From the theory of random dynamical systems [2] there thus exists a pathwise asymptotically stable stochastic stationary solution. Transforming back to the SODE, one concludes that the SODE also has a pathwise asymptotically stable stochastic stationary solution.
12
2.1.2
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs
Simple numerical schemes for RODEs
The rules of deterministic calculus apply pathwise to RODEs, but the vector field function in Fω (t, x) in (2.2) is not smooth in t. It is at most Hölder continuous in time like the driving stochastic process ζt and thus lacks the smoothness needed to justify the Taylor expansions and the error analysis of traditional numerical methods for ODEs in Chapter 1. Such methods can be used but will attain at best a low convergence order, so new higher order numerical schemes must be derived for RODEs. For example, let ζt be pathwise Hölder continuous of order 12 . Then the Euler scheme with step size n Yn+1 = (1 − n ) Yn + ζtn n for the RODE dx = −x + ζt dt attains the pathwise order 12 . One can do better, however, by using the pathwise averaged Euler scheme tn+1 ζt dt, Yn+1 = (1 − n ) Yn + tn
which was proposed by Grüne & Kloeden [40] (see also Talay [119, 120]). It attains the pathwise order 1 provided the integral is approximated with Riemann sums Jn tn+1 ζt dt ≈ ζtn +j δ δ tn
j =1
≈ n and δ · Jn = n . In fact, this was done with the step size δ satisfying more generally in [40] for RODEs with an affine structure, i.e., of the form δ 1/2
dx = g(x) + G(x)ζt , dt
(2.7)
where g : Rd → Rd and G : Rd → Rd × Rm . The explicit averaged Euler scheme then reads Yn+1 = Yn + [g (Yn ) + G (Yn ) In ] n ,
(2.8)
tn+1 1 ζs ds. (2.9) n tn For the general RODE (2.1) this suggests that one should pathwise average the vector field, i.e., tn+1 1 f (ζs , Yn ) ds, n tn where
In :=
2.2. Taylor-like expansions for RODEs
13
which is computationally expensive even for low-dimensional systems. An alternative is to use the averaged noise within the vector field, which leads to the explicit averaged Euler scheme Yn+1 = Yn + f (In , Yn ) n .
(2.10)
A systematic derivation of higher order numerical schemes for RODEs based on this idea using Taylor-like expansions from [61] and [79] will be outlined below. The next two subsections come from [79]. See also Carbonell et al. [15] for the local linearization method for RODEs.
2.2 Taylor-like expansions for RODEs For simplicity, our attention will be restricted here to the one-dimensional case d = m = 1; readers are referred to [66] for vector valued RODEs with multidimensional noise processes. To emphasize the role of the sample paths of the driving stochastic process ζt , a canonical sample space = C([0, ∞), R) of continuous functions ω : [0, ∞) → R will be used, so ζt (ω) = ω(t) for t ∈ [0, ∞) and, henceforth, the RODE (2.1) will be written dx = f (ω(t), x). dt Since f is assumed to be infinitely often continuously differentiable in its variables, the initial value problem dx (2.11) = f (ω(t), x), x(t0 ) = x0 , dt has a unique solution, which will be assumed to exist on a finite time interval [t0 , T ] under consideration. Hence, by the continuity of the solution x(t) = x(t; t0 , x0 , ω) on [t0 , T ], there exists an R = R(ω, T ) > 0 such that |x(t)| ≤ R for all t ∈ [t0 , T ]. In addition, it will be assumed that P almost all sample paths of ζt are locally Hölder continuous with the same Hölder exponent, i.e., there is a γ ∈ (0, 1] such that for P almost all ω ∈ and each T > 0 there exists a Cω,T > 0 such that |ω(t) − ω(s)| ≤ Cω,T · |t − s|γ
for all
|t|, |s| ≤ T .
(2.12)
Let θ be the supremum of all γ with this property. Two cases will be distinguished later: Case A, in which (2.12) also holds for θ itself, and Case B, when it does not. The Wiener process with θ = 12 is an example of Case B. For such sample paths ω define ω∞ := sup |ω(t)|, t∈[t0 ,T ]
so
ωγ := ω∞ + sup
|ω(t) − ω(s)| ≤ ωγ · |t − s|γ
s =t∈[t0 ,T ] |t−s|≤1
for all
|ω(t) − ω(s)| , |t − s|γ
s, t ∈ [t0 , T ], |s − t| ≤ 1.
14
2.2.1
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs
Multi-index notation
Let N0 denote the nonnegative integers. For a multi-index α = (α1 , α2 ) ∈ N20 define |α| := α1 + α2 ,
α! := α1 ! α2 ! .
In addition, for a given γ ∈ (0, 1], define the weighted magnitude of a multi-index α by |α|γ := γ α1 + α2 , ∞ and for each K ∈ R+ with K ≥ |α|γ let |α|K γ := K − |α|γ . For f ∈ C (R × R, R ), denote ∂ α f := (∂1 )α1 (∂2 )α2 f
with ∂ (0,0) f = f and (0, 0)! = 1. Finally, for brevity, write fα := ∂ α f . Let ω and R > 0 be fixed, the latter being chosen, as above, as an upper bound on the solution of the initial value problem (2.11) corresponding to the sample path ω on a fixed interval [t0 , T ]. Define f k := max
sup
|α|≤k |y|≤ω∞ |z|≤R
|fα (y, z)|
for all k ∈ {0, 1, 2, . . . } and, for brevity, write f := f 0 . Note that the solution of the initial value problem (2.11) is Lipschitz continuous with |x(t) − x(s)| ≤ f |t − s| for all t, s ∈ [t0 , T ] .
2.2.2 Taylor expansions of the vector field The solution x(t) of the initial value problem (2.11) is only once differentiable, so the usual Taylor expansion cannot be continued beyond the linear term. Nevertheless, the special structure of a RODE and smoothness of f in both of its variables enables one to derive implicit Taylor-like expansions of arbitrary order for the solution. Fix k ∈ Z+ and ω ∈ and write ωs := ω(s) − ω, ˆ where
ωˆ := ω(tˆ),
xs := x(s) − x, ˆ xˆ := x(tˆ),
for an arbitrary tˆ ∈ [t0 , T ). Then the usual Taylor expansion for f in both variables gives f (ω(s), x(s)) =
1 ˆ x) ˆ (ωs )α1 (xs )α2 + Rk+1 (s) ∂ α f (ω, α!
|α|≤k
(2.13)
2.3. RODE–Taylor schemes
15
with remainder term Rk+1 (s) =
|α|=k+1
1 α ∂ f (ωˆ + ξs ωs , xˆ + ξs xs ) (ωs )α1 (xs )α2 α!
(2.14)
for some ξs ∈ [0, 1]. Substituting this into the integral equation representation of the solution of (2.11), t x(t) = xˆ + f (ω(s), x(s)) ds, (2.15) tˆ
gives xt =
t t 1 (ωs )α1 (xs )α2 ds + ∂ α f (ωˆ , x) ˆ Rk+1 (s) ds , α! tˆ |α|≤k tˆ
remainder Taylor-like approximation
or, more compactly, as xt =
Tα (t; tˆ) +
|α|≤k
tˆ
t
Rk+1 (s) ds,
(2.16)
where Tα (t; tˆ) :=
1 ˆ x) ˆ fα (ω, α!
tˆ
t
(ωs )α1 (xs )α2 ds.
The expression (2.16) is implicit in xs and so is not a standard Taylor expansion. Nevertheless, it can be used as the basis for constructing higher order numerical schemes for the RODE (2.1).
2.3
RODE–Taylor schemes
The RODE–Taylor schemes are a family of explicit one-step schemes for RODEs (2.1) on subintervals [tn , tn+1 ] of [t0 , T ] with step size hn = tn+1 − tn , which are derived from the Taylor-like expansion (2.16). The simplest case is for k = 0. Then (2.16) reduces to t t 1 x(t) = xˆ + ˆ (ωs )0 (xs )0 ds + R1 (s) ds ∂ (0,0) f (ωˆ , x) (0, 0)! tˆ tˆ t = xˆ + f (ωˆ , x) ˆ t + R1 (s) ds, tˆ
16
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs
which leads to the well-known Euler scheme yn+1 = yn + hn f (ω(tn ), yn ). In order to derive higher order schemes, the xs terms inside the integrals must also be approximated. This can be done with a numerical scheme of a lower order than that of the scheme to be derived. Higher order schemes can thus be built up recursively for sets of multi-indices of the form AK := α = (α1 , α2 ) ∈ N20 : |α|θ = θα1 + α2 < K , where K ∈ R+ and θ ∈ (0, 1] is specified by the noise process in the RODE, i.e., the supremum of the Hölder coefficients of its sample paths. For K ∈ [0, ∞) consider the first step (K)
y1K,h (tˆ, y) ˆ = yˆ + yh (tˆ, y) ˆ
(2.17)
of a numerical approximation at the time instant tˆ + h for a step size h ∈ (0, 1] and (K) initial value (tˆ, y) ˆ ∈ [t0 , T ] × R, where the increments yh are defined recursively. (0) Specifically, let yh := 0 and define
(K)
ˆ := yh (tˆ, y)
Nα(K) (tˆ + h, tˆ, y), ˆ
(2.18)
|α|θ 0 for n = 0, 1, . . . , NT − 1. This will often be called the K-RODE–Taylor scheme.
2.3. RODE–Taylor schemes
2.3.1
17
Discretization error
The increment function of a K-RODE–Taylor scheme α2 tˆ+h 1 1 (|α|K (K) α1 θ ) ˆ ˆ fα (ω, ys (t , y) ˆ := ˆ y) ˆ (ωs ) ˆ ds F (h, t , y) h α! tˆ |α|θ 0 (see, e.g., Deuflhard & Bornemann [30]). Moreover, the classical theorem for ODEs on the loss of a power from the local to global discretization errors also holds for the RODE–Taylor schemes. Consequently, it suffices to estimate the local discretization error, which is defined here as (K) ˆ , ˆ := x(tˆ + h; tˆ, y) ˆ − y1K,h (tˆ, y) Lh (tˆ, y) where x(tˆ + h; tˆ, y) ˆ is the value of the solution of the RODE at time tˆ + h with initial ˆ value (t , y) ˆ and y1K,h (tˆ, y) ˆ is the first step of the numerical scheme with step size h for the same initial value. Define R˜ 0 := 0 and for K > 0 define R˜ K := sup
max
0 0 arbitrarily small, where K+1 ε := eωγε +2RK , CK
γε := θ −
ε . (k + 1)2
Case A will be proved in detail with some remarks provided on the essential differences which arise in Case B. Proof. The terms Tα in the Taylor-like expansion of (2.16) of the RODE solution are easily estimated to give Tα (tˆ + t; tˆ) ≤ f |α| · 1 ωα1 f α2 (t)|α|γ +1 , γ α!
(2.22)
where t := t − tˆ ∈ (0, 1] and t ∈ (tˆ, T ]. Then the estimate t 1 ωαγ 1 f α2 (t)|α|γ +1 ˆ Rk+1 (s) ds ≤ f k+1 α! t
(2.23)
|α|=k+1
of the remainder term follows from 1 α α1 α2 |Rk+1 (s)| = ∂ f (ωˆ + ξs ωs , xˆ + ξs xs ) (ωs ) (xs ) α! |α|=k+1
≤
|α|=k+1
1 α ∂ f (ωˆ + ξs ωs , xˆ + ξs xs ) ωs |α1 |xs |α2 α!
≤ f k+1
|α|=k+1
≤ f k+1
|α|=k+1
α 1 ωγ (s)γ 1 ( f s )α2 α! 1 ωαγ 1 f α2 (s)|α|γ . α!
Case A is proved by mathematical induction on K ≥ 0. Here estimates such as (2.22) and (2.23) hold for all γ ≤ θ, including θ itself, which will be used below. For K = 0 the assertion is trivial, because (0) ˆ := x(tˆ + h; tˆ, x) ˆ − y10,h (tˆ, x) ˆ = x(tˆ + h) − x(tˆ) ≤ f h. Lh (tˆ, x)
2.3. RODE–Taylor schemes
19
For K ∈ (0, θ ), the index k = kK = 0 and the multi-index set AK = {(0, 0)}. Hence (K) ˆ ˆ := x(tˆ + h; tˆ, x) ˆ − y1K,h (tˆ, x) Lh (tˆ, x) := xt − y1K,h (tˆ, x) ˆ tˆ+h (K) = Tα (tˆ + h; tˆ) + R1 (s) ds − Nα (tˆ + h; tˆ, x) ˆ tˆ |α|≤0 AK tˆ+h (K) R1 (s) ds − N(0,0) (tˆ + h; tˆ, x) ˆ = T(0,0) (tˆ + h; tˆ) + tˆ
tˆ+h θ+1 R1 (s) ds ≤ f 1 (ωθ + f ) h . =
tˆ ≤ CK
≤ hK+1
This completes the first step of the induction argument. Suppose now that the assertion is true for K < L for a fixed L = l θ, where l ∈ N. The aim is to show that the assertion is then true for any K ∈ [L, L + θ). Then k = kK = l and (K) ˆ ˆ := x(tˆ + h; tˆ, x) ˆ − y1K,h (tˆ, x) Lh (tˆ, x) tˆ+h (K) ≤ Rk+1 (s) ds − Nα (tˆ + h; tˆ, x) ˆ Tα (tˆ + h; tˆ) + tˆ |α|≤k α∈AK ˆ Tα (tˆ + h; tˆ) + ≤ ˆ Tα (t + h; tˆ) − Nα(K) (tˆ + h; tˆ, x) |α|≤k,α∈AcK
1. error
α2 >0,α∈AK
tˆ+h + Rk+1 (s) ds . ˆ t
2. error
3. error
Estimates of the first and third errors were given, respectively, in (2.22) and (2.23). The second error can be estimated through the following lemma. Lemma 2.2. For α ∈ AK with α2 > 0 α2 (K) ˆ ωαθ 1 (RK )α2 −1 hK+1 (2.24) ˆ − Tα (tˆ + h; tˆ) ≤ f k CK−1 Nα (t + h; tˆ, x) α! with the convention that Cr := 1 for r < 0.
20
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs (|α|K θ )
Proof. For brevity, omit the variables tˆ and xˆ in ys
and write
ˆ − Tα (tˆ + h; tˆ) . E2 := Nα(K) (tˆ + h; tˆ, x) Then α2 tˆ+h 1 (|α|K α1 α2 θ ) ys ds ˆ x) ˆ (ωs ) − (xs ) E2 ≤ fα (ω, α! tˆ α2 tˆ+h 1 (|α|K ) − (xs )α2 ds . ωαθ 1 (s)θα1 y s θ f |α| ≤ α! tˆ By the inequality |y p − x p | ≤ p · |y − x| · (max(|x|, |y|) )p−1 ,
x, y ∈ R,
p ∈ N,
the difference in the integrand can be estimated by α2 K (|α|Kθ ) α2 −1 (|α|Kθ ) α2 y (|α|θ ) | |x , y ≤ α y max − (x ) − x s 2 s s s s s (2.25) where, by the definition of RK , (|α|Kθ ) (2.26) max |xs | , ys ≤ RK s. Hence α2 ωαθ 1 E2 ≤ f k α!
tˆ
tˆ+h
(|α|K )
Ls θ (tˆ, x) ˆ (s)θα1 (RK s)α2 −1 ds
α2 ≤ f k ωαθ 1 (RK )α2 −1 α!
tˆ+h
tˆ
(|α|K )
Ls θ (tˆ, x) ˆ (s)|α|θ −1 ds.
(|α|K )
ˆ ≤ C|α|K (s)|α|θ By the induction assumption, Ls θ (tˆ, x)
K +1
θ
, so
tˆ+h α2 K α1 α2 −1 E2 ≤ f k C|α|K · (s)|α|θ +|α|θ ds ωθ (RK ) θ α! tˆ α2 α1 α2 −1 ωθ (RK ) · hK+1 , ≤ f k CK−1 · α! where Cr := 1 for r < 0. This completes the proof of Lemma 2.2.
2.3. RODE–Taylor schemes
21
Returning to the proof of Theorem 2.1, the three error estimates (2.22), (2.23), and (2.24) combine to give (K)
Lh (tˆ, x) ˆ ≤ f k+1 CK−1 hK+1
|α|≤k+1
1 ωαθ 1 (RK )α2 . α!
Finally, using the double exponential expansion ex+y =
α2 1 x α1 y α2 −1 x α1 y α2 = α! α! 2
α∈ N0
one obtains
α2 >0
(K) Lh (tˆ, x) ˆ ≤ f k+1 eωθ +RK CK−1 hK+1 ≤ CK hK+1 .
This completes the proof of Case A of Theorem 2.1. The main difference in the proof of Case B of Theorem 2.1 is in the estimate of the second error there given in Lemma 2.2. The required estimate is given by the following lemma, where the convention that Cr := 1 for r < 0 is also used. Lemma 2.3. For α ∈ AK with α2 > 0, h ∈ (0, 1], and ε > 0 arbitrarily small, α2 (K) ˆ ε∗ ωαγε1 (RK )α2 −1 hK+1−ε , ˆ − Tα (tˆ + h; tˆ) ≤ f k CK−1 Nα (t + h; tˆ, x) α! (2.27) where k ε ε∗ := ε , γε := θ − . k+1 (k + 1)2 Proof. As in Lemma 2.2, write E2 for the expression to be estimated. Now (ωs )α1 ≤ ωαγε1 (s)α1 γε ≤ ωαγε1 (s)
α1 θ−α1 ε
1 (k+1)2
1
≤ ωαγε1 (s)θα1 −ε k+1 ,
since s ∈ (0, 1] and α1 θ − ε
1 1 ≤ α1 θ − α1 ε k+1 (k + 1)2
for α1 ≤ K/θ ≤ kK + 1 = k + 1. Thus α2 tˆ+h 1 (|α|K α1 α2 θ ) E2 ≤ fα (ω, ys ds ˆ x) ˆ (ωs ) − (xs ) α! tˆ α2 tˆ+h 1 1 (|α|K ) f |α| ωαγε1 (s)θα1 −ε k+1 y s θ − (xs )α2 ≤ α! tˆ
ds,
22
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs
so, using the inequalities (2.25) and (2.26) to estimate the difference in the integrand, tˆ+h 1 α2 (|α|K ) E2 ≤ f k ωαγε1 Ls θ (tˆ, x) ˆ (s)(θα1 −ε k+1 ) (RK s)α2 −1 ds α! ˆt tˆ+h 1 α2 (|α|K ) ≤ f k Ls θ (tˆ, x) ˆ (s)|α|θ −1−ε k+1 ds, ωαγε1 (RK )α2 −1 α! tˆ (|α|K )
K +1−ε ∗
∗
|α|θ ε ˆ ≤ C|α| where Ls θ (tˆ, x) K (s)
by the induction assumption. Hence
θ
α2 ε∗ E2 ≤ f k C|α| ωαγε1 (RK )α2 −1 K · α! θ
tˆ+h
tˆ
(s)|α|θ
K +|α| −ε θ
ds
∗
ε ≤ CK−1 ∗
ε ≤ f k CK−1
α2 ωαγε1 (RK )α2 −1 hK+1−ε , α!
since |α|K θ + |α|θ − ε = K − ε. This completes the proof of Lemma 2.3.
2.3.2
Examples of RODE–Taylor schemes
Some explicit examples of RODE–Taylor schemes will be presented here for scalar RODEs, all but one with scalar noise processes. Two representative noise processes will be considered: Brownian motion (Wiener process) and fractional Brownian motion with Hurst coefficient H = 34 . MATLAB software for RODE–Taylor schemes can be found in [6]. RODEs with Brownian motion The following examples have the Brownian motion or Wiener process as the driving process, which falls into Case B with θ = 12 . The AK -RODE–Taylor schemes, which have pathwise global convergence order K − ε, are given for K = 0, 0.5, 1.0, 1.5, and 2.0. Since θ = 12 AK := α : |α| 1 < K = { α : α1 + 2α2 ≤ 2K − 1 } 2
for these K. Example 2.4. The 0-RODE–Taylor scheme corresponding to A0 = ∅ is yn ≡ y0 , which is an inconsistent scheme. Example 2.5. The 0.5-RODE–Taylor scheme corresponding to A0.5 = { (0, 0) } is the classical Euler scheme yn+1 = yn + hf (ω(tn ), yn ), which has order 0.5 − ε.
(2.28)
2.3. RODE–Taylor schemes
23
Example 2.6. The 1.0-RODE–Taylor scheme corresponding to A1.0 = {(0, 0), (1, 0)} is the “improved” Euler scheme, tn+1 yn+1 = yn + hf (ω(tn ), yn ) + f(1,0) (ω(tn ), yn ) ωs ds. tn
Its order 1 − ε is comparable to that of the Euler scheme for smooth ODEs. In the following schemes the coefficient functions on the right side are evaluated at (ω(tn ), yn ). Example 2.7. The 1.5-RODE–Taylor scheme corresponding to A1.5 = {(0, 0), (1, 0), (2, 0), (0, 1)} is tn+1 f(2,0) tn+1 h2 (ωs )2 ds + f(0,1) f . ωs ds + yn+1 = yn + hf + f(1,0) 2 2 tn tn tn+1 (0.5) Here |(0, 1)|1.5 (ys ) ds 1 = 1.5−1 = 0.5 and the last term is obtained from f(0,1) tn 2
(0.5)
with ys = (s − tn )f coming from the Euler scheme (2.28). Example 2.8. In the next case A2.0 = {(0, 0), (1, 0), (2, 0), (3, 0), (0, 1), (1, 1)} and (0.5) (1.0) |(1, 1)|21 = 0.5, |(0, 1)|21 = 1, so the terms ys , ys corresponding to the 0.52
2
and 1.0-RODE–Taylor schemes are required in the right-hand side of the new scheme. The resulting 2.0-RODE–Taylor scheme is then tn+1 f(2,0) tn+1 yn+1 = yn + hf + f(1,0) ωs ds + (ωs )2 ds 2 tn tn f(3,0) tn+1 h2 + (ωs )3 ds + f(0,1) f 6 2 tn tn+1 s tn+1 + f(0,1) f(1,0) ωv dv ds + f(1,1) f ωs s ds. tn
tn
tn
Example 2.9. Finally, consider a two-dimensional Brownian motion ω(t) = (ω1 (t), ω2 (t)). The 1.5-RODE–Taylor scheme here corresponds to the indicial set A1.5 = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (1, 1, 0), (2, 0, 0), (0, 2, 0), (0, 0, 1)} and is given by tn+1 tn+1 1 yn+1 = yn + hf + f(1,0,0) ωs ds + f(0,1,0) ωs2 ds + f(1,1,0)
tn
tn+1 tn
f(0,2,0) + 2
tn+1 tn
ωs1 ωs2 ds +
f(2,0,0) 2
(ωs2 )2 ds + f(0,0,1) f
tn
tn+1 tn
h2 . 2
(ωs1 )2 ds
24
Chapter 2. Taylor Expansions and Numerical Schemes for RODEs
RODEs with fractional Brownian motion Fractional Brownian motion with Hurst coefficient H = 34 also falls into Case B with θ = 34 . The RODE–Taylor schemes generally contain fewer terms than the schemes of the same order with Brownian motion or attain a higher order when they contain the same terms. Example 2.10. For AK = { (0, 0) } with K ∈ (0, 34 ] the RODE–Taylor scheme is the classical Euler scheme in Example 2, but now the order is 34 − ε. Example 2.11. AK = {(0, 0), (1, 0)} for K ∈ ( 34 , 1] : the RODE–Taylor scheme is the same as that in Example 3 and also has order 1 − ε. Example 2.12. For AK = {(0, 0), (1, 0), (0, 1)} with K ∈ (1, 32 ] the RODE–Taylor scheme, tn+1 h2 yn+1 = yn + hf + f(1,0) ωs + f(0,1) f , 2 tn which has order 1.5 − ε, omits one of the terms in the RODE–Taylor scheme of the same order for Brownian motion given in Example 2.7. Example 2.13. For AK = {(0, 0), (1, 0), (0, 1), (2, 0)} with K ∈ ( 32 , 74 ] the Taylor scheme is the same as that in the Brownian motion case in Example 2.7 but now has order 74 − ε instead of order order 32 − ε. Remark 1. The K-RODE-Taylor schemes are not necessarily optimal in the sense of involving the minimum number of terms for the given order (see also [66]).
Chapter 3
SODEs
Deterministic calculus is much more robust to approximation than stochastic calculus because the integrand function in a Riemann sum approximating a Riemann integral can be evaluated at an arbitrary point of the discretization subinterval, whereas for an Itô stochastic integral the integrand function must always be evaluated at the left-hand endpoint. Consequently, considerable care is needed in deriving numerical schemes for stochastic ordinary differential equations (SODEs) to ensure that they are consistent with Itô stochastic calculus. In particular, stochastic Taylor schemes are the essential starting point for the derivation of consistent higher order numerics schemes for SODEs. Other types of schemes for SODEs, such as derivativefree schemes, can then be obtained by modifying the corresponding stochastic Taylor schemes. The theory of stochastic Taylor schemes is briefly outlined here in order to highlight the methods and assumptions used and their shortcomings. Most of this chapter comes from [78] and [58, 71, 81].
3.1
Itô SODEs
Consider a scalar Itô SODE dXt = a(t, Xt ) dt + b(t, Xt ) dWt ,
(3.1)
where (Wt )t∈R+ is a standard Wiener process, i.e., with W0 = 0, w.p.1, and increments Wt − Ws ∼ N (0; t − s) for t ≥ s ≥ 0, which are independent on nonoverlapping subintervals. The SODE (3.1) is, in fact, only a symbolic representation for the stochastic integral equation t t Xt = Xt0 + a(s, Xs ) ds + b(s, Xs ) dWs , (3.2) t0
t0
25
26
Chapter 3. SODEs
where the first integral is pathwise a deterministic Riemann integral and the second an Itô stochastic integral, which looks as if it could be defined pathwise as a deterministic Riemann–Stieltjes integral, but this is not possible because the sample paths of the Wiener process, though continuous, are not differentiable or even of bounded variation on any finite subinterval. The Itô stochastic integral of a nonanticipative mean-square integrable integrand g is defined in terms of the mean-square limit, namely,
T
g(s) dWs := ms − lim
N T −1
→0
t0
g(tn , ω) Wtn+1 (ω) − Wtn (ω) ,
n=0
taken over partitions of [t0 , T ] of maximum step size := maxn n , where n = tn+1 −tn and tNT = T . Nonanticipativeness here means, in particular, that the random variables g(tn ) and Wtn+1 − Wtn are independent, from which follow two simple but very useful properties of Itô integrals: E
T
g(s) dWs = 0,
t0
3.1.1
E
2
T
g(s) dWs t0
=
T
Eg(s)2 ds.
t0
Existence and uniqueness of strong solutions of SODEs
The following theorem is a standard existence and uniqueness theorem for SODEs. The vector valued case is analogous. Proofs can be found, for example, in G¯ı hman & Skorohod [36], Kloeden & Platen [82], and Mao [91]. Theorem 3.1. Suppose that a, b : [t0 , T ] × R → R are continuous in (t, x) and satisfy the global Lipschitz condition |a(t, x) − a(t, y)| + |b(t, x) − b(t, y)| ≤ L|x − y| uniformly in t ∈ [t0 , T ] and that the random variable X0 is nonanticipative with respect to the Wiener process Wt with E(X02 ) < ∞. Then the Itô SODE dXt = a(t, Xt ) dt + b(t, Xt ) dWt has a unique strong solution on [t0 , T ] with initial value Xt0 = X0 . Alternatively, one could assume just a local Lipschitz condition. Then the additional linear growth condition |a(t, x)| + |b(t, x)| ≤ K(1 + |x|)
3.1. Itô SODEs
27
or the Hasminskii condition xa(t, x) + |b(t, x)|2 ≤ K(1 + |x|2 ) ensures the existence of the solution on the entire time interval [0, T ], i.e., prevents explosions of solutions in finite time. Cherny & Engelbert [17] give a systematic investigation of existence theorems (both with and without uniqueness) for SODEs under different assumptions on the coefficients. For SODEs with a discontinuous but monotone increasing drift, such as a Heaviside function, see Halidias & Kloeden [47].
3.1.2
Simple numerical schemes for SODEs
The stochastic counterpart of the Euler scheme, usually called the Euler–Maruyama scheme, for the Itô SODE (3.1) is given by Yn+1 = Yn + a(tn , Yn ) n + b(tn , Yn ) Wn with time and noise increments tn+1 ds, n = tn+1 − tn = tn
(3.3)
Wn = Wtn+1 − Wtn =
tn+1
dWs . tn
It seems to be consistent (and, indeed, is) with the Itô stochastic calculus because the noise term in (3.3) approximates the Itô stochastic integral in (3.2) over a discretization subinterval [tn , tn+1 ] by evaluating its integrand at the left-hand endpoint of this interval: tn+1 tn+1 tn+1 b(s, Xs ) dWs ≈ b(tn , Xtn ) dWs = b(tn , Xtn ) dWs . tn
tn
tn
Convergence for numerical schemes can be defined in a number of different useful ways. It is usual to distinguish between strong and weak convergence, depending on whether the realizations or only their probability distributions are required to be close, respectively. Consider a fixed interval [t0 , T ] and let be the maximum step size of any partition of [t0 , T ]. To emphasize the dependence on the step size, the () numerical iterates will be written Yn . Then a numerical scheme is said to converge with strong order γ if, for each sufficiently small , () E XT − YNT ≤ KT γ (3.4) and with weak order β if () E (g(XT )) − E g(YNT ) ≤ Kg,T β
(3.5)
28
Chapter 3. SODEs
for each polynomial g. These are global discretization errors, and the largest possible values of γ and β give the corresponding strong and weak orders, respectively, of the scheme. The stochastic Euler scheme (3.3) has strong order γ = 12 and weak order β = 1, which are quite low, particularly in view of the fact that a large number of realizations need to be generated for most practical applications. To obtain a higher order, one should avoid heuristic adaptations of well-known deterministic numerical schemes because they are usually inconsistent with Itô calculus or, when they are consistent, then they do not improve the order of convergence. Example 3.2. The deterministic Heun scheme, a second order Runge–Kutta scheme, adapted to the Itô SODE (3.1) has the form Yn+1 = Yn +
1 [a(tn , Yn ) + a(tn+1 , Yn + a(tn , Yn )n + b(tn , Yn )Wn )] n 2
1 + [b(tn , Yn ) + b(tn+1 , Yn + a(tn , Yn )n + b(tn , Yn )Wn )] Wn . 2 In particular, for the Itô SODE dXt = Xt dWt it simplifies to 1 Yn+1 = Yn + Yn (2 + Wn ) Wn , 2 so Yn+1 − Yn = Yn (1 + 12 Wn ) Wn . The conditional expectation Yn+1 − Yn x 1 1 x 1 2 E 0 + n = x Yn = x = E Wn + (Wn ) = 2 n 2 2 n n
should approximate the drift term a(t, x) ≡ 0 of the SODE. The adapted Heun scheme is thus not consistent with Itô calculus and does not converge in either the weak or strong sense. A higher order of convergence cannot be obtained with a deterministic numerical scheme adapted to SODEs, when it happens to be consistent, because such a scheme involves only the simple increments of time n and noise Wn , the latter being a poor approximation for the highly irregular Wiener process within the discretization subinterval [tn , tn+1 ]. To obtain a higher order of convergence one needs to provide more information about the Wiener process within the discretization subinterval. Such information is provided by multiple integrals of the Wiener process, which arise in stochastic Taylor expansions of the solution of an SODE. Consistent numerical schemes of arbitrarily desired higher order can be derived by truncating appropriate stochastic Taylor expansions.
3.2. Itô–Taylor expansions
3.2
29
Itô–Taylor expansions
Itô–Taylor expansions or stochastic Taylor expansions of solutions of Itô SODEs are derived through an iterated application of the stochastic chain rule, the Itô formula. The nondifferentiability of the solutions in time is circumvented by using the integral form of the Itô formula. The Itô formula for scalar valued function f (t, Xt ) of the solution Xt of the scalar Itô SODE (3.1) is t t f (t, Xt ) = f (t0 , Xt0 ) + (3.6) L0 f (s, Xs ) ds + L1 f (s, Xs ) dWs , t0
t0
where the operators L0 and L1 are defined by ∂ 1 ∂2 ∂ +a + b2 2 , ∂t ∂x 2 ∂x
L0 =
L1 = b
∂ . ∂x
(3.7)
This differs from the deterministic chain rule by the additional third term in the L0 operator, which is due to the fact that E |W |2 = t. When f (t, x) ≡ x, the Itô formula (3.6) is just the integral version (3.2) of the SODE (3.1) in its integral form, i.e., t t Xt = Xt0 + a(s, Xs ) ds + b(s, Xs ) dWs . (3.8) t0
3.2.1
t0
Iterated application of the Itô formula
Applying the Itô formula to the integrand functions f (t, x) = a(t, x) and f (t, x) = b(t, x) in (3.8) gives t s s Xt = Xt0 + a(t0 , Xt0 ) + L0 a(u, Xu ) du + L1 a(u, Xu ) dWu ds t0
+
t0
t
b(t0 , Xt0 ) +
t0
= Xt0 + a(t0 , Xt0 )
t
t0
s
t0
ds + b(t0 , Xt0 )
t0
L0 b(u, Xu ) du +
s
L1 b(u, Xu ) dWu dWs
t0 t
dWs + R1 (t, t0 )
t0
with the remainder s s s s 0 R1 (t, t0 ) = L a(u, Xu ) du ds + L1 a(u, Xu ) dWu ds t0
+
t0
t t0
t0 s
t0
t0
L0 b(u, Xu ) du dWs +
t t0
s
t0
L1 b(u, Xu ) dWu dWs .
(3.9)
30
Chapter 3. SODEs
Discarding the remainder gives the simplest nontrivial stochastic Taylor approximation t
Xt ≈ Xt0 + a(t0 , Xt0 )
t0
t
ds + b(t0 , Xt0 )
dWs ,
(3.10)
t0
which has strong order γ = 0.5 and weak order β = 1. Higher order stochastic Taylor expansions are obtained by successively applying the Itô formula to the integrand functions in the remainder, there being an increasing number of different alternatives. For example, applying the Itô formula to the integrand L1 b in the fourth double integral of the remainder R1 (t, t0 ) gives the stochastic Taylor expansion t t Xt = Xt0 + a(t0 , Xt0 ) ds + b(t0 , Xt0 ) dWs t0
+ L1 b(t0 , Xt0 )
t t0
t0
s
dWu dWs + R2 (t, t0 )
(3.11)
t0
with the remainder s s s s 0 L a(u, Xu ) du ds + a(u, Xu ) dWu ds R2 (t, t0 ) = t0
+
t0
t t0
t0 s
L0 b(u, Xu ) du dWs +
t0
+
t s t0
t0
t0
t s t0
u
t0
u
L0 L1 b(v, Xv ) dv dWu dWs
t0
L1 L1 b(v, Xv ) dWv dWu dWs .
t0
Discarding the remainder gives the stochastic Taylor approximation t t Xt ≈ Xt0 + a(t0 , Xt0 ) ds + b(t0 , Xt0 ) dWs t0
+ b(t0 , Xt0 )
∂b (t0 , Xt0 ) ∂x
t
t0
s
dWu dWs , t0
(3.12)
t0
∂b , which has strong order γ = 1 (and also weak order β = 1). since L1 b = b ∂x
To obtain even higher order expansions one continues the above procedure of expanding integrand functions in appropriate remainder terms with the help of the Itô formula. There are thus many different possible stochastic Taylor expansions. Remark 2. The two examples above already indicate the general pattern of the stochastic Taylor schemes: (i) They achieve their higher order through the inclusion of multiple stochastic integral terms.
3.2. Itô–Taylor expansions
31
(ii) An expansion may have different strong and weak orders of convergence. (iii) The possible orders for strong schemes increase by a fraction 21 , taking values 1 3 2 , 1, 2 , 2, . . . , whereas the possible orders for weak schemes are whole numbers 1, 2, 3, . . . .
3.2.2
General stochastic Taylor expansions
Multi-indices provide a succinct means to describe the terms that should be included or expanded to obtain a stochastic Taylor expansion of a particular order as well as for representing the iterated differential operators and stochastic integrals that appear in the terms of such expansions. Consider now a scalar SODE dXt = a(t, Xt ) dt +
m
j
bj (t, Xt )dWt
(3.13)
j =1
. . . , Wtm and differential operators with m independent scalar Wiener processes Wt1 , j 2 0 2 L now equal to that in (3.7) with b replaced by m j =1 (b ) and Lj = bj
∂ , ∂x
j = 1, . . . , m.
Writing b0 (t, x) for a(t, x) and dW 0 (t) for dt, the Itô formula (3.6) for a solution of the SODE (3.13) takes the compact form f (t, Xt ) = f (t0 , Xt0 ) +
m
t
j
Lj f (s, Xs ) dWs
j =0 t0
and reduces to the SODE (3.13) when f (t, x) ≡ x, i.e., the identity function id(x) ≡ x, since Lj id ≡ bj for j = 0, 1, . . . , m. In general, a multi-index α of length l(α) = l is an l-dimensional row vector α = (j1 , j2 , . . . , jl ) with components ji ∈ {0, 1, . . . , m} for i ∈ {1, 2, . . . , l}. Let Mm be the set of all multi-indices of length greater than or equal to zero, where a multiindex ∅ of length zero is introduced for convenience. Given a multi-index α ∈ Mm with l(α) ≥ 1, write −α and α− for the multi-index in Mm obtained by deleting the first and the last component, respectively, of α. For such a multi-index α = (j1 , j2 , . . . , jl ) with l ≥ 1, the multiple Itô integral Iα [g(·)]t0 ,t of a nonanticipative mean–square integrable function g is defined recursively by t j Iα− [g(·)]t0 ,s dWs l , I∅ [g(·)]t0 ,t := g(t). Iα [g(·)]t0 ,t := t0
32
Chapter 3. SODEs
Similarly, the Itô coefficient function fα for a deterministic function f is defined recursively by fα := Lj1 f−α , f∅ = f . The multiple stochastic integrals appearing in a stochastic Taylor expansion with constant integrands cannot be chosen completely arbitrarily. Rather, the set of corresponding multi-indices must form an hierarchical set, i.e., a nonempty subset A of Mm with sup l(α) < ∞
and
−α ∈ A
for each
α ∈ A \ {∅}.
α∈A
The multi-indices of the remainder terms in a stochastic Taylor expansion for a given hierarchical set A belong to the corresponding remainder set B(A) of A defined by B(A) = {α ∈ Mm \ A : −α ∈ A}, i.e., consisting of all of the “next following” multi-indices with respect to the given hierarchical set. Then the Itô–Taylor expansion corresponding to the hierarchical set A and remainder set B(A) is f (t, X(t)) = Iα [fα (t0 , X(t0 ))]t0 ,t + Iα [fα (·, X· )]t0 ,t , α∈A
α∈B(A)
i.e., with constant integrands (hence constant coefficients) in the first sum and timedependent integrands in the remainder sum. Remark 3. A similar setup holds for vector valued SODEs with the differential operators appropriately modified. See Chapter 5 of Kloeden & Platen [82] for details.
3.3
Itô–Taylor numerical schemes for SODEs
Itô–Taylor numerical schemes are obtained by applying Itô–Taylor expansion to the identity function f = id on a subinterval [tn , tn+1 ] at a starting point (tn , Yn ) and discarding the remainder term. This will be illustrated for the case m = 1 of a single Wiener process. The simplest nontrivial Itô–Taylor expansion (3.9) gives the Euler–Maruyama scheme (3.3), which is the simplest nontrivial stochastic Taylor scheme and has strong order γ = 0.5 and weak order β = 1. The Itô–Taylor expansion (3.11) gives the Milstein scheme Yn+1 = Yn + a(tn , Yn ) n + b(tn , Yn ) Wn tn+1 s ∂b dWu dWs . + b(tn , Yn ) (tn , Xn ) ∂x tn tn
(3.14)
3.3. Itô–Taylor numerical schemes for SODEs
33
∂b and the iterated Here the coefficient functions are f(0) = a, f(1) = b, f(1,1) = b ∂x integrals
I(0) [1]tn ,tn+1 =
tn+1 tn
dWs0 = n ,
I(1) [1]tn ,tn+1 =
tn+1 tn
dWs1 = Wn ,
and tn+1 s
I(1,1) [1]tn ,tn+1 =
tn
tn
dWτ1 dWs1 =
1 (Wn )2 − n . 2
(3.15)
The Milstein scheme converges with strong order γ = 1 and weak order β = 1. It has a higher order of convergence in the strong sense than the Euler–Maruyama scheme, but gives no improvement in the weak sense. In general, applying this idea to the Itô–Taylor expansion corresponding to the hierarchical set A gives the A-stochastic Taylor scheme A = Yn+1
α∈A
Iα id α (tn , YnA )
tn ,tn+1
= YnA +
id α (tn , YnA ) Iα [1]tn ,tn+1 . (3.16)
α∈A\∅
The strong order γ stochastic Taylor scheme, which converges with strong order γ , corresponds to the hierarchical set γ = α ∈ Mm : l(α) + n(α) ≤ 2γ
or
l(α) = n(α) = γ +
1 , 2
where n(α) denotes the number of components of a multi-index α which are equal to 0, while the weak order β stochastic Taylor scheme, which converges with weak order β, corresponds to the hierarchical set β = {α ∈ Mm : l(α) ≤ β} . For example, for m = 1 the hierarchical sets 1/2 = 1 = {∅, (0), (1)} give the Euler–Maruyama scheme, which is both strongly and weakly convergent, while the strongly convergent Milstein scheme corresponds to the hierarchical set 1 = {∅, (0), (1), (1, 1)}. Note that the Milstein scheme does not correspond to a stochastic Taylor scheme for a hierarchical set β . See Figure 3.1 for a graphical representation of the corresponding stochastic Taylor trees. Remark 4. Convergence theorems for stochastic Taylor schemes assume that the coefficients of the SODE are sufficiently often differentiable, so that all terms in these schemes make sense. Moreover, they assume that the coefficient functions of the Taylor schemes are globally Lipschitz continuous (see Theorem 10.6.3 in [82]
34
Chapter 3. SODEs
0
0
0
1
0
0
1
1
0
0 1
0
1
1
0
1
Figure 3.1. Stochastic Taylor trees for the stochastic Euler scheme (left) and the Milstein scheme (right). The multi-indices are formed by concatenating indices along a branch from the right back toward the root ∅ of the tree. Dashed line segments correspond to remainder terms [78].
in the case of strong convergence and Theorem 14.5.1 in [82] in the case of weak convergence for the precise description of the used assumptions). These will be called the standard assumptions. They are obviously not satisfied by some SODEs that arise in many interesting applications, as will be seen later in this chapter. Remark 5. There are usually no simple expressions like (3.15) for multiple stochastic integrals involving different, independent Wiener processes. How to approximate such integrals is a major issue in stochastic numerics. The difficulty and cost of approximating stochastic integrals of higher multiplicity restricts the practical usefulness of higher order strong schemes. However, the work of Wiktorsson [124] on the simulation of the second iterated integral is very promising. Remark 6. The Itô–Taylor schemes are the “basic” schemes for the weak and strong approximation of stochastic differential equations. Based on these schemes, numerous other numerical schemes such as derivative-free Runge–Kutta-like schemes and multistep schemes have been constructed in recent years. See, e.g., Kloeden & Platen [82], Milstein [95], Milstein & Tretyakov [96], Rößler [107], and the references therein. See also Burrage, Burrage & Tian [13] for the construction of Runge– Kutta-like schemes for SODEs in terms of the Butcher B-series used for Runge–Kutta schemes for ODEs. Debrabant & Kværnø [29] provide a comparison of the B-series and the Itô–Taylor expansions used above.
3.4. Pathwise convergence
3.4
35
Pathwise convergence
Pathwise convergence was already considered for RODEs in Chapter 2. It is also interesting for SODEs because numerical calculations of the random variables Yn in the numerical scheme above are carried out path by path. Itô stochastic calculus is, however, an L2 or a mean-square calculus and not a pathwise calculus. Nevertheless, some results for pathwise approximation of SODE are known. For example, in 1983 Talay [118] showed that the Milstein scheme SODE with a scalar Brownian motion has the pathwise error estimate 1 (Mil) (Mil) sup Xtn (ω) − Yn (ω) ≤ K,T (ω) 2 − n=0,...,NT
for all > 0 and almost all ω ∈ , i.e., he showed that the Milstein scheme converges pathwise with order 12 − . Later Gyöngy [41] and Fleury [33] showed that the Euler–Maruyama scheme has the same pathwise convergence order. Note that the error constants here depend on ω, so they are in fact random variables. The nature of their statistical properties is an interesting question, about which little is known theoretically so far and which requires further investigation. Plots of some empirical distributions of the random error constants are given in Figure 4.1 of Chapter 4. See also Kloeden & Neuenkirch [81] and Jentzen, Kloeden & Neuenkirch [70]. Given that the sample paths of a Wiener process are Hölder continuous with exponent 12 − one may ask: Is the convergence order 12 − “sharp” for pathwise approximation? The answer is no! Kloeden & Neuenkirch [81] showed that an arbitrarily high order of pathwise convergence is possible. Theorem 3.3. Under the standard assumptions, the Itô–Taylor scheme of strong order γ > 0 converges pathwise with order γ − for all > 0, i.e., (γ ) (γ ) sup Xtn (ω) − Yn (ω) ≤ K,T (ω) · γ − n=0,...,NT
for almost all ω ∈ . Thus, for example, the Milstein scheme has pathwise order 1 − rather than the lower order 12 − obtained in [118] (which was a consequence of the proof used there). The proof of Theorem 3.3, which will not be given here, is based on the Burkholder–Davis–Gundy inequality s p t p/2 E sup Xτ dWτ ≤ Cp · E Xτ2 dτ s∈[0,t]
0
0
and a Borel–Cantelli argument in the following lemma (see Lemma 2.1 in [81]).
36
Chapter 3. SODEs
Lemma 3.4. Let γ > 0 and cp ≥ 0 for p ≥ 1. If {Zn }n∈N is a sequence of random variables with 1/p E|Zn |p ≤ cp · n−γ for all p ≥ 1 and n ∈ N, then for each > 0 there exists a nonnegative random variable K such that |Zn (ω)| ≤ K (ω) · n−γ +ε
a.s.
for all n ∈ N.
3.4.1
Numerical schemes for RODEs applied to SODEs
The Doss–Sussmann theory allows an SODE to be transformed to an RODE, at least for SODEs with additive or linear noise for which explicit transformations are known. For example, the SODE with additive noise, dXt = −Xt3 dt + dWt ,
(3.17)
can be transformed to the RODE dz = −(z + Ot )3 + Ot , dt
(3.18)
where z(t, ω) := Xt (ω) − Ot (ω) and Ot is the Ornstein–Uhlenbeck stochastic stationary process satisfying the linear SDE dOt = −Ot dt + dWt .
(3.19)
For linear noise it is better to start with a Stratonovich SODE (scalar here for simplicity) dXt = f (Xt ) dt + σ Xt ◦ dWt , (3.20) for which the corresponding Itô SODE is 1 dXt = f (Xt ) − σ 2 Xt dt + σ Xt dWt . 2 The transformation z(t, ω) := e−Ot (ω) Xt (ω) leads to the RODE dz = e−Ot f eOt z + Ot z. dt
(3.21)
(3.22)
The RODEs (3.18) and (3.22) can be solved numerically using one of the pathwise convergent numerical schemes for RODEs in Chapter 2. The Ornstein– Uhlenbeck stochastic stationary process can be simulated directly since its statistical properties are well known. Note that the cubic drift coefficient of the SODE (3.17) does not satisfy the standard assumptions for the convergence of Itô–Taylor schemes.
3.5. Restrictiveness of the standard assumptions
3.5
37
Restrictiveness of the standard assumptions
Proofs in the literature of the above convergence orders, e.g., in the monographs Kloeden & Platen [82] and Milstein [95], assume that the coefficient functions fα in the Itô–Taylor schemes are uniformly bounded on Rd , i.e., the partial derivatives of appropriately high order of the SODE coefficient functions a, b1 , . . . , bm are uniformly bounded on Rd . This assumption is not satisfied for many SODEs in important applications such as the stochastic Ginzburg–Landau equation (the constants here are all positive), 1 2 3 (3.23) ν + σ Xt − λXt dt + σ Xt dWt , dXt = 2 or even the simpler SODE (3.17) with additive noise and a cubic nonlinearity. Matters are even worse for the Fisher–Wright equation dXt = [κ1 (1 − Xt ) − κ2 Xt ] dt + Xt (1 − Xt ) dWt , (3.24) the Feller diffusion with logistic growth SODE dXt = λXt (K − Xt ) dt + σ Xt dWt ,
(3.25)
and the Cox–Ingersoll–Ross equation dVt = κ (λ − Vt ) dt + θ Vt dWt ,
(3.26)
since the square-root function is not differentiable at zero and requires the expression under it to remain nonnegative for the SODE to make sense.
3.5.1
Counterexamples for the Euler–Maruyama scheme
The scalar SODE (3.17) with cubic drift and additive noise has a globally pathwise asymptotically stable stochastic stationary solution, since the drift satisfies a onesided dissipative Lipschitz condition. Its solution on the time interval [0, 1] for initial value X0 = 0 satisfies the stochastic integral equation t Xt = − Xs3 ds + Wt (3.27) 0
for every t ∈ [0, 1] and has finite first moment E |X1 | < ∞ (see, e.g., Theorem 2.4.1 in [91]). This SODE was first investigated numerically by Mattingly, Stuart & Higham [93] and Petersen [102]. The corresponding Euler–Maruyama scheme with constant step size = N1 is given by (N)
(N)
Yk+1 = Yk
(N) 3 − Yk + Wk (ω)
(3.28)
38
Chapter 3. SODEs (N)
with Y0
= 0 and
Wk = W(k+1) − Wk .
The next theorem, due to Hutzenthaler, Jentzen & Kloeden [58], shows that the Euler–Maruyama scheme does not converge strongly for the SODE (3.27). In addition, the remark following the proof shows that it also does not converge weakly. Theorem 3.5. The solution Xt of (3.27) and its Euler–Maruyama approximation (N) Yk satisfy (N) (3.29) lim E X1 − YN = ∞. N→∞
Proof. Let N ∈ N be arbitrary, define rN := max{3N , 2}, and consider the event sup |Wk (ω)| ≤ 1, |W0 (ω)| ≥ rN . N := ω ∈ : k=1,...,N−1
Then it follows by induction that k−1 (N) Yk (ω) ≥ rN2
for all ω ∈ N
for every k = 1, 2, . . . , N. Clearly, for k = 1, (N) Y1 (ω) = |W0 (ω)| ≥ rN
(3.30)
(3.31)
for each ω ∈ N from the definition of the Euler–Maruyama scheme (3.28) and the definition of the set N . Therefore, assume that (3.30) holds for some k ∈ {1, 2, . . . , N − 1}. Then k−1 (N) (3.32) Yk (ω) ≥ rN2 ≥ rN ≥ 1 for every ω ∈ N . Now, by the definition of the set N , |Wk (ω)| ≤ 1 for every ω ∈ N , so by (3.28) it follows that 3 (N) (N) (N) Yk+1 (ω) = Yk (ω) − Yk (ω) + Wk (ω) (N) 3 (N) ≥ Yk (ω) − Yk (ω) − |Wk (ω)| (N) (N) 3 ≥ Yk (ω) − Yk (ω) − 1
(3.33)
3.5. Restrictiveness of the standard assumptions
39
for every ω ∈ N . Hence the inequality (3.32) implies that (N) (N) 3 (N) 2 Yk+1 (ω) ≥ Yk (ω) − 2 Yk (ω) (N) 2 (N) ≥ Yk (ω) Yk (ω) − 2 (N) 2 (N) 2 ≥ Yk (ω) (rN − 2) ≥ Yk (ω) for every ω ∈ N , since rN − 2 ≥ 1 by definition. Thus, the induction hypothesis, i.e., the assumption that (3.30) holds for k, implies that k−1 2 k (N) = rN2 Yk+1 (ω) ≥ rN2 for all ω ∈ N , which shows that (3.30) holds for k + 1. Hence, inequality (3.30) holds for every k = 1, 2, . . . , N . In particular, N−1 (N) YN (ω) ≥ rN2 for every ω ∈ N and N ∈ N and it follows that (N) (N) E YN = YN (ω) P(dω) ≥
N
(N −1) (N) YN (ω) P(dω) ≥ P (N ) S · 22
for every N ∈ N. Now P (N ) = P
sup
sup |Wt | ≤
≥P
0≤t≤1
sup |Wt | ≤
=P ≥P
|Wk | ≤ 1 · P (|W0 | ≥ rN )
k=1,...,N−1
0≤t≤1
sup |Wt | ≤ 0≤t≤1
1 ≥ ·P 4
1 · P (|W0 | ≥ rN ) 2 √ √ 1 ·P N W1/N ≥ N rN 2 √ 1 1√ 2 · N rN e−( NrN ) 2 4
sup |Wt | ≤ 0≤t≤1
1 2 · e−NrN 2
for every N ∈ N. The next step needs the following lemma, which was proved in [58].
40
Chapter 3. SODEs
Lemma 3.6. Let (, F , P) be a probability space and let Z : → R be an F /B(R)measurable mapping which is standard normally distributed. Then 1 1 2 2 P |Z| ∈ [x, 2x] S ≥ x e−2x (3.34) P |Z| ≥ x ≥ x e−x , 4 2 for every x ∈ [0, ∞). It follows from Lemma 3.6 that 1 2 N−1 (N) 1 E YN ≥ · P sup |Wt | ≤ · e−NrN · 22 4 2 0≤t≤1 for every N ∈ N. Hence (N) 1 lim E YN ≥ · P N→∞ 4 =
1 ·P 4
sup |Wt | ≤
0≤t≤1
sup |Wt | ≤ 0≤t≤1
1 2 N−1 · lim e−NrN · 22 2 N→∞ 1 3 N−1 · lim e−9N · 22 = ∞. 2 N→∞
Finally, since E |X1 | is finite, it follows that (N) (N) lim E X1 − YN ≥ lim E YN − E |X1 | = ∞, N→∞
N→∞
which is the assertion of the theorem. Remark 7. Theorem 3.5 implies, by Jensen’s inequality, that (N) p lim E X1 − YN = ∞ N→∞
(3.35)
for every p ∈ [1, ∞). Moreover, since E |X1 |p < ∞ for every p ∈ [1, ∞) (see, e.g., Theorem 2.4.1 in [91]), it follows that 1 p 1 (N) p p p (N) E YN = E YN − X1 + X1 p 1 1 p (N) ≥ E YN − X1 − E |X1 |p p → ∞ as N → ∞ by (3.35). Hence
(N) p lim E |X1 |p − E YN = ∞
N→∞
for every p ∈ [1, ∞).
(3.36)
3.5. Restrictiveness of the standard assumptions
41
The above result is a special case of a more general result for scalar SODEs, dXt = a(Xt ) dt + b(Xt ) dWt ,
t ∈ [0, 1],
(3.37)
for which at least one coefficient function grows superlinearly, assuming that a solution exists, i.e., a predictable stochastic process Xt satisfying 1
P
|a(Xs )| + |b(Xs )|2 ds < ∞
=1
(3.38)
0
and
P Xt = ξ +
t
a(Xs ) ds +
0
t
g(Xs ) dWs
=1
(3.39)
0
for every t ∈ [0, 1], where the initial value is X0 = ξ . Theorem 3.7. Suppose that there exist constants C ≥ 1 and β > α > 1 such that max (|a(x)| , |b(x)|) ≥
|x|β , C
min (|a(x)| , |b(x)|) ≤ C |x|α
(3.40)
for all |x| ≥ C. If the solution Xt of the scalar SODE dXt = a(Xt ) dt + b(Xt ) dWt ,
X0 = ξ ,
(3.41)
for t ∈ [0, 1] satisfies E |X1 |p < ∞ for some p ∈ [1, ∞) and if the initial condition satisfies P ( g(ξ ) = 0 ) > 0, then (N) p (N) p and lim E |X1 |p − YN = ∞, (3.42) lim E X1 − YN = ∞ N→∞
N→∞
(N)
where YN denotes the Nth iterate of the corresponding Euler–Maruyama scheme with the constant time step = N1 . The assumption that the diffusion function does not vanish at the starting point ensures the presence of noise in the first time step. The proof of Theorem 3.7 can be found in Hutzenthaler, Jentzen & Kloeden [58]. The estimates in the proof require β > 1 and so do not apply to a superlinear drift function such as a(x) = x log x. However, the theorem does apply to the stochastic Ginzburg–Landau equation (3.23) and the Feller diffusion with logistic growth SODE (3.25), among others. Remark 8. The implicit Euler–Maruyama scheme applied to the SODE (3.27) converges in the strong sense; see, e.g., Szpruch & Mao [117].
This page intentionally left blank
Chapter 4
Numerical Methods for SODEs with Nonstandard Assumptions
There are various ways to overcome the problems caused by nonstandard assumptions on the coefficients of an SODE. One way is to restrict attention to SODEs with special dynamical properties such as exponential ergodicity, e.g., by assuming that the coefficients satisfy certain dissipativity and nondegeneracy conditions; see Higham, Mao & Stuart [57], Mattingly, Stuart & Higham [93], and Milstein & Tretjakov [97]. This yields the appropriate order estimates without bounded derivatives of coefficients. However, several types of SODEs and, in particular SODEs with squareroot coefficients remain a problem. Many numerical schemes do not preserve the positivity of the solution of the SODE and hence may crash when implemented, which has led to various ad hoc modifications to prevent this happening. Sections 4.1 and 4.2 come from [70] and [71], and most of Section 4.3 comes from [59].
4.1
SODEs without uniformly bounded coefficients
Similar to Gyöngy [41], a localization argument was used by Jentzen, Kloeden & Neuenkirch [70] to show that Theorem 3.3 remains true for an SODE dXt = a(Xt ) dt +
m j =1
when the coefficients satisfy a, b1 , . . . , bm ∈ C 2γ +1 (Rd ; Rd ) 43
j
bj (Xt ) dWt
44
Chapter 4. SODEs with Nonstandard Assumptions
but do not necessarily have uniformly bounded derivatives, provided that a solution of the SODE exists on the time interval under consideration. This result is a special case of Theorem 4.1 below. The convergence obtained is pathwise. It applies, for example, to the stochastic Landau–Ginzburg equation (3.23) to give pathwise convergence of the Taylor schemes. Note that strong convergence of Itô–Taylor schemes does not in general hold under these assumptions according to [58] (see also Theorem 3.7 above). Empirical distributions for the random error constants of the Euler–Maruyama and Milstein schemes applied to the SODE dXt = −(1 + Xt )(1 − Xt2 ) dt + (1 − Xt2 ) dWt ,
X(0) = 0,
on the time interval [0, T ] for N = 104 sample paths and time step = 0.0001 are plotted in Figure 4.1.
(1.0)
(0.5)
Figure 4.1. Empirical distributions of K0.001 and K0.001 (sample size: N = 104 ) [71].
4.2
SODEs on restricted regions
The Fisher–Wright SODE (3.24), the Feller diffusion with logistic growth SODE (3.25), and the Cox–Ingersoll–Ross SODE (3.26) have square-root coefficients, which requires the solutions to remain in the region where the expression under the square root is nonnegative. However, numerical iterations may leave this restricted region, in which case the algorithm will terminate. One way to avoid this problem is to use an appropriately modified Itô–Taylor scheme proposed in Jentzen, Kloeden & Neuenkirch [70]. Consider an SODE dXt = a(Xt ) dt +
m j =1
j
bj (Xt ) dWt ,
(4.1)
4.2. SODEs on restricted regions
45
where Xt takes values in a domain D ⊂ Rd for t ∈ [0, T ]. Suppose that the coefficients a, b1 , . . . , bm are r-times continuously differentiable on D and that the SODE (4.1) has a unique strong solution. Define / D}. E := {x ∈ Rd : x ∈ Then choose auxiliary functions f , g 1 , . . . , g m ∈ C s (E; Rd ) for s ∈ N and define ! a (x) = a(x) · ID (x) + f (x) · IE (x),
x ∈ Rd ,
! bj (x) = bj (x) · ID (x) + g j (x) · IE (x),
(4.2)
x ∈ Rd ,
(4.3)
! bj (y)
(4.4)
for j = 1, . . . , m. In addition, for x ∈ ∂D define ! a (x) =
lim
y→x; y∈D
! bj (x) =
! a (y),
lim
y→x;∈y∈D
for j = 1, . . . , m, if these limits exist. Otherwise, define ! a (x) = 0 and ! bj (x) = 0 for x ∈ ∂D, respectively. Finally, define the “modified” derivative of a function h : Rd → Rd by ∂x l h(x) =
∂ h(x), ∂x l
x ∈ D ∪ E,
(4.5)
∂x l h(x)
(4.6)
and for x ∈ ∂D define ∂x l h(x) =
lim
y→x; y∈D
for l = 1, . . . , d, if this limit exists; otherwise set ∂x l h(x) = 0 for x ∈ ∂D. A modified Itô–Taylor scheme is the corresponding Itô–Taylor scheme for the SODE with modified coefficients a (Xt ) dt + dXt = !
m
j ! bj (Xt ) dWt ,
(4.7)
j =1
!0 , L !1 , . . . , L !m with the above modified coefficients and using differential operators L derivatives. Note that this method is well-defined as long as the coefficients of the equation are (2γ + 1)-times differentiable on D and the auxiliary functions are (2γ − 1)-times differentiable on E. The purpose of the auxiliary functions is twofold: to obtain a well-defined approximation scheme and to “reflect” the numerical scheme back to D if it leaves D. In particular, the auxiliary functions can always be chosen to be affine or even constant. It was shown in Jentzen, Kloeden & Neuenkirch [70] that Theorem 3.3 in Chapter 3 adapts to modified Itô–Taylor schemes for SODEs on domains in Rd . Readers are referred to [70] for the proof.
46
Chapter 4. SODEs with Nonstandard Assumptions
" bm ∈ C 2γ +1 (D; Rd ) C 2γ −1 (E; Rd ). Theorem 4.1. Assume that ! a, ! b1 , . . . , ! Then for every > 0 and γ = 12 , 1, 32 , . . . there exists a nonnegative random (f ,g)
variable Kγ ,
such that (mod,γ ) sup Xtn (ω) − Yn (ω) ≤ Kγ(f,,g) (ω) · γ −
n=0,...,NT
(mod,γ )
for almost all ω ∈ , where = T /NT and the Yn Itô–Taylor scheme applied to the SODE (4.7).
correspond to the modified
The convergence rate does not depend on the choice of the auxiliary functions, but the random constant in the error bound clearly does. The next examples of Theorem 4.1 come from [70].
4.2.1
Examples
Consider the Cox–Ingersoll–Ross SODE dXt = κ (λ − Xt ) dt + θ Xt dWt
(4.8)
with κλ ≥ θ 2 /2 and x(0) = x0 > 0. Here D = (0, ∞) and the coefficients √ a(x) = κ (λ − x) , b(x) = θ x, x ∈ D, satisfy a, b ∈ C ∞ (D; R1 ). As auxiliary functions on E = (−∞, 0) choose, e.g., f (x) = g(x) = 0, or
f (x) = κ (λ − x) ,
x ∈ E,
g(x) = 0,
x ∈ E.
The first choice of auxiliary functions “kills” the numerical approximation as soon as it takes a negative value. However, the second is more appropriate, since if the scheme takes a negative value, the auxiliary functions force the numerical scheme to be positive again after the next steps, which recovers better the positivity of the exact solution. The coefficients of the d-dimensional stochastic Volterra–Lotka SODE dXt = diag(Xt1 , . . . , Xtd ) [(a + BXt ) dt + CXt dWt ]
(4.9)
are infinitely differentiable on Rd . Here superscripts index the components of the state vector x = (x 1 , . . . , x d ). Hence there is no need for a modification of the coefficients, and one can use the standard Itô–Taylor schemes. On the other hand, it was shown by Mao, Marion & Renshaw [92] that the solution never leaves the set D = (0, ∞)d if the initial value x0 ∈ (0, ∞)d and if the entries of the diffusion matrix C are positive
4.3. Another type of weak convergence
47
with strictly positive diagonal components. It is, thus, also useful to apply modified Itô–Taylor methods for this SODE with D = (0, ∞)d and, e.g., f k (x) = −2|a k | x k I{x k 0. With these modifications, this convergence takes the form N2 (N,k) 1 lim E g XT − 2 g YN (ω) = 0, a.s., (4.14) N→∞ N k=1 (N,k)
where YN (ω) is the ω-realization of the N th iterate of the Euler–Maruyama scheme applied to the SODE with the ω-realization of the kth Wiener process (k) Wt (ω) and constant time step size := T /N . The following theorem and its explanations and outline below come from Hutzenthaler & Jentzen [59]. Theorem 4.2. Suppose that a, b, g : R → R are four times continuously differentiable functions with derivatives satisfying (n) (n) (n) (4.15) a (x) + b (x) + g (x) ≤ L 1 + |x|δ for all x ∈ R
4.3. Another type of weak convergence
49
for n = 0,1,. . . , 4, where L ∈ (0, ∞) and δ ∈ (1, ∞) are fixed constants. Moreover, suppose that the drift coefficient a satisfies the global one-sided Lipschitz condition (x − y) · (a(x) − a(y)) ≤ L (x − y)2
for all x, y ∈ R
(4.16)
and that the diffusion coefficient satisfies the global Lipschitz condition |b(x) − b(y)| ≤ L |x − y|
for all x, y ∈ R.
(4.17)
Then there are F -measurable mappings Cε : → [0, ∞) for each ε ∈ (0, 1) and ˜ ∈ F with P[] ˜ = 1 such that an event 2 N 1 1 (N,k) E g(XT ) − g(YN (ω)) ≤ Cε (ω) · 1−ε (4.18) 2 N N m=1
˜ N ∈ N, and ε ∈ (0, 1), where Xt is the solution of the SODE (4.13) and for every ω ∈ , (N,k) YN the Nth iteration of the Euler–Maruyama scheme applied to the SODE (4.13) (k) with the Wiener process Wt for k = 1, . . . , N 2 . The assumptions of Theorem 4.2 ensure the existence of a solution Xt with continuous sample paths of the SODE (4.13) which satisfies * ) sup |Xt |p < ∞
E
(4.19)
0≤t≤T
$ # for all p ∈ [1, ∞). Hence, the expression E g(XT ) in (4.18) in Theorem 4.2 is well-defined. The proof of Theorem 4.2 can be found in [59]. The main step is to show the uniform boundedness of the restricted moments of the Euler–Maruyama approximations (N) sup E YN 1N < ∞, (4.20) N∈N
where {N }N∈N is a sequence of events whose probabilities converge to 1 sufficiently fast. Why this holds is easier to see in the special case of the SODE dXt = −Xt3 dt + Xt dWt
(4.21)
with the initial value X0 = 1, i.e., with coefficients a(x) = −x 3 and b(x) = x. The corresponding Euler–Maruyama approximation is given by (N)
(N)
Yk+1 = Yn
(N)
for k = 0, 1, . . . , N − 1 with Y0
(N) 3 (N) (N) − Yn + Yk Wk
= 1 and =
T N.
(4.22)
50
Chapter 4. SODEs with Nonstandard Assumptions (N)
Fix N ∈ N and assume that the Euler–Maruyama iterates Yk (ω) for a given ω do not change sign up to and including the nth one for some n ∈ {1, . . . , N }, which (N) will be held fixed for now. Without loss of generality suppose that Yk (ω) ≥ 0 for x k = 0, 1, . . . , n. Then, by the inequality 1 + x ≤ e for all x ∈ R, it follows that (N) 3 (N) (N) (N) (N) Yk (ω) = Yk−1 (ω) − Yk−1 (ω) + Yk−1 Wk (ω) (N) (N) (N) (N) ≤ Yk−1 (ω) 1 + Wk (ω) ≤ Yk−1 (ω) exp Wk−1 (ω) . (4.23) Iterating this inequality then gives
(N) (N) (N) (N) Yk (ω) ≤ Yk−2 (ω) exp Wk−2 (ω) exp Wk−1 (ω) (N)
≤ Y0 (ω) exp
k−1
(N)
Wj
(ω)
j =0
(N) = exp Wk (ω) =: Dk (ω)
(4.24)
for k = 0, 1, . . . , n. (N) The Euler–Maruyama iterates Yk (ω) for k ∈ {0, 1, . . . , n} are thus bounded (N) above by those of the dominating process Dk , which has uniformly bounded absolute moments. Hence the absolute moments of the Euler–Maruyama approximation can become unbounded only if the iterates of the Euler–Maruyama (N) approximation change sign. Now, if Yk (ω) is very large for some k, then the next (N) iterate Yk+1 (ω) has the opposite sign and is very much larger in absolute value because of the negative cubic drift. The absolute values of the Euler–Maruyama approximation increases more and more through such a sequence of changes in sign. This can be avoided by restricting attention to ω in an event N for which the drift alone cannot change the sign of the Euler–Maruyama approximation. Such a sign change can come only from the diffusion term, but its coefficient grows at most linearly and such changes of sign can be controlled. Between consecutive sign changes, the Euler–Maruyama iterates are again bounded by a dominating process as above. The reader is referred to [59] for more details. Remark 9. A similar idea was used by Katsoulakis, Kossioris & Lakkis [75]. In particular, [75] motivates that one could define quasi-strong convergence with the expectations in the strong convergence criterion (3.4) restricted to events N with P(N ) → 1 sufficiently fast as N → ∞, i.e., (N) (4.25) E 1N XT − YN → 0 as N → ∞. If one would only assume that P(N ) → 1 as N → ∞ without any restriction on the speed of convergence, then quasi-strong convergence in the sense of (4.25) would be equivalent to convergence in probability.
Missing page
Missing page
Chapter 5
Stochastic Partial Differential Equations
The stochastic partial differential equations (SPDEs) considered here are stochastic evolution equations of the parabolic type. The theory of such SPDEs is complicated by different types of solution concepts and function spaces depending on the spatial regularity of the driving noise process. There is an extensive literature containing existence and uniqueness results as well as many other results. For details the reader is referred to the monographs Chow [19], Da Prato & Zabczyk [24, 25, 26], Grecksch & Tudor [39], Hairer [45], Prévot & Röckner [104], Rozovskii [108], and Walsh [123]. The main result of this chapter, Theorem 5.1 in Section 5.4, establishes existence, uniqueness, and (maximal) regularity of solutions of SPDEs with globally Lipschitz continuous coefficients. It is strongly related to the results of Kruse & Larsson [87] and van Neerven, Veraar & Weis [121]. For completeness its proof is given in the appendix.
5.1
Random and stochastic PDEs
As with RODEs and SODEs, one can distinguish between random and stochastic PDEs. For simplicity of exposition, attention will be restricted here to parabolic reaction–diffusion-type equations on a bounded spatial domain D in Rd with a smooth boundary ∂D and a Dirichlet boundary condition. An example of a random PDE (RPDE) is ∂u = u + f (ζt , u), ∂t
u∂D = 0,
(5.1)
for t ≥ 0, where ζt , t ≥ 0, is a stochastic process (possibly infinite dimensional). This is interpreted and analyzed pathwise as a deterministic PDE. 53
54
Chapter 5. Stochastic Partial Differential Equations An example of an Itô SPDE is dXt = [Xt + f (Xt )] dt + b(Xt ) dWt ,
Xt ∂D = 0,
(5.2)
for t ≥ 0, where Wt , t ≥ 0, is an infinite dimensional Wiener process of the form Wt (x, ω) =
∞
j
cj Wt (ω)φj (x),
t ≥ 0, x ∈ D,
(5.3)
j =1 j
with independent scalar Wiener processes Wt , j ∈ N. Here the family φj , j ∈ N, is an orthonormal basis in, e.g., L2 (D, R). As for SODEs, the theory of SPDEs is a mean-square theory and requires an infinite dimensional version of Itô stochastic calculus. The Doss–Sussmann theory is not as well developed for SPDEs as for SODEs, but in simple cases an SPDE can be transformed to an RPDE. For example, the SPDE (5.2) with additive noise dXt = [Xt + f (Xt )] dt + dWt , Xt = 0, ∂D
is equivalent to the RPDE ∂u u∂D = 0, = u + f (u + Ot ) + Ot , ∂t with u(t) = Xt − Ot , t ≥ 0, where Ot , t ≥ 0, is the (infinite dimensional) Ornstein– Uhlenbeck stochastic stationary solution of the linear SPDE dOt = [Ot − Ot ] dt + dWt , Ot = 0. (5.4) ∂D
As a specific example, the RPDE ∂u ∂ 2 u u∂D = 0, = 2 − u − (u + Ot )3 , ∂t ∂x on the domain D = (0, 1) is equivalent to the SPDE with additive noise 2 ∂ 3 dXt = X − X − X Xt ∂D = 0, t t t dt + dWt , 2 ∂x
(5.5)
(5.6)
on the domain D = (0, 1).
5.1.1
Mild solutions of SPDEs
Let A be an (in general) unbounded linear operator (e.g., the Laplace operator with Dirichlet boundary conditions) and (Wt )t≥0 an infinite dimensional Wiener process. An SPDE of the form dXt = [AXt + F (Xt )] dt + B(Xt ) dWt
(5.7)
on a Hilbert space H is not as easily interpreted as a stochastic integral equation as in the finite dimensional case for SODEs. In fact, there are several different
5.2. Functional analytical preliminaries
55
interpretations in the literature as to how this should be done. The mild form of an SPDE will be used here since it is better suited for the derivation of Taylor expansions and numerical schemes. The mild form of the SPDE (5.7) is a stochastic integral equation t t A(t−s) At eA(t−s) B(Xs ) dWs a.s. (5.8) e F (Xs ) ds + X t = e X0 + 0
0
in H . Here : H → H , t ≥ 0, is a semigroup of solution operators of the deterministic ODE (really a PDE) dX = AX (5.9) dt on H , i.e., eAt = St for t ≥ 0, where St (x0 ) = X(t) is the solution of (5.9) with the initial value X(0) = x0 . In the finite dimensional case, H = Rd , the SPDE is an SODE, A is a d × d matrix, and eAt is a matrix exponential. What exactly (5.8) and the expressions in it are will be explained below. eAt
5.2
Functional analytical preliminaries
The precise mathematical setting for the types of SPDEs considered here is formulated in Section 5.3. This requires some terminology and basic concepts from infinite dimensional stochastic analysis, which will be reviewed briefly here. The reader is referred to Prévot & Röckner [104] for further details.
5.2.1
Hilbert–Schmidt and trace-class operators
Let (H , ·, ·H , ·H ) and (U , ·, ·U , ·U ) be two separable real Hilbert spaces and denote by (L(U , H ), ·L(U ,H ) ) the real Banach space of all bounded linear operators from U to H . A bounded linear operator S ∈ L(U , H ) is called a Hilbert–Schmidt operator if there exists a set I and an orthonormal basis (ui )i∈I ⊂ U of U such that 2 i∈I Sui H < ∞. Let H S(U , H ) ⊂ L(U , H ) denote the set of all Hilbert–Schmidt operators from U to H . The set H S(U , H ) equipped with the norm 1/2 2 SH S(U ,H ) := Sui H (5.10) i∈I
for all Hilbert–Schmidt operators S ∈ H S(U , H ) and all orthonormal bases (ui )i∈I ⊂ U of U and the corresponding scalar product is a separable real Hilbert space. It can be shown that the norm ·H S(U ,H ) is indeed well-defined through (5.10) (see Appendix B in Prévot & Röckner [104]). An operator S ∈ L(H ) = L(H , H ) is called a trace-class operator if there exists a set I, a family of nonnegative real numbers (λi )i∈I ⊂ [0, ∞), and an orthonormal
56
Chapter 5. Stochastic Partial Differential Equations
basis (gi )i∈I ⊂ H of H with λi < ∞ i∈I
and
Sv =
λi gi , vH gi
(5.11)
i∈I
for all v ∈ H . Clearly, a trace-class operator is also a Hilbert–Schmidt operator.
5.2.2
Hilbert space valued random variables
Let (, F , P) be a probability space and let (H , · H , ·, ·H ) be a separable real Hilbert space. Denote by B(E) = σ (E ) the Borel sigma-algebra of a topological space (E, E ). An F /B(H )-measurable mapping Y : → H is called an (H -valued) random variable. A random variable Y : → H is said to be normal distributed if the real valued random variable v, Y H : → R is normal distributed for every v ∈ H . This chapter concentrates on SPDEs on real Hilbert spaces. For SPDEs on real Banach spaces, the reader is referred to van Neerven, Veraar & Weis [121, 122], and to Chapter 7.
5.2.3
Hilbert space valued stochastic processes
Let a, b ∈ [0, ∞) with a < b and let (, F , P) be a probability space with a normal filtration (Ft )t∈[a,b] . In addition, let (H , · H , ·, ·H ) be a separable real Hilbert space. A mapping Y : [a, b] × → H is called an (H -valued) stochastic process if Yt : → H given by Yt (ω) = Y (t, ω) for all ω ∈ is F /B(H )-measurable for every t ∈ [a, b]. A (B([a, b]) ⊗ F )/B(H )-measurable mapping Y : [0, T ] × → H is called a product measurable stochastic process. Moreover, a stochastic process Y : [a, b] × → H is said to be adapted (with respect to the filtration (Ft )t∈[a,b] ) if Yt : → H is Ft /B(H )-measurable for every t ∈ [a, b]. Let P[a,b] denote the predictable sigma-algebra on [a, b] × (with respect to the filtration (Ft )t∈[a,b] ) which is generated by {a} × A for A ∈ Fa and (s, t] × A for A ∈ Fs , a ≤ s < t ≤ b, i.e., P[a,b] = σ {a} × A : A ∈ Fa ∪ (s, t] × A : s, t ∈ [a, b], s ≤ t, A ∈ Fs . (5.12) A P[a,b] /B(H )-measurable mapping Y : [a, b] × → H is called a predictable stochastic process.
5.2.4
Infinite dimensional Wiener processes
Let T ∈ (0, ∞), let (, F , P) be a probability space with a normal filtration (Ft )t∈[0,T ] , and let (U , ·U , ·, ·U ) be a separable real Hilbert space. In addition,
5.3. Setting and assumptions
57
let Q : U → U be a trace-class operator, so there exists a set I, a summable family of nonnegative real numbers (λi )i∈I ⊂ [0, ∞), and an orthonormal basis (ui )i∈I ⊂ U of U such that Qui = λi ui for all i ∈ I. In the next step let W i : [0, T ] × → R, i ∈ I0 := {j ∈ I : λj = 0}, be a family of independent scalar Wiener processes with respect to the filtration (Ft )t∈[0,T ] . Then there exists an adapted stochastic process W : [0, T ] × → U with W0 = 0 and with P-a.s. continuous sample paths which satisfies Wt = λi Wti ui (5.13) i∈I0
P-a.s. for all t ∈ [0, T ] (see Proposition 2.1.10 in Prévot & Röckner [104]). The stochastic process W : [0, T ] × → U is called a standard Q-Wiener process with respect to the filtration (Ft )t∈[0,T ] (see Definition 2.1.12 in [104]) and Q : U → U is called its covariance operator. The stochastic process W : [0, T ] × → U has the properties # $ E Wt = 0, E v, Wt U w, Wt U = v, tQwU (5.14) for all v, w ∈ U and all t ∈ [0, T ]. Further details on standard Q-Wiener processes can be found in Section 2.1 in [104]. In many situations one is interested in an infinite dimensional Wiener process with a covariance operator that is a nonnegative symmetric bounded linear operator but not a trace-class operator (such as the identity operator on an infinite dimensional Hilbert space). This can be achieved by the concept of a cylindrical Wiener process. ˜ : U → U be a nonnegative symmetric bounded linear To illustrate this concept, let Q operator (but not necessarily a trace-class operator), let U1 , ·U1 , ·, ·U1 be a separable real Hilbert space with U ⊂ U1 continuously, and let Q1 : U1 → U1 be a trace 1/2 ˜ 1/2 (U ) and with Q−1/2 (u)U1 = Q ˜ −1/2 (u)U class operator with Q1 (U1 ) = Q 1 1/2 −1/2 1/2 −1/2 ˜ (U ) (here Q ˜ for all u ∈ Q1 (U1 ) = Q and Q denote the pseudoinverses 1 1/2 1/2 of Q1 and Q , respectively; see Appendix C in [104]). A standard Q1 -Wiener process W˜ : [0, T ] × → U1 with respect to the filtration (Ft )t∈[0,T ] is then called ˜ a cylindrical Q-Wiener process with respect to the filtration (Ft )t∈[0,T ] (see Propo˜ sition 2.5.2 in [104]). Thus the cylindrical Q-Wiener process (W˜ )t∈[0,T ] in general does not take values in U but in the larger space U1 with a weaker topology. The ˜ : U → U is called the covariance nonnegative symmetric bounded linear operator Q ˜ ˜ operator of the cylindrical Q-Wiener process (Wt )t∈[0,T ] . More results on cylindrical ˜ Q-Wiener processes can be found in Section 2.5 in [104].
5.3
Setting and assumptions
The following setting will be used throughout this chapter. Let T ∈ (0, ∞) and let (, F , P) be a probability space with a normal filtration (Ft )t∈[0,T ] . Let (H , ·, ·H , ·H ) and (U , ·, ·U , ·U ) be two separable real
58
Chapter 5. Stochastic Partial Differential Equations
Hilbert spaces. Moreover, let Q : U → U be a nonnegative symmetric bounded linear operator and let (Wt )t∈[0,T ] be a cylindrical Q-Wiener process with respect to the filtration (Ft )t∈[0,T ] . Assumption 1 (linear operator A). Let I be a finite or countable set and let (λi )i∈I ⊂ R be a family of real numbers with inf i∈I λi > −∞. In addition, let (ei )i∈I ⊂ H be an orthonormal basis of H and let A : D(A) ⊂ H → H be a linear operator with Av = −λi ei , vH ei (5.15) i∈I
for all v ∈ D(A) and with D(A) = w ∈ H :
2 2 i∈I |λi | |ei , wH |
− inf i∈I λi . For r ∈ R let Hr := D (η − A)r with norm ·Hr := +(η − A)r (·)+H and corresponding scalar product ·, ·Hr denote the real Hilbert spaces of domains of fractional powers of the linear operator η − A : D(A) ⊂ H → H . (See Chapter 8 for more details on this family of Hilbert spaces.) Assumption 2 (drift term F ). Let α, δ ∈ R be real numbers with δ − α < 1 and let F : Hδ → Hα be a globally Lipschitz continuous mapping. 1
Let (U0 , ·, ·U0 , ·U0 ) denote the separable real Hilbert space U0 := Q 2 (U ) 1
1
with v, wU0 = Q− 2 v, Q− 2 wU for all v, w ∈ U0 (see, for example, Section 2.3.2 in [104]). For an arbitrary bounded linear operator S ∈ L(U ), let S −1 : im(S) ⊂ U → U denote the pseudoinverse of S (see Appendix C in [104]). Assumption 3 (diffusion term B). Let β ∈ R be a real number with δ − β < let B : Hδ → H S(U0 , Hβ ) be a globally Lipschitz continuous mapping.
1 2
and
Assumption 4 (initial value ξ ). Let γ ∈ [δ, min(α + 1, β + 12 )] and p ∈ [2, ∞) be real numbers and let ξ : → Hγ be an F0 /B Hγ -measurable mapping with # $ p E ξ Hγ < ∞.
5.4
Existence, uniqueness, and regularity of solutions of SPDEs
The literature contains many existence and uniqueness theorems for mild solutions of SPDEs. Theorem 5.1 below provides an existence, uniqueness, and regularity result for solutions of SPDEs with globally Lipschitz continuous coefficients in the setting of Section 5.3.
5.4. Existence, uniqueness, and regularity of solutions of SPDEs
59
Some additional notation is needed to formulate this result. For real numbers T ∈ (0, ∞), r ∈ (0, 1) and a normed real vector space (V , · V ), define f C r ([0,T ],V ) := sup f (t)V + t∈[0,T ]
f (t2 ) − f (t1 )V ∈ [0, ∞], |t2 − t1 |r t1 ,t2 ∈[0,T ] sup
(5.16)
t1 =t2
f C([0,T ],V ) := f C 0 ([0,T ],V ) := sup f (t)V ∈ [0, ∞]
(5.17)
t∈[0,T ]
for all mappings f : [0, T ] → V from [0, T ] to V , let C 0 ([0, T ], V ) := C([0, T ], V ) denote the set of all continuous functions from [0, T ] to V , and let (5.18) C r ([0, T ], V ) := f : [0, T ] → V : f C r ([0,T ],V ) < ∞ denote the space of all r-Hölder continuous functions from [0, T ] to V . Theorem 5.1 (existence, uniqueness, and regularity of solutions of SPDEs). Assume that the setting in Section 5.3 is fulfilled. Then there exists a unique (up-tomodifications) stochastic process X : [0, T ] × → Hγ satisfying # predictable p $ supt∈[0,T ] E Xt Hγ < ∞ and Xt = eAt ξ +
t
eA(t−s) F (Xs ) ds +
t
eA(t−s) B(Xs ) dWs
(5.19)
0
0
P-a.s. for all t ∈ [0, T ]. In addition, 1
X ∈ ∩r∈(−∞,γ ] C min(γ −r, 2 ) ([0, T ], Lp (; Hr )). A predictable stochastic process X : [0, T ] × → Hγ satisfying (5.19) with # p $ supt∈[0,T ] E Xt Hγ < ∞ is called a mild solution of the SPDE dXt = AXt + F (Xt ) dt + B(Xt ) dWt ,
X0 = ξ ,
t ∈ [0, T ].
(5.20)
Theorem 5.1 thus proves existence, uniqueness (up to modifications), and regularity of mild solutions of the SPDE (5.20). The proof of Theorem 5.1 is given in the appendix. Similar existence, uniqueness, and regularity results for mild solutions of SPDEs can be found in Kruse & Larsson [87] and van Neerven, Veraar & Weis [121]. In addition to mild solutions of SPDEs (see, e.g., Da Prato & Zabczyk [24, 25]; see also Walsh [123] for another type of mild solution for SPDEs), weak solutions of SPDEs are also frequently studied in the literature (see, e.g., Prévot & Röckner [104] and Rozovskii [108]).
60
5.5
Chapter 5. Stochastic Partial Differential Equations
Examples
The setting in Section 5.3 covers several types of examples, including finite dimensional SODEs and SPDEs with second and fourth order spatial differential operators. SPDEs with time-dependent coefficients are also included.
5.5.1
Finite dimensional SODEs
The abstract setting for SPDEs of evolutionary type in Section 5.3 also covers finite dimensional SODEs. The consequences of Theorem 5.1 are illustrated here for this finite dimensional context. Let T ∈ (0, ∞) and let (, F , P) be a probability space with a normal filtration (Ft )t∈[0,T ] . For d, m ∈ N let H = Rd with v, wH = v, wRd , vH = vRd for all v, w ∈ Rd and let U = Rm with v, wU = v, wRm , vU = vRm for all v, w ∈ Rm . In addition, let Q = I : Rm → Rm be the identity operator on the Rm and let W : [0, T ] × → Rm be an m-dimensional standard (Ft )t∈[0,T ] -Wiener process. Finally, let I = {1, 2, . . . , d}, λ1 = λ2 = · · · = λd = 0 and let e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , ed = (0, . . . , 0, 1) ∈ H . The linear operator A : D(A) ⊂ H → H in Assumption 1 thus satisfies Av = 0 for all v ∈ D(A) = H = Rd . In the next step, let F : Rd → Rd and B : Rd → H S(Rm , Rd ) be two globally Lipschitz continuous functions. In addition, for some real number p ∈ [2, ∞), let p ξ : → Rd be an F0 /B(Rd )-measurable mapping with E[ξ Rd ] < ∞. The setting in Section 5.3 is thus satisfied with η ∈ (0, ∞) arbitrary and α = β = δ = γ = 0. Theorem 5.1 therefore implies the existence of a unique (up-to-modifications) 1 predictable stochastic process X : [0, T ] × → Rd ∈ C 2 ([0, T ], Lp (; Rd )) satisfying the SODE t t Xt = X0 + F (Xs ) ds + B(Xs ) dWs (5.21) 0
0
P-a.s. for all t ∈ [0, T ]. For further existence and uniqueness results for finite dimensional SODEs see Klenke [76] and Theorem 2.9 in Chapter 5 in Karatzas & Shreve [74].
5.5.2
Second order SPDEs
Some SPDEs involving spatial derivatives of second order are formulated here as examples of the setting in Section 5.3. In order to make the presentations as explicit as possible, SPDEs on the simple domain D = (0, 1)d for d ∈ N are described here, although SPDEs on much more complicated domains in Rd also fit into this setting. Let T ∈ (0, ∞) and let (, F , P) be a probability space with a normal filtration (Ft )t∈[0,T ] . Denote by H = U = L2 (D, R) the real Hilbert space of equivalence
5.5. Examples
61
classes of Lebesgue square integrable functions from D ⊂ Rd to R equipped with the norm and the scalar product 1
vH = vU =
D
|v(x)|2 dx
2
,
v, wH = v, wU =
D
v(x) · w(x) dx
(5.22) for all v, w ∈ H = U . As a prominent example of the linear operator in Assumption 1 let ϑ ∈ (0, ∞) be a real number and for I = Nd let the eigenfunctions (ei )∈I ⊂ H and the eigenvalues (λi )∈I ⊂ R of the linear operator −A : D(A) ⊂ H → H be given by d (5.23) ei (x) = 2 2 sin(i1 πx1 ) · . . . · sin(id π xd ), λi = ϑπ 2 |i1 |2 + · · · + |id |2 for all x = (x1 , . . . , xd ) ∈ D = (0, 1)d and all i = (i1 , . . . , id ) ∈ I = Nd . The linear operator A : D(A) ⊂ H → H in Assumption 1 thus reduces to the Laplacian equipped with Dirichlet boundary conditions times the constant ϑ > 0. For r ∈ R denote by Hr := D((−A)r ) the real Hilbert spaces of domains of fractional powers of the linear operator −A : D(A) ⊂ H → H with scalar product ·, ·Hr and norm ·Hr := (−A)r (·)H . In the next step, let c ∈ [0, ∞) be a real number and let f , b : D × R → R be two Borel measurable functions with |f (x, 0)|2 dx < ∞, |b(x, 0)|2 dx < ∞, D
D
and |f (x, y1 ) − f (x, y2 )| + |b(x, y1 ) − b(x, y2 )| ≤ c |y1 − y2 |
(5.24)
for all x ∈ D and all y1 , y2 ∈ R. Condition (5.24) ensures that f and b are uniformly globally Lipschitz continuous in their last variable. Finally, set α = δ = 0 and observe that the (in general nonlinear) mapping F : H → H given by F (v) (x) = f (x, v(x)) (5.25) for all x ∈ D and all v ∈ H satisfies Assumption 2. The operator F is known as a Nemytskii operator in the literature (see, e.g., Runst & Sickel [109]). Assumption 3 will be verified separately for the two cases of space–time white noise and trace–class noise. Space–time white noise Let d = 1, so D = (0, 1), and let Q = I be the identity operator on H = U = L2 ((0, 1), R). Let β ∈ (− 12 , − 14 ) be arbitrary and consider the (in general nonlinear)
62
Chapter 5. Stochastic Partial Differential Equations
mapping B : H → H S(H , Hβ ) given by 1 b(x, v(x)) · u(x) · w(x) dx B(v)u (w) =
(5.26)
0
for all u, v ∈ H and all w ∈ H−β . Then B is well-defined, since it follows from the embedding H−β ⊂ C([0, 1], R) continuously and by the Cauchy–Schwarz inequality that 1 |b(x, v(x)) · u(x) · w(x)| dx 0 1
≤ 0
≤
1
|b(x, v(x)) · u(x)| dx wC([0,1],R) (5.27)
1 2
1
|b(x, v(x))|2 dx
0
1 2
|u(x)|2 dx
0
wC([0,1],R)
= b(·, v(·))H uH wC([0,1],R) < ∞ for all u, v ∈ H and all w ∈ H−β ⊂ C([0, 1], R). Inequality (5.27) implies, in particular, that 1 |b(x, v(x)) · u(x) · w(x)| dx ≤ b(·, v(·))H uC([0,1],R) wH < ∞ (5.28) 0
for all v, w ∈ H and all u ∈ H−β ⊂ C([0, 1], R) and this shows B(v)(H−β ) ⊂ H for all v ∈ H . In addition, observe that w1 , B(v)w2 H = B(v)w1 , w2 H
(5.29)
for all v ∈ H and all w1 , w2 ∈ H−β . Equation (5.29) then implies that + +2 B(v) − B(w)2H S(H ,Hβ ) = +(−A)β (B(v) − B(w))+H S(H ) , - ei , (−A)β (B(v) − B(w)) ej 2 = H i,j ∈I
, - ej , (B(v) − B(w)) (−A)β ei 2
=
H
i,j ∈I
=
(B(v) − B(w)) ei 2H |λi |2β
i∈I
and B(v) − B(w)H S(H ,Hβ ) ≤
√
2
i∈I
1 2
|λi |
1 |b(x, v(x)) − b(x, w(x))| dx
2β
2
D
√ + + ≤ c 2 +(−A)β +H S(H ) v − wH < ∞
2
5.5. Examples
63
+ + for all v, w ∈ H due to inequality (5.24). Note that +(−A)β +H S(H ) < ∞ since β < − 14 . The mapping B : H → H S(H , Hβ ) in (5.26) thus fulfills Assumption 3. Finally, assume that γ = β + 12 in Assumption 4. Then Theorem 5.1 implies that the SPDE 2 ∂ dXt (x) = ϑ X (x) + f (x, X (x)) dt + b(x, Xt (x)) dWt (x) (5.30) t t ∂x 2 with X0 (x) = ξ (x) and Xt (0) = Xt (1) = 0 for x ∈ (0, 1) and t ∈ [0, T ] has a unique (up-to-modifications) mild solution X : [0, T ] × → Hβ+ 1 . Here (Wt )t∈[0,T ] is a 2 cylindrical I -Wiener process on H . For example, the SPDE with additive noise 2 ∂ X (x) + f (x, X (x)) dt + dWt (x) dXt (x) = ϑ t t ∂x 2
(5.31)
with X0 (x) = ξ (x) and Xt (0) = Xt (1) = 0 for x ∈ (0, 1), t ∈ [0, T ], where b(x, y) = 1 for all x ∈ (0, 1), y ∈ R, and the stochastic heat equation with linear multiplicative noise 2 ∂ dXt (x) = ϑ X (x) dt + Xt (x) dWt (x) (5.32) t ∂x 2 with X0 (x) = ξ (x) and Xt (0) = Xt (1) = 0 for x ∈ (0, 1), t ∈ [0, T ], where f (x, y) = 0, b(x, y) = y for all x ∈ (0, 1), y ∈ R, both have unique (up-to-modifications) mild solutions. Remark 10. It was shown by Walsh [123] that mild solutions do not exist for space– time white noise in spatial domains of dimension higher than one. Trace–class noise In this subsection there is no restriction on the dimension d ∈ N of the spatial domain. Let J be a countable set and let gj : D¯ = [0, 1]d → R, j ∈ J, be a family of 2 continuous functions + + which form an orthonormal basis in H = L (D, R) and which + + satisfy supj ∈J gj C(D, ¯ R) < ∞. In addition, let rj j ∈J ⊂ [0, ∞) be a family of nonnegative real numbers with j ∈J rj < ∞ and let Q : H → H be a bounded linear operator given by Qu =
j ∈J
, rj gj , u H gj
for all u ∈ H . Note that Q : H → H is a trace–class operator.
(5.33)
64
1 2
Chapter 5. Stochastic Partial Differential Equations Next let (U0 , ·U0 , ·, ·U0 ) denote the separable real Hilbert space U0 := 1
Q (U ) with uU0 = Q− 2 uH for all v ∈ U0 and consider the mapping B : H → H S(U0 , H ) given by (B(v)u) (x) = b(x, v(x)) · u(x) (5.34) 1 ¯ R) confor all x ∈ D, v ∈ H , and u ∈ U0 . The embedding U0 = Q 2 (U ) ⊂ C(D, tinuously then demonstrates that B in (5.34) is indeed well-defined. Moreover, the inequality 1 + +2 2 1 + + B(v) − B(w) = (B(v) − B(w)) Q 2 gi
H S(U0 ,H )
H
i∈I
+ 1 +2 +Q 2 gi + ¯ ≤ C(D,R) i∈I
1 2
1 |b(x, v(x)) − b(x, w(x))| dx 2
D
+ 1+ ≤ c +Q 2 +H S(H ) sup gj C(D, ¯ R) v − wH < ∞ j ∈J
(5.35) holds for all v, w ∈ H , which shows that B satisfies Assumption 3. Finally, assume γ = 12 in Assumption 4. Then Theorem 5.1 implies that the SPDE ) * ∂2 ∂2 + · · · + 2 Xt (x) + f (x, Xt (x)) dt + b(x, Xt (x)) dWt (x) dXt (x) = ϑ ∂x12 ∂xd (5.36) and with X0 (x) = ξ (x) and Xt |∂D ≡ 0 for x = (x1 , . . . , xd ) ∈ D = t ∈ [0, T ] has a unique (up to modifications) mild solution X : [0, T ] × → H 1 , 2 where W : [0, T ] × → H is a standard Q-Wiener process here. For example, the SPDE with additive noise ) * ∂2 ∂2 + · · · + 2 Xt (x) + f (x, Xt (x)) dt + dWt (x) (5.37) dXt (x) = ϑ ∂x12 ∂xd (0, 1)d
with X0 (x) = ξ (x) and Xt |∂D ≡ 0 for x = (x1 , . . . , xd ) ∈ D = (0, 1)d and t ∈ [0, T ], where b(x, y) = 1 for all x ∈ (0, 1)d , y ∈ R, and the stochastic heat equation with linear multiplicative noise ) * ∂2 ∂2 + · · · + 2 Xt (x) dt + Xt (x) dWt (x) (5.38) dXt (x) = ϑ ∂x12 ∂xd with X0 (x) = ξ (x) and Xt |∂D ≡ 0 for x = (x1 , . . . , xd ) ∈ D = (0, 1)d and t ∈ [0, T ], where f (x, y) = 0, b(x, y) = y for all x ∈ (0, 1)d , y ∈ R, both have unique (up-tomodifications) mild solutions.
2
5.5. Examples
5.5.3
65
Fourth order SPDEs
Some SPDEs involving spatial derivatives of up to fourth order are formulated here as an example of the setting in Section 5.3. For simplicity only SPDEs on the onedimensional domain (0, 1) are considered. To be more precise, let H = U = L2 ((0, 1), R) be the real Hilbert space of equivalence classes of Lebesgue square integrable functions from (0, 1) to R equipped with the norm and the scalar product vH = vU =
1
1 2
|v(x)| dx 2
,
0
v, wH = v, wU =
1
v(x) · w(x) dx
0
(5.39) for all v, w ∈ H = U . In addition, let Q = I : H → H be the identity operator on H . For the linear operator A : D(A) ⊂ H → H in Assumption 1, let I = N and let (ei )∈I ⊂ H and (λi )∈I ⊂ R be given by √ ei (x) = 2 sin(iπ x), λi = π 4 i 4 (5.40) for all x ∈ (0, 1) and all i ∈ N. The linear operator A : D(A) ⊂ H → H in Assump ∂4 tion 1 thus fulfills (Av)(x) = − ∂x 4 v(x) for all v ∈ D(A) with D(A) = f ∈ H 4 ((0, 1), R) : f (0) = f (1) = f (0) = f (1) = 0 . (5.41) Here H 4 ((0, 1), R) = W 4,2 ((0, 1), R) is the Sobolev space of all four times weakly differential functions from (0, 1) to R with Lebesgue square integrable derivatives (see, e.g., Section 6.4 in Renardy & Rogers [106]). For r ∈ R denote by Hr := D((−A)r ) the real Hilbert spaces of domains of fractional powers of the linear operator −A : D(A) ⊂ H → H with scalar product ·, ·Hr and norm ·Hr := (−A)r (·)H . Next let c ∈ [0, ∞) be a real number and let f , b : (0, 1) × R2 → R be two Borel measurable functions with |f (x, 0, 0)|2 dx < ∞, |b(x, 0, 0)|2 dx < ∞ D
D
and |f (x, y1 , z1 ) − f (x, y2 , z2 )| + |b(x, y1 , z1 ) − b(x, y2 , z2 )| ≤ c (|y1 − y2 | + |z1 − z2 |) (5.42) for all x ∈ (0, 1) and all y1 , z1 , y2 , z2 ∈ R. Then let α = − 12 , let δ = 14 , and consider the (in general nonlinear) operator F : H 1 → H− 1 given by 4
2
F (v) (w) =
1 0
f (x, v (x), v(x)) · w (x) dx
(5.43)
66
Chapter 5. Stochastic Partial Differential Equations
for all v ∈ H 1 and all w ∈ H 1 . Observe that
4
2
.
1
F (v) (w) = f (·, v (·), v(·)), −(−A) 2 w
/ H
1 = − (−A) 2 f (·, v (·), v(·)) (w)
1 for all v ∈ H 1 and all w ∈ H 1 . It thus follows that F (v) = −(−A) 2 f (·, v (·), v(·)) 4 2 ∈ H−1/2 for all v ∈ H 1 . This and inequality (5.42) then imply that 4
F (v) − F (w)H−1/2 =
1
f (x, v (x), v(x)) − f (x, w (x), w(x))2 dx
1 2
0
+ + + + + + ≤ c +v − w +H + v − wH = c v − wH1/4 + +(−A)−1/4 (v − w)+ H1/4 ≤ c 1 + (−A)−1/4 L(H ) v − wH1/4 < ∞ for all v, w ∈ H1/4 . This demonstrates that F given by (5.43) indeed satisfies Assumption 2 with α = − 12 and δ = 14 . In the next step, let β ∈ (− 14 , − 18 ) be arbitrary and consider the (in general nonlinear) mapping B : Hδ → H S(H , Hβ ) given by b(x, v (x), v(x)) · u(x) · w(x) dx (5.44) B(v)u (w) = D
for all u ∈ H , v ∈ Hδ and all w ∈ H−β . Then B is well-defined, since, by the embedding H−β ⊂ C([0, 1], R) continuously and the Cauchy–Schwarz inequality, 1 1 b(x, v (x), v(x)) · u(x) dx wC([0,1],R) |b(x, v(x)) · u(x) · w(x)| dx ≤ 0
0
≤
b(x, v (x), v(x))2 dx
1
1 2
0
1
1 2
|u(x)|2 dx
0
wC([0,1],R)
(5.45)
+ + = +b(·, v (·), v(·))+H uH wC([0,1],R) < ∞ for all u ∈ H , v ∈ Hδ and all w ∈ H−β ⊂ C([0, 1], R). As in the space–time white noise case in Section 5.5.2, observe that , - ei , (−A)β (B(v) − B(w)) ej 2 B(v) − B(w)2H S(H ,Hβ ) = H i,j ∈I
, - ej , (B(v) − B(w)) (−A)β ei 2 = (B(v) − B(w)) ei 2H |λi |2β = H i,j ∈I
i∈I
5.5. Examples
67
for all v, w ∈ H 1 . Therefore 4
B(v) − B(w)H S(H ,Hβ ) 1 1 2 √ 2 2β b(x, v (x), v(x)) − b(x, w (x), w(x))2 dx |λi | ≤ 2 i∈I
D
≤ c 2 (−A)β H S(H ) 1 + (−A)−1/4 L(H ) v − wH1/4 < ∞ √
for all v, w ∈ H 1 due to inequality (5.42). Note that (−A)β H S(H ) < ∞ since 4
β < − 18 . The mapping B : Hδ → H S(H , Hβ ) in (5.44) thus fulfills Assumption 3. Finally, assume γ = β + 12 in Assumption 4. Then Theorem 5.1 implies that the Cahn–Hilliard–Cook-type SPDE 4 2 ∂ ∂ Xt (x) + f (x, Xt (x), Xt (x)) dt dXt (x) = − ∂x 4 ∂x 2 + b(x, Xt (x), Xt (x)) dWt (x)
(5.46)
with X0 (x) = ξ (x) and Xt (0) = Xt (1) = Xt (0) = Xt (1) = 0 for x ∈ (0, 1) and t ∈ [0, T ] has a unique (up-to-modifications) mild solution X : [0, T ]× → Hβ+ 1 . Here 2 (Wt )t∈[0,T ] is a cylindrical I -Wiener process on H . For more results on this type of fourth order SPDE see Blömker [8].
5.5.4
SPDEs with time-dependent coefficients
The setting in Section 5.3 also applies to SPDEs with time-dependent coefficients. Let T ∈ (0, ∞), let (, F , P) be a probability space with a normal filtration (Ft )t∈[0,T ] , and let (H˜ , ·, ·H˜ , ·H˜ ) and (U , ·, ·U , ·U ) be two separable real Hilbert spaces. Then let Q : U → U be a nonnegative symmetric bounded linear operator and let (Wt )t∈[0,T ] be a cylindrical Q-Wiener process with respect to (Ft )t∈[0,T ] . In addition, let I˜ be a finite or countable set, let (λi )i∈I˜ ⊂ R be a family of real numbers with inf i∈I˜ λi > −∞, and let (e˜i )i∈I˜ ⊂ H˜ be an orthonormal basis of H˜ . ˜ ⊂ H˜ → H˜ be a linear operator with Then let A˜ : D(A) ˜ = Av −λi e˜i , vH˜ e˜i (5.47) i∈I˜
2 2 − inf i∈I˜ λi and for r ∈ R ˜ r the real Hilbert spaces of domains of fractional powers denote by H˜ r := D (η − A)
68
Chapter 5. Stochastic Partial Differential Equations
+ + ˜ r (·)+ ˜ ˜ ⊂ H˜ → H˜ with norm · ˜ := +(η − A) of the linear operator η˜ − A˜ : D(A) Hr H and corresponding scalar product ·, ·H˜ r . In the next step let α, δ ∈ R be real numbers with δ −α < 1 and let F˜ : R× H˜δ → H˜ α be globally Lipschitz continuous. In addition, denote by U0 , ·, ·U0 , ·U0 the 1
1
1
separable real Hilbert space U0 := Q 2 (U ) with v, wU0 = Q− 2 v, Q− 2 wU for all v, w ∈ U0 . Let β ∈ R be a real number with δ − β < 12 and let B˜ : R × H˜ δ → H S(U0 , H˜ β ) be globally Lipschitz continuous. Finally, let γ ∈ [δ, min(α + 1, β + 12 )] and p ∈ [2, ∞) be numbers and let ξ˜ : → H˜ γ be an F0 /B(H˜ γ )-measurable # real p $ mapping with E ξ ˜ < ∞. Hγ
Now consider the separable real Hilbert space (H := R × H˜ , ·, ·H , ·H ) with 1 ˜ ∪ I˜ (t, v)H = (|t|2 + v2˜ ) 2 for all t ∈ R and all v ∈ H˜ . Moreover, define I := {I} H and λI˜ := 0, and consider the family (ei )i∈I ⊂ H given by eI˜ = (1, 0) and ei = (0, e˜i ) ˜ Note that the family (ei )i∈I ⊂ H is an orthonormal basis of H . for all i ∈ I. ˜ for all t ∈ R The linear operator in Assumption 1 thus fulfills A(t, v) = (0, Av) and all v ∈ H˜ . Furthermore, let η ∈ [η, ˜ ∞) ∩ (0, ∞) and for r ∈ R denote by Hr := D((−A)r ) the real Hilbert spaces of domains of fractional powers of the linear operator η − A : D(A) ⊂ H → H with scalar product ·, ·Hr and norm ·Hr := (η − A)r (·)H . Next note that F : Hδ → Hα given by F (t, v) = (1, F˜ (t, v)) for all t ∈ R, v ∈ H˜ δ ˜ v)u) satisfiesAssumption 2, while B : Hδ → H S(U0 , Hβ ) given by B(t, v)u = (0, B(t, for all t ∈ R, v ∈ H˜ δ , u ∈ U0 satisfies Assumption 3. Finally, let ξ : → Hγ be given by ξ (ω) = (0, ξ˜ (ω)) for all ω ∈ . Clearly, ξ satisfies Assumption 4. The setting in Section 5.3 is thus fulfilled. Theorem 5.1 therefore implies the existence of a unique (up-to-modifications) predictable stochastic process X : [0, T ] × → Hγ ∈ ∩r∈(−∞,γ ] C min(γ −r,1/2) ([0, T ], Lp (; Hr )), which satisfies (5.19). Define X˜ : [0, T ] × → H˜ γ by X˜ t (ω) = i∈I˜ ei , Xt (ω)H e˜i for all t ∈ [0, T ] and all ω ∈ . Note that X : [0, T ] × → H˜ γ is a predictable stochastic process in ∩r∈(−∞,γ ] C min(γ −r,1/2) ([0, T ], Lp (; H˜ r )) satisfying ˜ X˜ t = eAt ξ˜ +
t
e 0
P-a.s. for all t ∈ [0, T ].
˜ A(t−s)
F˜ (s, X˜ s ) ds +
0
t
˜ ˜ X˜ s ) dWs eA(t−s) B(s,
(5.48)
Chapter 6
Numerical Methods for SPDEs
The numerical approximation of SPDEs, specifically stochastic evolution equations of the parabolic or hyperbolic type, encounters all the difficulties that arise in the numerical solution of both deterministic PDEs and finite dimensional SODEs as well as many more due to the infinite dimensional nature of the driving noise processes. The state of development of numerical schemes for SPDEs compares with that for SODEs in the early 1970s. There is now a large number of papers on the numerical approximation of SPDEs. An extensive list can be found in the review paper [68]. However, most of the numerical schemes that have been proposed to date have a low order of convergence, especially in terms of an overall computational effort, and only recently has it been shown how to construct higher order schemes. The breakthrough for SODEs started with the Milstein scheme and continued with the systematic derivation of stochastic Taylor expansions and the numerical schemes based on them. These stochastic Taylor schemes are based on an iterated application of the Itô formula. The crucial point is that the multiple stochastic integrals which they contain provide more information about the noise processes within discretization subintervals, and this allows an approximation of higher order to be obtained. This theory was briefly recalled in Chapter 3. There is, however, no general Itô formula for the solutions of stochastic PDEs in Hilbert spaces or Banach spaces (see Chapter 7 for more details). Nevertheless, it has recently been shown that Taylor expansions for the solutions of such equations can be constructed by taking advantage of the mild-form representation of the solutions. These new results will be presented in the subsequent chapters. This chapter is an extract from [68].
69
70
Chapter 6. Numerical Methods for SPDEs
6.1 An early result To fix ideas consider a parabolic SPDE with a Dirichlet boundary condition on a bounded domain D in Rd of the form dXt = [AXt + f (Xt )] dt + g(Xt ) dWt
(6.1)
and suppose for now that the coefficients satisfy the assumptions of the previous chapter. In addition, assume that the eigenvalues λj and the corresponding eigenfunctions φj ∈ H01,2 (D) of the operator −A, i.e., with −Aφj = λj φj ,
j = 1, 2, . . . ,
form an orthonormal basis in L2 (D) with λj → ∞ as j → ∞. Assume also that Wt is a standard scalar Wiener process. Projecting the SPDE (6.1) onto the N -dimensional subspace HN of L2 (D) spanned by {φ1 , . . . , φN } gives an N -dimensional Itô–Galerkin SODE in RN of the form (N) (N) (N) (N) (6.2) dXt = AN Xt + fN (Xt ) dt + gN (Xt ) dWt , N,j φ ∈ H or (X N,1 , . . . , X N,N ) where X (N) is written synonymously for N N j =1 X j N ∈ R according to the context. Moreover, fN = PN f H and gN = PN g H , where N
N
f and g are now interpreted as mappings of L2 (D) or H01,2 (D) into itself, and PN is the projection of L2 (D) or H01,2 (D) onto HN , while AN = PN AH is the diagonal N # $ matrix diag λj , . . . , λN . Grecksch & Kloeden [38] showed in 1996 that the combined truncation and global discretization error for a strong order γ stochastic Taylor scheme applied to (6.2) with constant time step has the form γ + 12 +1 γ (N,) −1/2 ≤ KT λN+1 + λN (6.3) , max E Xk − Yk k=0,1,...,NT
L2 (D)
where x denotes the integer part of the real number x and the constant KT depends on the initial value and bounds on the coefficient functions f and g (and their derivatives) of the SPDE (6.1) as well as on the length of the time interval [0, T ] under (N,) consideration. Here and below Yk is written Yk to emphasize the dependence on the time step size . Since λj → ∞ as j → ∞, a very small time step is needed in high dimensions for convergence, i.e., the Itô–Galerkin SODE (6.2) is stiff and explicit schemes such as strong stochastic Taylor schemes are not really appropriate. Obviously, an implicit scheme should be used here, but the special structure of the SODE (6.2) suggests a simpler linear-implicit scheme, since it is the matrix AN in the linear part of the
6.2. Other results
71
drift coefficient that causes the troublesome growth with respect to the eigenvalues, so only this part of the drift coefficient needs to be made implicit. For example, the linear-implicit Euler scheme for the SODE (6.2) is (N)
(N)
Yk+1 = Yk
(N)
(N)
(N)
+ AN Yk+1 + fN (Yk ) + gN (Yk ) Wk ,
(6.4)
(N)
which is easily solved for Yk+1 because its matrix IN − AN is diagonal. Kloeden & Shott [84] showed that for a linear-implicit strong order γ stochastic Taylor scheme the combined error has the form (N,) −1/2 max E Xk − Yk ≤ KT λN+1 + γ . (6.5) L2 (D)
k=0,1,...,NT
The time step can thus be chosen independently of the dimension N of the Itô– Galerkin SODE (6.2). The above results are of limited use because Wt is only one-dimensional and the proofs of the convergence of Taylor schemes for SODEs in the monographs [82, 95] assume that partial derivatives of the coefficient functions of the Galerkin SODE are uniformly bounded on RN .
6.2
Other results
Much of the literature is concerned with a semilinear stochastic heat equation with additive space–time white noise W˙ t on the one-dimensional domain D = (0, 1) over the time interval [0, T ], i.e., ∂u ∂ 2 u = 2 + f (u) + W˙ t ∂t ∂x
(6.6)
with the Dirichlet boundary condition. These papers also use an overall convergence rate which combines the two components of the error bound due, respectively, to the temporal and spatial discretization. The overall convergence rate is usually expressed in terms of the computational cost of the scheme. For one-dimensional domains this is defined by K = N · M, where N arithmetical operations, random number, and function evaluations per time (N,M) of the scheme (this is related to the dimension step to calculate the next iterate Yk T . (Here in a Galerkin approximation) for M iterations with a constant time step = M (N,M)
Yk is now written Yk to emphasize the dependence on the number of time steps M.) If the scheme has error bound (N,M) 2 E Xtk − Yk sup
k=0,...,M
L2 (D)
1 2
≤ KT
1 1 + β α N M
(6.7)
72
Chapter 6. Numerical Methods for SPDEs
for α, β > 0, then the optimal overall rate is cost, i.e., (N,M) 2 max E Xtk − Yk
L2 (D)
k=0,...,M
β
αβ α+β
α
with respect to the computational
1 2
≤ KT · K
for N = K α+β and M = K α+β . For example, if α = rate is 13 .
1 2
αβ − α+β
and β = 1, then the overall
The following papers are a representative selection of many others in the literature dealing with the SPDE (6.6). Gyöngy & Nualart [44] introduced an implicit numerical scheme for this SPDE in 1995 and showed that it converges strongly to the exact solution without giving a rate. In 1998 and 1999 Gyöngy [42] also applied finite differences to an SPDE driven by space–time white noise and then used several temporal implicit and explicit schemes, in particular, the linear-implicit Euler scheme. He showed that these schemes converge with order 12 in the space and with order 14 in time (assuming a smooth initial value) and, hence, he obtained an overall convergence rate of 16 with respect to the computational cost in space and time. In 1999 Shardlow [111] applied finite differences to the SPDE (6.6) to obtain a spatial discretization, which he then discretized in time with a θ -method. This had an overall convergence rate of 16 with respect to the computational cost. In a seminal paper published in 2001, Davie & Gaines [27] showed that any numerical scheme applied to the SPDE (6.6) with f = 0 which uses only values of the noise Wt cannot converge faster than the rate of 16 with respect to the computational cost. Müller-Gronbach & Ritter [99] showed in 2007 that this is also a lower bound for the convergence rate. They even showed that one cannot improve this rate of convergence by choosing nonuniform time steps; see [100, 101]. Higher rates were obtained for smoother types of noise. For example, in 2003 Hausenblas [51] applied the linear-implicit and explicit Euler schemes and the Crank–Nicholson scheme to an SPDE of the form (6.6). For trace–class noise, she obtained the order 14 with respect to the computational cost, but in the general case of space–time white noise the convergence rate was no better than the Davie–Gaines barrier rate 16 . Similarly, in 2004 Lord & Rougemont [90] discretized in time the Galerkin– SODE obtained from the SPDE (6.6) with the numerical scheme (N,M)
Yk+1
(N,M) (N,M) = eAN h Yk + hfN (Yk ) + WkN ,
(6.8)
which they showed to be useful when the noise is very smooth in space, in particular with Gévrey regularity. However, in the general case of space–time white noise the scheme (6.8) converges at the Davie–Gaines barrier rate 16 .
6.3. The exponential Euler scheme
73
6.3 The exponential Euler scheme Davie & Gaines [27, page 129], remarked that it may be possible to improve the convergence rate by using suitable linear functionals of the noise. This suggestion was used by Jentzen & Kloeden [67], who considered a parabolic SPDE with additive noise dXt = [AXt + F (Xt )] dt + dWt ,
X0 = x 0 ,
(6.9)
in a Hilbert space (H , | · |) with inner product ·, ·, where A is an (in general) unbounded operator (for example, A = ), F : H → H is a nonlinear continuous function, and Wt is a cylindrical Wiener process. They interpreted this in the mild sense, t
Xt = eAt x0 +
t
eA(t−s) F (Xs ) ds +
0
eA(t−s) dWs ,
(6.10)
0
and used the fact that the solution of the N -dimensional Itô–Galerkin SODE in the space HN := PN H (or, equivalently, in RN ) (6.11) dXtN = AN XtN + FN (XtN ) dt + dWtN has an analogous “mild” representation t t AN (t−s) N + e F (X ) ds + eAN (t−s) dWsN . XtN = eAN t uN N s 0 0
(6.12)
0
This motivated what they called the exponential Euler scheme (N,M) (N,M) (N,M) AN e + A−1 − I fN (Yk ) Yk+1 = eAN Yk N +
tk+1
tk
eAN (tk+1 −s) dWsN
(6.13)
T for some M ∈ N and discretization times tk = k for k = with time step = M 0, 1, . . . , M. The exponential Euler scheme is, in fact, easier to simulate than may seem at (N,M) first sight. Denoting the components of Yk and FN by . / (N,M) (N,M) i = 1, . . . , N , Yk,i = ei , Yk , FNi = ei , FN ,
the numerical scheme (6.13) can be rewritten as (N,M) Yk+1,1
=
.. . (N,M)
Yk+1,N
=
1 2 (1 − e−λ1 ) 1 (N,M) q1 −2λ1 FN (Yk )+ (1 − e ) Rk1 , λ1 2λ1 .. .. . . 1 −λN ) 2 (1 − e qN (N,M) (N,M) e−λN Yk,N + FNN (Yk )+ (1 − e−2λN ) RkN , λN 2λN (N,M) e−λ1 Yk,1 +
74
Chapter 6. Numerical Methods for SPDEs
where the Rki for i = 1, . . . , N and k = 0, 1, . . . , M − 1 are independent, standard normal distributed random variables.
6.3.1
Convergence
The convergence of the exponential Euler scheme will be proved under the following assumptions, which the reader should compare with Assumptions 1–4 of Theorem 5.1 in Chapter 5. Assumption 5 (linear operator A). There exist sequences of real eigenvalues 0 < λ1 ≤ λ2 ≤ · · · and orthonormal eigenfunctions (en )n≥1 of −A such that the linear operator A : D(A) ⊂ H → H is given by Av =
∞
−λn en , v en
n=1
2 2 for all v ∈ D(A) with D(A) = v ∈ H : ∞ n=1 |λn | |en , v| < ∞ . Assumption 6 (nonlinearity F ). The nonlinearity F : H → H is two times continuously Fréchet differentiable and its derivatives satisfy + + +F (x) − F (y)+ ≤ L |x − y|H , (−A)(−r) F (x)(−A)r v ≤ L |v|H H
and r = 0, 1, and for all x, y ∈ H , v ∈ 1 1 −1 A F (x)(v, w) ≤ L (−A)− 2 v (−A)− 2 w 1 2,
D((−A)r ),
H
H
H
for all v,w,x ∈ H , where L > 0 is a positive constant. Assumption 7 (cylindrical Q-Wiener process Wt ). There exist a sequence (qn )n≥1 of positive real numbers and a real number γ ∈ (0, 1) such that ∞
2γ −1
λn
qn < ∞
n=1
and pairwise independent scalar Ft -adapted Wiener processes Wtn t≥0 for n ≥ 1. The cylindrical Q-Wiener process Wt is given formally by Wt =
∞ √
qn Wtn en .
(6.14)
n=1
Assumption 8 (initial value). The random variable x0 : → D((−A)γ ) satisfies E |(−A)γ x0 |4H < ∞, where γ > 0 is given in Assumption 7.
6.3. The exponential Euler scheme
75
Under Assumptions 5–8 the SPDE (6.9) has a unique mild solution Xt on the time interval [0, T ], where Xt is the predictable stochastic process in D((−A)γ ) given by (6.10). See, e.g., Theorem 5.1 above as well as Da Prato & Zabczyk [24] or Prévot & Röckner [104]. Since Assumption 6 also applies to FN , the Itô–Galerkin SODE (6.11) has a unique solution on [0, T ], which is given implicitly by (6.12). The above series (6.14) for the cylindrical Wiener process may not converge in H but in another space into which H can be continuously embedded. Thus, the formalism here includes space–time white noise (in one-dimensional domains only) as well as trace–class noise. Next the convergence theorem of [67] is presented. Theorem 6.1. Suppose that Assumptions 5–8 are satisfied. Then there is a constant CT > 0 such that 12 log(M) (N,M) 2 −γ sup E Xtk − Yk ≤ C T λN + H M k=0,...,M
(6.15) (N,M)
holds for all N , M ∈ N, where Xt is the solution of SPDE (6.9), Yk is the k numerical solution given by (6.13), tk = T M for k = 0, 1, . . . , M, and γ > 0 is the constant given in Assumption 8. In fact, the exponential Euler scheme (6.13) converges in time with a strong order 1 − ε for an arbitrary small ε > 0 since log(M) can be estimated by M ε , so 1 log(M) ≈ (1−ε) . M M Importantly, the error coefficient CT does not depend on the dimension N of the Itô–Galerkin SDE. t An essential point is that the integral tkk+1 eAN (tk+1 −s) dWsN includes more information about the noise on the discretization interval. Such additional information was the key to the higher order of stochastic Taylor schemes for SODE but is included there in terms of simple multiple stochastic integrals rather than integrals weighted by an exponential integrand.
6.3.2
Numerical results
To illustrate Theorem 6.1 consider the semilinear stochastic heat equations (6.6) on the one-dimensional domain (0, 1) with f (u) = 12 u and f (u) = −u3 , i.e., ∂u ∂ 2 u 1 = 2 + u + W˙ t , ∂t 2 ∂x
∂u ∂ 2 u = 2 − u3 + W˙ t , ∂t ∂x
(6.16)
76
Chapter 6. Numerical Methods for SPDEs
with the Dirichlet boundary condition and the initial value u0 (x) =
∞
√ n−0.6 2 sin(nπ x)
for
x ∈ (0, 1).
n=1
The first example is based on the function f (u) = 12 u, which satisfies the assumptions used in Theorem 6.1, while the second with the function f (u) = −u3 does not. For the SPDE (6.16) the linear-implicit Euler scheme is (N,M) (N,M) (N,M) Yk+1 = (I − AN )−1 Yk + · f (Yk ) + WkN , the Lord–Rougemont scheme is (N,M) (N,M) (N,M) Yk+1 = eAN Yk + · f (Yk ) + WkN , and the exponential Euler scheme is (N,M)
Yk+1
(N,M)
= eAN Yk
(N,M) AN e + A−1 − I f (Y ) + k N
tk+1
tk
eAN (tk+1 −s) dWsN
for k = 0, 1, . . . , M − 1 and N , M ≥ 1. From Figure 6.1 it is clear that the linear-implicit Euler and Lord–Rougemont schemes converge with the rate 16 , while the exponential Euler scheme converges with the rate 13 .
6.3.3
Restrictiveness of the assumptions
Assumptions 5–8 in Theorem 6.1 are quite restrictive and Theorem 6.1 has several serious shortcomings. First, the eigenvalues and eigenfunctions of the operator A are rarely known except in very simple domains. Finite element methods are a possible way around this difficulty. Some papers applying finite element methods to SPDEs are Allen, Novosel & Zhang [1], Katsoulakis, Kossioris & Lakkis [75], Kovács, Larsson & Lindgren [85], Kovács, Larsson & Saedpanah [86], Walsh [123], and Yan [125]. More seriously, Assumption 6 on the nonlinearity F , which is understood as the Nemytskii operator of some function f : R → R, is very restrictive and excludes Nemytskii operators for functions like f (u) =
u , 1 + u2
f (u) = u − u3 ,
u ∈ R.
These are important in applications involving stochastic reaction–diffusion equations. One reason is the Fréchet differentiability of the function F when considered as a
6.3. The exponential Euler scheme
77
0
10
error
Linear Implicit Euler Lord−Rougemont Exponential Euler Orderlines 1/6 and 1/3
−1
10
−2
10
2
10
3
10 computational effort
4
10
Figure 6.1. Mean-square error vs. computational effort as log-log plot with the function f (u) = 12 u [68]. mapping between the Hilbert space H and also the boundedness of the derivatives of F as expressed in Assumption 6. This restriction is weakened with the order 12 in u time in Jentzen [63], where also nonlinearities such as f (u) = 1+u 2 are considered, by the observation that the solution process in fact takes values not only in H , but also in a smaller subspace V , e.g., the space of continuous functions on which F is indeed Fréchet differentiable. The other problem is the global Lipschitz estimate on F . This difficulty also arises for the finite dimensional Itô SODEs, for which it can be overcome by using pathwise convergence rather than strong convergence as discussed in Chapter 3. In particular, the results there for SODEs have been generalized to SPDEs in Jentzen [64], where polynomial nonlinearities of the form f (u) = u − u3 are also considered.
This page intentionally left blank
Chapter 7
Taylor Approximations for SPDEs with Additive Noise
Taylor expansions of solutions of SPDEs in Banach spaces are the basis for deriving higher order numerical schemes for SPDEs, just as for SODEs. However, there is a major difficulty for SPDEs. Although an SPDE dXt = [AXt + F (Xt )] dt + B(Xt ) dWt
(7.1)
is driven by a Wiener process, which is a martingale, the solution process is usually not a semimartingale. See [37] for a clear discussion of this problem. In particular, a general Itô formula does not exist for the solutions of the SPDE (7.1). (In the special case of the squared norm as a test function, a standard Itô formula can be found in [88, 104] and in the references therein. In addition, in more general situations appropriately modified Itô-type formulas can be found in [37, 23] and in the references therein.) Hence stochastic Taylor expansions for the solutions of the SPDE (7.1) cannot be derived as in Chapter 3 for the solutions of finite dimensional SODEs. Apart from the exponential Euler scheme (6.13), only temporal approximations of low order for the solutions of SPDEs (7.1) were known until recently. For example, the stochastic convolution of the operator A = (Laplacian with Dirichlet boundary conditions) and B(U ) ≡ I in one spatial dimension has sample paths which are only ( 14 − ε)-Hölder continuous, and previously considered approximations did not attain a higher order than this in time. The reason is that the infinite dimensional noise process, in general, has only minimal spatial regularity, and the convolution of the semigroup and the noise is only as smooth in time as in space. This comparable regularity in time and space is a fundamental property of the dynamics of semigroups; see, e.g., [110]. To overcome these problems, robust Taylor expansions are needed for SPDEs driven by infinite dimensional Wiener processes. In this chapter it will be explained
79
80
Chapter 7. Taylor Approximations for SPDEs with Additive Noise
how this can be done for SPDEs with additive noise of the form dXt = [AXt + f (Xt )] dt + B dWt ,
(7.2)
which will be interpreted in mild form. The Taylor expansions will then be derived in essentially the same way as for RODEs in Chapter 2. The multiplicative noise case will be considered in Chapter 8. The additive noise case is, of course, a special case of the multiplicative noise case, but its special structure allows much stronger results to be obtained. In particular, pathwise as well as strong convergence results can be obtained and the driving noise process need not be a Wiener process; for example, it could be a fractional Brownian motion. This chapter is based on Chapter 2 in [62].
7.1 Assumptions Throughout this chapter the following assumptions are used. Let T ∈ (0, ∞) be a real number, let (, F , P) be a probability space, and let (V , ·V ) be a separable real Banach space. Assumption 9 (semigroup S). Let S : [0, ∞) → L(V ) be a mapping satisfying S0 = I , St1 St2 = St1 +t2 ,
sup St L(V ) < ∞,
t∈[0,T ]
sup
St − Ss L(V ) · s
s,t∈[0,T ] s 0 depends only on the S-tree t, on p ≥ 1, on R in (8.33), and on the sequence (Rn )n∈N in (8.34). Proof. Let t = (t , t ) ∈ ST’ and suppose that p ∈ [2, ∞) is arbitrary. (There is no loss of generality due to Jensen’s inequality.) The assertion will be shown by induction on the number of nodes l(t) ∈ N. Throughout the proof, C > 0 is a constant, which changes from line to line but depends on t, p, R, and (Rn )n∈N only. First, consider the case l(t) = 1. Then 0 t (1) = 0, I0 (t) = eAt − I Xt0 , t t (1) = 1, I10 (t) = t0 eA(t−s) F (Xt0 ) ds, t φ(t)(t) = I10∗ (t) = t0 eA(t−s) F (Xs ) ds, t (1) = 1∗ , t I20 (t) = t0 eA(t−s) B(Xt0 ) dWs , t (1) = 2, t 0 I2∗ (t) = t0 eA(t−s) B(Xs ) dWs , t (1) = 2∗ ,
P-a.s.,
8.7. Proofs
167
for every t ∈ [t0 , T ] and γ , t (1) = 0, ord(t) = 1, t (1) = 1 or 1∗ , θ, t (1) = 2 or 2∗ . In the case t (1) = 0 it follows that + + + + φ(t)(t)Lp (;H ) = + eAt − I Xt0 +
Lp (;H )
+ + + + ≤ +(κ − A)−γ eAt − I +
L(H )
+ + · +(κ − A)γ Xt0 +Lp (;H ) ≤ C (t)γ
for every t ∈ [t0 , T ]. Moreover, by Lemma 18 in [64], + t + + + A(t−s) + φ(t)(t)Lp (;H ) = + e F (Xt0 ) ds + + t0
Lp (;H )
t+ + + A(t−s) + ≤ F (Xt0 )+ +e
Lp (;H )
t0
≤e
ds
t
κT
+ + F (0)H + R1 +Xt0 +Lp (;H ) ds
t0
≤ eκT F (0)H + R1 Rp (t) ≤ C (t) for every t ∈ [t0 , T ] in the case t (1) = 1 and + t + + + A(t−s) + φ(t)(t)Lp (;H ) ≤ + e F (Xs ) ds + + t0
≤
t+ + + A(t−s) + F (Xs )+ +e
Lp (;H )
t0
≤ eκT
Lp (;H )
t
t0
ds
F (0)H + R1 Xs Lp (;H ) ds
≤ eκT F (0)H + R1 Rp (t) ≤ C (t)
168
Chapter 8. Taylor Approximations for SPDEs with Multiplicative Noise
for every t ∈ [t0 , T ] in the case t (1) = 1∗ . Finally, Corollary A.2 yields + + t + + A(t−s) + φ(t)(t)Lp (;H ) = + e B(Xt0 ) dWs + + t0
Lp (;H )
12
t + +2 + + A(t−s) B(Xt0 )+ p ≤p +e
≤ L0 p
ds
L (;H S(U ,H ))
t0
t
2
+ + 1 + +Xt +
Lp (;H )
0
t0
≤ 1 + Rp L 0 p
t
(t − s)
2θ−1
(t − s)
2θ−1
12 ds
12 ds
t0
t
= 1 + Rp L 0 p
s
2θ−1
12 ds
0
=
(1 + Rp )L0 p (t)θ ≤ C (t)θ √ 2θ
for every t ∈ [t0 , T ] in the case t (1) = 2 and + + t + + A(t−s) + φ(t)(t)Lp (;H ) = + e B(Xs ) dWs + + t0
≤p
Lp (;H )
12
t + +2 + A(t−s) + B(Xs )+ p +e
L (;H S(U ,H ))
t0
≤ L0 p
t
t0
2
1 + Xs Lp (;H ) (t − s)
≤ 1 + Rp L 0 p
t
(t − s)
2θ−1
2θ−1
12 ds
t0
= (1 + Rp )L0 p
t
s
2θ−1
12 ds
0
=
ds
(1 + Rp )L0 p (t)θ ≤ C (t)θ √ 2θ
12 ds
8.7. Proofs
169
for every t ∈ [t0 , T ] in the case t (1) = 2∗ . This shows the assertion in the case l(t) = 1. Now suppose that l(t) ∈ {2, 3, . . .}. Then let t1 , . . . , tn ∈ ST, n ∈ N, be the subtrees of t. Hence, φ(t)(t) = Itn (1) [φ(t1 ), . . . , φ(tn )](t),
P-a.s.,
for every t ∈ [t0 , T ] by the definition of φ. In the case t (1) = 1∗ , φ(t)(t) =
t
1
eA(t−s) 0
t0
F (n) (Xt0 + rXs )((t1 )(s), . . . , (tn )(s))
(1 − r)n−1 dr ds, (n − 1)!
×
P-a.s.,
for every t ∈ [t0 , T ] and therefore (t)(t)H ≤ e
κT
t
(t1 )(s)H · · · (tn )(s)H ds,
Rn t0
P-a.s.,
for every t ∈ [t0 , T ]. Hence, Hölder’s inequality implies (t)(t)Lp (;H ) ≤ e
κT
≤e
κT
+ t+ + + + + Rn + (t1 )(s)H · · · (tn )(s)H + t0
t
Rn t0
ds Lp (;R)
(t1 )(s)Lpn (;H ) · · · (tn )(s)Lpn (;H ) ds
≤ eκT Rn C (t)1+ordt(t1 )+···+ordt(tn ) = C (t)ordt(t) for every t ∈ [t0 , T ], since l(t1 ), . . . , l(tn ) ≤ l(t) − 1 and the induction hypothesis can be applied to subtrees. A similar calculation shows the result when t (1) = 1. In the case t (1) = 2∗ , φm (t)(t) =
t t0
1
eA(t−s) 0
×
B (n) (Xt0 + rXs ) ((t1 )(s), . . . , (tn )(s))
(1 − r)n−1 dr dWs , (n − 1)!
P-a.s.,
170
Chapter 8. Taylor Approximations for SPDEs with Multiplicative Noise
for every t ∈ [t0 , T ]. Thus φm (t)(t)Lp (;H ) ≤ p
t t0
+ 1 + + eA(t−s) B (n) (Xt0 + rXs ) + 0
+2 1 2 (1 − r)n−1 + + ds dr + × ((t1 )(s), . . . , (tn )(s)) (n − 1)! Lp (;H S(U ,H )) for every t ∈ [t0 , T ] by Corollary A.2. This yields t 1 + + A(t−s) (n) +e (t)(t)Lp (;H ) ≤ p B (Xt0 + rXs ) + 0
t0
+ + × ((t1 )(s), . . . , (tn )(s)) + +
2 dr
1 2
ds
Lp (;H S(U ,H ))
and hence (t)(t)Lp (;H ) ≤ Ln p
t
(t − s)2θ−1
1 + + + + + 1 + +Xt + rXs + 0 + H 0
t0
+ + × (t1 )(s)H . . . (tn )(s)H + +
2 dr
1 2
ds
Lp (;R)
for every t ∈ [t0 , T ]. Since 1 + + + + + 1 + +Xt + rXs + (t1 )(s) . . . (tn )(s) + p H H L (;R) dr 0 H 0
≤
1 0
+ + 1 + +Xt0 + rXs +Lp(n+1) (;H ) (t1 )(s)Lp(n+1) (;H )
· · · (tn )(s)Lp(n+1) (;H ) dr ≤ C(s − t0 )
ordt(t1 )+···+ordt(tn )
1 0
+ + 1 + +rXs + (1 − r)Xt0 +Lp(n+1) (;H ) dr
≤ C 1 + Rp(n+1) (s − t0 )ordt(t1 )+···+ordt(tn ) ≤ C (t)ordt(t1 )+···+ordt(tn )
8.7. Proofs
171
for every s ∈ [t0 , t] by the induction hypothesis, it follows that (t)(t)Lp (;H ) ≤ C
t
(t − s)(2θ−1) ds
12
(t)ordt(t1 )+···+ordt(tn )
t0
≤ C (t)θ+ordt(t1 )+···+ordt(tn ) = C (t)ordt(t) for every t ∈ [t0 , T ]. A similar calculation shows the result when t (1) = 2.
8.7.4
Proof of Theorem 8.4
The equality
P Xt = Xt0 + (w)(t) = 1
in Theorem 8.4 can also be written as (w) = X
(8.62)
in the space . If w = w0 , then (8.62) follows from (8.30). Therefore, assume that w = w0 . Then by the definition of the set SW’, there are natural numbers n ∈ N, i1 , . . . , in ∈ N, and j1 , . . . , jn ∈ N with (ik , jk ) ∈ acn E(ik−1 ,jk−1 ) . . . E(i1 ,j1 ) w0 for every k ∈ {1, . . . , n} such that w = E(in ,jn ) . . . E(i1 ,j1 ) w0 holds. Hence, (w) = (E(in ,jn ) . . . E(i1 ,j1 ) w0 ) = (E(in−1 ,jn−1 ) . . . E(i1 ,j1 ) w0 ) = · · · = (w0 ) by Lemma 8.6. Finally, (w) = (w0 ) = X by (8.30). This shows (8.62). Now let p ∈ [1, ∞) be arbitrary. It remains to show the approximation estimate + + +Xt − Xt − (w)(t)+ p ≤ C (t)ord(w) 0 L (;H ) for every t ∈ [t0 , T ], which can also be written as (w)(t) − (w)(t)Lp (;H ) ≤ C (t)ord(w) by (8.62). Here and below C ∈ [0, ∞) is a constant which changes from line to line but depends on w, on p ≥ 1, on R given in (8.33), and on the sequence (Rn )n∈N given in (8.34) only. Let t1 , . . . , tn ∈ ST’ be such that w = (t1 , . . . , tn ). Then Lemma 8.7 give φ(ti )(t)Lp (;H ) ≤ C (t)ordt(ti ) for every t ∈ [t0 , T ] and i ∈ {1, 2, . . . , n}.
(8.63)
172
Chapter 8. Taylor Approximations for SPDEs with Multiplicative Noise In the next step, the definitions of and give (w) = φ(t1 ) + · · · + φ(tn ),
(w) = ψ(t1 ) + · · · + ψ(tn ).
Hence, (w) − (w) = (φ(t1 ) − ψ(t1 )) + · · · + (φ(tn ) − ψ(tn )) =
φ(ti )
i=1,...,n ti is active
by the definition of ψ. Finally, by (8.63), it follows that (w)(t) − (w)(t)Lp (;H ) ≤ φ(ti )(t)Lp (;H ) i=1,...,n ti is active
≤
C (t)ordt(ti )
i=1,...,n ti is active
=
C (t)ordt(ti )−ord(w) (t)ord(w)
i=1,...,n ti is active
≤
C (T + 1)ordt(ti )−ord(w) (t)ord(w)
i=1,...,n ti is active
≤ C (t)ord(w) for every t ∈ [t0 , T ]. This completes the proof of Theorem 8.4.
Appendix A
Regularity Estimates for SPDEs
The proof of Theorem 5.1 is presented here. It makes frequent use of some important inequalities, which will first be considered. Then the semigroup, drift, and diffusion terms of the SPDE (5.20) will be systematically estimated for use in the proof. The estimates here are very similar to those in the literature (see particularly Da Prato & Zabzcyk [24], Kruse & Larsson [87], and van Neerven, Veraar & Weis [121, 122]). But for completeness they are proved in detail below. The following notation is used in sections A.3–A.5. For two normed real vector spaces V1 , ·V1 , V2 , ·V2 and a mapping f : V1 → V2 from V1 to V2 , define f Lip(V1 ,V2 ) := f (0)V2 + sup
f (v) − f (w)V2
v,w∈V1
v − wV1
∈ [0, ∞].
(A.1)
In addition, for real numbers T ∈ (0, ∞), r ∈ (0, 1), a probability space (, F , P), a separable real Hilbert space (V , ·V , ·, ·V ), a random variable Z : → V , and a stochastic process Y : [0, T ] × → V the notations # p $ 1 ZLp (;V ) := E ZV p ∈ [0, ∞]
(A.2)
and Y C r ([0,T ],Lp (;V )) # p $ 1 := sup E Yt V p + t∈[0,T ]
sup
t1 ,t2 ∈[0,T ] t1 =t2
are used in what follows. 173
+p $ 1 #+ E +Yt2 − Yt1 +V p |t2 − t1 |r
∈ [0, ∞]
(A.3)
174
Appendix A. Regularity Estimates for SPDEs
A.1
Some useful inequalities
A.1.1
Minkowski’s integral inequality
The following inequality is known as Minkowski’s integral inequality (see, for instance, Appendix A.1 in Stein [115] and Theorem 202 in Hardy, Littlewood & Pólya [49]; see also Lemma 18 in [62] for the next proof). Proposition 8. Let (1 , G1 , µ1 ) and (2 , G2 , µ2 ) be two finite measure spaces and let f : 1 × 2 → [0, ∞] be G1 ⊗ G2 /B([0, ∞])-measurable. Then
p f (x, y) µ2 (dy)
p1
1
p
|f (x, y)|p µ1 (dx)
≤
µ1 (dx)
2
1
µ2 (dy)
1
2
(A.4)
for all p ∈ [1, ∞).
Proof. In the case p = 1 equality holds and (A.4) follows by Fubini’s theorem. Therefore, let p > 1 and let q ∈ (1, ∞) such that p1 + q1 = 1. First, consider the case |f (x, y)| ≤ C < ∞
for all x ∈ 1 , y ∈ 2 ,
where f is bounded by a constant C > 0. Then p f (x, y) µ2 (dy) µ1 (dx) 1
2
(p−1)
=
f (x, y) µ2 (dy)
1
f (x, u) µ2 (du) µ1 (dx)
2
2 (p−1)
=
f (x, y) µ2 (dy) 2 1
)
f (x, u) µ1 (dx) µ2 (du)
2
(p−1)q
≤
f (x, y) µ2 (dy) 2
1
2
=
p f (x, y) µ2 (dy)
1
1 f (x, u)p µ1 (dx)
p
µ2 (du)
1
* 1 q µ1 (dx) ·
(1− p1 )
1 f (x, u)p µ1 (dx)
µ1 (dx)
2
2
p
µ2 (du)
1
due to Fubini’s theorem and Hölder’s inequality. Since
p f (x, y) µ2 (dy)
1
(1− p1 ) µ1 (dx)