Indifference Pricing
PRINCETON SERIES IN FINANCIAL ENGINEERING Edited by René Carmona and Erhan Çinlar
Indifference Pricing: Theory and Applications edited by René Carmona
Indifference Pricing
Theory and Applications
Edited by René Carmona
PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD
Copyright © 2009 by Princeton University Press Published by Princeton University Press 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press 6 Oxford Street, Woodstock, Oxfordshire OX20 1TW All Rights Reserved Library of Congress Cataloging-in-Publication Data Indifference pricing : theory and applications / edited by René Carmona. p. cm. — (Princeton series in financial engineering) Includes bibliographical references and index. ISBN 978-0-691-13883-1 (hbk. : alk. paper) 1. Nonlinear pricing—Mathematical models. 2. Prices—Mathematical models. I. Carmona, R. (René) HF5416.5.I53 2009 658.8'16—dc22 2008036265 British Library Cataloging-in-Publication Data is available Printed on acid-free paper. ∞ press.princeton.edu Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Contents
Preface
ix
PART 1. FOUNDATIONS
1
Chapter 1. The Single Period Binomial Model Marek Musiela and Thaleia Zariphopoulou
3
1.1 1.2
Introduction The Incomplete Model
Chapter 2. Utility Indifference Pricing: An Overview Vicky Henderson and David Hobson 2.1 Introduction 2.2 Utility Functions 2.3 Utility Indifference Prices: Definitions 2.4 Discrete Time Approach to Utility Indifference Pricing 2.5 Utility Indifference Pricing in Continuous Time 2.6 Applications, Extensions, and a Literature Review 2.7 Related Approaches 2.8 Conclusion
3 5
44 44 45 48 51 52 65 68 72
PART 2. DIFFUSION MODELS
75
Chapter 3. Pricing, Hedging, and Designing Derivatives with Risk Measures Pauline Barrieu and Nicole El Karoui
77
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9
Indifference Pricing, Capital Requirement, and Convex Risk Measures Dilatation of Convex Risk Measures, Subdifferential and Conservative Price Inf-Convolution Optimal Derivative Design Recalls on Backward Stochastic Differential Equations Axiomatic Approach and g-Conditional Risk Measures Dual Representation of g-Conditional Risk Measures Inf-Convolution of g-Conditional Risk Measures Appendix: Some Results in Convex Analysis
78 93 98 105 118 120 128 136 141
vi
CONTENTS
Chapter 4. From Markovian to Partially Observable Models René Carmona 4.1 A First Diffusion Model 4.2 Static Hedging with Liquid Options 4.3 Non-Markovian Models with Full Observation 4.4 Optimal Hedging in Partially Observed Markets 4.5 The Conditionally Gaussian Case
147 147 154 159 169 174
PART 3. APPLICATIONS
181
Chapter 5. Portfolio Optimization Aytac Ilhan, Mattias Jonsson, and Ronnie Sircar
183
5.1 5.2 5.3 5.4
Introduction Indifference Pricing and the Dual Formulation Utility Indifference Pricing Stochastic Volatility Models
Chapter 6. Indifference Pricing of Defaultable Claims Tomasz R. Bielecki and Monique Jeanblanc 6.1 6.2 6.3 6.4
Preliminaries Indifference Prices Relative to the Reference Filtration Optimization Problems and BSDEs Quadratic Hedging
Chapter 7. Applications to Weather Derivatives and Energy Contracts René Carmona 7.1 Application I: Temperature Options 7.2 Application II: Rainfall Options 7.3 Application III: Commodity Derivatives
183 186 190 197
211 211 216 222 230
241 241 249 256
PART 4. COMPLEMENTS
265
Chapter 8. BSDEs and Applications Nicole El Karoui, Said Hamadène, and Anis Matoussi
267
8.1 General Results on Backward Stochastic Differential Equations 8.2 Applications to Optimization Problems 8.3 Markovian BSDEs 8.4 BSDEs with Quadratic Growth with Respect to Z 8.5 Reflected Backward Stochastic Differential Equations
Chapter 9. Duality Methods Robert J. Elliott and John van der Hoek 9.1 9.2 9.3 9.4
Introduction Model Utility Functions Pricing Claims
269 279 285 296 303
321 321 322 325 326
CONTENTS
9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12 9.13 9.14
The Dual Cost Function G (y) and V 0 (y) The Minimum of V The Calculation of V0 (x) The Indifference Asking Price for Claims The Indifference Bid Price Examples Properties of ν Numerical Methods Approximate Formulas An Alternative Representation for VG (x)
Bibliography List of Contributors Notation Index Author Index Subject Index
vii 333 341 346 348 355 356 361 364 374 381 387 405 409 410 413
This page intentionally left blank
Preface
This book is the first volume in a series of treatises in financial engineering to be published by the Princeton University Press. It would not have been possible without the encouragement of E. Cinlar and the generous financial support of W. Neidig. It is an introduction to the subtle intricacies of utility-based pricing. It consists of the joint efforts of authors who have led the research on this field. Each chapter can be read independently of others, but the book is better than the sum of its parts: we have taken great care to unify the styles of individual contributors and to structure the contents of the individual chapters. Thus the book brings together, in a timely fashion, a comprehensive view of exciting new developments in utility indifference pricing. Classical work in the 1980s and 90s concentrated on the development of formulas for pricing and hedging of ever more exotic derivatives. Much of the work assumed complete markets, and the common belief was that it was always possible to choose a pricing kernel, or to choose a specific equivalent martingale measure to compute prices. But the markets are generally incomplete, and it may be impossible to hedge against all sources of randomness. Typical examples of such sources of randomness include stochastic volatilities, random jumps, and nontradable assets. The idea of this book came just as financial mathematicians and engineers started to develop pricing and hedging procedures under more realistic assumptions. As suggested by the quote “Nothing can have value without being an object of utility” by Karl Marx, we tried to enlist the contributors who have been part of a recent wave of interest in utility-based pricing, into writing a book introducing the subtle intricacies of this nonlinear pricing scheme. Utility indifference pricing was first introduced in the early nineties by Hodges and Neuberger, but it had to wait almost an entire decade to catch on with financial mathematicians. It was rediscovered by Davis [60,62] and used by Henderson [119] and Musiela and Zariphopoulou [195,196,199], who considered the case of a derivative written on a nontraded asset whose price dynamics are given by a geometric Brownian motion. Most of these contributions are restricted to the case of the exponential utility function. Indeed, in the case of the power utility, one is only able to obtain bounds on the prices. Simultaneously, developments in functional analysis and convex duality associated to the names of Delbaen, El Karoui, Fritelli, and many others brought theoretical tools to the forefront. It was time to bring all these developments together in textbook form. We started our crusade three and a half years ago. We were heartened by the positive responses of the research leaders in this field to our invitation to contribute to the book.
x
PREFACE
The book consists of nine chapters followed by an extensive bibliography and indexes. The chapters are divided into four parts. The first part of the book sets the stage for the developments of the theory. Chapter 1, written by M. Musiela and T. Zariphopoulou, chooses to use the simplest possible models of discrete time and finite state spaces to present the foundations of indifference pricing, highlight the financial underpinnings of the scheme, and expose the fundamental principles without the technicalities of continuous time and continuous spaces. Chapter 2 gives an introductory overview of the subject. V. Henderson and D. Hobson review the existing literature and present a clear mathematical presentation of the theory of utility functions as it is used throughout the book. Their compendium provides a fair and lucid introduction to utility indifference pricing. Part 2 concentrates on indifference pricing for diffusion models. Like Chapter 1, Chapter 3 presents original mathematical results. Using the optimal design of derivatives as motivation, P. Barrieu and N. El Karoui extend the indifference pricing paradigm beyond the realm of utility functions. They work in the framework of risk measures, and for this reason they need to develop the theory of inf-convolution of risk measures, and to present elements of a theory for dynamic risk measures. They rely on sophisticated results from the theory of Backward Stochastic Differential Equations (BSDEs), which are presented in Chapter 8. Chapter 4 starts with a generalization of the original geometric Brownian motion incomplete models in which indifference pricing was first studied in the mid-nineties. Steps are taken to solve the expected utility maximization problems for some special non-Markovian models with a view toward the analysis of indifference pricing for partially observed markets by ways of filtering techniques. The chapter concludes with the treatment of the conditionally Gaussian case. The three chapters of Part 3 are concerned with applications. In Chapter 5,A. Ilhan, M. Jonsson and R. Sircar study the problem of portfolio optimization using derivatives as a proxy for trading volatility. Since stochastic volatility models are not amenable to closed form solutions, they develop asymptotic approximations to assess numerically the performance of the static hedging strategies. In Chapter 6, T. Bielecki and M. Jeanblanc develop the mathematical foundations necessary for the pricing of defaultable securities, and they discuss quadratic hedging of these securities. They rely heavily on the theory of BSDEs presented later in the book. Finally, Chapter 7 discusses application to weather derivatives (temperature and rainfall options) and to commodity derivatives. The book concludes with a fourth part with complements to the fundamental results presented in the main body of the book. Chapter 8, written by N. El Karoui, S. Hamadène, and A. Matousi provides a well-balanced review of the basics of BSDEs and their applications. This crash course is of great value, not only to those trying to follow the developments presented in Chapters 3 and 6, but to those eager to learn the modern mathematics of the optimization problems of stochastic control, stochastic games, and stochastic equilibrium models in continuous time finance. Chapter 9, written by R. Elliott and J. van der Hoek, completes the analysis of the discrete time models started in Chapter 1 with a discussion of the duality approach
xi
PREFACE
to the utility maximization problems embedded in indifference pricing in the setting of discrete models. The real price of everything, what everything really costs to the man who wants to acquire it, is the toil and trouble of acquiring it. —Adam Smith René Carmona Princeton, N.J.
This page intentionally left blank
PART 1
Foundations
This page intentionally left blank
Chapter One The Single Period Binomial Model Marek Musiela1 Thaleia Zariphopoulou2
1.1 INTRODUCTION Derivatives pricing and investment management seem to have little in common. Even at the organizational level, they belong to two quite separate parts of financial markets. The so-called sell side, represented mainly by the investment banks, among other things offers derivatives products to their customers. Some of them are wealth managers, belonging to the so-called buy side of financial markets. So far, the only universally accepted method of derivative pricing is based upon the idea of risk replication. Models have been developed which allow for perfect replication of option payoffs via implementation of a replicating and self-financing strategy. We call such models complete. The option price is calculated as the cost of this replication. Adjustments to the price are later made to cover for risks due to the unrealistic representation of reality. More accurate description of the market is given by the so-called incomplete models in which not all risk in a derivative product can be eliminated by dynamic hedging. However, this potential model advantage is hampered by another difficulty. Namely, the concept of price for a derivative contract is not uniquely defined. Many 1 Early versions of this work were presented at the 2nd World Congress of the Bachelier Finance Society (2002), the AMS-IMS-SIAM Joint Summer Research Conference in Mathematical Finance (2003), Workshop on Financial Mathematics, Carnegie-Mellon (2005), Workshop on Stochastic Modelling, CMR, University of Montreal (2005), SIAM Conference on Control and Optimization, New Orleans (2005), Workshop on PDEs and Mathematical Finance, KTH, Stockholm (2005), Meeting on Stochastic Processes and Their Applications, Ascona (2005), and in seminars at Princeton University (2003), Columbia University (2003), MIT (2004), Imperial College (2004), University of Southern California (2004), Cornell University (2004), and Brown University (2005). The authors would like to thank the participants of these conferences and seminars for fruitful suggestions and useful comments. The authors would also like to thank René Carmona for giving them the opportunity to present their work in this volume. They would also like to thank Michael Anthropelos for thoroughly reading the manuscript and correcting typos and errors. Last but not least, the authors would like to express their gratitude to Professor Dieter Sondermann for his invaluable advice to start this study for the simplest possible model set up and to focus on bringing out intuition and structure, abstracting from technical complexities. 2 The second author aknowledges support from the National Science Foundation (NSF grants: DMS0102909, DMS-VIGRE-0091946 and DMS-FRG-0456118).
4
CHAPTER 1
approaches have been proposed and extensively studied; however, until now no clear consensus has emerged. On the other side of the spectrum of financial markets there are wealth managers. They have developed their own methodology for implementation of their investment decisions. They may use derivative products to improve their performance; however, their focus is on investment strategy with a view to optimize returns rather than on risk replication. Therefore, it should not come as a surprise that the models they use are very different from the models used in derivatives pricing. The main aim of this chapter is to work toward convergence of the methodologies used in these apparently quite distant areas. The idea is to associate the concept of price for a derivative contract with a rather natural, to a wealth manager, constraint, that is, maximization of expected utility of wealth. We choose to work with exponential utility and a very simple model structure, namely, the classical single period binomial model. We do so in order to eliminate all technical difficulties, explain the fundamental ideas and compare them with the classical arbitrage free theory, and concentrate exclusively on the most important links between the two areas. The chapter is organized as follows. In the next section we introduce and analyze in detail the single period binomial model. In particular, we derive an intuitively appealing formula for the indifference price of a general claim. Then, we study the various properties of the indifference price and exhibit the connection with convex risk measures. The link with the classical methodology of pricing by replication is analyzed next. It turns out that the analogue of the so-called delta retains its natural interpretation as the sensitivity of the price with respect to the movement of the instrument used for hedging. Moreover, it also appears that the components of risk that are left unhedged, in our incomplete model setup, have zero value from the perspective of valuation by indifference. Another important observation about the nature of pricing by the indifference is exposed in the subsection dedicated to relative pricing. Namely, this type of pricing scheme is relative to the agent’s portfolio in contrast to the arbitrage-free pricing scheme which is relative to the market portfolio. When interpreted this way, the indifference valuation can be viewed as linear, while, of course, when seen as a functional over a set of random variables it is not. Going deeper into the comparisons with the pricing by arbitrage, we then investigate the issue of unit choice and the necessary consistency with the static no-arbitrage constraint. We show, in particular, that in order to eliminate static arbitrage one needs to relate the risk aversion parameter with the unit of wealth. To our knowledge, this is the first time that modeling issues pertinent to consistency across units have been identified and addressed. To accommodate more general situations, we allow for the risk aversion to be modified according to our local in time views about anticipated performance of the traded securities. Specifically, we study the case when the risk aversion parameter depends on the future value of the traded stock. It turns out that the pricing rule retains its intuitive form. Moreover, the associated value function exhibits an interesting relationship between the risk tolerance (the reciprocal of risk aversion) at the end and at the beginning of a time period. Namely, the end of a period of risk tolerance
5
THE SINGLE PERIOD BINOMIAL MODEL
can be viewed as an option payoff, and the consistent risk tolerance for the beginning of the period is its arbitrage-free price. Motivated by the general need for absence of static arbitrage, and by the above observation in particular, we conclude introducing the notions of the utility normalization and the concepts of the backward and forward utilities.
1.2 THE INCOMPLETE MODEL We introduce a simple one-period binomial model with one riskless and two risky assets, of which only one is traded. By construction, the model is incomplete and our aim is to develop a coherent approach for investment management and derive from it a pricing methodology for derivative contracts. Optimal investment management is based on maximization of expected utility of wealth. There are a number of constraints we want to impose on our investment decision process and on the derivatives valuation method. To mention just two, we want our investment decisions not to depend on units in which the wealth is expressed. This is mainly because we also need to ensure that our pricing method is consistent with the absence of arbitrage and that it is also numeraire independent. We also want our pricing concept to have a clear intuitive meaning, so an effort is made to interpret the results and, whenever possible, to draw analogies with the classical arbitrage-free theory of complete markets. 1.2.1 Indifference Price Representation Consider a single period model in a market environment with one riskless and two risky assets. The riskless asset is assumed to offer zero interest rate. Only one of the traded assets can be traded, taken to be a stock. The current values of the traded and nontraded risky assets are denoted, respectively, by S0 and Y0 . At the end of the period T , the value of the traded asset is ST with ST = S0 ξ , where the random variable ξ = ξ d , ξ u and 0 < ξ d < 1 < ξ u . Similarly, the value of the nontraded asset YT satisfies YT = Y0 η, with η = ηd , ηu , with ηd < ηu , (Y0 , YT = 0). We introduce randomness into our single-period model by means of the probability space (, FT , P), where = {ω1 , ω2 , ω3 , ω4 } and P is a probability measure on the σ −algebra FT = 2 of all subsets of . For each i = 1, . . . , 4, we assume that pi = P {ωi } > 0 and we model the upwards and the downwards movement of the two risky assets ST and YT by setting their values as follows: ST (ω1 ) = S0 ξ u , ST (ω2 ) = S0 ξ u ,
YT (ω1 ) = Y0 ηu YT (ω2 ) = Y0 ηd
ST (ω3 ) = S0 ξ d , YT (ω3 ) = Y0 ηu ST (ω4 ) = S0 ξ d , YT (ω4 ) = Y0 ηd .
The measure P represents the so-called historical measure. Observe that the σ -algebra FT coincides with the σ -algebra FT(S,Y ) generated by the random variables ST and YT . In what follows we will also need the σ -algebra FTS generated exclusively by the random variable ST . Consider a portfolio consisting of α shares of the traded asset and the amount β invested in the riskless one. Its current value X0 = x is equal to β + αS0 = x, while
6
CHAPTER 1
its wealth XT , at the end of the period [0, T ], is given by XT = β + αST = x + α(ST − S0 ).
(1.1)
Now introduce a claim, settling at time T and yielding payoff CT . In pricing of CT , we need to specify our risk preferences. We choose to work with the exponential utility U (x) = −e−γ x , x ∈ R
and γ > 0.
(1.2)
Optimality of investments, which will ultimately yield the indifference price of the claim, is examined via the value function (1.3) V CT (x) = sup EP − e−γ (XT −CT ) α = e−γ x sup EP − e−γ α(ST −S0 )+γ CT . α
Below, we recall the definition of indifference prices. Definition 1.1 The indifference price of the claim CT = c(ST , YT ) is defined as the amount ν(CT ) for which the two value functions V CT and V 0 , defined in (1.3) and corresponding, respectively, to the claims CT and 0, coincide. Namely, ν(CT ) is the amount which satisfies V 0 (x) = V CT (x + ν(CT ))
(1.4)
for all initial wealth levels x ∈ R. Looking at the classical arbitrage-free pricing theory, we recall that derivative valuation has two fundamental components which do not depend on specific model assumptions. Namely, the price is obtained as a linear functional of the (discounted) payoff representable via the (unique) risk neutral equivalent martingale measure. Our goal is to understand how these two components, namely, the linear valuation operator and the risk neutral pricing measure, change when markets become incomplete. In the context of pricing by indifference, we will look for a valuation functional and a naturally related pricing measure under which the price is given as ν(CT ) = EQ (CT ).
(1.5)
Before we determine the fundamental features that E and Q should have, let us look at some representative cases. Examples: i) First, we consider a claim of the form CT = c(ST ). Intuitively, the indifference price should coincide with the arbitrage-free price, for there is no risk that cannot be hedged. Indeed, one can construct a nested complete one-period binomial model and show that ν(c(ST )) = EQ ∗ (c(ST )),
(1.6)
7
THE SINGLE PERIOD BINOMIAL MODEL
with Q∗ being the relevant risk neutral measure. The indifference price mechanism reduces to the arbitrage-free one and any effect on preferences dissipates. ii) Next, we look at a claim of the form CT = c(YT ) and assume for simplicity that the random variables ST and YT are independent under the measure P. In this case, intuitively, the presence of the traded asset should not affect the price. Indeed, working directly with the value function (1.3) and Definition 1.1, it is straightforward to deduce that 1 ν(c(YT )) = log EP (eγ c(YT ) ). (1.7) γ The indifference price coincides with the classical actuarial valuation principle, the so-called certainty equivalent value, which is nonlinear in the payoff and uses as pricing measure the historical one. iii) Finally, we examine a claim of the form CT = c1 (ST ) + c2 (YT ). One could be, wrongly, tempted to price CT by first pricing c1 (ST ) by arbitrage, next pricing c2 (YT ) by certainty equivalent, and adding the results. Intuitively, this should work when ST and YT are independent. However, this cannot possibly work under strong dependence between the two variables, for example, when YT is a function of ST . In general, ν(c1 (ST ) + c2 (YT )) = EQ ∗ (c1 (ST )) +
1 log EP (eγ c2 (YT ) ). γ
The above illustrative examples indicate certain fundamental characteristics E and Q should have. First of all, we observe that a nonlinear valuation functional must be sought. Clearly, any effort to represent indifference prices as expected payoffs under an appropriately chosen universal measure should be abandoned. Indeed, no linear pricing mechanism can be compatible with the concept of indifference based valuation as defined in (1.4). Note that this fundamental observation comes in contrast to the central direction of existing approaches in incomplete models that yield prices as expected payoffs under an optimally chosen measure. We also see that risk preferences may affect the valuation device given their inherent role in price specification. However, intuitively speaking, we would prefer to specify the pricing measure independently on the risk preferences. Finally, the pricing measure and the valuation device should ideally be the same for all claims to be priced. The next proposition yields the indifference price in the desired form (1.5). Proposition 1.1 Let Q be a measure under which the traded asset is a martingale and, at the same time, the conditional distribution of the nontraded asset, given the traded one, is preserved with respect to the historical measure P, i.e., Q(YT |ST ) = P(YT |ST ).
(1.8)
Let CT = c(ST , YT ) be the claim to be priced under exponential preferences with risk aversion coefficient γ . Then, the indifference price of CT is given by 1 γ CT ν(CT ) = EQ (CT ) = EQ |ST ) . (1.9) log EQ (e γ
8
CHAPTER 1
Proof. We prove the above result by constructing the indifference price via its definition (1.4). We start with the specification of the value functions V 0 and V CT . We represent the payoff CT as a random variable defined on with values CT (ωi ) = ci ∈ R, for i = 1, . . . , 4. Elementary arguments lead to V CT (x) = e−γ x sup(−e−γ αS0 (ξ
u −1)
α
− e−γ αS0 (ξ
d −1)
(eγ c1 p1 + eγ c2 p2 )
(eγ c3 p3 + eγ c4 p4 )).
Maximizing over α leads to the optimal number of shares α CT ,∗ , given by α CT ,∗ =
1 γ S0 (ξ u − ξ d ) +
log
1 γ S0 (ξ u − ξ d )
(ξ u − 1)(p1 + p2 ) (1 − ξ d )(p3 + p4 )
log
(1.10)
(eγ c1 p1 + eγ c2 p2 )(p3 + p4 ) . (eγ c3 p3 + eγ c4 p4 )(p1 + p2 )
Further straightforward, albeit tedious, calculations yield V CT (x) = −e−γ x
1 (eγ c1 p1 + eγ c2 p2 )q (eγ c3 p3 + eγ c4 p4 )1−q , (1.11) q q (1 − q)1−q
where q=
1 − ξd . ξu − ξd
For CT = 0, the value function takes the form p1 + p2 q p3 + p4 1−q V 0 (x) = −e−γ x . q 1−q
(1.12)
(1.13)
From the definition of the indifference price (1.4) and the representations (1.11), (1.13) of the relevant value functions, it follows that ν(CT ) = q
1 e γ c 3 p3 + e γ c 4 p 4 1 e γ c 1 p1 + e γ c 2 p2 + (1 − q) log . log p1 + p2 γ p3 + p4 γ
(1.14)
Next, we show that the above price admits the probabilistic representation (1.9). We first consider the terms involving the historical probabilities in (1.14) and we note that they can be actually written in terms of the conditional historical expectations, namely, e γ c 1 p 1 + e γ c 2 p2 = EP (eγ CT |A) p1 + p2 and e γ c 3 p3 + e γ c4 p4 = EP (eγ CT |Ac ), p3 + p4 where A = {ω1 , ω2 } = {ω : ST (ω) = S0 ξ u }. It is important to observe that conditioning is taken with respect to the terminal values of the traded asset. We continue with the specification of the pricing measure defined in (1.12). For this, we denote (with a slight abuse of notation) by q1 , q2 , q3 , q4 the elementary
9
THE SINGLE PERIOD BINOMIAL MODEL
probabilities of the sought measure Q. Straightforward calculations yield that q1 + q2 = q,
(1.15)
with q as in (1.12). In order to compute the quantity q1 , we look at the conditional historical probability of {YT = Y0 ηu }, given {ST = S0 ξ u }, and we impose (1.8), yielding p1 q1 = . p1 + p2 q The probabilities q2 , q3 and q4 , computed in a similar manner, are written below in a concise form: pi pi qi = q , i = 3, 4. (1.16) , i = 1, 2 and qi = (1 − q) p3 + p4 p1 + p2 It follows easily that the nonlinear terms in (1.14) can be compiled as 1 e γ c 1 p1 + e γ c 2 p2 e γ c 3 p 3 + e γ c 4 p4 1 IA + IAc log log γ p1 + p2 γ p3 + p4 1 1 = log EP (eγ CT |A)IA + log EP (eγ CT |Ac )IAc γ γ 1 = log EQ (eγ CT |ST ). γ Therefore, taking the expectation with respect to Q yields 1 γ CT EQ |ST ) log EQ (e γ 1 1 e γ c1 p1 + e γ c2 p2 e γ c 3 p3 + e γ c 4 p 4 = EQ IA + IAc log log γ p1 + p2 γ p3 + p4 1 1 e γ c 1 p1 + e γ c 2 p 2 e γ c 3 p3 + e γ c 4 p4 = q log + (1 − q) log = ν(CT ), γ p1 + p2 γ p3 + p4 where we used (1.14) to conclude.
2
We next discuss the key ingredients and highlight the intuitively natural features of the probabilistic pricing formula (1.9). Interpretation of the Indifference Price: Valuation is done via a two-step nonlinear procedure and under a single pricing measure. i) Valuation procedure: In the first step, risk preferences are injected into the valuation process. The original derivative payoff is being distorted to the preference adjusted payoff, to be called the conditional certainty equivalent 1 C˜ T = log EQ (eγ CT |ST ). γ
(1.17)
This new payoff has actuarial-type characteristics and reflects the weight that risk aversion carries in the utility-based methodology. However, the certainty equivalent
10
CHAPTER 1
rule is not applied in a naive way. Indeed, we do not consider any classical actuarial type functional, since 1 C˜ T = log EP (eγ CT ) γ
1 and C˜ T = log EQ (eγ CT ). γ
In the second step, the pricing procedure picks up arbitrage-free pricing characteristics. It prices the preference adjusted payoff C˜ T , dependent only on ST , through an arbitrage-free method. The same pricing measure is being used in both steps. The price is then given by ν(CT ) = EQ (CT ) = EQ (C˜ T ). It is important to observe that the two steps are not interchangeable and of entirely different natures. The first step prices in a nonlinear way as opposed to the second step that uses linear, arbitrage-free, valuation principles. In a sense, this is entirely justifiable: the unhedgeable risks are identified, isolated, and priced in the first step, and, thus, the remaining risks become hedgeable. One should then use a nonlinear valuation device for the unhedgeable risks and linear pricing for the hedgeable ones. A natural consequence of this is that risk preferences enter exclusively in the conditional certainty equivalent term, the only term related to unhedgeable risks. Both steps are generic and valid for any payoff. ii) Pricing measure: One pricing measure is used throughout. Its essential role is not to alter the conditional distribution of risks, given the ones we can trade, from their respective historical values. Naturally, there is no dependence on the payoff. The most interesting part, however, is its independence on risk preferences. This universality is expected and quite pleasing. It follows from the way we identified the pricing measure, via (1.8), a selection criterion that is obviously independent of any risk attitude. Finally, the distorted payoff C˜ T can be computed under both the historical and the pricing measure; indeed, we have 1 1 C˜ T = log EQ (eγ CT |ST ) = log EP (eγ CT |ST ). γ γ The remainder of this section is dedicated to a comparison of our representation of the indifference prices and of the associated value functions with the well-known representations obtained by Rouge and El Karoui in [237] and Delbaen et al. in [81] (see also Kabanov and Stricker [146]). The technical arguments are not difficult and therefore the discussion is provided in a casual fashion. The conclusions are presented in Proposition 1.3. In the aforementioned works, it has been established that the indifference price solves a stochastic optimization problem. The objective therein is to maximize, over all martingale measures, the expected payoff of the claim, reduced by a (relative) entropic penalty term (see (1.20) below). This representation is a direct result of the choice of exponential preferences and of the duality approach used on the primary expected utility problem. Details on the duality approach will be presented in the Chapter 9 by Elliott and Van der Hoek.
11
THE SINGLE PERIOD BINOMIAL MODEL
A martingale measure that naturally arises in this analysis is the so-called mini˜ It is defined as the mal relative entropy measure, denoted, for the moment, by Q. minimizer of the relative entropy, namely, ˜ H (Q|P) = min H (Q|P), Q∈Qe
where
H (Q|P) = EP
dQ dQ ln . dP dP
(1.18)
(1.19)
For an extensive study of this measure, we refer to the work of Frittelli [96, 97]. Under general model assumptions, the following result was established by Rouge and El Karoui in [237] and Delbaen et al. in [81]. Proposition 1.2 The indifference price ν(CT ) is given by 1 ˜ ν(CT ) = sup EQ (CT ) − (H (Q|P) − H (Q|P)) , γ Q∈Qe
(1.20)
where Qe is the set of martingale measures equivalent to P. The above formula has several attractive features. It is valid for general models and arbitrary payoffs. The entropic penalty directly quantifies the effects of incompleteness on the prices. The formula also exposes the limiting behavior of the price as the investor becomes risk neutral, namely, as γ converges to zero. Finally, it highlights, in an intuitively pleasing way, the monotonicity of the price in terms of risk preferences and its convergence to the arbitrage-free price as the market becomes complete. This representation has, however, some shortcomings. It provides the price via a new optimization problem, a fact that does not allow for a universal analogue to its arbitrage-free counterpart. It also yields a pricing measure that has the undesirable feature of depending on the specific payoff. Moreover, the price formula (1.20) considerably obstructs the analysis and study of certain important aspects of indifference valuation, as, for example, its numeraire independence and its generalization when risk preferences become stochastic. It also provides limited intuition for the construction of the relevant risk monitoring strategies and the associated payoff decomposition formulas. We start our comparative analysis by exploring the relation between the pricing ˜ appearing in (1.20). measure Q used in (1.9), and of the minimal entropy measure Q We can readily see that the two measures coincide. In fact, consider the relative ˆ is entropy (1.19) and look at its minimizers. For the simple model at hand, if Q an arbitrary martingale measure defined by the elementary probabilities qˆi , i = 1, . . . , 4, then ˆ H (Q|P) =
4 i=1
qˆi log
qˆi . pi
12
CHAPTER 1
Simple calculations yield that the minimizing elementary probabilities, say q˜i i = 1, . . . , 4, are given by pi pi q˜i = q , i = 1, 2 and q˜i = (1 − q) , i = 3, 4 p1 + p2 p3 + p4 and, thus, they are equal to the qi , i = 1, . . . , 4 of Q (see (1.16)). Therefore, ˜ H (Q|P) = H (Q|P) =
4 i=1
qi log
qi . pi
(1.21)
We observe that in the static incomplete model studied herein, there is an addiˆ which coincides with Q. Specifically, Q ˆ tional martingale measure, denoted by Q, is the minimal martingale measure defined as the minimizer of dQ ˆ H (Q|P) = min EP − ln . Q∈Qe dP This measure (see, among others, [249]) appears frequently in risk minimization in incomplete markets. ˜ and Q ˆ do not coincide. It is important to recall that, in general, the measures Q ˜ ≡ Q, ˆ we formulate all relevant arguments in terms of the Herein, even though Q minimal relative entropy measure to preserve the connection with the existing work on indifference valuation under exponential criteria. We also observe that the minimal relative entropy measure is not a maximizer in the pricing formula (1.20). Indeed, if this were the case, the indifference price ˜ an obviously incorrect would have been the expected value of the payoff under Q, conclusion. This can happen only if the market is complete in which case, the minimal relative entropy measure coincides with the unique risk-neutral one and the indifference price reduces to the arbitrage-free price. We can use the above observations to deduce alternative formulas for the involved value functions (1.3). These representations, first produced by Delbaen et al. in [81], are interesting on their own right. As we will see in subsequent sections, they offer valuable insights for the specification of the dynamic risk preferences and are instrumental in the construction of indifference prices in more complex model environments. To this end, we first observe that 4 p1 + p2 q p3 + p4 1−q qi − log = qi log , q 1−q pi i=1 which, in view of (1.21), implies that the left-hand side represents the minimal relative entropy. Combining this with (1.13) yields the representation (1.22). This structural result is intuitively pleasing. It reflects how risk preferences are dynamically adjusted via the optimal investments. In fact, the value function V 0 is directly obtained from the terminal utility U by a mere translation of the wealth argument. In a sense, the entropy H (Q|P) represents the wealth value adjustment due to the magnitude of the opportunities offered in the market.
13
THE SINGLE PERIOD BINOMIAL MODEL
A similar representation can be derived for the value function V CT ; see (1.23) below. It follows directly from the definitions of the indifference price and the value functions; see, respectively, (1.3) and (1.4). Formula (1.23) shows that V CT can be obtained from the terminal utility through two wealth adjustments, one that is related to the indifference price and the other, already appearing in the absence of the claim, reflecting the magnitude of investment opportunities and market incompleteness. We summarize the above results below. Proposition 1.3 Let ν(CT ) be the indifference price of the claim, Q the pricing measure introduced in (1.8), and H (Q|P) its associated relative entropy (cf. (1.21)). ˜ satisfies i) The minimal relative entropy measure Q ˜ Q ≡ Q. ii) The value functions V 0 and V CT are represented, respectively, by 1 0 −γ x−H (Q|P) = U x + H (Q|P) V (x) = −e γ
(1.22)
and V CT (x) = −e−γ x−H (Q|P)+γ ν(CT ) 1 = U x + H (Q|P) − ν(CT ) , γ
(1.23)
with U as in (1.2). iii) The indifference price satisfies 1 ν(CT ) = sup EQ (CT ) − (H (Q|P) − H (Q|P)) = EQ (CT ), γ Q∈Qe where the nonlinear pricing functional EQ is given by 1 γ CT EQ (CT ) = EQ |ST ) . log EQ (e γ 1.2.2 Properties of the Indifference Prices The previous analysis produced the nonlinear price representation ν(CT ) = EQ (CT ) = EQ (C˜ T ), where the preference adjusted payoff C˜ T is the conditional certainty equivalent (cf. (1.17)), 1 C˜ T = log EQ (eγ CT |ST ), γ and the pricing measure Q is given by (1.8). This pricing formula yields a direct constitutive analogue to the linear pricing rule of the complete models. Our next task is to explore the structural properties of the indifference prices, their behavior with respect to various inputs as well as their differences and similarities to the arbitrage-free prices.
14
CHAPTER 1
Throughout we occasionally adopt the notation ν(CT ; γ ). This is done for clarity and it is omitted whenever there is no ambiguity. Moreover, to ease the analysis and the presentation, it is also assumed that CT ≥ 0 P-a.s. This assumption is easily removed at the expense of more tedious arguments. i) Behavior with Respect to the Risk Aversion Coefficient While risk preferences are not affecting the arbitrage-free prices due to perfect risk replication, they represent an indispensable element of indifference prices. Indeed, the risk aversion coefficient γ appears in the construction of C˜ T . It is through this conditional preference adjusted payoff that the indifference valuation mechanism extracts and valuates the underlying unhedgeable risks. Proposition 1.4 The function γ → ν(CT ; γ ) = EQ
1 γ CT log EQ (e |ST ) γ
from R+ into R is increasing and continuous. Moreover, if for all claims CT we have ν(CT ; γ ) = ν(CT ; 1),
(1.24)
then γ = 1. Proof. Continuity follows directly from the formula and the properties of conditional expectation. To establish monotonicity, let us assume that 0 < γ1 < γ2 . Then, Holder’s inequality yields EQ (eγ1 CT |ST ) ≤ (EQ (eγ2 CT |ST ))γ1 /γ2 and, in turn, 1 1 log EQ (eγ1 CT |ST ) ≤ log EQ (eγ2 CT |ST ). γ1 γ2 Taking expectation, with respect to the pricing measure Q, we deduce the first statement. To establish the second assertion, we assume, without loss of generality, that p2 = 0. Then, we consider claims of the form CT (ω1 ) = c1 , CT (ωi ) = 0, i = 2, 3, 4, and we observe that (1.24) leads to ec1 p1 + p2 eγ c1 p1 + p2 1 = log log γ p1 + p2 p1 + p2 for all c1 . To conclude, it suffices to differentiate both sides with respect to c1 and rearrange terms. 2 Proposition 1.5 The following limiting relations hold: lim ν(CT ; γ ) = EQ (CT ),
(1.25)
γ →0+
. lim ν(CT ; γ ) = EQ CT L∞ Q{·|S }
γ →∞
T
(1.26)
15
THE SINGLE PERIOD BINOMIAL MODEL
Proof. To show (1.25), we pass to the limit, as γ → 0, in the price formula (1.14), ν(CT ; γ ) = q
γ c1 γ c3 1 1 e p1 + e γ c2 p2 e p3 + e γ c 4 p4 + (1 − q) log log γ p1 + p2 γ p3 + p4 (1.27)
to obtain lim ν(CT ; γ ) = q
γ →0
p1 c1 p 2 c2 + p1 + p2 p1 + p2
+ (1 − q)
p3 c3 p 4 c4 + p3 + p4 p3 + p4
.
On the other hand, by the properties of the pricing measure, we have q
pi pi = qi , i = 1, 2 and (1 − q) = qi , i = 3, 4, p1 + p2 p3 + p4
and, in turn, lim ν(CT ; γ ) =
γ →0
4
q i ci .
i=1
To establish (1.26), we pass to the limit as γ → ∞ in (1.27). We readily get that lim ν(CT ) = q max(c1 , c2 ) + (1 − q) max(c3 , c4 ),
γ →∞
and the statement follows.
2
Proposition 1.6 The indifference price satisfies lim
γ →0
∂ν(CT ; γ ) 1 = EQ (V arQ (CT |ST )), ∂γ 2
(1.28)
and thus, 1 ν(CT ; γ ) = EQ (CT ) + γ EQ (V arQ (CT |ST )) + o(γ ). 2
(1.29)
Proof. We only show (1.28), since (1.29) is an easy consequence. We first differentiate ν(CT ; γ ) with respect to γ , obtaining ∂ν(CT ; γ ) 1 1 EQ (CT eγ CT |ST ) = EQ − 2 log EQ (eγ CT |ST ) + ∂γ γ γ EQ (eγ CT |ST ) EQ (CT eγ CT |ST ) 1 EQ − ν(CT ; γ ) . = γ EQ (eγ CT |ST )
16
CHAPTER 1
Therefore, EQ (CT eγ CT |ST ) EQ − ν(C ; γ ) T EQ (eγ CT |ST ) 2 EQ (CT2 eγ CT |ST )EQ (eγ CT |ST ) − EQ (CT eγ CT |ST ) = lim EQ 2 γ →0 EQ (eγ CT |ST ) ∂ν(CT ; γ ) , − lim γ →0 ∂γ
∂ν(CT ; γ ) 1 = lim γ →0 γ →0 γ ∂γ lim
and, thus, ∂ν(CT ; γ ) 1 = EQ (EQ (CT2 |ST ) − (EQ (CT |ST ))2 ). γ →0 ∂γ 2 lim
(2)
Proposition 1.7 The indifference price is consistent with the no-arbitrage principle, namely, for all γ > 0, inf EQ (CT ) ≤ ν(CT ; γ ) ≤ sup EQ (CT ),
Q∈Qe
(1.30)
Q∈Qe
where Qe is the set of martingale measures that are equivalent to P. Proof. We assume, without loss of generality, that c1 < c2 and that c3 < c4 . The monotonicity of the price with respect to risk aversion implies lim ν(CT ; γ ) ≤ ν(CT ; γ ) ≤ lim ν(CT ; γ ) γ →∞
γ →0
and, in turn, that EQ (CT ) ≤ ν(CT ; γ ) ≤ EQ CT L∞ . Q{·|S } T
Taking the infimum over all martingale measures yields inf EQ (CT ) ≤ EQ (CT ) ≤ ν(CT ; γ ),
Q∈Q
and the left-hand side of (1.30) follows. We next observe that EQ CT L∞ Q{·|S
T}
= EQ¯ (CT ),
¯ has elementary probabilities where the martingale measure Q ¯ 2 } = q, Q{ω ¯ 3 } = 0, Q{ω ¯ 4 } = 1 − q. ¯ 1 } = 0, Q{ω Q{ω Observing that ν(CT ; γ ) ≤ EQ¯ (CT ) ≤ sup EQ (CT ), Q∈Q
we conclude.
2
17
THE SINGLE PERIOD BINOMIAL MODEL
ii) Behavior with Respect to Payoffs We first explore the monotonicity, convexity, and scaling behavior of the indifference prices. We note that in the next two propositions, all inequalities among payoffs and their prices hold both under the historical and the pricing measures P and Q. Since these two measures are equivalent, we skip any measure-specific notation for the ease of the presentation. Proposition 1.8 The indifference price is a nondecreasing and convex function of the claim’s payoff, namely, if CT1 ≤ CT2
then ν(CT1 ) ≤ ν(CT2 ),
(1.31)
and, for α ∈ (0, 1), ν(αCT1 + (1 − α)CT2 ) ≤ αν(CT1 ) + (1 − α)ν(CT2 ).
(1.32)
Proof. Inequality (1.31) follows directly from formula (1.9). To establish (1.32), we apply Holder’s inequality to obtain 1 1 2 log EQ (eγ (αCT +(1−α)CT ) |ST ) EQ γ 1 γ CT1 α γ CT2 1−α ≤ EQ |ST )) (EQ (e |ST )) ) log((EQ (e γ 1 1 γ CT2 γ CT1 |ST ) + (1 − α)EQ |ST ) , log EQ (e = αEQ log EQ (e γ γ 2
and the result follows. Proposition 1.9 The indifference price satisfies ν(αCT ) ≤ αν(CT ) for α ∈ (0, 1)
(1.33)
ν(αCT ) ≥ αν(CT ) for α ≥ 1.
(1.34)
and
Proof. To show (1.33) we observe
1 γ αCT log EQ (e |ST ) ν(αCT ) = EQ γ 1 γ¯ CT = αEQ log EQ (e |ST ) , γ¯
where, for α ∈ (0, 1), γ¯ = αγ < γ . Using the monotonicity of the price with respect to risk aversion, we conclude. Inequality (1.34) follows by the same argument. 2 The following result highlights an important property of the indifference price operator. We see that any hedgeable risk is scaled out from the nonlinear part of the
18
CHAPTER 1
pricing rule, and it is priced directly by arbitrage. Hedgeable risks do not differ from their conditional certainty equivalent payoffs. In this sense, we say that the pricing operator has the property of additive3 invariance with respect to hedgeable risks Note that this property is stronger, and more intuitive, than requiring mere translation invariance with respect to constant risks. Proposition 1.10 The indifference pricing operator is additively invariant with respect to hedgeable risks, namely, if CT = CT1 + CT2 , with CT1 = C 1 (ST ) and CT2 = C 2 (ST , YT ), then ν(CT ) = EQ (C 1 (ST ) + C 2 (ST , YT )) = EQ (C 1 (ST )) + ν(C 2 (ST , YT )).
(1.35)
Proof. The price formula (1.9), together with the measurability properties of C 1 (ST ), yields 1 γ (C 1 (ST )+C 2 (ST ,YT )) ν(CT ) = EQ |ST ) log EQ (e γ 1 1 γ C 2 (ST ,YT ) = EQ (C (ST )) + EQ log EQ (e |ST ) γ = EQ (CT1 ) + ν(CT2 ).
(2)
The above property yields the following conclusions for two extreme cases. Special Cases: i) Let CT = C 1 (ST ) + C 2 (ST , YT ) with YT depending functionally on ST . The payoff CT2 is then FTS −measurable and, therefore, C 2 (ST , YT ) = C˜ 2 (ST , YT ). Combining the above with (1.35) implies ν(CT ) = EQ (C 1 (ST )) + ν(C 2 (ST , YT )) = EQ (C 1 (ST )) + EQ (C 2 (ST , YT )). The indifference price simplifies to the arbitrage-free one, and the pricing measure coincides with the unique risk neutral measure. ii) Let CT = C 1 (ST ) + C 2 (YT ) with YT and ST independent under P. Then, 1 1 2 2 C˜ 2 (YT ) = log EQ (eγ C (YT ) |ST ) = log EP (eγ C (YT ) ). γ γ The indifference price of CT consists of the arbitrage-free price of the first claim plus the traditional actuarial certainty equivalent price of the second, ν(CT ) = EQ (C 1 (ST )) +
3 The
1 2 log EP (eγ C (YT ) ). γ
authors would like to thank Patrick Cheridito for suggesting this terminology.
19
THE SINGLE PERIOD BINOMIAL MODEL
We finish this section presenting the link between the indifference pricing functional ν and the so-called convex risk measures (see, among others, [89], [90], and [208]). Definition 1.2 The mapping ρ : FT → R is called a convex risk measure if it satisfies the following conditions for all CT1 , CT2 ∈ FT . • Convexity: ρ(αCT1 + (1 − α)CT2 ) ≤ αρ(CT1 ) + (1 − α)ρ(CT2 ), 0 ≤ α ≤ 1. • Monotonicity: If CT1 ≤ CT2 , then ρ(CT1 ) ≥ ρ(CT2 ). • Translation invariance: If m ∈ R, then ρ(CT1 + m) = ρ(CT1 ) − m. For any CT ∈ FT define
ρ(CT ) = ν(−CT ) = EQ
1 −γ CT |ST ) . log EQ (e γ
(1.36)
Proposition 1.11 The mapping ρ given in (1.36) defines a convex risk measure. Proof. All conditions follow trivially from the properties of the indifference price discussed earlier. 2 Note that the number ν(CT ) represents the indifference value of the payoff CT , while the number ρ(CT ) = ν(−CT ) is usually interpreted as a capital requirement imposed by a supervising body for accepting the position CT . It is interesting to observe that the concept of indifference value, deduced from the desire to behave optimally as an investor, is in the above sense consistent with a method that may be used to determine the capital amount, for a position to be acceptable by a supervising body. 1.2.3 Risk Monitoring Strategies We now turn our attention to the important issue of managing risk generated by the derivative contract. In complete markets, the payoff is reproduced by the associated self-financing and replicating portfolio. Consequently, any risk associated with the claim is eliminated. For the model at hand, any FTS -measurable claim CT is replicable and the familiar representation formula CT = ν(CT ) +
∂ν(CT ) (ST − S0 ) ∂S0
(1.37)
holds, with ν(CT ) = EQ (CT )
and
∂ν(CT ) ∂EQ (CT ) = . ∂S0 ∂S0
(1.38)
The indifference price coincides with the arbitrage-free price, and its spatial derivative yields the so-called delta. When the market is incomplete, however, perfect replication is not viable and a payoff representation similar to the above cannot be obtained. However, one may still seek a constitutive analogue to (1.37).
20
CHAPTER 1
We recall that the indifference price was produced via comparison of the optimal investment decisions with and without the claim in consideration. We should therefore base our study on the analysis of the relevant optimal portfolios, and the relation between the indifference price and the optimal wealth levels they generate. We start with an auxiliary structural result for the optimal policies of the underlying maximal expected utility problems (1.3). Proposition 1.12 Let ν(CT ) be the indifference price of the claim CT and H (Q|P) as in (1.21). The optimal number of shares α CT ,∗ in the optimal investment problem (1.3) is given by α CT ,∗ = α 0,∗ +
∂ν(CT ) , ∂S0
(1.39)
where α 0,∗ = −
1 ∂H (Q|P) γ ∂S0
(1.40)
represents the number of shares held optimally in the absence of the claim. Both optimal controls α CT ,∗ and a 0,∗ are wealth independent. Proof. We first recall that α CT ,∗ was provided in (1.10), rewritten below for convenience: (ξ u − 1)(p1 + p2 ) 1 log α CT ,∗ = γ S0 (ξ u − ξ d ) (1 − ξ d )(p3 + p4 ) 1 (eγ c1 p1 + eγ c2 p2 )(p3 + p4 ) + log γ c . u d γ S0 (ξ − ξ ) (e 3 p3 + eγ c4 p4 )(p1 + p2 ) When the claim is not taken into account, one can easily deduce, by setting ci = 0, i = 1, . . . , 4 above, that the corresponding optimal policy α 0,∗ equals (ξ u − 1)(p1 + p2 ) γ S0 (ξ u − ξ d ) (1 − ξ d )(p3 + p4 ) 1 (1 − q)(p1 + p2 ) = log . γ (S u − S d ) q(p3 + p4 )
α 0,∗ =
1
Using that ∂ ∂q = ∂S0 ∂S0
log
S0 − S d Su − Sd
=
(1.41)
1 Su − Sd
and differentiating the entropy expression (1.21) gives (1 − q)(p1 + p2 ) ∂q ∂H (Q|P) = − log . ∂S0 q(p3 + p4 ) ∂S0 Differentiating, in turn, the price formula (1.9) gives 1 (eγ c1 p1 + eγ c2 p2 )(p3 + p4 ) ∂ν(CT ) = log , ∂S0 γ S0 (ξ u − ξ d ) (eγ c3 p3 + eγ c4 p4 )(p1 + p2 ) which, combined with the expressions for α CT ,∗ and α 0,∗ yields (1.39).
2
21
THE SINGLE PERIOD BINOMIAL MODEL
Next, we consider the optimal wealth variables XCT ,∗ and X0,∗ representing, respectively, the agent’s optimal wealth with and without the claim. In the first case, the agent starts with initial wealth x + ν(CT ) and buys α CT ,∗ shares of stock. If the claim is not taken into account, the investor starts with x and follows the strategy α 0,∗ . In other words, C ,∗
XT T
= x + ν(CT ) + α CT ,∗ (ST − S0 )
(1.42)
and XT0,∗ = x + α 0,∗ (ST − S0 ).
(1.43)
We now introduce two important quantities that will help us produce a meaningful decomposition of the claim’s payoff. They are the residual optimal wealth and the residual risk, denoted respectively by L and R. These notions were first introduced by the authors in Musiela and Zariphopoulou [195, 196]; see also [198]. Definition 1.3 The residual optimal wealth process is defined as CT ,∗
Lt = Xt
− Xt0,∗ for t = 0, T .
(1.44)
In a complete model environment, the residual optimal wealth coincides with the value of the perfectly replicating portfolio. It is therefore a martingale under the unique risk neutral measure, and it generates the claim’s payoff at expiration. When the market is incomplete, however, several interesting observations can be made. The residual terminal optimal wealth LT reproduces the claim only partially. In addition, it is an FTS −measurable variable and retains its martingale property under all martingale measures. Its most important property, however, is that it coincides with the conditional certainty equivalent. This fact will play an instrumental role in two directions, namely, in the identification of the replicable part of the claim and in the specification of the risk monitoring policy. Proposition 1.13 The residual optimal wealth process satisfies L0 = ν(CT )
(1.45)
and LT = ν(CT ) +
∂ν(CT ) (ST − S0 ). ∂S0
(1.46)
Moreover, LT coincides with the conditional certainty equivalent, LT = C˜ T .
(1.47)
Finally, the process Lt is a martingale under all equivalent martingale measures, EQ (LT ) = L0 = ν(CT ) for Q ∈ Qe .
(1.48)
Proof. Assertions (1.45) and (1.46) follow easily from Definition 1.3, the optimal wealth representations (1.42) and (1.43) and the relation (1.39) between the optimal policies.
22
CHAPTER 1
To show (1.47), we first recall that ν(CT ) = EQ (C˜ T ),
(1.49)
which, in view of (1.46) yields, ∂EQ (C˜ T ) LT = EQ (C˜ T ) + (ST − S0 ). ∂S0 The claim C˜ T , however, is FTS −measurable and, thus, replicable. Its arbitrage-free decomposition is ∂EQ (C˜ T ) (ST − S0 ), C˜ T = EQ (C˜ T ) + ∂S0 and the identity (1.47) follows. The martingale property (1.48) is an easy consequence of (1.46).
2
Definition 1.4 The indifference price process νt (CT ), t = 0, T is defined as ν0 (CT ) = ν(CT ) and νT (CT ) = CT .
(1.50)
Definition 1.5 The residual risk process Rt , t = 0, T , is defined as the difference between the indifference price and the residual optimal wealth, namely, Rt = νt (CT ) − Lt , i.e., R0 = ν(CT ) − L0 and RT = CT − LT .
(1.51)
If perfect replication is viable, the residual risk is zero throughout and its notion degenerates. In general, it represents the component of the claim that is not replicable, given that risks which can be hedged have been already extracted optimally according to our utility criteria. As such, the residual risk should not generate any additional conditional certainty equivalent part nor should it, in consequence, acquire any additional indifference value. Proposition 1.14 The residual risk process has the following properties: i) It satisfies R0 = 0
(1.52)
RT = CT − C˜ T .
(1.53)
and
ii) Its conditional certainty equivalent is zero, R˜ T = 0.
(1.54)
ν(RT ) = 0.
(1.55)
iii) Its indifference price is zero,
23
THE SINGLE PERIOD BINOMIAL MODEL
iv) It is a supermartingale under the pricing measure Q, EQ (RT ) ≤ R0 = 0.
(1.56)
v) Its expected, under the historical measure P, certainty equivalent is zero, 1 log EP (eγ RT ) = 0. γ
(1.57)
Proof. Part (i) follows readily from the definition of the residual risk and the properties of Lt , t = 0, T . To show (ii), we apply directly the definition of the conditional certainty equivalent. This, together with the measurability of C˜ T , yields 1 ˜ R˜ T = log EQ (eγ (CT −CT ) |ST ) γ 1 = log EQ (eγ CT |ST ) − C˜ T γ = C˜ T − C˜ T = 0. Parts (iii) and (iv) are immediate consequences of (1.49), (1.53), and (1.54). To establish (1.57), we recall that 1 1 R˜ T = log EQ (eγ RT |ST ) = log EP (eγ RT |ST ), γ γ where we used (1.8). Using (1.54) and taking the expectation under P yields the result. 2 Being a supermartingale, the residual risk can be decomposed according to the Doob-Meyer decomposition. The related components can be easily retrieved and are presented below. Proposition 1.15 The supermartingale Rt , for t = 0, T , admits the Doob-Meyer decomposition Rt = Rtm + Rtd , where R0m = 0 and RTm = RT − EQ (RT ),
(1.58)
R0d = 0 and RTd = EQ (RT ).
(1.59)
and
The component Rtm is an FT(S,Y ) -martingale under Q„ while Rtd is decreasing and adapted to the trivial filtration F0(S,Y ) . We are now ready to provide the payoff decomposition result. This result is central in the study of risks associated with the indifference valuation method since it provides in a direct manner the constitutive analogue of the arbitrage-free payoff decomposition (1.37).
24
CHAPTER 1
Theorem 1.6 Let C˜ T and RT be, respectively, the conditional certainty equivalent and the residual risk associated with the claim CT . Let also Rtm and Rtd be the elements of the Doob-Meyer decomposition (1.58) and (1.59). ˜ Define the process MtC , for t = 0, T , by ˜
˜
M0C = ν(CT ) and MTC = ν(CT ) +
∂ν(CT ) (ST − S0 ). ∂S0
(1.60)
i) The claim CT admits the unique, under Q, payoff decomposition CT = C˜ T + RT = ν(CT ) +
∂ν(CT ) (ST − S0 ) + RT ∂S0
(1.61)
˜
= MTC + RTm + RTd . ii) The indifference price process νt , defined in (1.50), is an FT(S,Y ) − supermartingale under Q. It admits the unique decomposition νt (CT ) = Mt + Rtd ,
(1.62)
where ˜
Mt = MtC + Rtm . The components Mt and Rtd represent, respectively, the associated martingale and the non-increasing parts of the price process νt . From the application point of view, one may think of RT and its moments as natural variables for the quantification of errors associated with the risk monitoring policy. As the proposition below shows, the expected error obtains a rather intuitive form. It is proportional to the risk aversion and to the expected conditional variance of the nontraded risks. Naturally, both the expectation and the conditional variance need to be considered under the pricing measure Q. Proposition 1.16 The expected residual risk satisfies 1 EQ (RT ) = − γ EQ (Var Q (CT |ST )) + o(γ ) 2 and 1 EQ (RT ) = − γ EQ (Var Q (RT |ST )) + o(γ ). 2 Proof. The proof follows from (1.53), yielding EQ (RT ) = EQ (CT ) − EQ (C˜ T ), and the approximation formula (1.29). The second equality is obvious.
2
25
THE SINGLE PERIOD BINOMIAL MODEL
1.2.4 Relative Indifference Prices From the previous analysis, we can deduce that the indifference price is not a linear function of the claim’s payoff, namely, for α = 0, 1, ν(αCT ) = αν(CT ).
(1.63)
Indeed, as it was established in Proposition 1.9, if α > 1 (resp. α < 1), the indifference price is a superhomogeneous (resp. subhomogeneous) function of CT . Following simple arguments, we easily conclude that if two payoffs, say CT1 and CT2 , are considered, the indifference price functional is nonadditive, namely, ν(CT1 + CT2 ) = ν(CT1 ) + ν(CT2 ).
(1.64)
Extending these arguments to the case of multiple payoffs, we obtain that for N , with N > 1, payoffs N N i CT = ν(CTi ). (1.65) ν i=1
i=1
The nonadditive behavior of indifference prices is a direct consequence of the nonlinear character of the indifference valuation mechanism. Naturally, this nonlinear characteristic is inherited to the associated risk monitoring strategies. The nonadditivity property is perhaps the one that most differentiates the indifference prices and the relevant risk monitoring strategies from their complete market counterparts. This might then look as a serious deficiency of the indifference valuation approach both for the theoretical as well as the practical point of view. However, it should be noted that the aggregate valuation of the above claims was considered as if the individual risks were priced in isolation. In practice, risks and projects need to be valuated and hedged relative to already undertaken risks. In complete markets, perfect risk elimination makes this relative risk positioning redundant. But, when risks cannot be eliminated, one should develop a methodology that would quantify and price the incoming incremental risks, while taking into account the existing unhedgeable risk exposure. These considerations lead us to the relative indifference valuation concept. The notion of relative indifference price was first introduced by the authors in Musiela and Zariphopoulou [196] and was recently further developed in Stoikov’s stochastic volatility model [256]. Definition 1.7 Let C 1T = C 1 (ST , YT ) and C 2T = C 2 (ST , YT ) be two claims that 1 2 1 2 have indifference prices ν(CT1 ) and ν(CT2 ). Let V CT , V CT and V CT +CT be the value 1 2 1 2 functions (1.3) corresponding to claims CT , CT and CT + CT . The relative indifference prices ν(CT2 /CT1 ) and ν(CT1 /CT2 ) are defined, respectively, as the amounts satisfying 1
1
2
V CT (x) = V CT +CT (x + ν(CT2 /CT1 ))
(1.66)
26
CHAPTER 1
and 2
1
2
V CT (x) = V CT +CT (x + ν(CT1 /CT2 ))
(1.67)
for all wealth levels x ∈ R. As the following result yields, when a new claim is being priced relatively to an already incorporated risk exposure, the associated indifference prices become linear. Proposition 1.17 Assume that the claims CT1 = C 1 (ST , YT ) and CT2 = C 2 (ST , YT ) have indifference prices ν(CT1 ) and ν(CT2 ) and relative indifference prices ν(CT1 /CT2 ) and ν(CT2 /CT1 ). Then, the indifference price of the claim with payoff CT = CT1 + CT2 satisfies ν(CT ) = ν(CT1 ) + ν(CT2 /CT1 )
(1.68)
and ν(CT ) = ν(CT2 ) + ν(CT1 /CT2 ). Proof. We only show the first statement since the second follows by analogous arguments. For this, we recall the representation formula (1.23), which yields, respectively, 1
1
V C (x) = −e−γ x−H (Q|P)+γ ν(CT ) and VC
1 +C 2
1
2
(x) = −e−γ x−H (Q|P)+γ ν(CT +CT ) .
Moreover, the same formula together with the definition of the relative indifference price ν(CT2 /CT1 ) implies that 1
1
V C (x) = −e−γ x−H (Q|P)+γ ν(CT ) 2
1
1
2
= −e−γ (x+ν(CT /CT ))−H (Q|P)+γ ν(CT +CT ) = VC
1 +C 2
(x + ν(CT2 /CT1 ))
for all wealth levels. Equating the exponents yields (1.68).
2
The following results are immediate consequences of the above. Corollary 1.8 The indifference prices ν(CT1 ) and ν(CT2 ), and their relative counterparts ν(CT1 /CT2 ) and ν(CT1 /CT2 ), satisfy ν(CT1 ) − ν(CT2 ) = ν(CT1 /CT2 ) − ν(CT2 /CT1 ). Corollary 1.9 The indifference price of the claim CT = CT1 + CT2 is given by ν(CT1 + CT2 ) =
1 1 (ν(CT1 ) + ν(CT2 )) + (ν(CT1 /CT2 ) + ν(CT2 /CT1 )). 2 2
27
THE SINGLE PERIOD BINOMIAL MODEL
Moreover, ν(CT1 + CT2 ) − (ν(CT1 ) + ν(CT2 )) 1 1 = (ν(CT1 /CT2 ) + ν(CT2 /CT1 )) − (ν(CT1 ) + ν(CT2 )). 2 2 The latter formula yields the error emerging from the nonadditive character of the indifference price. This error may vanish in certain cases, as the examples below demonstrate. These examples were discussed in detail in Section 1.2.2 in the context of the additive invariance property of indifference prices. They refer to the special cases of a complete and a fully incomplete market setting. Special Cases Revisited: i) Let CT = CT1 + CT2 with CT1 = C 1 (ST , YT ) and CT2 = C 2 (ST ). Property (1.35) implies that the price is additive. In fact, ν(CT ) = ν(CT1 ) + ν(CT2 ) with ν(CT2 ) = EQ (C 2 (ST )). Proposition 1.17 then yields that ν(CT2 /CT1 ) = ν(CT2 ). If, additionally, YT depends functionally on ST , then we easily deduce that ν(CT ) = EQ (CT1 ) + EQ (CT2 ) and, in turn, that ν(CT1 /CT2 ) = ν(CT1 )
and ν(CT2 /CT1 ) = ν(CT2 ).
ii) Let CT = C 1 (ST ) + C 2 (YT ) with YT and ST be independent under P. Then, it was shown that the price behaves additively, namely, ν(CT ) = ν(CT1 ) + ν(CT2 ) with ν(CT1 ) = EQ (CT1 )
and ν(CT2 ) =
1 2 log EP (eγ C (YT ) ). γ
Proposition 1.17, in turn, implies ν(CT1 /CT2 ) = ν(CT1 )
and ν(CT2 /CT1 ) = ν(CT2 ).
The above examples demonstrate that the relative indifference prices reduce to the classical ones if the relevant risks are either fully replicable or independent from the traded ones. 1.2.5 Wealths, Preferences, and Numeraires The results of the previous sections were derived under the assumptions of zero interest rate and constant risk aversion. In this case, the wealths at the beginning and the end of a time period are expressed in a comparable unit (spot or forward),
28
CHAPTER 1
and, thus, the possible dependence of the underlying optimization problems on the unit choice is not apparent. Below we analyze this question by looking first at the relationship between the spot and forward units. Then, we consider a state-dependent risk aversion coefficient in order to cover other cases of numeraires. In particular, we consider the stock itself as a numeraire and show that the indifference prices can be made numeraire independent and consistent with the static no arbitrage constraint if the appropriate dependence across units is built into the risk preference structure. i) Indifference Prices in Spot and Forward Units Consider the one-period model, introduced in Section 1.2.1, of a market with a riskless bond and two risky assets, of which only one is traded. The dynamics of the risky assets remain unchanged, but we now allow for a nonzero riskless rate. The price of the riskless asset, therefore, satisfies B0 = 1 and BT = 1 + r with ξ d ≤ 1 + r ≤ ξ u. Because of the nonzero riskless rate, the price formula (1.9) cannot be directly applied. In order to produce meaningful prices, one needs to be consistent with the units in which the quantities that are used in price specification are expressed. For the case at hand, we will consider the valuation problem in spot and in forward units and will force the price to become independent of the unit choice. We start with the formulation of the indifference price problem in spot units. Consider a portfolio consisting of α shares of stock and the amount β invested in the riskless asset. Its current value is given by β + αS0 = x, where x represents the agent’s initial wealth, X0 = x. Expressed in spot units, that is discounted to time 0, its spot terminal wealth XTs satisfies ST s XT = x + α (1.69) − S0 . 1+r The investor’s utility is taken to be exponential with constant absolute risk aversion coefficient γ s . It is important to note that for the utility to be well defined, this coefficient needs to be expressed in the reciprocal of the spot unit. Optimality of investments will be carried out through the relevant value function, in spot units, given by
CT −γ s XTs − 1+r V s,CT (x) = sup EP −e . (1.70) α
Note that the option payoff CT is also discounted from time T to time 0. The following definition is a natural extension of Definition 1.1. Definition 1.10 The indifference price, in spot units, of the claim CT is defined as the amount ν s (CT ) for which the two spot value functions V s,CT and V s,0 , defined in (1.70) and corresponding to claims CT and 0, coincide. Namely, it is the amount ν s (CT ) satisfying V s,0 (x) = V s,CT (x + ν s (CT )) for any initial wealth levels x ∈ R.
(1.71)
29
THE SINGLE PERIOD BINOMIAL MODEL
Proposition 1.18 Let Qs be the measure such that ST = S0 EQs 1+r and Qs (YT |ST ) = P(YT |ST ).
(1.72)
Moreover, let CT = c(ST , YT ) be the claim to be priced, in spot units, with spot risk aversion coefficient γ s . Then, the spot indifference price is given by CT CT 1 γ s 1+r s ν s (CT ) = EQs = EQs e . (1.73) log E |S Q T 1+r γs Proof. Working along similar arguments to the ones used in the proof of Proposition 1.1, we first establish that the spot value functions, V s,0 and V s,CT , are given by V s,0 (x) = −e−γ
1
sx
s
(q s )q s (1 − q s )1−q s
(p1 + p2 )q (p3 + p4 )1−q
s
and V s,CT (x) = −e−γ × (eγ
sx
1 s c1 s c2 s (eγ 1+r p1 + eγ 1+r p2 )q (q s )q s (1 − q s )1−q s
s c3 1+r
p3 + e γ
s c4 1+r
s
p4 )1−q ,
where qs =
(1 + r) − ξ d . ξu − ξd
Applying Definition 1.10 gives s c2 s c1 1 eγ 1+r p1 + eγ 1+r p2 s s ν (CT ) = q log γs p1 + p2 c3 c4 γ s 1+r γ s 1+r p + e p e 1 3 4 + (1 − q s ) . log γs p3 + p4
(1.74)
(1.75)
Straightforward calculations yield that the spot pricing measure Qs has elementary probabilities, denoted by qis , i = 1, . . . , 4, pi pi , i = 1, 2 and qis = (1 − q s ) , i = 3, 4. (1.76) qis = q s p1 + p2 p3 + p4 We introduce the conditional certainty equivalent in spot units 1 s CT C˜ Ts = s log EQs (eγ 1+r |ST ). γ Equation (1.75) then yields ν s (CT ) = EQs (C˜ Ts ), and (1.73) follows.
2
30
CHAPTER 1
We next analyze the indifference valuation of CT assuming that all relevant prices, risk preferences, and value functions are expressed in the reciprocal of the forward unit. To this end, we consider the forward terminal wealth f
XT = XTs (1 + r) = x(1 + r) + α(ST − S0 (1 + r)) = f + α(FT − F0 ),
(1.77)
where f = x(1 + r) is the forward value of the current wealth. Moreover, F0 = S0 (1 + r) and FT = ST is the forward stock price process. Implicitly, we assume existence of the forward market for the risky traded asset S, and hence of the quoted prices F0 and FT , for it can be replicated by trading in this market. The corresponding forward value function is V f,CT (f ) = sup EP (−e−γ
f (X f −C ) T T
).
(1.78)
α
The risk aversion coefficient γ f is naturally expressed in forward units. Definition 1.11 The indifference price, in forward units, of the claim CT is defined as the amount ν f (CT ) for which the two forward value functions V f,CT and V f,0 , defined in (1.78) and corresponding to claims CT and 0, coincide. Namely, it is the amount ν f (CT ) satisfying V f,0 (f ) = V f,CT (f + ν f (CT ))
(1.79)
for any initial wealth levels f ∈ R. Proposition 1.19 Let Qf be a measure under which EQf (FT ) = F0 and Qf (YT |FT ) = P(YT |FT ).
(1.80)
Qf = Qs .
(1.81)
Then
Let CT be the claim to be priced under exponential preferences with forward risk aversion coefficient γ f . Then, the indifference price in forward units of CT is given by ν f (CT ) = EQf (CT ) 1 γ f CT = EQf log E (e |F ) . f T Q γf
(1.82)
Proof. Given the deterministic interest rate assumption, the fact that the measures Qf and Qs coincide is obvious. We next observe that the forward value function
31
THE SINGLE PERIOD BINOMIAL MODEL
V f,CT can be written as V f,CT (f ) = sup EP (−e−γ
f (X f −C ) T T
)
α
= sup EP (−e−γ (x(1+r)+α(ST −S0 (1+r))−CT ) ) α
ST CT −γ˜ x+α 1+r −S0 − 1+r = sup EP −e , f
α
with γ˜ = γ (1 + r). Therefore, V f,CT , and in turn V 0,CT , can be directly retrieved from their forward counterparts. The rest of the proof follows easily and it is therefore omitted. 2 f
For the rest of the analysis, we denote by Q the common spot and forward pricing measure. We are now ready to investigate when the spot and forward indifference prices are consistent with the static no-arbitrage condition and independent of the units, spot or forward, chosen in the supporting investment optimization problem. The result below gives the necessary and sufficient conditions on the spot and forward risk aversion coefficients. Proposition 1.20 The indifference prices, expressed in spot and forward units, are consistent with the no-arbitrage condition, that is, ν f (CT ) = (1 + r)ν s (CT ),
(1.83)
if and only if the spot and forward risk aversion coefficients satisfy γs . (1.84) 1+r Proof. We first show that if (1.84) holds, then (1.83) follows. Recalling (1.73) and (1.82), we deduce that, if (1.84) holds, then ν f (CT ) can be written as CT 1 γ s 1+r ν f (CT ) = (1 + r)EQ log E (e |S ) , Q T γs γf =
and one direction of the statement follows. We remind the reader that Q = Qs = Qf . We next show that for (1.83) to hold for all CT we must have (1.84). Indeed, if the consistency relationship (1.83) holds, then, for all claims CT , CT 1 1 1 γ s 1+r γ f CT EQ log E (e |S ) = E log E (e |S ) , Q T Q Q T 1+r γf γs and, in turn, γs 1 1+r CT γ f CT 1+r EQ log EQ (e |ST ) = EQ log EQ (e |ST ) . γf γs The statement then follows from straightforward rescaling arguments and (1.24). 2 We continue with a representation result for the spot and forward value functions. We recall that the spot and forward pricing measures reduce to the same measure Q, which, therefore, has elementary probabilities qi , i = 1, . . . , 4 given in (1.76).
32
CHAPTER 1
Working along similar arguments to the ones used in Section 1.2.1 we can easily establish the following result. Proposition 1.21 Let H (Q|P) be as in (1.21). Then, the value functions V s,CT and V f,CT are given by 1 s,CT −γ s (x−ν s (CT ))−H (Q|P) s s V (x) = −e = U x − ν (CT ) + s H (Q|P) , γ 1 f f f,CT −γ (f −ν (CT ))−H (Q|P) f f V (f ) = −e = U f − ν (CT ) + f H (Q|P) , γ where U s (x) = −e−γ utility functions.
sx
and U f (f ) = −e−γ
ff
represent the spot and forward
Recall that the argument x in U s (x) and in V s,CT (x) is expressed in the spot units, while the same argument f in U f (f ) and V f,CT (f ) is expressed in the forward units. Therefore, the utility and the value functions represent the same utility and value, independently on the units in which the relevant optimization problems are solved, if and only if (1.84) holds. More generally, the indifference-based valuation as well as the associated optimal investment problems can be formulated and solved in a numeraire-independent fashion provided the appropriate relations are built into the preference structure. In fact, these problems can be analyzed without making any reference to a unit by f optimizing over unitless quantities like γ s XTs or γ f XT . ii) Indifference Prices and State-Dependent Preferences Before we proceed with the specification of the price and the conditions for numeraire independence, we extend our previous setup to the case of the random risk aversion coefficient. Specifically, we assume that it is a function of the states of the traded asset. We may then conveniently represent the risk aversion at time T as the FTS -measurable random variable γT = γ (ST ) taking the values γ u = γ (S0 ξ u )
and γ d = γ (S0 ξ d ),
when the events {ω : ST (ω) = S0 ξ u } = {ω1 , ω2 } and {ω : ST (ω) = S0 ξ d } = {ω3 , ω4 } occur. Clearly, the risk aversion γT is expressed in the unit that is the reciprocal of the wealth XT unit. Alternatively, one may think of the risk tolerance δT =
1 , γT
(1.85)
which is obviously expressed in the units of wealth at time T . Here we assume the same model as in the previous section and choose to work with spot units, so XT = XTs as in (1.69). We have mentioned already that representations of the indifference prices under minimal model assumptions have been derived via duality arguments. These results can be extended even to the cases when the risk aversion coefficient is random. The
33
THE SINGLE PERIOD BINOMIAL MODEL
related pricing formulas, however, take a form that reveals limited insights about the numeraire and units effects. For this, we seek an alternative price representation, provided below, which, to the best of our knowledge, is new. We also adopt the notation ν(CT ; γT ), V CT (x; γT ) and V 0 (x; γT ) for the price and the relevant value functions, so that the random nature of risk preferences is conveniently highlighted. Proposition 1.22 Assume that the risk aversion coefficient γT is of the form γT = γ (ST ). Let Q be the measure defined in (1.72) and let CT = c(YT , ST ) be a claim to be priced under exponential utility with risk aversion coefficient γT . Then, the indifference price of CT is given by γ CT 1 T 1+r log EQ e |ST . ν(CT ; γT ) = EQ γT Proof. In order to construct the indifference price, we need to compute the value functions V CT (x; γT ) and V 0 (x; γT ). We recall that
ST CT −S0 − 1+r −γ x+α 1+r V CT (x; γT ) = sup EP −e T (1.86) α
and we introduce the notation u ξ u u −1 β =γ 1+r
ξd and β = γ 1 − . 1+r d
d
(1.87)
Further calculations yield V CT (x; γT ) = sup φ(α), α
where φ(α) = −e−αS0 β −e
u
−αS0 β d
e
c1 −γ u x− 1+r p
e
c
3 −γ d x− 1+r
1+e
p3 + e
c2 −γ u x− 1+r p
c4 −γ d x− 1+r
2
p4 ,
with β u and β d given in (1.87). Differentiating with respect to α yields that the maximum occurs at u c1
c2 γ 1+r γ u 1+r u −γ u x β e p + e p e 1 2 1 log d c3
. α CT ,∗ = d c4 S0 β u + β d β d e−γ d x eγ 1+r p3 + eγ 1+r p4 Calculating the terms −e−α e
−α CT ,∗ S0 β u
=
βu βd
CT ,∗ S β u 0
−
βu β u +β d
e
and −e−α β u (γ u −γ d ) x β u +β d
CT ,∗ S β d 0
eγ
u c1 1+r
eγ
d c3 1+r
(1.88)
gives p1 + e γ p3 + e γ
u c2 1+r d c4 1+r
p2 p4
−
βu β u +β d
34
CHAPTER 1
and e−α
CT ,∗ S β d 0
=
u − β u +β d βd
β βd
e
β d (γ u −γ d ) x β u +β d
c
e
1 γ u 1+r
e
c3 γ d 1+r
p1 + e p3 + e
c
2 γ u 1+r c4 γ d 1+r
p2
−
βd β u +β d
.
p4
It then follows, after tedious but routine calculations that V CT (x; γT ) = φ(α CT ,∗ ) u − uβ u d u uβ d d β u γ d +β d γ u β +β β +β β β e− β u +β d x = − d + β βd
× e
c
1 γ u 1+r
p1 + e
c
2 γ u 1+r
p2
βd β u +β d
e
c3 γ d 1+r
p3 + e
c4 γ d 1+r
p4
βu β u +β d
.
(1.89) Substituting CT = 0 in turn implies u − uβ u d u uβ d d β u γ d +β d γ u β +β β +β β β e− β u +β d x V 0 (x; γT ) = − d + β βd βu
βd
× (p1 + p2 ) β u +β d (p3 + p4 ) β u +β d .
(1.90)
Using (1.89), (1.90), and (1.4) (cf. Definition 1.1), we get −1 β uγ d + β d γ u ν(CT ; γT ) = βu + βd u c1 u c2 βd eγ 1+r p1 + eγ 1+r p2 × log βu + βd p1 + p2
βu eγ + u log β + βd
d c3 1+r
d c4 1+r
p3 + e γ p3 + p4
p4
.
(1.91)
We next observe that (1 + r) − ξ d γ uβ d = β uγ d + β d γ u ξu − ξd S0 (1 + r) − S d = = q, Su − Sd and ξ u − (1 + r) γ d βu = d u +β γ ξu − ξd u S − S0 (1 + r) = = 1 − q, Su − Sd
β uγ d
(1.92)
35
THE SINGLE PERIOD BINOMIAL MODEL
where S u = S0 ξ u and S d = S0 ξ d . The above equalities combined with (1.91) yield ν(CT ; γT ) = q
1 eγ log γu
u c1 1+r
u c2 1+r
p1 + e γ p1 + p2
1 eγ + (1 − q) d log γ
d c3 1+r
p2
(1.93) d c4 1+r
p3 + e γ p3 + p4
p4
.
Following the arguments developed in the proof of Proposition 1.2 we see that eγ
u c1 1+r
u c2 1+r
p1 + e γ p1 + p2
p2
= EQ (eγT 1+r |ST = S u )
p4
= EQ (eγT 1+r |ST = S d ).
CT
(1.94)
and eγ
d c3 1+r
d c4 1+r
p3 + e γ p3 + p4
CT
(1.95)
Finally, using the equalities (1.94) and (1.95) and the expression in (1.93), we easily conclude. 2 Proposition 1.23 Assume that the risk aversion coefficient γT is of the form γT = γ (ST ) and let δT be the risk tolerance coefficient introduced in (1.85). Let Q be the measure defined in (1.72) and let CT = c(YT , ST ) be a claim to be priced under exponential utility with risk aversion coefficient γT . Then, the value functions V 0 (x; γT ) and V CT (x; γT ), defined in (1.86), admit the following representations x V 0 (x; γT ) = − exp − (1.96) − H (Q∗ |P) EQ (δT ) x − ν(CT ; γT ) CT ∗ − H (Q |P) , (1.97) V (x; γT ) = − exp − EQ (δT ) where
ν(CT ; γT ) = EQ
CT 1 log EQ eγT 1+r |ST γT
(1.98)
and dQ∗ δT (ω) (ω) = . dQ EQ (δT )
(1.99)
Proof. To show (1.96) and (1.97), we work with formulas (1.89), and (1.90), interpreting appropriately the involved quantities. We first observe that −1 βu + βd β uγ d + β d γ u = βu + βd β uγ d + β d γ u −1 1 1 1 = (1 − q) + u q = γd γ EQ (δT )
36
CHAPTER 1
and, that βu (γ d )−1 (1 − q) , = βu + βd EQ (δT )
γ u (1 − q) βu , = βd γ dq
βd (γ u )−1 q . = βu + βd EQ (δT )
Combining the above, we obtain V 0 (x; γT ) = −e = −e
p1 + p2 q∗
−
x EQ (δT )
−
x −H (Q∗ |P) EQ (δT )
q ∗
p3 + p4 1 − q∗
1−q ∗
.
∗
Herein, the measure Q is defined by pi pi qi∗ = q ∗ , , i = 1, 2 and qi∗ = (1 − q ∗ ) p1 + p2 p3 + p4
i = 3, 4,
with q∗ =
(γ u )−1 q , EQ (δT )
(1.100)
where q given in (1.92). Note that the emerging measure Q∗ satisfies EQ∗ (γT (ST − S0 (1 + r))) = 0
(1.101)
and gives the same conditional distribution of YT given ST as the measure P. Alternatively, one can define Q∗ by its Radon-Nikodym density with respect to the pricing measure Q. Namely, (γ u )−1 q∗ dQ∗ = , (ωi ) = q EQ (δT ) dQ
i = 1, 2
and 1 − q∗ (γ d )−1 dQ∗ (ωi ) = = , dQ 1−q EQ (δT )
i = 3, 4.
To complete the proof, it remains to show (1.97). For this, it suffices to observe that the terms appearing in (1.89), containing the payoff, can be written as
c
1 γ u 1+r
c
2 γ u 1+r
p1 + e p2 p1 + p2 ν(CT ; γT ) , = exp EQ (δT )
e
βd β u +β d
e
c3 γ d 1+r
c4 γ d 1+r
p3 + e p3 + p4
p4
βu β u +β d
where we used (1.98). Formula (1.97) then follows from the latter equality and assertions (1.89) and (1.96). 2 We note that even though both measures, Q and Q∗ , have minimal, relative to P, entropy, they have different martingales properties. Namely, under Q, EQ (ST − S0 (1 + r)) = 0,
37
THE SINGLE PERIOD BINOMIAL MODEL
while, under Q∗ , EQ∗ (γT (ST − S0 (1 + r))) = 0. We finish this section with an important decomposition result for the writer’s optimal policy. Proposition 1.24 The writer’s optimal investment policy α CT ,∗ (cf. (1.88) admits the following decomposition: α CT ,∗ = α 0,∗ + α 1,∗ + α 2,∗ ,
(1.102)
where α 0,∗ = −
∂H (Q∗ |P) EQ (δT ), ∂S0
and α
2,∗
∂ = EQ (δT ) ∂S0
α 1,∗ =
∂ log EQ (δT ) x ∂S0
ν(CT ; γT ) . EQ (δT )
Proof. We first recall that α CT ,∗
u c1
u c2 u β u e−γ x eγ 1+r p1 + eγ 1+r p2 1 d c3
. log = d c4 S0 (β u + β d ) β d e−γ d x eγ 1+r p + eγ 1+r p 3
4
We then write α CT ,∗ as α CT ,∗ = α 0,∗ + α 1,∗ + α 2,∗ , where 1 β u (p1 + p2 ) log , S0 (β u + β d ) β d (p3 + p4 ) u e−γ x 1 log = , S0 (β u + β d ) e−γ d x
α 0,∗ = α 1,∗ and α
2,∗
γ u c1 u c2 e 1+r p1 + eγ 1+r p2 (p3 + p4 ) 1 = log d c3 . d c4 S0 (β u + β d ) eγ 1+r p3 + eγ 1+r p4 (p1 + p2 )
We will next represent the various quantities in terms of the measures Q and Q∗ . Recall that ∂q 1+r S0 (1 + r) − S d , = u , q= Su − Sd ∂S0 S − Sd and βd (γ u )−1 q = = q ∗, u d β +β EQ (δT ) βu (γ d )−1 (1 − q) = 1 − q ∗. = βu + βd EQ (δT )
38
CHAPTER 1
It follows, trivially, that log
(1 − q ∗ )(p1 + p2 ) β u (p1 + p2 ) = log . β d (p3 + p4 ) q ∗ (p3 + p4 )
Moreover, the relative entropy H (Q∗ |P) is given by H (Q∗ |P) = q ∗ log
q∗ 1 − q∗ , + (1 − q ∗ ) log p3 + p4 p1 + p2
and, hence, ∗ (1 − q ∗ )(p1 + p2 ) ∂q ∂H (Q∗ |P) . log =− ∂S0 q ∗ (p3 + p4 ) ∂S0 The sensitivity of q ∗ to S0 can be easily calculated. Indeed, we get ∂q (γ u )−1 (γ d )−1 ∂q ∗ = . ∂S0 ∂S0 (EQ (δT ))2 Also, because the coefficient S0 (β u + β d ) can be written as S0
1 ∂q (γ u )−1 (γ d )−1 = , d +β ) ∂S0 EQ (δT )
(β u
we get S0
∂q ∗ 1 EQ (δT ). = d +β ) ∂S0
(β u
The above formulas imply that α 0,∗ = −
∂H (Q∗ |P) EQ (δT ). ∂S0
Moving to the second term, α ∗,1 , we notice that α ∗,1 = −
S0
1 (γ u − γ d )x. + βd )
(β u
Moreover, because, ∂ log EQ (δT ) ∂q ∂ log EQ (δT ) = ∂S0 ∂q ∂S0 1 ∂q = ((γ u )−1 − (γ d )−1 ) EQ (δT ) ∂S0 1 =− (γ u − γ d ), S0 (β u + β d ) we easily obtain that α ∗,1 =
∂ log EQ (δT ) x. ∂S0
39
THE SINGLE PERIOD BINOMIAL MODEL
Obviously, the last term, α ∗,2 , has to do with the indifference price sensitivity. Indeed, after rather tedious calculations, we obtain CT
∂q ∗ ∂ γ ∗ log EQ∗ e T 1+r |ST α ∗,2 = (EQ (δT )) E Q ∂S0 ∂q ∗ CT
∂ = EQ (δT ) EQ∗ log EQ∗ eγT 1+r |ST ∂S0 ν(CT ; γT ) ∂ = EQ (δT ) , ∂S0 EQ (δT ) 2
and the proof is complete.
The above propositions demonstrate the effects of the random nature of risk aversion on the form of the value function, the indifference price, and the optimal policy. The appearance of the expected value of the risk tolerance, the reciprocal of risk aversion, seems to indicate that this quantity, rather than risk aversion, is more natural from the structural and interpretation points of view. Moreover, two interesting new features appear. First of all, the optimal policy α CT ,∗ is no longer independent of the initial wealth. However, the hedging demand α 2,∗ , coming from the presence of a derivative contract, remains independent on the initial wealth. So does the shares amount α 0,∗ , which is aiming to benefit from the opportunities created by the differences in the probabilities allocated to the outcomes by the historical measure P and by Q∗ . The number of shares α 1,∗ depends linearly ∂ log EQ (δT ) on the initial wealth x, with the slope representing the relative sensitivity ∂S0 of the current risk tolerance to the changes in the stock price. The second interesting T ;γT ) and new feature is the sensitivity ∂S∂ ν(C of the option price expressed in a 0 EQ (δT ) unitless fashion, i.e., relatively to the current risk tolerance. iii) Indifference Prices and General Numeraires Recall that the wealth XT at time T , given in (1.69), is expressed in the spot units. Observe that if the stock price is taken as the numeraire, the wealth will be expressed in the number of shares of stock and not in a dollar amount. Specifically, the terminal wealth XTS is given by x 1 S0 S XT = +α − ST 1 + r ST and the current wealth, equal to the number of shares at time 0, by x X0S = = xS . S0 Note that XT is discounted to time 0, and hence XTS is the time 0 equivalent of the number of shares held in the portfolio at time T . The related value function is given by
CT −γ S (ST ) XTS − S (1+r) S S,CT T V , (1.103) (x ) = sup EP −e α
S
where γ (ST ) represents the risk aversion associated with this unit.
40
CHAPTER 1
The following definition is a direct extension of Definition 1.1. Notice that in the current unit framework, the indifference price is expressed in number of shares and not as a dollar amount. Definition 1.12 The indifference price of the claim CT is defined as the number of shares ν S (CT ) for which the two value functions V S,CT and V S,0 , defined in (1.103) and corresponding to claims CT and 0, coincide. Namely, it is the number of stock shares ν S (CT ) satisfying V S,0 (x S ) = V S,CT (x S + ν S (CT ))
(1.104)
for any initial number of shares x ∈ R. S
Proposition 1.25 Let QS be a measure under which the discounted, by the traded asset, riskless bond, Bt /St , t = 0, T is a martingale and, at the same time, the conditional distribution of the nontraded asset, given the traded one, is preserved with respect to the historical measure P, i.e., QS (YT |ST ) = P(YT |ST ).
(1.105)
Let CT = c(ST , YT ) be the claim to be priced under exponential preferences with state-dependent risk aversion coefficient γ S (ST ). Then, the indifference price of CT , quoted in the number of shares of stock, is CT 1 γ S (ST ) S (1+r) S T ν (CT ) = EQS log EQS e . (1.106) |ST γ S (ST ) Proof. We start with the specification of the measure QS . We recall that, given the choice of numeraire, the martingale in consideration is BtS = BStt , where Bt and St stand, respectively, for the original bond and stock process. We denote by qiS , i = 1, . . . , 4 the elementary probabilities of QS . Simple calculations yield that pi pi qiS = q S , i = 1, 2 and qiS = (1 − q S ) , i = 3, 4, p1 + p2 p3 + p4 where
q = S
1 1 − ξd 1 + r
ξ uξ d . ξu − ξd
Alternatively, the measure QS can be defined by its Radon-Nikodym density with respect to Q, namely, dQS ST . = dQ (1 + r)S0 Indeed, we have
EQS
1+r ST
= EQ
ST 1+r (1 + r)S0 ST
=
1 . S0
We next observe that the value function V S,CT (cf. (1.103)) can be written as
ST CT −λT x+α 1+r −S0 − 1+r S,CT S V , (x ) = sup EP −e α
41
THE SINGLE PERIOD BINOMIAL MODEL
where λT =
γ S (ST ) . ST
(1.107)
Working as in the proof of Proposition 1.23, we get V S,CT (x S ) = V CT (x; λT ) x − ν(CT ; λT ) − H (Q|P) , = − exp − EQ (λ−1 T ) where dQ λ−1 T . = dQ EQ (λ−1 T ) The wealth argument x in V CT (x; λT ) as well as the price CT 1 log EQ eλT 1+r |ST ν(CT ; λT ) = EQ λT are expressed in the spot units and, hence, for all claims CT we have ν(CT ; γT ) = ν(CT ; λT ). Considering, as in Proposition 1.4, claims of the form CT1 (ω1 ) = c1 , CT1 (ωi ) = 0, i = 2, 3, 4 and CT2 (ω3 ) = c3 , CT2 (ωi ) = 0, i = 1, 2, 4, we get γT = λT . Consequently, the risk aversion γ S (ST ) associated with the numeraire S must satisfy γ S (ST ) = γT ST . We recall, however, that for any payoff, say G, dependent only on ST , G(ST ) G(ST ) . = S0 EQS EQ 1+r ST Therefore,
CT 1 γT 1+r log EQ e |ST ν(CT ; γT ) = EQ γT CT 1 γT 1+r = (1 + r)S0 EQS log EQ e |ST γT S T CT 1 γ S (ST ) S (1+r) T = (1 + r)S0 EQS . |ST log EQS e γ S (ST )
The statement then follows because the quantity (cf. (1.98)) CT 1 ν(CT ; γT ) γ S (ST ) S (1+r) T e = EQS |S log E T QS (1 + r)S0 γ S (ST ) is the indifference price quoted in the equivalent number of shares.
2
42
CHAPTER 1
1.2.6 Functions of Value and Utility The analysis on numeraires in the previous section exposed the important fact that the arguments of the functions of value and utility can be arbitrary as long as we are careful with the units in which the relevant quantities are expressed. Specifically, we saw that in order to refer to the same value and utility, independently on the arguments, and, also, in order to eliminate static arbitrage opportunities from our model, one needs to ensure that the risk aversion multiplied by wealth represents the same quantity independently of the wealth units. This produced necessary and sufficient conditions on risk preferences in spot and forward units, and for the case of general numeraires. In other words, natural properties of indifference prices imposed a certain structure on preferences and, in turn, on the involved utility and value functions. As one moves to a multiperiod setup and to more complex dynamic payoffs, additional price considerations are expected to impose further structural properties on the supporting utilities and value functions. As a consequence, the concept of utility, as a static expression of risk attitude, will need to be put in the correct dynamic context in order to ultimately generate consistent prices across times, units, etc. Even though the single-period framework impedes us from exposing such issues, we, nevertheless, provide below some motivational observations. For convenience, we fix the benchmark risk aversion parameter γT = γ (ST ) as representing our aversion to risk associated with wealth expressed in the spot units, i.e., discounted to the current time. Whenever convenient, we also use the corresponding risk tolerance δT = γT−1 (cf. (1.85)). From the previous analysis, the utility and value function in the single period Merton problem ( [187]) can be written as U (x; δT ) = −e
− δx
T
and V 0 (x; δT ) = −e
− E x(δ ) −H (Q∗ |P) Q T
,
(1.108)
where δT dQ∗ = , dQ EQ (δT ) and Q as in (1.8). We recall that indifference prices are built via the values of the relevant investment opportunities and that these values are generated by the preference structure and the assumptions on the market. In order, therefore, to build consistent pricing systems, one needs to correctly specify the investor’s preferences across times and to understand the interplay between utilities and value functions. To gain some intuition, let us simply look at how zero wealth is being valued at the beginning and at the end of the trading horizon. Setting x = 0 in (1.108) yields ∗ |P)
V 0 (0; δT ) = −e−H (Q
.
43
THE SINGLE PERIOD BINOMIAL MODEL
This is the value of zero wealth at the beginning of the period, as measured by the value function. Hence, it depends on the entropy term H (Q∗ |P) and thus on the model. On the other hand, the utility of zero wealth at the end of the time period T , as measured by the utility function, equals UT (0; δT ) = −1. We then say that the utility value of zero wealth is −1.4 Alternatively, the investor may also want to associate −1 to the value of zero wealth at the beginning of the period, thus making it independent of the model. This can be easily achieved by normalizing the utility function (1.108) accordingly. Indeed, let x
∗ |P)
− +H (Q U¯ (x; δT ) = −e δT
(1.109)
represent the utility of wealth x at time T . Obviously, at the beginning of the period, the associated value function V¯ 0 (x; δT ) satisfies x
− ∗ V¯ 0 (x; δT ) = eH (Q |P) V 0 (x; δT ) = −e EQ (δT ) .
(1.110)
The corresponding values of zero wealth then become ∗ U¯ (0; δT ) = −eH (Q |P) ,
and ∗ V¯ 0 (0; δT ) = eH (Q |P) V00 (0; δT ) = −1.
The above considerations might look pedantic in the single-period setting. However, in the multiperiod case, one needs to reconcile single-period and multiperiod concepts of the value and utility functions. Indeed, when dealing concurrently with investment problems over multiple investments horizons, one needs to identify the utility function at a given horizon with the value function obtained by solving the optimal investment problem over the next time period. Together, the functions of utility and value lead to the natural concept of a dynamic utility or of a term structure of utilities. Such a utility is a function of wealth, the investment horizon, and the market model. In the above simple case, the dynamic utility for the beginning and the end of period, as given by the functions U F (x, 0) = −e
− E x(δ ) Q T
,
U F (x, T ) = −e
− δx +H (Q∗ |P) T
(1.111)
,
is normalized at the beginning of the period. We call it the forward utility. On the other hand, the dynamic utility, given by U B (x, 0, T ) = −e
− E x(δ ) −H (Q∗ |P) Q T
,
U B (x, T , T ) = −e
− δx
T
,
(1.112)
is normalized at the end of the time period. We call it the backward utility. 4An investor may prefer to associate 0, rather than −1, utility value with zero wealth. This requirement is easily met by adding the normalizing constant 1 to UT (x; δT ). We choose not to do it because this does not change anything in our analysis and lengthens many expressions.
Chapter Two Utility Indifference Pricing: An Overview Vicky Henderson David Hobson
2.1 INTRODUCTION The idea of gamblers ranking risky lotteries by their expected utilities dates back to Bernoulli [22]. An individual’s certainty equivalent amount is the certain amount of money that makes them indifferent between the return from the gamble and this amount, as described in Chapter 6 of Mas-Colell et al. [184]. Certainty equivalent amounts and the principle of equi-marginal utility (see Jevons [141]) have been used by economists for many years. More recently, these concepts have been adapted for derivative security pricing. Consider an investor receiving a particular derivative or contingent claim offering payoff CT at future time T > 0. When there is a financial market, and the market is complete, the price the investor would pay can be found uniquely. Option pricing in complete markets uses the idea of replication whereby a portfolio in stocks and bonds re-creates the terminal payoff of the option, thus removing all risk and uncertainty. The unique price of the option is given by the law of one price—the initial wealth necessary to fund the replicating portfolio. However, in reality, most situations are incomplete, and complete models are only an approximation to this. Market frictions—for example, transactions costs, nontraded assets, and portfolio constraints—make perfect replication impossible. In such situations, many different option prices are consistent with no-arbitrage, each corresponding to a different martingale measure. There is no longer a unique price. However, in this situation, even with an incomplete financial market, the investor can maximize expected utility of wealth and may be able to reduce the risk due to the uncertain payoff through dynamic trading. She would be willing to pay a certain amount today for the right to receive the claim such that she is no worse off in expected utility terms than she would have been without the claim. Hodges and Neuberger [131] were the first to adapt the static certainty equivalence concept to a dynamic setting like the one described. Similar ideas are also important in actuarial mathematics. There is a valuation method called the “premium principle of equivalent utility” which has desirable properties if utility is of the exponential form; see Gerber [105] for details. The method described above, termed utility indifference pricing, and the subject of this book, is one approach that can be taken in an incomplete market.
45
UTILITY INDIFFERENCE PRICING
Other potential approaches that will be discussed in this overview include the selection of one particular measure according to a minimal distance criterion (for example, the minimal martingale measure or the minimal entropy measure), superreplication, and convex risk measures. The advantages of utility indifference pricing include its economic justification and incorporation of risk aversion. It leads to a price that is nonlinear in the number of units of claim, which is in contrast to prices in complete markets and some of the alternatives mentioned above. The indifference price reduces to the complete market price, which is a necessary feature of any good pricing mechanism. Indifference prices can also incorporate wealth dependence. This may be desirable as the price an investor is willing to pay could well depend on the current position of his derivative book. Although we concentrate on pricing issues here, utility indifference also gives an explicit identification of the hedge position. This is found naturally as part of the optimization problem. Limitations of indifference pricing methodology include the fact that explicit calculations may be done in only a few concrete models, mainly for exponential utility. Exponential utility has the feature that the wealth or initial endowment of the investor factors out of the problem, which makes the mathematics tractable but is also a strong assumption. Different investors with varying initial wealths are unlikely to assign the same value to a claim. Practically, users may not be satisfied with the concept of utility functions and unable to specify the required risk aversion coefficient. In this overview, we will introduce utility indifference pricing and give a survey of the literature, both more theoretical and applications. Inevitably, such a survey reflects the authors’ experience and interests.
2.2 UTILITY FUNCTIONS We begin by introducing utility functions that are central to indifference pricing. Define a utility function U (x) as a twice continuously differentiable function with the property that U (x) is strictly increasing and strictly concave. Utility functions are increasing to reflect that investors prefer more wealth to less, and concave because investors are risk averse; see Copeland and Weston [53] and Mas-Colell et al. [184]. A utility function can be defined either over the positive real line or over the whole real line. A popular example of the former is the power utility U (x) =
x 1−R ; R = 1, R > 0, 1−R
while the exponential utility 1 U (x) = − e−γ x ; γ > 0 γ is an example of the latter. We will describe both of these in more detail below. The main distinction concerns whether wealth is restricted to be positive or allowed to become negative.
46
CHAPTER 2
2.2.1 HARA Utilities A useful quantity for the discussion of utilities is the coefficient of absolute risk aversion (due to Arrow [4] and Pratt [224]), given by Ra (x) = −
U
(x) . U (x)
A utility function is of the HARA class if Ra (x) satisfies Ra (x) =
1 A + Bx
x ∈ ID
(2.1)
where ID is the interval on which U is defined (or equivalently the interval on which the denominator is positive) and B is a nonnegative constant. The constant A is such that A > 0 if B = 0, whereas A can take any value if B is positive. If B > 0 then U (x) = −∞ for x < −A/B and ID = (−A/B, ∞). Conversely, if B = 0, then ID = R, and U is finite valued for all wealths. These definitions ensure U is concave and increasing. Our notation can easily be made consistent with that of Merton [188]. If B > 0 and B = 1, then integration leads to U (x) =
C 1 (A + Bx)1− B + D; B −1
C > 0, D ∈ R, x > −A/B,
where C and D are constants of integration. This is called the extended power utility function; see Huang and Litzenberger [133]. If A = 0, this becomes the well-known narrow power utility function: U (x) =
CB −1/B 1 Bx 1− B + D; B −1
C > 0, D ∈ R, x > 0.
It is more usually written with R = 1/B and D = 0, C = B 1/B , giving U (x) =
x 1−R 1−R
R = 1.
The narrow power utility has constant relative risk aversion of R, where relative risk aversion Rr (x) is defined to be Rr (x) = xRa (x). Returning to further utility functions in the HARA class, for B = 1 we have U (x) = C ln (A + x) + E
C > 0, E ∈ R, x > −A,
the logarithmic utility function. Taking A = 0, E = 0, C = 1 gives the standard or narrow form. Finally, for B = 0, U (x) = −
F −x/A +G e A
F > 0, A > 0, G ∈ R, x ∈ R,
the exponential utility function. It is usual to take G = 0, A = 1/γ , F = 1/γ 2 . For this utility function, Ra (x) = γ , a constant.
47
UTILITY INDIFFERENCE PRICING
2.2.2 Non-HARA Utilities We now briefly discuss some utility functions that do not fit into the HARA class. An important example is the quadratic utility that has received much attention historically in the literature. Taking B = −1, A > 0 in (2.1) gives 1 2 x ∈ R. x 2A This decreases over part of the range, violating the assumption that investors desire more wealth and so have an increasing utility function, but has excellent tractability properties. When used to price options, some of the common utility functions discussed in the previous sections have shortcomings. For instance, we shall show that it is not possible to price a short call option with exponential or power-type utilities. The power utility requires wealth be non-negative and the exponential, although allowing for negative wealth, gives problems at least in the case of lognormal models: To circumvent these problems, it is worth mentioning another moderately tractable family of utilities. These are functions of the form
1 U (x) = κ > 0. (2.2) 1 + κx − 1 + κ 2 x 2 ; κ This class penalizes negative wealth less severely. We will also briefly introduce some special utilities related to other methodologies for pricing in incomplete markets. First, consider the concept of superreplication. A claim is superreplicated if the hedging portfolio is guaranteed to produce at least the payoff of the claim. The superreplication price is the smallest initial fortune with which it is possible to superreplicate the payoff of the claim with probability one. It is the supremum of the possible prices consistent with no arbitrage, and is therefore often unrealistically high. The superreplication price corresponds to a utility function of the form −∞ x < 0 U (x) = , 0 x≥0 U (x) = x −
although this does not satisfy our formal definition. A second approach is that of shortfall hedging, which minimizes expected loss; see Föllmer and Leukert [88]. This criterion corresponds to a utility function of the form U (x) = −x − . 2.2.3 Utilities and the Legendre-Fenchel Transform Denote by I the inverse of the strictly decreasing mapping U from (−∞, ∞) onto itself. If B > 0, we have 1 y −B I (y) = −A , (2.3) B C
48
CHAPTER 2
whereas if B = 0, I (y) = −A ln (y/F ). The convex conjugate function U˜ of the utility function is the Legendre-Fenchel transform of the convex function −U , that is, U˜ (y) := sup{−xy + U (x)} y > 0.
(2.4)
x>0
For B positive with B = 1, U˜ (y) =
A C −B y 1−B + y + D, B(B − 1) B
whereas for B = 1, U˜ (y) = −C ln y + Ay + (E − C − C ln C), and for B = 0, U˜ (y) = Ay ln y − (A ln F + 1/A)y + G. Note also that unifying these gives another characterization of HARA utilities, those with H U˜
(y) = 1+B , y for a positive constant H . Of course, both I and U˜ in (2.3) and (2.4) can be defined for any utility, not just the HARA class. Much of the general duality theory can be extended to arbitrary utility functions, at least so long as the utility satisfies the reasonable asymptotic elasticity property of Kramkov and Schachermayer [162].
2.3 UTILITY INDIFFERENCE PRICES: DEFINITIONS The utility indifference buy (or bid) price p b is the price at which the investor is indifferent (in the sense that his expected utility under optimal trading is unchanged) between paying nothing and not having the claim CT and paying p b now to receive the claim CT at time T . Consider the problem with k > 0 units of the claim. Assume the investor initially has wealth x and zero endowment of the claim. The definitions extend to cover nonzero endowments also. Define V (x, k) =
sup EU (XT + kCT ),
XT ∈A(x)
(2.5)
where the supremum is taken over all wealths XT that can be generated from initial fortune x. The utility indifference buy price pb (k) is the solution to V (x − pb (k), k) = V (x, 0).
(2.6)
That is, the investor is willing to pay at most the amount pb (k) today for k units of the claim CT at time T . Similarly, the utility indifference sell (or ask) price p s (k) is
49
UTILITY INDIFFERENCE PRICING
the smallest amount the investor is willing to accept in order to sell k units of CT . That is, ps (k) solves V (x + ps (k), −k) = V (x, 0).
(2.7)
Formally, the definitions of these two quantities can be related by pb (k) = −ps (−k), and with this in mind, we let p(k) denote the solution to (2.6) for all k ∈ R. Utility indifference prices are also known as reservation prices; for example, see Munk [192]. Also used in the finance literature is the terminology private valuation, which emphasizes that the proposed price is for an individual with particular risk preferences; see Teplá [259] and Detemple and Sundaresan [72]. Note, however, that utility indifference prices are calculated in a partial equilibrium in which the prices of traded assets are specified exogenously, rather than in a full equilibrium setting. Also note that in the standard formulation, agents choose a price as a function of quantity; the same ideas can be used to translate the problem so that the agent chooses a quantity given a price. Now this quantity can either be chosen to leave the agent indifferent between entering the transaction or not, or, more reasonably, the quantity can be chosen to maximize expected utility. Utility indifference prices have a number of appealing properties. We will comment further on these properties later in specific model frameworks. (i) Nonlinear Pricing First, in contrast to the Black and Scholes price (and many alternative pricing methodologies in incomplete markets), utility indifference prices are nonlinear in the number of options, k. The investor is not willing to pay twice as much for twice as many options, but requires a reduction in this price to take on the additional risk. Second, and alternatively, a seller requires more than twice the price for taking on twice the risk. This property can be seen from the value function (2.5) since U is a concave function. Put differently, the amount an agent is prepared to pay for a claim CT depends on his prior exposure to nonreplicable risk. We will assume throughout that this prior exposure is zero, and some of our conclusions (such as concavity below) depend crucially on this assumption. (ii) Recovery of Complete Market Price If the market is complete or if the claim CT is replicable, the utility indifference price p(k) is equivalent to the complete market price for k units. To show this, let RT denote the time T value of one unit of currency invested at time 0. If XT ∈ A(x), then we can write XT = xRT + X˜ T for some X˜ T ∈ A(0), where A(0) is the set of claims which can be replicated with zero initial wealth. (We assume that X˜ T ∈ A(0) if and only if X˜ T + xRT ∈ A(x).) Since CT is replicable, from an initial fortune p BS , write CT = pBS RT + X˜ TC where X˜ TC ∈ A(0). The superscript BS is intended to denote the Black-Scholes or complete market price. Then for XT ∈ A(x), XT + kCT = (x + kpBS )RT + X˜ T + k X˜ TC = (x + kpBS )RT + X˜ T ,
50
CHAPTER 2
where X˜ T ∈ A(0). Thus XT + kCT ∈ A(x + kpBS ) and so V (x, k) =
sup EU (XT + kCT ) =
XT ∈A(x)
sup XT ∈A(x+kpBS )
EU (XT ) = V (x + kpBS , 0)
and thus p(k) = kpBS . That is, the indifference price for k units is simply k times the compete market price p BS . (iii) Monotonicity Let p i be the utility indifference price for one unit of payoff CTi and let CT1 ≤ CT2 . Then p1 ≤ p2 . This follows directly from (2.5) and the price definitions. Using properties (ii) and (iii), we have p sub (k) ≤ p(k) ≤ psup (k), where psup , psub are the super and subreplicating prices for k units of claim, respectively. (iv) Concavity Let pλ be the utility indifference price for the claim λCT1 + (1 − λ)CT2 where λ ∈ [0, 1]. Then pλ ≥ λp1 + (1 − λ)p2 . This can be shown as follows. Let XTi be the optimal target wealth for an individual with initial wealth x − pi due to receive the claim CTi . Then, by definition V (x, 0) = V (x − pi , 1, CTi ) =
sup XT ∈A(x−pi )
EU (XT + CTi ) = EU (XTi + CTi ),
where the dependence on the claim CTi has been made explicit in the notation. Define X¯ T = λXT1 + (1 − λ)XT2 . Then X¯ T ∈ A¯ = A(x − λp1 − (1 − λ)p2 ). Then V (x − λp1 − (1 − λ)p2 , 1, λCT1 + (1 − λ)CT2 ) = sup EU (XT + λCT1 + (1 − λ)CT2 ) XT ∈A¯
≥ EU (X¯ T + λCT1 + (1 − λ)CT2 ) = EU (λ(XT1 + CT1 ) + (1 − λ)(XT2 + CT2 )) ≥ λEU (XT1 + CT2 ) + (1 − λ)EU (XT2 + CT2 ) = V (x, 0) = V (x − pλ , 1, λCT1 + (1 − λ)CT2 ). Note that if we consider sell prices rather than buy prices, then pλ is convex rather than concave. As can be seen from (2.6), to compute the utility indifference price of a claim, two stochastic control problems must be solved. The first is the optimal investment problem when the investor has a zero position in the claim. These optimal
UTILITY INDIFFERENCE PRICING
51
investment problems date back to Merton [186, 187]. Merton used dynamic programming to solve for an investor’s optimal portfolio in a complete market where asset prices follow Markovian diffusions. This approach leads to Hamilton Jacobi Bellman (HJB) equations and a PDE for the value function representing the investor’s maximum utility. Merton was able to solve such PDE’s analytically in a number of now well-known special cases. The second is the optimal investment problem when the investor has bought or sold the claim. This optimization involves the option payoff, and problems are usually formulated as one of stochastic optimal control and again solved in the Markovian case using HJB equations. We will see an example of this in Section 2.5.4. An alternative solution approach is to convert this primal problem into the dual problem, which involves minimizing over martingale measures; see Section 2.5.5. Under this approach, the problems are no longer restricted to be Markovian in nature. As well as the utility indifference price of the option, the hedging strategy of the investor is crucial. Since the market is incomplete, the hedge will not be perfect. The investor’s hedge arises from the optimization problem (2.5), where the optimization takes place over admissible strategies. The hedge typically involves a Merton term that would be the appropriate hedge for the no-option problem and an additional term that accounts for the option position. The remaining concept to introduce is that of the marginal price. The marginal price is the utility indifference price for an infinitesimal quantity. As we will see later, marginal prices are linear pricing rules and amount to choosing a particular martingale measure. Marginal prices are commonly used in economics, and have been proposed in an option pricing context in various forms by Davis [60, 61], Karatzas and Kou [150], and Kallsen [149].
2.4 DISCRETE TIME APPROACH TO UTILITY INDIFFERENCE PRICING The problem of pricing European options on nontraded assets in a binomial model was first tackled in Smith and Nau [254] in the context of real options and by Detemple and Sundaresan [72] as part of a study of the effect of portfolio constraints; see Section 2.6.2. Both papers consider options on a nontraded asset where a second, correlated asset is available for trading. Smith and Nau [254] treat European options where the investor has exponential utility. Detemple and Sundaresan [72] represent price movements in a trinomial model, which they solve numerically for the utility indifference price in the case of power utility. They also consider American-style options but in a simpler framework with no traded correlated asset. In this case, the investor cannot short-sell the underlying asset, and indifference prices are found numerically. More recently, Musiela and Zariphopoulou [199] (and Chapter 1 of this book) revisit the problem and place it in a formal mathematical setting. They derive the European option’s indifference value in the European case with exponential utility. Other utilities and the alternative dual approach are used in Chapter 9.
52
CHAPTER 2
We can illustrate briefly the main ideas in a simple one-period binomial model where current time is denoted 0 and the terminal date is time 1. This exposition follows that of Musiela and Zariphopoulou [199]. The market consists of a riskless asset, a traded asset with price P0 today and a nontraded asset with price Y0 today. Assume the riskless asset pays no interest for simplicity. The traded price P0 may move up to P0 ψ u or down to P0 ψ d , where the random variable ψ takes either value ψ u , ψ d and 0 < ψ d < 1 < ψ u . The nontraded price satisfies Y1 = Y0 φ, where φ = φ d , φ u and φ d < φ u . There are four states of nature corresponding to outcomes of the pair of random variables: (ψ u , φ u ), (ψ u , φ d ), (ψ d , φ u ), (ψ d , φ d ). Wealth X1 at time 1 is given by X1 = β + αP1 = x + α(P1 − P0 ), where α is the number of shares of stock held, β is the money in the riskless asset, and x is initial wealth. The investor is pricing k units of a claim with payoff C1 and has exponential utility. The value function in (2.5) becomes 1 −γ (X1 +kC1 ) V (x, k) = sup E − e , γ α and the utility indifference buy price defined in (2.6) solves V (x, 0) = V (x − pb (k), k). The price pb (k) is given by 0
p b (k) = EQ
1 0 log EQ (eγ kCT |P1 ) , γ
(2.8)
where Q0 is the measure under which the traded asset P is a martingale, and the conditional distribution of the nontraded asset given the traded one is preserved with respect to the real world measure P. This is the minimal martingale measure of Föllmer and Schweizer [91]. In fact, in this simple setting, all minimal distance measures are identical, so Q0 is also the minimal entropy measure of Frittelli [97]; see the discussion in Section 2.7.2. The price in (2.8) can be shown to satisfy properties (i)–(iv) in Section 2.3; see Chapter 1 of this book. The above formulation shows the utility indifference price (in a one-period model with exponential utility) can be written as a new nonlinear, risk-adjusted payoff, and then expectations are taken with respect to Q0 of this new payoff. This is in contrast to the usual linear pricing structures found in complete markets and in other approaches to incomplete markets pricing. A similar representation appears in Smith and Nau [254].
2.5 UTILITY INDIFFERENCE PRICING IN CONTINUOUS TIME Consider a model on a stochastic basis (, (Ft )0≤t≤T , P). For simplicity we assume that the model supports a single traded asset with price process Pt and a second auxiliary process Yt , which may correspond to a related but nontraded stock or a diffusion process that drives the dynamics of Pt . For example Yt may represent the volatility of P . Suppose that the dynamics of Pt and Yt are governed by the
53
UTILITY INDIFFERENCE PRICING
stochastic differential equation (SDEs), dPt = σt (dBt + λt dt) + rt dt Pt dYt = at dWt + bt dt.
(2.9) (2.10)
Here B and W are Brownian motions with correlation ρt , which together generate the filtration Ft and σ, λ, r, a, b, and ρ are adapted to Ft . The problem is to price a (typically nonnegative) contingent claim CT ∈ mFT , where T is the horizon time. We will concentrate on this bivariate model that is rich enough to contain interesting examples and to illustrate the central concepts of the theory, but simple enough for explicit solutions to sometimes exist. It is possible to extend the analysis to higher dimensions, but as we shall see it is already difficult to find solutions to the utility indifference pricing problem even in the two-dimensional special case. Throughout this overview we will assume that even though the asset Yt is not directly traded, its value is an observable quantity. When Yt is a hidden Markov process and its value needs to be estimated from the information contained in the filtration generated by the traded asset Pt , the issues become much more delicate; see Chapter 4. It is convenient to write the Brownian motion W as a composition of two independent Brownian motions Bt and Bt⊥ so that dWt = ρt dBt + ρt⊥ dBt⊥ , where ρt⊥ is the positive solution to ρt2 + (ρt⊥ )2 = 1. Note also that we have chosen to parameterize the traded asset P via its volatility σt and Sharpe ratio λt rather than volatility and drift. Of course, there is a simple relationship between the two parameterizations whereby the drift is given by rt + λt σt , but as we shall see the Sharpe ratio plays a fundamental role in the characterization of the solution to the utility indifference pricing problem. Moreover, it is the Sharpe ratio that determines whether an investment is a good deal. There are two canonical situations that fit into our general framework: Example 2.5.1 Nontraded assets problem Let Y represent the value of a security that is not traded, or on which trading is difficult or impossible for an agent because of liquidity or legal restrictions. Let P represent the value of a related asset such as the market index. The problem is to calculate a utility indifference price for a claim CT = C(YT ) on the nontraded asset. Davis [61] calls this example a model with basis risk. Examples might include a real option, or an executive stock option where the executive is forbidden from trading on the underlying stock; see Section 2.6. We shall identify a special case of this problem as the constant parameter case. By this we mean that σ, λ, r, and ρ are constants and Yt is an autonomous diffusion. The analysis of the problem does not depend on the precise specification of the dynamics of this process, but if Yt is to represent a stock price process it is most natural to take at = ηYt and bt = Yt (r + ηξ ), where ξ is the Sharpe ratio of the nontraded asset. This specification is common in the finance literature. Duffie and Richardson [75] considered the problem of determining the optimal hedge in this model under the assumption of a quadratic utility. The problem of finding a utility indifference price under exponential utility was studied by Tepla [259], who considered the case where Yt is Brownian motion. In the specific case where the claim is units of the nontraded
54
CHAPTER 2
asset CT = YT , she derived an explicit formula for the utility indifference price. The exponential Brownian case was solved explicitly by Henderson and Hobson [125] and Henderson [119]; see also Section 2.5.3. They gave a general representation of the price of a claim that is a function of the nontraded asset CT = C(YT ); see (2.16) below. Subject to a transformation of variables this formula includes the Tepla result as a special case. Finally, Musiela and Zariphopoulou [198] observed that the same analysis carries over to arbitrary diffusion processes Yt . As we shall see, exponential utility and the nontraded assets model is one of the few examples for which an explicit form for the utility indifference price is known. Example 2.5.2 Stochastic volatility models The second important situation is when Y governs the volatility of the asset, so that σt = σ (Yt , t). In this setting the fundamental problem is to price a derivative, such as a call option, on the traded asset P , but it is also possible to consider options on volatility itself. We shall be interested in the situation where the Sharpe ratio depends on Yt and Yt is an autonomous diffusion (popular models include the Ornstein-Uhlenbeck process of Stein and Stein [255], and the square-root or Bessel process proposed by Hull and White [135] and investigated by Heston [126]). In this case, some progress can be made toward characterizing the solution, but unlike in the nontraded assets model there is no explicit representation of the utility indifference option price, even for common classes of utility functions. Note that if σ (Yt ) = Yt , then given observations on the asset price process it is possible to determine the quadratic variation and hence Yt2 . If Yt is modeled as a nonnegative process (such as in the Bessel process model), this means that Yt is adapted to the filtration generated by the price process FtP . With sufficient additional regularity conditions the filtration Ft can be identified with the filtration generated by the price process FtP . In this overview we will limit the analysis to the case where this identification is valid, not least because in the general case additional complications over filtering arise; see Rheinländer [110].
2.5.1 Martingale Measures and State-Price Densities One common approach to option pricing in incomplete markets in the mathematical financial literature is to fix a measure Q under which the discounted traded assets are martingales and to calculate option prices via expectation under this measure. This is related to the notion of a state-price-density from economics. The advantage of using a state-price density ζT is that prices can be calculated as expectations under the physical measure: p = E[ζT CT ]. In an incomplete market there is more than one martingale measure, or equivalently there are infinitely many state-price densities. In the model given by (2.9) and (2.10), it is straight-forward to characterize the equivalent martingale measures; see Frey [94]. They are given by dQ = ZT , dP FT
55
UTILITY INDIFFERENCE PRICING
where Z is a uniformly integrable martingale of the form s s 1 s 2 1 s 2 ⊥ λt dBt − λ dt − χt dBt − χ dt . Zs = exp − 2 0 t 2 0 t 0 0
(2.11)
Here λt is the Sharpe ratio of the traded asset, but χt is undetermined, save for the fact that for Q to be a true probability measure it is necessary to have E[ZT ] = 1. An example of a candidate martingale Zt is given by the choice χt ≡ 0 which leads to the minimal martingale measure of Föllmer and Schweizer [91]. The state-price densities ζT take the form s rt dt Zs , ζs = exp − 0
where Zs is given by (2.11) and have the property that ζt Pt is a P-local martingale. If the interest rate is deterministic then the state-price density and the density of the martingale measure differ only by a positive constant. Otherwise they differ by a stochastic discount factor. The martingale measures Qχ , associated martingales Z χ and state-price denχ sities ζT can all be parameterized by the process χt that governs the change of drift on the nontraded Brownian motion B ⊥ , and we shall use the superscript χ to denote this dependence. 2.5.2 Numéraires In a complete market there is just one martingale measure or state-price density. All options can be replicated, and the unique fair price for the option is given by the replication price. As we saw in Section 2.3 the replication price is also the utility indifference bid and ask price. The price calculated in this way does not depend on the choice of numéraire; see Geman et al [103]. As a result we are free to choose any numéraire that is convenient for the calculations: for example, in an exchange or Margrabe [183] option, the analysis is greatly simplified if one of the assets in the exchange is used as numéraire. Although in a complete market the fair price does not depend on the choice of numéraire, the definition of martingale measure does depend on the choice of numéraire; see Branger [33]. We have the relationship 1 dQN , ζT = NT dP FT where Nt is the numéraire, and QN is a martingale measure for this numéraire. The formulas of the previous section are quoted with respect to the bank account numéraire. Again, note that the state-price density has the advantage of being numéraire independent. For utility indifference pricing the situation is somewhat different. In an incomplete market there is risk, and an agent needs to specify the units in which these risks are to be measured, as well as the concave utility function. If the numéraire is
56
CHAPTER 2
to be changed, then the utility needs to be modified, sometimes in a nontrivial and unnatural way, in order that the analysis remains consistent. We shall fix cash at time T as the units in which utility is measured. 2.5.3 The Primal Approach Recall that the utility indifference price of the claim CT is given as the solution to V (x − p(k), k) = V (x, 0),
(2.12)
where V (x, k) = sup E[U (XTx,θ + kC T )]. Here the notation XTx,θ denotes the terminal fortune of an investor with initial wealth x who follows a trading strategy that consists of holding θt units of the traded asset. At this stage we are not very explicit about set of attainable terminal wealths, except to say that XTx,θ is the terminal value of the wealth process satisfying X0x,θ = x, the self-financing condition dXtx,θ = θt dPt + rt (Xtx,θ − θt Pt )dt,
(2.13)
and sufficient regularity conditions to exclude doubling strategies. In order to calculate the utility indifference price it is necessary to solve a pair of optimization problems. As in the binomial setting there are two approaches to each problem, via primal and dual methods. We begin with a discussion of the primal approach for which it is necessary to assume that we are in a Markovian setting, and to consider the dynamic version of the optimization problem at an intermediate time t. Define V (x, 0) = V (x, p, y, t) = sup Et [U (XTx,θ )|Xt = x, Pt = p, Yt = y]. θ
Using the observation that V (x, p, y, t) is a martingale under the optimal strategy θ, and a super-martingale otherwise, we have that V solves an equation of the form sup Lθ V = 0
V (x, p, y, T ) = U (x),
θ
where 1 1 1 Lθ f = θt2 σt2 p 2 fxx + σt2 p 2 fpp + at2 fyy + θt σt2 p 2 fxp + θt σt at ρt pfxy 2 2 2 +σt at ρt pfpy + (θt σt λt p + rt x)fx + (σt λt p + rt p)fp + bt fy + f˙. Here a subscript t refers to an adapted process, whereas other subscripts refer to partial derivatives. Given that Lθ is quadratic in θ, the minimization in θ is trivial and the problem can be reduced to solving a nonlinear Hamilton-Jacobi-Bellman equation in four variables. If Pt is an exponential Brownian motion with constant parameters (or more precisely if none of the parameters σ, λ, r, a, b, or ρ depends on the price level), then
57
UTILITY INDIFFERENCE PRICING
the traded asset value scales out of the problem and the dimension can be reduced by one. Further, in the nontraded asset model where σ, λ, r and ρ are independent of Yt , the solution of the Merton problem is independent of Yt , and again the number of state-variables can be reduced by one. Finally, for HARA utility functions it is possible to conjecture the dependence of the value function on wealth and again to reduce the number of dimensions. For example, for exponential utility wealth factors out of the problem, and it is possible to consider V ≡ −(1/γ )e−γ x V (p, y, t), where V (p, y, T )=1. 2.5.4 The Nontraded Assets Model Suppose we are in the nontraded asset problem with constant interest rates such that Pt follows a constant parameter Black-Scholes model. Suppose that Yt is also a representation of a share price so that it is again natural to think of Yt as following an exponential Brownian motion: dYt = ηdWt + (r + ηξ )dt. Yt It follows that with exponential utility 1 r(T −t) −λ2 (T −t)/2 , V (x, p, y, t) = − e−γ xe γ
(2.14)
whereas for power utility V (x, p, y, t) =
1 2 x 1−R e(1−R)λ (T −t)/2R er(T −t)(1−R) ; 1−R
(2.15)
see Merton [186, 187]. Now consider the problem of evaluating the left-hand side of (2.12) under the assumption that CT = C(YT ). At this stage we only sketch the details of the argument because we are going to give a fuller discussion via the dual approach in later sections. The only change from the analysis of the previous section is that the boundary condition becomes V (x, p, y, T ) = U (x + kC(y)). Again, the above simplifications can be used to reduce the dimension of the problem (see Henderson and Hobson [125]), except that in this case it is not possible to remove the dependence on y. Suppose that the agent has exponential utility. Then the nonlinear HJB equation can be linearized using the Hopf-Cole transformation (this idea was introduced to mathematical finance by Zariphopoulou [270] with the terminology distortion). It is now possible to write down the solution to this equation and the value function to the problem with the option (at t = 0) is given by (Henderson and Hobson [125]), 0 1/(1−ρ 2 ) 1 rT 2 2 . V (x, k) = − e−γ xe −λ T /2 EQ e−kγ (1−ρ )C(YT ) γ Here Q0 is the minimal martingale measure: the measure under which the discounted traded asset is a martingale, but the law of the orthogonal martingale measure is unchanged. In the nontraded assets model, Q0 is also the minimal
58
CHAPTER 2
distance measure for any choice of distance metric, including the minimal entropy measure; see Section 2.7.2. It follows that the price can be expressed as p(k) = −
e−rT 0 2 ln EQ [e−kγ (1−ρ )C(YT ) ]. γ (1 − ρ 2 )
(2.16)
Observe that this price is independent of the initial wealth of the agent, and that it is a nonlinear concave function of k. When k > 0, and C is nonnegative, the bid price is well defined, but for k < 0 it may be that the price is infinite. (This is true if the claim is units of the asset Y , or calls on YT .) Thus, one of the disadvantages of exponential utility is that the ask price for many important examples of contingent claims is infinite. This is one of the motivations for considering the utility U defined in (2.2). Suppose now that the agent has power utility. In this case there is no known solution to the the HJB equation, although it can be solved numerically. Instead Henderson and Hobson [128] and Henderson [119] consider expansions in the number of claims k. Kramkov and Sirbu [163] have recently extended the analysis to more general models of volatilities. The former paper considers claims that are units of the nontraded asset, whereas the latter considers more general European claims. Henderson [119] finds that the utility indifference price is given by T Y 2 (C Y )2 k2 R 2 2 Q0 −rT Q0 p(k) = ke E [C(YT )] − e−rt t 0,∗ t dt + o(k 2 ), η (1 − ρ )E 2 x (Xt /x) 0 (2.17) 0 0,∗ −r(T −t) Q Y Et [C(YT )], Ct = ∂Ct /∂Y and Xt is the wealth process where Ct = e consistent with the optimal solution of the Merton problem in the absence of the claim. Note that by scaling Xt0,∗ /x is independent of x, so that the integral in the above expression is independent of initial wealth. The idea of the proofs in both [128] and [119] is that the value function in the presence of the claim can be approximated from below by considering a cleverly chosen, but suboptimal, wealth process Xt , and from above by considering a wellchosen state-price density ζT (see (2.20) below). Given upper and lower bounds on the value function, it is possible to deduce bounds on the option price, which agree to o(k 2 ). Given this expansion for the utility-indifference price of the claim it is possible to investigate the comparative statics of the price with respect to parameters such as initial wealth. Consider (2.17). As wealth increases, the second term becomes less important and the bid price rises. Note that the first term in the expansion is independent of the relative risk aversion coefficient R and of the initial wealth of the agent x, and is the discounted expected payoff under the Föllmer-Schweizer minimal martingale measure, and the second term is negative, which is consistent with a price function which is concave in k. Further the first term is linear in the claim, 0 0 0 (in the sense that EQ [C1 (YT ) + C2 (YT )] = EQ [C1 (YT )] + EQ [C2 (YT )]), but the second term is nonlinear; recall the first property in Section 2.3. Again, if the claim is unbounded the ask price can be infinite, so that the above expansion is only valid in general for positive claims and positive k.
59
UTILITY INDIFFERENCE PRICING
2.5.5 The Dual Approach The primal approach involves finding sup E[U (XTx,θ + kCT )].
(2.18)
θ
The dual approach involves solving (2.18) via translating the problem into one of minimization over state-price densities or martingale measures; see Karatzas et al. [151] and Cvitanic et al. [59]. In a complete market it is possible to write the set of attainable terminal wealths generated from an initial fortune x and a self-financing strategy as the set of random variables which satisfy E[ζT XT ] ≤ x. In an incomplete market this condition becomes that E[ζT XT ] ≤ x for all state-price-densities. This allows us to take a Lagrangian approach to solving (2.18). For all state-price densities ζT , terminal wealths XT satisfying the budget constraint and nonnegative Lagrange multipliers µ E[U (XT + kCT ) − µ(ζT XT − x)] = E[U (XT + kCT ) − µζT (XT + kCT ) + µ(x + ζT kCT )] ≤ µx + µkE[ζT CT ] + E[U˜ (µζT )],
(2.19)
where U˜ was the Legendre-Frenchel transform of −U introduced earlier in (2.4). Optimizing over wealths on the one hand, and Lagrange multipliers and state-price densities on the other, we have sup E[U (XT + kCT )] ≤ inf inf {µx + µkE[ζT CT ] + E[U˜ (µζT )]}, µ
XT
ζT
(2.20)
and provided certain regularity conditions are met (see, for example, Owen [207]), there is equality in this expression. The dual problem is to find the infimum on the right-hand side of (2.20). By considering the derivation of the dual problem, it is clear that if we can find suitable random variables XTk,∗ and ζTk,∗ , and a constant µk,∗ such that U (XTk,∗ + kCT ) = µk,∗ ζTk,∗ then there should be equality in (2.19) and hence XTk,∗ is the optimal primal variable, and µk,∗ and ζTk,∗ are the optimal dual variables. Consider first the case where k = 0. In solving the Merton problem it is possible to ignore the presence of the process Yt and reduce the optimization problem to a complete market problem involving Pt alone. In a complete market there is a unique state-price density, and finding the infimum over ζT is trivial. In this case the primal problem of minimizing over random variables XT is reduced to the problem of minimizing over a real-valued quantity. This illustrates the power of the dual method: for many problems the dual problem is a considerable simplification. Let µ0,∗ and ζT0,∗ be the solutions to the dual problem when there is zero endowment of the contingent claim (and the agent has initial wealth x). We suppose that such solutions exist. Then V (x − kE[ζT0,∗ CT ], k) = inf inf {µ(x − kE[ζT0,∗ CT ]) + µkE[ζT CT ] + E[U˜ (µζT )]} µ
ζT
≤ µ x + E[U˜ (µ0,∗ ζT0,∗ )] = V (x, 0) = V (x − p(k), k). 0,∗
60
CHAPTER 2
kE[ζT0,∗ CT ]
It follows that p(k) ≤ and we have a simple upper bound on the utility indifference of the bid price of a claim in terms of an expectation related to the solution of a Merton problem. We return to this idea later in Section 2.7.3. Now consider the right-hand side of (2.20). In the case of deterministic interest rates and exponential utility, the minimization over ζT , or equivalently ZT , reduces to inf {E[ZT ln ZT ] + γ kE[ZT CT ]}. ZT
Furthermore, given the simple dependence of exponential utility on initial wealth, it is possible (see Frittelli [97], and also Rouge and El Karoui [237], Delbaen et al. [81], and Becherer [16]) to deduce an expression for the form of the utility-indifference price, p(k) =
e−rT inf {E[ZT ln ZT ] + kγ E[ZT CT ]} − inf {E[ZT ln ZT ]} . ZT ZT γ
(2.21)
Note that the second minimization in (2.21) involves finding the minimal entropy measure.
2.5.6 Solving the Merton Problem via Duality The goal in this section is to solve the optimization problem for a class of problems of the forms given in Examples 2.5.1 and 2.5.2 under the assumption of a utility function of HARA type. As we observed in the previous section, the key is to find a solution to XT∗ = I (µ∗ ζT∗ ). Suppose that U satisfies a power law, U (x) = x 1−R /(1 − R). Then I (y) = y −1/R is again of power form. Using the substitution πt = θt Pt /Xt , the gains from trade from a self-financing strategy can be written dPt dXt − rt dt + rt dt, (2.22) = πt Pt Xt where πt is the proportion of wealth invested in the risky traded asset. This has solution T T 1 XT = x exp rt − πt2 σt2 dt . πt σt (dBt + λt dt) + 2 0 0 Hence, if XT∗ = I (µ∗ ζT∗ ), where π ∗ denotes the optimal strategy and χ ∗ the market price of risk for the Brownian motion B ⊥ , then we must have T T 1 ln x + rt − (πt∗ )2 σt2 dt πt∗ σt (dBt + λt dt) + 2 0 0 T T T 1 1 1 1 ∗ rt dt + λt dBt + λ2 dt = − ln µ + R R 0 R 0 2R 0 t T 1 1 T ∗ ⊥ χ dBt + (χt∗ )2 dt. + R 0 t 2R 0
61
UTILITY INDIFFERENCE PRICING
After some algebra and the substitution φt∗ = λt − Rπt∗ σt , this equation can be reduced to T T 1 1 1 1 λ2t dt − (1 − R) rt dt = c + MT + 1− [M]T + MT⊥ + [M ⊥ ]T , 2 R 2R 2 0 0 (2.23) where c is a constant depending on µ∗ , x and R, t t 1 Mt = dt , Mt⊥ = χu∗ dBt⊥ , φu∗ dBu + λt 1 − R 0 0 and where [·] denotes quadratic variation. If interest rates are deterministic, then this simplifies to a representation T 1 1 1 1 1− [M]T + MT⊥ + [M ⊥ ]T , λ2t dt = α + MT + (2.24) 2 R 2R 2 0 T where α is the constant c + (1 − R) 0 rt dt. Note that in the case R = 1 (logarithmic utility), there is a trivial solution for which both sides of (2.23) are zero. In this case we can find a solution to XT = I (µζT ), for which the state-price density is the minimal state-price density in the sense of Föllmer and Schweizer [91]. Note also that (2.23) is an identification of random variables and not processes, and is a representation of a FT random variable in terms of a pair of Brownian martingales and their quadratic variations. Let us now consider the case of exponential utility, under the assumption of constant interest rates. Formally, taking R = ∞ in Equation (2.24) gives T T 1 T 2 1 T ∗ 2 λt dt = α + φt∗ (dBt + λt dt) + χu∗ dBu⊥ + (χu ) du. (2.25) 2 0 2 0 0 0 This equation can be derived directly from the relationship XT = I (µζT ), using (2.13) rather than (2.22), but we want to give a direct argument which shows that (2.25) leads to the solution of the Merton problem. For some state-price density χ χ ζT = e−rT ZT we have χ
µζ χ χ T E[U˜ (µζT )] = E ln ζT − (1 − ln µ) γ
µe−rT Qχ χ = E [ln ZT ] − (1 − ln µ + rT ) . γ χ
Here Qχ is the measure given by (dQχ /dP)|FT = ZT . Further, T T T T 1 1 T 2 χ Qχ ⊥,Qχ ⊥ ln ZT = − λt dBt + λt dt − λt dBt + χt2 dt 2 0 0 0 0 2 0 T T Qχ ⊥,Qχ (ηt − λt )dBt + (χt − λ⊥ =α+ t )dBt 0 0 1 T ∗ + (χt − χt )2 dt, (2.26) 2 0
62
CHAPTER 2
where B and B are Qχ -Brownian motions. Taking expectations under EQ , and assuming that the local martingales in (2.26) are true martingales, we have T ∗ 2 Qχ Qχ 1 E [ln ZT ] = α + E (2.27) (χt − χt ) , 2 0 Qχ
⊥,Qχ
χ
which is minimized by the choice χt = χt∗ . Hence, subject to certain technical conditions, if we can solve (2.25) then we have found the measure that minimizes χ E[U˜ (µζT )], and, moreover, inf {µx + E[U˜ (µζT )]} = µx + ζT
µe−rT (α − (1 − ln µ + rT )) . γ
A further minimization over µ gives 1 V (x, 0) = inf inf µx + E[U˜ (µζT )] = − exp −γ xerT − α . µ ζT γ
(2.28)
Hence, finding the solution to the dual Merton problem under exponential utility reduces to solving (2.25). This direct argument can be adapted to cover power utilities. In this case we find x 1−R (1−R)rT −α V (x, 0) = inf inf µx + E[U˜ (µζT )] = . (2.29) e µ ζT 1−R 2.5.7 Explicit Solutions of the Merton Problem via Duality under Constant Interest Rates It remains to solve (2.24) or (2.25). Consider first the nontraded assets model in which r and λ are constants. Then (2.24) has a trivial solution 1 1 α= λ2 T . 1− R 2 (The solution for exponential utility follows on taking R = ∞.) If we substitute this value into (2.28) and (2.29), then we recover the value functions (2.14) and (2.15) for t = 0. Now consider the stochastic volatility model, under an assumption of a constant correlation between the Brownian motions B and W . Suppose that Yt is an autonomous diffusion and that the Sharpe ratio is a function of this process: λt = λ(Yt ). Then with the reparameterization φt∗ = ρψt , χt∗ = ρ ⊥ ψt and = (ρ ⊥ )2 + (ρ 2 /R) = 1 + ρ 2 (R −1 − 1), (2.24) becomes T T 1 1 λ(Yt )2 dt = α + ψt dWt 1− R 2 0 0 T 1 T 2 + 1− ρψt dt + ψt dt. (2.30) R 2 0 0 Let Pˆ be the measure under which Wˆ given by d Wˆ = dW + (1 − R −1 )ρdt is a Brownian motion. Then (see Kobylanski [159] or Hobson [129]), multiplying
63
UTILITY INDIFFERENCE PRICING
ˆ we have by −, exponentiating, and taking expectations under P, T 1 1 2 Pˆ λ(Yt ) dt . 1− α = − ln E exp − 2 R 0
(2.31)
Thus we have found a solution to the Merton problem. Provided that the expectation in (2.31) can be calculated, it is possible to represent the solution to the Merton problem in a simple form. Diffusions for which (2.31) can be solved include the case where Yt is a Bessel process, and λ(Yt ) is affine; see Grasselli and Hurd [111]. In the above analysis we concentrated on finding the value of α that is required to solve for the value function. The full solution of (2.30) includes an expression for ψt , which in turn gives expressions for φt∗ and hence the optimal trading strategy. Given ψt it is also possible to calculate the market price of risk χ ∗ . In the stochastic volatility case it turns out that the market price of risk is a time-inhomogeneous function of Yt ; see Hobson [129]. 2.5.8 Solving the Dual Problem with the Claim Now we return to the problem with the claim. In order to characterize the solution we need to solve XT∗ + kCT = I (µ∗ ζT∗ ). Suppose interest rates are deterministic and the utility function is exponential. Suppose we have a solution to T T 1 T ∗ 2 1 T 2 ∗ ∗ ⊥ φt (dBt + λt dt) + χu dBu + (χu ) du. λ dt + γ kCT = α + 2 0 2 0 t 0 0 (2.32) Then, by a direct repeat of the argument leading to (2.27) we have T 1 χ χ χ χ (2.33) (χt∗ − χt )2 , EQ [γ kCT ] + EQ [ln ZT ] = α + EQ 2 0 which is minimized by the choice χt = χt∗ , and then µx + µkE[ζT CT ] + E[U˜ (µζT )] ≥ µx + χ
χ
µe−rT (α − (1 − ln µ + rT )), γ
with equality for χ = χ ∗ . It follows that V (x, k) = −
1 exp −γ xerT − α . γ
(2.34)
It remains to solve (2.32). Consider the nontraded assets example with constant parameters. There are two cases which can be solved. Case 1. Suppose that the claim CT can be replicated for an initial price pBS , so that T T θt (dPt /Pt − rdt) = pBS erT + θt σ (dBt + λdt). CT = pBS erT + 0
0
Then there is a trivial solution of (2.32) for which χ ∗ ≡ 0, φt∗ = σ kγ θt and α = γ kerT p BS + λ2 T /2. Note that p BS = e−rT EQ [CT ], where Q is any martingale measure.
64
CHAPTER 2
Case 2. Suppose that CT = C(YT ). We look for a solution under Q0 , the mar0 tingale measure under which B 0 given by Bt = Bt + λt and B ⊥ are Brownian 0 0 motions. Set β = α − λ2 T /2 and Wt = ρBt + ρt⊥ Bt⊥ . Then (2.32) becomes T T 1 T 2 0 ηt dBt + χu dBu⊥ + χ du. (2.35) γ kC(YT ) = β + 2 0 u 0 0 We look for a solution of the form ηt = ρψt , χt = ρ ⊥ ψt , whence T (1 − ρ 2 ) T 2 γ kC(YT ) = β + ψt dW 0 + ψu du. 2 0 0 Using the same transformations as for the Merton problem, we deduce β=−
1 0 2 ln EQ [e−(1−ρ )γ kCT ], 2 (1 − ρ )
from which it is possible to deduce an expression for α and hence V (x, k). It is possible to combine these two cases to solve the utility indifference pricing problem for a payoff that is a linear combination of a claim on PT and a claim on YT . Suppose CT = C 1 (PT ) + C 2 (YT ). Then 0 1/(1−ρ 2 ) 1 2 rT Q0 1 2 V (x, k) = − e−γ xe −γ kE [C (PT )]−λ T /2 EQ e−(1−ρ )γ kCT . γ Finally we use the definition V (x − p(k), k) = V (x, 0) to obtain the following extension of the utility indifference price given in (2.16): e−rT 0 0 2 p(k) = ke−rT EQ [C 1 (PT )] − (2.36) ln EQ e−kγ (1−ρ )C(YT ) . 2 γ (1 − ρ ) Here we have written the first expectation as being under the minimal martingale measure, but of course this component of the price is identical under all choices of martingale measure. It is an open problem to price a general claim of the form CT = C(PT , YT ), or even to extend the study of claims on YT alone to the case of nonconstant correlation. The fundamental problem is to solve (2.32) or one of its equivalent variations. Mania et al. [176] translate (2.32) into a backward SDE, and Tehranchi [258] has an interesting approach via the Hölder inequality. It is possible to develop some intuition for the form of the general solution by considering the discrete time case (see Smith and Nau [254] and Musiela and Zariphopoulou [199]), but it is not clear whether this can lead to a closed form expression for the price such as in (2.36). Thus, even in the nontraded assets model with exponential utility, it is difficult to obtain an explicit expression for the utility-indifference price of a contingent claim. In other contexts the problem is even more difficult, although the general dual expression (2.21) remains valid. In the stochastic volatility context Sircar and Zariphopoulou [252] give a general characterization of the price of an option on PT and some expansions. In the same class of models, but in the slightly artificial case of an option on the volatility process, Tehranchi [258] and Grasselli and Hurd [111] show how to calculate an exact option price.
65
UTILITY INDIFFERENCE PRICING
2.6 APPLICATIONS, EXTENSIONS, AND A LITERATURE REVIEW Utility-indifference pricing can be a useful concept of value in any incomplete situation. It has already been developed in the context of transactions costs (see below) and for nontraded assets. Møller [189] uses utility indifference in an insurance context. A number of applications are treated later this book. In particular, Chapter 6 deals with indifference pricing of defaultable claims, which is a very new application of these techniques. Chapters 4 and 7 treat weather and energy applications of indifference pricing; see also Davis [62], who applied the concept of marginal price to weather. We discuss some of these areas in this section and give references where further details can be found. 2.6.1 Transactions Costs Option pricing with transactions costs represents the first area in which utility indifference pricing was used. Hodges and Neuberger [131] consider an investor endowed with stock, bond, and an option and derive the investor’s optimal investment in stock and bond such that his expected utility is maximized in the presence of transactions costs. Under exponential utility, they solve numerically for the optimal hedge in a binomial setting. The idea was also used in papers of Davis and Norman [63], Davis et al. [64], Davis and Zariphopoulou [65], and Constantinides and Zariphopoulou [51], again in the context of transactions costs. The latter derive a closed-form upper bound to the utility indifference price of a call option. Monoyios [190], [191] considers marginal pricing of options under transactions costs and computes the option prices numerically. 2.6.2 Portfolio Constraints Munk [192] uses utility indifference pricing in the context of portfolio constraints. His constraints are on the total portfolio amounts, that is, the cash amount in risky stock and the riskless bond. The investor invests in stock and bond, and consumes at rate c in order to maximize the expected remaining lifetime utility from consumption ∞ e−β(s−t) U (cs )ds, sup E t
where utility is power. The investor can also buy or sell units of a European-style claim. In the special case where the constraint is a nonnegative wealth constraint, Merton [186] solves the no-option optimal control problem. The option problem is solved numerically by Munk. A second case where the investor faces borrowing constraints is studied numerically. It is also possible to recast the nontraded assets model as a portfolio constraint problem, where the constraint is that the investor cannot trade on a particular asset. There are a number of papers written in this vein. First, Kahl et al. [148] consider a manager with nontraded stock who can invest in a correlated market asset and bank account, and consume. The manager maximizes expected power utility from
66
CHAPTER 2
consumption and terminal wealth. Kahl et al. [148] use utility indifference to value the restricted stock to the manager and solve via numerical methods. Second, the paper of Teplá [259] considers an investor with exponential utility who holds some quantity of a nontraded asset in her portfolio. The nontraded payoff is assumed normally distributed, while the traded assets are lognormal. Finally, Detemple and Sundaresan [72] treat short sales constraints for American claims. 2.6.3 Unhedgeable Income Streams Rather than valuing options or nontraded stock, utility indifference can also be used to value streams of income or payments over time where these streams cannot be hedged. In particular, there is a large finance literature on the effect of labor income on asset allocation decisions; see the text of Campbell and Viceira [40]. Labor income is nontraded since individuals cannot trade claims to their future wages. Henderson [122] studies the effect of stochastic income on the optimal investment decision of an investor who has exponential utility. The income state variable is correlated with a risky traded asset and the investor maximizes expected utility of terminal wealth, where wealth is generated by trading in the market and by receiving the stochastic income. Explicit solutions are found and the effects of income on the optimal portfolio are studied. Munk [193] is a recent paper in a long line looking at an optimal investment and consumption problem with stochastic income and power utility over an infinite horizon. He carries out numerical computations to obtain the value of the nontraded income and optimal portfolio. 2.6.4 Indifference Pricing for American Options Davis and Zariphopoulou [65] introduce the notion of American-style utility indifference prices in the context of transactions costs. Treating the problem of options on nontraded assets, Oberman and Zariphopoulou [205] examine the resulting American utility indifference prices with exponential utility. In the nontraded asset framework discussed in Section 2.5.4, the value function is shown to be a combination of the classical HJB equation and an obstacle problem because discretionary stopping is allowed. The buyer’s indifference price solves a quasi-linear variational inequality, and in the case where the stock dynamics are lognormal (and the option is only on the nontraded asset), the price can be written as the solution to an optimal stopping problem. The optimal stopping problem involves a nonlinear criterion which is the same as that appearing in the European case earlier. Assuming interest rates are zero, the buyer’s price for the American option with payoff C(Yτ ) can be expressed as 1 Q0 −γ (1−ρ 2 )kC(Yτ ) pAm (k) = sup − ln E e . γ (1 − ρ 2 ) τ 0 λ dP 3.1.4 Risk Measures and Hedging In this subsection, we come back to the possible interpretation of the risk measure ρ(X) in terms of capital requirement. This leads also to a natural relationship
87
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
between risk measure and hedging. We then extend it to a wider perspective of superhedging. Risk Measure and Capital Requirement Looking back at Equation (3.2), the risk measure ρ(X) gives an assessment of the minimal capital requirement to be added to the position as to make it acceptable in the sense that the new position (X and the added capital) does not carry any risk with nonnegative measure any more. More formally, it is natural to introduce the acceptance set Aρ related to ρ defined as the set of all acceptable positions in the sense that they do not require any additional capital: Aρ = { ∈ X ,
ρ() ≤ 0}.
(3.9)
Given that the epigraph of the convex risk measure ρ is epi(ρ) = {(, m) ∈ X × R | ρ() ≤ m} = {(, m) ∈ X × R | ρ( + m) ≤ 0}, the characterization of ρ in terms of Aρ is easily obtained: ρ(X) = inf {m ∈ R; m + X ∈ Aρ }. This last formulation makes very clear the link between risk measure and capital requirement. From the definition of both the convex risk measure ρ and the acceptance set Aρ and the dual representation of the risk measure ρ, it is possible to obtain another characterization of the associated penalty function α as α(Q) = sup EQ [−],
if
Q∈M1,f ,
= +∞,
if not.
(3.10)
∈Aρ
α(Q) is the support function of −Aρ , denoted by Aρ (Q). When Aρ is a cone, i.e., ρ is a coherent (positive homogeneous) risk measure, then α(Q) only takes the values 0 and +∞. By definition, the set Aρ is “too large” in the following sense: even if we can write m + X ∈ Aρ as m + X = ξ ∈ Aρ , we cannot have an explicit formulation for ξ and in particular cannot compare m + X with 0. Therefore, it seems natural to consider a (convex) class of variables H such that m + X ≥ H ∈ H. H appears as a natural (convex) set from which a risk measure can be generated. Risk Measures Generated by a Convex Set Risk Measures Generated by a Convex in X In this section, we study the generation of a convex risk measure from a general convex set. Definition 3.4 Given a nonempty convex subset H of X such that inf {m ∈ R | ∃ξ ∈ H, m ≥ ξ } > −∞, the functional ν H on X ν H () = inf {m ∈ R; ∃ξ ∈ H, m + ≥ ξ } is a convex risk measure. Its minimal penalty function α H
α (Q) = sup EQ [−H ]. H ∈H
H
is given by
(3.11)
88
CHAPTER 3
The main properties of this risk measure are listed or proved below: 1. The acceptance set of ν H contains the convex subsets H and AH = { ∈ X , ∃ξ ∈ H, ≥ ξ }. Moreover, Aν H = AH if the last subset is closed in the following sense: For ξ ∈ AH and ∈ X , the set {λ ∈ [0, 1] >| λξ + (1 − λ) ∈ AH } is closed in [0, 1] (see Proposition 4.6 in [90]). 2. The penalty function α H associated with ν H is the support function of −Aν H defined by α H (Q) = Aν H (Q) = supX∈A H EQ [−X]. Let us show that α H ν
is also nothing else but H : For any X ∈ Aν H there exist > 0 and ξ ∈ H such that −X ≤ −ξ + . Taking the “expectation” with respect to the additive measure Q ∈ M1,f , it follows that EQ [−X] ≤ EQ [−ξ ] + ε ≤ H (Q) + ε where H (Q) = supH ∈H EQ [−H ]. Taking the supremum with respect to X ∈ Aν H on the left hand side, we deduce that Aν H ≤ H ; the desired result follows from the observation that H is included in Aν H . 3. When H is a cone, the corresponding risk measure is coherent (homogeneous). The penalty function α H is the indicator function (in the sense of the convex analysis) of the orthogonal cone MH : l MH (Q) = 0 if Q∈MH , +∞ otherwise, where MH = {Q ∈M1,f ; ∀ξ ∈ H, EQ [−ξ ] ≤ 0}. The dual formulation of ν H is simply given for ∈ X by ν H () = sup EQ [−]. Q∈MH
It is natural to associate the convex indicator l H on X with the set H, l H (X) = 0 if X ∈ H; +∞ otherwise. This convex functional is not translation invariant, and therefore it is not a convex risk measure. Nevertheless, l H and ν H are closely related as follows. Corollary 3.5 Let l H be the convex indicator on X of the convex set H. The risk measure ν H , defined in Equation (3.11), is the largest convex risk measure dominated by l H and it can be expressed as ν H () = inf {ρworst ( − ξ ) + l H (ξ )}, ξ ∈X
where ρworst () = supω∈ {−(ω)} is the worst-case risk measure. Proof. Let L = {m ∈ R, ∃ξ ∈ H, m ≥ ξ }. This set is a half-line with lower bound inf ξ ∈H supω ξ(ω). Moreover, for any m0 ∈ / L, m0 ≤ inf ξ ∈H supω ξ(ω). Therefore, ν H (0) = inf sup ξ(ω) = inf ρworst (−ξ ). ξ ∈H ω
The same arguments hold for ν H ().
ξ
2
Therefore, ν H may be interpreted as the worst case risk measure ρworst reduced by the use of (hedging) variables in H. This point of view would be generalized in Corollary 3.9 in terms of the inf-convolution ν H = ρworst l H .
89
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Risk Measures Generated by a Convex Set in L∞ (P) Assume now H to be a convex subset of L∞ (P). The functional ν H on L∞ (P) is still defined by the same formula (3.11), in which the inequality has to be understood in L∞ (P), i.e. P − a.s., with a penalty function only defined on M1,ac (P) and given by α H (Q) = supH ∈H EQ [−H ]. The problem is then to give condition(s) on the set H to ensure that the dual representation holds on M1,ac (P) and not only on M1,ac (P). By Theorem 3.3, this problem is equivalent to the continuity from above of the risk measure ν H or equivalently to the weak*-closure of its acceptance set AH . Properties of this kind are difficult to check, and in the following we will simply give some examples where this property holds.
3.1.5 Static Hedging and Calibration In this subsection, we consider some examples motivated by financial risk hedging problems. Hedging with a Family of Cash Flows We start with a very simple model where it is only possible to hedge statically over a given period using a finite family of bounded cash flows {C1 , C2 , . . . , Cd }, the (forward) price of which is known at time 0 and denoted by {π1 , π2 , . . . , πd }. All cash flows are assumed to be nonnegative and nonredundant. Constants may be included and then considered as assets. We assume that the different prices are coherent in the sense that ∃Q0 ∼ P, s.t.
∀i, EQ0 [Ci ] = πi .
Such an assumption implies in particular that any inequality on the cash flows is preserved on the prices. The quantities of interest are often the gain values of the basic strategies, Gi = Ci − πi . We can naturally introduce the non-empty set Qe of equivalent “martingale measures” as Qe = {Q | Q ∼ P, s.t.
∀i, EQ [Gi ] = 0}.
The different instruments we consider are very liquid; by selling or buying some quantities θi of such instruments, we define the family of gains associated with trading strategies θ: d d d θi Gi , θ ∈ R , with initial value θi π i . = G(θ ) = i=1
i=1
This framework is very similar to Chapter 1 in Föllmer and Schied [90], where it is shown that the assumption of coherent prices is equivalent to the absence of arbitrage opportunity in the market defined as (AAO)
G(θ) ≥ 0 P a.s.
⇒ G(θ ) = 0
P a.s.
90
CHAPTER 3
These strategies can be used to hedge a risky position Y . In the classical financial literature, a superhedging strategy is a par (m, θ ) such that m + G(θ) ≥ Y, a.s. This leads to the notion of superhedging (superseller) price π↑sell (Y ) = inf {m | ∃G(θ )s.t.m + G(θ ) ≥ Y }. In terms of risk measure, we are concerned with the static superhedging price of −Y . So, by setting H = −, we define the risk measure ν H as ν H (X) = π↑sell (−X) = inf {m ∈ R, ∃θ ∈ Rd : m + X + G(θ ) ≥ 0}. Let us observe that the no arbitrage assumption implies that EQ0 [G(θ)] = 0. Hence, ν H (0) ≥ EQ0 [−X] > −∞. Moreover, the dual representation of the risk measure ν H in terms of probability measures is closely related to the absence of arbitrage opportunity as underlined in the following proposition (Chapter 4 in [90]): Proposition 3.2 i) If the market is arbitrage free, i.e., (AAO) holds true, the convex risk measure ν H can be represented in terms of the set of equivalent “martingale” measures Qe as ν H () = sup EQ (−X), Q∈Qe
where Qe = {Q ∼ P, EQ (Gi ) = 0, ∀i = 1 . . . d}.
(3.12) By Theorem 3.2, this L∞ (P)-risk measure is continuous from above. ii) Moreover, the market is arbitrage-free if ν H is sensitive in the sense that ν H () > ν H (0) for all such that P(X < 0) > 0 and P(X ≤ 0) = 1. Calibration Point of View and Bid-Ask Constraint This point of view is often used on financial markets when cash flows depend on some basic assets (S1 , S2 , . . . , Sn ), whose characteristics will be given in the next paragraph. We can consider, for instance, (Ci ) as payoffs of derivative instruments, sufficiently liquid to be used as calibration tools and static hedging strategies. So far, all agents having access to the market agree on the derivative prices, and do not have any restriction on the quantity they can buy or sell. We now take into account some restrictions on the trading. We first introduce a bid-ask spread on the (forward) price of the different cash flows. We denote by πiask (Ci ) the market buying price and by πibid (Ci ) the market selling price. The price coherence is now written as ∃Q0 ∼ P,
∀i, πiask (Ci ) ≤ EQ0 [Ci ] ≤ πibid (Ci ).
To define the gains family, we need to make a distinction between cash flows when buying and cash flows when selling. To do that, we double the number of basic gains, by associating, with any given cash flow Ci , both gains Gbid = Ci − πibid and i ask Gask = π (C ) − C . Henceforth, we do not make distinction of the notation, and i i i i we still denote any gain by Gi . The price coherence is then expressed as ∃Q0 ∼ P,
∀i = 1, . . . , 2d,
EQ0 [Gi ] ≤ 0.
The set of such probability measures, called super-martingale measures, is denoted by Qse . Note that the coherence of the prices implies that the set Qse is nonempty.
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
91
Using this convention, a strategy is defined by a 2d-dimensional vector θ, the components of which are all nonnegative. More generally, we can introduce more trading restriction on the size of the transaction by constraining θ to belong to a convex set K ⊆ R2d + such that 0 ∈ K. Note that we can also take into account some limits to the resources of the investor, in such way the initial price θ, π has an an upper bound. In any case, we still denote the set of admissible strategies by K and the family of associated gains by: = G(θ ) = 2d θ G , θ ∈ K . In this i=1 i i constrained framework, the relationship between price coherence and (AAO) on has been studied in details in Bion-Nadal [27] but also in Chapter 1 of [90]. More precisely, as above, the price coherence implies that the risk measure ν H related to H = − is not identically −∞. A natural question is to extend the duality relationship (3.12) using the subset of super-martingale measures. Using Paragraph 3.1.4, this question is equivalent to show that the minimal penalty function is infinite outside of the set of absolutely continuous probability measures and that ν H is continuous from above. When studying the risk measure ν H (Definition 3.4 and its properties), we have proved that ∀Q ∈ M1,ac (P),
α H (Q) = sup EQ [−ξ ] = sup EQ [G(θ)]. ξ ∈H
In particular, since 0 ∈ K, if Q ∈ Qse , then α(Q) α H is the indicator function of Qse .
θ ∈K
= 0. Moreover, if is a cone, then
It remains to study the continuity from above of ν H and especially to relate it with the absence of arbitrage opportunity in the market. We summarize below the results Föllmer and Schied obtained in Theorem 4.95 and Corollary 9.30 [90].
Proposition 3.3 Let the set K be a closed subset of Rd . Then, the market is arbitrage free if and only if the risk measure ν H is sensitive. In this case, ν H is continuous from above and admits the dual representation: ν H () =
sup {EQ [−] − α H (Q)}.
Q∈M1,ac
Dynamic Hedging A natural extension of the previous framework is the multiperiod setting or more generally the continuous-time setting. We briefly present some results in the latter case. Note that we will come back to these questions in the second part of this chapter, under a slightly different form, assuming that basic asset prices are Itô’s processes. We now consider a time horizon T , a filtration (Ft ; t ∈ [0, T ]) on the probability space (, F, P), and a financial market with n basic assets, whose (nonnegative) vector price process S follows a special locally bounded semi-martingale under P. To avoid arbitrage, we assume that (AAO) There exists a probability measure Q0 ∼ P such that S is a Q0 − local-martingale.
92
CHAPTER 3
Let Qac be the family of absolutely continuous martingale measures: Qac = {Q|Q P, S is a Q local-martingale}. (AAO) ensures that the set Qac is non empty. Then, as in Delbaen [66], Qac is a closed convex subset of L1 (P). Let us now introduce t dynamic strategies as predictable processes θ and their gain processes Gt (θ ) = 0 θu , dSu = (θ.S)t . We only consider bounded gain processes and define ST = {GT (θ) = (θ.S)T | θ.S is bounded}. Delbaen and Schachermayer have established in [67], as in the static case, the following duality relationship: sup{EQ [−X]|Q ∈ Qac } = inf {m|∃ GT (θ ) ∈ ST s.t.
m + X + GT (θ) ≥ 0}.
Putting H = −ST , this equality shows that ν H is a coherent convex risk measure continuous from above. Constrained Portfolios When constraints are introduced on the strategies, everything becomes more complex. Therefore, we refer to the course held by Schied [246] for more details. We assume that hedging positions live in the following convex set: ST = {GT (θ ) = (θ.S)T |θ.S is bounded by below, θ ∈ K}. The set of constraints is closed in the following sense: the set { θ dS | θ ∈ K} is closed in the semi-martingale or Émery topology. The optional decomposition theorem of Föllmer and Kramkov [87] implies the following dual representation for the risk measure ν H : ν H () =
Q
sup {EQ [−] − EQ [AT ]},
Q∈M1,ac
Q
Q
where AQ . is the optional process defined by A0 = 0 and dAt = ess supξ ∈K EQ [θt dSt |Ft ]. Q The penalty function α H of the risk measure ν H can be described as EQ [AT ] provided that Q satisfies the three following conditions: • Q is equivalent to P; • Every process θ.S with θ ∈ K is a special semi-martingale under Q; • Q admits the upper variation process AQ for the set {θ.S | θ ∈ K}. We can set α H (Q) = +∞ when one of these conditions does not hold. Remark. Note that there is a fundamental difference between static hedging with a family of cash flows and dynamic hedging. In the first case, the initial wealth is a market data: it corresponds to the (forward) price of the considered cash flows. The underlying logic is based upon calibration as the probability measures we consider have to be consistent with the observed market prices of the hedging instruments. In the dynamic framework, the initial wealth is a given data. The agent invests it in a self-financing admissible portfolio, which may be rebalanced in continuous time.
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
93
The problem of dynamic hedging with calibration constraints is a classical problem for practitioners. This will be addressed in details after the introduction of the inf-convolution operator. Some authors have been looking at this question (see for instance Bion-Nadal [26] or Cont [52]).
3.2 DILATATION OF CONVEX RISK MEASURES, SUBDIFFERENTIAL AND CONSERVATIVE PRICE 3.2.1 Dilatation: γ -Tolerant Risk Measures For noncoherent convex risk measures, the impact of the size of the position is not linear. It seems, therefore, natural to consider the relationship between “risk tolerance” and the perception of the size of the position. To do so, we start from a given root convex risk measure ρ. The risk tolerance coefficient is introduced as a parameter describing how agents penalize compared with this root risk measure. More precisely, denoting by γ the risk tolerance, we define ργ as 1 ργ () = γρ . (3.13) γ ργ satisfies a tolerance property or a dilatation property with respect to the size of the position, therefore it is called the γ -tolerant risk measure associated with ρ (also called the dilated risk measure associated with ρ as in Barrieu and El Karoui [15]). A typical example is the entropic risk measure, where eγ is simply the γ -dilated of e1 . These dilated risk measures satisfy the following nice property. Proposition 3.4 Let (ργ , γ > 0) be the family of γ -tolerant risk measures issued of ρ. Then, i) The map γ → (ργ − γρ(0)) is nonincreasing,
ii) For any γ , γ > 0, (ργ )γ = ργ γ . iii) The perspective functional defined on ]0, ∞[×X by X pρ (γ , X) = γρ = ργ (X) γ is a homogeneous convex functional, cash-invariant with respect to X (i.e., a coherent risk measure in X). Proof. (i) We can take ρ(0) = 0 without loss of generality of the arguments. By γ h and γ +h applying the convexity inequality to Xγ and 0 with the coefficients γ +h (h > 0), we have, since ρ(0) = 0, X γ X h γ X ρ ≤ ρ + ρ(0) ≤ ρ . γ +h γ +h γ γ +h γ +h γ (ii) The equation is an immediate consequence of the definition and characterization of tolerant risk measures.
94
CHAPTER 3
(iii) The perspective functional is clearly homogeneous. To show the convexity, let β1 ∈ [0, 1] and β2 = 1 − β1 two real coefficients, and (γ1 , X1 ), (γ2 , X2 ) two points in the definition space of pρ . Then, by the convexity of ρ, pρ (β1 (γ1 , X1 ) + β2 (γ2 , X2 )) β1 X1 + β2 X2 = (β1 γ1 + β2 γ2 )ρ β1 γ1 + β2 γ2 β 2 γ2 X2 β1 γ1 X1 ≤ (β1 γ1 + β2 γ2 ) + ρ ρ β1 γ1 + β2 γ2 γ2 β1 γ1 + β2 γ2 γ1 ≤ β1 ργ1 (X1 ) + β2 ργ2 (X2 ). 2
The other properties are obvious.
So, we naturally are looking for the asymptotic behavior of the perspective risk measure when the risk tolerance either tends to +∞ or tends to 0.
3.2.2 Marginal Risk Measures and Subdifferential Marginal Risk Measure Let us first observe that ρ is a coherent risk measure if and only if ργ ≡ ρ. We then consider the behavior of the family of γ -tolerant risk measures when the tolerance becomes infinite. Proposition 3.5 Suppose that ρ(0) = 0, or equivalently α(Q) ≥ 0 ∀Q ∈ M1,f . i) The marginal risk measure ρ∞ , defined as the nonincreasing limit of ργ when γ tends to infinity, is a coherent risk measure with penalty function α∞ = limγ →+∞ (γ α), that is, α∞ (Q) := sup {EQ [−] − ρ∞ ()} = 0 if α(Q) = 0, ρ∞ () = supQ∈M1,f {EQ [−] | α(Q) = 0}.
+∞ if not,
ii) Assume now that ρ is a L∞ (P)-risk measure such that ρ(0) = 0. If ρ is continuous from below, the ρ∞ is continuous from below and admits a representation in terms of absolutely continuous probability measures as ρ∞ () = maxQ∈M1,ac {EQ [−] | α(Q) = 0}, and the set {Q ∈ M1,ac | α(Q) = 0} is nonempty, and weakly compact in L1 (P). Proof. i) Thanks to Theorem 3.4, for any ∈ X ργ () ρ∞ () when γ → +∞. Given the fact that −m ≥ ργ () ≥ −M when m ≤ ≤ M, we also have −m ≥ ρ∞ () ≥ −M and ρ∞ is finite. Convexity, monotonicity and cash translation invariance properties are preserved when taking the limit. Therefore, ρ∞ is a convex risk measure with ρ∞ (0) = 0. Moreover, given that (ρδ )γ = ρδγ = (ργ )δ , we have that (ρδ )∞ = ρ∞ = (ρ∞ )δ and ρ∞ is a coherent risk measure.
95
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Since α ≥ 0, the minimal penalty function is α∞ (Q) = supξ {EQ [−ξ ] − ρ∞ (ξ )}
ξ = supξ supγ >0 {EQ [−ξ ] − γρ } γ = supγ >0 {γ α(Q)} = 0 if α(Q) = 0,
+∞ if not.
Moreover, α∞ is not identically equal to +∞ since the set {Q ∈ M1,f | α(Q) = 0} is not empty given that ρ(0) = 0 = max{−α(Q)} = −α(Q0 ) for some additive measure Q0 ∈ M1,f , from Theorem 3.2. Assume now that ρ is continuous from below and consider a nondecreasing sequence (ξn ∈ X ) with limit ξ ∈ X . By monotonicity, ρ∞ (ξ ) = inf ργ (ξ ) = inf inf ργ (ξn ) = inf inf ργ (ξn ) = inf ρ∞ (ξn ). γ
γ
ξn
ξn
γ
ξn
Then, ρ∞ is also continuous from below. ii) When ρ is a L∞ (P)-risk measure, continuous from below, ρ is also continuous from above and the dual representation holds in terms of absolutely continuous probability measures. Using the same argument as above, we can prove that ρ∞ is a coherent L∞ (P)-risk measure, continuous from below with minimal penalty function: α∞ (Q) = 0 if α(Q) = 0 and = +∞ otherwise.
Q ∈ M1,ac
Moreover, thanks to Theorem 3.3, the set {Q ∈ M1,ac | α(Q) = 0} is nonempty and weakly compact in L1 (P). 2 To have some intuition about the interpretation in terms of marginal risk measure, it is better to refer to the risk aversion coefficient = 1/γ . ρ∞ () appears as the limit of 1 (ρ() − ρ(0)), i.e., the right-derivative at 0 in the direction of of the risk measure ρ, or equivalently, the marginal risk measure. For instance, e∞ () = EP (−). In some cases, in particular when the set Qα∞ has a single element, the pricing rule ρ∞ (−) is a linear pricing rule and can be seen as an extension of the notion of marginal utility pricing and of the Davis price (see Davis [60] or Karatzas and Kou [150]). Subdifferential and Its Support Function Subdifferential Let us first recall the definition of the subdifferential of a convex functional. Definition 3.6 Let φ be a convex functional on X . The subdifferential of φ at X is the set ∂φ(X) = {q ∈ X | ∀X ∈ X , φ(X + Y ) ≥ φ(X) + q(−Y )}.
96
CHAPTER 3
The subdifferential of a convex risk measure ρ with penalty function α(q) = supY {q(−Y ) − ρ(Y )} is included in Dom(α), since when q ∈ ∂ρ(ξ ), then α(q) − (q(−ξ ) − ρ(ξ )) ≤ 0. So, we always refer to finitely additive measure Q when working with risk measure subdifferential. In fact, we have the well-known characterization of the subdifferential: q ∈ ∂ρ(ξ ) if and only if q ∈ M1,f is optimal for the maximization program EQ [−ξ ] − α(Q) −→ maxQ∈M1,f . We can also relate it with the notion of marginal risk measure, when the root risk measure is now centered around a given element ξ ∈ X , i.e. ρξ (X) = ρ(X + ξ ) − ρ(ξ ), by defining ρ∞,ξ () ≡ lim γ ρ ξ + − ρ(ξ ) . γ →+∞ γ Using Proposition 3.5, since the ρξ penalty function is αξ (Q) ≡ α(Q) − EQ [−ξ ] + ρ(ξ ), ρ∞,ξ is coherent and ρ∞,ξ () = sup {EQ [−] | ρ(ξ ) = EQ [−ξ ] − α(Q)}. Q∈M1,f
Proposition 3.6 The coherent risk measure ρ∞,ξ () ≡ limγ →+∞ γ (ρ(ξ + γ ) − ρ(ξ )) is the support function of the subdifferential ∂ρ(ξ ) of the convex risk measure ρ at ξ : ρ∞,ξ () = sup {EQ [−] | ρ(ξ ) = EQ [−ξ ] − α(Q)} = sup EQ [−]. Q∈M1,f
Q∈∂ρ(ξ )
Proof. From the definition of the subdifferential, ∂ρ(ξ ) = {q ∈ X | ∀ ∈ X , ρ(ξ + ) ≥ ρ(ξ ) + q(−)} = {q ∈ X | ∀ ∈ X , ρ∞,ξ () ≥ q(−)} = ∂ρ∞,ξ (0). But q ∈ ∂ρ∞,ξ (0) iff α∞,ξ (Qq ) = 0. So the proof is complete.
2
The L∞ (P) Case When working with L∞ (P)-risk measures, following Delbaen [66] (Section 8), the natural definition of the subdifferential is the following: ∂ρ(ξ ) = {f ∈ L1 (P) | ∀ ∈ L∞ (P), ρ(ξ + ) ≥ ρ(ξ ) + EP [f (−)]}. Using the same arguments as above, we can prove that every f ∈ ∂ρ(ξ ) is nonnegative with a P-expectation equal to 1. Since ∂ρ(ξ ) is also the subdifferential of ρ∞,ξ (0), the properties of ∂ρ(ξ ) may be deduced from those of the coherent risk measure ρ∞,ξ , for which we have already shown that if ρ is continuous from below
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
97
and ρ(0) = 0, then for any ξ the effective domain of α∞,ξ is nonempty. Then, under this assumption, ∂ρ(ξ ) is nonempty and we have the same characterization of the subdifferential as Q ∈ ∂ρ(ξ ) ⇐⇒ ρ(ξ ) = EQ [−ξ ] − α(Q). We now summarize these results in the following proposition. Proposition 3.7 Let ρ be a L∞ (P)-risk measure, continuous from below. Then, for any ξ ∈ L∞ (P), ρ∞,ξ is the support function of the nonempty subdifferential ∂ρ(ξ ), i.e., ρ∞,ξ () = sup{EQ [−]; Q ∈ ∂ρ(ξ )}. and the supremum is attained by some Q ∈ ∂ρ(ξ ). 3.2.3 Conservative Risk Measures and Superprice We now focus on the properties of the γ -tolerant risk measures when the risk tolerance coefficient tends to 0 or equivalently when the risk aversion coefficient goes to +∞. The conservative risk measures then obtained can be reinterpreted in terms of superpricing rules. Using vocabulary from convex analysis, these risk measures are related to recession (or asymptotic) functions. Proposition 3.8 i) When γ tends to 0, the family of γ -tolerant risk measures (ργ ) admits a limit ρ0+ , which is a coherent risk measure. This conservative risk measure ρ0+ is simply the “superprice” of −: ρ0+ () = lim (ργ () − γρ(0)) = sup {EQ [−] | α(Q) < ∞}. γ ↓0
Q∈M1,f
Its minimal penalty function is α0+ (Q) = 0 if α(Q) < +∞ and
= +∞ if not.
ii) If ρ is continuous from above on L∞ (P), then ρ0+ is continuous from above and ρ0+ () =
sup {EQ [−] | α(Q) < ∞}.
Q∈M1,ac
Proof. Let us first observe that ργ (ξ ) = γ (ρ( γξ ) − ρ(0)) + γρ(0) is the sum of two terms. The first term is monotonic while the second one goes to 0. The functional ρ0+ is coherent (same proof as for ρ∞ ) with the acceptance set Aρ0+ = {ξ, ∀λ ≥ 0, λξ ∈ Aρ − ρ(0)}. On the other hand, by monotonicity, the minimal penalty function α0+ ≥ γ α ≥ 0; so, α0+ (Q) = 0 on Dom(α), and α0+ (Q) = +∞ if not. In other words, α0+ is the convex indicator of Dom(α). If ρ is continuous from above on L∞ (P), then the same type of dual characterization holds for ρ0+ but in terms of M1,ac . So, α0+ (Q) = 0 on Dom(α), and α0+ (Q) = +∞ if not. We could have proved directly the continuity from above of
98
CHAPTER 3
ρ0+ , since ρ0+ is the nondecreasing limit of continuous from above risk measures (ργ − γρ(0)). 2 Remark. A nice illustration of this result can be obtained when considering the entropic risk measure eγ . In this case, it comes immediately that e0+ () = supQ {EQ [−] | h(Q | P) < +∞} = P − ess sup(−) = ρmax () where ρmax is here the L∞ (P)-worst case measure. This also corresponds to the weak superreplication price as defined by Biagini and Frittelli in [23]. Note that this conservative risk measure e0+ () cannot be realized as EQ0 [−] for some Q0 ∈ M1,ac . It is a typical example where the continuity from below fails. 3.3 INF-CONVOLUTION A useful tool in convex analysis is the inf-convolution operation. While the classical convolution acts on the Fourier transforms by addition, the inf-convolution acts on Fenchel transforms by addition, as we would see later. 3.3.1 Definition and Main Properties The inf-convolution of two convex functionals φA and φB may be viewed as the functional value of the minimization program φA,B (X) = inf {φA (X − H ) + φB (H )}.
(3.14)
H ∈X
This program is the functional extension of the classical inf-convolution operator acting on real convex functions f g(x) = inf y {f (x − y) + g(y)}. Illustrative Example Let us assume that the risk measure ρA is the linear one qA (X) = EQA [−X], whose penalty function is the functional αA (Q) = 0 if Q = QA , = +∞ if not. Given a convex risk measure, ρB , with penalty functional αB , we deduce from the definition of the inf-convolution that qA ρB (X) = qA (−X) − αB (QA ). • Then, qA ρB is identically −∞ if αB (QA ) = +∞. • If it is not the case, the minimal penalty function αA,B associated with this measure is αA,B (Q) = αB (QA ) + αA (Q) = αB (Q) + αA (Q). • Moreover, the infimum is attained in the inf-convolution program by any H ∗ , such that αB (QA ) = EQA [−H ∗ ] − ρB (H ∗ ), that is, H ∗ is optimal for the maximization program defining the αB . In terms of subdifferential, we have the first-order condition:
QA ∈ ∂ρB (H ∗ ).
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
99
Inf-Convolution and Duality In our setting, convex functionals are generally convex risk measures, but we have also been concerned by the convex indicator of convex subset, taking infinite values. In that follows, we already assume that convex functionals φ we consider are proper (i.e., not identically +∞) and in general closed or lower semicontinuous (in the sense that the level sets {X|φB (X) ≤ c}, c ∈ R are weak*-closed). To be consistent with the risk measure notations, we define their Fenchel transforms on X as β(q) = sup {q(−X) − φ(X)}. X∈X
When the linear form q is related to an additive finite measure Q ∈ M1,f , we use the notation qQ (X) = EQ [X]. For a general treatment of inf-convolution of convex functionals, the interested reader may refer to the highlighting paper of Borwein and Zhu [32]. The following theorem extends these results to the inf-convolution of convex functionals whose one of them at least is a convex risk measure. Theorem 3.7 Let ρA be a convex risk measure with penalty function αA and φB be a proper closed convex functional with Fenchel transform β. Let ρA φB be the inf-convolution of ρA and φB defined as X → ρA φB (X) = inf {ρA (X − H ) + φB (H )}, H ∈X
(3.15)
and assume that ρA φB (0) > −∞. Then, i) ρA φB is a convex risk measure which is finite for all X ∈ X . ii) The associated penalty function αA,B takes the value +∞ for any q outside of M1,f , and and
∀Q ∈ M1,f ∃Q ∈ M1,f
αA,B (Q) = αA (Q) + βB (qQ ), s.t. αA (Q) + βB (qQ ) < ∞.
iii) Moreover, if the risk measure ρA is continuous from below, then ρA φB is also continuous from below. Proof. We give here the main steps of the proof of this theorem. • The monotonicity and translation invariance properties of ρA φB are immediate from the definition, since at least one of the both functionals have these properties. • The convexity property simply comes from the fact that, if ρA and ρB are convex functionals, then for any XA , XB , HA and HB in X and any λ ∈ [0, 1], it holds ρA ((λXA + (1 − λ)XB ) − (λHA + (1 − λ)HB )) ≤ λρA (XA − HA ) + (1 − λ)ρA (XB − HB ) as well as φB (λHA + (1 − λ)HB ) ≤ λφB (HA ) + (1 − λ)φB (HB ).
100
CHAPTER 3
By adding both inequalities and taking the infimum in HA and HB on the left-hand side and separately in HA and in HB on the right-hand side, we obtain: ρA φB (λXA + (1 − λ)XB ) ≤ λ ρA φB (XA ) + (1 − λ)ρA ρB (XB ). • Using Equation (3.3), the associated penalty function is given, for any Q∈M1,f , by αA,B (Q) = supX∈X {EQ [−X] − ρA,B (X)} = sup∈X {EQ [−X] − inf H ∈X {ρA (X − H ) + φB (H )}} = supX∈X supH ∈X × {EQ [−(X − H )] + EQ [−H ] − ρA (X − H ) − φB (H )}, X − H ∈ X we get and by setting X − ρA (X) + EQ [−H ] − φB (H )) αA,B (Q) = supX∈X supH ∈X (EQ [−X] = αA (Q) + βB (qQ ). When q ∈ M1,f , the same equalities hold true. Since ρA is a convex risk measure, αA (q) = +∞, and since β is a proper functional, β(q) is dominated from below; so, αA,B (q) = +∞. This equality αA,B = αA + βB holds even when αA and βB they take infinite values. • The continuity from below is directly obtained upon considering an increasing sequence of (Xn ) ∈ X converging to X. Using the monotonicity property, we have inf ρA φB (Xn ) n
= inf inf {ρA (Xn − H ) + φB (H )} n
H
= inf inf {ρA (Xn − H ) + φB (H )} = inf {ρA (X − H ) + φB (H )} H
n
H
= ρA φB (X).
(2)
We can now give an inf-convolution interpretation of the convex risk measure ν H generated by a convex set H as in Corollary 3.5 as the inf-convolution of the convex indicator function of H, and the worst-case risk measure. This regularization may be applied at any proper convex functional. Proposition 3.9 [Regularization by inf-convolution with ρworst ] Let ρworst (X) = sup(−X(ω)) ω
be the worst-case risk measure. i) ρworst is a neutral element for the infimal convolution of convex risk measures. ii) Let H be a convex set such that inf {m| ∃ξ ∈ H, m ≥ ξ } > −∞. The convex risk measure generated by H, ν H is the inf-convolution of the convex indicator functional of H with the worst case risk measure, ν H = ρworst l H .
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
101
iii) More generally, let φ be a proper convex functional, such that for any H , we have φ(H ) ≥ − sup H (ω) − c. ω
The infimal convolution of ρworst and φ, ρφ = ρworst φ is the largest convex risk measure dominated by φ. iv) Let β the penalty functional associated with φ. Then, the penalty functional associated with ρφ is the functional αφ , restriction of β at the set M1,f , αφ (q) = β(q) + l M1,f (q) = β(qQ ) if qQ ∈ M1,f ,
+∞ if not.
v) Given a general risk measure ρA such that ρA φ(0) > −∞, then ρA φ = ρA ρworst φ = ρA ρφ . Proof. We start by proving that ρρworst = ρ. By definition, ρρworst (X) = inf Y {supω (−Y (ω)) + ρ(X − Y )} = inf Y {ρ(X − (Y − supω (−Y (ω))))} = inf Y ≥0 {ρ(X − Y )} = ρ(X). To conclude, we have used the cash invariance of ρ and the fact that ρ(X − Y ) ≥ ρ(X) whenever Y ≥ 0 ii) has been proved in Corollary 3.5. iii) By Theorem 3.7, ρφ = ρworst φ is a convex risk measure. Since ρworst is a neutral element for the inf-convolution of risk measure, any risk measure ρ dominated by φ is also dominated by ρworst φ since ρ = ρworst ρ ≤ ρworst φ = ρφ . Hence the result. 2 Therefore, in the following, we only consider the infimal convolution of convex risk measures. The following result makes more precise Theorem 3.7 and plays a key role in our analysis. Theorem 3.8 [Sandwich Theorem] Let ρA and ρB be two convex risk measures. Under the assumptions of Theorem 3.7 (i.e., ρA,B (0) = ρA ρB (0) > −∞): i) There exists Q ∈ ∂ρA,B (0) such that, for any X and any Y , ρA ρB (0) ≤ (ρA (X) − EQ [−X]) + (ρB (Y ) − EQ [−Y ]). ii) Assume ρA,B (0) ≥ c. There is an affine function, aQ (X) = −EQ [−X] + r, with Q ∈ ∂ρA,B (0), satisfying ρA (.) ≥ aQ ≥ −ρB (−.) + c.
(3.16)
Moreover, for any X such that ρA (X) + ρB (−X) = ρA,B (0), Q ∈ ∂ρB (−X) ∩ ∂ρA (X). The inf-convolution is said to be exact at X. iii) Interpretation of the Condition ρA ρB (0) > −∞.
102
CHAPTER 3
The following properties are equivalent: • ρA ρB (0) > −∞. • The sandwich property (3.16) holds for some affine function aQ (X) = −EQ [−X] + r. • There exists Q ∈ Dom(αA ) ∩ Dom(αB ). • Let ρ0A+ (resp. ρ0B+ ) be the conservative risk measure associated with ρ A (resp. ρ B ). Then ρ0A+ (X) + ρ0B+ (−X) ≥ 0. Before proving this theorem, let us make the following comment: the infconvolution risk measure ρA,B , given in Equation (3.15) may also be defined, for instance, as the value functional of the program ρA,B () = ρA ρB () = ρA ν AρB () = inf {ρA ( − H ), H ∈ AρB }, where ν AρB is the risk measure with acceptance set AρB . This emphasizes again the key role played the risk measures generated by a convex set, if needed. Proof. i) Using Theorem 3.7, we see that the convex risk measure ρA,B is finite. Hence, its subdifferential ∂ρA,B (0) is nonempty. More precisely, there exists Q0 ∈ ∂ρA,B (0) such that ρA,B (X) ≥ ρA,B (0) + EQ0 (−X). In other words, ρA,B (0) ≤ ρA,B (X) + EQ0 (X) ≤ ρA (X − Y ) − EQ0 [−(X − Y )] + ρB (Y ) − EQ0 [−Y ]. ii.a) Assume that ρA ρB (0) ≥ c. Applying the previous inequality at Y = −Z, and X = U + Y = U − Z, we have ρA (U ) − EQ0 [−U ] ≥ −ρB (−Z) − EQ0 [Z] + ρA,B (0). Then, −αA (Q0 ) := inf {ρA (U ) − EQ0 [−U ]} U
≥ αB (Q0 ) + ρA,B (0) := sup{−ρB (−Z) − EQ0 [Z] + ρA,B (0)}. Z
By Theorem 3.7 this inequality is in fact an equality. Picking r = αA (Q0 ), and defining aQ0 (X) = EQ0 [−X] + r yield to an affine function that separates ρA and −ρB (−.) + c. ii.b) Finally, when ρA (X) + ρB (−X) = ρA,B (0), by the above inequalities, we obtain −ρB (−X) − EQ0 [−X] ≥ −ρB (−Z) − EQ0 [Z]. In other words, Q0 belongs to ∂ρB (−X). By symmetry, Q0 also belongs to ∂ρA (X). iii) " The implication (1) ⇒ (2) ⇒ (3) is clear, using the results (i) and (ii) of this theorem. • Very naturally, one obtains (3) ⇒ (2) and (3) ⇒ (1) as the existence of Q0 ∈ Dom(αA ) ∩ Dom(αB ) implies that for any X, ρA (X) ≥ EQ0 [−X] − αA (Q) and ρB (−X) ≥ EQ0 [X] − αB (Q). Considering r = sup{αA (Q)); αB (Q)}, one obtains (2). Moreover, ρA (X) + ρB (−X) ≥ −(αA (Q) + αB (Q)), and
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
103
taking the infimum with respect to X, we see that ρA ρB (0) > −∞, i.e., property (1). • Let us now look at the following implication (2) ⇒ (4). We first observe that (2), i.e., ρA (X) ≥ −EQ0 [−X] + r implies ρ0A+ (X) ≥ −EQ0 [−X], and ρB (−X) ≥ EQ0 [−X] − r implies ρ0B+ (−X) ≥ EQ0 [−X]. Therefore, we obtain (4) as ρ0A+ (X) + ρ0B+ (−X) ≥ 0. • The converse implication (4) ⇒ (2) is obtained by applying the sandwich property (3.16) to ρ0A+ and ρ0B+ . 2 On risk measures on L∞ (P) Let us consider the inf-convolution between two risk measures ρA and ρB , where one of them, for instance ρA , is continuous from below (and consequently from above) and therefore is defined on L∞ (P). In this case, as the inf-convolution maintains the property of continuity from below (see Theorem 3.7); the risk measure ρA ρB is also continuous from below and therefore is a risk measure on L∞ (P), having a dual representation on M1,ac (P). γ -Tolerant Risk Measures and Inf-Convolution In this subsection, we come back to the particular class of γ -tolerant convex risk measures ργ to give an explicit solution to the exact inf-convolution. Recall that this family of risk measures is generated from a root risk measure ρ by the transformation ργ (ξT ) = γργ ( ξγT ), where γ is the risk tolerance coefficient with respect to the size of the exposure. These risk measures satisfy the following semigroup property for the inf-convolution. Proposition 3.10 Let (ργ , γ > 0) be the family of γ -tolerant risk measures issued of ρ. Then, the following properties hold: i) For any γA , γB > 0, ργA ργB = ργA +γB . B X is an optimal structure for the minimization ii) Moreover, F ∗ = γAγ+γ B program: ργA +γB (X) = ργA ργB (X) = inf {ργA (X − F ) + ργB (F )} F
= ργA (X − F ∗ ) + ργB (F ∗ ). The inf-convolution is said to be exact at F ∗ . iii) Let ρ and ρ be two convex risk measures. Then, for any γ > 0, ργ ργ = (ρρ )γ . iv) Assume ρ(0) = 0 and ρ (0) = 0. When γ = +∞, this relationship still holds:
ρ∞ ρ∞ = (ρρ )∞ . v) If ρ0+ ρ0 + (0) > −∞, we also have ρ0+ ρ0 + = (ρρ )0+ . Proof. Both (i) and (iii) are immediate consequences of the definition of infimal convolution. (ii) We first study the stability property of the functional ργ by studying the optimization program ργA (X − F ) + ργB (F ) → minF restricted to the family {αX, α ∈ R}. Then, given the expression of the functional ργ , a natural candidate
104
CHAPTER 3
becomes F ∗ =
γB γA +γB
X, since
ργA (X − F ∗ ) + ργB (F ∗ ) = (γA + γB )ρ
1 X = ργC (X). γA + γB
(iv) The asymptotic properties are based on the nonincrease of the map γ → ργ . Then, when γ goes to infinity, pass to the limit is equivalent to take the infimum w.r. of γ and change the order of minimization, in such way that pass to the limit is justified. (v) When γ goes to 0, the problem becomes a minimax problem, and we only obtain the inequality. When the finite assumption holds, by Theorem 3.7, the minimal penalty function of ρ0+ ρ0 + is α0+ + α0 + . By the properties of conservative risk measures, α0+ is the
convex indicator of Dom(α). So, α0+ + α0 + = l Dom(α)∩Dom(α) . On the other hand,
+ the minimal penalty function of (ρρ )0 is the indicator of Dom(α + α ). Since, α is dominated by the same minimal bound −ρ(0), Dom(α + α ) = Dom(α) ∩ Dom(α ) Both risk measures have same minimal penalty functions. This completes the proof. 2 An Example of Inf-Convolution: The Market Modified Risk Measure We now consider a particular inf-convolution that is closely related to Subsection 3.1.4 as it also deals with the question of optimal hedging. More precisely, the minimization problem inf ρ(X − H )
H ∈VT
can be seen as an hedging problem, where VT corresponds to the set of hedging instruments. It somehow consists of restricting the risk measure ρ to a particular set of admissible variables and is in fact the inf-convolution ρν VT . Using Proposition 3.9, it can also be seen as the inf-convolution ρl VT ρworst . The main role of ρworst is to transform the convex indicator l VT , which is not a convex risk measure (in particular, it is not translation invariant), into the convex risk measure ν VT . The following corollary is an immediate extension of Theorem 3.7 as it establishes that the value functional of the problem, denoted by ρ m , is a convex risk measure, called market modified risk measure. Corollary 3.9 Let VT be a convex subset of L∞ (P) and ρ be a convex risk measure with penalty function α such that inf {ρ(−ξT ), ξT ∈ VT } > −∞. The infconvolution of ρ and ν VT , ρ m ≡ ρν VT , also defined as ρ m () ≡ inf {ρ( − ξT ) | ξT ∈ VT } = ρl VT (),
(3.17)
is a convex risk measure, called market modified risk measure, with minimal penalty function defined on M1,ac (P), α m (Q) = α(Q) + α VT (Q). Moreover, if ρ is continuous from below, ρ m is also continuous from below. This corollary makes precise the direct impact on the risk measure of the agent of the opportunity to invest optimally in a financial market.
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
105
Remark. Note that the set VT is rather general. In most cases, additional assumptions will be added and the framework will be similar to those described in Subsection 3.1.5. Acceptability and market modified risk measure: The market modified risk measure has to be related to the notion of acceptability introduced by Carr, Geman, and Madan in [211]. In this paper, they relax the strict notion of hedging in the following way: instead of imposing that the final outcome of an acceptable position, suitably hedged, should always be nonnegative, they simply require that it remains greater than an acceptable position. More precisely, using the same notations as in Subsection 3.1.5 and denoting by A a given acceptance set and by ρA its related risk measure, we can define the convex risk measure: ν¯ H (X) = inf {m ∈ R, ∃θ ∈ K ∃A ∈ A : m + X + G(θ) ≥ APa.s.}. To have a clearer picture of what this risk measure really is, let us first fix G(θ). In this case, we simply look at ρA (X + G(θ)). Then, the risk measure ν¯ H is defined by taking the infimum of ρA (X + G(θ)) with respect to θ, ν¯ H (X) = inf ρA (X + G(θ )) = inf ρA (X − H ). θ∈K
H ∈H
H
Therefore, the risk measure ν¯ is in fact the particular market-modified risk measure ρ m = ν H ρA . We obtain directly the following result of Föllmer and Schied [90] (Proposition 4.98): the minimal penalty function of this convex risk measure ν¯ H is given by α¯ H (Q) = α H (Q) + α(Q), where α H is the minimal penalty function of ν H and α is the minimal penalty function of the convex risk measure with acceptance set A. 3.4 OPTIMAL DERIVATIVE DESIGN In this section, we now present our main problem, that of derivative optimal design (and pricing). The framework we generally consider involves two economic agents, at least one of them being exposed to a nontradable risk. The risk transfer between both agents takes place through a structured contract denoted by F for an initial price π. The problem is therefore to design the transaction, in other words, to find the structure F and its price π. This transaction may occur only if both agents find some interest in doing this transaction. They express their satisfaction or interest in terms of the expected utility of their terminal wealth after the transaction, or more generally in terms of risk measures. 3.4.1 General Modelling Framework Two economic agents, respectively denoted by A and B, are evolving in an uncertain universe modeled by a standard measurable space (, ) or, if a reference
106
CHAPTER 3
probability measure is given, by a probability space (, , P). In the following, for the sake of simplicity in our argumentation, we will make no distinction between both situations. More precisely, in the second case, all properties should hold P − a.s.. Both agents are taking part in trade talks to improve the distribution and management of their own risk. The nature of both agents can be quite freely chosen. It is possible to look at them in terms of a classical insured-insurer relationship, but from a more financial point of view, we may think of agent A as a market maker or a trader managing a particular book and of agent B as a traditional investor or as another trader. More precisely, we assume that at a future time horizon T , the value of agent A’s terminal wealth, denoted by XTA , is sensitive to a nontradable risk. Agent B may also have her own exposure XTB at time T . Note that by “terminal wealth,” we mean the terminal value at the time horizon T of all capitalized cash flows paid or received between the initial time and T ; no particular sign constraint is imposed. Agent A wants to issue a structured contract (financial derivative, insurance contract) F with maturity T and forward price π to reduce her exposure XTA . Therefore, she calls on agent B. Hence, when a transaction occurs, the terminal wealth of the agent A and B are WTA = XTA − F + π,
WTB = XTB + F − π.
As before, we assume that all the quantities we consider belong to the Banach space X , or, if a reference probability measure is given, to L∞ (P). The problem is therefore to find the optimal structure of the risk transfer (F, π) according to a given choice criterion, which is in our study a convex risk measure. More precisely, assuming that agent A (resp. agent B) assesses her risk exposure using a convex risk measure ρA (resp. ρB ), agent A’s objective is to choose the optimal structure (F, π ) in order to minimize the risk measure of her final wealth ρA (XTA − F + π) → inf . F ∈X ,π
Her constraint is then to find a counterpart. Hence, agent B should have an interest in doing this transaction. At least, the F -structure should not worsen her risk measure. Consequently, agent B simply compares the risk measures of two terminal wealth, the first one corresponds to the case of her initial exposure XTB and the second one to her new wealth if she enters the F -transaction, ρB (XTB + F − π) ≤ ρB (XTB ). Transaction Feasibility and Optimization Program The optimization program as described above as inf ρA (XTA − F + π)
F ∈X ,π
subject to
ρB (XTB + F − π ) ≤ ρB (XTB )
(3.18)
can be simplified using the cash-translation invariance property. More precisely, binding the constraint imposed by agent B at the optimum and using the translation
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
107
invariance property of ρB , we find directly the optimal pricing rule for a structure F : πB (F ) = ρB (XTB ) − ρB (XTB + F ).
(3.19)
This pricing rule is an indifference pricing rule for agent B. It gives for any structure F the maximum amount agent B is ready to pay in order to enter the transaction. Note also that this optimal pricing rule together with the cash translation invariance property of the functional ρA enable us to rewrite the optimization program (3.18) as follows, without any need for a Lagrangian multiplier: inf {ρA (XTA − F ) + ρB (XTB + F ) − ρB (XTB )},
F ∈X
or to within the constant ρB (XTB ) as RAB (XTA , XTB ) = inf {ρA (XTA − F ) + ρB (XTB + F )}. F ∈X
(3.20)
Interpretation in Terms of Indifference Prices This optimization program (Program (3.20)) can be reinterpreted in terms of the indifference prices, using the notations introduced in the exponential utility framework in Subsection 3.1.1. To show this, we introduce the constants ρA (XTA ) and ρB (XTB ) in such a way that Program (3.20) is equivalent to: inf {ρA (XTA − F ) − ρA (XTA ) + ρB (XTB + F ) − ρB (XTB )}.
F ∈X
Then, using the previous comments, it is possible to interpret ρA (XTA − F ) − ρA (XTA ) as πAs (F |XTA ), i.e., the seller’s indifference pricing rule for F given agent A’s initial exposure XTA , while ρB (XTB + F ) − ρB (XTB ) is simply the opposite of πBb (F |XTB ), the buyer’s indifference pricing rule for F given agent B’s initial exposure XTB . For agent A, everything consists then of choosing the structure as to minimize the difference between her (seller’s) indifference price (given XTA ) and the (buyer’s) indifference price imposed by agent B: inf {πAs (F |XTA ) − πBb (F |XTB )} ≤ 0.
F ∈X
(3.21)
Note that for F ≡ 0, the spread between both transaction indifference prices is equal to 0. Hence, the infimum is always non-positive. This is completely coherent with the idea that the optimal transaction obviously reduces the risk of agent A. The transaction may occur since the minimal seller price is less than the maximal buyer price. For agent A, everything can also be expressed as the following maximization program sup {πBb (F |XTB ) − πAs (F |XTA )}.
F ∈X
(3.22)
The interpretation becomes then more obvious since the issuer has to optimally choose the structure in order to maximize the “ask-bid” spread associated with transaction.
108
CHAPTER 3
Relationships with the Insurance Literature and the Principal-Agent Problem The relationship between both agents is very similar to a principal-agent framework. Agent A plays an active role in the transaction. She chooses the “payment structure” and then is the “principal” in our framework. Agent B, on the other hand, is the “agent” as she simply imposes a price constraint to the principal and in this sense is rather passive. Such a modeling framework is also very similar to an insurance problem: Agent A is looking for an optimal “insurance” policy to cover her risk (extending here the simple notion of loss as previously mentioned). In this sense, she can be seen as the “insured.” On the other hand, Agent B accepts to bear some risk. She plays the same role as an “insurer” for Agent A. In fact, this optimal risk transfer problem is closely related to the standard issue of optimal policy design in insurance, which has been widely studied in the literature (see, for instance, Borch [31], Bühlman [35], [36] and [37], Bühlman and Jewell [38], Gerber [106], Raviv [226]). One of the fundamental characteristics of an insurance policy design problem is the sign constraint imposed on the risk, that should represent a loss. Other specifications can be mentioned as moral hazard or adverse selection problems that have to be taken into account when designing a policy (for more details, among a wide literature, refer, for instance, to the two papers on the relation principal-agent by Rees [227] and [228]). These are related to the potential influence of the insured on the considered risk. Transferring risk in finance is somehow different. Risk is then taken in a wider sense as it represents the uncertain outcome. The sign of the realization does not a priori matter in the design of the transfer. The derivative market is a good illustration of this aspect: forwards, options, swaps have particular payoffs that are not directly related to any particular loss of the contract’s seller.
3.4.2 Optimal Transaction This subsection aims at solving explicitly the optimization Program (3.20): RAB (XTA , XTB ) = inf {ρA (XTA − F ) + ρB (XTB + F )}. F ∈X
The value functional RAB (XTA , XTB ) can be seen as the residual risk measure after the F -transaction, or equivalently as a measure of the risk remaining after the transaction. It obviously depends on both initial exposures XTA and XTB since the transaction consists of an optimal redistribution of the respective risk of both agents. ≡ XB + F ∈ X . The program to be solved becomes Let us denote by F T ) + ρB (F )}, RAB (XTA , XTB ) = inf {ρA (XTA + XTB − F ∈X F
or, equivalently, using Section 3.3, it can be written as the following inf-convolution problem: RAB (XTA , XTB ) = ρA ρB (XTA + XTB ).
(3.23)
109
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
As previously mentioned in Theorem 3.7, the condition ρA ρB (0) > −∞ is required when considering this inf-convolution problem. This condition is equivalent to ∀ξ ∈ X , ρ0A+ (ξ ) + ρ0B+ (−ξ ) ≥ 0 (Theorem 3.8 iii). This property has a nice economic interpretation, since it says that the infconvolution program makes sense if and only if for any derivative ξ , the conservative seller price of the agent A, −ρ0A+ (ξ ), is less than the conservative buyer price of the agent B, ρ0B+ (−ξ ). In the following, we assume such a condition to be satisfied. The problem is not to study the residual risk measure as previously but to characterize the optimal structure F˜ ∗ or F ∗ such that the inf-convolution is exact at this point. To do so, we first consider a particular framework where the optimal transaction can be explicitly identified. This corresponds to a well-studied situation in economics where both agents belong to the same family. Optimal Transaction between Agents with Risk Measures in the Same Family More precisely, we now assume that both agents have γ -tolerant risk measures ργA and ργB from the same root risk measure ρ with risk tolerance coefficients γA and γB , as introduced in Subsection 3.2.1. In this framework, the optimization program (3.23) is written as follows: RAB (XTA , XTB ) = ργA ργB (XTA + XTB ). In this framework, the optimal risk transfer is consistent with the so-called Borch’s theorem. In this sense, the following result can be seen as an extension of this theorem since the framework we consider here is different from that of utility functions. In his paper [31], Borch obtained indeed, in a utility framework, optimal exchange of risk, leading in many cases to familiar linear quota-sharing of total pooled losses. Theorem 3.10 (Borch [31]) The residual risk measure after the transaction is given by RAB (XTA , XTB ) = inf {ργA (XTA − F ) + ργB (XTB + F )} = ργC (XTA + XTB ), F ∈X
with γC = γA + γB . The optimal structure is given as a proportion of the initial exposures XTA and XTB , depending only on the risk tolerance coefficients of both agents: γB γA XTA − XB (to within a constant). (3.24) F∗ = γA + γB γA + γB T The equality in Equation (3.24) has to be understood Pa.s. if the space of structured products is L∞ (P). Proof. The optimization program (3.20) to be solved (with F˜ ≡ XTB + F ∈ X ) is ) + ργ (F )). RAB (XTA , XTB ) = inf (ργA (XTA + XTB − F B F˜ ∈X
∗ = Using Proposition 3.10, the optimal structure F˜ ∗ is F ∗ by F ∗ − XB . result is then obtained by replacing F T
γB γA +γB
(XTA + XTB ). The 2
110
CHAPTER 3
Comments and properties: i) Both agents are transferring a part of their initial risk according to their relative tolerance. The optimal risk transfer underlines the symmetry of the framework for both agents. Moreover, even if the issuer, agent A has no exposure, a transaction will occur between both agents. The structure F enables them to exchange a part of their respective risk. Note that if none of the agents is initially exposed, no transaction will occur. In this sense, the transaction has a nonspeculative underlying logic. ii) Note also that the composite parameter γC is simply equal to the sum of both risk tolerance coefficients γA and γB . This may justify the use of risk tolerance instead of risk aversion where harmonic mean has to be used. Individual Hedging as a Risk Transfer In this subsection, we focus on the individual hedging problem of agent A and see how this problem can be interpreted as a particular risk transfer problem. The question of optimal hedging has been widely studied in the literature under the name of ”hedging in incomplete markets and pricing via utility maximization” in some particular framework. Most of the studies have considered exponential utility functions. Among the numerous papers, we may quote the papers by Frittelli [97], El Karoui and Rouge [156], Delbaen et al. [81], Kabanov and Stricker [146], or Becherer [16]. We assume that agent A assesses her risk using a (L∞ (P)) risk measure ρA . She can (partially) hedge her initial exposure X using instruments from a convex subset VTA (of L∞ (P)). Her objective is to minimize the risk measure of her terminal wealth: inf ρA (XTA − ξ ).
ξ ∈VTA
(3.25)
The L∞ (P) framework has been carefully described in Subsection 3.1.5. In particular, to have coherent transaction prices, we assume in the following that the market is arbitrage-free. As already mentioned in Subsection 3.3.1, the opportunity to invest optimally in a financial market has a direct impact on the risk measure of the agent and transforms her initial risk measure ρA into the market modified risk measure ρAm = ρA ν A . This inf-convolution problem makes sense if the condition ρAm (0) > −∞ is satisfied. The hedging problem of agent A is identical to the previous risk transfer problem (3.20), agent B being now the financial market with the associated risk measure ν A . Existence of an Optimal Hedge The question of the existence of an optimal hedge can be answered using different approaches. One of them is based on analysis techniques and we present it in this subsection. In the following, however, when introducing dynamic risk measures, we will consider other methods leading to a more constructive answer. In this subsection, we are interested in studying the existence of a solution for the hedging problem of agent A (Program (3.25)) or equivalently for the inf-convolution problem in L∞ (P). The following of existence can be obtained.
111
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Theorem 3.11 Let VT be a convex subset of L∞ (P) and ρ be a convex risk measure on L∞ (P) continuous from below, such that inf ξ ∈VT ρ(−ξ ) > −∞. Assume the convex set VT bounded in L∞ (P). The infimum of the hedging program ρ m (X) inf ρ(X − ξ ) ξ ∈VT
is “attained” for a random variable with respect to the a.s. convergence.
ξT∗
in L∞ (P), belonging to the closure of VT
Proof. First note that the proof of this theorem relies on arguments similar to those used by Kabanov and Stricker [146]. In particular, a key argument is the Komlos Theorem (Komlos [161]): Lemma 3.12 (Komlos) Let (φn ) be a sequence in L1 (P) such that supn EP (|φn |) < +∞. Then there exists a subsequence (φn ) of (φn ) and a function φ ∗ ∈ L1 (P) such that for every further subsequence (φn
) of (φn ), the Cesaro-means of these subsequences converge to ϕ ∗ , that is, N 1 φn
(ω) = φ ∗ (ω) N→∞ N
for almost every ω ∈ .
lim
n =1
We first show that the set Sr = {ξ ∈ L∞ (P) | ρ(X − ξ ) ≤ r} is closed for the weak*topology. To do that, by the Krein-Smulian theorem ([90] Theorem A.63), it is sufficient to show that Sr ∩ {ξ ; ξ ∞ ≤ C} is closed in L∞ (P). Let (ξn ∈ VT ) be a sequence bounded by C, converging in L∞ to ξ ∗ . A subsequence still denoted by ξn converges a.s. to ξ ∗ . Since ρA is continuous from below, ρ is continuous w.r. to pointwise convergence of bounded sequences and then ξ ∗ belongs to Sr . Sr is weak*-closed. Given the assumption that (ξn ) is L∞ -bounded, we can apply Komlos lemma: ξn therefore, there exists a subsequence (ξjk ∈ VTA ) such that the Cesaro-means, 1 n A ∗ ∞ ξ converges almost surely to ξ ∈ L (P). Note that ξ belongs to V j n T as k=1 k n A ∗ a convex combination of elements of VT . So ξ belongs to the a.s. closure of VTA . Since ρA is continuous from below, ρ is continuous w.r. to pointwise convergence of bounded sequences. ξn ) ≤ ρA (X − ξ ∗ ) = ρA (lim(X − ξn )) ≤ lim inf ρA (X − ξn ). lim sup ρA (X − n
n
∗
ρA ( n1
n
n
1 n Then, ≤ ρA (X − ξ ) ≤ limn inf k=1 (X − ξjk )) ≤ lim n inf n k=1 ρA (X − ξjk ) by Jensen inequality. Finally, given the convergence of ρA (X − ξjk ) to ρAm (X), the Cesaro-means also converge and ρA (X − ξ ∗ ) = inf ξ ∈V A ρA (X − ξ ). T 2
ρAm (X)
γ -Tolerant Risk Measures: Derivatives Design with Hedging Opportunities We now consider the situation where both agents A and B have a γ -dilated risk measure, defined on L∞ (P) and continuous from above. Moreover, they may reduce their risk by transferring it between themselves but also by investing in the financial market, choosing optimally their financial investments.
112
CHAPTER 3
The investment opportunities of both agents are described by two convex subsets VTA and VTB of L∞ (P). In order to have coherent transaction prices, we assume that the market is arbitrage free. In our framework, this can be expressed as the existence of a probability measure that is equivalent to P in both sets of probability measures MV i = Q ∈ M1,e (P); ∀ξ ∈ VTi , EQ [−ξ ] ≤ 0 for i = A, B. Equivalently, T
∃Q ∼ P s.t.
Q ∈ MV A ∩ MV B . T
T
This opportunity to invest optimally in a financial market reduces the risk of both agents. To assess their respective risk exposure, they now refer to market modified risk measures ργmA and ργmB defined if J = A, B as ργmA () = ργA ν A ()
and
ργmB () = ργB ν B ().
Let us consider directly the optimal risk transfer problem with these market modified risk measures, i.e., A m RAB XT + XTB = inf ργmA XTA − F + ργmB XTB + F . (3.26) F ∈X
The details of this computation will be given in the next subsection, A when considm ering the general framework. The residual risk measure RAB XT + XTB defined in Equation (3.26) may be simplified using the commutativity property of the inf-convolution and the semigroup property of γ -tolerant risk measures: A m RAB XT + XTB = ργmA ργmB XTA + XTB = ργA ν A ργB ν B XTA + XTB = ργA ργB ν A ν B XTA + XTB = ργC ν A ν B XTA + XTB , where ργC is the γ -tolerant risk measure associated with the risk tolerance coefficient γC = γA + γB . This inf-convolution program makes sense under the initial condition ργmA ργmB (0) > −∞. Such an assumption is made. The following theorem gives the optimal risk transfer in different situations depending on the access both agents have to the financial markets. Theorem 3.13 Let both agents have γ -tolerant risk measures with respective risk tolerance coefficients γA and γB . i) If both agents have the same access to the financial market from a cone, VT , then an optimal structure, solution of the minimization Program (3.26) is given by F∗ =
γB γA XTA − XB . γA + γB γA + γB T
ii) Assume that both agents have different access to the financial market via two convex sets VTA and VTB . Suppose ξ ∗ = ηA∗ + ηB∗ is an optimal solution of inf ργC XTA + XTB − ξ , (A+B)
ξ ∈VT
113 A (A) (B) = ξT + ξTB | ξTA ∈ VT , ξTB ∈ VT . Then,
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
with ηA∗ ∈
VT(A) , ηB∗ F∗ =
∈
VT(B)
and VT(A+B)
γB γA γB γA XA − XB − η∗ + η∗ γA + γB T γA + γB T γA + γB A γA + γB B
is an optimal structure. Moreover, i) ηB∗ is an optimal hedging portfolio of XTB + F ∗ for agent B 1 1 ργ X B + F ∗ − ηB∗ = γB B T γB
inf
(B) ξB ∈VT
ργB XTB + F ∗ − ξB
1 ργC XTA + XTB − ξ ∗ . γC ii) ηA∗ is an optimal hedging portfolio of XTA − F ∗ for agent A =
1 1 ργA XTA − F ∗ + ηA∗ = γA γA =
inf
(A)
ξA ∈VT
ργA XTA − F ∗ + ξA
1 ργC XTA + XTB − ξ ∗ . γC
Proof. To prove this theorem, we proceed in several steps: Step 1 Let us first observe that A m XT + XTB = ργC XTA + XTB − ξ ∗ RAB , − ξ ∗ + ργ F = inf ργA XTA + XTB − F B ∈X F
= F + XB − ξB . Given Proposition 3.10, we obtain directly an expression where F T ∗ as: F ∗ = γB (X A + XB − ξ ∗ ) = γB (X A + XB − for the optimal ”structure” F T T T T γA +γB γC ) = γB (X A + XB − ξ ∗ ). ξ ∗ ). Moreover, ργB (F T T γC Step 2 ∗ − XB + η∗ . We then want Rewriting in the reverse order, we naturally set F ∗ = F T B ∗ to prove that ηB is an optimal investment for agent B. For the sake of simplicity in our notation, we consider GX (ξA , ξB , F ) ργA (XTA − F − ξA ) + ργB (XTB + F − ξB ). ∗ = F ∗ + XB − η∗ , we have Given the optimality of ξ ∗ = ηA∗ + ηB∗ and F T B m (XTA + XTB ) = GX (ηA∗ , ηB∗ , F ∗ ) RAB = inf (A)
(B)
F ∈X ,ξA ∈VT ,ξB ∈VT
≤ inf
(B) ξB ∈VT
GX (ξA , ξB , F )
GX (ηA∗ , ξB , F ∗ ) ≤ GX (ηA∗ , ηB∗ , F ∗ ).
Then ηB∗ is optimal for the problem ργB (F − ξB ) → inf ξ ∈V (B) . The optimality of ηA∗ B T can be proved using the same arguments. 2
114
CHAPTER 3
Remark. (a) We first assume that both agents have the same access to the financial market from a cone H. Given the fact that the risk measure generated by H is coherent and thus invariant by dilatation, the market-modified risk measures of both H H H H agents are generated from the root risk measure ρν = ρ as ρA = ργA ν = ργA νγHA = ρν H γ = ργHA and ρBH = ργHB . A (b) In a more general framework, when both agents have different access to the (A+B) financial market, the convex set VT(A+B) associated with the risk measure A ν B = A B A B ν ν plays the same role as the set H above, since ργC ν ν XT + XT = ργC ν (A+B) XTA + XTB . Comments: Note that when both agents have the same access to the financial market, it is optimal to transfer the same proportion of the initial risk as in the problem without market. This result is very strong as it does not require any specific assumption either for the nontradable risk or the financial market. Moreover, the optimal structure F ∗ does not depend on the financial market. The impact of the financial market is simply visible through the pricing rule, which depends on the market modified risk measure of agent B. Standard diversification will also occur in exchange economies as soon as agents have proportional penalty functions. The regulator has to impose very different rules on agents as to generate risk measures with nonproportional penalty functions if she wants to increase the diversification in the market. In other words, diversification occurs when agents are very different one from the other. This result supports for instance the intervention of reinsurance companies on financial markets in order to increase the diversification on the reinsurance market. Optimal Transaction in the General Framework We now come back to our initial problem of optimal risk transfer between agent A and agent B, when now they both have access to the financial market to hedge and diversify their respective portfolio. General Framework As in the dilated framework, we assume that both their risk measures ρA and ρB are defined on L∞ (P) and are continuous from above. The investment opportunities of both agents are described by two convex subsets VTA and VTB of L∞ (P) and the financial market is assumed to be arbitrage-free. 1. This opportunity to invest optimally in a financial market reduces the risk of both agents. To assess their respective risk exposure, they now refer to market modified risk measures ρAm and ρBm defined if J = A, B as ρJm () inf ξ ∈V (J ) ρJ ( − ξJ ). As usual, we assume that ρJm (0) > −∞ for J T the individual hedging programs to make sense. Thanks to Corollary 3.9, ρAm () = ρA ν A ()
and
ρBm () = ρB ν B ().
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
115
2. Consequently, the optimization program related to the F -transaction is simply subject to ρBm XTB + F − π ≤ ρBm XTB . inf ρAm XTA − F + π F,π
As previously, using the cash translation invariance property and binding the constraint at the optimum, the pricing rule of the F -structure is fully determined by the buyer as (3.27) π ∗ (F ) = ρBm XTB − ρBm XTB + F . It corresponds to an ”indifference” pricing rule from the agent B’s market modified risk measure. 3. Using again the cash translation invariance property, the optimization program simply becomes A m inf ρAm XTA − F + ρBm XTB + F − ρBm XTB RAB XT + XTB − ρBm XTB . F
m With the functional RAB , we are in the framework of Theorem 3.7: m RAB (3.28) XTA + XTB = inf ρAm XTA − F + ρBm XTB + F F m A = ρAm ρBm XTA + XTB + ρBm F = inf ρA XT + XTB − F F (3.29) = ρA ν A ρB ν B XTA + XTB . m The value functional RAB of this program, resulting from the inf-convolution of four different risk measures, may be interpreted as the residual risk measure after all transactions. This inf-convolution problem makes sense if the initial condition ρAm ρBm (0) > −∞ is satisfied. 4. Using the previous Theorem 3.7 on the stability of convex risk measure, prom vided the initial condition is satisfied, RAB is a convex risk measure with the A B m m m penalty function αAB = αA + αB = αA + αB + α VT + α VT .
Comments: The general risk transfer problem can be viewed as a game involving four different agents if the access to the financial market is different for agent A and agent B (or three otherwise). As a consequence, we end up with an inf-convolution problem involving four different risk measures, two per agents. Optimal Design Problem Our problem is to find an optimal structure F ∗ realizing the minimum of the Program (3.28): A m RAB XT + XTB = inf ρAm XTA − F + ρBm XTB + F . F
Let us first consider the following simple inf-convolution problem between a convex risk measure ρB and a linear function qA as introduced in Subsection 3.3.1: qA ρB (X) = inf {EQA [−(X − F )] + ρB (F )}. F
(3.30)
116
CHAPTER 3
Proposition 3.11 The necessary and sufficient condition to have an optimal solution F ∗ to the linear inf-convolution problem (3.30) is expressed in terms of the subdifferential of ρB as QA ∈ ∂ρB (F ∗ ). This necessary and sufficient corresponds to the first order condition of the optimization problem. More generally, the following result is obtained. Theorem 3.14 (Characterization of the optimal) Assume that ρAm ρBm (0) > − ∞. The inf-convolution program A m XT + XTB = inf ρAm XTA − F + ρBm XTB + F RAB F
m A B X is exact at F ∗ if and only if there exists QX AB ∈ ∂RAB (XT + XT ) such that QAB ∈ m m B A ∗ ∗ ∂ρA (XT − F ) ∩ ∂ρB (XT + F ). In other words, the necessary and sufficient condition to have an optimal solution F ∗ to the inf-convolution program is that there exists an optimal additive measure m m X A B A B ∗ ∗ QX AB for (XT + XT , RAB ) such that XT + F is optimal for (QAB , αB ) and XT − F m X is optimal for (QAB , αA ).
Both notions of optimality are rather intuitive as they simply translate the fact that the dual representations of the risk measure on the one hand, and of the penalty function on the other, are exact respectively at a given additive measure and at a given exposure. A natural interpretation of this theorem is that both agents agree on the measure QX AB in order to value their respective residual risk. This agreement enables the transaction. A m B Proof. Let us denote by QX AB the optimal additive measure for XT + XT , RAB . In m A B this case, QX AB ∈ ∂RAB (XT + XT ). As mentioned in Subsection 3.1.2, the existence of such an additive measure is guaranteed as soon as the penalty function is defined by Equation (3.10). This justifies the writing of the theorem in terms of additive measures rather than in terms of probability measures. In the proof, we denote by X XTA + XTB and by c , the centeredrandom variable c with respect to the given additive measure QX AB optimal for X, RAB : = − EQX []. So, by definition, AB
X −RAB (X c ) = αA (QX AB ) + αB (QAB )
= sup{−ρA (X c − F c )} + sup{−ρB (F c )} F
F
≥ − inf {ρA (X − F ) + ρB (F c )} = −RAB (X c ). c
c
F
In particular, all inequalities are equalities and sup{−ρA (X c − F c )} + sup{−ρB (F c )} = sup{−ρA (X c − F c ) − ρB (F c )}. F
F ∗
F
Hence, F is optimal for the inf-convolution problem, or equivalently for the program on the right-hand side of this equality, if and only if F ∗ is optimal for both
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
117
problems: sup{−ρB (F c )} and F
sup{−ρA (X c − F c )}. F
The second formulation is a straightforward application of Theorem 3.8 ii), considering the problem not at 0 but at XTA + XTB . 2 In order to obtain an explicit representation of an optimal structure F ∗ , some technical methods involving a localization of convex risk measures have to be used. This is the aim of the second part of this chapter, which is based upon some technical results on BSDEs. Therefore, before localizing convex risk measures and studying our optimal risk transfer in this new framework, we present in a separate section some quick recalls on BSDEs, which is essential for a good understanding of the second part on dynamic risk measures.
PART II: DYNAMIC RISK MEASURES We now consider dynamic convex risk measures. Quite recently, many authors have studied dynamic version of static risk measures, focusing especially on the question of law invariance of these dynamic risk measures: among many other references, one may quote the papers by Cvitanic and Karatzas [58], Wang [264], Scandolo [242], Weber [265], Artzner et al. [209], Cheridito, Delbaen and Kupper [212] [214] or [213], Detlefsen and Scandolo [73], Frittelli and Gianin [99], Frittelli and Scandolo [100], Gianin [107], Riedel [229], Roorda, Schumacher, and Engwerda [6], or the lecture notes of Peng [220]. Very recently, extending the work of El Karoui and Quenez [79], Klöppel and Schweizer have related dynamic indifference pricing and BSDEs in [158]. In this second part, we extend the axiomatic approach adopted in the static framework and introduce some additional axioms for the risk measures to be timeconsistent. We then relate the dynamic version of convex risk measures to BSDEs. The associated dynamic risk measure is called g-conditional risk measure, where g is the BSDE coefficient. We will see how the properties of both the risk measure and the coefficient g are intimately connected. In particular, one of the key axioms in the characterization of the dynamic convex risk measure will be the translation invariance, and this will impose the g-coefficient of the related BSDE to depend only on z. In the last two sections, we come back to the essential point of this chapter, the optimal risk transfer problem. We first derive some results on the inf-convolution of dynamic convex risk measures and obtain the optimal structure as a solution to the inf-convolution problem. The idea behind our approach is to find a trade-off between static and very abstract risk measures as to obtain tractable risk measures. Therefore, we are more interested in tractability issues and interpretations of the dynamic risk measures we obtain rather than the ultimate general results in BSDEs.
118
CHAPTER 3
3.5 RECALLS ON BACKWARD STOCHASTIC DIFFERENTIAL EQUATIONS In the rest of the chapter, we take into account more information on the risk structure. In particular, we assume the σ -field F generated by a d-dimensional Brownian motion between [0, T ]. Since any bounded FT -measurable variable is an stochastic integral w.r. to the Brownian motion, the risk measures of interest have to be robust with respect of this localization principle. To do that, we consider a family of risk measures described by backward stochastic differential equations (BSDE). In this section, we introduce general BSDEs, defining them, recalling some key results on existence and uniqueness of a solution and presenting the comparison theorem. Complete proofs and additional useful results are given in the chapter on BSDEs. 3.5.1 General Framework and Definition Let , F, P be a probability space on which is defined a d-dimensional Brownian motion W := (Wt ; t ≤ TH ), where TH > 0 is the time horizon of thestudy. Let us consider the natural Brownian filtration Ft0 = σ Ws ; 0 ≤ s ≤ t; t ≥ 0 and (Ft ; t ≤ TH ) its completion with the P-null sets of F. Denoting by E the expected value with respect to P, we introduce the following spaces which will be important in the formal setting of BSDEs. Since the time horizon may be sometimes modified, the definitions are referring to a generic time T ≤ TH . • L2n Ft = {η : Ft − measurable Rn − valued random variable s.t. E(|η|2 ) < ∞}. • Pn (0, T ) = {(φt ; 0 ≤ t ≤ T ) : progressively measurable process with values in Rn }. • Sn2 (0, T ) = {(φt ; 0 ≤ t ≤ T ) : φ ∈ Pn s.t. E[supt≤T |Yt |2 ] < ∞}. • Hn2 (0, T ) = {(φt ; 0 ≤ t ≤ T ) : φ ∈ Pn s.t. E[ 0T |Zs |2 ds] < ∞}. • Hn1 (0, T ) = {(φt ; 0 ≤ t ≤ T ) : φ ∈ Pn s.t. E[( 0T |Zs |2 ds)1/2 ] < ∞}. Let us give the definition of the one-dimensional BSDE; the multidimensional case is considered in the book’s chapter dedicated to BSDEs. Definition 3.15 Let ξT ∈ L2 (, FT , P) be a R-valued terminal condition and g a coefficient P1 ⊗ B(R) ⊗ B(Rd )-measurable. A solution for the BSDE associated with (g, ξT ) is a pair of progressively measurable processes (Yt , Zt )t≤T , with values in R × R1×d such that 2 (Yt ) ∈ S12 (0, T ), (Zt ) ∈ H1×d (0, T ) T T (3.31) Yt = ξT + t g(s, Ys , Zs )ds − t Zs dWs , 0 ≤ t ≤ T . The following differential form is also useful: −dYt = g(t, Yt , Zt )dt − Zt dWt ,
YT = ξT .
(3.32)
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
119
Conventional notation: To simply the writing of the BSDE, we adopt the following notations: the Brownian motion W is described as a column vector (d, 1), and the Z vector is described as a row vector (1, d) such that the notation ZdW has to be understood as a matrix product with (1,1)-dimension. Remark. If ξT and g(t, y, z) are deterministic, then Zt ≡ 0, and (Yt ) is the solution of ODE dyt yT = ξT . = −g(t, yt , 0), dt If the final condition ξT is random, the previous solution t is FT -measurable, and so nonadapted. So we need to introduce the martingale 0 Zs dWs as a control process to obtain an adapted solution. 3.5.2 Some Key Results on BSDEs Before presenting key results of BSDEs, we first summarize the results concerning the existence and uniqueness of a solution. The proofs are given in the chapter 8 dedicated to BSDEs with some complementary results. Existence and Uniqueness Results In the following, we always assume the necessary condition on the terminal condition ξT ∈ L2 (, FT , P). 1. (H1): The standard case (uniformly Lipschitz): (g(t, 0, 0); 0 ≤ t ≤ T ) belongs to H2 (0, T ) and g uniformly Lipschitz continuous with respect to (y, z), i.e., there exists a constant C ≥ 0 such that dP × dt − a.s. ∀(y, y , z, z ) ≤ C(|y − y | + |z − z |).
|g(ω, t, y, z) − g(ω, t, y , z )|
Under these assumptions, Pardoux and Peng [216] proved in 1990 the existence and uniqueness of a solution. 2. (H2) The continuous case with linear growth: there exists a constant C ≥ 0 such that dP × dt − a.s.
∀(y, z) |g(ω, t, y, z)| ≤ k(1 + |y| + |z|).
Moreover we assume that dP × dt a.s., g(ω, t, ., .) is continuous in (y, z). Then, there exist a maximal and a minimal solutions (for a precise definition, please refer to chapter 8 dedicated to BSDEs), as proved by Lepeltier and San Martin in 1998 [172]. 3. (H3) The continuous case with quadratic growth in z: In this case, the assumption of square integrability on the solution is too strong. So we only consider bounded solution and obviously terminal condition ξT ∈ L∞ . We also suppose that there exists a constant k ≥ 0, such that dP × dt − a.s.
∀(y, z)
|g(ω, t, y, z)| ≤ k(1 + |y| + |z|2 ).
Moreover, we assume that dP × dt − a.s., g(ω, t, ., .) is continuous in (y, z).
120
CHAPTER 3
Then there exist a maximal and a minimal bounded solution as first proved by Kobylansky [159] in 2000 and extended by Lepeltier and San Martin [172] in 1998. The uniqueness of the solution was proved by Kobylansky [159] under the additional conditions that the coefficient g is differentiable in (y, z) on a compact interval [−K, K] × Rd and that there exists c1 > 0 and c2 > 0 such that ∂g ∂g ≤ c1 (1 + |z|), ≤ c2 (1 + |z|2 ). (3.33) ∂z ∂y Comparison Theorem We first present an important tool in the study of one-dimensional BSDEs: the socalled comparison theorem. It is the equivalent of the maximum principle when working with PDEs. Theorem 3.16 (Comparison Theorem) Let (ξT1 , g 1 ) and (ξT2 , g 2 ) be two pairs (terminal condition, coefficient) satisfying one of the above conditions (H1,H2,H3) (but the same for both pairs). Let (Y 1 , Z 1 ) and (Y 2 , Z 2 ) be the maximal associated solutions. i) We assume that ξT1 ≤ ξT2 , P − a.s. and that dP × dt − a.s. ∀(y, z) g 1 (ω, t, y, z) ≤ g 2 (ω, t, y, z). Then we have Yt1 ≤ Yt2 a.s. ∀ t ∈ [0, T ]. ii) Strict inequality: Under (H1), if in addition Yt1 = Yt2 on B ∈ Ft , then a strict version of this result holds as a.s. on B ξT1 = ξT2 , ∀s ≥ t, Ys1 = Ys2 , and g 1 (s, Ys1 , Zs1 ) = g 2 (s, Ys2 , Zs2 ) dP × ds − a.s. on B × [t, T ]. 3.6 AXIOMATIC APPROACH AND g-CONDITIONAL RISK MEASURES In this section, we give a general axiomatic approach for dynamic convex risk measures and see how they are connected to the existing notions of consistent convex price systems and nonlinear expectations, respectively introduced by El Karoui and Quenez [78] and Peng [218]. Then, we relate the dynamic risk measures with BSDEs and focus on the properties of the solution of some particular BSDEs associated with a convex coefficient g, called g-conditional risk measures. 3.6.1 Axiomatic Approach Following the study of static risk measures by Föllmer and Schied [89] and [90], we now propose an axiomatic approach common to dynamic convex risk measures, nonlinear expectations, and convex price systems.
121 probability space. A Definition 3.17 Let , F, P, (Ft ; t ≥ 0) be a filtered dynamic L2 -operator (L∞ -operator) Y with respect to Ft ; t ≥ 0 is a family of con2 tinuous semi-martingales which maps, for any time T , a L (FT ) bounded stopping (resp. L∞ (FT )) -variable ξT onto a process Yt (ξT ); t ∈ [0, T ] . Such an operator is said to be
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
1. (P1) Convex: For any stopping times S ≤ T , for any (ξT1 , ξT2 ), for any 0 ≤ λ ≤ 1, YS (λξT1 + (1 − λ)ξT2 ) ≤ λYS (ξT1 ) + (1 − λ)YS (ξT2 ) P − a.s. 2. (P2) Monotonic: For any stopping times S ≤ T , for any (ξT1 , ξT2 ) such that ξT1 ≥ ξT2 a.s., (P2+): the operator is increasing if YS (ξT1 ) ≥ YS (ξT2 ) a.s., (P2-): the operator is decreasing YS (ξT1 ) ≤ YS (ξT2 ) a.s. 3. (P3) Translation invariant: For any stopping times S ≤ T and any ηS ∈ FS , for any ξT , (P3+) YS (ξT + ηS ) = YS (ξT ) − ηS a.s., (P3-) YS (ξT + ηS ) = YS (ξT ) − ηS a.s. 4. (P4) Time-consistent: For S ≤ T ≤ U three bounded stopping times, for any ξU (P4+) YS ξU = YS YT ξU a.s., (P4-) YS ξU = YS − YT ξU a.s. 5. (P5) Arbitrage-free: For any stopping times S ≤ T , and for any (ξT1 , ξT2 ) such that ξT1 ≥ ξT2 , YS (ξT1 ) = YS (ξT2 ) on AS = {S < T } ⇒ ξT1 = ξT2 a.s. on AS . 6. (P6) Conditionally invariant: For any stopping times S ≤ T and any B ∈ FS , for any ξT , YS (1B ξT ) = 1B YS (ξT ) a.s. 7. (P7) Positive homogeneous: For any stopping times S ≤ T , for any λS ≥ 0 (λS ∈ FS ) and for any ξT , YS (λS ξT ) = λS YS (ξT ) a.s. First, note that the property (P5) of no-arbitrage implies that the monotonicity property (P2) is strict. Most axioms have two different versions, depending on the sign involved. Making such a distinction is completely coherent with the previous observations in the static part of this chapter about the relationship between price and risk measure: since the opposite of a risk measure is a price, the axioms with a “+” sign are related to the characterization of a price system, while the axioms with a “−” sign are related to that of a dynamic risk measure. In [78], when studying pricing problems under constraints, El Karoui and Quenez defined a consistent convex (forward) price system as a convex (P1), increasing
122
CHAPTER 3
(P2+), time consistency (P4+) dynamic operator Pt , without arbitrage (P5). Time consistency (P4+) may be view as a dynamic programming principle. At the same period, Peng introduced the notion of nonlinear expectation as a translation invariance (P3+) convex price system, satisfying the conditional invariance property (P6), which is very intuitive in this framework (see for instance, Peng [218]). Note that (P6) of conditional invariance implies some additional assumptions on the operator Y: in particular for any t, Yt (0) = 0. In the following, we denote the nonlinear expectation by E. Now on, we focus on dynamic convex risk measures, where now only the properties with the “−” sign hold. Definition 3.18 A dynamic operator satisfying the axioms of convexity (P1), decreasing monotonicity (P2-), translation invariance (P3-), time-consistency (P4-), and arbitrage free (P5) is said to be a dynamic convex risk measure. It will be denoted by R in the following. If R also satisfies the positive homogeneity property (P7), then it is called a dynamic coherent risk measure. Note that a nonlinear expectation defines a dynamic risk measure conditionally invariant and centered. Remark. It is not obvious to find a negligible set N such that for any bounded g / N , ξT → RS (ω, ξT ) is a static convex stopping time S and any bounded ξT , ∀ ω ∈ risk measure. The negligible sets may depend on the variable ξT itself. Dynamic Entropic Risk Measure A typical example is the dynamic entropic risk measure, obtained by conditioning the static entropic risk measure. For any ξT bounded, ! ! 1 1 ⇒ eγ ,t (ξT ) = γ ln E exp(− ξT )|Ft . eγ (ξT ) = γ ln E exp(− ξT ) γ γ Since, ξT is bounded, eγ ,t (ξT ) is bounded for any t. Therefore, this dynamic operator defined on L∞ satisfies the properties of dynamic convex risk measures. Convexity, decreasing monotonicity, translation invariance, no-arbitrage are obvious; the time-consistency property (P 4−) results from the transitivity of conditional expectation: a.s. ∀t ≥ 0 , ∀h > 0, eγ ,t ξT = eγ ,t − eγ ,t+h ξT We give the easy proof of this identity to help the reader to understand the (−) sign in the formula. Proof.
1 Yt+h ≡ eγ ,t+h (ξT ) = γ ln E exp − ξT Ft+h γ 1 eγ ,t (−eγ ,t+h (ξT )) = γ ln E exp − (−Yt+h ) Ft γ
123 1 1 Ft γ ln E exp − ξT Ft+h = γ ln E exp γ γ 1 = γ ln E E exp − ξT Ft . (2) γ
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Moreover, it is possible to relate the dynamic entropic risk measure eγ ,t with the solution of a BSDE, as follows. Proposition 3.12 The dynamic entropic measure eγ ,t (ξT ); t ∈ [0, T ] is solution " " 2 of the following BSDE with the quadratic coefficient g t, z = 2γ1 "z" and terminal bounded condition ξT . " 1 " "Zt "2 dt − Zt dWt eγ ,T ξT = −ξT . (3.34) −deγ ,t ξT = 2γ Proof. Let us denote by Mt (ξT ) = E exp − γ1 ξT |Ft . As M is a positive and bounded continuous martingale, one can use the multiplicative decomposition to get dMt = γ1 Mt (Zt dWt ), where (Zt ; t ≥ 0) is a 1 × d dimensional squareintegrable process. By Itô’s formula applied to the function γ ln(x), we obtain the Equation (3.34). Note that the conditional expectation of the quadratic variation T ! |Zs |2 ds|Ft ] = E eγ ,t (ξT ) − ξT |Ft E t
is bounded and conversely if the Equation (3.34) has the solution (Y,Z) such that YT T and E t |Zs |2 ds|Ft ] are bounded, then Y is bounded. This point will be detailed in Theorem 3.25. 2 This relationship between the dynamic entropic risk measure and BSDE can be extended to general dynamic convex risk measures as we will see in the rest of this section. 3.6.2 Dynamic Convex Risk Measures and BSDEs This section is about the relationship between dynamic convex risk measures and BSDEs. More precisely, we are interested in the correspondence between the properties of the “BSDE” operator and that of the coefficient. We consider the dynamic operator generated by the maximal solution of a BSDE: Definition 3.19 Let g be a standard coefficient. The g-dynamic operator, denoted g by Y g , is such that Yt (ξT ) is the maximal solution of the BSDE(g, ξT ). As a consequence, the adopted point of view is different from that of the section dedicated to recalls on BSDEs where the terminal condition of the BSDE was fixed. It is easy to deduce properties of the g-dynamic operator from those of the coefficient g. The converse is more complex and this study has been initiated by Peng when considering g-expectations ([218]). Our characterization is based upon the following lemma.
124
CHAPTER 3
Lemma 3.20 (Coefficient Uniqueness) Let g 1 and g 2 be two regular coefficients, i such that uniqueness of solution for the BSDE(g 1 ) holds. Let Y g be g i -dynamic operator(i = 1, 2). Assume that ∀(T , ξT ),
g1
dP × dt − a.s.
g2
Yt (ξT ) = Yt (ξT ).
i) If the coefficients g 1 and g 2 simply depend on t and z, then dP × dt − a.s. ∀z g 1 (t, z) = g 2 (t, z). ii) In the general case, the same identity holds provided the coefficients are continuous w.r. to t. dP × dt − a.s. ∀(y, z) g 1 (t, y, z) = g 2 (t, y, z). Proof. Suppose that both coefficients g 1 and g 2 generate the same solution Y (but a priori different processes Z 1 and Z 2 ) for the BSDEs (g 1 , ξT ) and (g 2 , ξT ), for any ξT in the appropriate space (L2 or L∞ ). Given the uniqueness of the decomposition of the semi-martingale Y , the martingale parts and the finite variation t processes of the both decompositions of Y are indistinguishable. In particular, 0 Zs1 dWs = t 2 t t t dWs = 0 Zs dWs , a.s. and 0 g 1 (s, Ys , Zs1 )ds = 0 g 2 (s, Ys , Zs2 )ds. There0 Zs t t fore, 0 g 1 (s, Ys , Zs )ds = 0 g 2 (s, Ys , Zs )ds. A priori, these equalities only hold for processes (Y, Z) obtained through BSDEs. i) Assume that g 1 and g 2 do not depend on y. Given a bounded adapted process Z, we consider the following locally bounded semimartingale U as dUt = g 1 (t, Zt )dt − Zt dWt ; U0 = u0 . (U, Z) is the solution of the BSDE(g 1 , UT ∧τ ) t∧τ where τ is a stopping time s.t. UT ∧τ is bounded. By uniqueness, 0 g 1 (s, Zs )ds = t∧τ 2 g (s, Zs )ds. As shown in b) below, this equality implies that g 1 (s, Zs ) = 0 2 g (s, Zs ), a.s. ds × dP. Thanks to the continuity of g 1 and g 2 w.r. to z, we can only consider denumerable rational z to show that, with dt × dP probability one, for any z, g 1 (s, z) = g 2 (s, z). ii1) In the general case, given a bounded process Z, we consider a solution of the following forward stochastic differential equation Y as dYt = g 1 (t, Yt , Zt )dt − Zt dWt ; Y0 = y0 and the stopping time τN defined as the first time, when |Y | crosses the level N. The pair of processes (YτN ∧t , Zt 1]0,τN ] (t)) is solution of the BSDE with bounded terminal condition ξT = YτN ∧T . Consequently, the pair of stochastic processes (YτN ∧t , Zt 1]0,τN ] (t)) is also solution of the BSDE(g 2 , YτN ∧T ). Hence, both stochast t tic processes 0 g 1 (s, Ys , Zs )ds and 0 g 2 (s, Ys , Zs )ds are indistinguishable on ]0, τN ∧ T ]. Since τN goes to infinity with N , the equality holds at any time, for any bounded process Z. ii2) Assume g 1 (s, y, z) and g 2 (s, y, z) continuous w.r. to s. Let z be a given vector. z be a forward perturbation of a general solution Y , at the level z between t Let Yt+h and t + h, u u g 1 (s, Ysz , z)ds − zdWs ∀u ∈ [t, t + h]. Yuz = Yt + t
t
125
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES z ), for u By assumption, (Yuz , z) is also solution of the BSDE(g 2 , Yt+h
u
g 1 (s, Ysz , z)ds =
t
∈ [t, t + h], and
t
g 2 (s, Ysz , z)ds. t
z,i − Yt |Ft goes in L1 to g i (t, Yt , z) (i = 1, 2) with Hence, by continuity, h1 E Yt+h h → 0. Then g 1 (t, Yt , z) = g 2 (t, Yt , z) for any solution Yt of the BSDE, i.e. for any v.a Ft -measurable. 2 Comments: i) Peng ([218] and [220]) and Briand et al. [210] have been among the first to look at the dynamic operators to deduce local properties through the coefficient g of the associated BSDE, when considering nonlinear expectations. More recently, Jiang has considered the applications of g-expectations in finance in his PhD thesis [142]. ii) In [210], Briand et al. proved a more accurate result for g Lipschitz. More precisely, let g be a standard coefficient such that P-a.s., t −→ g(t, y, z) is continuous and g(t, 0, 0) ∈ S 2 . Let us fix (t, y, z) ∈ [0, T ] × R × Rd and consider for each n ∈ N∗ , {(Ysn , Zsn ); s ∈ [t, tn = t + n1 ]} solution of the BSDE (g, Xn ) where the terminal condition Xtn is given by Xtn = y + z Wtn − Wt . Then for each (t, y, z) ∈ [0, T ] × R × Rd , we have L2 − lim n Ytn − y = g(t, y, z). n→∞
Some properties automatically hold for the dynamic operator Y g simply because it is the maximal solution of a BSDE. Some others can be obtained by imposing conditions on the coefficient g: Theorem 3.21 Let Y g be the g-dynamic operator. i) Then, Y g is increasing monotonic (P2+), time-consistent (P4+) and arbitragefree (P5). ii) Moreover, under the assumptions of Lemma 3.20, 1. Y g is conditionally invariant (P6) if and only if for any t ∈ [0, T ], z ∈ Rn , g(t, 0, 0) = 0. 2. Y g is translation invariant (P3+) if and only if g does not depend on y. 3. Y g is homogeneous if and only if g is homogeneous; iii) For properties related to the order, the following implications simply hold: 1. If g is convex, then Y g is convex (P1). 1 2 2. If g 1 ≤ g 2 , then Y g ≤ Y g . Therefore, if g is a convex coefficient depending only on z, Rg (ξT ) ≡ Y g (−ξT ) is a dynamic convex risk measure, called g-conditional risk measure. Note that Y g is a consistent convex price system and, moreover, if for any t ∈ [0, T ], g(t, 0) = 0, then Y g is a nonlinear expectation, called g-expectation. Proof. i) The strict version of the comparison Theorem 3.16 leads immediately to both properties (P2+) and (P5).
126
CHAPTER 3
Up to now, we have defined and considered BSDEs with a terminal condition at a fixed given time T . It is always possible to consider it as a BSDE with a time horizon TH ≥ T , even if TH is a bounded stopping time. Obviously, the coefficient g has to be extended as g1[0,T ] and the terminal condition ξTH = ξT . Therefore, the solution Yt is constant on [T , TH ]. To obtain the time-consistency propert (P4), also called the flow property, we consider three bounded stopping times S ≤ T ≤ U and write the solution of the BSDEs as function of the terminal date. With obvious notations, we want to prove that YS (T , YT (U, ξU )) = YS (U, ξU )
a.s.
By simply noting that YS (T , YT (U, ξU ))
= YT (U, ξU ) + = ξU + T
T
g(t, Zt )dt −
S U
g(t, Zt )dt −
T
Zt dWt S
U
Zt dWt +
T
T
g(t, Zt )dt −
S
T
Zt dWt , S
the process, which is defined as Yt (T , YT (U, ξU )) on [0, T ] and by Yt (U, ξU ) on ]T , U ], is the maximal solution of the BSDE (g, ξU , U ). Uniqueness of the maximal solution implies (P4). ii) The three properties (ii1), (ii2), and (ii3) involve the same type of arguments to be proved, so we simply present the proof for (ii2). Let gm (t, y, z) = g(t, y + m, z). We simply note that Y.m = Y. (ξT + m) − m is the maximal solution of the BSDE (gm , ξT ). The translation invariance property is equivalent to the indistinguishability of both processes Y and Y m ; by the uniqueness Lemma 3.20, this property is equivalent to the identity g(t, y, z) = gm (t, y, z) = g(t, y + m, z)
a.s.
This implies that g does not depend on y. iii) We first consider the proof of (iii1 ). For the convexity property, we consider different BSDEs: (Yt1 , Zt1 ) is the (maximal) solution of (g, ξT1 ) and (Yt2 , Zt2 ) is t = λYt1 + (1 − λ)Yt2 , with the (maximal) solution of (g, ξT2 ). Then, we look at Y λ ∈ [0, 1]. We have t = (λg(t, Yt1 , Zt1 ) + (1 − λ)g(t, Yt2 , Zt2 ))dt − (λZt1 + (1 − λ)Zt2 )dWt , −d Y T = λξT1 + (1 − λ)ξT2 . Y Since g is convex, we can rewrite this BSDE as t , Z t ) + α(t, Yt1 , Yt2 , Zt1 , Zt2 , λ))dt − Z t dWt , t = (g(t, Y −d Y where α is a a.s. nonnegative process. Hence, using the comparison theorem, the t of this BSDE is for any t ∈ [0, T ] a.s. greater than the solution Yt of solution Y the BSDE (g, λξT1 + (1 − λ)ξT2 ). It is a supersolution in the sense of Definition 2.1 of El Karoui, Peng, and Quenez [202]. Finally, (iii2) is a direct consequence of the comparison Theorem 3.16. 2
127
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Additional Comments on the Relationship between BSDEs and Dynamic Operators Since 1995, Peng has focused on finding conditions on dynamic operators so that they are linear growth g-expectations. This difficult problem is solved in particular for dynamic operators satisfying a domination assumption introduced by Peng [218] in 1997 where bk (z) = k|z|. For more details, please refer to his lecture notes on BSDEs and dynamic operators [220]. Theorem 3.22 Let (Et ; 0 ≤ t ≤ T ) be a nonlinear expectation such that There exists |λ| ∈ H2 and a sufficiently large real number k > 0 such that for any t ∈ [0, T ] and any ξT ∈ L2 (FT ): −bk +|λ|
Et
b +|λ|
(ξT ) ≤ Et (ξT ) ≤ Et k
(ξT ) a.s., b
and for any (ξT1 , ξT2 ) ∈ L2 (FT ): Et (ξT1 ) − Et (ξT2 ) ≤ Et k (ξT1 − ξT2 ). Then, there exists a function g(t, y, z) satisfying assumption (H1), such that for any t ∈ [0, T ], ∀ ξT ∈ L2 (FT ),
a.s.,
∀t,
g
Et (ξT ) = Et (ξT ).
For a proof of this theorem, please refer to Peng [219].
Infinitesimal Risk Management The coefficient of any g-conditional risk measure Rg can be viewed as the infinitesimal risk measure over a time interval [t, t + dt] as g
EP [dRt |Ft ] = −g(t, Zt )dt, where Zt is the local volatility of the gconditional risk measure. Therefore, choosing carefully, the coefficient g enables to generate g-conditional risk measures that are locally compatible with the views and practice of the different agents in the market. In other words, knowing the infinitesimal measure of risk used by the agents is enough to generate a dynamic risk measure, locally compatible. In this sense, the g-conditional risk measure may appear more tractable than static risk measures. The following example gives a good intuition of this idea: the g-conditional risk measure corresponding to the mean-variance paradigm has a g-coefficient of the type g(t, z) = −λt z + 12 z2 . The process λt can be interpreted as the correlation with the market numéraire. Therefore, g-Conditional risk measures are a way to construct a wide family of convex risk measures on a probability space with Brownian filtration, taking into account the ability to decompose the risk through intertemporal local risk measures g(t, Zt ). In the following, to study g-conditional risk measures, we adopt the same methodology as in the static framework. In particular, we start by developing a dual representation for these dynamic risk measures, in terms of the “dual function” of their coefficient. This study requires some general properties of convex functions on Rn .
128
CHAPTER 3
3.7 DUAL REPRESENTATION OF g-CONDITIONAL RISK MEASURES Following the approach adopted in the first part of this chapter when studying static risk measures, we now focus on a dual representation for g-conditional risk measures. The main tool is the Legendre-Fenchel transform G of the coefficient g, defined by G(t, µ) =
sup z∈Qnrational
{µ, −z − g(t, z)} .
(3.35)
The convex function G is also called the polar function or the conjugate of g. Provided that g is continuous, g(t, z) = sup µ, −z − G(t, µ) . (3.36) µ∈Qnrational
More precisely, Definition 3.23 A g-conditional risk Rg measure is said to have a dual representation if there exists a set A of admissible controls such that for any bounded stopping time S ≤ T and any ξT in the appropriate space T g RS (ξT ) = ess supµ∈A EQµ −ξT − (3.37) G(t, µt )dt|FS , S
where Q is a probability measure absolutely continuous with respect to P. The dual representation is said to be exact at µ¯ if the ess sup is reached for µ. ¯ µ
In order to obtain this representation, two intermediate steps are needed: 1. Refine results on Girsanov theorem and the integrability properties of martingales with respect to change of probability measures. 2. Refine results from convex analysis on the Legendre-Fenchel transform and the existence of an optimal control in both Formulas (3.35) and (3.36), including measurability properties, The next paragraph gives a summary of the main results that are needed. 3.7.1 Girsanov Theorem and BMO-Martingales Our main reference on Girsanov theorem and BMO-martingales is the book by Kazamaki [157]. The exponential martingale associated with the d-dimensional Brownian motion W , t t 1 µ E µs dWs = t = exp µs dWs − |µs |2 ds , 2 0 0 solution of the forward stochastic equation dtµ = tµ µ∗t dWt ,
µ
0 = 1
is a positive local martingale, if µ is an adapted process such that
(3.38) T 0
|µs |2 ds < ∞.
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
129
µ T
is the density (w.r. to P) When µ is a uniformly integrable (u.i.) martingale, of a new probability measure denoted by Qµ . Moreover, if W is a P-Brownian t µ motion, then Wt = Wt − 0 µs ds is a Qµ -Brownian motion. Questions around Girsanov theorem are of two main types. They mainly consist of • first, finding conditions on µ so that µ is a u.i. martingale, • second, giving so;e precision on the integrability properties that are preserved under the new probability measure. The bounded case, that is recalled below, is well known. The BMO case is less standard, so we give more details. Change of Probability Measures with Bounded Coefficient When µ is bounded, it is well known that the exponential martingale belongs to all µ 2 1+ Hp -spaces. ). In particular, if t Moreover, if2 a process is in H (P), Zit is in t H (Q µ Z # Mt = 0 Zs dWs is a H (P)-martingale, then Mt = 0 Zs dWs is a u.i. martingale under Qµ , with null Qµ -expectation. Change of Probability Measures with BMO-Martingale The right extension of the space of bounded processes is the space of BMO processes defined as T 2 2 |ϕs | ds|Ft ≤ C a.s. . BMO(P) = ϕ ∈ H s.t ∃C ∀t E t ∗ The smallest constant C such that the previous inequality holds t is denoted by C = 2 ||ϕ||BMO . In terms of martingale, the stochastic integral 0 ϕs dWs is said to be a BMO(P)-martingale if and only if the process ϕ belongs to BMO(P). The following deep result is proved in Kazamaki [157] (Section 3.3).
Theorem 3.24 Let the adapted process µ be in BMO(P). Then 1. The exponential martingale µ is a u.i. martingale and t defines a new equivalent µ probability measure Qµ . Moreover, Wt = Wt − 0 µs ds is a Qµ -Brownian motion. t t µ 2. Mt = 0 µ∗s dWs , and more generally any BMO(P)-martingale MtZ = 0 Zs #tµ = t µ∗s dWsµ and M #tZ = dWs , are transformed into continuous processes M 0 t µ µ 0 Zs dWs that are BMO(Q )-martingales. 3. The BMO-norms with respect to P and Qµ are equivalent: k||Z||BMO(Qµ ) ≤ ||Z||BMO(P) ≤ K||Z||BMO(Qµ ) . The constants k and K only depend on the BMO-norm of µ. Hu, Imkeller, and Müller [268] were among the first to use the property that the martingale dMtZ = Zt dWt , which naturally appears in BSDEs associated with exponential hedging problems, is BMO. Since then, such a property has been used in
130
CHAPTER 3
different papers, mostly dealing with the question of dynamic hedging in an exponential utility framework (see, for instance, the recent papers by Mania, Santacroce, and Tevzadze [176] and Mania and Schweizer [179]). In the proposition below, we extend their results to general quadratic BSDEs. Proposition 3.13 Let (Y, Z) be the maximal solution of the quadratic BSDE (H3) . with coefficient g, and M Z = 0 Zs dWs the stochastic integral Z.W dYt = g(t, Zt )dt − dMtZ ,
YT = ξT .
Given that by assumption Y is bounded, and |g(t, 0)|1/2 ∈ BMO(P), M Z is a BMO(P)-martingale Proof. Let k be the constant such that |g(t, z)| ≤ |g(t, 0)| + k|z|2 . Thanks to It’s formula applied to the solution (Y, Z) and to the exponential function: T β2 T exp(β Yt ) = exp(β YT ) + β exp(β Ys )g(s, Zs )ds − exp(β Ys )|Zs |2 ds 2 t t T −β exp(β Ys )Zs dWs t
= exp(β YT ) + β −β
t
T
β exp(β Ys ) g(s, Zs ) − |Zs |2 ds 2
T
exp(β Ys )Zs dWs . t
Given that β2 |Zs |2 − g(s, Zs ) ≥ ( β2 − k)|Zs |2 − |g(s, 0)| ≥ ε|Zs |2 − |g(s, 0)| for β ≥ (k + ε) and taking the conditional expected value, we obtain T T 2 βεE exp(β Ys )|Zs | ds|Ft ≤ C + βE exp(β Ys )|g(s, 0)|ds|Ft ≤ C, t
t
where C is a universal constant that may change from place to place. Since exp(β Ys ) is bounded both from below and from above, the property holds. 2 3.7.2 Some Results in Convex Analysis We shall need some key results in convex analysis to obtain the dual representation of g-conditional risk measures. They are presented in an appendix at the end of the chapter to preserve the continuity of the arguments in this part. More details or proofs may be found in Aubin [5], Hiriart-Urruty and Lemaréchal [127], or Rockafellar [231]. 3.7.3 Dual Representation of Risk Measures We now study the dual representation of g-conditional risk measures. The space of admissible controls depends on the assumption imposed on the coefficient g. We consider successively both situations (H1) and (H3). There is no need to look separately at (H2), as, under our assumptions, the condition (H2) implies the condition (H1)
131
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
(for more details, please refer to the appendix in Section 3.9 at the end of the chapter). The (H1) case has been solved in [202] but the (H3) case is new. Theorem 3.25 Let g be a convex coefficient satisfying (H1) or (H3) and G be the {µ, −z − g(t, z)}. associated polar process, G(t, µ) = supz∈Qn rational i) For almost all (ω, t), the program g(ω, t, z) = supµ∈Qn [µ, −z − rational G(ω, t, µ) ] has an optimal progressively measurable solution µ(ω, ¯ t) in the subdifferential of g at z, ∂g(ω, t, z). ii) Then Rg has the following dual representation, exact at µ, ¯ T g G(s, µs )ds|Ft Rt (ξT ) = esssupµ∈A EQµ −ξT − = EQµ¯ −ξT −
t
T
G(s, µ¯ s )ds|Ft ,
t
where, 1. Under (H1) (|g(t, z)| ≤ |g(t, 0)|| + k|z|), A is the space of adapted processes µ bounded by k, and Qµ is the associated equivalent probability measure with µ density T where µ is the exponential martingale defined in (3.38). 2. Under (H3), (|g(t, z)| ≤ |g(t, 0)| + k|z|2 ), A is the space of BMO(P)processes µ and Qµ is defined as above. iii) Let g(t, .) be a strongly convex function (i.e., g(t, z) − 12 C|z|2 is a convex function). Then the Fenchel-Legendre transform G(t, µ) has a quadratic growth in µ and the following dual representation holds true: T g G(s, µs )ds|Ft = esssupξT EQµ [ξT |Ft ] − Rt (ξT ) EQµ t ! g = EQµ ξ¯T |Ft − Rt (ξ¯T ). Proof. i) Since g is a proper function, the dual representation of g with its polar function G is exact at µ¯ ∈ ∂g(z): g(t, z) =
sup [µ, −z − G(t, µ) ] = µ, ¯ −z − G(t, µ), ¯
µ∈Qnrational
using classical results of convex analysis, recalled in the appendix at the end of the chapter. The measurability of µ¯ is separately studied in Lemma 3.26 just after this proof. iia) Let us first consider a coefficient g with linear growth (H1); so, g(t, 0) is in H2 . By definition, −G(t, µt ) is dominated from above by the square integrable process g g(t, 0). Then, let Rt (ξT ) := Yt be the solution of the BSDE (g, −ξT ), µ
−dYt = g(t, Zt )dt − Zt dWt = (g(t, Zt ) − µt , −Zt )dt − Zt dWt ,
YT = −ξT . (3.39)
By Girsanov Theorem (Theorem 3.24), for µ ∈ A the exponential martingale µ is u.i. and defines a probability measure Qµ on FT such that the process
132
CHAPTER 3
.
.
W µ = W − 0 µs ds is a Qµ -Brownian motion. Since M Z = 0 Zs dWs is in H2 (P), # Z = . Zs dWsµ is a u.i. Qµ -martingale. Moreover, since µ is bounded and g M 0 uniformly Lipschitz, the process (g(t, Zt ) − Zt µt ) belongs to H2 (P) but also to H1+ (Qµ ). So we can use an integral representation of the BSDE (3.39) in terms of T (g(s, Zs ) − µs , −Zs )ds|Ft Yt = EQµ −ξT +
t
≥ EQµ −ξT −
T
G(s, µs )ds|Ft .
(3.40)
t
We do not need to prove that the last term is finite. It is enough to recall that (−G(s, µs ))+ is dominated from above by the dQ × ds integrable process (g(s, 0))+ . iib) Let µ¯ be an optimal control, bounded by k, such that g(t, Zt ) = µ¯ t , −Zt − G(t, µ¯ t ) (see Lemma 3.26 for measurability results). Then the process −G(t, µ¯ t ) belongs to H2 (P) and so to H1+ (Qµ¯ ). By the previous result, Yt = EQµ¯ [−ξT − T ¯ s )ds|Ft ]. So the process Y is the value function of the maximization dual t G(s, µ problem T G(s, µs )ds|Ft . Yt = esssupµ∈A EQµ −ξT − t
iic) We now consider a coefficient g with quadratic growth (H3) and bounded solution Yt . Using the same notation, we know by Girsanov Theorem 3.24 that if µ ∈ BMO(P), µ is a u.i. martingale and the probability measure Qµ is well defined. The proof of the dual representation is very similar to that of the previous case, after solving some integrability questions. It is enough to notice that 1
• by assumption, |g(., 0)| 2 is BMO(P), • by Proposition 3.13, Z is BMO(P), • by Girsanov Theorem 3.24, for any µ ∈ BMO(P), the processes µ, Z and 1 |g(., 0)| 2 are in BMO(Qµ ). 1
1
So |g(t, Zt )| 2 and |µt Zt | 2 are in BMO(Qµ ). Moreover, the process (−G(t, µ))+ , which is dominated from above by |g(t, 0)|, is a Qµ × dt-integrable process. Then the inequality (3.40) holds. iid ) Let µ¯ be an optimal control, such that g(t, Zt ) = µ¯ t , −Zt − G(t, µ¯ t ). Given that g(t, .) has quadratic growth, the polar function G(t, .) satisfies the following inequality: G(t, µ¯ t ) ≥ −|g(t, 0)| +
1 |µ¯ t |2 . 4k
Then, for small ε < 1/(4k), 1 − ε |µ¯ t |2 ≤ G(t, µ¯ t ) + |g(t, 0)| − ε|µ¯ t |2 4k ≤ |g(t, 0)| − g(t, Zt ) + µ¯ t , −Zt − ε|µ¯ t |2 1 ≤ |g(t, 0)| − g(t, Zt ) + |Zt |2 . 4ε
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
133
Since both processes |g(t, Zt )|1/2 and Z are BMO(P), µ¯ is also BMO(P), and the other processes hold nice integrability properties with respect to both probability measures P and Qµ and the integral representation follows. iii) Let h(t, z) = g(t, z) − 12 C|z|2 be the convex function associated with g. Since g is the sum of two convex functions h and 12 C|.|2 , its Fenchel-Legendre transform G is the inf-convolution of the Fenchel-Legendre transforms of both h and 12 C|.|2 . But the Fenchel-Legendre transform of the quadratic function 12 C|.|2 is still a quadratic 1 |µ|2 and G has a quadratic growth (as the inf-convolution of a convex function, 2C function H with a quadratic function). Therefore, for a given µ ∈ BMO(P), there exists Z¯ ∈ BMO(P) such that G(t, µt ) = µt , −Z¯ t − g(t, Z¯ t ) (in other words, ¯ µ ∈ ∂g(Z)). T We now introduce the penalty function α µ defined by αtµ = EQµ [ t G(s, µs ) ds|Ft ]. Using the above duality result, we have T µ ¯ ¯ (µs , −Zs − g(s, Zs ))ds|Ft . αt = EQµ t
T
T t g µ Since ξ¯T = 0 (µs , −Z¯ s − g(s, Z¯ s ))ds + 0 Z¯ s dWs and Rt (ξ¯T ) = 0 (µs , t µ −Z¯ s − g(s, Z¯ s ))ds + 0 Z¯ s dWs , we finally deduce that g µ αt = EQµ [ξ¯T |Ft ] − Rt (ξ¯T ).
Moreover, using Equation (3.40), we have ! µ αS ≥ ess sup EQµ ξT |FS − Rg (ξT ). ξT
Hence, the result.
2
The question of the measurability of the optimal solution(s) µ¯ is considered in the following lemma. Lemma 3.26 Let g be a convex coefficient satisfying (H1) or (H3) and G be the associated polar function. There exists an progressively measurable optimal solution µ¯ such that g(t, Zt ) = µ¯ t , −Zt − G(t, µ¯ t ) a.s. dP × dt. Proof. For each (ω, t) ∈ × [0, T ], the sets given by: {µ ∈ Rn : g(ω, t, Zt ) = Zt µ − G(ω, t, µ)} are nonempty. Hence, by a measurable selection theorem (see for instance Dellacherie and Meyer [70] or Benes [18]), there exists a Rn valued progressively measurable process µ¯ such that: g(ω, t, Zt ) = µ¯ t , −Zt − G(ω, t, µ¯ t ) dP × dt − a.s.. 2
3.7.4 g-Conditional γ -Tolerant Risk Measures and Asymptotics In this subsection, we pursue our presentation and study of g-conditional risk measures using an approach similar to that we have adopted in the static framework.
134
CHAPTER 3
g-Conditional γ -Tolerant Risk Measures As in the static framework, we can define dynamic versions for both coherent and γ -tolerant risk measures based on the properties of their coefficients using the uniqueness Lemma 3.20. More precisely, let γ > 0 be a risk-tolerance coefficient. As in the static framework, where
the γ -dilated of any static convex risk measure 1 ρ is defined by ργ (ξT ) = γρ γ ξT , we can define the g-conditional risk measure, g
Rγ , γ -tolerant of Rg , as the risk measure associated with the coefficient gγ , which 1 z). is the γ -dilated of g: gγ (t, z) = γ g( t,γ Note that if g is Lipschitz continuous (H1), gγ also satisfies (H1), and if g is continuous with quadratic growth (H3) with parameter k, then g also satisfies (H3), but with parameter γk . Note also that the dual function of gγ , Gγ , can be expressed in terms of G, the dual function of g as Gγ (µ) = γ G(µ). A standard example of g-conditional γ -tolerant risk measure !is certainly the dynamic entropic risk measure eγ ,t (ξT ) = γ ln E exp(− γ1 ξT )|Ft , which is the γ -tolerant of e1,t . Asymptotic Behavior of Entropic Risk Measure Let us look more closely at the dynamic entropic risk measure. Letting γ go to +∞, the BSDE-coefficient qγ (z) = 2γ1 |z|2 tends to 0 and we directly obtain the natural extension of the static case, e∞,t (ξT ) = EP [−ξT |Ft ]. Letting γ tend to 0, the BSDE coefficient explodes if |z| = 0 and intuitively the martingale of this BSDE has to be! equal to 0. More precisely, since by definition γ exp(eγ ,t (ξT )) = E exp(− γ1 ξT )|Ft , lim exp(eγ ,t (ξT )) = || exp(−ξT )||∞ t = inf {Y ∈ Ft : Yt ≥ exp(−ξT )}.
γ →0
So we have e0+ ,t (ξT ) = || − ξT ||∞ t . This conditional risk measure is a g-conditional risk measure associated with the indicator function of {0}. Let us also observe that e0+ ,t (ξT ) is an adapted non-increasing process without martingale part.
Marginal Risk Measure In the general γ -tolerant case, assuming that the g-conditional risk measures are centered (equivalently g(t, 0) = 0 equivalently G(t, .) ≥ 0), the same type of results can be obtained concerning the asymptotic behavior of the γ -dilated coefficient and the duality. Then, the limit of gγ when γ → +∞ is the derivative of g at the origin in the direction of z. g g R∞ is the nonincreasing limit of Rγ defined by its dual representation, g
R∞,t (ξT ) = ess supµ∈A {EQµ [−ξT |Ft ]| G(u, µu ) = 0, ∀u ≥ t, −a.s.}. In some cases (in particular, in the quadratic case when the polar function G has a g unique 0, i.e., G(u, 0) = 0 is unique), −R∞ is a linear pricing rule and can be seen as an extension of the Davis price (see Davis [60]).
135
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
Conservative Risk Measures and Superpricing We now focus on the properties of the g-conditional γ -tolerant risk measures when the risk tolerance coefficient goes to zero. To do so, we need some results in convex analysis regarding the so-called recession function, defined for any z ∈ Dom(g) by g0+ (z) := limγ ↓0 γ g( γ1 z) = limγ ↓0 γ (g(y + γ1 z) − g(y)). The key properties of this function are recalled in the appendix. Conservative Risk Measure • Under assumption (H 1), we may assume that g(t, 0) = 0. Therefore, the polar function G is non negative. Since g(t, .) has a linear growth with constant k, the recession function g0+ (t, .) is finite everywhere with linear growth, and the domain of the dual function G is bounded by k. The BSDE(g0+ , ξT ) has a unique solution g Yt0 (ξT ) ≥ Rt γ (ξT ). Using their dual representation through their polar functions lDom(G) and γ G, T Yt0 (ξT ) = ess supµ∈Ak EQµ −ξT − lDom(G) (u, µu )du|Ft , t
g Rγ ,t (ξT )
= ess supµ∈Ak E
Qµ
−ξT − γ
T
G(u, µu )du|Ft ,
t
we can take the nondecreasing limit in the second line and show that g
g
R0+ ,t (ξT ) = limγ ↓0 Rγ ,t (ξT ) = Yt0 (ξT ) = ess supµ∈Ak {EQµ [−ξT |Ft ]| G(u, µu ) < ∞ ∀u ≥ t, du − a.s..} = ess supµ∈Ak ∩Dom(G) EQµ [−ξT |Ft ]. • When the coefficient g has a quadratic growth (H 3), the recession function may be infinite on a set with positive measure and the BSDE(g0+ , ξT ) is not well defined. g However, we can still take the limit in the dual representation of Rγ ,t , obtain the g g same characterization of R0+ ,t , and consider R0+ as a generalized solution of BSDE whose the coefficient g0+ may take infinite values. In particular if, as in the entropic case, g0+ = l{0} , G is finite everywhere and any equivalent probability measure associated with BMO coefficient, said to be in Q(BMO), is admissible. Then, l
∞ R0{0} + ,t (ξT ) = ess supQ∈Q(BMO) EQ [−ξT |Ft ] = || − ξT ||t = e0+ ,t (ξT ).
Superprice System g
Note that the conservative risk measure R0+ ,t (ξT ) = ess supµ∈A∩Dom(G) EQµ ! − ξT |Ft is the equivalent of the superpricing rule of −ξT (this notion was first introduced by El Karoui and Quenez [79] under the name “upper hedging price”). When the λt -translated of Dom(G)t is a vector space, the recession function g0+ (t, z) is the indicator function of the orthogonal vector space Dom(G)% t plus a linear funcg tion z, −λt . Then, R0+ ,t (−ξT ) is exactly the upper-hedging price associated with hedging portfolios constrained to live in Dom(G)% .
136
CHAPTER 3
The conservative measure is the smallest of coherent risk measure such that g g Rt (−ξT ) − Rt (−ηT ) ≤ Rcoh t (−ξT + ηT ) for any (ξT , ηT ) in the appropriate space. Volume Perspective Risk Measure It is also possible to associate a coherent risk measure Rg˜ with any convex risk measure Rg , using the perspective function g˜ of the coefficient g, which is assumed to be normalized for the sake of simplicity (g(t, 0) = 0). The perspective function g˜ is defined as z if γ > 0 γ g t, γ g(t, ˜ γ , z) = . z = g0+ (t, z) if γ = 0 limγ →0 γ g t, γ More details about g˜ can be found in the appendix. As a direct consequence, the g-conditional ˜ risk measure Rg˜ is a coherent risk measure.
3.8 INF-CONVOLUTION OF g-CONDITIONAL RISK MEASURES In this section, we come back to inf-convolution of risk measures, when they are g-conditional risk measures. This study is based upon the inf-convolution of their respective coefficients. More precisely, we will study for any t the inf-convolution of the g-conditional risk B measures RA t and Rt defined as A B R RB t ξT = ess inf FT RA , (3.41) t ξT − FT + Rt FT where both ξT and FT are taken in the appropriate space and show that this new dynamic risk measure is under mild assumptions the (maximal solution) RA,B of the BSDE (g A g B , −ξT ), where (g A g B )(., t, z) = ess inf z (g(., t, x − z) + g(., t, z)). Then, the next step is to characterize the optimal transfer of risk between both agents A and B, agent A being exposed to ξT at time T . Some key results on the inf-convolution of convex functions are recalled in the appendix at the end of the chapter, the main argument being summarized in the proposition below. Proposition 3.14 Let g A and g B be two convex functions of z. Under the following condition, g0A+ (t, z) + g0B+ (t, −z) > 0,
∀z = 0,
then g A g B is exact for any z as the infimum is attained by some x ∗ : g A g B (z) = inf {g A (z − x) + g B (x)} = g A (z − x ∗ ) + g B (x ∗ ). x
3.8.1 Inf-convolution and Optima We now focus on our main problem of inf-convolution of g-conditional risk measures as expressed in Equation (3.41). The following theorem gives us an explicit
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
137
characterization of an optimum for the inf-convolution problem provided such an optimum exists. Theorem 3.27 Let g A and g B be two convex coefficients depending only on z and satisfying the condition of Proposition 3.14. For a given ξT in the appropriate space (either L2 or L∞ ), let (RA,B (ξT ), Zt ) be the maximal solution of t #tB be a measurable process such that Z #tB = ) and Z the BSDE (g A g B , −ξ T B A arg minx {g t, Zt − x + g (t, x)} dt × dP − a.s.. Then, the following results hold: B i) For any t ∈ [0, T ] and for any FT such that both RA t (ξT − FT ) and Rt (FT ) are well defined: B ξT ≤ RA RA,B t (ξT − FT ) + Rt (FT ) P − a.s. t #B is admissible, then for any t ∈ [0, T ] ii) If the process Z ξT = (RA RB t ξT P − a.s., RA,B t and the structure FT∗ defined by the forward equation T T #tB dWt #tB dt − FT∗ = Z g B t, Z 0
0
is an optimal solution for the inf-convolution problem: ∗ B ∗ (RA RB t ξT = RA t (ξT − FT ) + Rt (FT ). #tB is guaranteed Proof. i) First, note that the existence of such a measurable process Z by Theorem 3.14. B In the following, we consider any FT such that both RA t (ξT − FT ) and Rt (FT ) are well defined. B Let us now focus on RA t (ξT − FT ) + Rt (FT ). It satisfies A −d Rt (ξT − FT ) + RBt (FT ) = g A (t, ZtA ) + g B (t, ZtB ) dt − ZtA + ZtB dWt = (g A (t, Zt − ZtB ) + g B (t, ZtB ))dt − Zt dWt , B and at time T , RA T (ξT − FT ) + RT (FT ) = −ξT . B A Therefore, (Rt (ξT − FT ) + Rt (FT ), Zt ) is solution of the BSDE with terminal condition −ξT , which is also the terminal condition of the BSDE (g A g B , −ξT ), and a coefficient g written in terms of the solution ZtB of the BSDE (g B , FT ) as: g(t, z) = g A (t, z − ZtB ) + g B (t, ZtB ). Using the definition of the inf-convolution, this coefficient is then always greater than g A g B . Thus, we can compare RA t (ξT − FT ) + RBt (FT ) with the solution of the BSDEs (g A g B , −ξT ) using the comparison Theorem (3.16) and obtain the desired inequality. #tB is admissible, using different notions of ii) Let now assume that the process Z admissibility when either (H1) or (H3) (square integrability or BMO). Thanks to Theorem 3.14, we can show that both dynamic risk measures coincide. We now introduce the structure FT∗ defined by the forward equation t t #sB dWs . #sB )ds − Z g B (s, Z Ft∗ = 0
0
138
CHAPTER 3
#tB , such a structure is well Note first that thanks to the admissibility of the process Z defined and belongs to the appropriate space (either L2 (FT ) or L∞ (FT )). Let us also observe that −Ft∗ is also solution of the BSDE (g B , −FT∗ ) since −Ft∗ = T #uB dWu . By uniqueness, this process is RBt (F ∗ ). #uB )dt − T Z −FT∗ + t g B (u, Z T t ∗ A B Since Rt (ξT − FT ) + Rt (FT∗ ) is solution of the BSDE with coefficient written #tB ) + g B (t, Z #tB ) and terminal condition −ξT and given that as g A (t, Zt − Z A B #tB , #tB + g B t, Z g g t, Zt = g A t, Zt − Z by uniqueness, we also have ∀t ≥ 0, A RA,B ξT = R RB t ξT t
P a.s.
The proof also gives the optimality for the Problem (3.41) of the structure T T #tB dt − #tB dWt . FT∗ = Z g B t, Z 0
(2)
0
On uniqueness on the optimum Note that the optimal structure FT∗ is determined to within a constant because of the translation invariance property (P3-) satisfied by B both risk measures RA t and Rt since B ess inf FT {RA t (ξT − (FT + m)) + Rt (FT + m)} B = ess inf FT {RA t (ξT − FT ) + m + Rt (FT ) − m}
= (RA RB )t (ξT ). Note also that FT∗ is optimal for all the optimal structure problems for all stopping times S such that 0 ≤ S ≤ T a.s. The following theorem gives some sufficient conditions ensuring the admissibil#tB . ity of the process Z Theorem 3.28 [Exact Inf-convolution] Let g B be a strongly convex coefficient. For any convex function g A , the inf-convolution g A g B is convex with quadratic growth (H3), so in particular, if g A satisfies (H3). #tB , defined in Theorem 3.27, is in BMO(P). In this case, the process Z Note that in this case, the optimal structure FT∗ , defined in Theorem 3.27, is quasi-bounded as it belongs to the BMO-closure of L∞ as defined by Kazamaki [157] (chapter 3). Proof. From the duality Theorem 3.25, the optimal control µ∗ of GA,B , the polar function of g A g B , is in BMO(P). From the inf-convolution, we deduce that this is also the optimal control for GA and GB in the following sense: #tB ) = µ∗t , −(Zt − Z #tB ) − GA (t, µ∗t ), g A (t, Zt − Z #tB ) = µ∗t , −Z #tB − GB (t, µ∗t ). g B (t, Z #tB and Z #tB are in BMO(P) (from Proposition 3.13) and the Therefore, both Zt − Z B #t is admissible. 2 process Z
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
139
Comments i) Just as in the static framework, we obtain the same result when considering g-conditional γ -tolerant risk measures. The Borch theorem is therefore extremely robust since the quota sharing of the initial exposure remains an optimal way of transferring the risk between different agents. ii) Under some assumptions, the underlying logic of the transaction is nonspeculative since there is no interest for the first agent to transfer some risk or equivalently to issue a structure if she is not initially exposed. This result is completely consistent with the result we have already obtained in the static framework. 3.8.2 Hedging Problem As in Subsection 3.4.2, we consider the hedging problem of a single agent. She wants to hedge her terminal wealth XT by optimally investing on financial market and assesses her risk using a general g-conditional risk measure Rg . Framework We consider the same framework as that introduced in Subsection 3.1.5 when looking at the question of dynamic hedging in the static part. More precisely, we assume that d basic securities are traded on the market. Their forward (nonnegative) vector price process S follows an Itô semi-martingale with a uniformly bounded drift coefficient and an invertible and bounded volatility matrix σt . Under P, dSt = σt (dWt + λt dt); St
S0 given.
(3.42)
To avoid arbitrage, we assume (AAO): there exists a probability measure Q, equivalent to P, such that S is a Q-local martingale. From the completeness of this basic arbitrage-free market, we deduce the uniqueness of Q, which is usually called the risk-neutral probability measure. The agent can invest in dynamic strategies θ, i.e., d-predictable processes and (Gt (θ ) = (θ.S)t ) denotes the associated gain process. We assume that not all strategies are admissible and that, for instance, the agent has some restriction imposed on the transaction size. These constraints create some market incompleteness in the framework we consider. ST = {GT (θ) | θ.S is bounded by below , θ ∈ K} is the set of admissible hedging gain processes. K is a convex subset of BMO(P) such that any admissible strategies θ is in K (equivalently, ∀ t, θt ∈ Kt ). Hedging Problem At time 0, the hedging problem of the agent can be expressed as the determination of an optimal admissible strategy θ as to minimize the initial g-conditional risk measure of her terminal wealth inf R0 (XT − GT (θ)).
θ∈K
(3.43)
140
CHAPTER 3
The value functional of this program is the dynamic market modified risk measure of agent A, denoted by Rm . Using the previous results, we can obtain the following proposition. Proposition 3.15 i) Let lσt∗ (Kt ) = lK (t be the indicator function of the convex set ∗ # Kt = σt Kt . Provided that the inf-convolution glσt∗ (Kt )(Zt ) is well defined, the residual risk measure Rm is given as the maximal solution of the following BSDE: m −dRm t (X) = g (t, Zt )dt − ZT dWt ;
Rm T (X) = −XT ,
where g m is the restriction of g to the admissible set g m (t, Zt ) = glσt∗ (Kt )(Zt ). ii) If g is strongly convex, then this hedging problem has a solution. (t )2 , where γ is the In particular, in the entropic case, g m (t, z) = 2γ1 d 1 (z, K γ risk tolerance coefficient and d 1 (z, K) is the distance function to K. The optimal γ investment strategy θ is the projection on K of Zt , solution of the BSDE (g m , −XT ). The terminal value GT (θ ) of the associated portfolio is given by T T ∗ (θt ) σt λt dt + (θt )∗ σt dWt . GT (θ ) = x + 0
0
Generalized BSDEs: In the static framework, we expressed the hedging problem as an inf-convolution between the seminal risk measure of the agent and the risk measure ν H generated by H, the convex set of constrained terminal gains, or more generally the inf-convolution between the seminal risk measure of the agent and the convex indicator of H (Proposition 3.9). From a dynamic point of view, the set H can be seen as the set of all dynamic terminal values of portfolios with some constraint on the strategies. Everything can be formulated in the same way. Note that the natural candidate for RH would be the inf-convolution between the dynamic worst-case risk measure and the convex indicator of H: l H limγ →∞ ( 2γ1 |z|2 ). This infimum is always strictly positive. Moreover, it is an increasing process at the limit. To model this “limit BSDE,” an increasing process has to be introduced (for more details, please refer to El Karoui and Quenez [78] and Cvitanic and Karatzas [58]). As a consequence, the dynamic version of the risk measure generated by H cannot be seen exactly as the solution of a standard BSDE, as previously defined, in the sense that the coefficient can take infinite values. This is however not such a problem here as we really focus on the inf-convolution. Therefore, we can simply consider the restriction of the seminal risk measure to a particular set. The powerful regularization impact of the inf-convolution is again visible here. Hedging problem at any time t: Solving the hedging problem at time 0 leads to the characterization of a particular probability measure, which can be called calibration probability measure as the prices of any hedging instruments made with respect to this measure coincide with the observed market prices on which all agents agree. Solving the hedging problem at any time t is equivalent to solving the same problem at time 0 as soon as the prices of these hedging instruments at this time t are
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
141
given as the expected value of their discounted future cash flows under the optimal calibration probability measure determined at time 0. This optimal probability measure is very robust as it remains the pricing measure for hedging instruments between 0 and T . Therefore, we can introduce the same problem at any time t: ess inf Rt XT − GT (θ ) = Rm t (XT ). θ∈K
BSDEs time consistency and uniqueness are key arguments to show that if θ is optimal for the problem at time 0, then θ is optimal for the optimization program at any time t. Dynamic entropic framework: The entropic hedging problem, lying at the core of this book, has been intensively studied in the literature. But only a few papers are using a BSDEs framework. After the seminal paper by El Karoui and Rouge [156], different authors have used BSDEs to solve this problem under various assumptions (see in particular Sekine [250], Mania et al. [176], and more recently Hu, Imkeller, and Müller [268] or Mania and Schweizer [179]). Another approach, different from what we have mentioned above, has been used to solve the hedging problem. It involves the dual representation for the dynamic entropic risk measure as given by Theorem 3.25: T |µs |2 eγ ,t () = sup EQµ − − γ ds|Ft . 2 µ∈Aq t Therefore, the hedging problem at any time t can be rewritten as ess inf ess supµ∈Aq {EQµ [−XT + GT (θ )|Ft ] − γ h(Qµ |P)}, θ∈K
and it may be solved by using dynamic programming arguments.
3.9 APPENDIX: SOME RESULTS IN CONVEX ANALYSIS We now present some key results in convex analysis that will be useful to obtain the dual representation of g-conditional risk measures. More details or proofs may be found in Aubin [5], Hiriart-Urruty and Lemaréchal [127] or Rockafellar [231]. All the notations and definitions we introduce are consistent with the notations of risk measures. They may differ from the standard framework of convex analysis (especially regarding the sign). Even if the coefficient of the BSDE is finite, we are also interested in convex functions taking infinite values. The main motivation is the definition of its convex polar function G. In what follows, as in [127], we always assume that the considered functions are not identically +∞ and are bounded from below by a affine function (note that this assumption is rather general and does not necessarily require that the functions are convex). The domain of a function g is defined as the nonempty set Dom(g) = {z : g(z) < +∞}. The epigraph of convex function is the subset of Rn × R as: epi g = {(x, λ) | g(x) ≤ λ}. When the convex functions are lower semicontinuous (lsc), epig is closed, and they are said to be closed.
142
CHAPTER 3
3.9.1 Duality Legendre-Fenchel Transformation Let g be a convex function. The polar function G is defined on Rn by sup (µ, −z − g(z)).
G(µ) = sup(µ, z − g(z)) = z
(3.44)
z∈Dom(g)
The function G is a closed convex function, which can take infinite values. The conjugacy operation induces a symmetric one-to-one correspondence in the class of all closed convex functions on Rn : g(z) = sup(µ, −z − G(µ)), µ
G(µ) = sup(µ, −z − g(z)). z
Convex set and duality Given a nonempty subset S ⊂ Rn , the indicator function (in the convex analysis terminology) of S, lS : Rn → R+ ∪ {+∞}, is defined by lS (z) = 0
if
z∈S
and
+∞
if not.
lS is convex (closed), iff S is convex (closed) since epi lS = S × R+ . The polar function of lS is the support function of −S: σS (z) := sups, −z = sup{s, −z − lS (s)}. s∈S
s
The support function is closed, convex, homgeneous function: σS (λz) = λσS (z) for all λ > 0. Its epigraph and its domain are convex cones.
Subdifferential and Optimization The subdifferential of the convex function g in z, whose the elements are called subgradient of g at z, is the set ∂g(z) defined as ∂g(z) = { µ | g(x) ≥ g(z) − µ, x − z ,
∀x} = { µ |g(z) − µ, −z ≥ G(µ)}. (3.45)
If z ∈ / Dom(g), ∂g(z) = ∅. But if z is in the interior of Dom(g), the subgradient ∂g(z) is nonempty (see Section E in [127] or Chapter 23 in [231]); in fact, it is enough that z belongs to the relative interior of Dom(g), where ridom(g) is defined in Section A in [127] and in Chapter 6 in [231]. In particular, if g is finite, then ∂g(z) is nonempty for any z. When ∂g(z) is reduced to a single point, the function is said to be differentiable in z. Note that when the function g is the indicator function of the convex set C, the subdifferential of g in z ∈ C is the positive normal cone NC+ (z) to C at z, NC+ (z) = {s ∈ Rn | ∀y ∈ C − s, y − z } ≤ 0}. Subgradients are solutions of minimization programs as inf z (g(z) − µ, −z ) (= −G(µ)), or its dual program, inf µ (G(µ) − µ, −z )(= −g(z)). The precise result is the following (see Section E in [127]): Let g be a closed convex function
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
143
and G its polar function. • # µ ∈ ∂g(# z) ⇐⇒ # µ is optimal for the following minimization program, that is µ) − # µ, −z . −g(z) = inf (G(µ) − µ, −z ) = G(# µ
•# z ∈ ∂G(# µ) ⇐⇒# z is optimal for the following minimization program, that is in −G(µ) = inf (g(z) − µ, −z ) = g(# z) − µ, −# z . z
In the following, when working with BSDEs, we will denote by zµ the scalar product between the line vector z and the column vector µ. 3.9.2 Recession Function Recession Function The recession function associated with a closed convex function g is thehomo geneous convex function defined for z ∈ Dom(g) by g0+ (z) := limγ ↓0 γ g γ1 z = limγ ↓0 γ g(y + γ1 z) − g(y) . This function g0+ is the smallest homogeneous function h such that for any z, y ∈ Dom(g), g(z) − g(y) ≤ h(z − y). When g(z) ≤ c + k|z|, g0+ (z) ≤ k|z| is a finite convex function, and the function g is Lipschitzcontinuous function with Lipschitz coefficient k since g(z) − g(y) ≤ k|z − y|. This property explains why any convex coefficient of BSDE satisfying the assumption (H 2) in fact satisfies (H 1). Let G be the polar function of g. Using obvious notation, for any µ ∈ Dom(g), polar g0+ (µ) = lim(γ G(µ)) = 0. γ ↓0
So, polar g0+ = ldom G . By the conjugacy relationship applied to closed functions, g0+ is the support function of Dom(G); so g0+ is finite everywhere iff Dom(G) is bounded, or iff g is uniformly Lipschitz, or finally iff g has linear growth. The recession function of the quadratic function qk (z) = c + k|z|2 is infinite except in z = 0, and its polar function is the null function. More generally, convex functions such that g0+ = l{0} admit finite polar function G and this condition is sufficient. Perspective Function Let us consider a closed convex function g such that g(0) = 0. The perspective function associated with g is the function g˜ defined on R+ × Rn as z if γ > 0 γ g γ . g(γ ˜ , z) = z = g0+ (z) if γ = 0 limγ →0 γ g γ Note first that the perspective function of g corresponds to the γ -dilated of g, seen as a function of both variables z and γ , when γ > 0. It is prorogated for γ = 0 by
144
CHAPTER 3
the recession function g0+ . Note that the risk tolerance coefficient is considered as a risk factor itself. g˜ is a positive homogeneous convex function (for more details, please refer to Part B [127]). The dual function of g, ˜ defined on R × Rn , is given by ˜ G(θ, µ) = 0
if G(µ) ≤ −θ ,
+∞
and
otherwise.
If g(0) < ∞, note that G(µ) is bounded. 3.9.3 Infimal Convolution of Convex Functions and Minimization Programs Addition and inf-convolution of closed convex functions are two dual operations with respect to the conjugacy relation. Let g A and g B be two closed convex functions from Rn ∪ {+∞}. By definition, the infimal convolution of g A and g B is the function g A g B defined as A B g g (z) = inf (g A (y A ) + g B (y B )) = inf (g A (z − y) + g B (y)). (3.46) y A +y B =z
y
If g A g B ≡ ∞, then its polar function, denoted by GAB , is simply the sum of the polar functions of g A and g B : GAB (µ) = GA (µ) + GB (µ). Inf-Convolution as a Proper Convex Function The function g A g B may take the value −∞, which is contrary to the assumption made in Subsection 3.7.2. To avoid this difficulty, we assume that both functions g A and g B have a common affine minorant s, . − b. This assumption may be expressed in terms of their recession functions, both of them being also bounded from below by s, . . Therefore, g0A+ (z) + g0B+ (−z) ≥ 0 for any z and consequently A g0+ g0B+ (0) ≥ 0. Note that this condition can also be expressed in terms of the polar functions of g A and g B as dom(GA ) ∩ dom(GB ) = ∅. Existence of Exact Inf-Convolution We are interested in the existence of a solution to the inf-convolution problem (3.46). When a solution exists, the infimal convolution is said to be exact. The previous conditions are almost sufficient, as proved in Rockafellar [231] since, if we assume g0A+ (z) + g0B+ (−z) > 0,
∀z = 0,
(3.47)
then g g is a closed convex function, and for any z, the infimum is attained by some x ∗ : A
B
g A g B (z) = inf {g A (z − x) + g B (x)} = g A (z − x ∗ ) + g B (x ∗ ). x
The condition (3.47) is satisfied if intdom(GA ) ∩ intdom(GB ) = ∅ (in fact, the true interior corresponds to the relative interior defined in Section A by Hiriart-Urruty and Lemaréchal [127]).
PRICING, HEDGING, AND DESIGNING DERIVATIVES WITH RISK MEASURES
145
Examples of exact inf-convolution We now mention different cases where the inf-convolution has a solution. • First, when both convex functions g A and g B are dilated, then their infconvolution is exact without having to impose any particular assumption, as we have already noticed when working with static risk measures in the first part (see Proposition 3.10). More precisely, assume that g A and g B are dilated from a given convex function g such that g A = gγA and g B = gγB , then g A g B = gγA +γB and for any z, an optimal solution x ∗ to the inf-convolution B problem is given by x ∗ = γAγ+γ . B A • More generally, if g is bounded from below and if g B satisfies the qualification constraint ensuring that inf z g B (z) is reached for some z (i.e., g B has a strictly positive recession function g0B+ ), then the condition (3.47) is satisfied and the inf-convolution g A g B has a nonempty compact set of solutions. Characterization of Optima We are now interested on the characterization of optima in the case of exact infconvolution. This can be done in terms of the subdifferentials of the different convex functions involved. More precisely, let us consider zA and zB respectively in dom(g A ) and in dom(g B ) and z = zA + zB in dom(g A g B ). Then, ∂g A (zA ) ∩ ∂g B (zB ) ⊂ ∂(g A g B )(z). Moreover, if ∂g A (zA ) ∩ ∂g B (zB ) = ∅, then the inf-convolution g A g B is exact at z = zA + zB and ∂g A (zA ) ∩ ∂g B (zB ) = ∂(g A g B )(z). (For more details, please refer to [127]). In particular, as 0 belongs to the domain of g A and g B , if ∂g A (0) ∩ ∂g B (0) = ∅, then ∂(g A g B )(0) = ∂g A (0) ∩ ∂g B (0) and the inf-convolution is exact at 0. Moreover, if both functions are centered, i.e. g A (0) = g B (0) = 0, then the infconvolution is also centered as (g A g B )(0) = g A (0) + g B (0) = 0. Regularization by Inf-Convolution As convolution, the infimal convolution is used in regularization procedures. The most famous regularizations are certainly, on the one hand, the Lipschitz regularization g(k) of g using the inf-convolution with the kernel bk (z) = k|z| and on the other hand, the differentiable regularization, also called Moreau-Yosida regularization, g[k] of g using the inf-convolution with the kernel qk (z) = k2 |z|2 . Both regularizations do not have however the same “efficiency.” Lipschitz Regularization We first consider the inf-convolution g(k) of g using the kernel bk (z) = k|z| or more generally using functions whose polar’s domain is bounded (or equivalently with a finite recession function). The function g(k) is finite, convex, nondecreasing w.r. to k. Moreover, its infconvolution g(k) is Lipschitz-continuous, with Lipschitz constant k. More generally,
146
CHAPTER 3
the inf-convolution of two convex functions, one of them satisfying (H1), also satisfies (H1) without any condition on the other function. If z0 ∈ int domg, then g(k) (z0 ) = g(z0 ) for k large enough. When g = lC is the indicator function of a closed convex set C, gk = kdist(., C). This regularization is used in the book’s chapter dedicated to BSDEs to show the existence of BSDE with continuous coefficient. Moreau-Yosida Regularization We now consider the inf-convolution g[k] of g using the kernel qk (z) = k2 |z|2 . The function g[k] is finite, convex, nondecreasing w.r. to k. Moreover, g[k] is differentiable and its gradient is Lipschitz-continuous with Lipschitz constant k. In other words, the polar function of g[k] is strongly convex with module k, equivalently G[k] (.) − k2 |.|2 is still a convex function (for more details, please refer to Cohen [50]). There exists a point Jk (z) that attains the minimum in the inf-convolution problem with qk . The maps z → Jk (z) are Lipschitz continuous with a constant 1, independent of k and monotonic in the following sense (Jk (z) − Jk (y)) (z − y)∗ ≥ ||Jk (z) − Jk (y)||2 . Moreover,∇g[k] = k(z − Jk (z)). More generally, the inf-convolution of two convex functions, one of them being strongly convex, satisfies (H3) without any condition on the other function.
Chapter Four From Markovian to Partially Observable Models René Carmona
In this chapter, we compute the value functions needed for indifference pricing in models of continuous time finance of increasing generality. We first implement the indifference pricing paradigm for the exponential utility function in a Markovian setting motivated by an application to weather derivatives, which will be discussed in Chapter 7. As an added bonus to our results we consider static hedging with liquid options in the spirit of the analysis of Chapter 5, and we connect our results to the “maximum entropy philosophy” often used as a basic principle when nothing else can help with the calibration of a risk-neutral measure. The second part of the chapter is devoted to non-Markovian models. This generalization is forced on us by our desire to consider partially observable systems. Again, motivated by applications discussed in more detail in Chapter 7, we revisit our diffusion models to include more general adapted coefficients, and, in the process, we generalize our results to other utility functions. Moreover, because mean-variance hedging is very much in the spirit of the hedging procedures exhibited in this chapter, for each of the market models considered we analyze the mean-variance hedging problem and we provide bounds for the corresponding values functions.
4.1 A FIRST DIFFUSION MODEL The introduction of the mathematical model studied in the first two sections of this chapter was motivated by the weather markets, especially the market of temperature options. A brief overview of the latter is given in Chapter 7. Some familiarity with this market helps one understand the nature of the mathematical assumptions we introduce, and the significance of the results we prove. However, the reader willing to accept the assumptions of the models will fully benefit from the presentation of this chapter, even if she has little or no interest in the weather markets. Our presentation follows closely the line of an unpublished technical report [42] written with A. Danilova. These results will be generalized later in the chapter. We chose to present them first in a restricted form for historical reason and to emphasize the progression from Markovian models to more general adapted setups provided by the case of partially observed systems. Let us assume that an agent is considering the purchase of an option written on a nontraded asset whose value at time t we denote by Yt , and let T be the date of
148
CHAPTER 4
maturity of this option. In the case of a European call with strike K, the payoff ξ of the option is (YT − K)+ if we use the standard notation of a plus exponent for the positive part of a real number. More generally, we can assume that the payoff is of the form ξ = f (YT ) for some measurable function f . In this section, we assume that the function f is bounded. As explained in Section 7.1 of Chapter 7, this assumption is fully justified in the case of temperature options. Finally, we assume that the agent has access to a bank account and that she is allowed to trade in an asset whose price at time t is denoted by St , and that she tries to maximize the expected utility of terminal wealth. Motivated by the case of the weather markets, we assume that the nontraded index affects the dynamics of the traded asset, for example through its drift and volatility coefficients. For convenience, we shall assume that the rate of interest of the bank account is a constant that we denote by r. Stochastic Dynamics We assume that Yt and St satisfy the following stochastic differential equations (SDE’s in what follows): ) dSt = St [µ(t, Yt )dt + σ (t, Yt )dWt1 ] (4.1) dYt = p(t, Yt )dt + q(t, Yt )dWt2 , where the two correlated Wiener processes {Wti }t satisfy d[W 1 , W 2 ]t = ρdt for some constant −1 ≤ ρ ≤ 1. Main Assumption The following assumption is not minimal. It is chosen for convenience. Assumption 4.1 The functions µ(t, y), σ (t, y), and q(t, y) are Hölder continuous, bounded, and σ (t, y) is bounded away from zero. As a consequence of Assumption (4.1), the quantity m(t, y) =
µ(t, y) − r , σ (t, y)
(4.2)
which is called the traded risk premium, is bounded and Hölder continuous. We assume that the agent’s risk preferences are given by a utility function, and for the sake of convenience we use the form U (x) = − exp(−γ x) of the standard exponential utility with risk aversion parameter γ > 0. Recall Section 2.2.1 of Chapter 2. As we shall see in Chapter 7, St could be the price F (t, T ) of a forward contract of natural gas, and Yt could be the average daily temperature at a meteorological station near the city gate where the natural gas is to be delivered. We now describe the hedging portfolios. At time t = 0, our agent is setting up a portfolio containing units/shares of the tradable asset and a bank deposit. Note that this amount could be negative if the agent borrows money from the bank instead of making a deposit. She dynamically rebalances the portfolio in continuous time. If
149
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
we denote by π˜ t the amount of money invested in the traded asset at time t, and by X˜ tπ˜ the corresponding wealth, then under assumption (4.1), the wealth process X˜ tπ˜ satisfies the following stochastic differential equation: d X˜ tπ˜ = r X˜ tπ˜ dt + [µ(t, Yt ) − r]π˜ t dt + σ (t, Yt )π˜ t dWt1 . We shall only consider Markovian strategies for which π˜ t is of the form π˜ t = π(t, ˜ X˜ tπ˜ , Yt ) for some deterministic function (t, x, y) → π˜ (t, x, y). It is more convenient to rewrite the above dynamical equation in terms of forward values. More precisely, we denote by Xtπ the T -forward value of the hedging portfolio, i.e., Xtπ = er(T −t) X˜ tπ˜ . Then the dynamics of the wealth process can be rewritten as dXtπ = [µ(t, Yt ) − r]πt dt + σ (t, Yt )πt dWt1 provided that we also set πt = er(T −t) π˜ t . Our goal is to price a contingent claim written on the nontraded index Y , and whether our agent acts as a buyer or seller of such a claim, she tries to hedge away the risk associated with this derivative with a portfolio invested in the tradable asset S. We analyze in the next section how she can also include a static position in liquid options on the nontraded asset. So, the investor’s objective is to solve the classical Merton’s problem, which consists here in determining the trading strategy as the argument of the following maximization problem: max E{U (XTπ + ξ )|Xtπ , Yt }. π
Remarks 1. One should think of the pay-off ξ as being positive for the buyer of the option and negative for the seller of the option. 2. As mentioned throughout the book, similar problems have been considered by many authors. Most often, as for example in [62], it is assumed that µ and σ are constants and p and q are of the form p(Yt , t) = pYt , q(Yt , t) = qYt . 4.1.1 Hedging with the Traded Asset and the Savings Account In this section, we consider the case of a European option hedged with the traded asset and the savings account. Other contingent claims are treated in [42]. As explained earlier, we assume that the option payoff is of the form ξ = f (YT ) for some bounded function f . In fact, as the following computations show, it is enough to assume that the payoff function f satisfies f (y) ≥ κ(1 + |y|),
y∈R
for some κ < 0. Our goal is to compute the value function u˜ f (t, x, y) = sup E{U (X˜ Tπ˜ + ξ )|X˜ tπ˜ = x, Yt = y}. π˜
(4.3)
150
CHAPTER 4
4.1.2 Computation of the Value Function Note that u˜ f (t, x, y) = uf (t, er(T −t) x, y), where the function uf is defined by uf (t, x, y) = sup E{U (XTπ + f (YT ))|X˜ tπ˜ = x, Yt = y}.
(4.4)
π˜
Proposition 4.1 Assuming that (4.1) holds, we have uf (t, x, y) = −e−γ e where γρ = γ (1 − ρ 2 ), Mρ (t, T ) =
1 − ρ2 2
r(T −t) x
1
˜
EP {e−γρ f (YT )−Mρ (t,T ) |Yt = y} 1−ρ 2 ,
T
m2 (s, Ys )ds,
and m(s, y) =
t
µ(s, y) − r , σ (s, y)
and the probability measure P˜ is defined by T d P˜ 1 2 T 2 2 ρ = exp −ρ m(s, Y )dW − m (s, Y ))ds . s s s 2 dP t t
(4.5)
(4.6)
(4.7)
FT
Proof. Since this result will be proved in greater generality later, we skip some of the technical details of the proof, and we concentrate on the major steps of the approach based on the Hamilton Jacobi Belman (HJB for short) equation. The value function defined in (4.35) is expected to satisfy the equation ht + Lp h + H (t, x, y, h, hx , hxx , hxy ) = 0,
(4.8)
where Lp h, defined by 1 2 (4.9) q (y, t)hyy + p(y, t)hy , 2 is the linear second-order partial differential operator governing the diffusion process of the nontradable dynamics, and Lp h =
H (t, x, y, h, hx , hxx , hxy ) 1 2 2 = max hx (µ(t, y) − r)π + σ (t, y) π hxx + ρπ σ (t, y)q(t, y)hxy (4.10) π 2 gives the nonlinear part of the HJB equation. See [82] for details. For the sake of notation, we use subscripts to denote partial derivatives. The optimization problem appearing in definition (4.10) of the nonlinear term can be solved explicitly since the function in question is merely quadratic in its argument π. This gives the optimal trading strategy π : π = −
µ(t, y) − r hx q(t, y) hxy −ρ , σ 2 (t, y) hxx σ (t, y) hxx
(4.11)
and plugging this value into the HJB equation (4.8) we get ht + Lp h −
([µ(t, y) − r]hx + ρq(t, y)σ (t, y)hxy )2 =0 2σ (t, y)2 hxx
(4.12)
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
151
with terminal condition h(T , x, y) = −e−γ (x+f (y)) . The form of this terminal condition suggests that we look for solutions of the form h(t, x, y) = −e−γ x F (y, t). In order for such a function to be a solution, F will have to satisfy Fy2 1 1 Ft + Lpρ F − m(t, y)2 F − ρ 2 q 2 (t, y) =0 2 2 F with terminal condition F (T , y) = e−γf (y) , where we used the notation pρ for the modified drift pρ (t, y) = p(t, y) − ρq(t, y)m(t, y). This transformed equation is still nonlinear, but the nonlinearity is more manageable, for a simple transformation of the Hopf-Cole type can linearize it. Indeed, setting F (t, y) = v(t, y)δ with δ = 1/(1 − ρ 2 ), the function v has to satisfy vt + Lpρ v −
1 − ρ2 m(t, y)2 v = 0 2
(4.13)
2
with terminal condition v(y, T ) = e−γ (1−ρ )f (y) . This equation is a backward linear parabolic equation, and the solution admits a Feynman-Kac representation. If {Y˜t }t is a diffusion process with generator Lpρ , then this representation reads 1−ρ 2 T 2 ˜ v(t, y) = Ey v(T , Y˜T )e 2 t m (s,Ys )ds .
(4.14)
Because of the special form of the generator Lpρ , a convenient way to realize the diffusion process is to change the probability measure with a Girsanov transformation and still use the same process {Yt }t . Indeed, since m is bounded, Novikov’s condition is satisfied and formula (4.7) defines a probability measure P˜ under which the process {W˜ t }t , defined by s m(u, Yu )du, W˜ s = Ws2 + ρ t
is a Brownian motion, and the process {Yt }t satisfies the stochastic differential equation dYt = [p(t, Yt ) − ρm(t, Yt )q(t, Yt )]dt + q(t, Yt )d W˜ t , which is characteristic of the diffusion generated by Lpρ .
2
Lemma 4.2 If we assume that Assumption 4.1 is satisfied and that f satisfies condition (4.3) for some constant κ < 0, then the solution of (4.13) is given by ˜ −γρ f (YT )−Mρ (t,T ) |Yt = y}. v(t, y) = E{e Note that condition (4.3) is an assumption on the derivative to be priced, not on the model. Moreover, this assumption is always satisfied in the case of temperature options since the payoffs of all of the commonly traded options are bounded. See, for example, [41].
152
CHAPTER 4
Proof. After the change of variables z = ec(t) y the partial differential equation (4.13) becomes ξt + α1 (z, t)ξz +
α22 (z, t) ξzz + α0 (z, t)ξ = 0, 2
(4.15)
where α1 (z, t) = c (t)z + ec(t) (p(e−c(t) z, t) − ρq(e−c(t) z, t)m(e−c(t) z, t)) 1 α2 (z, t) = ec(t) q(e−c(t) z, t), α0 (z, t) = − m2 (e−c(t) z, t)(1 − ρ 2 ) (4.16) 2 ξ(z, t) = v(e−c(t) z, t) with terminal condition ξ(z, T ) = exp(−γρ f (ze−cT )).
2
Since α1 , α2 and α3 are bounded and Hölder continuous, Assumption 4.1 guarantees that the partial differential equation (4.15) has a unique classical solution 2 ξ ∈ C 1,2 with growth condition ξ(t, z) ≤ K1 eaz . Hence we can use Feynman-Kac representation (4.14) to conclude. Lemma 4.3 (Verification Theorem) If assumption 4.1 is satisfied, µ(y, t), σ (y, t), p(y, t), q(y, t) are Lipschitz continuous, py (y, t), my (y, t) and qy (y, t) are bounded and Lipschitz, f (y) ≥ κ(1 + |y|) for some κ < 0 and |f (y) − f (x)| ≤ K|x − y|, for some K > 0, then value function u(t, x) is solution of (4.12), and π = −
µ(y, t) − r hx q(y, t) uxy −ρ σ 2 (y, t) hxx σ (y, t) uxx
gives an optimal Markov strategy. Proof. The verification theorem for Markov control of [82] implies that, in order to prove this proposition, we need only to prove that the optimal strategy π given by (4.11) is locally Lipschitz and satisfies a sublinear growth condition. Representing π in terms of v(t, y) gives π =
ρ q(y, t) vy m(y, t) + . γ σ (y, t) γ (1 − ρ 2 ) σ (y, t) v
By Lemma 4.2 and Jensen’s inequality and due to the fact that E˜ t {f (YT )} exists, we have ˜ t {Mρ (t, T )}] ≥ L1 exp[−E ˜ t {Mρ (t, T )}], ˜ t {f (YT )} − E v(t, y) ≥ exp[−γρ E 2 T 2 where L1 > 0. Since Mρ (t, T ) = 1−ρ t m (Ys , s)ds and since m(y, t) is 2 bounded, v(t, y) is necessarily bounded away from zero. Proposition 4.2 implies that v ∈ C 1,2 , and since all coefficients are Lipschitz continuous and σ (y, t) is bounded away from zero, it follows that π is locally Lipschitz. In order to check for sublinear growth, it is enough to prove that vy /v has sublinear growth since all coefficients are bounded and σ (y, t) is bounded away from zero. For later convenience, we rewrite the ratio vy /v in the form ec(t) ξz /ξ (recall the notation of Lemma 4.2).
153
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
The assumptions on m, q, p and σ imply that αi , i = 0, 1, 2 are bounded and i = 0, 1, 2 are bounded and Lipschitz. Because of the assumption made on f , Lemma 4.2 implies ∂αi , ∂z
1 |ξz | E˜ t {exp[−2γρ f (ZT e−cT )]} 2 . ≤ C2 ξ E˜ t {exp[−γρ f (ZT e−cT )]}
(4.17)
Since f is globally Lipschitz, we have E˜ t {exp[2Ke−cT γρ |ZT − z|]} 2 |ξz | , ≤ C2 ξ E˜ t {exp[−Ke−cT γρ |ZT − z|]} 1
and we can conclude that ξz /ξ is bounded by Lemma 4.2. The verification theorem for Markov controls [82] can then be used to conclude the proof. 2 From Lemma 4.2 and Lemma 4.3 we obtain uf (t, x, y) = −e−γ e
r(T −t) x
1
˜ −γρ f (YT )−Mρ (t,T ) |Yt = y} 1−ρ 2 . E{e
Using this expression, we can compute the seller’s and buyer’s indifference prices. Since their computations are very similar, we restrict ourselves to the computation of the seller’s price. This indifference price is the amount p that compensates for facing the liability ξ = f (YT ) at time T , i.e., it is the number p satisfying uf (t, x, y) = u0 (t, x + p, y). With this definition we have 1
e
−γ er(T −t) p
=
˜ −γρ f (YT )−Mρ (t,T ) |Yt = y} 1−ρ 2 E{e 1
˜ −Mρ (t,T ) |Yt = y} 1−ρ 2 E{e
.
(4.18)
Note that this price does not depend upon the value of the initial wealth. This property is specific to the choice of exponential utility. Some view this feature as desirable, others dislike the fact that the price does not depend upon the current value of the book when the contract is entered. Note also that uxx < 0, justifying the solution (4.11) of the optimization problem. Numerical Computations: The numerics of indifference price computations are very involved when closed form formulas do not exist for the value functions. Indeed, solving numerically continuous time stochastic control problems is usually very difficult. See nevertheless [143]. So the existence of formulas such as (4.18) are of great importance as they are amenable to numerical computations, for example by Monte Carlo particle methods. The probability measure P˜ is called the indifference measure. It provides a vehicle to assess the value of a risky portfolio (risky in the sense that it cannot be hedged perfectly). The indifference measure has an interesting property: as noticed by Musiela and Zariphopoulou it can be viewed as the projection of an equivalent martingale measure. A typical equivalent martingale measure has null market price of risk on
154
CHAPTER 4
the stochastic volatility of the traded asset: T 1 T 2 dP ∗ 1 = exp − m(Ys , s)dWs − m (Ys , s)ds . dP FT 2 0 0 To verify this, first we factorize the change of measure: * ) W2 1 − ρ2 ρ2 dP ∗ 2 W0 2M W − F = E exp −ρM 1 − ρ M − M − E T , dP F W 2 2 2 T
W2
where FT
is the completion of the natural filtration of W 2 up to time T , T T i m(Ys , s)dWsi and M = m2 (Ys , s)ds. MW = 0
0
2 2 Note that exp(− 1 − ρ 2 M W − 1−ρ M) and m(Ys , s), s ≤ T are measurable with 2 W2 0 respect to FT and that W and W 2 are independent. Therefore we have the following projection: ) * dP ∗ d P˜ E = . dP F W 2 dP T
4.2 STATIC HEDGING WITH LIQUID OPTIONS We consider the same problem as before, namely dynamic hedging and valuation by indifference pricing of an option with a European payoff. However, when entering the contract, the agent is now allowed to set up a static position in liquid options as part of the hedging strategy. We assume that European options with maturity dates T1 , T2 , . . . , Tn and payoffs f1 (YT1 ), f2 (YT2 ), . . . , fn (YTn ), are available for trading at time t = 0 at prices c1 , c2 , . . . , cn . We assume that these options are liquid in the sense that our agent can buy and sell these options in any quantity at these prices. The position is called static because contracts entered at time t = 0 in these options are kept until they mature. Our objective is, as before, to determine the value function uf (t, x, y) = sup E U XTπ,α + f (YT ) |Xtπ,α = x, Yt = y , (4.19) π,α={αi }i=1,...,n
with the important difference that the wealth process Xtπ,α now starts from n π,α rT x− α i ci , X0 = e i=1
jumps each time one of the liquid options matures, i.e., XTπ,α = XTπ,α + αi fi (YTi )er(T −Ti ) , i i+ while it still satisfies the same stochastic differential equation, dXtπ,α = (µ(t, Yt ) − r)πt dt + σ (t, Yt )πt dWt1 ,
(4.20)
155
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
in between two successive maturity times. Note that, as before, we work with forward prices. The reason for this formulation of the problem is twofold. First, it can be proved that the indifference price of liquid options fi (YTi ) is exactly ci . Second, the value function is concave in α, and if we knew that the supremum is in fact a maximum, i.e., that it is attained for a value α ∗ of α, we could use numerical procedures to locate this optimum. Hence the major obstacle remaining is to find out whether or not the supremum is attained. This problem is studied below. We first consider the case where α = {αi }i=1,...,n is fixed, and we are adding one more derivative to an already existing portfolio. The corresponding value function is u˜ f,α (t, x, y) = uf,α (t, er(T −t) x, y) π,α
= sup E{−e−γ (XT
+f (YT ))
π˜
|Xtπ,α = er(T −t) x, Yt = y}.
For the sake of definiteness, we suppose . . . < Tn < Tn−1 . . . < T1 < T . Lemma 4.4 The α-fixed value function is given by n
˜ −γρ [f (YT )+ uf,α (t, x, y) = −e−γ x E{e
i=1 α˜i fi (YTi )]−Mρ (t,T )
1
|Yt = y} 1−ρ 2
for t ∈ (Tn+1 , Tn ] where α˜i = αi er(T −Ti ) and where Mρ is defined by (4.6) as before. Proof. We proceed by induction. Let us first assume that t ∈ (T1 , T ]. There is no option maturity Ti between T1 and T , and since the dynamics of the forward wealth are given by dXtπ = (µ(t, Yt ) − r)πt dt + σ (t, Yt )πt dWt1 , the problem coincides with the problem we solved in the previous section. Hence, the value function is given by 1
˜ −γρ [f (YT )]−Mρ (t,T ) |Yt = y} 1−ρ 2 , uf,α (t, x, y) = −e−γ x E{e and the claim of the lemma holds true. We now assume that the formula is true for t ∈ (Tn , Tn−1 ], so that n−1
˜ −γρ [f (YT )+ u˜ f,α (t, x, y) = −e−γ x E{e
˜ i fi (YTi )]−Mρ (t,T ) i=1 α
1
|Yt = y} 1−ρ 2
for all t ∈ (Tn , Tn−1 ]. Next we fix t ∈ (Tn+1 , Tn ]. Using our induction hypothesis, since uf,α (t, x, y) is continuous in t ∈ (Tn , Tn−1 ], we have uf,α (Tn +, x, y) = lim uf,α (t, x, y) t↓Tn
n−1 1 = −e−γ x E˜ Tn + e−γρ [f (YT )+ i=1 α˜ i fi (YTi )]−Mρ (Tn +,T ) 1−ρ 2 .
The jump condition (4.20) implies that n 1 uf,α (Tn , x, y) = −e−γ x E˜ Tn eγρ [f (YT )+ i=1 α˜ i fi (YTi )]−Mρ (Tn ,T ) 1−ρ 2 , and if we use momentarily the notation n 1 ˜ Tn e−γρ [f (YT )+ i=1 α˜ i fi (YTi )]−Mρ (Tn ,T ) 1−ρ 2 , e−γ g(YTn ,n) = E
156
CHAPTER 4
we see that π,α u˜ f,α (t, x, y) = max Et − e−γ (XTn +g(YTn ,n)) .
π
For t ∈ (Tn+1 , Tn ) we have dXtπ = (µ(t, Yt ) − r)πt dt + σ (t, Yt )πt dWt1 , and hence we can use the induction hypothesis to obtain 1 uf,α (t, x, y) = −e−γ x E˜ t e−γρ [g(YTn ,n)]−Mρ (t,Tn ) 1−ρ 2 . Since n e−γρ g(YTn ,n) = E˜ Tn e−γρ [f (YT )+ i=1 α˜i fi (YTi )]−Mρ (Tn ,T )
and Mρ (t, Tn ) + Mρ (Tn , T ) = Mρ (t, T ), we can use the tower property to conclude that n 1 uf,α (t, x, y) = −e−γ x E˜ e−γρ [f (YT )+ i=1 α˜i fi (YTi )]−Mρ (t,T ) 1−ρ 2 , 2
concluding the proof.
Hence, if at time t we want to buy an option f (YT ) and a portfolio ni=1 α˜i fi (YTi ) of options, and invest what is left of the initial capital x, i.e., Xtπ,α = x − nliquid r(T −t) , through continuous trading, we would have i=1 ci e u˜ f,α (t, x, y) = −e−γ e
r(T −t) x
n 1 E˜ t e−γρ [f (YT )+ i=1 α˜ i (fi (YTi )−βi )]−Mρ (t,T ) 1−ρ 2 ,
with βi = ci er(Ti −t) . Since u˜ f (t, x, y) is the supremum of these u˜ f,α (t, x, y), i.e., u˜ f (t, x, y)eγ e
r(T −t) x
n 1 ˜ t e−γρ [f (YT )+ i=1 α˜ i (fi (YTi )−βi )]−Mρ (t,T ) 1−ρ 2 . − E = max n {α˜ i }i=1
Then, by definition of the indifference price, one sees that the indifference price of the liquid options is given by their market prices, as we stated at the beginning of this subsection. Moreover, if we introduce the measure P∗ by its Radon Nykodym derivative e−γρ f (YT )−Mρ (t,T ) dP∗ = d P˜ E˜ t {e−γρ f (YT )−Mρ (t,T ) } ˜ we have: with respect to P, r(T −t) x 1−ρ 2 − u˜ f (t, x, y)eγ e −γρ ni=1 α˜ i (fi (YTi )−βi ) ∗ = max − e , E − t ˜ t {e−γρ f (YT )−Mρ (t,T ) } {α˜ i }ni=1 E
and, using the notation u(α˜ 1 , . . . , α˜ n , ·) = −e−γρ
n
˜ i (fi (YTi )−βi ) i=1 α
,
157
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
duality gives u(α˜ 1 , . . . , α˜ n , ·) − Z
n
α˜ i (fi (YTi ) − βi ) ≤ v(Z, ·)
(4.21)
i=1
for any Z,{α˜ i }, where v is dual to u, i.e., v(Z, ·) =
Z γρ
ln
Z −1 , γρ
and (4.21) is in fact an equality if and only if Z and {α˜ i } satisfy e−γρ
n
˜ i (fi (YTi )−βi ) i=1 α
=
Z . γρ
In order to find an upper bound for u˜ f (t0 , x, y), we assume that Z is a nonnegative random variable such that E∗t {Z(fi (YTi ) − βi )} = 0. Taking expectation of both sides in (4.21) we get E∗t {u(α˜ 1 , . . . , α˜ n , ·)} ≤ E∗t {v(Z, ·)}, and since the left-hand side doesn’t depend upon Z and the right-hand side doesn’t depend upon {α˜ i }, we have u˜ f (t, x, y)eγ e
r(T −t) x
1 ≤ − − V (t, y)E˜ t e−γρ f (YT )−Mρ (t,T ) 1−ρ 2 ,
where V (t, y) = min E∗t Z
Z γρ
(4.22)
Z ln −1 γρ
∗
under the equality constraints E {Z(fi (YTi ) − βi )|Yt = y} = 0 for all i. Choosing T T dQ ˜ hs d W s − hs ds , Z = z ∗ = z exp dP t t we see that the dual problem is to find z z ∗ dQ z dQ V (t, y) = min ln − 1 + Et ln z,Q γρ γρ γρ dP∗ dP∗ dQ under the constraints E∗t dP ∗ (fi (YTi ) − βi ) = 0 for all i. Lemma 4.5 The minimization problem dQ dQ ∗ M(t, y) = arg min E Yt = y ln Q dP∗ dP∗ under the constraints E∗
dQ (f (Y ) − β )|Y = y =0 i T i t i dP∗
has a solution if the problem is feasible (i.e., there exists a Q such that the constraints dQ dQ are satisfied and E∗ { dP ∗ ln( dP∗ )|Yt = y} is finite) and fi (·) are bounded.
158
CHAPTER 4
Proof. Assume that h is a density under Q and h∗ is a density under P∗ , then we can restate this the functional problem in a functional form: find h such that it minimizes I (h) = h ln(h/ h∗ )dP under the constraints hdP = 1 and hfi dP = βi for all i. If we define a sequence {hk }k decreasing toward ξ = inf h I (h), satisfying the constraints and such that I (hk ) ≤ ξk for all k, then the sequence {hk / h∗ }k has a weakly convergent subsequence since they are bounded in L1 (h∗ dX) by 1 and if we function ψ(t) = t ln(t) (which satisfies ψ(t)/t → +∞ as t → +∞ and ∞use the hk ∗ ψ( )h dX ≤ ξ0 for all k). Hence, by the characterization of weakly com−∞ h∗ pact subsets of L1 , we conclude that {hk }k admits a subsequence which is weakly convergent in L1 (dX). Since I (h) is weakly lower semicontinuous (by Donsker-Varadhan lemma) we have that if hk ! h∞ , then I (h∞ ) ≤ I (hk ). Also by definition of weak continuity if hk ! h∞ and each hk satisfies constraints, we would have that h∞ satisfies constraints as well. Hence Q∗ corresponding to h∞ is the solution to the initial problem and ∗ ∗ dQ dQ M(t, y) = E∗ Y ln = y . 2 t dP∗ dP∗ Lemma 4.6 If the optimization problem inf
∞
∞ −∞ hdx=1, −∞ hfi dx=βi , i=1,··· ,n
where we set I (h) = written in the form
∞
−∞
I (h),
h ln( hh∗ )dx, has a solution, then this solution can be
h∞ =
h∗ e
E∗ {e
n
n
i=1 λi (fi (YTi )−βi )
i=1 λi (fi (YTi )−βi )
|Yt = y}
for some set of λi ’s. Proof. Since the optimization problem has a solution, it must be a solution of the following unconstrained optimization problem: +∞ h inf sup − dx h ln λ h∗ h −∞ +∞ +∞ n + λi h(fi − βi )dx + λ hdx − 1 . i=1
−∞
−∞
Now, if we take variational derivative with respect to h and set it to 0 to obtain conditions on stationary point, then we have +∞ +∞ +∞ n h ln + 1 δhdx + λi (fi − βi )δhdx + λ δhdx = 0 − h∗ −∞ −∞ −∞ i=1 ∞ for any δh. Hence, using the fact that −∞ hdx = 1, we have h∞ =
h∗ e
E∗ {e
n
n
i=1 λi (fi (YTi )−βi )
i=1 λi (fi (YTi )−βi )
|Yt = y}
159
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
for some set of λi ’s. Consequently, if the dual optimization problem is feasible and fi are bounded then by Lemma 4.5, it has a solution and by Lemma 4.6 its solution is of the postulated form. Since for any optimal set of λi ’s we have E∗t {λj (fj (YTj ) − βj )e
n
i=1 λi (fi (YTi )−βi )
} = 0,
for j = 1, . . . , n because of the form of the constraints, the dual problem becomes n z z z V (t, y) = min ln −1 − ln E∗t {e i=1 λi (fi (YTi )−βi ) }, z γρ γρ γρ and since optimal z∗ is z∗ = γρ E∗t {e
n
i=1 λi (fi (YTi )−βi )
},
n
the dual problem has a solution V (t, y) = E∗t {−e i=1 λi (fi (YTi )−βi ) } and equality in (4.22) is reached if and only if αi = − γλρi , and hence the primal problem has a solution. Therefore, if the dual problem is feasible and fi are bounded, then there exists a set of α on which primal problem reaches its maximum.
4.3 NON-MARKOVIAN MODELS WITH FULL OBSERVATION We now tackle the problem of the analysis of indifference prices in the case of non-Markovian models. As explained in the abstract at the beginning of the chapter, our motivation comes from the analysis of partially observed systems, which we discuss in the following section. The generalizations presented here were obtained (more or less simultaneously) independently by M. Tehranchi in [258] and Brendle and Carmona in [34]. We follow this last presentation because it is better suited to the generalization to partially observed models, which we consider next. As before, we consider a pair of stochastic process {St }t≥0 and {Yt }t≥0 for the values of a traded asset and a nontraded index, respectively. As before, we assume that the dynamics of the nontraded index are of the form dYt = p(t)dt + q(t)dWt ,
(4.23)
but the main difference is that we now allow the coefficients to depend upon the past. In other words, we merely assume that the processes {p(t)}t≥0 and {q(t)}t≥0 are adapted to the filtration FY = {FtY }t≥0 generated by the process {Yt }t . Similarly, we assume that the dynamics of {St }t are of the form dSt = µ(t)dt + σ (t)dWtS , St which we will sometimes use in the form dSt = σ (t) [λ(t) dt + dWtS ], St
(4.24)
(4.25)
if we set λ(t) = µ(t)/σ (t) for the Sharpe ratio. Again, we assume that the coefficients {µ(t)}t≥0 and {σ (t)}t≥0 are adapted to the filtration FY and we assume that
160
CHAPTER 4
{Wt }t≥0 and {WtS }t≥0 are Wiener processes with correlation ρ. It will be convenient to write (4.26) WtS = ρWt + 1 − ρ 2 Wt⊥ , where {Wt }t≥0 and {Wt⊥ }t≥0 are independent Wiener processes. We shall denote by FY,S = {F S,Y (t)}t≥0 the natural filtration of {(St , Yt )}t≥0 . We will assume that λ(t) =
p(t) µ(t) +ρ . σ (t) q(t)
Under this assumption, one can rewrite the dynamics of {St }t in the form σ (t) dSt dYt , = µ(t)dt + 1 − ρ 2 σ (t)dWt⊥ + ρ St q(t)
(4.27)
(4.28)
Agents invest in the traded asset only. Their wealth process {Xt }t≥0 satisfies the stochastic differential equation dXt = θ (t)
1 dSt = θ(t)λ(t)dt + θ (t)dWtS , σ (t) St
(4.29)
where θ (t)/σ (t) denotes the amount of wealth invested in the risky asset. The portfolios we consider admissible in this section are the square integrable portfolios T satisfying E{ 0 |θ (t)|2 dt} < ∞. Alternatively, we may write dXt = Xt π(t)
1 dSt = Xt π(t)λ(t)dt + Xt π(t)dWtS , σ (t) St
if π(t)/σ (t) denotes the fraction of wealth invested in the risky asset. 4.3.1 The Optimal Control Problem Throughout this section, we consider a contingent claim written on the nontraded asset Y . Its payoff is modeled by a random variable ξ , which we assume to be FTY measurable. Apart from European-style contingent claims considered earlier, this includes contingent claims with path-dependent payoff, such as Asian options or barrier options. We assume that the investor’s risk preferences are described by a utility function. We consider three cases, which we treat separately: exponential, power, and logarithmic utility functions. In each case, we derive an explicit formula when the claim is underlied by the nontraded asset Y . When the claim payoff depends also upon the tradable asset, we derive an upper bound for the optimal utility of the stochastic control problem. In each of the cases we consider, we use the duality approach successfully implemented in [56] and thoroughly reviewed in [235] to identify the candidates for optimal portfolios and changes of measures, and we use direct probabilistic computations to derive formulas for the value functions. It is known from [199, 125, 42] that these formulas ought to be exact when the terminal wealth includes contingent claims written on the nontradable factors, but that only one-sided bounds can be proven when these claims also depend upon the tradable factors [252]. We extend these results to the general nonanticipative setting, and we prove that value function probabilistic representations exist beyond the Markovian framework.
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
161
Constant Absolute Risk Aversion We first assume that the investor’s risk preferences are modeled by the exponential utility function U (x) = − exp(−γ x). Recall the discussion in Chapter 2. According to the martingale representation theorem, there exists a process {θ ∗ (t)_0≤t≤T adapted to the filtration {FtY }0≤t≤T and such that T 1 1 T (λ(t) − γ (1 − ρ 2 )θ ∗ (t))2 dt (λ(t) − γ (1 − ρ 2 )θ ∗ (t))dWt − 2 exp − 2ρ 0 ρ 0
T T 2 exp −γ (1 − ρ ) ξ − ρ 0 λ(t) dWt − 12 0 λ(t)2 dt
. = T T E exp −γ (1 − ρ 2 ) ξ − ρ 0 λ(t) dWt − 12 0 λ(t)2 dt Then, we define a process {Z(t)}0≤t≤T by 1 t 1 − ρ2 t ∗ Z(t) = exp − λ(u)dW (u) + γ θ (u)dW S⊥ (u) ρ 0 ρ 0 t 1 − ρ2 t ∗ 1 1 2 λ(u) du + γ θ (u)λ(u)du − ρ2 2 ρ2 0 0 1 1 − ρ2 t ∗ 2 θ (u) du (4.30) − γ2 2 ρ2 0 where we set WtS⊥ =
1 − ρ 2 Wt − ρ Wt⊥ .
Note that, as defined, the process {WtS⊥ }0≤t≤T is independent of {WtS }0≤t≤T which is also a Wiener process. A straightforward computation gives: Lemma 4.7 Let {X ∗ (t)}0≤t≤T be the wealth process associated to the portfolio process {θ ∗ (t)}0≤t≤T . Then we have 1 1 1 ∗ XT + ξ = x − log ZT − log E exp − γ (1 − ρ 2 )ξ γ γ 1 − ρ2 T 1 T λ(t)dWt − λ(t)2 dt . (4.31) −ρ 2 t 0 Since E{ZT XT∗ } = x, this implies E{ZT log ZT } + γ E{ZT ξ } T 1 T 1 2 2 log E exp − γ (1 − ρ )ξ − ρ λ(t)dWt − λ(t) dt . =− 1 − ρ2 2 0 0 (4.32) Next we note that whenever {θ(t)}0≤t≤T is an admissible portfolio process, the corresponding wealth process {Xt }0≤t≤T satisfies E exp(−γ (XT + ξ ))F S,Y (t) ≥ exp − γ Xt − E{ZT log ZT } − γ E{ZT ξ } .
162
CHAPTER 4
Indeed, using the identity E{ZT XT |X0 = x} = x, and Jensen’s inequality, we obtain E{exp(−γ (XT + ξ ))} = E{ZT exp(−γ XT − log ZT − γ ξ )} ≥ exp(−γ E{ZT XT } − E{ZT log ZT } − γ E{ZT ξ })
(4.33)
= exp(−γ x − E{ZT log ZT } − γ E{ZT ξ }).
(4.34)
Furthermore, if we put θ (t) = θ ∗ (t), then equality holds. The value for the optimal control problem considered in this subsection is V (x) = sup E{− exp(−γ (XT + ξ ))|X0 = x},
(4.35)
where x denotes the initial capital. Recalling the lower bound (4.33) together with the equality holding when θ (t) = θ ∗ (t), we have proved the following result: Proposition 4.2 The value function V (x) defined in (7.31) is given by V (x) = − exp(−γ x − E{ZT log ZT } − γ E{ZT ξ }) or, equivalently, by V (x) = −e−γ x E exp − γ (1 − ρ 2 )ξ − ρ
T
1 2
λ(t)dWt −
0
(4.36)
T
λ(t)2 dt
1 1−ρ 2
.
t
(4.37) In particular, for ξ = 0, we obtain V (x) = − exp (−γ x) E exp − ρ 0
T
1 λ(t)dWt − 2
T 2
λ(t) dt
1 1−ρ 2
,
0
(4.38) which together with (4.37) proves that the indifference price of ξ is given by: T 1 1 T 2 λ(t) dW − λ(t) dt E exp − ρ t 2 0 γ (1 − ρ 2 ) 0 T 1 T 2 2 λ(t) dWt − λ(t) dt . − E exp − γ (1 − ρ ) ξ − ρ 2 0 0 This formula extends to the non-Markovian case the results obtained in [?, 119, 42] with various degrees of generality, and reproduced in part in the previous section. If the contingent claim is assumed to depend upon both the traded and the nontraded asset, namely, if we assume that the random variable ξ is FTS,Y -measurable instead of FTY -measurable, we do not have an explicit formula for the value function any longer. However, in the spirit of the bounds derived in [252], we can still derive an upper bound. Lemma 4.8 Let {Xt }0≤t≤T be the wealth process of an admissible portfolio {θ (t)}0≤t≤T . We have E{exp(−γ (XT + ξ )} ≥ exp(−γ x − E{ZT log ZT } − γ E{ZT ξ }).
(4.39)
163
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
Proof. As before, using the martingale identity E{ZT XT |X0 = x} = x, and Jensen’s inequality, we obtain E{exp(−γ (XT + ξ ))} = E[{ZT exp(−γ XT − log ZT − γ ξ )} ≥ exp(−γ E{ZT XT } − E{ZT log ZT } − γ E{ZT ξ }) = exp(−γ x − E{ZT log ZT } − γ E{ZT ξ }). 2
This proves the assertion. Therefore, the maximal expected utility satisfies the upper bound V (x) ≤ exp(−γ x − E{ZT log ZT } − γ E{ZT ξ }).
(4.40)
Constant Relative Risk Aversion We now assume that the investor’s risk preferences are given by a power-law utility function of the form U (x) = x γ for some 0 < γ < 1. In the notation of Chapter 2, this means that we choose R = 1 − γ and we ignore the constant in front, as it is irrelevant to our computations. Our analysis parallels the above discussion of the CARA case. By virtue of the martingale representation theorem, we can find a process {π ∗ (t)}0≤t≤T adapted to the filtration {FtY }0≤t≤T and such that 1 T (λ(t) − (1 − γ + γρ 2 )π ∗ (t))dWt exp − ρ 0 T 1 1 2 ∗ 2 − (λ(t) − (1 − γ + γρ )π (t)) du 2 ρ2 0
T γρ T 1 γ 2 exp 1−γ λ(t)dW + λ(t) dt t 0 2 1−γ 0
. = γ T γρ T 2 dt E exp 1−γ 0 λ(t)dWt + 12 1−γ λ(t) 0 Furthermore, we define a process {Z(t)}0≤t≤T by 1 − ρ2 t ∗ 1 t λ(u)dW (u) + (1 − γ ) π (u)dW S⊥ (u) Z(t) = exp − ρ 0 ρ 0 t 1 − ρ2 t ∗ 1 1 2 λ(u) du + (1 − γ ) π (u)λ(u)du − ρ2 2 ρ2 0 0 1 − ρ2 t ∗ 2 1 π (u) du − (1 − γ )2 2 ρ2 0 for ≤ t ≤ T . A straightforward computation shows the following: Lemma 4.9 Let {X∗ (t)}0≤t≤T be the wealth process corresponding to the portfolio process {π ∗ (t)}0≤t≤T constructed above. Then we have − 1 T T 1 − 1−γ 1−γ +γρ 2 γρ 1 γ ∗ 2 E exp λ(t)dWt + λ(t) dt . XT = xZT 1−γ 0 2 1−γ 0
164
CHAPTER 4
In particular, since E{ZT XT∗ } = x and 1 T T γ − 1−γ 1−γ +γρ 2 1 γ γρ 2 λ(t) dt . λ(t)dWt + } = E exp E{ZT 2 1−γ 0 1−γ 0 Note that, by the martingale identity E{ZT XT |X0 = x} = x and using Hölder’s inequality, we get γ − 1−γ
γ
E{XT } ≤ E{ZT XT }γ E{ZT γ
γ − 1−γ
= Xt E{ZT
}1−γ
}1−γ ,
with equality when π(t) = π ∗ (t). Hence, if we define the value function V (x) by γ
V (x) = sup E{XT }, the above computations proved: Proposition 4.3 The optimal utility V (x) is given by γ − 1−γ
V (x) = x γ E{ZT
}1−γ
or, equivalently,
γρ V (x) = x E exp 1−γ
γ
0
T
1 γ λ(t)dWt + 2 1−γ
T 2
λ(t) dt
1−γ 1−γ +γρ 2
.
0
Remark. Case of a more general claim. As before, we can also consider the case of a contingent claim whose payoff is a nonnegative FTS,Y -random variable ξ . As in the case of the exponential utility function, we derive an upper bound similar to (4.40). Indeed, using the martingale identity E{ZT XT } = x and Hölder’s inequality, we obtain − γ
E{(XT + η)γ } ≤ E{ZT (XT + η)}γ E{ZT 1−γ }1−γ γ − γ = Xt + E{ZT η E{ZT 1−γ }1−γ , which proves that the value function satisfies the upper bound γ − 1−γ
V (x) ≤ (x + E{ZT ξ })γ E{ZT
}1−γ .
Logarithmic Utility Function If the investor’s risk preferences are modeled by the utility function U (x) = log x, we define the process Z by t 1 t Z(t) = exp − λ(u)[ρdW (u) + 1 − ρ 2 dW ⊥ (u)] − λ(u)2 du 2 0 0 for 0 ≤ t ≤ T . Direct computations give:
165
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
Lemma 4.10 Let {X∗ (t)}0≤t≤T be the wealth process corresponding to the portfolio process {π ∗ (t)}0≤t≤T defined by π ∗ (t) = λ(t). Then we have XT∗ = xZT−1 , and hence E{log XT∗ } = log x − E{log ZT }. We follow the same procedure to identify the value function of the Merton’s problem. If {Xt }0≤t≤T is the wealth process associated to a generic admissible portfolio {π(t)}0≤t≤T , using the martingale identity E{ZT XT |X0 = x} = x and Jensen’s inequality, we obtain E{log XT } = E{log(ZT XT )} − E{log ZT } ≤ log E{ZT XT } − E{log ZT } = log Xt − E{log ZT }, with equality when π(t) = π ∗ (t). So, if we define the value function V (x) by V (x) = sup E{log XT }, then we proved: Proposition 4.4 The value function V (x) is given by T 1 λ(t)2 dt . V (x) = log x − E{log ZT } = log x + E 2 0 Remark. As before, we can prove a one-sided inequality when the contingent claim is only assumed to be FTS,Y -measurable. Indeed, using the martingale property of {Z(t)Xt }0≤t≤T and Jensen’s inequality, we obtain E{log(XT + η)} = E{log ZT (XT + η)} − E{log ZT } ≤ log E{ZT (XT + η)} − E{log ZT } = log(Xt + E{ZT η}) − E{log ZT }, which implies that the value function satisfies the upper bound V (x) ≤ log(x + E{ZT ξ }) − E{log ZT }.
(4.41)
4.3.2 Mean-Variance Hedging We close this section with a discussion of mean-variance hedging. In other words, we assume that the investor balances risk and reward à la Markovitz. To be specific, the investor tries to minimize the variance E{XT2 } − E{XT }2 subject to the constraint E{XT } = y, where y is a fixed positive real number.
166
CHAPTER 4
As before, we use the martingale representation theorem to construct a stochastic process {π ∗ (t)}0≤t≤T adapted to the filtration {FtY }0≤t≤T and satisfying
T T exp −2ρ 0 λ(t)dWt − 0 λ(t)2 dt
T T E exp −2ρ 0 λ(t)dWt − 0 λ(t)2 dt 1 T (λ(t) + (1 − 2ρ 2 )π ∗ (t))dWt = exp − ρ 0 T 1 1 2 ∗ 2 (λ(t) + (1 − 2ρ )π (t)) dt . − 2 ρ2 0 Furthermore, we define a process {Z(t)}0≤t≤T by t 1 − ρ2 t ∗ 1 t 1 1 S⊥ λ(u)dWu − π (u)dWu − λ(u)2 du Zt = exp − ρ 0 ρ 2 ρ2 0 0 1 1 − ρ2 t ∗ 2 1 − ρ2 t ∗ π (u) du . π (u)λ(u)du − − 2 ρ2 ρ2 0 0 Lemma 4.11 If the wealth process {Xt }0≤t≤T of an admissible portfolio {π(t)}0≤t≤T satisfies E{XT } = y, then we have E{XT2 } − E{XT }2 ≥
(y − x)2 . E{ZT2 } − 1
Proof. Using the martingale identity E{ZT XT } = x and the constraint E{XT } = y, we obtain E{XT2 } − E{XT }2 ≥ 2E{(α + (x − α)ZT E[ZT2 ]−1 )XT } − E{(α + (x − α)ZT E[ZT2 ]−1 )2 } − y 2 = 2αy + 2(x − α)xE{ZT2 }−1 − α 2 − (x 2 − α 2 )E{ZT2 }−1 − y 2 = (x − α)2 E{ZT2 }−1 − (y − α)2 for all real numbers α. Hence, if we choose α=
yE{ZT2 } − x , E{ZT2 } − 1
we obtain E{XT2 } − E{XT }2 ≥
(y − x)2 . E{ZT2 } − 1
(2)
Direct computations prove: Lemma 4.12 Let α be a real number, and let {X∗ (t)}0≤t≤T be the solution of the stochastic differential equation dXt∗ = (Xt∗ − α)π ∗ (t)
1 dSt σ (t) St
167
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
with the initial condition X0∗ = x. Then we have T ∗ XT = α + (x − α)ZT E exp − 2ρ λ(t)dWt − 0
T 2
λ(t) dt
1 1−2ρ 2
.
0
In particular, since E{XT∗ } = x, we must have T λ(t)dWt − E{ZT2 } = E exp − 2ρ
−
T
λ(t)2 dt
1 1−2ρ 2
.
0
0
Lemma 4.13 Let α be a real number, and let {X∗ (t)}0≤t≤T be the solution of the stochastic differential equation dXt∗ = (Xt∗ − α)π ∗ (t)
1 dSt , σ (t) St
with the initial condition X0∗ = x. Then we have XT∗ = α + (x − α)ZT E{ZT2 }−1 . In particular, we have E{XT∗ } = α + (x − α)E{ZT2 }−1 and E{XT∗2 } − E{XT∗ }2 = (x − α)2 E{ZT2 }−1 − (x − α)2 E{ZT2 }−2 . Proof. Using the relations XT∗ = α + (x − α)ZT E exp − 2ρ
T
λ(t)dWt −
0
and
E{ZT2 } = E exp − 2ρ
T
T
λ(t)2 dt 0
λ(t)dWt −
−
T
λ(t)2 dt
1 1−2ρ 2
0
0
we obtain XT∗ = α + (x − α)ZT E{ZT2 }−1 , from which it follows that E{XT∗ } = α + (x − α)E{ZT2 }−1 and E{XT∗2 } − E{XT∗ }2 = (x − α)2 E{ZT2 }−1 − (x − α)2 E{ZT2 }−2 . Hence, if we choose α=
1 1−2ρ 2
yE{ZT2 } − x , E{ZT2 } − 1
we obtain E{XT∗ } = α + (x − α)E{ZT2 }−1 = y
,
168
CHAPTER 4
and E{XT∗2 } − E{XT∗ }2 = (x − α)2 E{ZT2 }−1 − (x − α)2 E{ZT2 }−2 =
(y − x)2 . E{ZT2 } − 1
(2)
So if we define the mean-variance value function by V (x, y) = inf E{XT2 } − E{XT }2 , where the infimum is taken over all portfolio processes with initial capital x and prescribed expected return E{XT } = y, then we have the following result: Proposition 4.5 The mean-variance value function is given by V (x, y) =
(y − x)2 E{ZT2 } − 1
or, equivalently, V (x, y) = (y − x) E exp − 2ρ 2
T
λ(t)dWt −
0
−
T 2
λ(t) dt
1 1−2ρ 2
−1
−1 .
0
We now consider a contingent claim with payoff given by an FTS,Y -measurable random variable η. For every real number x, we define the minimal replication error by Rη (x) = inf E{(XT − η)2 }, where the infimum is taken over all admissible portfolio processes {θ(t)}t≥0 with initial capital x. Proposition 4.6 The replication error Rη (x) reaches a minimum for the initial wealth x = E{ZT η}. Proof. This is a special case of a general result of M. Schweizer. In this special situation, we can give a direct proof based on Lemma 4.13. Let us start with dXt = θ(t)
1 dSt , σ (t) St
with initial condition X0 = x. Next, let {X∗ (t)}0≤t≤T be the solution of the stochastic differential equation dXt∗ = Xt∗ π ∗ (t)
1 dSt , σ (t) St
with the initial condition X0∗ = E{ZT η} − x, so that XT∗ =
E{ZT η} − x ZT . E{ZT2 }
169
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
Then the sum {Xt + Xt∗ }0≤t≤T satisfies the stochastic differential equation d(Xt + Xt∗ ) = (θ(t) + Xt∗ π ∗ (t))
1 dSt σ (t) St
with X0 + X0∗ = E{ZT η}. From this it follows that E{(XT − η)2 } = E{(XT + XT∗ − η)2 } − 2E{XT∗ (XT + XT∗ − η)} + E{XT∗ }2 = E{(XT + XT∗ − η)2 } + −2
(E{ZT η} − x)2 E{ZT2 }
E{ZT η} − x E{ZT (XT + XT∗ − η)} E{ZT2 }
= E{(XT + XT∗ − η)2 } +
(E{ZT η} − x)2 . E{ZT2 }
Hence, if we take the infimum over all portfolio processes {θ(t)}0≤t≤T , we obtain Rη (x) = Rη (E{ZT η}) +
(E{ZT η} − x)2 . E{ZT2 }
Note that we do not need to assume that the infimum is attained. From this the assertion follows immediately. 2
4.4 OPTIMAL HEDGING IN PARTIALLY OBSERVED MARKETS We now consider dynamics driven in part by factors that cannot be observed. We give an example of such a system in Section 7.3 of Chapter 7 when we discuss models for the convenience yield in energy markets. But beyond the role they play in the analysis of commodity markets, partially observable systems are often used as a testbed for some of the developments in robust control as applied to economic models with possible misspecifications like those discussed in the recent works of Hansen and his collaborators. See, for example, [167] and the references therein. Typical results in optimal control rely on the analysis of equations of the HamiltonJacobi-Bellman type derived from the dynamic programming principle [19, 82]. These nonlinear partial differential equations do not always have solutions in the classical sense. The theory of viscosity solutions (see, for example, [82]) was invented in part to overcome this difficulty, but its technical nature is regarded as an unnecessary overkill by many financial engineers and probabilists. Besides this issue of taste or background, one of the disadvantages of this approach is the fact that it requires the factors dynamics to be Markovian. This is especially constraining in the case of partially observed systems. The latter are usually modeled by specifying the risk-neutral dynamics of factors by a system of stochastic differential equations of the form (1) Xt dXt = b(Xt )dt + (Xt )dWt , Xt = , Xt(2)
170
CHAPTER 4
Xt(1)
Xt(2)
with observable, and not observable. Such a system is obviously Markovian, and in order to remain in the Markovian setting, one needs to take a filtering point of view and replace the unobserved factors by their conditional distributions (the so-called filters) and rely on the Zakai and Kushner equations to retain Markovian dynamics. This filtering strategy amounts to • Replacing unobserved factors by their conditional distributions; • Jumping from a finite dimensional problem to an infinite dimensional one. Filtering techniques have been applied to many financial problems of the kind considered in this book. However, most applications have been limited to Gaussian or conditionally Gaussian models (as, for example, in [168, 203, 251]) and regime switching models (as in [118]). The reason for these assumptions is to avoid the jump to infinite dimensional analysis. Indeed, in all these cases, the filter is Gaussian or finitely supported and can be characterized by a finite number of parameters. These applications include risk premium filtering as in [239] and estimation of the term structure of power prices as in [182]. So the existence of unobservable factors transforms the original finite dimensional optimal control problem into an infinite dimensional one. In particular, the Hamilton-Jacobi-Bellman equation is a nonlinear partial differential equation in infinite dimensions! See, for example, [169] and [45, 46] and Section 7.3 of Chapter 7, where some of the results of [45, 46] are presented. In this section, we rely on an implicit assumption of conditional Gaussianity to avoid the jump to infinite dimensions. In this way, we can use a simple conditioning argument to replace the original finite dimensional control problem with partial observations by a control problem with full observations and the same dimension. However, the coefficients of this new control problem depend on the past of the observed factors, even if the original dynamics were assumed to be Markovian. 4.4.1 The Model As in Section 4.1, we start with a Markovian model. We consider a pair of stochastic processes {Yt }t≥0 and {St }t≥0 whose dynamics are given by (4.23)–(4.24), but we now assume a specific form for the coefficients. We assume that the coefficients of (4.23) are of the form p(t) = p(Yt , t) + ϕ(Yt , t)ζt ,
and
q(t) = q(Yt , t).
Similarly, we assume that the coefficients of (4.24) are of the form µ(t) = µ(Yt , t) + ϕ(Yt , t)ζt ,
and
σ (t) = σ (Yt , t),
and the term λ(t) of (4.28) is of the form λ(t) = λ(Yt , t)dt + ρ
ϕ(Yt , t) ζt , h(Yt , t)
where as before, σ > 0 and that h ≥ h for some strictly positive constant h. The process {ζt }t≥0 is assumed to satisfy a stochastic differential equation of the form dζt = g(Yt , ζt , t)dt + h(Yt , ζt , t)dBt , and {Bt }t≥0 is a Wiener process independent of {Wt }t≥0 and {Wt⊥ }t≥0 .
171
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
The thrust of this section is the fact that the process {ζt : t ≥ 0} is unobservable. Therefore, the investor’s decisions must be based solely on the past observations of the processes {St : t ≥ 0} and {Yt : t ≥ 0}. This means that the portfolio processes {θ (t)}t≥0 and {π(t)}t≥0 must be adapted to the filtration {FtS,Y : t ≥ 0}. Since the process {ζt }t≥0 is not necessarily adapted to the filtration {FtS,Y }t≥0 , it is natural to introduce the conditional expectation of δ(t) conditional on the σ -algebra F S,Y (t), i.e., ζˆt = E{ζt |FtS,Y }. By construction, the process {ζˆt }t≥0 is adapted to the filtration {FtS,Y }t≥0 . In the present situation, it is not hard to see that the process {ζˆt }t≥0 is adapted to the filtration {FtY }t≥0 . In other words, the best guess for δ(t) depends only upon the past observations {Yu }0≤u≤t . The past observations {Su }0≤u≤t are of no use for the filtering problem. To see this, we use the identity FtS,Y = σ {Yu , Wu⊥ ; 0 ≤ u ≤ t}. Since the σ -algebra σ {ζu , Yu }0≤u≤t is independent of the σ -algebra σ {Wu⊥ }0≤u≤t , we obtain ζˆt = E{ζt |FtS,Y } = E{ζt |σ {Yu , Wu⊥ ; 0 ≤ u ≤ t}} = E{ζt |σ {Yu ; 0 ≤ u ≤ t}} = E{ζt |FtY }, which proves our claim. 4.4.2 Reduction to the Full Observation Case Let us define a probability measure Pˆ by T d Pˆ ϕ(Yt , t) 1 T ϕ(Yt , t)2 2 ˆ (ζ − ζ ) dt . = exp − (ζt − ζˆt )dWt − t t dP 2 0 h(Yt , t)2 0 h(Yt , t) Furthermore, we define the process {W˜ t }0≤t≤T by t ϕ(Yu , u) Wˆ t = Wt + (ζu − ζˆu )du. 0 h(Yu , u) It follows from Girsanov’s theorem that {Wˆ t }0≤t≤T is a Wiener process relative to ˆ Moreover, this same theorem tells us that the dynamics the probability measure P. of {Yt }0≤t≤T and {St }0≤t≤T under the probability measure Pˆ are given by dYt = (g(Yt , t) + ϕ(Yt , t)ζˆt )dt + h(Yt , t)d Wˆ t and
dSt ϕ(Yt , t) S ˆ ˆ = σ (Yt , t) λ(Yt , t) + ρ ζ (t) dt + d Wt , St h(Yt , t)
(4.42)
172 provided we set
CHAPTER 4
Wˆ tS = ρ Wˆ t + 1 − ρ 2 Wt⊥ .
(4.43)
For the sake of convenience we introduce the notation t ϕ(Yu , u) 1 t ϕ(Yu , u)2 2 ˆ H (t) = exp (ζu − ζˆu )d Wˆ u − (ζ − ζ ) du u u 2 0 h(Yu , u)2 0 h(Yu , u) for all 0 ≤ t ≤ T . We have T ϕ(Yt , t) 1 T ϕ(Yt , t)2 2 ˆ ˆ ˆ (ζt − ζt ) dt (ζt − ζt )d W (t) − HT = exp 2 0 h(Yt , t)2 0 h(Yt , t) T 1 T ϕ(Yt , t)2 ϕ(Yt , t) 2 ˆ dt (ζ − ζ ) (ζt − ζˆt )dWt + = exp t t 2 0 h(Yt , t)2 0 h(Yt , t) dP . = d Pˆ Note that the random variable HT is not FTS,Y -measurable. However, also note the following: Lemma 4.14 The random variable HT satisfies ˆ (t)|F S,Y } = 1, E{H T
a.s.
for all 0 ≤ t ≤ T . Proof. The process {H (t)}0≤t≤T satisfies the stochastic differential equation dH (t) = H (t)
ϕ(Yt , t) (ζt − ζˆt )d Wˆ t , h(Yt , t)
which implies that H (t) − 1 =
t
H (u) 0
ϕ(Yu , u) (ζu − ζˆu )d Wˆ u . h(Yu , u)
By definition of ζˆu , we have E{ζu − ζˆu |FuS,Y } = 0, and since dP = HT , d Pˆ we obtain ˆ T (ζu − ζˆu )|FuS,Y } = 0. E{H Using the identity ˆ T |Fuζ,S,Y } = H (u), E{H
(4.44)
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
173
we also get ˆ (u)(ζu − ζˆu )|FuS,Y } = 0. E{H Since the σ -algebra σ {Wˆ v − Wˆ u , Wv⊥ − Wu⊥ ; u ≤ v ≤ T } generated by the increζ,S,Y ments is independent of Fu , we conclude that ˆ (u)(ζu − ζˆu )|FuS,Y ∨ σ {Wˆ v − Wˆ v , Wv⊥ − Wu⊥ ; u ≤ v ≤ T }} = 0. E{H From this it follows that ˆ (u)(ζu − ζˆu )|F S,Y } = 0. E{H T Therefore, we obtain t ϕ(Yu , u) ˆ (ζu − ζˆu )d Wˆ u FTS,Y = 0, H (u) E h(Yu , u) 0 hence ˆ (t)|F S,Y } = 1, E{H T which is the desired result.
2
Let us assume that the investor’s risk preferences are modeled by some utility function U . Furthermore, let us also consider a contingent claim whose payoff at maturity T is given by a random variable ξ . Naturally, we assume that the payoff depends only on the observable variables. In other words, we assume that the random variable ξ is FTS,Y -measurable. The investor wants to find an admissible portfolio process {θ (t)}0≤t≤T that maximizes the expected terminal time utility T dSt 1 θ(t) +ξ , E U x+ σ (Yt , t) St 0 where x is the initial endowment. A portfolio process {θ (t)}0≤t≤T is admissible if, among other things, it is adapted to the filtration {FtS,Y }0≤t≤T . This means that the investor’s portfolio decisions must be based exclusively on the past observations of the processes {St }0≤t≤T and {Yt }0≤t≤T . The optimal utility for this problem is defined as T dSt 1 θ(t) +ξ , V (x) = sup E U x + σ (Yt , t) St 0 where the supremum is taken over all admissible portfolio processes {θ(t)}0≤t≤T . Our aim is to identify the optimal utility V (x) with the optimal utility Vˆ (x) of a stochastic control problem with full observation. To this end, we define T 1 dSt ˆ ˆ θ(t) +ξ . (4.45) V (x) = sup E U x + σ (Yt , t) St 0 As before, the supremum is computed over all admissible portfolios {θ(t)}0≤t≤T . We have: Proposition 4.7 The optimal utility V (x) coincides with Vˆ (x).
174
CHAPTER 4
Proof. Let {θ (t)}0≤t≤T be an arbitrary admissible portfolio process. Since it is adapted to the filtration {FtS,Y }0≤t≤T , the random variable T 1 dSt θ(t) σ (Y , t) St t 0 is FTS,Y -measurable. Since the random variable ξ is FTS,Y -measurable by assumption, the random variable T dSt 1 +ξ θ(t) σ (Yt , t) St t is also FTS,Y -measurable. From this it follows that T 1 dSt θ(t) +ξ E U x+ σ (Yt , t) St 0 T dSt 1 ˆ θ(t) +ξ = E HT U x + σ (Yt , t) St 0 T 1 dSt ˆ E{H ˆ T |F S,Y }U x + θ(t) + ξ =E T σ (Yt , t) St 0 T 1 dSt θ(t) +ξ . = Eˆ U x + σ (Yt , t) St 0 the desired result follows by taking the supremum over all admissible portfolio 2 processes {θ (t)}0≤t≤T . 4.5 THE CONDITIONALLY GAUSSIAN CASE In this section, we look at a special case where the filtering problem can be reduced to a system of finitely many stochastic differential equations. To this end, we assume that the processes {Yt }0≤t≤T and {St }0≤t≤T satisfy the stochastic differential equations dYt = (g(Yt , t) + ϕ(Yt , t)ζt )dt + h(Yt , t)Wt and
σ (Yt , t) dSt dYt , = µ(Yt , t)dt + 1 − ρ 2 σ (t)dWt⊥ + ρ St h(Yt , t)
where {Wt }t≥0 and {Wt⊥ }t≥0 are independent Wiener processes. The second equation can be written in the form ϕ(Yt , t) dSt ζt dt + dWtS , (4.46) = σ (Yt , t) λ(Yt , t)dt + ρ St h(Yt , t) where λ(Yt , t) =
µ(Yt , t) g(Yt , t) +ρ , σ (Yt , t) h(Yt , t)
and where the Wiener process {WtS }t≥0 is defined as before.
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
175
Furthermore, we assume that the dynamics of the process {ζt }0≤t≤T are given by dζt = (a(Yt , t) + b(Yt , t)ζt )dt + q(Yt , t)dBt , where {Bt }0≤t≤T is a Wiener process independent of {Wt }t≥0 and {Wt⊥ }t≥0 . For these dynamics, we assume that the drift is affine in ζt and that the diffusion coefficient is independent of ζt . The latter implies that we are dealing with a conditionally Gaussian model, and the latter are amenable to Kalman filtering theory, which we use next.
4.5.1 Equations for the Optimal Filter We denote by ζˆt and ωt the conditional mean and variance of ζt , i.e., ζˆt = E{ζt |FtS,Y } and ωt = E{(ζt − ζˆt )2 |FtS,Y }. Following the classical presentation of the Kalman filter given in [174], one can prove that the dynamics of these first two conditional moments are given by d ζˆt = (a(Yt , t) + b(Yt , t)ζˆt )dt 1 + ϕ(Yt , t)ωt (dYt − g(Yt , t)dt − ϕ(Yt , t)ζˆt dt), h(Yt , t)2 and
1 2 2 dωt = q(Yt , t) + 2b(Yt , t)ωt − ϕ(Yt , t) ωt dt. h(Yt , t)2
(4.47)
2
(4.48)
4.5.2 Formulas for the Value Function We are now in a position to compute the optimal expected utility of terminal wealth V (x) = sup E{U (XT + ξ )|FtS,Y } for the three most commonly used utility functions U already used in the previous section.
The Case of Constant Absolute Risk Aversion Suppose that the investor’s risk preferences are given by an exponential utility function U (x) = − exp(−γ x) for some γ > 0. We consider a contingent claim whose payoff is modeled by an FTY -measurable random variable ξ , and we compute the optimal utility V (x) = sup E{− exp(−γ (XT + ξ ))},
176
CHAPTER 4
where the supremum is taken over all admissible portfolio processes {θ (t)}0≤t≤T . To this end, we replace V (x) by Vˆ (x) defined by Vˆ (x) = sup Eˆ − exp(−γ (XT + ξ )) . As above, the supremum runs over all admissible portfolio processes {θ(t)}0≤t≤T . This is a stochastic control problem with full observation. Its solution is given by T ϕ(Yt , t) ˆ exp − γ (1 − ρ 2 )ξ − ρ λ(Yt , t) + ρ ζˆt d Wˆ t Vˆ (x) = exp(−γ x)E h(Yt , t) 0 2 1 T 1−ρ 2 ϕ(Yt , t) 1 λ(Yt , t) + ρ . ζˆt dt − h(Yt , t) 2 0 ˆ the dynamics of {St }0≤t≤T , {Yt }0≤t≤T , Note that, under the probability measure P, and {ζˆt }0≤t≤T are given by dYt = (g(Yt , t) + ϕ(Yt , t)ζˆt )dt + h(Yt , t)d Wˆ t , dSt ϕ(Yt , t) = σ (Yt , t) λ(Yt , t)dt + ρ ζˆt dt + d Wˆ tS , St h(Yt , t) where we use the notation (4.43) for Wˆ S , and d ζˆt = (a(Yt , t) + b(Yt , t)ζˆt )dt +
ϕ(Yt , t) ωt d Wˆ t . h(Yt , t)2
To simplify this result, it is convenient to introduce the probability measure Pˆ 1 defined by T ϕ(Yt , t) d Pˆ 1 = exp − ρ λ(Yt , t) + ρ ζˆt d Wˆ t h(Yt , t) d Pˆ 0 2 T 1 2 ϕ(Yt , t) ˆ λ(Yt , t) + ρ − ρ ζt dt . 2 h(Yt , t) 0 With this definition, the optimal utility can be written as ˆ ˆ V (x) = exp(−γ x)E1 exp − γ (1 − ρ 2 )ξ 1 − (1 − ρ 2 ) 2
0
T
ϕ(Yt , t) λ(Yt , t) + ρ ζˆt h(Yt , t)
2 1 1−ρ 2 dt .
This formula is of the Feynman-Kac type and as such it is amenable to computations, either by Monte Carlo methods or by numerical methods for partial differential equations. Note that, under the probability measure Pˆ 1 , the dynamics of the processes {St }0≤t≤T and {Yt }0≤t≤T are given by dYt = (g(Yt , t) − ρh(Yt , t)λ(Yt , t) + (1 − ρ 2 )ϕ(Yt , t)ζˆt )dt + h(Yt , t)d Wˆ 1,t ,
177
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
and
ϕ(Yt , t) dSt ζˆt dt = σ (Yt , t) (1 − ρ 2 )λ(Yt , t)dt + ρ(1 − ρ 2 ) h(Yt , t) St ⊥ 2 ˆ + ρd W1,t + 1 − ρ dWt ,
where {Wˆ 1,t }0≤t≤T is a Brownian motion relative to the probability measure Pˆ 1 . Moreover, the conditional expectation ζˆt satisfies the stochastic differential equation d ζˆ (t) = (a(Yt , t) + b(Yt , t)ζˆt )dt
ϕ(Yt , t) ϕ(Yt , t) + ω(t) d Wˆ 1,t − ρλ(Yt , t)dt − ρ 2 ζˆt dt . h(Yt , t) h(Yt , t) Constant Relative Risk Aversion We now assume that the investor’s risk preferences are described by the utility function U (x) = x γ for some 0 < γ < 1, and we compute the optimal utility γ
V (x) = sup E{XT }, where the supremum is taken over all admissible portfolio processes {π(t)}0≤t≤T . As before, we can replace V (x) by the optimal utility Vˆ (x) of a fully observable model, and the value of the latter is given by T ϕ(Yt , t) γρ γ ˆ ˆ ˆ V (x) = x E exp λ(Yt , t) + ρ ζt d Wˆ t 1−γ 0 h(Yt , t) 2 1−γ T 1−γ +γρ 2 1 γ ϕ(Yt , t) λ(Yt , t) + ρ + , ζˆt dt 2 1−γ 0 h(Yt , t) ˆ To simplify this where the expectation is taken under the probability measure P. ˆ result, it is convenient to define a probability measure P2 by T γρ ϕ(Yt , t) d Pˆ 2 = exp λ(Yt , t) + ρ ζˆt d Wˆ t 1−γ 0 h(Yt , t) d Pˆ 2 2 2 T 1 γ ρ ϕ(Yt , t) ˆ λ(Yt , t) + ρ − ζt dt . 2 (1 − γ )2 0 h(Yt , t) With this definition, the optimal utility can be written as 2 ˆ 2 exp 1 γ 1 − γ + γρ Vˆ (x) = x γ E 2 1−γ 1−γ 2 1−γ T 1−γ +γρ 2 ϕ(Yt , t) ˆ λ(Yt , t) + ρ , ζt du × h(Yt , t) 0
178
CHAPTER 4
and for Pˆ 2 , the dynamics of the processes {St }0≤t≤T and {Yt }0≤t≤T are given by γρ h(Yt , t)λ(Yt , t) dYt = g(Yt , t) + 1−γ 1 − γ + γρ 2 ϕ(Yt , t)ζˆt du + h(Yt , t)d Wˆ 2,t , + 1−γ and
1 − γ + γρ 2 ϕ(Yt , t) 1 − γ + γρ 2 dSt λ(Yt , t)dt + ρ ζˆt dt = σ (Yt , t) 1−γ 1−γ h(Yt , t) St + ρd Wˆ 2,t + 1 − ρ 2 dWt⊥ ,
where {Wˆ 2,t }0≤t≤T is a Wiener process relative to the probability measure Pˆ 2 . Moreover, the conditional expectation ζˆt satisfies the stochastic differential equation d ζˆt = (a(Yt , t) + b(Yt , t)ζˆt )dt
γρ γρ 2 ϕ(Yt , t) ϕ(Yt , t) λ(Yt , t)dt + ζˆt dt . ω(t) d Wˆ 2,t + + h(Yt , t) 1−γ 1 − γ h(Yt , t) Logarithmic Utility Function We next consider the logarithmic utility function U (x) = log x. In this case, the optimal utility is V (x) = sup E{log XT }, where the supremum is over all admissible portfolio processes {π(t)}0≤t≤T . In order to derive a formula for V (x), we use the identity V (x) = Vˆ (x), where ˆ Vˆ (x) = sup E{log XT }. Again, the supremum runs over all admissible portfolio processes {π(t)}0≤t≤T . This is a stochastic control problem with full observation, the solution of which is given by 2 T 1ˆ ϕ(Yt , t) ˆ ˆ V (x) = log x + E λ(Yt , t) + ρ ζt dt . 2 h(Yt , t) 0
4.5.3 Mean-Variance Hedging We finally assume that the investor’s risk preferences are modeled by the meanvariance principle. We define the value function V (x, y) by V (x, y) = inf E{XT2 } − E{XT }2 , where the infimum is taken over all admissible portfolio processes with initial capital x and expected return E{XT } = y.
179
FROM MARKOVIAN TO PARTIALLY OBSERVABLE MODELS
As above, one can show that V (x, y) = Vˆ (x, y), where ˆ T2 } − E{X ˆ T }2 , Vˆ (x, y) = inf E{X where the infimum runs over all admissible portfolio processes with initial capital x ˆ T } = y. Hence, we have reduced the problem to a stochastic and expected return E{X control problem with full observation, the solution of which is given by T ϕ(Yt , t) 2 ˆ ˆ ˆ V (x, y) = (y − x) E exp − 2ρ ζt d Wˆ t λ(Yt , t) + ρ h(Yt , t) 0 2 − 1 −1 T 1−2ρ 2 ϕ(Yt , t) ˆ −1 . ζt dt λ(Yt , t) + ρ − h(Yt , t) 0 To simplify this result, we define a probability measure Pˆ 3 by T d Pˆ 3 ϕ(Yt , t) = exp − 2ρ λ(Yt , t) + ρ ζˆt d Wˆ t h(Yt , t) d Pˆ 0 2 T ϕ(Yt , t) 2 ˆ ζt dt . λ(Yt , t) + ρ − 2ρ h(Yt , t) 0 With this definition, the optimal utility Vˆ (x, y) can be written in the form T 2 ˆ 2 ˆ λ(Yt , t) V (x, y) = (y − x) E3 exp − (1 − 2ρ ) 0
2 − 1 −1 1−2ρ 2 ϕ(Yt , t) +ρ ζˆt dt −1 . h(Yt , t) Furthermore, the dynamics of {Yt }0≤t≤T and {St }0≤t≤T are given by dYt = (g(Yt , t) − 2ρh(Yt , t)λ(Yt , t) + (1 − 2ρ 2 )ϕ(Yt , t)ζˆt )dt + h(Yt , t)d Wˆ 3,t and
dSt ϕ(Yt , t) ζˆt dt = σ (Yt , t) (1 − 2ρ 2 )λ(Yt , t)dt + ρ(1 − 2ρ 2 ) h(Yt , t) St ⊥ 2 ˆ + ρd W3,t + 1 − ρ dWt ,
where {Wˆ 3,t }0≤t≤T is a Wiener process under the probability measure Pˆ 3 . Moreover, the conditional expectation ζˆt satisfies the stochastic differential equation d ζˆt = (a(Yt , t) + b(Yt , t)ζˆt )dt ϕ(Yt , t) 2 ϕ(Yt , t) ˆ ˆ + ζt dt . ω(t) d W3,t − 2ρλ(Yt , t)dt − 2ρ h(Yt , t) h(Yt , t) Consider now a contingent claim, settling at time T , written on the observable processes S and Y . Its payoff is modeled by an FTS,Y -measurable random variable η.
180
CHAPTER 4
For every initial capital x, we define the minimal replication error Rη (x) = inf E{(XT − η)2 }, where the infimum is taken over all admissible portfolios {θ(t)}0≤t≤T , and {Xt }0≤t≤T is the wealth process associated with {θ(t)}0≤t≤T and the initial capital x. Arguing as above, we obtain Rη (x) = Rˆ η (x), where 2 ˆ Rˆ η (x) = inf E{(X T − η) },
where the infimum is taken over all admissible portfolio processes {θ (t)}0≤t≤T . For abbreviation, let ˆ λ(t) =
µ(Yt , t) g(Yt , t) ϕ(Yt , t) +ρ +ρ ζˆt , σ (Yt , t) h(Yt , t) h(Yt , t)
and assume that {π(t)} ˆ 0≤t≤T is chosen such that
1 T ˆ + (1 − 2ρ 2 )πˆ ∗ (t) d Wˆ t λ(t) exp − ρ 0 T
2 1 1 2 ∗ ˆ λ(t) + (1 − 2ρ )πˆ (t) dt − 2 ρ2 0 −1 T T ˆ exp − 2ρ ˆ ˆ 2 dt =E λ(t)d Wˆ t − λ(t) exp
− 2ρ
0 T
ˆ λ(t)d Wˆ t −
0
0 T
ˆλ(t)2 dt .
0
Finally, we introduce the exponential martingale {Zˆ t }0≤t≤T , 1 t ˆ λ(u)d Wˆ u Zˆ t = exp − ρ 0 1 − ρ2 t ∗ − πˆ (u)[ 1 − ρ 2 d Wˆ u − ρdWu⊥ ] ρ 0 t 1 − ρ2 t ∗ 1 1 2 ˆ ˆ λ(u) du − πˆ (u)λ(u)du − 2 ρ2 0 ρ2 0 1 1 − ρ2 t ∗ 2 πˆ (u) du . − 2 ρ2 0 Then the minimal replication error Rˆ η (x) is smallest for ˆ Zˆ T η}. x = E{ This is the equivalent for our partially observable market of the Föllmer-Schweizer hedging price [91] of η. 2
PART 3
Applications
This page intentionally left blank
Chapter Five Portfolio Optimization Aytac Ilhan Mattias Jonsson Ronnie Sircar
We study the problem of portfolio optimization in an incomplete market using derivatives as well as basic assets such as stocks. In such markets, an investor may want to use derivatives, as a proxy for trading volatility, for instance, but they should be traded statically, or relatively infrequently, compared with assumed continuous trading of stocks, because of the much larger transaction costs. We discuss the computational tractability obtained by assuming exponential utility, and connection to the method of utility-indifference pricing. In particular, we show that the optimal number of derivatives to invest in is given by the optimizer in the Legendre transform of the indifference price as a function of quantity, evaluated at the market price. This is illustrated in a standard diffusion stochastic volatility model, when the indifference price is the solution of a quasilinear PDE problem. We suggest some asymptotic approximations for the optimal derivative holding, first when it might be small, and second in the case of slowly varying volatility.
5.1 INTRODUCTION In this chapter, we study the problem of portfolio optimization using stocks and derivatives, where the performance is measured by expected exponential utility. 5.1.1 Background Portfolio optimization problems within the context of continuous-time stochastic models of financial variables have been, and continue to be, the subject of much research activity in financial mathematics and engineering. A wide-ranging theory has been developed since the papers of Merton [186,187] for understanding the issue of optimal asset allocation for maximizing expected utility under various sources of market incompleteness (such as transaction costs, trading constraints, or stochastic volatility), different utility prescriptions, the presence of random endowments, and so on. Key references, in particular for the duality theory used heavily to obtain existence and uniqueness results in incomplete markets, are [153, 162], and recent extensions can be found in [59, 243].
184
Payoff
CHAPTER 5
Kl
Stock price at time T
Ku
Figure 5.1 Strangle payoff function.
Typically, these results study the problem of optimizing over continuous trading strategies in primitive (or underlying) securities such as stocks. However, traders have long been using derivative securities as a proxy for some of the untradable components in an incomplete market. A standard example is a strangle that involves long positions in a European call option with strike Ku and a European put option with the same maturity date T and a lower strike Kl . Both options are out of the money at the time of their purchase, so we assume the current stock price S0 ∈ (Kl , Ku ). The terminal payoff as a function of the stock price on date T is shown in Figure 5.1. Such a position is often described as being “long volatility,” since the holder is rewarded by a significant move in the stock price, either up or down. In this chapter, we study the problem of incorporating derivatives along with stocks in the investment problem. A crucial difference between the two asset classes is that transaction costs on derivatives trades are significantly higher than on basic stocks. In addition, there may be greater liquidity issues in the less frequently traded derivatives markets. We shall assume, therefore, that stocks can be traded continuously, ignoring transaction costs, but that options can only be bought or sold statically. Other authors have studied a similar problem, but under different assumptions. Carr and Madan [48] assumed the availability for trading of European options of all strikes, thereby completing the market, and in a one-period equilibrium model. Liu and Pan [175] assumed continuous trading of the derivatives, again completing the market in a different way. We shall assume a given finite set of contingent claims available for purchase (or short sale) at given unit market prices. For tractability, we will also assume the investor’s preferences to be described by an exponential utility function. In a complete market, derivatives are redundant because they can be replicated by dynamic trading in the underlying. In that case, the problem studied here is
185
PORTFOLIO OPTIMIZATION
not well posed. The utility-indifference pricing mechanism, introduced by Hodges and Neuberger [131], asks at what price an investor is indifferent, with respect to maximum expected utility, about a given derivatives position in an incomplete market. It turns out that this question, rephrased in terms of quantity of derivatives for given market prices, yields the answer as to the optimal static position to take in the derivatives. 5.1.2 The Investment Problem We describe here the problem in the simplest setting of one stock and one derivative. The more general problem where many options are available is studied in [137]. We suppose there is an investor with initial investment capital v > 0. He can trade dynamically in a stock and a bank account. The stock price process is denoted S, and is defined on a probability space (, F, P), where P is the investor’s subjective measure. For simplicity, we take the interest rate to be zero throughout. The analysis can be modified for a nonzero interest rate by switching to discounted variables. He can also trade statically in a derivative security that pays the random amount G on date T . The market price of the derivative is p. Assumption 5.1 The payoff G is bounded. Remark. Although the strangle given as an example in the previous section, and other common strategies involving call option payouts are not bounded, we shall assume they are replaced by their cutoff versions, where the cutoffs are conditioned on sufficiently extreme events so as not to affect practical accuracy. For some relaxations of this assumption, see [81], [16], or [138]. Assumption 5.2 The investor has an exponential utility function U (x) = −e−γ x , where γ > 0 is his risk-aversion parameter. The investor buys α derivatives for price αp at time zero, and holds them until expiration at time T . With his remaining capital v − αp, he trades continuously in the Merton portfolio, that is, the stock and bank account. Let θt be the amount held in the stock at time t, and (Xt )0≤t≤T the value of this latter portfolio t θt dt. Xt = v − αp + 0
Then the investor’s problem is to maximize over both the dynamic control θ and the static derivative quantity α his expected utility of terminal wealth. Let u(x, αG, γ ) = sup E{−e−γ (XT +αG) }, θ
the optimal expected utility from trading the stock with initial capital x, and an option payout (or random endowment) αG at the terminal time. Then the investor’s
186
CHAPTER 5
problem is to find max u(v − αp, αG, γ ). α
(5.1)
This is an optimization of a function that is itself the value function of a stochastic control problem. In the next section, we show that it is closely related to another control problem, namely that of finding the utility-indifference price of the derivatives.
5.2 INDIFFERENCE PRICING AND THE DUAL FORMULATION We consider a market with two tradable instruments: the stock, or the risky asset, S and the riskless bond. We assume that S is a locally bounded (P, F)-semi-martingale, where F is a filtration on the given probability space satisfying the usual conditions.
5.2.1 Utility Indifference Prices We start by defining the set of absolutely continuous (equivalent) local martingale measures Pa (Pe ) as Pa = {Q P | S is a local (Q, F)-martingale}, Pe = {Q ∼ P | S is a local (Q, F)-martingale}. The indifference price of the claim G is specified through the solutions of two stochastic control problems. The first is the classical Merton optimal investment problem (5.2) M(x, γ ) = sup E −e−γ XT ,
where is a suitable set of trading strategies made precise below. The second is the optimal investment problem for the buyer of the claim, u(x, G, γ ) = sup E{−e−γ (XT +G) }.
(5.3)
Clearly, u(x, 0, γ ) = M(x, γ ). Then the (buyer’s) indifference price h of G is defined by u(x − h(G, γ ), G, γ ) = M(x, γ ).
(5.4)
To be specific about the permissible trading strategies, we first introduce Pf (P ), the set of measures in Pa with finite relative entropy with respect to P, where the relative entropy of a measure, H (Q|P), is defined: ) dQ log , Q P, E dQ dP dP (5.5) H (Q|P) = ∞, otherwise.
187
PORTFOLIO OPTIMIZATION
Assumption 5.3 There exists an equivalent local martingale measure with finite relative entropy Pf (P) ∩ Pe = ∅.
(5.6)
We denote by the set of S-integrable trading strategies for which the corresponding wealth process is a martingale under all measures in Pf (P ) with respect to the filtration F. 5.2.2 Dual Problem: Relative Entropy Minimization It is convenient for interpretation and computation to study the dual of the buyer’s stochastic control problem (5.3). The dual of the maximization of expected exponential utility over trading strategies with a static option position is the problem of minimizing this option’s payoff over a space of measures penalized by the relative entropy of the measure. Before stating these results and related references, we sketch the relation between these problems. Sketch of Duality Theory Let us start by weakening the martingale equality constraint to an inequality u(x, G, γ ) = sup E{U (XT + G)},
(5.7)
θ
s.t.EQ {XT } ≤ x,
∀ Q ∈ Pf (P).
(5.8)
As G is bounded, we define ξ = XT + G. For a measure Q ∈ Pf (P ), a positive constant , and a trading strategy such that (5.8) holds, dQ E{U (ξ )} ≤ E U (ξ ) − (ξ − G − x) , dP since we are adding a nonnegative quantity to the right-hand side. Taking the supremum over all allowable strategies ω by ω, which is equivalent to taking supremum over ξ ’s, we get dQ sup E{U (ξ )} ≤ E V + (x + EQ {G}), ∀Q ∈ Pf (P), and ∀ ≥ 0, dP ξ (5.9) where V () is the Fenchel-Legendre transform of U (x) V () = sup(U (x) − x) = x∈R
log − , γ γ γ
≥ 0.
(5.10)
Note that E{V ( dQ )} is finite for < ∞ as Q is in Pf (P). Furthermore, taking dP the infimum in (5.9) over all possible Q’s and ’s, we get dQ sup E{U (ξ )} ≤ inf inf E V + (x + EQ {G}). (5.11) ≥0 Q∈P (P) dP f ξ
188
CHAPTER 5
Writing V () explicitly as defined in (5.10), the above reduces to 1 Q . log − + x + H (Q|P) + E {G} sup E{U (ξ )} ≤ inf inf ≥0 Q∈Pf (P) γ γ γ γ π (5.12) The optimizing Q in (5.12) does not depend on , which will not be true for other common utility functions, and in particular log and power. # ∈ Pf (P) One part of the problem is showing the existence and uniqueness of Q that minimizes 1 H (Q|P) + EQ {G} γ under Assumption 5.3 and the boundedness assumption on G. In [81], Delbaen et al. reduce this problem to the case, where G = 0 with a measure transformation, and # the parameter they conclude as needed by using the results of Frittelli [97]. Given Q, # that minimizes the right-hand side of (5.12) is given by 1 # # # = γ exp −γ x + EQ| , {G} + H (Q|P) γ which is strictly positive. From (5.10), the corresponding maximizer # ξ of the righthand side of (5.9) is given by # # d Q| 1 #T + G. # =X ξ = − log γ dP γ # Con#T is a martingale under Q|. With the above representation, we conclude that X cluding that the wealth is a stochastic integral of a trading strategy in with respect to S is more involved, and we refer the reader to Lemma 3.3 of [81], or Propo#T is optimal for the primal problem, because for any sition 1.2.3 of [16]. Then X trading strategy in , we have * ) # d Q| # (ξ − G + x) E {U (ξ )} ≤ E U (ξ ) − dP ) * # d Q| # ≤ E U (# ξ) − (# ξ − G + x) dP = E{U (# ξ )}. Main Duality Formula In the case of exponential utility, a duality result including a contingent claim in a general semi-martingale setting was shown by Delbaen et al. [81]. They show the equality of the solutions 1 Q (5.13) E {G} + H (Q|P) − γ x , u(x, G, γ ) = − exp −γ inf Q∈Pf (P) γ
189
PORTFOLIO OPTIMIZATION
and they conclude that the optimizers in both problems are achieved in their feasible sets. Moreover, the minimizing measure is equivalent to P. They give different theorems corresponding to different feasible sets of strategies. El Karoui and Rouge [156] studied indifference pricing with a Brownian filtration using backward stochastic differential equations. Kabanov and Stricker [146] showed that some of the assumptions in [81] were superfluous. Becherer [16] extended the results of [81] by using the extensions of [146]. It is worth noting that our set of feasible trading strategies corresponds to 2 in [16]. The duality relation for general utility functions defined on R+ was considered in [243], and for utility functions defined on R in [244]. However, these papers do not involve a claim. The results were extended to include a claim in [59] and [207]. 5.2.3 Expressions for Indifference Prices It is easy to see from the duality formula (5.13) that the dependence of the value function u in (5.3) on the initial wealth x is simply through the multiplicative factor −e−γ x . This is the typical ansatz one would make in a dynamic programming approach to solving the problem in Markovian models, and we see that the separation of variables is quite general. Since the indifference price h, defined in (5.4), is merely an adjustment in initial wealth level for setting G to zero, it immediately follows that 1 M(0, γ ) h(G, γ ) = log , (5.14) γ u(0, G, γ ) and substituting the specific expressions for the duals of the buyer’s and Merton problems, we can write 1 1 h(G, γ ) = inf EQ {G} + H (Q|P) − inf H (Q|P). (5.15) Q∈Pf (P) Q∈Pf (P) γ γ Note that h is independent of the initial wealth. Indifference Price with an Alternative Expression The entropy terms in (5.15) can be combined into one entropy term with a different prior measure, the minimal entropy martingale measure, which is the measure minimizing the relative entropy in Pf (P): Q0 = arg min H (Q|P). Q∈Pf (P)
(5.16)
Results on the existence and uniqueness of this measure can be found in Frittelli [97] and Grandits and Rheinländer [110]. Theorem 5.4 (Theorem 2.2-5 of Frittelli [97] and Theorem 2.2 of Delbaen et al. [81]) Under assumption (5.6), Q0 exists, is unique, is in Pf (P) ∩ Pe and its density has the form dQ0 0 = c0 e−γ XT , dP
(5.17)
190
CHAPTER 5
where XT0 is the optimal terminal wealth associated with the solution of the Merton problem (5.2) and log c0 = H (Q0 |P) < ∞. Moreover, XT0 is attained by a trading strategy in . Proposition 5.1 Assume dQ0 ∈ L2 (P). dP The indifference price h(G, γ ) of the bounded claim G is equal to 1 Q 0 E {G} + H (Q|Q ) . h(G, γ ) = inf γ Q∈Pf (Q0 )
(5.18)
(5.19)
Proof. From (5.17), the relative entropy of a measure Q P with respect to P can be written in terms of its relative entropy with respect to Q0 as (5.20) H (Q|P) = H (Q|Q0 ) + H (Q0 |P) − γ EQ XT0 . If we choose Q in Pf (P), the last term on the right-hand side of (5.20) is zero as XT0 is a martingale under Q. Moreover, Q is also in Pf (Q0 ) as all terms in (5.20) except from H (Q|Q0 ) are finite and Q0 ∼ P. To deduce the reverse conclusion, we note that if the assumption given in (5.18) 0 holds, eγ |XT | is in L1 (Q0 ) because 0 dQ γ X0 0 0 EQ {eγ XT } = E e T = c0 < ∞. dP Using Lemma 3.5 of Delbaen et al. [81] for the random variable |XT0 |, we deduce that 0
0
EQ {|XT0 |} ≤ H (Q|Q0 ) + e−1 EQ {eγ |XT | }. In other words, XT0 is in L1 (Q) for all Q ∈ Pf (Q0 ). As the last term on the right-hand side of (5.20) is now guaranteed to be finite for all Q ∈ Pf (Q0 ), we conclude that Pf (Q0 ) ⊂ Pf (P). But then the last term on the right-hand side of (5.20) is zero for all Q ∈ Pf (Q0 ). 2 The expression (5.19) points out a buyer’s tendency to price the claim with its worst-case expectation penalized by the entropic distance from the prior risk-neutral measure, Q0 . As Q0 ∼ P, we can also apply the duality result to (5.19), and obtain
1 0 0 (5.21) h(G, γ ) = − log − sup EQ − e−γ (XT +G) . γ 5.3 UTILITY INDIFFERENCE PRICING The investor who is contemplating buying α options at market price p will maximize her expected terminal utility by choosing the optimal number of options (assuming
191
PORTFOLIO OPTIMIZATION
it exists) α ∗ = arg max u(x − αp, αG, γ ).
(5.22)
α
Throughout, we assume a linear pricing rule in the market. From the definition (5.4) of h, this is equivalent to α ∗ = arg max M(x − αp + h(αG, γ ), γ ), α
= arg max −e−γ (x−αp+h(αG,γ ))−H (Q α
0 |P)
,
using (5.13) with G ≡ 0 and the definition (5.16) of Q0 . Extracting the terms which depend on α, this reduces to α ∗ = arg max(h(αG, γ ) − αp). α
(5.23)
In other words, the optimal derivatives position is found from the Fenchel-Legendre transform of the indifference price as a function of quantity, evaluated at the market price. From this, it is clear that existence and uniqueness of the solution to our optimization problem (5.22) will depend on the strict concavity of the indifference price as a function of quantity, and the value of the market price p. To this end, in the next few sections, we study some properties of the indifference price h(αG, γ ) as a function of α and γ . 5.3.1 Dependence on the Risk-Aversion Parameter Large Risk-Aversion Limit As investors become more risk averse, the price they are willing to pay for a contingent claim tends to the subhedging price of the claim lim h(G, γ ) = inf EQ {G}.
γ ↑∞
Q∈Pe
(5.24)
In fact, it is easy to show from (5.15) that the limit is the infimum of the expected payoff over all measures in Pf (P) ∩ Pe ; however, to show that it is also the infimum over all measures in Pe requires more work. For a proof of the result, we refer the reader to the Corollary 5.1 of [81]. Zero Risk-Aversion Limit It was proved by Becherer [16] that in the limit as the risk aversion parameter tends to zero, the indifference price goes to the expected payoff under the minimal entropy martingale measure: 0
lim h(G, γ ) = EQ {G}. γ ↓0
(5.25)
Monotonicity Monotonicity of the indifference price as a function of the risk aversion parameter can be seen directly from (5.15). For γ1 , let us define the measure that attains the
192
CHAPTER 5
minimum in (5.15) as Q1 , which is guaranteed to exist by the duality result. Then, h(G, γ1 ) = EQ1 {G} +
1 (H (Q1 |P) − H (Q0 |P)). γ1
(5.26)
The last term on the right-hand side is nonnegative as Q0 is the minimal entropy martingale measure. Dividing this positive term by γ2 > γ1 instead of γ1 , we only make the right-hand side of (5.26) smaller: h(G, γ1 ) ≥ EQ1 {G} +
1 (H (Q1 |P) − H (Q0 |P)). γ2
(5.27)
As h(G, γ2 ) is the infimum of EQ {G} +
1 (H (Q|P) − H (Q0 |P)) γ2
over Q in Pf (P) and as Q1 is in this feasible set, we conclude that EQ1 {G} +
1 (H (Q1 |P) − H (Q0 |P)) ≥ h(G, γ2 ). γ2
(5.28)
Combining equations (5.27) and (5.28), we conclude the monotonicity of the indifference price as a function of the risk aversion parameter: h(G, γ1 ) ≥ h(G, γ2 ) for γ2 > γ1 . The results on the limits of the indifference price and its monotonicity are also given by El Karoui and Rouge [156] in the context of Itô process models. They also show that the indifference price takes all values between these bounds for different levels of the risk-aversion parameter γ .
5.3.2 Extreme Quantity Asymptotics Becherer [16] also notes that the indifference price is a decreasing function of the risk-aversion parameter and satisfies the property h(αG, γ ) = αh(G, αγ ) for α > 0,
(5.29)
which follows easily from (5.15). Therefore, the fair price of the claim G, introduced by Davis in [60], and defined as lim α↓0
h(αG, γ ) , α
the marginal value of introducing G, is given by lim α↓0
h(αG, γ ) 0 = lim h(G, γ ) = EQ {G}. γ ↓0 α
Equations (5.24) and (5.29) imply that 1 h(αG, γ ) = inf EQ {G} . α↑∞ α Q∈Pe lim
(5.30)
193
PORTFOLIO OPTIMIZATION
Moreover, lim
α↓−∞
1 1 h(αG, γ ) = − lim h(α(−G), γ ) = − inf EQ {−G} = sup EQ {G}. α↑∞ Q∈Pe α α Q∈Pe (5.31)
The limit is known as the superhedging price of the claim.
5.3.3 Differentiability In this section, we prove the following: Proposition 5.2 The derivative of the indifference price of αG with respect to α exists for α ∈ R and ∂ α h(αG, γ ) = EQ {G}, ∂α
(5.32)
where Qα is the measure minimizing entropy with respect to Pα over Pf (Pα ), with dPα = cα e−γ αG , dP
and cα = (E{e−γ αG })−1 .
(5.33)
Proof. We directly calculate the limits in the definition of the derivative: h((α + )G, γ ) − h(αG, γ ) h((α + )G, γ ) − h(αG, γ ) α = lim = EQ {G}. ↑0 (5.34) By (5.14), the first limit is equal to sup E{−e−γ (XT +αG) } 1 lim . (5.35) log ↓0 γ sup E{−e−γ (XT +(α+)G) } lim ↓0
In terms of P α , (5.35) can be expressed as α 1 sup EP {−e−γ XT } . log lim ↓0 γ sup EPα {−e−γ (XT +G) }
(5.36)
This expression is the limit as goes to zero of the indifference price per unit of G options, from the viewpoint of an investor with subjective measure Pα (compare with (5.14)), if we can show that for this investor the set of allowable trading strategies is . In other words, we need to show that Pf (P) = Pf (Pα ). Using the definition of Pα given in (5.33) and that G is bounded, the relative entropy of a measure Q with respect to Pα can be written in terms of its entropy with respect to P as H (Q|Pα ) = H (Q|P) − log cα + EQ {γ αG}. Equality of the sets follows trivially.
(5.37)
194
CHAPTER 5
As Pα ∼ P, (5.6) is satisfied with the new prior Pα , and we can use the duality result to rewrite (5.36) as 1 1 lim inf α EQ {G} + H (Q|Pα ) − inf α H (Q|Pα ) . (5.38) ↓0 Q∈Pf (P ) Q∈Pf (P ) γ γ Taking the limit as goes to zero is equivalent to taking the limit as the risk-aversion parameter goes to zero with the prior P α fixed, and we conclude by (5.25) that the α limit exists and is equal to EQ {G}. The result for the second limit in (5.34) follows similarly. 2 5.3.4 Strict Concavity The indifference price is concave in α as it is the infimum over Pf (P) of the affine function of α 1 αEQ {G} + (H (Q|P) − H (Q0 |P)). γ Since it has also a well-defined gradient, the indifference price is in fact differentiable. Moreover, the derivative of the indifference price is bounded between the limits given in (5.30) and (5.31). Therefore, the existence of α ∗ as defined in (5.23) is guaranteed for market prices that are between these limits, in other words for market prices that are between the superhedging and subhedging prices of the option G. Another interpretation of this result follows from Theorem 5.3 in [245], which states that the interval given by the superhedging and subhedging prices of an option is exactly the interval of no-arbitrage prices of that option. Therefore, the existence of α ∗ is guaranteed for arbitrage-free market prices, p. In this theorem, Schachermayer also points out that there are two cases. Either the subhedging price of an option is equal to its superhedging price, in which case the option is replicable and the set of no-arbitrage prices of the option is a single point, or the subhedging price of an option is strictly less than its superhedging price, in which case the set of no-arbitrage prices is the open interval
inf EQ {G}, sup EQ {G} . (5.39) Q∈Pe
Q∈Pe
In the following proposition, we show strict concavity of the indifference price of an option that is not replicable in the sense specified by Schachermayer in [245]. Proposition 5.3 The indifference price of αG is a strictly concave function of α ∈ R if the subhedging price of G is strictly less than its superhedging price. Proof. Let us start by fixing α2 > α1 . We will assume that h(αG, γ ) is a linear function of α on the line segment between α1 and α2 and derive a contradiction. We define the measures P1 = Pα1 and P2 = Pα2 as in (5.33), and Q1 and Q2 as the measures that minimize the entropy with respect to P1 and P2 , respectively. From (5.37), we get (H (Q2 |P1 ) − H (Q1 |P1 )) + ((H (Q1 |P2 ) − H (Q2 |P2 )) 2
1
= γ (α1 − α2 )(EQ {G} − EQ {G}).
195
PORTFOLIO OPTIMIZATION Qi
As E {G} is the slope of the the indifference price at αi , the right-hand side is equal to zero, by our linearity assumption. Now, Q1 is the minimizer of H (Q|Pα1 ) over Pf (P), which includes Q2 , so the first term in the left-hand side is nonnegative. The same conclusion applies to the second term, therefore both terms are zero. Then the uniqueness of the minimal entropy martingale measure (see Theorem 5.4) implies that Q1 = Q2 . Using (5.17) and (5.33), the density of Qi can be specified as follows: dQi i = ci e−γ (XT +αi G) , for i = 1, 2, dP where XT1 and XT2 are the optimal terminal wealths in the optimization problems defining u(0, α1 G, γ ) and u(0, α2 G, γ ). Therefore, these terminal wealths are attained by two trading strategies in . Combining the above density representation with the equality of Q1 and Q2 , we get (α2 − α1 )G = const + XT1 − XT2 .
h(αG) lim α↑∞ h(αG)/α= inf
EQ{G}
Q∈Pe
lim α↓ 0 h(α G)/α=EQ {G} 0
α
lim α↓ −∞ h(αG)/α=supQ∈P EQ{G} e
Figure 5.2 The indifference price is a strictly concave function of the number of derivatives. The limit of the slope as α goes to infinity is the subhedging price of the derivative, and the superhedging price as α goes to minus infinity. The slope at α equal to zero is the Davis fair price.
196
CHAPTER 5
Then for all Q ∈ Pf (P), EQ {G} is a constant (and is equal to the Davis fair price). However, Corollary 5.1 in Delbaen et al. [81] states that the supremum of EQ {G} over Q ∈ Pe (P) is equal to the supremum over Q ∈ Pf (P) ∩ Pe , and therefore the former is also equal to the Davis fair price. This implies that the set of no-arbitrage prices is a single point and the option is replicable, which is a contradiction. We conclude that the indifference price is a strictly concave function of α. 2 5.3.5 Several Contingent Claims In realistic situations, many contingent claims are available for an investor to incorporate in her portfolio. Suppose there are N options in the market with bounded payoffs Gi and market prices pi for i = 1, . . . , N. The optimal number of each option to hold can be formulated in a similar way to the single option case. Let α = (α1 , . . . , αN ) denote a static position in the options. Then it is clear from our previous analysis that the optimal static position α ∗ in the derivatives is given by α ∗ = arg max (h(α · G, γ ) − α · p) ,
(5.40)
α
where α · G = N i=1 αi Gi . The problem is now to find conditions on the indifference price as a function of the vector α and the market price vector p of the derivatives G for existence and uniqueness of an optimal investment strategy. In [137], we show: h(α G) lim α↓ −∞ h(α G)/α=supQ∈ P EQ{G} e
0
lim α→ 0 h(α G)/α=EQ {G}
0
lim α ↑ ∞ h(α G)/α= inf
Q∈ Pe
α
Q
E {G}
Figure 5.3 The indifference price over α. The limit of the slope as α goes to infinity is the subhedging price of the derivative, and the superhedging price as α goes to minus infinity. The slope at α equal to zero is the Davis fair price.
197
PORTFOLIO OPTIMIZATION
• Assuming that none of the claims Gi is redundant (in a sense made precise there), the set of no-arbitrage price vectors is an open convex subset V of RN . • The indifference price is a strictly concave function of α with a well-defined gradient. • For each market price vector p in V , there exists a unique optimal derivatives position α ∗ .
5.4 STOCHASTIC VOLATILITY MODELS Stochastic volatility models are popular because they capture the deviation of stock price data from the Black-Scholes geometric Brownian motion model in a parsimonious way. They were originally introduced in the late 1980s by Hull and White [135] and others for option pricing. Much of their success derives from their predicted option prices exhibiting the implied volatility skew that is observed in many options markets. See [92], for example, for details. The risky asset S is modeled by the following stochastic differential equations (SDEs) dSt = µSt dt + σ (Yt )St dWt1 , dYt = b(Yt )dt + a(Yt )(ρdWt1 + ρ dWt2 ),
(5.41) (5.42)
where Y is the volatility driving factor correlated with the stock price and ρ = 1 − ρ 2 . W 1 and W 2 are two independent Brownian motions on the given space, and we take the filtration to be the augmented natural filtration of these Brownian motions. We will throughout assume the following. Assumption 5.5 i. σ and a are smooth and bounded with bounded derivatives, ii. 0 < L ≤ σ , for some constant L < ∞, iii. b is smooth with bounded first derivative. In this class of models, the wealth process satisfies dXt = µπt dt + σ (Yt )πt dWt1 ,
X0 = x,
(5.43)
where πt represents the dollar amount invested in the stock at time t. The set of admissible policies in this model is the set of trading strategies that satT isfy the integrability constraint E{ 0 πt2 dt} < ∞. We consider European claims G = g(ST , YT ), where g is smooth and bounded with bounded derivatives. However, many path-dependent claims can be treated in a similar manner with additional variables or boundary conditions. For example, indifference pricing of barrier options was studied in [138]. We do not discuss American-style options here. The indifference price in this model can be characterized as the solution of a quasi-linear partial differential equation by using the HJB equations related to the value functions u and M as in [252]. An alternative is solving the corresponding dual control problems given in (5.15), which is the approach we will follow here. We start by finding the minimal entropy martingale measure Q0 .
198
CHAPTER 5
5.4.1 Q0 within the Stochastic Volatility Model The so-called minimal martingale measure, P 0 , which was introduced in [91], is defined by the following Girsanov transformation T µ 1 T µ2 dP0 1 ds . = exp − dWs − 2 0 σ 2 (Ys ) dP 0 σ (Ys ) By our assumptions on σ (y), P0 has finite relative entropy and is equivalent to P. Therefore, Q0 is in Pf (P) ∩ Pe , and Assumption 5.3 is satisfied. For a given equivalent local martingale measure, Pλ there exists λ with T 2 0 λt dt < ∞ a.s. such that T T dPλ µ2 1 T µ 2 + λ λs dWs2 − = exp − dWs1 + s ds . 2 0 σ 2 (Ys ) dP 0 σ (Ys ) 0 (5.44) For the moment we shall consider λ in H2 (Pλ ), where H2 (Q) consists of all adapted T processes u that satisfy the integrability condition EQ { 0 u2t dt} < ∞. The entropy of such a measure Pλ with respect to P is T µ2 Pλ 1 2 (5.45) + λt dt . E 2 0 σ (Yt )2 We introduce the stochastic control problem related to maximizing the negative of relative entropy 1 T µ2 λ 2 Yt = y . ds (5.46) + λ ψ(t, y) = sup EP − s 2 t σ 2 (Ys ) λ∈H2 (Pλ ) The Hamilton-Jacobi-Bellman (HJB) equation associated with this stochastic control problem is µ2 1 ψt + L0y ψ + sup ρ a(y)ψy λ − λ2 = 2 , t < T , (5.47) 2 2σ (y) λ ψ(T , y) = 0, where L0y is the infinitesimal generator of the process (Yt ) under P0 and is given by ∂2 µ ∂ 1 . L0y = a 2 (y) 2 + b(y) − ρa(y) 2 ∂y σ (y) ∂y Performing the maximization in (5.47), we obtain 1 µ2 ψt + L0y ψ + (1 − ρ 2 )a 2 (y)ψy2 = 2 , 2 2σ (y) ψ(T , y) = 0,
t < T,
(5.48)
with the corresponding optimal control λ∗t = ρ a(Yt )ψy (t, Yt ).
(5.49)
199
PORTFOLIO OPTIMIZATION
The PDE in (5.48) can be linearized by a logarithmic transformation: ψ(t, y) =
1 log f (t, y). (1 − ρ 2 )
Then f satisfies ft + L0y f = (1 − ρ 2 )
µ2 f, 2σ 2 (y)
(5.50)
t < T,
f (T , y) = 1. Using the probabilistic representation of the solution of (5.50), we have T 2 µ (1 − ρ 2 ) 1 P0 exp − log E ds Yt = y . ψ(t, y) = (1 − ρ 2 ) 2σ 2 (Ys ) t
(5.51)
Lemma 5.6 Under Assumption 5.5, the density of the minimal entropy martingale measure is T T dQ0 µ λ∗ (t, Yt )dWt2 = exp − dWt1 + dP 0 σ (Yt ) 0 µ2 1 T 2 ∗ dt (5.52) + (λ − (t, Y )) t 2 0 σ 2 (Yt ) with λ∗ and ψ given in (5.49), (5.51), respectively. The minimum relative entropy H (Q0 |P) is equal to −ψ(0, y). Proof. From Theorem 2.9.10 in [164], under Assumption 5.5, f , which is given by T 2 µ (1 − ρ 2 ) P0 f (t, y) = E ds Yt = y , exp − 2σ 2 (Ys ) t is in C 1,2 ([0, T ) × R) and satisfies a polynomial growth condition in y. Moreover, f is the unique solution in this class of functions. As ψ is attained by logarithmic transformation of f , which is strictly positive, it will satisfy the same conclusions. Then, the optimality of the solution can be concluded by Theorem IV.3.1 in Fleming and Soner [82]. Note that f (t, y) is bounded. Under Assumption 5.5, taking the derivative of (5.50) with respect to y and using the probabilistic representation of the solution, we conclude that ψy (t, y), and hence λ∗ (t, y) are also bounded. Therefore, λ∗ defined in (5.49) is an optimizer of (5.46). The final step that Q0 is given by (5.52)—in other words, it was sufficient to consider λ ∈ H2 (Pλ )—follows from Proposition 3.2 of [110]. We refer to [20] and [138] for detailed calculations. 2 5.4.2 Indifference Pricing Partial Differential Equation In similar fashion, we introduce the stochastic control problem T µ2 1 λ 2 ν(t, S, y) = inf EP αg(ST , YT ) + + λ s ds λ 2γ t σ 2 (Ys )
St = S, Yt = y . (5.53)
200
CHAPTER 5
Given the solution of this stochastic control problem, the indifference price is h(αG, γ ) = ν(0, S, y) +
1 ψ(0, y). γ
The HJB equation associated with the stochastic control problem in (5.53) is as follows: µ2 1 2 λ =− , t < T , (5.54) νt + L0S,y ν + inf ρ a(y)νy λ + λ 2γ 2γ σ 2 (y) ν(T , S, y) = αg(S, y), where L0S,y is the generator of (St , Yt ) under P0 L0S,y =
∂2 ∂2 1 2 + L0y . σ (y)S 2 2 + ρσ (y)a(y)S ∂S∂y 2 ∂S
Performing the minimization in (5.54), we have 1 µ2 , νt + L0S,y ν − γ (1 − ρ 2 )a 2 (y)νy2 = − 2 2γ σ 2 (y) ν(T , S, y) = αg(S, y),
t < T,
(5.55)
with the corresponding optimal control λα (t, S, y) = −γρ a(y)νy (t, S, y). For h(t, S, y) = ν(t, S, y) + γ1 ψ(t, y), it follows from (5.55) and (5.48) that h(t, S, y) solves 0 1 2 2 2 ht + LQ t < T, S,y h − γ (1 − ρ )a (y)hy = 0, 2 h(T , S, y) = αg(S, y),
(5.56)
where 0
0 2 2 LQ S,y = LS,y − (1 − ρ )a (y)ψy (t, y)
∂ ∂y
is the generator of (St , Yt ) under Q0 . The PDE in (5.56) is the HJB equation associated with the stochastic control problem in (5.19). If the claim is contingent on YT only (G = g(YT )), S vanishes from the PDE in (5.56), and by a logarithmic transformation, the nonlinear PDE in (5.56) can be linearized. Hence, in this case an explicit solution for the indifference price could be found (see Chapter 3 in this volume), h(t, y) = −
1 0 2 log EQ {e−γ (1−ρ )g(YT ) | Yt = y}. γ (1 − ρ 2 )
(5.57)
Such a situation arises when Y is not considered as a volatility driving process, but rather as a nontraded asset, where another correlated asset S is available for trading. If further σ is assumed to be independent of y, the minimal entropy martingale measure coincides with the minimal martingale measure (since ψy is zero). The canonical example is modeling the traded asset price process as a geometric Brownian motion (see, for example, [198]).
201
PORTFOLIO OPTIMIZATION
Regularity of the Value Function The PDE in (5.55) does not admit an explicit solution and, in this section, we verify the existence and uniqueness of a solution. The traditional existence results for HJB equations require the feasible set of controls to be compact, which prevents us from direct use of these results. Therefore, we will first consider bounded subsets and study the limiting behavior of the value function as this bound is taken to infinity. A similar analysis was conducted by Pham [221] for power utility. In this section, we will further impose that a(y) = a is a constant. However, as Pham suggests in Remark 2.1 in [221], this assumption is not restrictive as the original model can be rewritten in this form with the change of variable suggested in this remark. Then one needs to be careful in verifying the assumptions that guarantee existence and uniqueness of a solution to the new system. Under P0 , the corresponding stochastic differential equations for the stock price process and the volatility driving process are dSt = σ (Yt )St dWt0,1 , µ dt + a ρdWt0,1 + ρ dWt0,2 , dYt = b(Yt ) − ρa σ (Yt )
(5.58) (5.59)
where W 0,1 and W 0,2 are two independent Brownian motions on (, F, P0 ). Let us rewrite the PDE in (5.55) as µ2 , 2γ σ 2 (y) ν(T , S, y) = αg(S, y),
νt + L0S,y ν + H (νy ) = −
t < T,
(5.60)
where 1 H (p) = − γ (1 − ρ 2 )a 2 p 2 . 2
(5.61)
L(q) = max[H (p) + qp],
(5.62)
We also define L by p∈R
which is the Fenchel-Legendre transform of H up to the sign of the qp term. The explicit form for L is found as L(q) =
q2 . 2γ (1 − ρ 2 )a 2
As H is concave in p, and we have the following duality relation: H (p) = min[L(q) + qp]. q∈R
Let us introduce the truncated functions H k (p) = min [L(q) + qp], q∈Bk
where Bk is the (compact) interval of length 2k, Bk = {q ∈ R : |q| ≤ k},
k > 0.
(5.63)
202
CHAPTER 5
We consider the following differential equations µ2 , 2γ σ 2 (y) ν k (T , S, y) = αg(S, y),
νtk + L0S,y ν + H k (νyk ) = −
t < T,
(5.64)
and assume the following: Assumption 5.7 Assume that |νyk (t, S, y)| ≤ C, for a positive constant C independent of k. Theorem 5.8 Under Assumption 5.5 and Assumption 5.7, equation (5.60) has a unique solution ν ∈ Cp1,2,2 ([0, T ) × R+ × R) with ν continuous in [0, T ] × R+ × R. Proof. Under Assumption 5.5, the function b(y) − ρaµ/σ (y) is C 2 with a bounded first derivative. By Theorem VI.6.2 in Fleming and Rishel [83], there exist unique solutions ν k ∈ C 1,2,2 ([0, T ) × R+ × R), which satisfy polynomial growth conditions in S and y and which are continuous on [0, T ] × R+ × R, and which solve (5.64). Moreover, applying Theorem IV.3.1 in Fleming and Soner [82], these solutions have the following stochastic control representations ν k (t, S, y) = inf EQ
k
q∈Bk
T
L(qs ) +
t
µ2 , Y )|S = S, Y = y , ds + αg(S T T t t 2γ σ 2 (Ys ) (5.65)
where the controlled dynamics under Qk are given by dSt = σ (Yt )St dWtk,1 , µ dt + a(ρdWtk,1 + ρ dWtk,2 ), dYt = qt + b(Yt ) − ρa σ (Yt )
(5.66) (5.67)
with W k,1 and W k,2 being two independent Brownian motions under Qk . The function q → L(q) + qνyk attains its minimum in R at qˆ k (t, S, y) = −γ (1 − ρ 2 )a 2 νyk (t, S, y). From Assumption 5.7, there exists a positive constant C independent of k such that |qˆ k (t, S, y)| ≤ C,
for all t ∈ [0, T ], S ∈ R+ , y ∈ R.
For k ≥ C, H k (νyk ) = min [L(q) + qνyk ], q∈Bk
= min[L(q) + qνyk ], q∈R
= H (νyk ), for all (t, S, y) ∈ [0, T ] × R+ × R. We deduce that ν k is a solution to (5.60) with the desired smoothness conditions.
203
PORTFOLIO OPTIMIZATION
Assume ν (1) and ν (2) are two solutions to (5.60) that are in C 1,2,2 ([0, T ) × R+ × R) and that satisfy a polynomial growth condition on [0, T ] × R+ × R, and let ζ = ν (1) − ν (2) . Then, ζ solves 1 ζt + L0S,y ζ + γ (1 − ρ 2 )a 2 (ν (1) (t, S, y) + ν (2) (t, S, y))ζy = 0, 2 ζ (T , S, y) = 0.
t < T,
The probabilistic representation of the solution indicates that ζ (t, S, y) is the conditional expectation of zero under the measure defined by the following Girsanov transformation: T dPν 1 γ (1 − ρ 2 )a 2 (ν (1) (t, St , Yt ) + ν (2) (t, St , Yt ))dWt = exp 0 dP 2 0 1 T − γ (1 − ρ 2 )a 2 (ν (1) (t, St , Yt ) + ν (2) (t, St , Yt ))dt , 4 0 and therefore is equal to zero. Note that Pν is well defined under the assumptions on ν (1) and ν (2) . This guarantees uniqueness of the solution and completes the proof. 2 Corollary 5.9 Let λα (t, S, y) = −γρ aνy (t, S, y), where ν is the unique solution to (5.60) in the class of C 1,2,2 ([0, T ) × R+ × R) functions that satisfy a polynomial growth condition on [0, T ] × R+ × R. Define Qα as 1 Qα = arg min EQ {αG} + H (Q|P) , (5.68) Q∈Pf γ in the stochastic volatility model given in (5.41) and (5.42) with a(y) = a. Then, the density of Qα is given by T T dQα µ dWt1 + λα (t, St , Yt )dWt2 = exp − ) dP σ (Y t 0 0 µ2 1 T 2 α (5.69) − + (λ (t, St , Yt )) dt . 2 0 σ 2 (Yt ) Proof. Note that νy is bounded. Therefore, so is λα . From the Verification Theorem (Theorem IV.3.1 and Corollary IV.3.1 in [82]), ν is the optimal solution to (5.53), and λα is an optimal Markov control policy.As in the case without claims in Section 5.4.1, the final step to show that the optimal measure Qα defined in (5.68) is given by (5.69) again follows from Proposition 3.2 of [110]. We refer to [138] for details. 2 Corollary 5.10 Under Assumption 5.5 and Assumption 5.7, Equation (5.56) has a unique solution h ∈ Cp1,2,2 ([0, T ) × R+ × R), with h continuous in ([0, T ] × R+ × R). Proof. Write h(t, S, y) = ν(t, S, y) + γ1 ψ(t, y), where the ν and ψ are the unique solutions of (5.60) and (5.48). The result follows trivially. 2
204
CHAPTER 5
5.4.3 Asymptotic Expansions We end this section with two concave approximations of the indifference price as a function of quantity that can be used to compute the approximate optimal derivatives positions under appropriate assumptions. The first is based on a direct power series expansion in the quantity α and so is valid for small quantities; the second is based on the slow timescale of fluctuation of an important factor of market volatility. Small α Approximation The dependence of the indifference price on α appears in the terminal condition of (5.56), and the indifference price is zero when α is zero. To gain an understanding of the price, we construct a power series expansion for small α: h(t, S, y) = αh(1) (t, S, y) + α 2 h(2) (t, S, y) + · · · . Inserting this approximation into (5.56) and grouping order α terms, we deduce that h(1) satisfies 0
Q (1) = 0, h(1) t + LS,y h
t < T , S > 0,
(5.70)
h (T , S, y) = g(S, y), (1)
and the solution is given by 0
h(1) (t, S, y) = EQ {g(ST , YT ) | St = S, Yt = y}.
(5.71)
Note that h(1) (t, S, y) is the Davis fair price of the claim [61]. Considering terms of order α 2 , we deduce that h(2) satisfies 1 Q0 (2) 2 = γ (1 − ρ 2 )a 2 (Yt )(h(1) h(2) t + LS,y h y ) , 2 h(2) (T , S, y) = 0,
t < T , S > 0,
(5.72)
and by the Feynman-Kac formula, h(2) (t, S, y) T 1 0 2 a 2 (Ys )(h(1) (s, S , Y )) ds | S = S, Y = y . = EQ − γ (1 − ρ 2 ) s s t t y 2 t (5.73) Note that h(2) is negative, reflecting the concavity of h. Considering only these first two terms, the optimal number α ∗ to hold is given by, approximately, α∗ =
p − h(1) (t, S, y) . 2h(2) (t, S, y)
For G = g(YT ), an explicit expression for h(2) (t, y) for the indifference price can be found as 1 0 h(2) (t, y) = γ (1 − ρ 2 ) (h(1) (t, y))2 − EQ {g(YT )2 } | Yt = y 2 1 Q0 = − γ (1 − ρ 2 )vart,y (g(YT )), 2
205
PORTFOLIO OPTIMIZATION Q0
where vart,y denotes the conditional variance given {Yt = y}, under the measure Q0 . Of course, in this case, it is easier to obtain the terms directly by a Taylor series expansion on (5.57). A similar expansion using Malliavin calculus is studied by Davis in [61], where the case with high correlation between the two assets is considered. Slow Volatility Approximation There are a number of approaches for constructing stochastic volatility models that reflect historical and option price data in a parsimonious manner. For example, [7, 74] advocate one-factor stochastic volatility jump-diffusion models, while [12] employ Ornstein-Uhlenbeck-Levy processes. The motivation for departing from “traditional” one-dimensional diffusion models [135] is to bridge the seeming inconsistency between slow mean reversion estimated from daily stock returns and pronounced implied volatility skews at short maturities. Another way of capturing these observations is to allow for two-factor stochastic volatility models in which one factor is varying slowly and the other is fast mean reverting. The advantage is in remaining within a diffusion framework (at the cost of increased dimensionality), where statistical, analytical, and simulation tools are extremely convenient. In addition, many problems can be tackled by constructing asymptotic approximations, using singular perturbation techniques for the fast factor, and regular perturbation for the slow one. The asymptotic analysis with just the fast factor is studied for a variety of derivative pricing problems in [92], for partial hedging and utility maximization problems in [144, 145], for exotic options pricing (and in particular passport options with their embedded portfolio optimization problems) in [136], and for indifference prices in [252]. The joint asymptotics for no-arbitrage European option pricing with both scales appears in [93]. Here, we shall ignore the fast factor and concentrate on the slow-scale asymptotics. The assumption is that the time horizon of the investor’s problem is long enough that the effect of the fast ergodic factor averages out. To this end, we introduce a small parameter δ > 0 representing the slow scale and replace b and a in (5.42) by √ δb(Yt ) and δa(Yt ), respectively. Therefore, the dynamics of our volatility driving factor is given by √ Yt = δb(Yt )dt + δa(Yt ) ρdWt1 + ρ dWt2 . We first construct the expansion for the function ψ(t, y), which is related to the value function of the plain Merton problem by M(x, γ ) = −e−γ x+ψ(0,y) . It is the solution of the PDE problem (5.48), which we rewrite as ψt + δM2 ψ −
√ ρµa(y) 1 µ2 ψy + δ(1 − ρ 2 )ψy2 = δ , σ (y) 2 2σ (y)2
(5.74)
in t < T , with ψ(T , y) = 0. Here, we define M2 =
∂2 ∂ 1 a(y)2 2 + b(y) , 2 ∂y ∂y
the infinitesimal generator of Y on the unit timescale (that is, δ = 1).
(5.75)
206
CHAPTER 5
We look for a formal expansion ψ = ψ (0) +
√ (1) δψ + δψ (2) + · · · ,
(5.76)
which, for fixed (t, y), converges as δ ↓ 0. In fact, we want to construct the expansion of the indifference price h up to order δ (to obtain some concavity as a function of α), and so we shall only need the first two terms of the ψ expansion. Inserting the expansion (5.76) into (5.74) and comparing powers of δ, we find that ψ (0) should be chosen to solve ψt(0) =
µ2 , 2σ (y)2
with zero terminal condition. This yields ψ (0) (t, y) = −(T − t) Moving to terms of order
√
µ2 . 2σ (y)2
δ, we find that ψ (1) should be chosen to solve ψt(1) =
ρµa(y) (0) ψ , σ (y) y
again with zero terminal condition. The solution is ρµ3 a(y) ∂ σ (y)−2 . 4σ (y) ∂y √ Now we construct an expansion in powers of δ for the indifference function h. We first rewrite the PDE (5.56) as √ 1 1 ht + σ (y)2 S 2 hSS + δM1 h + δM 2 h − δγ (1 − ρ 2 )a(y)2 h2y = 0, 2 2 with h(T , S, y) = αg(S). Here, ∂2 ρµ ∂ M1 = ρσ (y)a(y)S + a(y) ρ ψy(0) − ∂S∂y σ (y) ∂y ∂ M 2 = M2 + ρ a(y)ψy(1) , ∂y ψ (1) (t, y) = (T − t)2
and we have substituted the expansion (5.76) for ψ. We look for an expansion √ h = h(0) + δh(1) + δh(2) + · · · .
pricing
(5.77)
(5.78) (5.79)
(5.80)
Inserting (5.80) into (5.77) and comparing order one terms yields that h(0) should be chosen to solve LBS (σ (y))h(0) = 0, with h (T , S, y) = αg(S), and where (0)
LBS (σ (y)) =
1 ∂2 ∂ + σ (y)2 S 2 2 , ∂t 2 ∂S
(5.81)
207
PORTFOLIO OPTIMIZATION
the Black-Scholes differential operator at volatility level σ (y). Since (5.81) is simply the Black-Scholes PDE with volatility coefficient σ (y), h(0) is the BlackScholes price of the European contract with payoff αg, which we denote h(0) (t, S, y) = hBS (t, S; σ (y)). √ Comparing terms of order δ yields that h(1) should be chosen to solve LBS (σ (y))h(1) = −M1 h(0) , with zero terminal condition. At this stage, it is convenient to introduce the notation ∂k , ∂S k the kth logarithmic derivative. We will use shortly D for D1 . As used extensively in the singular perturbation analysis in [92], the solution of the PDE problem Dk = S k
LBS (σ (y))uk," = −(T − t)" Dk hBS uk," (T , S, y) = 0, in t < T is given by uk," (t, S, y) =
(T − t)"+1 Dk hBS (t, S; σ (y)), "+1
as can be verified by direct substitution. Using that h(0) solves the Black-Scholes PDE and the relation ∂ ∂ hBS = (T − t)σ S 2 2 hBS ∂S ∂σ between the Greeks Vega and Gamma of Black-Scholes European option prices, we have M1 h(0) = ρσ 2 aσ (T − t)DD2 h(0) ρµ
+ (T − t)aσ σ a(1 − ρ 2 )c0 (T − t) − D2 h(0) , σ where c0 (y) = −
µ2 . 2σ (y)2
Therefore, h(1) =
1 ρaσ 2 σ (T − t)2 DD2 h(0) 2 1 1 ρµ + aσ σ
a(1 − ρ 2 )c0 (T − t)3 − (T − t)2 D2 h(0) . 3 2 σ
Note that, so√far, the nonlinear term in (5.77) has played no role, and the approximation h(0) + δh(1) is linear in α. To pick up concavity in α, we need to proceed to the next term in the expansion. Comparing terms of order δ in the PDE (5.77)
208
CHAPTER 5
with the substituted expansion (5.80) gives 1 2 LBS (σ (y))h(2) = −M 2 h(0) − M1 h(1) + γ (1 − ρ 2 )a(y)2 (h(0) y ) , 2 with zero terminal condition. We write h(2) = h(2,1) + h(2,2) + α 2 F (t, S, y), where LBS (σ (y))h(2,1) = −M 2 h(0) , LBS (σ (y))h(2,2) = −M1 h(1) , 1 2 LBS (σ (y))F = γ (1 − ρ 2 )a(y)2 (h(0) y ) , 2 each with zero terminal condition. Using that ∂2 hBS = (T − t)D2 hBS + σ 2 (T − t)2 D22 hBS , ∂σ 2 and
h(0) yy = σ
∂ ∂2 hBS + (σ )2 2 hBS , ∂σ ∂σ
we calculate 1 M 2 h(0) = A(T − t)D2 hBS + a 2 σ 2 (σ )2 (T − t)2 D22 hBS 2 + a 2 (1 − ρ 2 )c1 σ σ (T − t)3 D2 hBS , where 1 A(y) = a(y)2 (σ (y)σ
(y) + σ (y)2 ) + b(y)σ (y)σ (y), 2 ρµ3 a(y) ∂ σ (y)−2 . c1 (y) = 4σ (y) ∂y Then, h(2,1) =
1 1 A(T − t)2 D2 hBS + a 2 σ 2 (σ )2 (T − t)3 D22 hBS 2 6 1 + a 2 (1 − ρ 2 )c1 σ σ (T − t)4 D2 hBS . 4
Similarly, writing h(1) = c2 (y)(T − t)2 DD2 h(0) + c3 (y)(T − t)3 + c4 (y)(T − t)2 D2 h(0) , and M1 = (c5 (y)D + (c6 (y)(T − t) + c7 (y))I )
∂ , ∂y
209
PORTFOLIO OPTIMIZATION
where 1 c2 = ρaσ 2 σ , 2 1 c3 = a 2 (1 − ρ 2 )σ σ c0 , 3 1 c4 = − aσ ρµ, 2 c5 = ρσ a, c6 = a 2 (1 − ρ 2 )c0 , c7 = −ρaµ/σ, and I = D0 is the identity operator, we obtain M1 h(1) = (c5 (y)D + (c6 (y)(T − t) + c7 (y))I )(c2 (T − t)DD2 h(0) + (c3 (T − t)3 + c4 (T − t)2 )D2 h(0) + c˜2 (T − t)3 DD22 h(0) + (c˜3 (T − t)4 + c˜4 (T − t)3 )D22 h(0) ), where c˜i = ci σ σ , and therefore 1 1 h(2,2) = c5 c2 (T − t)D 2 D2 h(0) + c5 c˜2 (T − t)4 D 2 D22 h(0) 4 2 1
1
1
1
4 3 3 2 + c5 c3 (T − t) + c5 c4 (T − t) + c6 c2 (T − t) + c7 c2 (T − t) DD2 h(0) 4 3 3 2 1 1 1 1 5 4 5 4 + c5 c˜3 (T − t) + c5 c˜4 (T − t) + c6 c˜2 (T − t) + c7 c˜2 (T − t) DD22 h(0) 5 4 5 4 1 1 1 1 6 5 5 4 + c6 c˜3 (T − t) + c6 c˜4 (T − t) + c7 c˜3 (T − t) + c7 c˜4 (T − t) D22 h(0) 5 5 4 6 1
1
1
1
5 4 4 3 + c6 c3 (T − t) + c6 c4 (T − t) + c7 c3 (T − t) + c7 c4 (T − t) D2 h(0) . 5 4 4 3 Finally, F is given by (σ (y))2 EP
0
T t
and the Vega V =
∂ ∂σ
1 γ (1 − ρ 2 )a(y)2 V 2 (u, Su )du , 2
(5.82)
hBS . In (5.82), S is the solution of t , St d W d St = σ (y)
a standard Brownian motion in (, F, P0 ). In other words, S is a geometric with W Brownian motion with constant volatility σ (y). For example, if the option g( ST , YT ) were a put option written on S with strike price K, the explicit form of V would be as follows: √ 1 2 S T −t exp − d1 (t, S) , V(t, S) = √ 2 2π
210
CHAPTER 5
where d1 (t, S) =
√ 1 log( S/K) + σ (y) T − t. √ σ (y) T − t 2
In this case, the expectation in (5.82) reduces to T 1 1 γ (1 − ρ 2 )a(y)2 (σy (y))2 √ 2 2π T − 2t + u t 2 4 T −t T −3t+2u (T − t) σ (y) /4 + 2 log(K/S) (T − u)3/2 S T −2t+u K T −2t+u du, × exp − (T − 2t + u)σ (y)2 which can be calculated very fast. Given the market price p per unit of the derivative security with payoff g(ST ), the optimal number of derivative α ∗ to hold is approximately given by the maximizer of √ + + (0) + δ h (1) + δh(2,1) ) + α 2 δF − αp, α(h + (0) = h(0) /α is the α-independent Black-Scholes price of one contract, and where h + (1) . Therefore similarly h √ + + (1) + δh(2,1) ) (0) + δ h p − (h . α∗ ≈ 2δF Clearly, this blows up as δ ↓ 0, unless p is the Black-Scholes price of the contract (with today’s volatility). This is what we would expect in the convergence to a complete market, when derivatives become redundant, and induce arbitrage opportunities for infinite profit unless priced correctly by the market.
Chapter Six Indifference Pricing of Defaultable Claims Tomasz R. Bielecki Monique Jeanblanc
Our goal in this chapter is to give an application of the theory of indifference prices in the context of defaultable claims within the reduced-form approach. In this approach the defaultable market is incomplete and there does not exist a (perfect) hedging strategy for claims that depend on the occurrence of the default. An important issue is the choice of relevant information. The chapter is organized as follows. Section 6.1 contains a brief description of the basic concepts of default risk that are used in the chapter. The second section is devoted to indifference pricing in the filtration of default-free assets. The following section studies the case where the investor additionally uses the information on the default in the choice of the portfolio and is endowed with an exponential utility function. In the last section, we present the quadratic hedging problem.
6.1 PRELIMINARIES In this section, we introduce the basic notions that will be used in what follows. First, we define a default-free market model. Then, we examine the concept of a default time and we present the associated hazard process. We make precise the choice of the filtration, which is an important aspect of our presentation.
6.1.1 Default-Free Market Consider an economy in continuous time, with the time parameter t ∈ R+ . A probability space (, G, P) endowed with a one-dimensional standard Brownian motion {Wt , t ≥ 0} is given. We assume that the reference filtration F = {Ft , t ≥ 0} is the P-augmented version of the natural filtration generated by W . We have Ft ⊂ G, for any t ∈ R+ ; however, we do not assume that G = F∞ . In the first step, we introduce a Black and Scholes arbitrage-free default-free market. In this market, we have the following primary assets: • A money market account B satisfying dBt = rBt dt,
B0 = 1,
212
CHAPTER 6
or, equivalently, Bt = exp(rt) where the interest rate r is assumed to be constant. • A default-free asset whose price {St , t ≥ 0} follows a geometric Brownian motion dynamics dSt = St (νdt + σ dWt ), where ν and σ are two constants, with σ = 0. It is not difficult to extend the study to the case where r, ν, and σ are F-adapted process provided some regularity is assumed in order to guarantee that the defaultfree market is arbitrage free. In the last part of the chapter, we shall turn to a more general model of the primary market. As is well known, the Black and Scholes default-free market is arbitrage free and complete, and the (unique) risk-neutral probability Q is obtained via its RadonNikodym density, i.e., dQ|Ft = ηt dP|Ft , where (ηt , t ≥ 0) is the (P, F)-martingale given as 1 2 ηt = exp −θWt − θ t , 2
(6.1)
where θ = (ν − r)σ −1 is the risk premium. From Girsanov’s theorem, the process Q Wt = Wt + θ t, is a (Q, F)-Brownian motion.
6.1.2 Default Time The default time τ is defined as a nonnegative random variable on the probability space (, G, P). We introduce the default process Ht = 1{τ ≤t} and we denote by H = (Ht , t ≥ 0) the filtration generated by this process (this filtration is right-continuous, and, as usual, we take the completion of this filtration). Note that Ht = σ (t ∧ τ ), hence, any Ht -measurable random variable is a deterministic function of the random variable τ ∧ t.
6.1.3 Hazard Process It is generally assumed that the investor knows when the default takes place, that is, the observation of the investor includes the filtration H. At time t, the investor knows whether or not the default has occurred. If the default has not occurred yet, the investor has no information on the date when the default will possibly take place. Therefore, we consider the filtration of information that takes into account the information on the asset price and of the occurrence of default: G = F ∨ H so that Gt = Ft ∨ Ht = σ (Ft ∪ Ht ) for every t ∈ R+ . The filtration G is referred to as the full filtration. It is clear that τ is an H-stopping time, as well as a G-stopping time (but not necessarily an F-stopping time). The concept of hazard process of a
213
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
random time τ is closely related to the process F = {Ft , t ≥ 0}, which is defined as follows: Ft = P{τ ≤ t|Ft },
∀t ∈ R+ .
(6.2)
Let us denote Gt = 1 − Ft = P{τ > t|Ft } the survival probability and let us assume that Gt > 0 for every t ∈ R+ (hence, we exclude the case where τ is an F-stopping time – a case that corresponds to the so-called structural approach). Then the process = {t , t ≥ 0}, given by the formula t = − ln(1 − Ft ) = − ln Gt ,
∀t ≥ 0,
is well defined. It is termed the hazard process of the random time τ with respect to the reference filtration F. We postulate that P (τ < ∞) = 1 (i.e. τ is finite with probability one). We now formulate an important Hypothesis: We assume in this chapter that the Brownian motion {Wt , t ≥ 0} is a (P, G)-Brownian motion. In other words, we assume that any F-martingale is a G-martingale. This assumption is known in the literature of enlargement of filtration as hypothesis (H). It can be proved that it is equivalent to P{τ ≤ t|Ft } = P{τ ≤ t|F∞ },
∀ t ≥ 0,
and, as a consequence, the process F (hence ) is increasing. We do not comment here on that hypothesis, we simply mention that this hypothesis is necessary to guarantee that there is no-arbitrage in the default-free market using G-adapted strategies. See Elliott et al. [230] or Bielecki et al. [261] for comments. Note that, due to hypothesis (H), the process {ηt , t ≥ 0} defined in (6.1) is a (P, G)-martingale. This allows us to define the probability Q∗ whose the restriction to Gt is dQ∗ |Gt = ηt dP|Gt . Obviously, the restriction of Q∗ to F is equal to Q. We shall omit the superscript ∗ in what follows. Moreover, for simplicity, we assume that the process {Ft , t ≥ 0} is absolutely continuous, that is, t Ft = fu du 0
for some density process f : R+ → R+ . Then we have t Ft = 1 − e−t = 1 − exp − γu du , 0
where γt =
ft , 1 − Ft
∀t ≥ 0.
∀t ≥ 0,
214
CHAPTER 6
∞
The process γ is nonnegative and satisfies 0 γu du = ∞. It is called the hazard rate of τ (or the stochastic intensity). It can be checked by direct calculations that the process t∧τ t Mt = Ht − γu du = Ht − (1 − Hu− )γu du (6.3) 0
0
is a (purely discontinuous) (P, G)-martingale. This implies that the random time τ is totally inaccessible in the filtration G. We emphasize that, in our setting, the intensity process is uniquely defined up to infinity and is F-adapted. Moreover, from the definition of Q (relative to the full filtration G) the process M is a (Q, G)martingale. Indeed, the change of probability has an effect only on the Brownian motion W and no effect on the martingale M, which is orthogonal to W . In the particular case where the random time τ is independent of the filtration F, the hazard process is deterministic. t Furthermore, for any G-predictable processψ ψ such that t ψs > −1, ∀s, a.s., and ψ (1 + ψ )(1 − H )γ ds < ∞, the process M = M − s s s t t 0 0 (1 − Hs )ψs γs ds is a Q ψ martingale if the probability measure Q is defined by: dQψ |Gt = ηt E(ψ • M)t dP|Gt . Here, the process E(ψ • M)t —the Doléans-Dade exponential of ψ • M—is the unique process Y , which is the solution of dYt = Yt− ψt dMt , Y0 = 1. The restriction of Qψ to the σ -algebra Ft is equal to Q. 6.1.4 Defaultable Claims A defaultable claim (X1 , X2 , τ ) with maturity date T consists of the following. • The default time τ specifying the random time of default and thus also the default events {τ ≤ t} for every t ∈ [0, T ]. It is always assumed that τ is strictly positive with probability 1. • The promised payoff X1 , which represents the random payoff received by the owner of the claim at time T , if there is no default prior to or at time T . The actual payoff at time T associated with X1 thus equals X1 1{τ >T } . We assume that X1 is an FT -measurable random variable. • The recovery payoff X2 , where X2 is an FT -measurable random variable, which is the amount received by the owner of the claim at maturity, provided that the default occurs prior to or at maturity date T . In what follows, we shall denote by X = X1 1T 0 in the default-free financial market using a self-financing strategy. The associated optimization problem is (P) : V(v) := sup EP {U (VTv (φ))}, φ∈(F )
where the wealth process {Vt =
Vtv (φ), t
≤ T } is solution of
dVt = rVt dt + φt (dSt − rSt dt),
V0 = v.
(6.4)
Here (F ) is the class of all F-adapted, self-financing trading strategies. Problem (PFX ): Optimization in the default-free market using F-adapted strategies and buying the defaultable claim. The agent buys the defaultable claim X at price p, and invests his remaining wealth v − p in the default-free financial market, using a trading strategy φ ∈ (F ). The resulting global terminal wealth will be v−p,X
VT
v−p
(φ) = VT
(φ) + X.
The associated optimization problem is v−p
(PFX ) : VXF (v − p) := sup EP {U (VT
(φ) + X)},
φ∈(F )
where the process V v−p (φ) is solution of (6.4), which satisfies the initial condition v−p V0 (φ) = v − p. We emphasize that the class (F ) of admissible strategies is the same as in the problem (P), that is, we restrict here our attention to trading strategies that are adapted to the reference filtration F. Problem (PGX ): Optimization in the default-free market using G-adapted strategies and buying the defaultable claim. The agent buys the defaultable contingent claim X at price p, and invests the remaining wealth v − p in the financial market, using a strategy adapted to the enlarged filtration G. The associated optimization problem is v−p
(PGX ) : VXG (v − p) := sup EP {U (VT
(φ) + X)},
φ∈(G)
where (G) is the class of all G-admissible trading strategies. Remark. It is easy to check that the solution of (PG ) : sup EP {U (VTv (φ))}, φ∈(G)
is the same as the solution of (P).
216
CHAPTER 6
Definition 6.1 For a given initial endowment v, the F-indifference buying price of the defaultable claim X is the real number pF∗ (v) such that V(v) = VXF (v − pF∗ (v)). ∗ Similarly, the G-indifference buying price of X is the real number pG (v) such that G ∗ V(v) = VX (v − pG (v)).
Remark. We can define the F-indifference selling price p∗F (v) of X by considering −p, where p is the buying price of −X, as specified in Definition 6.1. If the contingent claim X is FT -measurable, then (see Rouge and ElKaroui [156] or Chapter 2 by Henderson and Hobson in this volume) the F- and the G-indifference selling and buying prices coincide with the hedging price of X, i.e., ∗ pF∗ (v) = pG (v) = EP {ζT X)} = EQ {X} = p∗G (v) = p∗F (v),
where we denote by ζ the deflator process ζt = ηt e−rt .
6.2 INDIFFERENCE PRICES RELATIVE TO THE REFERENCE FILTRATION In this section, we study the problem (PFX ) (i.e., we use strategies adapted to the reference filtration). First, we compute the value function, i.e., VXF (v − p). Next, we establish a quasi-explicit representation for the indifference price of X in the case of exponential utility. Finally, we compare the credit spread obtained via the risk-neutral valuation with the credit spread determined by the indifference price of a defaultable zero-coupon bond. 6.2.1 Solution of Problem (PFX ) In view of the particular form of the defaultable claim X it follows that v−p,X
VT
v−p
(φ) = 1{τ >T } (VT
v−p
(φ) + X1 ) + 1{τ ≤T } (VT
(φ) + X2 ). v−p
Since the trading strategies are F-adapted, the terminal wealth VT FT -measurable random variable. Consequently, it holds that v−p,X
EP {u(VT
(φ))} = v−p
= EP {U (VT = =
(φ) is an
v−p
(φ) + X1 )1{τ >T } + U (VT
(φ) + X2 )1{τ ≤T } } v−p v−p EP {EP {U (VT (φ) + X1 )1{τ >T } + U (VT (φ) + X2 )1{τ ≤T } |FT }} v−p v−p EP {U (VT (φ) + X1 )(1 − FT ) + U (VT (φ) + X2 )FT }.
Recall that FT = P{τ ≤ T |FT } by definition (6.2). Thus, problem (PFX ) is equivalent to the following problem: v−p
(PFX ) : VXF (v − p) := sup EP {JX (VT φ∈(F )
(φ), ·)},
217
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
where JX (y, ω) = U (y + X1 (ω))(1 − FT (ω)) + U (y + X2 (ω))FT (ω), for every ω ∈ and y ∈ R. The real-valued mapping JX (·, ω) is strictly concave and increasing. Consequently, for any ω ∈ , we can define the mapping IX (z, ω) by setting IX (z, ω) = (JX (·, ω))−1 (z) for z ∈ R, where (JX (·, ω))−1 denotes the inverse mapping of the derivative of JX with respect to the first variable. To simplify the notation, we shall suppress the second variable, and we write IX (·) in lieu of IX (·, ω). The following lemma provides the form of the optimal solution for the problem (PFX ), Lemma 6.2 The optimal terminal wealth for the problem (PFX ) is a random variable v−p,∗ of the form VT = IX (λ∗ ζT ), P-a.s., for some λ∗ such that v−p,∗
v − p = EP {ζT VT
}.
(6.5)
Thus the optimal global wealth equals VT = VT + X = IX (λ∗ ζT ) + X and the value function of the objective criterion for the problem (PFX ) is v−p,X,∗
v−p,X,∗
VXF (v − p) = EP {U (VT
v−p,∗
)} = EP {U (IX (λ∗ ζT ) + X)}.
(6.6)
Proof. It is well known (see, e.g., Karatzas and Shreve [153]) that, in order to find the optimal wealth it is enough to maximize U (#) over the set of square-integrable and FT -measurable random variables #, subject to the budget constraint, given by EP {ζT #} ≤ v − p. The mapping JX (·) is strictly concave (for all ω). Hence, for every pair of FT measurable random variables (#, #∗ ) subject to the budget constraint, by tangent inequality we have EP {JX (#) − JX (#∗ )} ≤ EP {(# − #∗ )JX (#∗ )}. For #∗ = VT
v−p,∗
given in the formulation of the lemma we obtain v−p,∗
EP {JX (#) − JX (VT
)} ≤ λ∗ EP {ζT (# − VT
v−p,∗
)} ≤ 0,
where the last inequality follows from the budget constraint and the choice of λ∗ . Hence, for any φ ∈ (F ), v−p
EP {JX (VT
v−p,∗
(φ)) − JX (VT
)} ≤ 0.
To conclude the proof, it remains to observe that the first-order conditions are also sufficient in the case of a concave criterion. Moreover, by virtue of strict concavity 2 of the function JX , the optimal strategy is unique. 6.2.2 Exponential Utility: Explicit Computation of the Indifference Price For the sake of simplicity, we assume here that r = 0.
218
CHAPTER 6
Proposition 6.1 Let U (x) = − exp(−γ x) for some γ > 0. Assume that the random variables ζT e−γ Xi , i = 1, 2 are P-integrable. Then the F-indifference buying price is given by 1 pF∗ (v) = − EP {ζT ln((1 − FT )e−γ X1 + FT e−γ X2 )} = EP {ζT }, γ where the FT -measurable random variable equals =−
1 ln((1 − FT )e−γ X1 + FT e−γ X2 ). γ
(6.7)
Thus, the F-indifference buying price pF∗ (v) is the arbitrage price of the associated claim . In addition, the claim enjoys the following meaningful property EP {U (X − )|FT } = 0.
(6.8)
Proof. In view of the form of the solution to the problem (P), we obtain ∗ 1 µ ζT v,∗ VT = − ln . γ γ The budget constraint EP {ζT VTv,∗ } = v implies that the Lagrange multiplier µ∗ satisfies ∗ 1 1 µ (6.9) = − EP {ζT ln ζT } − v. ln γ γ γ The solution to the problem (PFX ) is obtained in a general setting in Lemma 6.2. In the case of an exponential utility, we have (recall that the variable ω is suppressed) JX (y) = −e−γ (y+X1 ) (1 − FT ) − e−γ (y+X2 ) FT , so that JX (y) = γ e−γ y (e−γ X1 (1 − FT ) + e−γ X2 FT ). Thus, setting A = e−γ X1 (1 − FT ) + e−γ X2 FT = e−γ , we obtain IX (z) = −
1 1 z z ln = − ln − . γ Aγ γ γ
It follows that the optimal terminal wealth for the initial endowment v − p is ∗ ∗ 1 λ ζT 1 λ 1 v−p,∗ VT = − ln = − ln − ln ζT − , γ Aγ γ γ γ where the Lagrange multiplier λ∗ is chosen so that the budget constraint is satisfied, v−p,∗ that is, EP {ζT VT } = v − p or ∗ 1 λ 1 ln = − EP {ζT ln ζT } − EP {ζT )} − v + p. (6.10) γ γ γ
219
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
From definition, the F-indifference buying price is a real number p∗ = pF∗ (v) such that v−p ∗ ,∗
EP {exp(−γ VTv,∗ )} = EP {exp(−γ (VT ∗
+ X))},
∗
where µ and λ are given by (6.9) and (6.10), respectively. After substitution and simplifications, we arrive at the following equality: EP {exp(−γ (EP {ζT } − p∗ + X − ))} = 1.
(6.11)
It is easy to check that EP {e−γ (X−) |FT } = 1
(6.12)
so that equality (6.8) holds, and EP {e−γ (X−) } = 1. Combining (6.11) and (6.12), we conclude that pF∗ (v) = EP {ζT }. 2 We briefly provide the analog of (6.7) for the F-indifference selling price of X. where We have p∗F (v) = EP {ζT }, = 1 ln((1 − FT )eγ X1 + FT eγ X2 ). γ
(6.13)
Remark. The F-indifference prices pF∗ (v) and p∗F (v) do not depend on the initial endowment v. This is a property of the exponential utility function that we already encountered several times. In view of (6.8), the random variable will be called the indifference conditional hedge. From concavity of the logarithm function we obtain ln((1 − FT )e−γ X1 + FT e−γ X2 ) ≥ (1 − FT )(−γ X1 ) + FT (−γ X2 ). Hence, using the fact that ζT is FT -measurable, pF∗ (v) ≤ EP {ζT ((1 − FT )X1 + FT X2 )} = EQ (X). Comparison with marginal price. Let us present the results that can be derived from the marginal utility pricing approach. The Davis or marginal price (see Davis [60] or Chapter 2 by Henderson and Hobson in this volume) is given by d ∗ (v) =
EP {U (VTv,∗ )X} . V (v)
In our context, this yields d ∗ (v) = EP {ζT (X1 FT + X2 (1 − FT ))}. In this case, the risk aversion γ has no influence on the pricing of the contingent claim. In particular, when F is deterministic, the Davis price reduces to the arbitrage price of each (default-free) financial asset Xi , i = 1, 2, weighted by the corresponding probabilities FT and 1 − FT . 6.2.3 Risk-Neutral Spread versus Indifference Spreads In our setting the price process of the T -maturity unit discount Treasury (defaultfree) bond is B(t, T ) = e−r(T −t) . Let us consider the case of a defaultable bond with
220
CHAPTER 6
zero recovery, i.e., X1 = 1 and X2 = 0. It follows from (6.13) that the F-indifference buying and selling prices of the bond are (it will be convenient here to indicate the dependence of the indifference price on maturity T ) 1 DF∗ (0, T ) = − EP {ζT ln(e−γ (1 − FT ) + FT )} γ and D∗F (0, T ) =
1 EP {ζT ln(eγ (1 − FT ) + FT )}, γ
respectively. be a risk-neutral probability for the filtration G, that is, for the enlarged Let Q market. The “market” price at time t = 0 of defaultable bond, denoted as D 0 (0, T ), of its discounted payoff, that is, is thus equal to the expectation under Q T )RT ), D 0 (0, T ) = EQ{1{τ >T } RT } = EQ((1 − F ≤ t|Ft } for every t ∈ [0, T ]. Let us emphasize that the risk-neutral t = Q{τ where F is chosen by the market, via the price of the defaultable asset. The probability Q indifference buying and selling spreads at time t = 0 are defined as S ∗ (0, T ) = −
1 DF∗ (0, T ) ln T B(0, T )
and S∗ (0, T ) = −
1 D∗F (0, T ) ln , T B(0, T )
respectively. Likewise, the risk-neutral spread at time t = 0 is given as S 0 (0, T ) = −
1 D 0 (0, T ) ln . T B(0, T )
Since DF∗ (0, 0) = D∗F (0, 0) = D 0 (0, 0) = 1, the respective short spreads at time t = 0 are given by the following limits (provided the limits exist) d + ln DF∗ (0, T ) s ∗ (0) = lim S ∗ (0, T ) = − −r T ↓0 dT T =0 and
d + ln D∗F (0, T ) s∗ (0) = lim S∗ (0, T ) = − −r, T ↓0 dT T =0
respectively. We also set s 0 (0) = lim S 0 (0, T ) = − T ↓0
d + ln D 0 (0, T ) −r. dT T =0
T and FT are absolutely continuous with Assuming, as we do, that the processes F respect to the Lebesgue measure, and using the observation that the restriction of Q
221
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
to FT is equal to Q, we find out that 1 DF∗ (0, T ) = − EQ {ln(e−γ (1 − FT ) + FT )} B(0, T ) γ T T 1 −γ 1− ft dt + ft dt , = − EQ ln e γ 0 0 and 1 D∗F (0, T ) = EQ {ln(eγ (1 − FT ) + FT )} B(0, T ) γ T T 1 ft dt + ft dt . = EQ ln eγ 1 − γ 0 0 Furthermore,
T D 0 (0, T ) ft dt . = EQ {1 − FT } = EQ 1 − B(0, T ) 0
Consequently, s ∗ (0) =
1 γ (e − 1)f0 , γ
s∗ (0) =
1 (1 − e−γ )f0 , γ
and s 0 (0) = f0 . Now, if we postulate, for instance, that s∗ (0) = s 0 (0) (it would be the case if the market price is the selling indifference price), then we must have 1 1 f0 = (1 − e−γ )f0 = (1 − e−γ )γ0 γ γ so that γ0 < γ0 . Similar calculations can be made for any t ∈ [0, T ]. It can be noticed that, if the market price is the selling indifference price, f0 corresponds to the riskneutral intensity at time 0, whereas γ0 is the historical intensity. The reader may refer to Bernis and Jeanblanc [21] for other comments. 6.2.4 Recovery Paid at Time of Default Assume now that the recovery payment is made at time τ , if τ ≤ T . More precisely, let {Xt3 , t ≥ 0} be some F-adapted process. If τ < T , the payoff Xt3 is paid at time t = τ and re-invested in the riskless asset. The terminal global wealth is now v−p
(VT
v−p
(π ) + X1 )1T −1 and θ is the risk premium θ = ν/σ . Indeed, using Kusuoka’s representation theorem (Kusuoka [166]), we know that any strictly positive martingale can be written in the form dLt = Lt− ("t dWt + ψt dMt ). The discounted price of the default-free asset is a martingale under the change of probability, hence, it is easy to check that "t = −θ . (We have already noticed that the restriction of any emm to the filtration F is equal to Q.) Let us denote Q # are Qψ #t = Mt − t ψs ξs ds. The processes W Q and M by Wt = Wt + θ t and M 0 martingales. Then, t t 1 2 ln(1 + ψs )dHs − ψs ξs ds Lt = exp −θ Wt − θ t + 2 0 0 θ 2t Q = exp −θ Wt + 2 t t # × exp ln(1 + ψs )d Ms + [(1 + ψs ) ln(1 + ψs ) − ψs ]ξs ds . 0
0
Hence, the relative entropy of Q with respect to P is equal to EQψ (ln LT ), i.e., T 1 2 [(1 + ψs ) ln(1 + ψs ) − ψs ]ξs ds . θ T+ H (Qψ |P) = EQψ 2 0 ψ
From duality theory, the optimization problem v
inf EP (e−γ VT (π ) e−γ X )
π∈$(G)
reduces to maximization w.r.t ψ of 1 ψ EQψ X − H (Q |P) , γ that is, maximization w.r.t ψ of 1 2 1 T θ T− [(1 + ψs ) ln(1 + ψs ) − ψs ]ξs ds . EQψ X − 2γ γ 0 We solve this latter problem by considering 1 Q #t , ut dWt + ut d M [(1 + ψt ) ln(1 + ψt ) − ψt ]ξt dt + # dUt = γ 1 2 UT = X − θ T. 2γ Setting Yt = γ Ut we obtain Q #t , dYt = ([(1 + ψt ) ln(1 + ψt ) − ψt ]ξt )dt + # yt dWt + yt d M 1 YT = γ X − θ 2 T . 2
228
CHAPTER 6
In terms of the martingale M, we get Q
yt )]ξt )dt + # yt dWt + yt dMt . dYt = ([(1 + ψt ) ln(1 + ψt ) − ψt (1 + The solution is obtained by maximization of the drift in the above equation w.r.t. ψ, which leads to ln(1 + ψs ) = ys . Consequently, the BSDE reads 1 YT = γ X − θ 2 T , 2
Q
dYt = −(eyt − 1 − yt dMt , yt )ξt dt + # yt dWt + and setting Zt∗ = exp(−Yt ) we conclude that dZt∗
1 Q ∗ (e#yt − 1)dMt , yt dWt + Zt− = Zt∗# yt2 dt − Zt∗# 2
∗ zt = Zt− (e#yt − 1) yt , or, denoting # zt = −Zt∗#
dZt∗ =
ZT∗
1 2 = exp −γ X + θ T , 2
1 ZT∗ = exp −γ X + θ 2 T , 2
1 2 Q # zt dWt +# zt dMt , z dt +# 2Zt∗ t
1 2 (T −t)
which is equivalent to (6.22). (Note that Zt = Zt∗ e− 2 θ
.)
6.3.2 Indifference Buying and Selling Prices 6.3.2.1 Particular Case: Attainable Claims Assume, as before, that r = 0 and let us check that the indifference buying price is the hedging price in case of attainable claims. Assume that a claim X is FT measurable. By virtue of the predictable representation theorem, there exists a pair (x, # x ), where x is a constant and # xt is an F-adapted process, such that X = x + T Q Q # dW , where W = W + θt. Here x = EQ X is the arbitrage price of X and x u t u t 0 the replicating portfolio is obtained through # x . Hence, the time t value of X is t Q Q Xt = x + 0 # xt dWt and the process xu dWu . Then dXt = # Zt = e−θ satisfies
dZt = Zt
2 (T −t)/2 −γ X t
e
1 2 1 2 2 Q xt dWt xt dt + γ# θ + γ # 2 2
1 (νZt + σ γ Zt # xt )2 dt + γ Zt # xt dWt , 2σ 2 Zt ZT = e−γ X . =
Hence (Zt , γ Zt # xt , 0) is the solution of (6.22) with the terminal condition e−γ X , and Z0 = e−θ Note that, for X = 0, we get Z0 = e inf EP {e
π ∈$(G)
2 T /2 −γ x
−θ 2 T /2
−γ VTv (π )
e
.
, therefore, } = e−γ v e−θ
2 T /2
.
229
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
The G-indifference buying price of X is the value of p such that v−p
v
inf EP {e−γ VT (π ) } = inf EP {e−γ (VT
π ∈$(G)
(π )+X)
π ∈$(G)
},
that is, e−γ v e−θ
2 T /2
= e−γ (v−p+EQ X) e−θ
2 T /2
.
∗ We conclude easily that p∗G (X) = EQ X. Similar arguments show that pG (X) = EQ X.
6.3.2.2 General Case Assume now that a claim X is GT -measurable. Assuming that the process Z introduced in (6.22) is strictly positive, we can use its logarithm. Let us denote #t = Zt /# t = Zt / zt = and ψ zt =, ψ κt =
t ψ ≥ 0. t ) ln(1 + ψ
Then we get d(ln Zt ) =
1 2 #t dWtQ + ln(1 + ψ t )(dMt + ξt (1 − κt )dt), θ dt + ψ 2
and thus d(ln Zt ) =
1 2 #t , #t dWtQ + ln(1 + ψ t )d M θ dt + ψ 2
where #t = dMt + ξt (1 − κt )dt = dHt − ξt κt dt. dM # defined as d Q| #G = # is a martingale under the probability measure Q The process M t η satisfies # ηt dP|Gt , where # d# ηt = −# ηt− (θ dWt + ξt (1 − κt )dMt ) with # η0 = 1. Proposition 6.2 The G-indifference buying price of X with respect to the exponential utility is the real number p such that e−γ (v−p) Z0X = e−γ v Z00 , that is, ∗ ∗ (X) = γ −1 ln(Z00 /Z0X ) or, equivalently, pG (X) = EQ pG # X. Our previous study establishes that the dynamic hedging price of a claim X is the process Xt = EQ # {X|Gt }. This price is the expectation of the payoff, under some martingale measure, as is any price in the range of no-arbitrage prices. Remark. All the results presented in this section remain valid if ν and σ are adapted processes. Comments: See also the papers of Stricker [257] and Fujiwara and Miyahara [101] where the link between indifference prices and minimal entropy measure is presented.
230
CHAPTER 6
6.4 QUADRATIC HEDGING We work under the same hypothesis as before; in particular, the wealth process follows dVtv (π ) = πt (νdt + σ dWt ),
V0v (π ) = v.
In the last part of this section we shall study a more general case. The objective of this section is to examine the issue of quadratic pricing and hedging. Specifically, for a given P-square-integrable claim X ∈ GT , we study the following problems: • For a given initial endowment v, solve the minimization problem: min EP {(VTv (π) − X)2 }. π
A solution to this problem provides the portfolio that, among the portfolios with a given initial wealth, has the closest terminal wealth to a given claim X, in the sense of L2 -norm under the historical probability P. The solution of this problem exists, T since the set of stochastic integrals of the form 0 φs dSs is closed in L2 . • Solve the minimization problem: min EP {(VTv (π) − X)2 }. π,v
The optimal value of v is called the quadratic hedging price and the optimal π the quadratic hedging strategy. The quadratic hedging problem was examined in a fairly general framework of incomplete markets by means of BSDEs in several papers; see, for example, Mania [178], Mania and Tevzadze [181], Bobrovnytska and Schweizer [30], Hu and Zhou [132], or Lim [173](2004). Since this list is by no means exhaustive, the interested reader is referred to the references quoted in the above-mentioned papers. The reader may refer to Bielecki et al. [260] for a study of the same problem under a constraint on the expectation. Also, some additional references can be found in that paper. 6.4.1 Quadratic Hedging with F-Adapted Strategies We shall first solve, for a given initial endowment v, the following minimization problem: min EP {(VTv (π) − X)2 },
π ∈$(F)
where X is given as X = X1 1{τ >T } + X2 1{τ ≤T } for some FT -measurable, P-square-integrable, random variables X1 and X2 . Using the same approach as in Section 6.2.1, we define JX (y) = (y − X1 )2 (1 − FT ) + (y − X2 )2 FT
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
231
and its derivative JX (y) = 2[(y − X1 )(1 − FT ) + (y − X2 )FT ] = 2[y − X1 (1 − FT ) − X2 FT ]. Hence, the inverse of JX (y) is 1 z + X1 (1 − FT ) + X2 FT , 2 and thus the optimal terminal wealth equals IX (z) =
1 ∗ λ ζT + X1 (1 − FT ) + X2 FT , 2 where λ∗ is specified through the budget constraint: VTv,∗ =
1 ∗ λ EP {ζT2 } + EP {ζT X1 (1 − FT )} + EP {ζT X2 FT } = v. 2 The optimal strategy is the one that hedges the FT -measurable contingent claim EP {ζT VTv,∗ } =
λ∗ ζT + X1 (1 − FT ) + X2 FT = 2e−θ2 T (v − EQ (X))ζT + X1 (1 − FT ) + X2 FT . We deduce that min EP {(VTv − X)2 } π ) * 2 1 ∗ λ ζT + X1 (1 − FT ) + X2 FT − X1 (1 − FT ) = EP 2 ) 2 * 1 ∗ +EP λ ζT + X1 (1 − FT ) + X2 FT ) − X2 FT 2 1 ∗ 2 (λ ) EP {ζT2 } + EP {(X1 − X2 )2 FT (1 − FT )} 4 1 (v − EP {ζT (X1 + FT (X2 − X1 ))}2 = 2EP {ζT2 }
=
+EP {(X1 − X2 )2 FT (1 − FT )}. It remains to minimize over v the right-hand side, which is now simple. Therefore, we obtain the following result. Proposition 6.3 If one restricts attention to F-adapted strategies, the quadratic hedging price of the claim X = X1 1{τ >T } + X2 1{τ ≤T } equals EP {ζT (X1 + FT (X2 − X1 )} = EQ {X1 (1 − FT ) + FT X2 }. The optimal quadratic hedging of X is the strategy which replicates the FT measurable contingent claim X1 (1 − FT ) + FT X2 . Let us now examine the case of a generic GT -measurable random variable X. Here, we shall only examine the solution of the second problem introduced above, that is, min EP {(VTv (π) − X)2 }. v,π
232
CHAPTER 6
As explained in Bielecki et al. [260], this problem is essentially equivalent to a problem where we restrict our attention to the terminal wealth so that we may reduce the problem to minV ∈FT EP {(V − X)2 }. From the properties of conditional expectations, we have min EP {(V − X)2 } = EP {EP {X|FT } − X)2 },
V ∈FT
and the initial value of the strategy with terminal value EP {X|FT } is EP {ζT EP {X|FT }} = EP {ζT X} (recall that ζ is the deflator process defined at the end of the preliminary in the first section). In essence, the latter statement is a consequence of the completeness of the default-free market model. Indeed, the fact that the conditional expectation EP {X|FT } can be written as a stochastic integral w.r.t. S follows directly from the completeness of the default-free market. In conclusion, the quadratic hedging price equals EP {ζT X} = EQ X and the quadratic hedging strategy is the replicating strategy of the attainable claim EP {X|FT } associated with X. 6.4.2 Quadratic Hedging with G-adapted Strategies Similarly, as in the previous subsection, we assume here that the price process of the underlying asset obeys dSt = St (νdt + σ dWt ), so that the wealth process follows dVtv (π ) = πt (νdt + σ dWt ),
V0v (π ) = v.
We shall first solve, for a given initial endowment v, the following minimization problem min EP {(VTv (π) − X)2 }.
π ∈$(G)
As discussed in Bielecki et al. [260], one way of solving this problem is to project T the random variable X on the closed set of stochastic integrals of the form 0 ϕs dSs . Here, we present an alternative approach. We are looking for G-adapted processes X, , and such that the process Jt (π, v) = (Vtv (π) − Xt )2 t + t ,
∀t ∈ [0, T ]
(6.24)
is a G-submartingale for any G-adapted trading strategy π and a G-martingale for some strategy π ∗ . In addition, we require that XT = X, T = 1, T = 0 so that JT (π, v) = (VTv (π ) − X)2 . Let us assume that the dynamics of these processes are of the form xt dWt + xt dMt , dXt = xt dt + # dt = t− (ϑt dt + # ϑt dWt + ϑt dMt ), # dt = ψt dt + ψt dWt + ψt dMt ,
(6.25) (6.26) (6.27)
233
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
where the drifts xt , ϑt , and ψt are yet to be determined. From Itô’s formula, we obtain (recall that ξt = γt 1{τ >t} ) xt )dWt − 2(Vt − Xt− ) xt dMt d(Vt − Xt )2 = 2(Vt − Xt )(πt σ − # 2 2 +[(Vt − Xt− − xt ) − (Vt − Xt− ) ]dMt +(2(Vt − Xt )(πt ν − xt ) + (πt σ − # xt )2 +ξt [(Vt − Xt − xt )2 − (Vt − Xt )2 ])dt, where we denote Vt = Vtv (π ). Then, using integration by parts formula, we obtain by straightforward calculations Jt (π ) = k(t, πt , ϑt , xt , ψt )dt + martingale, where k(t, πt , ϑt , xt , ψt ) = ψt + t [ϑt (Vt − Xt )2
+2(Vt − Xt )[(πt ν − xt ) + # ϑt (πt σ − # xt ) + ξt xt ] 2 2 +(πt σ − # xt ) − (Vt − Xt )2 ]]. xt ) + ξt (ϑt + 1)[(Vt − Xt −
The process {Jt (π ), t ≥ 0} is a (local) martingale if and only if its drift term k(t, πt , xt , ϑt , ψt ) equals 0 for every t ∈ [0, T ]. In the first step, for any t ∈ [0, T ] we find πt∗ such that the minimum of k(t, πt , xt , ϑt , ψt ) is attained. Subsequently, we shall choose the processes x = x ∗ , ϑ = ϑ ∗ and ψ = ψ ∗ in such a way that k(t, πt∗ , xt∗ , ϑt∗ , ψt∗ ) = 0. This choice will imply that k(t, πt , xt∗ , ϑt∗ , ψt∗ ) ≥ 0 for any trading strategy π and any t ∈ [0, T ]. The strategy π ∗ that minimizes k(t, πt , xt , ϑt , ψt ) is the solution of the following equation: (Vtv (π ) − Xt )(ν + # ϑt σ ) + σ (πt σ − # xt ) = 0,
∀t ∈ [0, T ].
∗
Hence, the strategy π is implicitly given by πt∗ = σ −1# ϑt σ )(Vtv (π ∗ ) − Xt ) = At − Bt (Vtv (π ∗ ) − Xt ), xt − σ −2 (ν + # where we denote At = σ −1# xt ,
Bt = σ −2 (ν + # ϑt σ ).
After some computations, we see that the drift term of the process J (π ∗ ) admits the following representation: k(t, πt , ϑt , xt , ψt ) = ψt + t (Vt − Xt )2 (ϑt − σ 2 Bt2 ) +2t (Vt − Xt )(σ 2 At Bt − # ϑt # ϑt xt − xt ξt − xt ) 2 +t ξt ( ϑt + 1) xt . From now on, we shall assume that the auxiliary processes ϑ, x, and ψ are chosen as follows: ϑt = ϑt∗ = σ 2 Bt2 , xt = xt∗ = σ 2 At Bt − # ϑt # ϑt xt − xt ξ t , ∗ 2 ψt = ψt = −t ξt (ϑt + 1) xt .
234
CHAPTER 6
Straightforward computation verifies that if the drift coefficients ϑ, x, ψ in (6.25)– (6.27) are chosen as above, then the drift term in dynamics of J is always nonnegative, and it is equal to 0 for πt∗ = At − Bt (Vtv (π ∗ ) − Xt ). Our next goal is to solve Equations (6.25)–(6.27). Since ϑt = σ 2 Bt2 , the threedimensional process (, # ϑ, ϑ) is the unique solution to the linear BSDE (6.26) ϑt σ )2 dt + # ϑt dWt + ϑt dMt ), T = 1. dt = t (σ −2 (ν + # It is obvious that a solution is # ϑt = 0, ϑt = 0,
t = exp(−θ 2 (T − t)),
∀t ∈ [0, T ].
(6.28)
The three-dimensional process (X, # x, x ) solves Equation (6.25) with xt = xt∗ = 2 2 xt . This means that (X, # x, x ) is the unique solution to the linear σ At (ν/σ ) = θ# BSDE xt dMt , XT = X. xt dt + # xt dWt + dXt = θ# x and The unique solution to the last equation is Xt = EQ {X|Gt }. The components # x are given by the integral representation of the G-martingale {Xt , t ≥ 0} with Q respect to W Q and M, where Wt = Wt + θt. Notice also that since # ϑ = 0, the ∗ optimal portfolio π is given by the feedback formula xt − θ(Vtv (π ∗ ) − Xt )). πt∗ = σ −1 (# xt2 t . Therefore, we can solve explicitly the Finally, since ϑ = 0, we have ψt = −ξt BSDE (6.27) for the process . Indeed, we are now looking for a three-dimensional # ψ), which is the unique solution of the BSDE process (, ψ, #t dWt + ψ t dMt , T = 0. xt2 dt + ψ dt = −t ξt Noting that the process
t + 0
t
xs2 ds s ξs
T xs2 ds, we obtain the value of is a G-martingale under P with terminal value 0 s ξs in a closed form: T T 2 t = EP xs2 ds|Gt = xs2 1{τ >s} |Gt }ds s ξs e−θ (T −s) EP {γs t
T
=
e−θ
t 2 (T −s)
xs2 et −s |Ft )}ds, EP {γs
(6.29)
t
where we have identified the process x with its F-adapted version (recall that any G-predictable process is equal, prior to default, to an F-predictable process). Substituting (6.28) and (6.29) in (6.24), we conclude that for a fixed v the value function for our problem is Jt∗ (v) = Jt (π ∗ , v), where in turn 2
Jt (π ∗ , v) = (Vtv (π ∗ ) − Xt )2 e−θ (T −t) T 2 +1{τ >t} e−θ (T −s) EP {γs xs2 et −s |Ft }ds. t
235
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
In particular, J0∗ (v) = e−θ
2T
T
(v − X0 )2 + EP 0
2
xs2 e−s ds e θ s γs
.
The quadratic hedging price, say v ∗ , is obtained by minimizing J0∗ (v) with respect to v. From the last formula, it is obvious that the quadratic hedging price is v ∗ = X0 = EQ X. We are in the position to formulate the main result of this section. A corresponding theorem for a default-free financial model was established by Kohlmann and Zhou [160]. Proposition 6.4 Let a claim X be GT -measurable and square-integrable under P. The optimal trading strategy π ∗ , which solves the quadratic problem min EP {(VTv (π) − X)2 },
π ∈$(G)
is given by the feedback formula πt∗ = σ −1 (# xt − θ(Vtv (π ∗ ) − Xt )), xt is specified by where Xt = EQ (X|Gt ) for every t ∈ [0, T ], and the process # Q
dXt = # xt dWt + xt dMt . The quadratic hedging price of X is EQ X. 6.4.2.1 Example: Survival Claim Let us consider a simple survival claim X = 1{τ >T } , and let us assume that is t deterministic, specifically, (t) = 0 γ (s)ds. In that case, from the representation theorem (see Bielecki and Rutkowski [25], 159), we have dXt = xt dMt with xt = −e(t)−(T ) . Hence, T s ξs xs2 ds Gt t = EP t T 2(s)−2(T ) s γ (s)1{τ >s} e ds Gt = EP t T (t)−2(T ) −θ 2 (T −s) (s) EP e γ (s)e ds Ft = 1{τ >t} e t
T
= 1{τ >t} e(t)−2(T )
e−θ
2 (T −s)
γ (s)e(s) ds.
t
One can check that, at time 0, the value function is indeed smaller than the one obtained with F-adapted portfolios. 6.4.2.2 Case of an Attainable Claim Assume now that a claim X is FT -measurable. Then Xt = EQ (X|Gt ) is the price of Q X, and it satisfies dXt = # xt dWt . The optimal strategy is, in a feedback form, xt − θ(Vt − Xt )), πt∗ = σ −1 (#
236
CHAPTER 6
and the associated wealth process satisfies Q
Q
xt − ν(Vt − Xt ))dWt . dVt = πt∗ (νdt + σ dWt ) = πt∗ σ dWt = σ −1 (σ# Therefore, Q
d(Vt − Xt ) = −θ(Vt − Xt )dWt . Hence, if we start with an initial wealth equal to the arbitrage price EQ X of X, then we see that Vt = Xt for every t ∈ [0, T ], as expected.
6.4.2.3 Indifference Price Let us emphasize that the indifference price has no real meaning here, since the problem min EP ((VTv )2 ) has no financial interpretation. We have studied in Bielecki et al. [260] a more pertinent problem, with a constraint on the expected value of VTv under P. Nevertheless, from a mathematical point of view, the indifference price would be the value of p such that T 2 xs2 e−s )1{τ >t} ds. eθ s EP (γs (v 2 − (v − p)2 ) = 0
In the case of the example studied in Section 6.4.2.1, the indifference price would be the nonnegative value of p such that T 2 2 −2T eθ s γs es ds. 2vp − p = e 0
Let us also mention that our results are different from results of Lim [173]. Indeed, Lim studies a model with Poisson component, and thus in his approach the intensity of this process does not vanish after the first jump.
6.4.3 Jump-Dynamics of Price We assume here that the price process follows dSt = St− (νdt + σ dWt + ϕdMt ),
S0 > 0,
where M is the compensated martingale of the default process and where the constant ϕ satisfy ϕ > −1 so that the price St is strictly positive. Hence, the primary market, where the savings account and the asset S are traded is arbitrage free, but incomplete (in general). It follows that the wealth process satisfies dVtv (π ) = πt (νdt + σ dWt + ϕdMt ),
V0v (π) = v.
As in the previous subsection, our aim is, for a given initial endowment v, to solve the minimization problem: min EP {(VTv (π) − X)2 }. π
237
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
In order to characterize the value function, we proceed as before. That is, we are looking for processes X, , and such that the process (for simplicity, we write Vt in place of Vtv (π )) J (t, Vt ) = (Vt − Xt )2 t + t is a submartingale for any π and a martingale for some π ∗ , and such that T = 0, XT = X, T = 1. (Note that Mania and Tevzadze [181] used a similar approach for continuous processes, with a value function of the form Jt = 0 (t) + 1 (t)Vt + 2 (t)Vt2 .) Let us assume that the dynamics of these processes are of the form dXt = ft dt + # xt dWt + xt dMt , # dt = t (ϑt dt + ϑt dWt + ϑt dMt ) # dt = ψt dt + ψt dWt + ψt dMt ,
(6.30) (6.31) (6.32)
where the drifts ft , ϑt , and ψt have to be determined. From Itô’s formula we obtain d(Vt − Xt )2 = 2(Vt − Xt )(πt σ − # xt )dWt ! 2 xt ) − (Vt − Xt )2 dMt + (Vt + πt ϕ − Xt − +(2(Vt − Xt )(πt µ − ft ) + (πt σ − # xt )2 +ξt [(Vt + πt ϕ − Xt − xt )2 −(Vt − Xt )2 −2(Vt − Xt )(πt ϕ − xt )]) dt. Process t (Vt − Xt )2 + t is a (local) martingale iff k(πt , ft , ϑt , ψt ) = 0 for all t, where k(π, ϑ, f, ψ) = ψ + t [ϑt (Vt − Xt )2 +2(Vt − Xt )((π µ − f ) + # ϑt (πσ − # xt ) − ξt (πϕ − xt )) 2 2 +(π σ − # xt ) + ξt ( ϑt + 1)((Vt + πϕ − Xt − xt ) − (Vt − Xt )2 )]. In the first step, we find π & such that the maximum of k(π ) is obtained. Then, one defines (f ∗ , ϑ ∗ , ψ ∗ ) such that k(π & , f ∗ , ϑ ∗ , ψ ∗ ) = 0. This implies that, for any π , k(π, f ∗ , ϑ ∗ , ψ ∗ ) ≤ 0, and that k(π & , f ∗ , ϑ ∗ , ψ ∗ ) = 0. The optimal π & is the solution of (Vt − Xt )(µ − ξt ϕ + # ϑt σ ) + σ (πσ − # xt ) + ξt ( ϑt + 1)ϕ(Vt + πϕ − Xt − xt ) = 0, hence πt& =
1
+ 1) = At − Bt (Vt − Xt ), σ2
+ ϕ2ξ
t (ϑt
((σ# xt + ξt ϕ( ϑt + 1) xt ) − (µ + # ϑt σ + ξt ϕ ϑt )(Vt − Xt ))
with At = (σ# xt + ξt ϕ( ϑt + 1) xt )#−1 t Bt = (µ + # ϑt σ + ξt ϕ ϑt )#−1 t #t = σ 2 + ϕ 2 ξt ( ϑt + 1).
238
CHAPTER 6
After some computations, the drift term of t (Vt − Xt ) + t is found to be ϑt # xt − ξt xt − ft ) ϑt t (Vt − Xt )2 (ϑt − Bt2 #t ) + 2t (Vt − Xt )(At Bt #t − # 2 2 ϑt + 1)(At ϕ − xt ) + t (At σ − # xt ) + ψt . + t ξt ( Then, we choose ϑt∗ = Bt2 #t ϕt # xt − ξt xt ϑt ft∗ = At Bt #t − # ∗ xt )2 . xt )2 − t (At σ − # ψt = −t ξt (ϑt + 1)(At ϕ − Let us suppose that with this choice of drifts, Equations (6.31) and (6.32) admit solutions (we shall discuss this issue below). Next, let us denote these solutions as #∗ , ψ ∗ ); the corresponding processes A, B, and ϑ ∗, ϑ ∗ ), (X ∗ , # x∗, x ∗ ) and ( ∗ , ψ (∗ , # ∗ ∗ ∗ # will be denoted as A , B , and # . Consequently, the drift term of ∗t (Vt∗ (π ) − Xt∗ ) + t∗ is nonpositive for any admissible π and it is equal to 0 for π ∗ = A∗t − Bt∗ (Vtv,∗ (π ∗ ) − Xt∗ ). The three-dimensional process (∗ , # ϑ ∗, ϑ ∗ ) is supposed to satisfy the BSDE (µ + # ϑt σ + ξt ϕ ϑ t )2 (6.33) dt + # ϑt dWt + ϑt dMt dt = t σ 2 + ϕ 2 ξt ( ϑt + 1) T = 1. We shall discuss this equation later. x∗, x ∗ ) is a solution of the linear BSDE The three-dimensional process (X ∗ , # 1 (κ1,t # xt dWt + xt dMt xt + κ2,t xt )dt + # #t XT = X,
dXt =
where ϑt − ϕ 2 # ϑt ξt (1 + ϑt . ϑt ), κ2,t = ϕξt (1 + ϑt )(µ + σ # ϑt ) − σ 2 ξt κ1,t = σ µ + σ ϕξt Thus, Xt∗ = EQκ {X|Gt }, where dQκ |Gt = L(κ) t dP|Gt and = −L(κ) dL(κ) t t−
κ1,t κ2,t dWt + dMt . #t ξ #t
#∗ , ψ ∗ ) is solution of The three-dimensional process ( ∗ , ψ #t dWt + ψ t dMt ϑt + 1)(At ϕ − xt )2 + (At σ − # xt )2 )dt + ψ dt = −t (ξt ( T = 0. Thus, noting that t∗ +
0
t
s (ξs ( ϑs + 1)(As ϕ − xs )2 + (As σ − # xs )2 )ds
239
INDIFFERENCE PRICING OF DEFAULTABLE CLAIMS
is a G-martingale, we obtain that T ∗ s (ξs ( ϑs + 1)(As ϕ − xs )2 + (As σ − # xs )2 )ds t = E t
Gt .
(6.34)
6.4.3.1 Discussion of Equation (6.33): Duality Approach Our aim is here to prove that BSDE (6.33) has a solution. We take the opportunuity to correct a mistake in Bielecki et al. [260], where we claim that, in the particular case where the intensity γt is constant, we get a solution of the form θt constant. The solution that appears in Bielecki et al. is valid only in the case P(τ < T ) = 1. We proceed using the duality approach. The set of equivalent martingale measures is determined by the set of densities. From Kusuoka’s representation theorem [166] it follows that any strictly positive martingale in the filtration G can be written as dLt = Lt− ("t dWt + χt dMt )
(6.35)
for a G-predictable process χ satisfying χt > −1. In order that L corresponds to the Radon-Nikodym density of an emm, a relation between " and χ has to be satisfied in order to imply that process Lt St is a P (local) martingale. (Recall that r = 0.) Straightforward application of integration by parts formula proves that the drift term of LS vanishes iff ϕχt ξt + σ "t + ν = 0. Recall that by definition the variance optimal measure for L is a probability measure Q∗ that minimizes EQ∗ (LT ) = EP (L2T ). At this moment we are unable to verify existence/uniqueness of such a measure in the context of our model. We thus assume that the measure exists, Hypothesis: We assume that the variance optimal measure exists. In what follows we use the same argument as in Bobrovnytska and Schweizer [30]. Toward this end we denote by L∗ the Radon-Nikodym density of the variance optimal martingale measure. Let Z be the martingale Zt = EQ∗ (L∗T |Gt ) and U = L∗ /Z. It is proved in Delbaen and Shachermayer (Lemma 2.2) [68] that, if the variance optimal martingale measure exists, then there exists a predictable process # z such that zt dSt = zt (σ dWt + ϕdMt + νdt), dZt /Zt− = # zt St− (in the proof of Lemma 2.2, the hypothesis of continuity of the where zt = # asset is not required). The process L∗ is a (P, G) martingale, hence there exist " and χ such that dL∗t = L∗t− ("t dWt + χt dMt ) From Itô’s calculus, setting U = L∗ /Z, we obtain 1 (χt + 1) − 1) dMt , dUt = Ut− At dt + ("t − zt σ )dWt + 1 + zt ϕ
240 where
CHAPTER 6
1 −1 + ξt (1 + χt ) zt ϕ + 1 + zt ϕ z2 ϕ 2 = zt2 σ 2 + ξt (1 + χt ) t 1 + zt ϕ ϕ2 . = zt2 σ 2 + ξt (1 + χt ) 1 + zt ϕ
At = zt2 σ 2
We recall that ϕχt ξt + σ "t + ν = 0. Hence, letting # ut = "t − zt σ 1 (χt + 1) − 1, ut = 1 + zt ϕ we get zt = − It follows that
At = zt2
ut ν + σ# ut + ϕξt . ut ) σ 2 + ϕ 2 ξt (1 +
ϕ2 σ + ξt (1 + χt ) 1 + zt ϕ
2
ϕ2 ut ) = zt2 σ 2 + ξt (1 + zt ϕ)(1 + 1 + zt ϕ 2 2 2 ut )ϕ = zt σ + ξt (1 + =
ut )2 (ν + σ# ut + ϕξt 2 2 σ + ϕ ξt (1 + ut )
so that process U is a solution of (ν + σ# ut + ϕξt ut )2 # , dW + u dM dt + u dUt = Ut− t t t t σ 2 + ϕ 2 ξt (1 + ut )
UT = 1,
which establishes that the BSDE (6.33) has a solution as long as the variance optimal martingale measure exists in our setup.
Chapter Seven Applications to Weather Derivatives and Energy Contracts René Carmona
In this chapter, we consider three applications of the theory developed in this book. We bring to bear some of the theoretical concepts studied earlier in rather concrete settings. Indifference pricing has recently been applied to many incomplete market models. We discuss only a small sample, and we refer the reader interested in insurance problems to [189] and [140] and in pricing catastrophic bonds to [269]. The numerical results presented in this chapter are borrowed from [43] and [45].
7.1 APPLICATION I: TEMPERATURE OPTIONS The crux of option pricing theory in complete markets is that every contingent claim can be perfectly replicated, and in the absence of arbitrage opportunity, its price has to be equal to the cost of setting up the replicating portfolio. In emerging markets, this is hardly the case due to the presence of many unhedgeable factors and illiquidity of underlying assets. A portfolio of weather derivatives consisting of HDD and CDD options is a case in point. See below for a definition of these technical terms. Although futures contracts are traded on the Chicago Mercantile Exchange, and despite the fact that some HDD swaps are actively traded on the OTC market, except possibly for a few weather locations, the lack of an active secondary market makes it very difficult to use them as hedging instruments. Also, basis risk remains when risk mitigation is at a location different from the location of the risk exposure. Similar problems exist with most physical commodities: the prices of commodity forward contracts with physical delivery differ from those of the corresponding exchange traded futures that are used as hedging instruments. As a result, perfect replication is not possible. We credit Mark Davis for being one of the first ones, if not the first one, to realize that the indifference pricing paradigm offers a natural framework for the analysis of the basis cunendrum, and for using the notion of marginal price as a possible answer. See, for example, [60], [61], and [62]. We shall use the mathematical framework introduced in Section 4.1 of Chapter 4. • Options are written on an index. • There is a tradable asset that is correlated to the underlying index. • There are traded derivatives on the underlying index.
242
CHAPTER 7
As explained throughout the book, the problem differs from classical option theory in that there is no perfect hedging strategy. The goal is to find a trading strategy that minimizes the risk quantified by a risk measure (as discussed in Chapter 3), or maximizes expected terminal utility under the physical measure. Due to lack of liquidity, marking to market is not feasible, and it is reasonable to use for value a cash amount that an investor is willing to swap with a risky portfolio. Derivative indentures can be extremely complex in commodity markets. The simplest contracts are forwards, futures, swaps, European options on forwards and futures, quarterly and annual swaptions, annual swaptions, etc. Some of these contracts are settled financially, others by physical delivery of the commodity, and the complexity of the valuation is compounded by the physical constraints of the delivery. This is especially true for natural gas and electricity, which we shall consider later. Delivery is usually spread over a long period of time, typically one month, and the notion of spot price is rather elusive. Contract terms include day-ahead, end-ofthe-week, end-of-the-month, etc. Futures and options on futures are exchange-traded instruments: although these are not as liquid as fixed-income products, they are primary hedging instruments. Depending upon the delivery location, there are forward contracts that are usually priced as futures price plus basis. Our goal in this section is to force the pricing of temperature options into the framework of Section 4.1 of Chapter 4 so that results developed there can be applied. 7.1.1 Introduction to the Weather Markets It is estimated that as many as 2,000 deals were struck on weather derivatives over the last three years, and the size of the market is currently estimated between $5 to $7 billion, depending on the sources of information. It is impossible to have a precise estimate of this figure because most of the deals have taken place on the highly unregulated over-the-counter (OTC) market. The most commonly traded weather derivatives are plain vanilla calls and puts, floors, caps collars, and swaps on the cumulative number of HHDs or CDDs at a specific location. Other deals have been based on weather events such as rainfall, snowpack, water flow, wind speed, precipitation, and humidity, but the overwhelming majority of traded options have been on temperatures and heating and cooling degree days. See, nevertheless, Section 7.2 for a discussion of precipitation options. The Chicago Mercentile Exchange (CME) introduced monthly futures contracts (and options on these futures) based on monthly temperature indexes computed in a certain number of American cities. More recently, as business suffers from weather uncertainty overseas as well, the CME added contracts on the major European capitals such as Amsterdam, Berlin, Stockholm, Paris, and London, which became the object of significant trading activity. Weather derivatives are the object of intensive research, both in academia where they provide a challenging multidisciplinary problem of financial engineering, and in the banking and insurance worlds where they are regarded as a tool of choice to control and hedge some of the risks associated with weather. This chapter can only hope to scratch the surface by reviewing potential applications of indifference pricing. The reader interested in all the facts of these markets should consult the
243
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
collective book [9] and J. Porter’s review article [223] for a historical perspective on the weather markets. 7.1.2 Heating and Cooling Degree Days This subsection is devoted to the introduction of the terminology used in weatherderivative transactions. Complements and numerical examples of statistical analyses can be found in Chapter 5 of [41]. Most meteorological stations located near major international airports provide reliable information on the intraday fluctuations of the temperature. For each calendar day t, they publish the minimum and the maximum temperatures recorded that day. We will denote them by Mint and Maxt . The quantity Avgt defined by 1 (7.1) (Mint + Maxt ) 2 is called the average temperature of the day (at this location.) Note that this number can be very different from what one might expect when extreme temperatures are attained for only a short period of time in the day. In any case, this is the number used to define the underlying indexes. On each day t, one defines the heating degree day HDDt and the cooling degree day CDDt for that day by the formulas: Avg t =
HDDt = (65 − Avg t )+
and
CDDt = (Avg t − 65)+ .
(7.2)
So HDDt is zero if the average temperature of the day is above 65◦ F , and it is the number of degrees by which it falls under 65◦ F when it does. If we think of 65◦ as a threshold above which we need to cool and below which we need to heat, HDDt gives the number of degrees by which one should heat on that day. The interpretation of CDDt is symmetric, and one should view CDDt as the number of degrees by which one should cool. Remark. The choice of 65◦ F for the threshold is quite artificial. We follow this convention for the sake of convenience in the description of the most standard contracts. However, one should be aware of the importance of tailor-made contracts involving various combinations of intraday temperature patterns. Some of these contracts drew a lot of attention, both in the industry and in academia, as they were highly publicized. Also, the introduction of critical days and cross-commodity contracts and double trigger options added to the need of a better understanding of the pricing issues. The following notations are needed for the definitions of the payoff functions of the degree-day derivatives. Given a strike period SP , the aggregate degree days over SP are defined by the formulas HDDt and CDD(SP) = CDDt . (7.3) HDD(SP) = t∈SP
t∈SP
In the original OTC contracts, the strike period SP typically extended from May to September for CDDs, and from November to March for HDDs. It is one month in the CME futures contracts. A typical contract states that, at the end of the strike
244
CHAPTER 7
period SP, the owner of the contract will receive an amount ξ = f (DD), where DD is either HDD(SP) if the option is written on heating degree-days, or CDD(SP) if the option is written on cooling degree-days. We discuss examples of functions f in the next subsection. Finally, the parties agree on a premium P . This is the amount paid by the buyer of the option to the seller at the inception of the contract. Typical Payoff Functions Here are some examples of frequently used payoff functions. Call with a Cap The payoff ξ = f (DD) is of the form ξ = min{α(DD − K)+ , C}.
(7.4)
The buyer of such an option pays the up-front premium at the signature of the contract, and, if DD > K at the end of the strike period the writer (seller) of the option pays the buyer the nominal payoff rate α times the amount by which DD exceeds the strike, the overall payment being limited by the cap C. The option is said to be at the money when the strike K is near the historical average of DD, and out of the money when K is far from this average. As far as we know, most of the options traded on the OTC have been deep out of the money: buyers are willing to give up some upside for some downside protection. There is no agreement on what is a reasonable period over which to compute the so-called historical average. Where to strike an option? Some recommend using no more than ten to twelve years in order to capture recent warming trends, while others suggest using longer periods to increase the chances to have a stable historical average. Put with a Floor The payoff ξ = f (DD) is given by ξ = min{α(K − DD)+ , F }.
(7.5)
In exchange for the premium, the buyer of such an option receives at the end of the strike period, if DD < K, α times the amount by which the strike exceeds DD, the overall payment being limited by the floor F . Collar Buying a collar contract is essentially equivalent to being long a call with a cap and short a put with a floor. More precisely, the payoff is given by ξ = min{α(DD − Kc )+ , C} − min{β(Kp − DD)+ , F }.
(7.6)
The parameters α, β, C, F, Kc , Kp are usually adjusted for the premium to be zero. Since there is no up-front cost to get into one of these contracts, this form of collar was very popular in the early days of the temperature markets. 7.1.3 Modeling the Temperature We discuss later the pricing of temperature options by modeling the changes in the yearly aggregate index DD. For the time being, we view the temperature θt as the underlying on which the options are written. In this case, they are more of the Asian
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
245
type than of the European type, and pricing may be more difficult. In any case, a better understanding of the underlying is necessary before we can go any further. Theoretical models for the historical evolution of the temperature at a given location can be easily derived from the statistical analysis of temperature data. Contrary to the situation in Europe, historical temperature data are readily available in the United States at low if not no cost at all from the National Oceanic and Atmospheric Administration (NOAA). The daily average temperature quotes provided by the National Weather Service (NWS) are usually rounded to the nearest integer, while the HDDt and CDDt are computed from average daily temperatures with one decimal place when used as underlyings for futures contracts and options on these futures. To be more specific, the minimum and maximum temperatures on a given day are given as whole integers, while their average is computed with one decimal place. The statistical analysis of these temperature time series is not too difficult. Once the long-term (multidecade) trends have been removed, once our gut feelings about global warming have been dealt with, it is relatively easy to remove a seasonal component (obviously based on the yearly weather cycles) and analyze the stationary remainder. The net result is an autoregressive model with order 3 or 4, depending upon the location of the meteorological station. We refer the interested reader to Section 5.6 of Chapter 5 of the textbook [41] for details and examples. She can also consult [204] for the use of auto-regressive moving-average time series. The simplest model consistent with these statistical analyzes is a model of the form dYt = −λ(Yt − θ(t))dt + σ dWt2 , given by a form of Ornstein-Uhlenbeck process mean reverting toward a deterministic function θ (t) that represents the aggregate of a mean level, a possible trend due to global warming and a strong seasonal component. It is argued in [41] (see also [62]) that temperature data is mostly Gaussian, but if in some locations, for some periods of time, it is believed that marginal distribution tails are heavier than normal, it is plain to fit more general models without constant and deterministic volatility. In any case, all these stochastic models are of the form dYt = p(t, Yt )dt + q(t, Yt )dWt2 , considered in Section 4.1 of Chapter 4. Risk Management A good understanding of the temperature statistics is very important for risk management. Being able to simulate effortlessly a large number of scenarios for the time evolution of the daily average temperature is a necessary step toward the understanding of the statistics of the cumulative HDDs and CDDs over the strike periods. In this way, risk managers can estimate the probability that a given derivative be needed (exercised) and quantify how much it is worth to the potential buyer. As we have seen earlier, this is not enough to unlock the mysteries of risk-adjusted statistics needed for pricing; but equipped with the right utility function, this approach may lead to pricing formulas.
246
CHAPTER 7
7.1.4 Pricing Despite many efforts to restore confidence in a market that lost volume, transparency, and liquidity after the demise of Enron in December 2001, the market of weather derivatives is still a mystery to many. Some enter this market in search of a diversification tool completely uncorrelated with their main risk exposures, say to the equity, fixed income, or even credit markets. This obvious lack of correlation is the reason many hedge funds and several endowment funds got a taste of the weather market. It is also the rationale behind actuarial pricing algorithms based on historical dynamics, justifying the lack of risk premium by the lack of correlation with the financial markets. We review a couple of possible approaches to pricing, trying to emphasize the mathematical underpinnings and the data needed to actually implement these ideas. Estimating State Price Densities Temperature options can be viewed as European vanilla options when viewed as written on the aggregate number of degree days HDD(SP) or CDD (SP) over the specified strike period SP . As in the case of the pricing of derivatives on liquid underlying assets, it is expected that arbitrage theory will lead to expectation pricing and option prices of the form (7.7) Vt = fK (x)ϕT −t,SP (x)dx, where we used the notation fK (x) for the payoff of the claim when the aggregate index over the period SP is x, and where ϕT −t,SP (x) denotes the density of the underlying (cumulative HDDs or CDDs over the strike period SP, for example) over which the contingent claim is written. Since the weather derivative markets are (as of today) incomplete, this density is not unique, and any practical pricing recipe will have to provide a systematic way to estimate why such density is being used by the market makers to price outstanding claims. In the case of liquid markets, it is reasonable to use nonparametric procedures to estimate such an implied state price density. Unfortunately, lack of liquidity and lack of transparency plague the OTC market of weather derivatives, and these nonparametric estimation procedures are out of the question. In fact, given a strike period and location, only a very small number (typically between 3 and 5) of quotes are available. For this reason, parametric procedures are a must. Most options traded OTC have taken the form of calls, puts, or collars written on the aggregate HDDs or CDDs over a several-months strike period. The fact that the underlying is a sum of a relatively large number of individual terms being functions (typically combinations of hockey stick functions) of a process of the diffusion type (recall formulas (7.3) above) is screaming for a central limit theorem argument and the use of normal distributions. Empirical evidence that the underlying aggregate number of degree days is normally distributed is not very convincing. As the results vary from one meteorological station to another, it still takes a leap of faith to
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
247
jump to the conclusion that after risk adjustment, pricing should still be done with a normal distribution. But the example of the Black-Scholes-Merton model shows that to go from one type of probability to the other can sometimes be done by merely changing some of the parameters of the distribution. This argument is very popular among the risk managers of portfolios that include weather derivatives: short of any other practical method, assuming that the state price density ϕT −t,SP is normal, and estimating its parameters to be consistent with the knowledge of (a few) existing trades, is as practical as it sounds reasonable. Could the Black-Scholes Formula Do It? Let us denote momentarily the daily average temperature (7.1) by θt , and let us switch from time series to continuous models. The continuous time analogs of autoregressive models can be written in the form ˜ + σ dWt , d θ˜t = µt (θ)dt
(7.8)
˜ of the where W = {Wt }t is a standard Wiener process and where the value µt (θ) ˜ and where θ˜t = functional µt depends upon the past up to time t of the path θ, θt − s(t) is the remainder term when a seasonal component s(t), which is assumed to be a smooth deterministic periodic function of bounded variations, has been removed. In order to extend to the continuous time case the AR(p) model alluded to earlier, one uses the functional t t1 tp−1 ˜ ··· θ˜tp−1 dtp−1 · · · dt1 . µt (θ ) = 0
0
0
Obviously, µt (θ˜ ) is nonanticipative, and, as a consequence, θ˜t and θt are Ito’s processes. Puts, calls, and swaps on HDDs and CDDs for a strike period SP are thus Asian-type T -claims if T is the last day of the strike period SP . Indeed, the payoff function of each such degree day option is of the following form: + χ (θt )dt − K , (7.9) ξ = fT (θ) = SP
which we make explicit in the case of a call by choosing χ (x) = (x − 65)+ in the case of HDDs, χ (x) = (65 − x)+ in the case of CDDs. Tempting as it may be, the analogy with the setting of the Black-Scholes theory stops here. Indeed, the underlying θt is not tradable, and we propose to use the indifference pricing approach discussed in Section 4.1 of Chapter 4. See also [39]. It is clear that the market model provided by formula (7.8) is exactly of the simplest form of incomplete models discussed in Section 4.1 of Chapter 4. This is mostly due to the fact that the payoff functions reviewed in Subsection 7.1.2 are not of the form ξ = f (θT ) for some fixed T , and some extra work needs to be done if we want to price these options. The main idea is to add state variables to reduce the path dependence to a Markovian dependence. This is done by adding (like in the case of Asian options) time averages to the mix of state variables. We refer the interested reader to [42] where the details are spelled out.
248
CHAPTER 7
Calibration of a More Sophisticated Incomplete Model This last subsection is devoted to an application of the calibration procedure already presented in the case of stochastic volatility models. Using the notation of the theoretical discussion of Subsection 7.1.4, we rely on an incomplete model of the form ) ˜ d θ˜t = t (θ)dt + σθ dWt(1) (7.10) ˜ Z)dt + σZ dWt(2) , dZt = h(t, θ, and we assume that we have prices for options on the temperature and on the underlying Z. We first assume that the drift of the underlying Z is known. In fact, in order to get as close as possible to a risk neutral pricing model, we shall assume that this drift is constant and equal to the short interest rate corrected by an appropriate convenience yield. We also assume that the volatilities σZ and σθ are known for having been estimated from historical data. We also assume that the correlation ρ between the two Wiener processes has also been estimated. So the only remaining unknown in the model is the (risk neutral) drift ψt . Our goal is to choose this drift in such a way that the (risk neutral) pricing model so-obtained will price correctly all the options (on θt and Zt ) whose prices are known to us, and minimizing a loss functional used to penalize the departure of the model from a given model chosen as reasonable starting point. There are at least two ways to approach the problem. In the first approach, at least when Z is one-demensional, we derive an HJB partial differential equation in this abstract framework. Only then, do we discretize time and compute the value function (and subsequently the Lagrange multipliers and the optimal drift) on a discrete grid for θ and Z. As explained earlier, this procedure is mined with pitfalls and stability issues that make practical implementations extremely complex and highly unstable. The second approach relies on the analysis of a discretized model based on the generation of scenarios obtained from the simulations of samples from the system of stochastic differential equations (7.10). The main advantages of this approach is the simplicity of its rationale and the robustness to the variations in the Lagrange multipliers. Moreover, it is reasonable to use it for multivariate Z’s, while the PDE approach of the HJB equation cannot handle one time dimension and three or more space dimensions. Once a specific (large but fixed) number of samples of the system (7.10) have been generated, our probabilistic world is limited to this finite set of scenarios, and stable numerical methods of calibration are available in such a finite setting. Open Problem We close this section with a short discussion of an interesting mathematical challenge prompted by a recent work of Barrieu and El Karoui [13, 14] in which they discuss the optimal design of a protection by derivatives using temperature options as a motivation. In their one-period, two-agents model, the temperature option appears as an insurance contract. This feature is the subject of a heated debate, and even
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
249
though we believe that temperature options are derivatives (as opposed to insurance contracts), we shall not weigh in this controversy. Instead, we would like to propose an extension of the model used in [14] to a multiperiod, multiagent setting, in so doing introducing a secondary market in the model. In most cases, the inception of a contract takes place months before the period covered by the contract. Typically, CDD contracts for a given summer are initially traded during fall or winter preceding the summer in question. At that time, and for quite some time, meteorological predictions are essentially based on historical data for the temperatures during previous summer seasons, and it is very difficult to mark to marker these instruments on a daily basis as nothing changes from one day to the next in terms of prediction of the temperature during the period covered by the contract. So at this stage of their lives, temperature options behave pretty much like insurance contracts: we enter in such a contract and we wait for maturity and possible settlement and cash exchange. However, when we get close to the period by the contract, say ten days to two weeks, not only does the holder of the options gets a very good indication if the temperature is at or above or below the expected historical average, but meteorological predictions start kicking in with some accuracy. At that time, some investors revise their positions (some do panic), and an active secondary market suddenly appears, and these options change hands, possibly several times. An interesting mathematical model would extend the one-period, two-agents model of [14] to a multiperiod model in which the terminal distribution of the payoff is updated at each period conditionally on the appearance of the new weather predictions, and in which a third agent allows the nervous investor to unload the derivative he does not want to carry to maturity. Stochastic control arguments (typically dynamic programming) should be brought to bear to solve such a model in the spirit of [14]. Such a relatively simple extension would produce a more realistic model of the weather market.
7.2 APPLICATION II: RAINFALL OPTIONS In this section we discuss a variation on the model introduced in Section 4.1 of Chapter 4 and used above in the case of temperature options. In order to illustrate the versatility of the indifference pricing paradigm, we consider the case of a class of weather derivatives that are not exchange traded, options on the rainfall. We report on a case study performed in collaboration with Pavel Diko and published in [43]. Precipitation data are not readily available like temperature data, but they are far more accessible and reliable than wind and snowfall data. As a result, more precipitation contracts have been traded. Pricing of precipitation derivatives poses several challenges that have not been completely resolved. Precipitation is a very localized phenomenon, and most hydrological models do not offer the spatial resolution needed for the analysis of the amount of rain falling at a given station, typically a bucket in the backyard of an accredited meteorologist. Theoretical difficulties lie in the fact that, despite the existence of intensive hydrological research, it is not straightforward to develop a tractable mathematical model for rainfall at a specific location. In [43], Carmona and Diko propose a variant of
250
CHAPTER 7
a standard hydrological model due to Ignacio Rodriguez-Iturbe. They use simulations to argue that their model is better suited to pricing issues, and they show how maximum likelihood can be used to fit such a model to real data. We use this model for the purpose of the present discussion. Once we settle on a model for precipitation, we develop a pricing methodology that can incorporate the idiosyncracies of the underlying precipitation process. Since precipitation cannot be traded directly, the market model is most likely to be incomplete, and we use the utility indifference pricing paradigm to price options.
7.2.1 Precipitation Data Data serving as underlying for precipitation contracts come from meteorological stations. Although made at discrete time intervals, the observations come close enough to continuous observations of the rainfall intensity. A period of rainfall, when its intensity stays constant, can be described by a pair (ξ, β) of real numbers. Here β > 0 is the length of time for which the rainfall intensity stays constant and equal to ξ > 0. Let us denote by M the sequence {(ξi , βi+1 )}i=1,2,3,...,n of pairs describing the consecutive periods of constant rainfall intensity. The sequence M comprises the statistical data for which we want to formulate a parsimonious yet well-fitting probabilistic model. Precipitation has been the subject of intensive research for years. Several types of models have been developed [206], which can be divided into four categories: meteorological models that seek to capture the dynamics of large scale atmospheric processes controlling precipitation [185], multi-scale models that use multifractal cascades to describe rainfall [112, 113], statistical models that use purely statistical techniques to fit the rainfall data to well known distribution types with little emphasis on underlying physical processes [267], and finally, stochastic processes based on models that try to describe rainfall behavior by a small set of physically meaningful parameters. They were introduced by LeCam in 1961 [170] and further developed by Rodriguez-Iturbe et al. [233, 234].
7.2.2 Modeling Rainfall Dynamics Aversion of the Bartlett-Lewis-Poisson cluster (BLPC) process model of RodriguezIturbe, Cox, and Isham [233] forms the basis of the model proposed by Carmona and Diko in [43]. In this model, rainfall is assumed to be composed of storms that are, in turn, composed of rainfall cells. The storms arrive according to a Poisson process; within each storm, cells arrive according to another Poisson process, and the duration of the activity of the storm is random. Each cell has random duration and random depth. Storms and rain-cells can overlap. If two or more cells overlap, their depths add up. Exponential distributions for both the storm and cell durations are used in practice ( [54, 233]). Although the BLPC model is physically intuitive, parameter fitting is an issue. We follow [43] in using a variant of the Rodriguez-Iturbe model for which maximum likelihood estimation (MLE) is possible. In [43] the authors propose to model rainfall intensity within a single storm by a homogeneous jump
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
Markov process {Yt } whose infinitesimal generator G is given by (φ(y) − φ(x))A(x, y)ν(dy), [Gφ](x) =
251
(7.11)
R+
where A(x, y) = {λ1 λu e−λu (y−x) 1(x,∞) (y)λ¯ 2 (x)λd e−λd (x−y) 1(0,x) (y) + λ¯ 2 (x)e−λd x 1{0} (y)} with ) (I I ) λ¯ 2 (ξ ) = 1(0,∞) (ξ )(λ(I 2 + λ2 κ(ξ )),
(7.12)
where ν is Lebesgue’s measure on [0, ∞) plus an added unit point mass at 0, and κ(ξ ) is defined on R+ and satisfies κ(x) = x for 0 ≤ x ≤ K (for some large K) it is bounded; it is 3-times differentiable, (∂/∂x)κ(x) > 0. The model is determined ) (I I ) 5 by the parameter vector (λ1 , λ(I 2 , λ2 , λu , λd ) ∈ R+ . Furthermore, we replace the constant cell arrival rate λ1 by a continuous-time, two-states {0, λ1 } Markov chain λ¯ 1 with transition rates qd and qp . MLE for λ1 , qd , qp for such a model has been done in [3]. Assuming one observes the interarrival times U = (U1 , U2 , . . . , Un ) of a Cox process N directed by a switching Markov process M, the likelihood function can be shown to have the following form: L(U, λ1 , qd , qp ) =
n ,
f (Ui , λ1 , qd , qp )
i=1
1 [(qd + qp − r1 )r1 e−r1 x − (qd + qp − r2 )r2 e−r2 x ] r2 − r1 1 1 r1 = (qd + qp + λ1 ) − (qd + qp + λ1 )2 − l1 qd 2 4 1 1 r2 = (qd + qp + λ1 ) + (qd + qp + λ1 )2 − l1 qd . 2 4 Once the parameters of the process λ¯ 1 are estimated, the MLE method for estimating the remaining parameters can be used. f (x, λ1 , qd , qp ) =
7.2.3 Security Price as Random Evolution As in our discussion of temperature options, we assume the existence of a liquidly traded asset S whose dynamics are given by a stochastic differential equation of the form dSt = St µ(Yt )dt + St σ (Yt )dWt ),
(7.13)
where the process Y = {Yt } is a continuous-time Markov process in Rn independent of the Wiener process W . Models of this form are common in financial applications: regime switching models, stochastic volatility models. The pair (St , Yt ) forms a Markov process that is known in the literature as random evolution. As in Section 4.1 of Chapter 4, we assume that the coefficients µ and σ are bounded so that existence and uniqueness of a strong solution exists.
252
CHAPTER 7
7.2.4 Utility Maximization as a Stochastic Control Problem We now formalize the problem described in the introduction. We are given a filtered probability space (, F, {Ft }0≤t≤T , P). Ft is the sigma field σ {(Ss , Ys ) : 0 ≤ s ≤ t} augmented by the null sets of F = FT . Trading in St is allowed, and we assume for simplicity that the interest rate of the bank account is zero. The economic agent possesses initial wealth x and uses a self-financing trading strategy φ—the amount of money invested in the risky asset. She chooses the strategy in order to maximize the utility U of her terminal wealth. For any strategy, the dynamics of the wealth process are given by dXt = φt (µ(Yt )dt + σ (Yt )dWt ),
(7.14)
and the objective is to maximize E{U (XT )}.
(7.15)
Restricting the optimization problem (7.15) to Markovian controls π leads to the HJB equation. The generator Aπ of the (controlled) random evolution (X, Y ) reads ∂V ∂ 2V ∂V 1 (7.16) + πµ(y) + π 2 σ 2 (y) 2 + GV (t, x, y), ∂t ∂x 2 ∂x where G is the infinitesimal generator of the process Y . Hence the HJB equation for the value function V of the optimization problem (7.15) is ∂V 1 2 2 ∂ 2V ∂V u (7.17) + u σ (y) 2 . + GV + sup uµ(y) 0 = sup A V = 2 ∂x ∂x ∂t u u Aπ V (t, x, y) =
As in Section 4.1 of Chapter 4 we perform the maximization with respect to π explicitly to get the optimal control π ∗ : π∗ = −
µ(y) ∂V /∂x , σ 2 (y) ∂ 2 V /∂x 2
(7.18)
and the HJB equation becomes 0=
1 µ2 (y) (∂V /∂x)2 ∂V , + GV − 2 σ 2 (y) ∂ 2 V /∂x 2 ∂t
(7.19)
with terminal condition V (T , x, y) = U (x). Using the special form of the process Y described earlier we get 1 µ2 (y) (∂V /∂x)2 ∂V + λ(y) [V (t, x, z) − V (t, x, y)]$(y, dz) − , 0= ∂t 2 σ 2 (y) ∂ 2 V /∂x 2 R (7.20) where λ is the jump rate function and $ is the jump transition kernel for the process Y . The Case of Exponential Utility In this subsection we solve the HJB equation (7.19) in the case of the exponential utility function U (x) = −e−γ x with γ > 0. We can get an explicit solution in this
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
253
case by linearizing the HJB equation by a Hopf-Cole transformation. We hypothesize the form of the solution as V (t, x, y) = −e−γ x F (t, y). Substituting in (7.19), we get ∂F 1 µ2 (y) = (t, y) + GF, (7.21) F (t, y) 2 σ 2 (y) ∂t with the terminal condition F (T , y) = 1. The solution of this equation is given by the so-called Feynman-Kac formula. Indeed, its right-hand side is the backward evolution operator of Y (the justification will be given below). Such a formula already appeared in [42], where the zero-order term found in the exponential was called the traded risk premium. The solution is − T 1 µ2 (Ys ) ds F (t, y) = E e t 2 σ 2 (Ys ) Yt = y . Substituting out, we see that the candidate for the expression of the value function solving the HJB equation (7.19) with exponential utility is − T 1 µ2 (Ys ) ds V (t, x, y) = −e−γ x E e t 2 σ 2 (Ys ) Yt = y , the corresponding optimal trading strategy being given by the time varying Sharpe ratio µ(Yt− ) φt = , γ σ 2 (Yt− ) where Yt− = limh0 Yt−h is the left-hand limit assuring the predictability of the trading strategy. The proof is completed by way of a verification theorem, which is plain under the assumption σ (y) > > 0 for all y. The Case of Power Utility In the case of the power utility function U (x) = x γ /γ with γ < 1, we search for a value function in the form V (t, x, y) = (x γ /γ )F (t, y), and the linearized equation becomes γ ∂F µ2 (y) F (t, y) = (t, y) + GV , (7.22) 2 2(1 − γ ) σ (y) ∂t with terminal condition F (T , y) = 1. Again, we derive a Feynman-Kac formula for the solution T γ µ2 (Ys ) ds F (t, y) = E e t 2(1−γ ) σ 2 (Ys ) Yt = y , and the value function reads V (t, x, y) =
xγ E e γ
T t
γ µ2 (Ys ) 2(1−γ ) σ 2 (Ys ) ds
with the corresponding optimal strategy φt =
µ(Yt− ) Xt . σ 2 (Yt− )(1 − γ )
Yt = y ,
254
CHAPTER 7
7.2.5 Utility Maximization with a Derivative Indifference pricing requires to solve the above utility maximization problem with the payoff of the derivative as part of the cash flow. The payoffs of rainfall deriva t
tives are usually of the form ξ = ( t Yt dt − K)+ when the underlying index is t
the amount of rain during a given period, or ξ = ( t 1{Yt >} dt − K)+ when the derivative is written on the amount of time it rains (does not rain) during a given period. Here is the minimal precipitation level that constitutes a “rainy day.” For the sake of simplicity, we assume that K = 0. See [43] for the discussion of the steps that need to be taken to include the case K > 0. So we assume that the pay t
off is of the form ξ = t h(Ys )ds 0 ≤ t ≤ t
≤ T , where the function h could be h(y) = y, h(y) = 1(,∞) (y), or any measurable function h ≥ 0 with polynomial growth. Also, we limit ourselves to the case of the buyer who is maximizing her terminal utility E{U (XT + ξ )}. It turns out that the solution (φ ∗ , V ∗ ) of the buyer problem is the same as the solution (φˆ ∗ , Vˆ ∗ ) of the following random endowment optimal investment problem: sup E{U (Xˆ T )}
(7.23)
φ
for the dynamics d Xˆ t = g(t, Yt )dt + φˆ t (µ(Yt )dt + σ (Yt )dWt ), where g(t, y) = 1(t ,t
) (t)h(y). The HJB equation for problem (7.23) is given by ∂V ∂V ∂V 1 2 2 ∂ 2V 0= (7.24) + g(t, y) + GV + sup uµ(y) + u σ (y) 2 . ∂t ∂x ∂x 2 ∂x u The optimal strategy π ∗ is the still given by (7.18) and the HJB equation becomes ∂V ∂V 0= + g(t, y) + λ(y) [V (t, x, z) − V (t, x, y)]$(y, dz) ∂t ∂x R 1 2 (∂V /∂x)2 , (7.25) − σ (y) 2 2 ∂ V /∂x 2 with the terminal condition V (T , x, y) = U (x). The Case of Exponential Utility As always, the case of the exponential utility is simpler because of the separation of the variables in the equation. Searching for a value function of the form V (t, x, y) = −e−γ x F (t, y), we find that F has to satisfy ∂F 1 µ2 (y) = (t, y) + GF, (7.26) F (t, y) γ g(t, y) + 2 2 σ (y) ∂t with terminal condition F (T , y) = 1. The solution is given by the Feynman-Kac formula: − T γ g(s,Y )+ 1 µ2 (Ys ) ds s 2 σ 2 (Ys ) F (t, y) = E e t Yt = y ,
255
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
and the value function is
− T γ g(s,Y )+ 1 µ2 (Ys ) ds s 2 σ 2 (Ys ) V (t, x, y) = −e−γ x E e t Yt = y ,
with the corresponding optimal trading strategy φt = µ(y)/(σ 2 (y)γ ). The Case of Power Utility The situation is more complicated in the case of power utility, and one can only derive upper and lower bounds for the value function. See [43] for details. 7.2.6 Indifference Prices Proceeding as in Section 4.1 of Chapter 4, one finds an explicit formula for the indifference price of the derivative in the case of the exponential utility function. It reads: − T 1 µ2 (Ys ) ds E e 0 2 σ 2 (Ys ) Y0 = y 1 p = ln T 2 . γ − 0 γ g(s,Ys )+ 12 µ2 (Ys ) ds σ (Y ) s E e Y0 = y As usual, the indifference price is independent of the initial wealth. In the case of power utility, we can only derive bounds: T 0 E e
1/γ γ VL (0, x, y) γ µ2 (Ys ) ds 2(1−γ ) σ 2 (Ys ) Y
0
=y
E − x ≤ p ≤ x 1 − 1 +
T 0
g(t, Yt )dt x
dQ0 dP Y0
=y
γ γ−1
.
7.2.7 Numerical Example Finally, for the sake of illustration, we reproduce the numerical example given in [43]. Note that, except for the highly publicized example of the three-year deal offered by Asterix’s amusement park in Paris, not much public information is available on the terms and conditions of precipitation options and their prices. We consider a sample call option on the amount of rainfall during June and July of 2004, as recorded by Bergen weather station in Norway with a strike of K mm and tick price 1 NOK, i.e., the seller of the contract pays to the buyer one NOK for each millimeter of cumulative rainfall above K mm during June and July 2004 in Bergen. The rainfall data were provided by the Norwegian Meteorological Institute in the form of high-frequency rainfall intensity records from the “pluviometer”-equipped weather station in Bergen for calendar year 2002. We fit the jump Markov model to the rainfall data. Then we identify a traded asset influenced by the amount of rainfall in Bergen in June and July. The results are reproduced in Table 7.1. For the tradable St we chose the fourth quarter 2002 forward power contract traded on the NordPool electric power exchange of four Nordic
256
CHAPTER 7
Table 7.1 Maximum likelihood parameter estimates for Bergen rainfall data.
Parameter
Value
λ1 qd qp ) λ(I 2 (II ) λ2 λu λd
78.62 0.846 5.028 0.000 2.819 0.012 0.011
day−1 day−1 day−1 day−1 mm− 1 daymm− 1 daymm− 1
Table 7.2 Parameter estimates for Q402 drift and volatility.
Parameter
Value
a b σ
−0.55 0.40 0.10 0.20
countries: Norway, Sweden, Finland, and Denmark. We calibrated the functional form of µ and σ to the data, and we used Monte Carlo computations to compute indifference prices. We used a constant volatility function σ (Y ) ≡ σ and a drift of the form µ(y) = a ∗ ln( + y) + b, where represents the cutoff level for no rainfall, and a and b are obtained by ordinary least-squares regression. The values of the parameters are given in Table 7.2. Table 7.3 gives the prices obtained by Monte Carlo computations of the Feynman-Kac expectations. In the last column, what we call risk-neutral price is a price under a measure that makes the tradable price S a martingale and preserves the historical statistics of the rainfall process Y .
7.3 APPLICATION III: COMMODITY DERIVATIVES Commodity markets offer natural examples of models involving both nontradable and nonobservable indexes used directly or indirectly as underliers for derivative contracts. This last section is devoted to a brief introduction to practical problems lending themselves to analyses in the spirit of Chapter 4. It is patterned after several joint works with Mike Ludkovski [45,46]. For the sake of completeness, we remark that the indifference pricing paradigm has been proposed in [1] as a framework for power prices formation. There, Markov-perfect equilibrium is used to analyze price formation for hydroelectric generation. However, this approach is far afield from the spirit of this book. The discussion of this last section is more in the spirit of the partially observed models of Chapter 4 and the limitations of indifference pricing in this setup.
257
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
Table 7.3 Bergen rainfall pricing results.
Utility indifference price Risk Strike aversion 200
250
300
350
400
450
500
0.1 0.01 0.001 0.1 0.01 0.001 0.1 0.01 0.001 0.1 0.01 0.001 0.1 0.01 0.001 0.1 0.01 0.001 0.1 0.01 0.001
With power hedge Without power hedge Seller
Buyer
Seller
Buyer
232.68 158.63 133.62 167.90 111.48 84.16 174.13 60.81 43.82 162.90 24.82 27.82 116.06 6.37 6.44 14.53 2.90 3.27 0.38 0.38 0.21
33.54 117.26 129.96 27.94 68.15 80.90 11.31 34.12 41.43 4.37 14.68 25.90 1.89 3.60 6.15 0.36 1.25 2.97 0.07 0.30 0.21
235.20 167.00 140.99 169.87 122.32 91.19 177.71 70.52 49.52 167.38 28.79 34.17 120.36 8.20 8.04 17.01 4.30 4.98 0.54 0.55 0.30
36.51 124.85 137.15 29.95 75.69 87.82 12.72 39.01 46.84 5.18 17.15 31.86 2.31 4.61 7.68 0.48 1.81 4.50 0.10 0.43 0.29
Seller’s Riskhedging neutral discount price 99% 95% 95% 99% 91% 92% 98% 86% 88% 97% 86% 81% 96% 78% 80% 85% 68% 66% 71% 70% 72%
145.97 145.21 139.05 98.98 96.07 89.49 57.29 50.58 48.16 26.21 21.92 32.99 9.09 5.99 7.86 1.65 2.65 4.73 0.20 0.48 0.30
7.3.1 Illiquid Spot Commodities as Nontradable Assets Examples of derivative contracts written on commodity spot prices abound. However, because of the physical nature of many of these commodities, most of the trading is done by means of forward contracts, and spot markets are merely used as buffers to accommodate unexpected fluctuations in demand or supply. This phenomenon is exacerbated in the case of electricity because of the lack of storage solutions. In order to illustrate why the setup of indifference pricing is well suited to many commodity contract valuations, we consider the example of a gas-fired powerplant owner who chooses to mitigate the risk associated with rising gas prices and declining power prices by purchasing a string of spark spread options. So for each day T in a time interval [T1 , T2 ], the plant owner buys a (European) call that pays S (e) (T ) − Heff S (g) (T ) − K whenever S (e) (T ) − Heff S (g) (T ) > K. In our idealized example, S (e) (T ) and S (g) (T ) stand for the spot prices of power and natural gas at nodes and interchanges closest to the plant, while Heff denotes the heat
258
CHAPTER 7
rate measuring the efficiency of the plant in MBtu/MWH. Heff gives the number of MBtu’s of gas needed to produce one MWh of electricity. The difference S (e) (T ) − Heff S (g) (T ) is called the spot spark spread at the given location. According to the classical theory, in order to hedge his position, the seller of such a string of spread options should take positions in the power and gas markets. However, these hedges may be difficult to implement, especially if the nodes and interchanges where he plans to set up the hedges are not liquid. A natural alternative frequently used by practitioners is to use forward contracts with the nearest maturities and delivery locations as proxies for the spot prices. Let us denote them by F (e) (T , Tj ) and F (g) (T , Tj ) for the sake of definiteness. The latter are better behaved in the sense that they exhibit fewer spikes, their liquidities can be reasonable, and their dynamics are very similar to the spot dynamics with a strong correlation. The reader interested in mathematical models for the dynamics of electricity spot prices is referred to Barlow’s intriguing model [11], and the recent paper [104] of Geman and Roncoroni. Despite its obvious importance, this problem has not been addressed in the academic literature, and certainly not with the methods presented in this book. However, we are clearly in the situation studied in Section 4.1 of Chapter 4, the role of the tradable ST being played by the forward spark spread F (e) (T , Tj ) − Heff F (g) (T , Tj ) and the role of the nontradable Y (T ) being played by the spot spread S (e) (T ) − Heff S (g) (T ). The only difference of any consequence is the lack of natural diffusion equations for the stochastic dynamics of spark spreads. Reasonable approximations have been proposed, and they make it possible to use the results of Section 4.1 of Chapter 4 as derived there. However, a more realistic approach is to model the joint dynamics of the four prices by a four-dimensional diffusion process, and to use a multivariate extension of the results of Section 4.1 of Chapter 4 in which the nontradable is the bivariate random vector formed by S (e) (T ) and S (g) (T ), and the tradable is the bivariate random vector formed by F (e) (T , Tj ) and F (g) (T , Tj ). The interested reader is referred to the review article of Carmona and Durrleman [44] for an extensive discussion of mathematical models and numerical methods used in practice in the pricing of spark spread options. 7.3.2 Convenience Yields as Nonobservable Factors Next, we consider one of the most commonly cited models of partially observed factors in the technical literature on commodity markets. A First Convenience Yield Model For the purpose of illustration, we present the two-factor convenience yield model originally proposed by Gibson and Schwartz [108]. According to this model, the joint risk neutral dynamics of the commodity spot price St and the convenience yield δt are given by stochastic differential equations of the form dSt = (rt − δt )St dt + σ St dWt1 , dδt = κ(θ − δt )dt
+ γ dWt2 .
(7.27) (7.28)
259
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS 34 33.5
Partial Observations Full Information
33 32.5
$/bbl
32 31.5 31
30.5 30
29.5 29
0
1
2
3
4
5
6
7
Years to Maturity
Figure 7.1 Expected spot price with respect to risk-neutral dynamics. Parameter values are from [248]. Today’s price is set at $30.
At this stage, a few important remarks are in order. • For all practical purposes this model is complete if we are only interested in pricing (and hedging) derivatives written on the spot price St . Indeed, a Girsanov change of measure can remove the drift from the dynamics of St , and for all practical pricing purposes the model becomes complete. • Replacing the spot price St by its logarithm Xt = log St leads to a simple linear equation, and the model for (Xt , δt ) appears as a simple affine Gaussian model. In particular, if we assume that it is fully observed (i.e., the convenience yield can be observed) explicit formulae give most of the derivative prices, including the prices of the forward contracts. • Closed form formulae are not available if we assume that the system is only partially observed because of the presence of the convenience yield. However, Kalman filtering can be used to replace the dynamics of δt by the dynamics of its conditional expectation and variance, leading to a three-dimensional fully observed model, which we consider later in this section. • As a matter of illustration we computed the forward curves in two of the cases discussed in the bullet points above: assuming first that the convenience yield can be observed, and then assuming that it cannot be observed, in which case we use a Kalman filter to estimate the convenience yield. We shall give details of the filtering procedure later in this section. For the time being, we note that the curves are very different for short maturities, after which they seem to differ by a parallel shift. The spread between the two curves can be interpreted as a premium for the lack of observability of the convenience yield. As explained earlier, the Gibson-Schwartz model is an exponential affine model (see, for example, [47]) and, assuming that the convenience yield is observed, forward
260
CHAPTER 7
contract prices are given by exponential of linear combinations of St and δt . F (t, T ) = St e
T
t rs ds B(t,T )δt +A(t,T )
e
(7.29)
,
where A(t, T ) and B(t, T ) are given by e−κT − 1 , κ κθ + ρσs γ A(t, T ) = (1 − e−κ(T −t) − κ(T − t))+ κ2 γ2 + 3 (2κ(T − t) − 3 + 4e−κ(T −t) − e−2κ(T −t) ). κ For later purposes we note that the risk-neutral dynamics of the forward contract prices are of the form e−κT − 1 (7.30) dWt2 . dF (t, T ) = F (t, T ) rt dt + σ dWt1 + γ κ B(t, T ) =
0.4 0.2 0.0 -0.2 -0.4 -0.8
-0.6
Implied Convenience Yield
0.6
0.8
The above formulas can easily be inverted and, given a fixed time to maturity, say 3 months or 6 months, on any given day t, one can recover the convenience yield δt from the spot price St and the forward contract price F (t, t + τ ). So, using (7.29), we can calculate implied convenience yields from the observed futures prices. Numerical computations show that each futures contract carries its own source of risk as evidenced by the sharp spikes in Figure 7.2. Moreover, as Figure 7.3 shows, there is a further inconsistency between the δt ’s implied from futures contracts of different maturities. This remark is screaming for the introduction of a term structure of convenience yield as introduced for example in [29].
1994
1995
1996
1997
1998
1999
2000
2001
2002
Time
Figure 7.2 Implied convenience yield using a 3-month futures and the same parameters as in Figure 7.3
261
0.2 0.0 -0.2 -0.4
Difference in implied conv. yield
0.4
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
1994
1995
1996
1997
1998
1999
2000
2001
2002
Time
Figure 7.3 Difference in implied convenience yields between 3- and 12-month futures.
7.3.3 Convenience Yield Models Revisited The above analysis was based on the computation of the implied convenience yield from (7.29). It shows that the basic model (7.27) is not consistent with the forward curve. It appears that each futures contract seems to carry its own source of risk as evidenced by the sharp spikes in Figure 7.2, which result due to sudden uncorrelated movements in the futures and the spot, and the differences in implied δt ’s evidenced by Figure 7.2 and Figure 7.3. This prompts us to consider a different model, hoping to capture the maturity specific nature of convenience yield. First, we fix a maturity T0 , and we consider the forward price Ft = F (t, T0 ) instead of the spot price St . Second, we model historical dynamics instead of risk-neutral dynamics. We assume that dFt = (µt − δt )Ft dt + σ Ft dWt1 , dδt = κ(θ − δt )dt + σδ dWt2 , or, more generally, dδt = b(δt , Ft )dt + σδ (δt , Ft )dWt2 . We also assume that • the forward contract is tradable (and hence its price Ft is observable); • the (forward) convenience yield δt is not observable. We consider a contingent claim ξ written on a nontradable index. In this respect, the situation is quite similar to the diffusion model proposed in Chapter 4 for temperature options. The difference is that we now have a nonobservable factor, namely the convenience yield. Motivated by the discussion of [45,46] we propose the following example. Let us assume that the operator of the Princeton University power plant
262
CHAPTER 7
heard rumors about the higher than expected likelihood for a warm summer, and that he wants to protect its operating budget from high gas prices for that summer. He enters into a contract locking the price of gas at the nearest exchange. Very few trades are recorded at this location, and for all practical purposes, we can assume that the index on which the contract is written is not traded. Instead of writing the payoff as a function of the price of gas at this location, we choose to write the payoff as a function of the price at a location where contracts are liquidly traded, and the difference in price between the two locations. This difference is usually called the basis. So we write the payoff in the form ξ = φ(ST , BT ), where we use the notation BT for the (observed) basis factor BT . So our goal is to compute the value function V φ (t, s, x, ξ ; T ) = sup E π ∈ATt
∞
−∞
U (XTx,π + φ(ST , BT ))dPB St = s, Xt = x, δt ∼ ξ . (7.31)
This formulation is motivated by a short note of Lasry and Lions, whose results we discuss in detail below. In our discussion we consider two specific cases: • BT independent of T , S, and δ. • BT = αδT + . Lasry-Lions Approach (B independent) In a short note, Lasry and Lions suggested a model for which they outlined without proofs the steps described in this section, namely, • Using filtering theory (specifically Kushner and/or Zakai’s equation), we can derive a dynamical equation in ∞ dimensions, and replace the unobserved convenience yield δt by its conditional distribution; • Work with the exponential utility function to capture the agent risk preferences and define the corresponding value function V (t, s, x, ρ); • Use the dynamic programming principle to derive an infinite dimensional nonlinear PDE of the Hamilton-Jacobi-Bellman type; • Postulate that V (t, s, x, ρ) = − exp(−γ [x + ψ(t, s, ρ)]) and derive a PDE for ψ that can be linearized by a Hopf-Cole transformation. The various steps of this strategy can be implemented and, at least in an informal way, they lead to an expression of the form 1 1 −γ φ(FT ,b) ψ(t, s, φ) = E dPB (b) + log ρT (x)dx Ft = s, ρt = φ log e γ γ for the solution. Remark. The value function separates as a sum of two terms with clear intuitive interpretations. • A first term giving the certainty equivalent of the payoff of the option 1 −γ φ(FT ,b) E dPB (b) Ft = s, ρt = φ . log e γ
263
APPLICATIONS TO WEATHER DERIVATIVES AND ENERGY CONTRACTS
• A second term representing the cost of partial observations 1 log ρT (x)dx Ft = s, ρt = φ , E γ which is obviously negative. Reflecting on indifference prices, one should note that as the cost of partial observation is independent of the claim, the indifference price pφ is trivial in the sense that 1 E log e−γ φ(FT ,b) dPB (b) . pφ = γ In fact, if φ(FT , B) = φ(FT ) + B, then necessarily the indifference price pφ is of the form E[φ(FT )] minus a constant. More generally, and as we already noted in our discussion of Chapter 4, ANY two-factor (Ft , δt ) stochastic drift model gives trivial indifference prices to claims on Ft if we use exponential utility. Generalizations As explained in [45, 46], it is possible to extend some of the above results to cases for which the basis B depends upon the convenience yield, and to more general dynamics of the forward prices. In particular, we studied numerically non-Gaussian diffusion models, such as the CEV model 1+β
dFt = Ft (r − δt )dt + σ Ft
dWt .
φ
In this case the exponent ψ reads 1 e−γ φ(FT ,aδ+) dP()ρT (δ)dδ E log ψ φ (t, s, φ) = γ R
Ft = s, ρt ∼ φ .
The difficulties come from the fact that Tt and B do not separate, and we cannot use Kalman filtering. For the sake of illustration we reproduce in Figure 7.4 numerical 0.2
0.4 delta0 = −0.06 delta0 = −0.03 delta0 = 0.02 delta0 = 0.07
0.3
k = 1.1 k = 1.5 k=2
0.1
0.2
0
0.1 −0.1
0 −0.2
−0.1 −0.3
−0.2 −0.4
−0.3 −0.5
−0.4 0
1
2
3
4
5
6
cm
0
1
2
3 Years
4
5
Figure 7.4 Varying the initial value of δt (left). Varying the mean-reversion speed κ (right).
6
264
CHAPTER 7
results obtained with particle filters (see, for example, [55]) for the computation of the cost of partial observation 1˜ E log ρT (δ)dδ − log e−γ aδ ρT (δ)dδ . γ For this computation we used γ = 0.1, a = 10, 10, 000 simulations and 500 particles. Remark. As a final remark we point out that in the special case φ(FT , B) = φ(FT ) + aδT + , it is possible to compute the indifference price. We get 1 φ P = E log ρT (x)dx − ψ φ γ R ρT (δ)dδ γ σ2 1 + log −γ aδ = E φ(FT ) + µ + , 2 γ e ρT (δ)dδ where ∼ N(µ , σ2 ).
PART 4
Complements
This page intentionally left blank
Chapter Eight BSDEs and Applications Nicole El Karoui Said Hamadène Anis Matoussi
Backward stochastic differential equations (BSDEs) were first introduced by J. M. Bismut in 1973 [28] as equations for the adjoint processes in the stochastic version of the Pontryagin maximum principle. Pardoux and Peng [216] generalized the notion in 1990 and were the first to consider general BSDEs and to solve the questions of existence and uniqueness. A solution for a BSDE associated with a coefficient g(t, ω, y, z) and a terminal value ξT is a pair of square integrable, adapted (w.r.t. the Brownian filtration) processes (Yt , Zt )t≤T such that T T Yt = ξT + g(s, ω, Ys , Zs )ds − Zs dBs , t ≤ T . (8.1) t
t
When the function g is Lipschitz continuous with respect to (y, z) and ξT ∈ L2 (), Pardoux and Peng proved in their pioneering paper [216] the existence and uniqueness of a solution for (8.1). Since then BSDEs have been widely used in stochastic control and especially in mathematical finance, as any pricing problem involving a replication argument can be written in terms of linear BSDEs, or nonlinear BSDEs when portfolio constraints are taken into account, as in El Karoui, Peng, and Quenez [202]. Another problem that has attracted a lot of attention in this area, especially in connection with applications, is how to improve the existence/uniqueness conditions of a solution for (8.1). When the functions g and ξT are valued in R, there were many articles that weaken the existence conditions of a solution for (8.1). Basically, in those papers it is assumed that g is just continuous and satisfies a linear or a quadratic growth condition. Among them we quote Hamadène [114], Lepeltier and San Martin [171], [172], and Kobylanski [159]. These works are based on the comparison theorem of solutions of BSDEs. Nevertheless, in general, we do not have uniqueness of the solution. In [159], Kobylanski has shown that the BSDE associated with a bounded terminal value ξT and a continuous generator g with quadratic growth, i.e., satisfying |g(t, ω, y, z)| ≤ C(1 + |y| + |z|2 ), has a solution. In that paper a uniqueness result is also given. This latter model of BSDEs is very useful in mathematical finance theory, especially when we deal with exponential utilities or risk measure and applications to
268
CHAPTER 8
weather derivatives (see, e.g., the paper [156] by El Karoui and Rouge, Sekine [250], and Chapter 3 of this book. Actually it has been shown in [156] that in a market model with constraints on the portfolios, if we define the indifference price for a claim ζ as the smallest number p such that x+p,π
sup E[e−γ (XT π
−ζ )
x,π
] ≥ sup E[e−γ XT ], π
where Xx,π is the wealth associated with the portfolio π and initial value x, then this problem gives rise to the solution of a BSDE with quadratic growth coefficient. Finally, let us point out that risk-sensitive control problems also turn into BSDEs that fall in the same framework as in El Karoui and Hamadène [155]. Another type of BSDEs are the reflected ones. Introduced by El Karoui, Kapoudjian, Pardoux, Peng, and Quenez in [201], the setting of those equations is the following: let us consider an adapted stochastic process S := (St )t≤T , which stands for a barrier. A solution for the reflected BSDE associated with (g, ξ, S) is a triple of adapted stochastic processes (Yt , Zt , Kt )t≤T such that T T Zs dBs , t ≤ T , g(s, ω, Ys , Zs )ds + KT − Kt − Y t = ξ + t t (8.2) T Yt ≥ St and (Yt − St )dKt = 0. 0
The process K is continuous, increasing, and its role is to push Y upward in order T to keep it above the barrier S. The requirement 0 (Yt − St )dKt = 0 means that the action of K is made with a minimal energy. The development of reflected BSDEs has been especially motivated by pricing American contingent claims by replication, especially in constrained markets. Actually, it has been shown by El Karoui, Pardoux, and Quenez in [200] that the price of an American contingent claim (St )t≤T whose strike is γ in a standard complete financial market is Y0 , where (Yt , πt , Kt )t≤T is the solution of the reflected BSDE −dYt = b(t, Yt , πt )dt + dKt − πt dWt , YT = (ST − γ )+ , T Yt ≥ (St − γ )+ and (Yt − (St − γ )+ )dKt = 0 0
for an appropriate choice of the function b. The process π allows one to construct a replication strategy and K is a consumption process that the buyer of the option could pursue. In a standard financial market, the function b(t, ω, y, z) = rt y + zθt , where θt is the risk premium and rt the spot rate to invest or borrow. Now when the market is constrained, e.g., when interest rates are not the same whether one borrows or invests money, then the function b is given by b(t, ω, y, z) = rt y + zθt − (Rt − rt )(y − (z.σt−1 .1))− , where Rt (resp. rt ) is the spot rate to borrow (resp. invest) and σ the volatility. Finally, one also can consider BSDEs with two reflecting barriers (Cvitanic and Karatzas [57], Hamadène, Lepeltier, and Matoussi [241]). Indeed, in Equation (8.2), instead of having just one reflecting barrier we can introduce another barrier U := (Ut )t≤T and we require that the component Y should stay between S and U . However, we should add an increasing process K − in the equation in order to keep Y
269
BSDEs AND APPLICATIONS
below U . Those BSDEs have been developed especially in connection with Dynkin games, mixed differential games, and recallable options in Hamadène [115]. In conclusion, we can say that BSDEs have many applications in different fields of mathematics despite the fact that they have been intensively studied only during the last decade. There exist many research papers on BSDEs, to which we will refer in the following, but only a few lecture notes on the subject: Pitman Research Notes in Mathematics Series (1997) [225], Forward Backward SDEs and their application by Ma and Young [177], BSDEs, weak convergence, and homogenization of semilinear PDEs of Pardoux [215]. On the other hand, let us point out that the relationships between finance and BSDEs have been studied in detail in several research papers (see, for instance, El Karoui, Peng, and Quenez [202]), or lecture notes, for example of the CIME: El Karoui and Quenez [78], and Peng [220]. Acknowledgments: The authors thank Pauline Barrieu for reading and correcting earlier versions of the manuscript.
8.1 GENERAL RESULTS ON BACKWARD STOCHASTIC DIFFERENTIAL EQUATIONS 8.1.1 BSDE: Existence and Uniqueness of a Solution Let (, F, P) be a probability space on which is defined a d-dimensional Brownian motion W := (Wt )t≤T . Let us denote by (FtW )t≤T the natural filtration of W and (Ft )t≤T its completion with the P-null sets of F. We define the following spaces: • • • • •
Pn the set of Ft -progressively measurable, Rn -valued processes on × [0, T ] L2n (Ft ) = {η : Ft -measurable Rn − valued random variable s.t.E[|η|2 ] < ∞} Sn2 (0, T ) = {ϕ ∈ Pn with continuous paths, s.t.E[supt≤T |ϕt |2 ] < ∞} T Hn2 (0, T ) = {Z ∈ Pn s.t. E[ 0 |Zs |2 ds] < ∞} T Hn1 (0, T ) = {Z ∈ Pn s.t. E[( 0 |Zs |2 ds)1/2 ] < ∞}.
Let us now introduce the notion of multidimensional BSDEs. Definition 8.1 Let ξT ∈ L2m (FT ) be an Rm -valued terminal condition and let g be an Rm -valued coefficient, Pm ⊗ B(Rm × Rm×d )-measurable. A solution for the m-dimensional BSDE associated with parameters (g, ξT ) is a pair of progressively measurable processes (Y, Z) := (Yt , Zt )t≤T with values in Rm × Rm×d such that 2 Y ∈ Sm2 , Z ∈ Hm×d T T (8.3) Yt = ξT + g(s, Ys , Zs )ds − Zs dWs , 0 ≤ t ≤ T . t
t
The differential form of this equation is −dYt = g(t, Yt , Zt )dt − Zt dWt ,
YT = ξT .
Hereafter g is called the coefficient and ξ the terminal value of the BSDE.
(8.4)
270
CHAPTER 8
Under some specific assumptions on the coefficient g, the BSDE (8.3) has a unique solution. The standard assumptions are the following: 2 (i) (g(t, 0, 0))t≤T ∈ Hm (ii) g is uniformly Lipschitz with respect to(y, z) : there exists a constant (H1) C ≥ 0 s.t. ∀(y, y , z, z ) |g(ω, t, y, z) − g(ω, t, y , z )| ≤ C(|y − y | + |z − z |), dt ⊗ dP a.e. Theorem 8.2 (Pardoux and Peng [216]) Under the standard assumptions (H1), there exists a unique solution (Y, Z) of the BSDE with parameters (g, ξT ). Proof. i) First assume that g ≡ 0. A solution (Y, Z) of equation (8.3) satisfies: T Zs dWs , YT = ξT . Yt = ξT − t
So, the process Y is necessarily the continuous version of the S 2 -martingale E[ξT |Ft ]. As a direct consequence of the martingale representation theorem with respect to a Brownian filtration (see for instance Theorem 4.15 in Karatzas and 2 Shreve [152]), there exists a unique process Z ∈ Hm×d such that t Zs dWs . Yt = E[ξT |Ft ] = E[ξT ] + 0
Therefore, when g ≡ 0, we have proved the existence and uniqueness of a solution for the BSDE (8.3). ii) The extension to the case where g does not depend on (y, t z), i.e., g(t, ω, y, z) := 2 , is obvious given that (Yt + 0 g(s)ds, Zt ) is solution g(t, ω) = g(t, ω, 0, 0) ∈ Hm T of the BSDE with 0-coefficient and terminal value 0 g(s)ds + ξT . iii) Let us now consider a general Lipschitz coefficient g(t, ω, y, z). As in the deterministic case, the solution will be obtained as the fixed point of an appropriate 2 2 application defined on Hm × Hm×d . However, instead of working with the usual norm on this space, we will use an equivalent norm involving a weight on time similar to a capitalization factor, parametrized by a positive constant α. Let Dα be 2 2 equipped with the norm the space Hm × Hm×d ||(Y, Z)||α = E
T
eαs (|Ys |2 + |Zs |2 )ds
12 .
0
In the following, the terminal condition ξT is given and is not referred to anymore. Let be the application from Dα into Dα defined by (u, v) := (ut , vt )t≤T ∈ Dα −→ (u, v) = (Ytu,v , Ztu,v )t≤T , where (Y u,v , Z u,v ) is the Dα -valued solution of the BSDE with coefficient g u,v (t) = g(t, ut , vt ). Note that g u,v does not depend upon (y, z). Thus the existence of a solution follows from (b). As usual, the existence of the fixed point we are interested in lies upon the contraction property of the map for a norm . α with a large enough parameter α.
271
BSDEs AND APPLICATIONS
− Ytu ,v )2
So let ((u, v), (u , v )) ∈ Dα × Dα . Applying Itô’s formula to yields T T
αt u,v u ,v 2 αs u,v u ,v 2 e |Yt − Yt | + e |Zs − Zs | ds + α eαs |Ysu,v − Ysu ,v |2 ds eαt (Ytu,v
t
= (MT − Mt ) + 2
t T
u ,v
eαs (Ysu,v − Ys
)(g(s, us , vs ) − g(s, u s , vs ))ds,
t
t
where Mt = 2 0 eαs (Ysu,v − Ysu ,v )(Zsu,v − Zsu ,v )dWs is a uniformly integrable martingale. Indeed, using the inequality 2ab ≤ a 2 + b2 , we directly obtain T 12
E e2αs |Ysu,v − Ysu ,v |2 |Zsu,v − Zsu ,v |2 ds 0
u ,v
≤ cE sup |Ysu,v − Ys
|
0≤s≤T
c ≤ E 2
sup 0≤s≤T
|Ysu,v
0
− Ysu ,v |2
T
u ,v 2
|Zsu,v − Zs
c + E 2
T 0
| ds
|Zsu,v
12
− Zsu ,v |2 ds
.
But we know that sups≤T |Ysu,v − Ysu ,v | is square integrable and (Z u,v − Z u ,v ) ∈
2 Hm×d . Therefore, the process {eαs |(Ysu,v − Ysu ,v )(Zsu,v − Zsu ,v )|}s is in Hd1 . Finally from classical results, the associated stochastic integral is a uniformly integrable martingale, with expectation equal to zero. We can now use (H1), i.e., the fact that g is Lipschitz continuous with constant C. By taking expectations and using the polarization identity we get C 2 C2 2 C2 2 2 b ≤ b , −αa + 2Cab = −α a − b + α α α and then
E eαt |Ytu,v − Ytu ,v |2 + E
T
eαs |Zsu,v − Zsu ,v |2 ds
t
T
≤E
eαs (−α|Ysu,v − Ysu ,v |2 + 2C|Ysu,v − Ysu ,v |(|us − u s | + |vs − vs |)ds
t
T C eαs (|us − u s | + |vs − vs |)2 ds E α t T 2C 2 αs
2
2 E e (|us − us | + |vs − vs | ))ds . ≤ α t
≤
2
Therefore, in terms of the α-norms of the solutions, we have
||Z u,v − Z u ,v )||2α ≤
2C 2 ||(u − u , v − v )||2α . α
272
CHAPTER 8 u ,v
This inequality yields to different estimates for the process Y u,v − Y . The most
2 obvious is that at any time t, eαt E[|Ytu,v − Ytu ,v |2 ] ≤ 2Cα ||(u − u , v − v )||2α . In
particular, since Y0u,v and Y0u ,v are deterministic, we have 2C 2 ||(u − u , v − v )||2α . α Moreover, by integrating between 0 and T both sides of this inequality, we obtain
|Y0u,v − Y0u ,v |2 ≤
2C 2 T ||(u − u , v − v )||2α . α Therefore, there exists K > 0 depending only on C and T such that
||Y u,v − Y u ,v ||2α ≤
K (8.5) ||(u − u , v − v )||2α . α Thus for any α > K, the map is contracting on the Hilbert space Dα . The fixed point theorem ensures the existence of a unique pair (Y, Z) ∈ Dα such that (Y, Z) = (Y, Z). The uniqueness has to be understood dt ⊗ dP- almost everywhere. By construction, the first component Y of (Y, Z) is a continuous process equal to Y dt ⊗ dP-a.e. and the pair (Y , Z) is the unique solution of the BSDE 2 (g, ξT ) with a continuous first component.
||(Y u,v − Y u ,v , Z u,v − Z u ,v )||2α ≤
Remark. If instead of assumption (H1)-(ii), we simply require that first g satisfies the following monotonicity condition with respect to y: (g(s, y 1 , z) − g(s, y 1 , z))(y 1 − y 2 ) ≤ K(y 1 − y 2 )2 , and that the mapping y → g(t, y, z) is continuous with a mild growth condition, and second, the mapping z → g(t, y, z) is uniformly Lipschitz continuous, then the BSDE associated with (g, ξT ) has a unique solution. For more details, the interested reader is referred, for instance, to Pardoux and Darling [240] or Pardoux [215]. Instead of using a fixed point theorem to prove the existence of the solution of BSDE (8.4), we could have used an explicit approximation method, such as the well-known Picard approximation. The interest of such an approach is to construct a sequence converging almost surely to the solution. Proposition 8.1 If (Y k , Z k ) is the Picard sequence recursively defined by (Y 0 = 0, Z 0 = 0), and −dYtk+1 = g(t, Ytk , Ztk )dt − Ztk+1 dWt ,
YTk+1 = ξT ,
(8.6)
2 2 ⊗ Hm×d , and dt ⊗ dP-a.e. to the soluthen this sequence (Y k , Z k ) converges in Hm tion (Y, Z) of the BSDE(g, ξT ). Moreover, the sequence (Y k ) converges uniformly almost surely.
Proof. Let the Picard sequence (Y k , Z k ) be recursively defined by (Y 0 = 0, Z 0 = 0) and −dYtk+1 = g(t, Ytk , Ztk )dt − Ztk+1 dWt ,
YTk+1 = ξT .
273
BSDEs AND APPLICATIONS
Let δ k (Y ) = Y k − Y k−1 , δ k (Z) = Z k − Z k−1 be the processes of increments between two iterations. Then, applying the previous estimates (8.5) to (ukt , vtk ) = (Ytk−1 , k k k k Ztk−1 ) and (Y k , Z k ) = (Y (u ,v ) , Z (u ,v ) ), we obtain (δ k+1 (Y ), δ k+1 (Z)) 2α ≤
K (δ k (Y ), δ k (Z)) 2α ≤ k (Y 1 , Z 1 ) 2α , α
(8.7)
where = Kα is a real constant < 1. The norms of the increments of the series are dominated by a geometric sequence. Therefore both series (Y k+1 − Y k ) and (Z k+1 − Z k ) converge in the Hilbert space Dα . In particular, the sequence of partial sums of the series converges in Dα , and dt ⊗ dP-a.e.. Moreover, it is possible to obtain the same kind of estimates for the uniform norm of the semimartingale δ k (Y ) defined by δ k (Y ) 2∞ := E[sup0≤t≤T |δtk (Y )|2 ]. t t First let us observe that for any Itô semimartingale t = 0 + 0 φs ds + 0 σs dWs , 2 2 with φ and σ in Hm and Hm×d , respectively, the maximal inequality of martingales ensures that "2 t 2 " . T " " 2 " σs dWs " := E sup σs dWs ≤ 4E |σ | ds := 4||σ ||2H2 . s " " 0
∞
0≤t≤T
0
m×d
0
The same estimate holds for the semimartingale since "2 2 " . T " " 2 2 " +" |φs |ds ∞ ≤ 2 E[|0 | ] + E " σs dWs " 0
≤ 2E[|0 |
2
] + 2T ||φ||2H2 m
∞
0
+ 8||σ ||2H2 . m×d
T T Using the semimartingale representation t = T − t φs ds − t σs dWs , the T T same estimates may be used, given the inequality | t σs dWs |2 ≤ 2(| 0 σs dWs |2 + t | 0 σs dWs |2 ) and
2∞ ≤ 2 E[|T |2 ] + 2T ||φ||2H2 + 10||σ ||2H2 . (8.8) m
m×d
k+1
Then, we apply this last inequality to the semimartingale δ (Y ) with decomposition φtk = g(t, Ytk , Ztk ) − g(t, Ytk−1 , Ztk−1 ) and σtk = Ztk − Ztk−1 = δtk (Z). Since g is Lipschitz continuous with Lipschitz constant C, ||φ||2 ≤ C||(δ k (Y ), δ k (Z))||2 . So we obtain δ k+1 (Y ) 2∞ ≤ 2T C||(δ k (Y ), δ k (Z))||22 + 20||δ k (Z)||22 ≤ K||(δ k (Y ), δ k (Z))||22 ≤ K k−1 ,
(8.9)
where K is a universal constant that can change from one inequality to another. The last estimate is given by Equation (8.7) in terms of the α-norm, which is equivalent to the H2 -norm. As a consequence, the series (Y k+1 − Y k ) is uniformly convergent. 2 Using Itô’s formula and the same kind of estimates as above, we obtain the following a priori estimate.
274
CHAPTER 8
Proposition 8.2 [A priori estimate] Let (Y, Z) be a solution of BSDE(g, ξT ). Then there exists a constant c > 0 such that T T |Zt |2 dt ≤ cE |ξT |2 + |g(t, 0, 0)|2 dt . (8.10) E sup |Yt |2 + 0≤t≤T
0
0
8.1.2 One-Dimensional Linear BSDEs When the coefficient is linear, we can get explicitly the component Y of the solution. As it has been shown by El Karoui et al. in [202], this representation is useful to prove a strict comparison theorem. Proposition 8.3 [Linear BSDE] Let (β, µ) be a bounded (R, Rd )-valued progressively measurable process, ϕ be an element of H12 (0, T ) and ξT ∈ L21 (0, T ). We consider the following linear BSDE: −dYt = (ϕt + Yt βt + Zt µt )dt − Zt dWt ;
YT = ξT .
(8.11)
i) Equation (8.11) has a unique solution (Y, Z) ∈ S12 (0, T ) × Hd2 (0, T ), and Y is given explicitly by T (8.12) t,s ϕs ds Ft , Yt = E ξT t,T + t
where (t,s )s≥t is the adjoint process defined by the forward linear SDE dt,s = t,s (βs ds + µs dWs );
t,t = 1,
(8.13)
satisfying the flow property ∀t ≤ s ≤ u t,s s,u = t,u P − a.s. ii) If ξT and ϕ are both non-negative, then the process (Yt )t≤T is non-negative. Moreover, if in addition Yt = 0 on B ∈ Ft , then P− a.s. on B, for any s ≥ t, Ys = 0, ξT = 0, and ϕs = 0, Zs = 0 dP ⊗ ds−a.e. Proof. i) By Theorem 8.2, the BSDE (8.11) has a unique solution (Y, Z). Using t Itô’s calculus, it can be easily seen that (t Yt + 0 s ϕs ds)t is a local martingale. Moreover, using the fact that both sups≤T |Ys | and sups≤T |s | belong to L2 (), the product sups≤T |Ys | × sups≤T |s | belongs to L1 (). Therefore, the local martingale t (t Yt + 0 s ϕs ds)t is a uniformly integrable martingale, whose value at time t is the Ft -conditional expectation of its terminal value. ii) Note that is a positive process. By Y ’s explicit representation (8.12), if ξT and ϕ are nonnegative, Y is also nonnegative as the conditional expectation of the nonnegative random variable t,T . Moreover, this conditional expectation is equal to 0 only if t,T ≡ 0, P− a.s. More precisely, if Yt = 0 on B, then for any s ≥ t, Ys = 0, ξT = 0, and ϕs = 0, Zs = 0 dP ⊗ ds− a.s. 2 8.1.3 Comparison Theorem In the one-dimensional case, i.e., when m = 1, we have a comparison result between the Y s as soon as we can compare the associated coefficients and terminal values. More precisely,
275
BSDEs AND APPLICATIONS
Theorem 8.3 (El Karoui, Peng, and Quenez [202]) Let us consider the solutions (Y, Z) and (Y , Z ) of two BSDEs associated with parameters (g, ξT ) and (g , ξT ). Only g is assumed to be satisfying (H1), and g (s, Ys , Zs ) is only required to be an element of H12 . If ξT ≤ ξT P−a.s. and g(t, Yt , Zt ) ≤ g (t, Yt , Zt ) dt ⊗ dP−a.e., then, Yt ≤ Yt ,
∀ t ∈ [0, T ] P−a.s.
Moreover, the comparison is strict in the sense that, if in addition Y0 = Y0 , then ξT = ξT , g(s, Ys , Zs ) = g (s, Ys , Zs ), and Ys = Ys , ∀ s ∈ [0, T ]P− a.s. In particular, when either P(ξT < ξT ) > 0 or g(t, Yt , Zt ) < g (t, Yt , Zt ) on a set of positive dt ⊗ dP measure, then Y0 < Y0 . Proof. First note that both the condition g(t, Yt , Zt ) − g (t, Yt , Zt ) ≤ 0 and the Lipschitz property of g imply that g(t, Yt , Zt ) − g (t, Yt , Zt ) ≤ g(t, Yt , Zt ) − g(t, Yt , Zt ) ≤ C(|Yt − Yt )| + |Zt − Zt |). Itô-Tanaka’s formula applied to ((Yt − Yt )+ )2 yields T
+ 2 αT
+,2 αt e ((Yt − Yt ) ) = e (ξT − ξT ) − 2 eαs (Ys − Ys )+ (Zs − Zs )dWs + Vt,T , t
where
T
Vt,T = t
eαs (−α(Ys − Ys )+,2 − 1{Ys >Ys } |Zs − Zs |2
+ (Ys − Ys )+ [g(s, Ys , Zs ) − g (s, Ys , Zs )])ds, and then,
Vt,T ≤
T
eαs (−α(Ys − Ys )+,2 − 1{Ys >Ys } |Zs − Zs |2
t
+ 2C(Ys − Ys )+ (|Ys − Ys )| + |Zs − Zs )|)ds. By the polarization formula, for a large enough α and for any a, b, we have − α(a + )2 − 1{a>0} b2 + 2Ca + (|a| + |b|) = −1{a>0} (−α|a|2 − |b|2 + 2C|a|(|a| + |b|)) = 1{a>0} (−(|b| − C|a|)2 − (α − 2C − C 2 )|a|2 ) ≤ 0. Therefore, using the above inequality and taking α > 2C + C 2 we obtain Vt,T ≤ 0. Using the same argument as in the proof of Theorem 8.2, we show that the martingale T eαs (Ys − Ys )+ (Zs − Zs )dWs t
is uniformly integrable, with zero expectation. Taking into account the assumption on the terminal conditions, we deduce that E[eαt (Yt − Yt )+,2 ] ≤ 0 and the comparison result.
276
CHAPTER 8
For the remainder of the proof, we assume, for the sake of simplicity, that d = 1. Let ξ¯T = ξT − ξT and ϕt = g(t, Yt , Zt ) − g (t, Yt , Zt ), and set βt = (Yt − Yt )−1 (g(t, Yt , Zt ) − g(t, Yt , Zt )) if t = Yt , and βt = 0 if Yt = Yt , and µt = (Zt − Zt )−1 (g(t, Yt , Zt ) − g(t, Yt , Zt )) if Zt = Zt , and µt = 0 if Zt = Zt . Since the coefficient g is uniformly Lispchitz with respect to (y, z), it follows that (β, µ) is a real valued bounded progressively measurable process. Note also that since g(s, 0, 0) ∈ H12 and g (s, Ys , Zs ) ∈ H12 , we have that ϕ is in H12 . Moreover, it is obvious that (Y¯t , Z¯ t ) := (Yt − Yt , Zt − Zt ) is the unique solution of the following linear BSDE with parameters (ϕs + yβs + zµs , ξT − ξT ): T T Z¯ s dWs . (ϕs + Y¯s βs + Z¯ s µs )ds − Y¯t = ξ¯T + t
t
By Proposition 8.3, we know that Y¯ is given explicitly by T t,s ϕs ds Ft , Y¯t = E ξ¯T t,T + t
where (t,s )s≥t is given by (8.13) with β and µ as above. Therefore, the strict comparison theorem follows from this formula and the fact that ξ¯ and ϕ are positive. 2 8.1.4 One-Dimensional BSDEs with Non-Lipschitz Coefficients We now consider one-dimensional BSDEs whose coefficients g are merely continuous. We work under the following assumption: − g has linear growth : there exists a function s.t. ∀(y, z, ) |g(t, ω, y, z)| ≤ |g(t, ω, 0, 0)| + k(|y| + |z|)dt ⊗ dP− a.e. (H2) − (g(t, 0, 0))t≤T ∈ H12 , − dt ⊗ dP− a.e., the function (y, z) −→ g(ω, t, y, z) is continuous. Then, the following result holds: Theorem 8.4 (Lepeltier and San Martin [171]) Under the assumption (H2), the BSDE associated with parameters (g, ξT ) has a minimal solution (Y, Z), i.e. if (Y , Z ) is another solution for (8.3), then Y ≤ Y P-a.s.. The proof is based on regularization by inf-convolution techniques. Lemma 8.5 (Lepeltier and San Martin [171]) Let g : Rd → R be a continuous function with linear growth |g(x)| ≤ k(1 + |x|) = bk (x). The sequence gn (x) = inf y∈Qd (g(y) + n|x − y|) = gbn0 (x) defined as the inf-convolution of the function g with the function bn0 (x) = n|x| is finite for n ≥ k, with the following properties: i) |gn (x)| ≤ k(1 + |x|) = bk (x) (the same k), ii) the sequence gn is non-decreasing,
277
BSDEs AND APPLICATIONS
iii) the sequence gn is Lipschitz continuous and |gn (x) − gn (y)| ≤ n|x − y|, iv) if xn → x, then gn (xn ) → g(x). Proof of Theorem 8.4. Using Lemma 8.5, the functions gn , bk and −bk satisfy (H1). For n ≥ k, let (Y n , Z n ) be the solution of the BSDE(gn , ξT ), let (U 1 , V 1 ) (resp. (U 2 , V 2 )) be the solution of the BSDE(−bk , ξT ) (resp. (bk , ξT )). By the Comparison Theorem 8.3, for any n ≥ k, U 2 ≥ Y n+1 ≥ Y n ≥ U 1 . Hence, the sequence (Y n )n≥0 converges a.s. and in H12 to an upper semicontinuous process Y . Observe that for n ≥ k, E[sup0≤s≤T |Ysn |2 ] ≤ E[sup0≤s≤T (|Us1 |+ |Us2 |)2 ] < ∞. Now, applying Itô’s rule to (Ytn )2 , and using both the inequality 2k|ab| ≤ 2k 2 |a|2 + 12 |b|2 and the linear growth of g, we obtain the following uniform estimates for Y n and Z n : T E |Y0n |2 + |Zsn |2 ds 0
≤ E[ξT2 ] + 2E 1 ≤ c+ E 2
0 T
0
T
≤ c + CE
T
0
|Ysn |(|g(s, 0, 0)| + k(|Ysn | + |Zsn |))ds
|Ysn |2 ds
+ 2kE
T
0
|Ysn ||Zsn |ds
1 |Us1 + Us2 |2 ds + E 2
T 0
|Zsn |2 ds
,
(8.14)
where c and C are positive constants that may change from a line to another. T So, both sequences (sup0≤s≤T |Ysn |2 )k≤n and ( 0 |Zsn |2 ds)k≤n are bounded in L1 . Thanks to the uniform linear growth of gn , and the fact that g(t, 0, 0) belongs to T H2 , ( 0 gn (Ysn , Zsn )2 ds)k≤n is also bounded in L1 by a constant Kg . Using these estimates, it is easy to show that the sequence (Z n )n≥k is of Cauchy type in Hd2 (0, T ). Indeed, using Itô’s formula with n, m ≥ k yields to T E |Zsn − Zsm |2 ds 0
≤ |Y0n − Y0m |2 + 2E
T 0
(Ysn − Ysm )(gn (s, Ysn , Zsn ) − gm (s, Ysm , Zsm ))ds
m 2 n ≤ |Y0 − Y0 | + 2 E
0
× E
T 0
≤
|Y0n
− Y0m |2
T
|Ysn
− Ysm |2 ds
12
|gn (s, Ysn , Zsn ) − gm (s, Ysm , Zsm )|2 ds + 2(K ) E
T
1 g 2
0
|Ysn
− Ysm |2 ds
12 ,
12 .
Thanks to the convergence theorem of H2 -martingales, for any stopping time ν ≤ T , we have 2 ν n lim E (Zs − Zs )dWs = 0. n→∞
0
278
CHAPTER 8
Since we also have that limn→∞ E[|Yνn − Yν |2 ] = 0, in order to show that the limit process (Y, Z) is the solution of the BSDE(g, ξT ), we only have to prove that ν ( 0 gn (s, Ysn , Zsn ) − g(s, Ys , Zs )ds)k≤n goes to 0 in L1 : ν n n gn (s, Ys , Zs ) − g(s, Ys , Zs )ds E 0
T
≤E 0
|gn (s, Ysn , Zsn ) − g(s, Ys , Zs )|1{|Ysn |+|Zsn |≤k} ds
T
+E 0
|gn (s, Ysn , Zsn ) − g(s, Ys , Zs )|1{|Ysn |+|Zsn |>k} ds .
There exists a subsequence, which we still denote by n for the sake of simplicity, such that the first term in the right-hand side converges to 0 as n → ∞ since P−a.s. and for any t ≤ T , (gn (t, ω, y, z))n≥k converges uniformly to g(t, ω, y, z) on compact subsets of R1+d . Concerning the second term in the right-hand side, we use the linear growth of gn to observe that |gn (s, Ysn , Zsn ) − g(s, Ys , Zs )| ≤ k(|Ysn | + |Zsn |) + |s |), where belongs to H2 . Since 1{|Ysn |+|Zsn |>k} ≤ k1 (|Ysn | + |Zsn |), and since the sequence (|Ysn | + |Zsn |) is bounded in H2 , the second term is bounded by a constant δ/k which converges to 0 as k → ∞. Henceforth, the left-hand side also converges to 0 as n → ∞. It follows that for any ν a stopping time we have ν ν g(s, Ys , Zs )ds + Zs dWs . Yν = Y0 − 0
0
Now we can use the optional section theorem (see, for instance, Dellacherie and Meyer [69]), which “loosely speaking” means that two optional processes X and X
are undistinguishable whenever they satisfy Xν = Xν for any stopping time ν. This t t implies that Y and (Y0 − 0 g(s, Ys , Zs )ds + 0 Zs dWs )t≤T are undistinguishable, and therefore (Y, Z) is a solution for the BSDE associated with (g, ξT ). Now, if (Y , Z ) is another solution for the BSDE (8.3), given that the coefficients gn are Lipschitz processes, we can apply the comparison theorem 8.3 and deduce that Y n ≤ Y since gn ≤ g. Hence, by taking the limit we obtain Y ≤ Y and then Y is the minimal one solution. Remark. Just as previously, if we had used a nonincreasing scheme we would have constructed the maximal solution of the BSDE (8.3). The way in which the minimal (maximal) solution for (8.3) has been constructed allows us to deduce a comparison result. Proposition 8.4 Let (Y, Z) be the minimal solution of the BSDE(g, ξT ), where ¯ of an another BSDE the pair (g, ξT ) satisfies (H2). Let us consider a solution (Y¯ , Z) ¯ associated with (g(t, ¯ y, z), ξT ). Assume that ξT ≤ ξ¯T
P− a.s. and g(t, Y¯t , Z¯t ) ≤ g(t, ¯ Y¯t , Z¯t ) dP ⊗ dt− a.e.
Then, Y ≤ Y¯ P− a.s. Proof. Note that under the assumptions of the proposition, for any nondecreas¯ Y¯t , Z¯t ). Then, with ing approximation gn of the coefficient g, gn (t, Y¯t , Z¯t ) ≤ g(t,
279
BSDEs AND APPLICATIONS
the Lipschitz continuous coefficients gn defined in Lemma 8.5 and their solutions (Y n , Z n ), the assumptions of Theorem 8.3 are satisfied, and P− a.s. Y n ≤ Y¯ . The same property holds for the limit process Y . 2
8.2 APPLICATIONS TO OPTIMIZATION PROBLEMS 8.2.1 Financial Mathematics: Pricing of an European Option Let us consider a financial market where there are one nonrisky asset P 0 and a finite family of risky assets P i (i = 1, . . . , m): ) dPt0 = Pt0 rt dt dPti = Pti (µit dt + σti dWt ), t ≤ T ; P0 > 0 and P0i > 0. The processes r and µi are supposed P1 -measurable and bounded. Precise assumptions on the volatility matrix σ will be introduced in the next theorem. Let π¯ = (πt0 , πt1 , . . . , πtm )t≤T be a portfolio for a small investor. The process π 0 (resp. π i ) is the amount invested in the bond (resp. in the risky asset i). The value of the portfolio π¯ is given by the process Vt = Vtπ¯ := πt0 + πt1 + · · · + πtm . A portfolio π¯ is called admissible if (π 0 , πt1 , . . . , π m ) are Pm+1 -measurable and if T T P−a.s., |πt0 |dt < ∞, ∀ i ≥ 1 |πsi σsi |2 ds < ∞, 0
0
and
Vtπ¯
≥ 0 for any t ≤ T .
An admissible portfolio π¯ is called self-financing if its value (Vtπ¯ )t≤T satisfies 1 m dPt0 1 dPt m dPt + π + · · · · π t t Ptm Pt0 Pt1 m = rt Vt dt + (πti (µit − rt )dt + πti σti dWt )
dVt = πt0
i=0
= rt Vt dt + πt (µt − rt )dt + πt σt dWt ,
(8.15)
where π = (π 1 , . . . , π m ) is the risky portfolio. In the following, the set of admissible self-financing portfolios is denoted by A. Sometimes we also use the notation i (V , π ) ∈ A with the meaning (π0 , π) ∈ A where πt0 = Vt − m i=1 πt . Let us now consider ξT the payoff at time T of a nonnegative contingent claim, for instance ξ = (PT1 − K)+ . We want to seek for the initial endowment X0 the investor should have in order to get the final wealth ξT at time T , by making a self-financing investment in the market. Formally, the value X0 is defined by X0 = inf {x; ∃(V , π) s.t. (V , π) ∈ A, VT = ξ and V0 = x}. Let us first emphasize that even if the (V , πσ ) satisfies (8.15) and VT = (PT1 − K)+ , since there is a lack of integrability of the processes we cannot claim that (V , πσ ) is a solution of a BSDE, namely the one that looks like (8.15).
280
CHAPTER 8
When the financial market is assumed to be arbitrage free and complete, which is related to the assumption that the volatility matrix σt is invertible and its inverse is bounded, the next theorem shows that the value of the contingent claim ξ = (PT1 − K)+ is given by the solution of the linear BSDE: Theorem 8.6 (El Karoui, Peng, and Quenez [202]) Suppose that m = d, and that the matrix σt = (σt1 , σt2 , . . . , σtm ) is invertible, and the risk premium vector defined by θt = (σt )−1 (µt − rt ) is bounded. For any 0 ≤ ξT ∈ L21 (FT ), the value X0 is equal to Y0 , where (Yt , πt σt )t≤T is the solution of the following linear BSDE: ) dYt = {rt Yt + πt σt θt }dt + πt σt dWt (8.16) YT = ξ, and Yt is the value of the same contingent claim at t and πt the risky portfolio. Moreover Y is the minimum value of all admissible portfolios, i.e., if (V π , π) is the solution of the equation (8.15) such that (V π , π) ∈ A, then ∀ t ∈ [0, T ], Yt ≤ Vtπ P− a.s. Proof. By Theorem 8.2, there exists a unique couple of square integrable processes (Y, Z) solution of the BSDE (8.16) with linear coefficient g(t, y, z) = −rt y − zθt . Since the terminal value ξT is nonnegative, then by Proposition 8.4-b) the process Y is nonnegative, and the solution is the value of a self-financing admissible portfolio. T For t ≤ T let us set πt = Zt σt−1 , then 0 |πs σs |2 ds < ∞ and (Y, π σ ) satisfies equation (8.16). As earlier, the state price density process r,θ (financial name for the adjoint process) plays a key role in comparison results: r,θ r,θ dt,s = t,s (−rs ds − θs dWs ),
r,θ t,t = 1.
(8.17)
r,θ Then (0,t Yt ) is a uniformly integrable martingale. ˜ be an admissible portfolio such that VTπ˜ = ξ . Since (Vtπ˜ ) Now let π˜ = (π˜ 0 , π) satisfies the same linear equation (8.15), we can use Itô’s formula to show that r,θ π˜ r,θ r,θ (0,t Vt ) is a nonnegative local martingale with terminal value 0,T ξT = 0,T YT . Since any nonnegative local martingale is a supermartingale, we have at any time t: r,θ π˜ r,θ r,θ 0,t Vt ≥ E[0,T ξT |Ft ] = 0,t Yt ,
and Yt is the price at time of the contingent claim ξT .
2
Remark. From the classical mathematical finance point of view, the state price r,θ t,s . = Dt,s × L density process is splitted into the product of two processes t,s s The first one is the discount factor Dt,s = exp − t ru du and the second one is 0,T of the so-called risk neutral probability measure P (also called the likelihood L equivalent martingale measure): t,s = L t,s (−θs dWs ), with L t,t = 1 and dDt,s = −Dt,s rs ds, with Dt,t = 1 dL (8.18) The explicit representation of the price may be rewritten as Yt = E[ r,θ ξT |Ft ] = E[Dt,T ξT |Ft ]. (8.19) t,T
281
BSDEs AND APPLICATIONS
8.2.2 Optimal Control Let S be a system whose evolution is described by a Rd -valued stochastic process (xt )t≤T solution of the following standard SDE: dxt = σ (t, x.)dWt ,
t ≤ T ; x0 = x ∈ Rd .
(8.20)
The matrix σ is Lipschitz continuous with respect to x and sublinearly growing with respect to the appropriate norm. In addition, it is invertible and its inverse is bounded (note that this latter condition can be relaxed). A controller intervenes on this system via an adapted stochastic process u := (ut )t≤T taking its values in a compact metric space U . The set of those controls is called admissible and is denoted by U. When the controller acts with u, the dynamics of the controlled system is the same as that of x under the probability measure Pu , whose density with respect to P is given by T dPu 1 σ −1 (s, x.)f (s, x., us )dWs − |σ −1 (s, x.)f (s, x., us )|2 ds = exp dP 2 0 . := E (8.21) σ −1 (s, x.)f (s, x., us )dWs , 0
where E(M) is the exponential local martingale associated with the martingale M. The function f is assumed to be measurable and bounded. Under the new probability measure Pu , the process x is again solution of a forward stochastic differential equation, now driven by the Pu -Brownian motion W u , and ∀t ∈ [0, T ]: dxt = f (t, x., ut )dt + σ (t, x.)dWtu ,
x0 = x,
where dWtu = dWt − σ −1 (t, x.)f (t, x., ut )dt. Those considerations imply that the action of the controller rises a drift in the dynamics of the system. On the other hand, the control action is not free and generates a profit for the agent, denoted by J (u) and equal to T J (u) := Eu h(s, x., us )ds + (xT ) . (8.22) 0
The problem is now to find u∗ ∈ U such that J (u∗ ) ≥ J (u) for any u ∈ U. In this model, let us first stress that the functions f , σ , and h depend on x from its path up to t and not only on the value xt of x at time t. In this framework, the appropriate norm is [x]t = sup0≤s≤t |xs |. For the sake of simplicity, we assume the following strong properties: f and h are bounded and continuous functions of u. is also assumed to be bounded. We can extend these conditions to the case when the function σ −1 f is sublinearly growing (resp. has polynomial growth) with respect to x. The way to find an optimal control u∗ such that J (u∗ ) = supU J (u) = J ∗ is to introduce two linear BSDEs Y u and Y ∗ such that J (u) = Y0u and J ∗ = Y0∗ . Then, by Comparison Theorem 8.3, the problem can be reduced to the maximization of their coefficients with respect to u.
282
CHAPTER 8
It is easy to represent J (u). Indeed, from the representation of the solution of a linear BSDE in Proposition 8.3, a natural candidate for the coefficient of the BSDE associated with Y u (u ∈ U) is the Hamiltonian process: H (t, x., z, ut ) := zσ −1 (t, x.)f (t, x., ut ) + h(t, x., ut ).
(8.23)
Since σ −1 (t, x.)f (t, x., ut ), h(t, x., u) and (xT ) are uniformly bounded, H (t, x., z, ut ) is a linear coefficient of a BSDE satisfying (H1). Proposition 8.3 gives the uniqueness of a P-solution (Ytu , Ztu ) of BSDE(H (t, x., z, ut ), (xT )), i.e., −dYtu = H (t, x., Ztu , ut )dt − Ztu dWt ,
YTu = (xT ),
and we have that J (u) = Y0u . It is more difficult to find a BSDE associated with J ∗ . Comparison Theorem 8.3 suggests to take H ∗ (t, x., z) = supu∈U H (t, x., z, u) as coefficient of this BSDE. i) First, let us show that H ∗ satisfy (H1). As supremum of uniformly Lipschitz affine coefficients, it is clear that H ∗ (t, x., z) is a convex uniformly Lipschitz coefficient, since sup H (t, x., z, u)− sup H (t, x., z , u)≤ sup |H (t, x., z, u) − H (t, x., z , u)| u∈U
u∈U
u∈U
≤ k|z − z |. Moreover H ∗ (t, x., 0) = supu∈U h(t, x., u) is bounded. ii) It now remains to verify the measurability of H ∗ (t, x., z), given that we take a supremum over an uncountable set. There are different ways to prove this result. With our (strong) assumptions, we give below a complete answer to this point. For the use of ess supremum, see for instance [202]. Therefore, thanks to Theorem 8.2, there is a unique P−solution (Yt∗ , Zt∗ ) of the BSDE(H ∗ (t, x., z), (xT )): −dYt∗ = H ∗ (t, x., Zt∗ )dt − Zt∗ dWt ,
YT∗ = (xT ).
Since for any u ∈ U, H ∗ (t, x., z) ≥ H (t, x., z, u), we can use Comparison Theorem 8.3 and obtain Yt∗ ≥ Ytu ∀ t ∈ [0, T ]. In particular J ∗ ≤ Y0∗ . It remains to show that we have the converse inequality. We do this by providing an optimal control. Since the function u → H (t, x., z, u) is continuous on the compact set U, H ∗ (t, x., z) is a maximum, and we can select a measurable version u∗ (t, x., z) of arg max H (t, x., z, u), such that H ∗ (t, x., z) = H (t, x., z, u∗ (t, x., z)) = supu∈U H (t, x., z, u). The mapping u∗ exists through V. Benes selection theorem since the function (t, ω, z, u) ∈ [0, T ] × × Rd × U → H (t, x., z, u) is P ⊗ B(Rd ) ⊗ B(U)/B(R)-measurable, the mapping u → H (t, x., z, u) is continuous, and finally the set U is compact (see, for instance, Benes [17] or Elliot [154] Lemma 16.30 for more details). We summarize these results in the following theorem: Theorem 8.7 (Hamadène-Lepeltier [117],El Karoui-Peng- Quenez [202]) Under the previous assumptions, for any admissible control u ∈ U, the Hamiltonian
283
BSDEs AND APPLICATIONS
processes H (t, x., z, ut ) and the maximal Hamiltonian process H ∗ (t, x., z) defined by H (t, x., z, ut ) := zσ −1 (t, x.)f (t, x., ut ) + h(t, x., ut ), H ∗ (t, x., z) := sup H (t, x., z, u) u∈U
give a family of BSDEs with terminal condition (xT ), and linear or convex coefficients satisfying (H1). The associated solutions are denoted by (Ytu , Ztu ) and (Yt∗ , Zt∗ ). Moreover, there exists a measurable control process u∗ (t, x., z) such that at any time t ∈ [0, T ] and for any z ∈ Rm , H ∗ (t, x., z) = H (t, x., z, u∗ (t, x., z)). The process u∗t = u∗ (t, x., Zt∗ ) is an optimal control process since at any time t ≤ T ∗
Yt∗ = Ytu = ess sup Ytu ; u∈U
in particular Y0∗ = sup J (u) = J ∗ . u∈U
Notes: This result has been extended by Hamadène and Lepeltier in [116] to the case when the function σ −1 f is sub-linearly growing (resp. has polynomial growth) with respect to x. Note that the same type of result is used under a different set of assumptions in Chapter 3 by Barrieu and El Karoui when defining g-conditional risk measures as the maximal solution of a BSDE associated with a convex coefficient g. In the book edited by El-Karoui and Mazliak, several papers are dedicated to the relationship between BSDEs and stochastic control. In particular, the paper of Quenez provides a link between martingale methods in stochastic control (in [154] or [80]) and BSDEs. 8.2.3 Zero-Sum Stochastic Differential Games We now consider a situation where two agents, denoted by c1 and c2 , intervene on the system S but have antagonistic advantages. To control the system, the agent c1 (resp. c2 ) acts with a P-measurable stochastic process u := (ut )t≤T (respectively v := (vt )t≤T ) taking values in a compact metric space U (resp. V ). The set of admissible controls for c1 (resp. c2 ) is denoted by U (resp. V). When the agents act on S with a pair of strategies (u, v) ∈ U × V, the dynamics of the controlled system has the same law as that of (xt )t≤T under the probability Pu,v defined by its density with respect to P in the following way: T dPu,v −1 σ (s, x.)f (s, x., us , vs )dWs . =E dP 0 The function f is measurable, continuous with respect to x, and sublinearly growing with respect to (u, v). As in the control setting, the interventions of the controllers generate a drift in the dynamics of the evolution of the system. More precisely, dxt = f (t, x., ut , vt )dt + σ (t, x.)dWtu,v , t ≤ T ; x0 = x, where W u,v is a Brownian motion under Pu,v . Between the agents c1 and c2 , depending on the used pair of strategies (u, v), there is a payoff J (u, v), which is a reward for c1 and a cost for c2 . Its expression
284
CHAPTER 8
is given by
T
J (u, v) := E
u,v
h(s, x., us , vs )ds + (xT ) ,
0
where h is a bounded, measurable function. In addition, it is continuous with respect to (u, v). The problem is now to find a fair strategy (u∗ , v ∗ ) for both agents, i.e., a strategy that satisfies for any u ∈ U and v ∈ V : J (u, v ∗ ) ≤ J (u∗ , v ∗ ) ≤ J (u∗ , v). This is a zero-sum game problem and (u∗ , v ∗ ) is called a saddle-point for the game. For an admissible pair of strategies (u, v), let us define the Hamiltonian of the zerosum game H (t, x., z, ut , vt ) := zσ −1 (t, x.)f (t, x., ut , vt ) + h(t, x., ut , vt ). Note that H is an affine function, uniformly Lipschitz continuous with respect to z. Let us now assume: (J1): the function H satisfies the following Isaacs’condition: inf sup H (t, x., z, u, v) = sup inf H (t, x., z, u, v).
v∈V u∈U
u∈U v∈V
Under (J1), there exist two measurable functions u∗ (t, x., z) and v ∗ (t, x., z) taking values respectively in U and V such that • (i) inf v∈V supu∈U H (t, x., z, u, v) = H (t, x., z, (u∗ , v ∗ )(t, x, z)), • (ii) for any (u, v) ∈ U × V , for any t, z: H (t, x., z, u, v ∗ (t, x., z)) ≤ H (t, x., z, (u∗ , v ∗ )(t, x, z)) ≤ H (t, x., z, u∗ (t, x, z), v). We have the following result related to the existence of a saddle-point for the zerosum stochastic differential game: Theorem 8.8 (Hamadène and Lepeltier [116]) Assume that (J1) holds. Then the BSDE associated with (H (t, x., z, u∗ (t, x., z), v ∗ (t, x., z)), (xT )) has a solution (Yt∗ , Zt∗ ) in the sense that −dYt∗ = H (t, x., Zt∗ , (u∗ , v ∗ )(t, x., Zt∗ ))dt − Zt∗ dWt , YT∗ = (xT ). In addition, the pair of strategies (u∗ , v ∗ ) := (u∗ (t, x., Zt∗ ), u∗ (t, x., Zt∗ )) is a saddle-point for the zero-sum stochastic differential game and J (u∗ , v ∗ ) = Y0∗ = inf sup J (u, v) = sup inf J (u, v), v∈V u∈U
i.e.,
Y0∗
u∈U v∈V
is the value of the game.
Short proof. For the sake of simplicity, we only consider here the case of g bounded. The existence of (Y ∗ , Z ∗ ) is a direct consequence of Theorem 8.2. The function that associates H (t, x., z, u∗ (t, x., z), v ∗ (t, x., z)) with z is uniformly Lipschitz continuous using point (i). In addition, we have Y0∗ = J (u∗ , v ∗ ). Now if u is another control for c1 , then there exists (Y u , Z u ) solution of the BSDE associated with (H (t, x., z, ut , v ∗ (t, x., Zt∗ )), (xT )) and Y0u = J (u, v ∗ ). On the other hand, both the Comparison Theorem 8.3 and point (ii) imply that Y0u ≤ Y0∗ , i.e.
285
BSDEs AND APPLICATIONS
J (u, v ∗ ) ≤ J (u∗ , v ∗ ). In a symmetric way, we can prove that J (u∗ , v ∗ ) ≤ J (u∗ , v) for any admissible strategy v for c2 . Henceforth (u∗ , v ∗ ) is a saddle-point for the game. The fact that Y0∗ is the value of the game follows from the inequalities Y0∗ ≤ sup inf J (u, v) u∈U v∈V
and Y0∗ ≥ inf sup J (u, v). v∈V u∈U
Therefore, we have equalities in the previous inequalities, since (u∗ , v ∗ ) is a saddlepoint for the zero-sum game.
8.3 MARKOVIAN BSDEs When the BSDE coefficients are deterministic functions of a diffusion process, the solution (Y,Z) is also a deterministic function of the same process. If, in addition, a certain regularity on the coefficients is introduced, it is possible to relate these functions with the pair (solution, gradient) of some semi-linear PDE. In this section, we will show this relationship, starting from very general results to end up with fine regularity results. The basic framework is the following: the randomness of the coefficient and the terminal value of a Markovian BSDE come from a diffusion process (Xst,x , 0 ≤ s ≤ T ), which is the strong solution of a standard (forward) Itô stochastic differential equation (SDE): ) dXst,x = b(s, Xst,x )ds + σ (s, Xst,x )dWs , t ≤ s ≤ T (8.24) 0 ≤ s ≤ t. Xst,x = x, For any given (t, x) ∈ [0, T ] × Rd , we will denote by (Yst,x , Zst,x , 0 ≤ s ≤ T ) a solution of the following BSDE: ) −dYs = g(s, Xst,x , Ys , Zs )1{s≥t} ds − Zs dWs (8.25) YT = (XTt,x ). We still assume that the SDE (8.24) has a unique strong solution. Moreover, in order to have good estimates of the solution, we assume that the conditions (M1f ) are satisfied, where the symbol f is used to refer to forward equation. (i) b and σ are uniformly Lipschitz continuous with respect to x. (M1f ) (ii) there exists a constant c s.t. for any (s, x), |σ (s, x)| + |b(s, x)| ≤ c(1 + |x|). The system of equations (8.25) and (8.24) is called a forward-backward stochastic differential equation (FBSDE). Note: More general FBSDEs are studied in Ma and Yong [177] when the coefficients of the FSDE also depend on the solution (Y, Z) of the BSDE. The problem is then
286
CHAPTER 8
much more complicated and does not admit a solution in all cases (see [177] for a complete review on this problem). Let us now introduce the Markovian version of the previous assumptions (H1): (i) The function g : [0, T ] × Rd × Rm × Rm×d → Rm is uniformly Lipschitz in (y, z) with Lipschitz constant C, i.e., |g(s, x, y1 , z1 ) − g(s, x, y2 , z2 )| ≤ C(|y1 − y2 | + |z1 − z2 |). (M1b) (ii) There exists a constant c such that for a real constant p ≥ 1/2 |g(s, x, 0, 0)| + |(x)| ≤ c (1 + |x|p ). or the Markovian version of assumptions (H2) formulated earlier (i) The real function g : [0, T ] × Rd × Rm × Rm×d → R has uniform linear growth |g(s, x, y, z)| ≤ |g(s, x, 0, 0)| + C(|y| + |z|). (M2b) (ii) The real function (y, z) → g(t, x, y, z) is continuous. (iii) There exists a constant c such that for a real constant p ≥ 1/2 |g(s, x, 0, 0)| + |(x)| ≤ c (1 + |x|p ).
8.3.1 A General Result This general result is the main application of the Picard approximation studied in Proposition 8.1. Note that under assumptions (M1b) or (M2b) the stochastic coefficient g(s, Xst,x , y, z) and the terminal condition (XTt,x ) of the BSDE (8.25) satisfy (H1). We will now show that the solution (Yst,x , Zst,x ) is Markovian in the sense that these processes can be expressed through deterministic functions of s and Xst,x . More precisely, Theorem 8.9 Under (M1f ) and (M1b), or (M1f ) and (M2b), there exist two measurable deterministic functions u(t, x) and d(t, x) such that the solution (Yst,x , Zst,x ) of BSDE (8.25) is given by ∀ s ≤ T , Yst,x = u(s, Xst,x ) and Zst,x = d(s, Xst,x )σ (s, Xst,x ), ds ⊗ dP−a.e. Furthermore, for any Ft -measurable random variable χ ∈ L2 , the solution t,χ t,χ t,χ t,χ (Ys , Zs )s≥t is given by (u(s, Xs ), d(s, Xs ))s≥t . Proof. a) Let us first assume that g does not depend on (y, z), i.e., g(t, x, y, z) = g(t, x). The result can be obtained from Theorem 6.27 in Cinlar et al. [77] (one can also refer to Dellacherie and Meyer [70]). More precisely, in our framework, this theorem can be written as follows: T Lemma 8.10 Let Be be the σ -field on Rn generated by the functions E[ t φ × (s, Xst,x )ds] where φ is a continuous bounded Rn -valued function. Then the semimartingale T t,x t,x t,x g(r, Xr )dr Fs Ys = E (XT ) + s
287
BSDEs AND APPLICATIONS
admits a continuous version given by u(s, Xst,x ), where T t,x t,x u(t, x) = E (XT ) + g(r, Xr )dr s
t
is Be -measurable. Moreover, u(t, x) + t g(r, Xrt,x )dr + Yst,x is an additive martingale with the following representation: s s t,x t,x g(r, Xr )dr + Ys = d(r, Xrt,x )σ (r, Xrt,x )dWr , t ≤ s ≤ T , t
t
where d(t, x) is Be -measurable. In particular, Zst,x = d(s, Xst,x )σ (s, Xst,x ), ds ⊗ dP − a.e. The use of the σ -field Be instead of Borelian σ -field ensures that for any function f , bounded and Be -measurable, the processes f (X.t,x ) is optional; in particular, if f is an excessive function, f (X.t,x ) is a càdlàg supermartingale. b) Let us now consider a general Lipschitz coefficient g(s, x, y, z) satisfying (M2b). By applying Lemma 8.10 to the Picard approximations (Ysn , Zsn ) introduced in Proposition 8.1, we conclude by a recursive argument that there exist some Be -measurable functions un , dn such that: ∀ s ∈ [t, T ],
Ys(t,x),n = un (s, Xst,x ), Zs(t,x),n = dn (s, Xst,x )σ (s, Xst,x ).
We have shown in Proposition 8.1 the ds ⊗ dP almost everywhere convergence of this sequence to (Y (t,x) , Z (t,x) ). Let us denote: u(s, x) := lim sup un (s, x) n→+∞
and d(s, x) := lim sup d n (s, x). n→+∞
First, note that the optional processes (u(s, Xst,x ), d(s, Xst,x ))s≤T are ds ⊗ dP almost everywhere equal to (Ys(t,x) , Z (t,x) )s≤T . In the right-hand side of equation (8.25) we can replace (Ys(t,x) , Zs(t,x) ) by (u(s, Xst,x ), d(s, Xst,x )), and Ys(t,x) appears as the solution of a BSDE with given coefficient g(s, Xst,x , u(s, Xst,x ), d(s, Xst,x )). But the previous lemma implies that Ys(t,x) only depends on (s, Xst,x ), that is, u(s, Xst,x ). Ys(t,x) = # u may be used instead of Moreover, # u(s, Xst,x ) = u(s, Xst,x ) ds ⊗ dP − a.e. Hence, # u in the coefficient g. In conclusion, (# ut,x (s, Xst,x ), d(s, Xst,x )) is solution of the BSDE (8.25). c) We now suppose that g is a real function and that (M2b) holds. Using approximation by inf-convolution as in Theorem 8.4, we still have the Markovian property for the approximating BSDEs whose coefficient gn (s, ., ..) is of the form gn (s, ., ..) = gn (s, Xst,x , y, z). Let us denote by un and dn the functions associated with the solution of the Lipschitz BSDEs as in (b) above. Let us set u˜ = lim inf un , and d˜ = lim inf dn . ˜ Xst,x )) By exactly the same argument as above, we conclude that (u(s, ˜ Xst,x ), d(s, is the minimal solution of the BSDE. 2 We now introduce some continuity assumptions on g and ψ in order to show that the solution is continuous with respect to (t,x) and, more importantly, that the BSDEs solution is the viscocity solution of some nonlinear PDE.
288
CHAPTER 8
(M1b, iii): the mapping x → (g(t, x, 0, 0), (x)) is continuous. We refer to the global assumption (M1b) + (M1b, iii) as to (M1bc ). 8.3.2 Viscosity Solution of Semilinear PDEs Let us now consider the following system of semilinear parabolic PDEs, where u is a Rm -valued function, defined on [0, T ] × Rd satisfying ∂u (t, x) + Lu(t, x) + g(t, x, u(t, x), Dσ u(t, x)) = 0, ∀ t ∈ [0, T ] and x ∈ Rd , ∂t u(T , x) = (x), ∀ x ∈ Rd . (8.26) L is a second-order differential operator given by L :=
d d 1 ∂2 ∂ (σ σ ∗ )i,j i j + bi i 2 i,j =1 ∂x ∂x ∂x i=1
and
Dσ u := ∇uσ.
(8.27)
Under the assumptions (M1f+M1bc ) on the coefficients, we can only consider the solution of PDE (8.26) in the viscosity sense. Moreover, we need to make the following restriction: for 1 ≤ i ≤ m, the i-th coordinate of g, denoted by gi , depends only on the i-th row of the matrix z. Therefore, the equation (8.26) can be written as ∂ui (t, x) + Lui (t, x) + gi (t, x, u(t, x), (∇ui σ )(t, x)) = 0, i = 1, . . . , m ∂t u(T , x) = (x). Let us first introduce the definition of a viscosity solution: Definition 8.11 Assume u ∈ C([0, T ] × Rd ; Rm ) and u(T , x) = (x), for all x ∈ Rd . u is called a viscosity subsolution (resp. supersolution) of PDE (8.26) if, for any 1 ≤ i ≤ m, ϕ ∈ C 1,2 ([0, T ] × Rd ) and (t, x) ∈ [0, T ] × Rd such that ϕ(t, x) = u(t, x) and (t, x) is a local maximum (resp. minimum) of ui − ϕ, −
∂ϕ (t, x) − Lϕ(t, x) − gi (t, x, u(t, x), (∇ϕσ )(t, x)) ≤ 0 ∂t (8.28) ∂ϕ resp. − (t, x) − Lϕ(t, x) − gi (t, x, u(t, x), (∇ϕσ )(t, x)) ≥ 0 . ∂t
Moreover, u ∈ C([0, T ] × Rd ; Rm ) is called a viscosity solution of PDE (8.26) if it is both a viscosity subsolution and a viscosity supersolution. We now give the probabilistic interpretation of the viscosity solution of PDE (8.26) using (Yst,x , Zst,x ) solution of the BSDE (8.25): Theorem 8.12 (Pardoux and Peng [217]) Under the assumptions (M1f ) and (M1bc ), u(t, x) := Ytt,x is a viscosity solution of PDE (8.26) and grows at most polynomially at infinity.
289
BSDEs AND APPLICATIONS
Proof. First, it is well known (see for instance Kunita [165]) that, for any p ≥ 2, there exists a constants cp such that for any t, t ∈ [0, T ], x, x ∈ Rd ,
p
(8.29) E sup |Xst,x − Xst ,x |p ≤ cp (1 + |x|p )(|x − x |p + |t − t | 2 ). 0≤s≤T
Therefore, using (M1bc ), (8.29) and the Lp -estimates of the BSDE solution (8.25) (see Pardoux and Peng [217], Lemma 2.1 and Corollary 2.10 for all the technical details), we get that there exists a constant q such that
p
E sup |Yst,x − Yst ,x |p ≤ cp (1 + |x|q )(|x − x |p + |t − t | 2 ). (8.30) 0≤s≤T
Thus, u(t, x) := Ytt,x is locally Lipschitz in x and Hölder continuous in t, and therefore, in particular, u is a continuous function in (t, x). The polynomial growth of u follows from the Lp -estimates for Y.t,x , the growth assumptions growth of g(t, x, 0, 0) and , and the Lp -estimates for the diffusion X.t,x . Let us now prove that u is a viscosity subsolution of (8.26). For the sake of simplicity, we consider only the one-dimensional case, i.e., m = 1. Take ϕ ∈ C 1,2 ([0, T ] × Rd ) and (t, x) a local maximum point of u − ϕ. We assume without loss of generality that u(t, x) = ϕ(t, x). We suppose that ∂ϕ (t, x) + Lϕ(t, x) + g(t, x, u(t, x), (∇ϕσ )(t, x)) < 0, (8.31) ∂t and then we will prove that this leads to a contradiction. By continuity, (8.31) still holds for a neighborhood of the point (t, x). Indeed, let 0 < α ≤ T − t be such that for all t ≤ s ≤ t + α and |y − x| ≤ α, ∂ϕ (s, y) + Lϕ(s, y) + g(s, y, u(s, y), (∇ϕσ )(s, y)) < 0. ∂t (8.32) Now let us define the stopping time τ = inf {s ≥ t; |Xst,x − x| ≥ α} ∧ (t + α) and the pair ui (s, y) ≤ ϕ(s, y) and
t,x ¯ t,x (Y¯s , Z¯ s ) := (Y¯s∧τ , Zs∧τ ),
solution of the BSDE Y¯s = u(τ, Xτt,x ) +
t+α
t ≤ s ≤ t +α
1[0,τ ] (r)g(r, Xrt,x , u(r, Xrt,x ), Z¯ r )dr −
s
t+α
Z¯ r dWr ,
s
for t ≤ s ≤ t + α. Using Itô formula, the pair t,x (Y˜s , Z˜ s ) := (ϕ(s, Xs∧τ ), 1[0,τ ] (s)(∇ϕσ )(s, Xst,x )),
solves the BSDE Y˜s = ϕ(τ, Xτt,x ) −
t+α
1[0,τ ] (r) s
t ≤ s ≤ t +α
t+α ∂ϕ Z˜ r dWr , + Lϕ (r, Xrt,x )dr − ∂t s
for t ≤ s ≤ t + α. Now using (8.32) and the strict Comparison Theorem 8.3, we get that Y¯t < Y˜t , i.e. u(t, x) < ϕ(t, x), which is in contradiction with our assumption. 2
290
CHAPTER 8
8.3.3 Regularity Results on BSDEs Now, we are interested in the relationship between BSDEs and regular solutions of the semi-linear PDEs. We will study in details the regularity of the functions of the BSDE solution with respect to the initial conditions and provide a probabilistic method for studying solutions of the semilinear PDE (8.26). Classical Solutions of Semilinear PDEs We will first study classical solutions, therefore we need to check regularity results for BSDEs solutions. The objective is to find probabilistic estimates with respect to the initial conditions (in time and space), in order to show that this is a regular solution of the PDE. The BSDE estimates require precise Lp −estimates of both the diffusion solution and its first and second derivatives. The proof of these results is technical, so we do not present it here and refer to Pardoux and Peng ( [217], Lemma 2.5 and Theorem 2.9) for a complete proof. Theorem 8.13 (Pardoux and Peng [217]) In addition to assumptions (M1f ) and (M1b), we assume that the functions b, σ , g and ψ are C 3 with bounded derivatives. Then: i) The process {Yst,x ; (s, t) ∈ [0, T ]2 , x ∈ Rd } has a version that belongs almost surely to C 0,0,2 ([0, T ]2 × Rd ). In particular Ytt,x is of class C 2 in x a.s. and the derivatives up to order 2 are a.s. continuous in (t, x). ii) The random field {Zst,x , 0 ≤ t ≤ s ≤ T } has an a.s. continuous version such that Zst,x = ∇Yst,x (∇Xst,x )−1 σ (Xst,x ).
(8.33)
In particular, taking s = t, we obtain Ztt,x = ∇Ytt,x σ (x). We now give a probabilistic interpretation for solutions of the semilinear PDE (8.26) using the solution of the Markovian BSDE (8.25): Theorem 8.14 (Pardoux and Peng [217]) In addition to assumptions (M1f ) and (M1b), we assume that the functions b, σ , g and ψ are C 3 with bounded derivatives. Then: i) If u belongs to C 1,2 ([0, T ] × Rd , Rm ) is a classical solution of PDE (8.26), then the couple (u(s, Xst,x ), Dσ u(s, Xst,x )) is the solution of the BSDE (8.25) in the time interval [t, T ]. In addition, for any t ≤ T , u(t, x) = Ytt,x . ii) If (Yst,x , Zst,x ) is the unique solution of the BSDE (8.25), then u(t, x) := Ytt,x , 0 ≤ t ≤ T , x ∈ Rd belongs to C 1,2 ([0, T ] × Rd , Rm ), and it is a classical solution of the PDE (8.26). Proof. i) This result is a direct consequence of Itô’s formula for u(s, Xst,x ), s ∈ [t, T ]. ii) The second result is the nonlinear Feynman-Kac’s formula for the semilinear PDE (8.26). For the complete proof one can see Pardoux and Peng [217]. For t ∈ [0, T ] and x ∈ Rd , let u(t, x) := Ytt,x . From Theorem 8.13, the function u belongs to C 0,2 ([0, T ] × Rd ) by PDE (8.26). Using the flow property of {Xst,x ; t ≤ s ≤ T , x ∈ Rd } and the uniqueness of the solution of the BSDE (8.25), we get for
291
BSDEs AND APPLICATIONS
each t ≤ t + h ≤ T , t,x t+h,Xt+h
t,x Yt+h = Yt+h
t,x = u(t + h, Xt+h ).
Furthermore, let # := {t = t0 < t1 < · · · < tn = T } be a subdivision of the interval [t, T ], h(x) − u(t, x) = −
n−1
(u(ti+1 , x) − u(ti , x))
i=0
=−
n−1
(u(ti+1 , x) − u(ti+1 , Xtt,x )) − i+1
i=0
n−1
(u(ti+1 , Xtt,x ) − u(ti , x)) i+1
i=0
By Itô’s formula and the BSDE (8.25) we get n−1 ti+1 (Lu(ti+1 , Xrti ,x ) + g(r, Xrti ,x , Yrti ,x , Zrti ,x ))dr h(x) − u(t, x) = − ti
i=0
−
ti+1
(Zrti ,x
− Dσ u(ti+1 , Xrti ,x ))dWr
.
ti n − tin ) = 0 in the above equation and using Zst,x = Now, let limn→∞ supi (ti+1 t,x Dσ u(s, Xs ), we obtain T h(x) − u(t, x) = − (Lu(r, x) + g(r, x, u(r, x), Dσ u(r, x)))dr, t
which implies that u ∈ C
1,2
([0, T ] × Rd ) and solves the PDE (8.26).
2
Sobolev Solutions of Semilinear PDEs For smooth coefficients, the PDE (8.26) has a classical solution, but if we assume that the coefficient g is simply a Lipschitz function, one has to consider solutions in a weak sense. More precisely, u becomes a weak solution of Equation (8.26) with terminal value , if the following relation holds: ∀φ ∈ Cc1,∞ ([0, T ] × Rd ), T (u(r, ·), ∂r φ(r, ·))dr + (u(s, ·), φ(s, ·)) − ((·), φ(·, T )) s
+
T
e(u(r, ·), φ(r, .))dr =
s
where (φ, ψ) =
T
(g(r, ·, u(r, ·), Dσ u(r, ·)), φ(r, ·))dr,
t
(8.34)
φ(x)ψ(x)dx denotes the scalar product in L2 (Rd , dx) and 1 ∗ σ ∇σ + b ψ dx (Dσ ψDσ φ) + φ∇ e(ψ, φ) = 2 Rd Rd
is the energy of the system associated with the PDE. This is a variational formulation for PDE (8.26). So now assume that instead of (M1f ) and (M1b), the following assumptions hold: (M1 f ) : b (resp. σ ) is C 2 (Rd ; Rd ) (resp. C 3 (Rd×d ; Rd )) with bounded derivatives.
292
CHAPTER 8
and
(i) The function g : [0, T ] × Rd × Rm × Rm×d → Rm is uniformly Lipschitz in (y, z)with Lipschitz constant C, i.e., (M1 b) |g(s, x, y1 , z1 ) − g(s, x, y2 , z2 )| ≤ C(|y1 − y2 | + |z1 − z2 |). (ii) ∈ L2 (Rd , dx) and g(t, x, 0, 0) ∈ L2 ([0, T ] × Rd , dt ⊗ dx).
Under the above assumptions, the BSDE (8.25) still has a unique solution provided that we give a sense to (XTt,x ) and g(s, Xst,x , y, z). This difficulty was tackled by Barles and Lesigne [10] and Bally and Matoussi [8], who have used the following norm equivalence result: Proposition 8.5 There exist two constants c1 , c2 > 0, such that for every t ≤ s ≤ T and φ ∈ L1 (Rd ), we have c1 E[|φ(Xst,x )|]dx ≤ c2 |φ(x)|dx. (8.35) |φ(x)|dx ≤ Rd
Rd
Rd
Moreover, for every ψ ∈ L ([0, T ] × R ), T |ψ(s, x)|dxds ≤ c1 1
d
Rd
t
t
T
Rd T
≤ c2 t
E[|ψ(s, Xst,x )|]dxds
Rd
|ψ(s, x)|dxds.
(8.36)
The constants c1 , c2 depend on T , and the bounds of the first (resp. first and second) derivatives of b (resp. σ ). Proof. Under the regularity assumption [M1]−(ii ) the stochastic flow (Xst,x ) is differentiable in x (see Kunita [165]). We denote by (Xˆ st,x ) the inverse flow and by J (Xˆ st,x ) the determinant of its Jacobian matrix. J (Xˆ st,x ) is positive because it is a continuous function of s ∈ [t, T ] and J (Xˆ tt,x ) = 1. Moreover, we know (see Lemma 5.1 in Bally and Matoussi [8]) that there exists two constants c > 0 and C > 0 such that c ≤ E[J (Xˆ st,x )] ≤ C, ∀ x ∈ Rd ,
t ≤ s ≤ T.
Xˆ st,x ,
(8.37)
Using the change of variable y = we get for any φ ∈ L (R ) E[|φ(Xst,x )|)]dx = |φ(y)|E[J (Xˆ st,y )]dy, Rd
1
d
Rd
which, in view of estimates (8.37), yields to the inequalities (8.35). Moreover, using (8.35) for x → ψ(s, x) and then integrating with respect to s ∈ [t, T ], we get the inequalities (8.36). 2 The natural space for the weak solutions of semilinear PDEs (8.34) is the following Hilbert space: H := {u ∈ L2 ([0, T ] × Rd ))|Dσ u ∈ L2 ([0, T ] × Rd )}, endowed with the norm u 2H := u L2 + Dσ u L2 .
293
BSDEs AND APPLICATIONS
The goal of Proposition 8.5 is to make a natural connection between the norms used to study the BSDE, namely the H2 −norm and the one used in PDE, namely the H−norm. On the other hand, in order to give the connection between the weak solution of PDE (8.34) and BSDE (8.25), we will introduce a random test functions (φt ) defined, for any φ ∈ Cc∞ (Rd ), by #st,x )J (X #st,x ), ∀ s ∈ [0, T ], ∀ x ∈ Rd . φt (s, x) := φ(X
(8.38)
This random test function naturally appears in the definition of the composition of v ∈ L2 (Rd ) with the stochastic flow (Xst,x ) i.e., (v ◦ Xst,· , φ) := (v, φt (s, ·)). Indeed, by a change of variable, we have #st,y )J (X #st,y )dy = (v ◦ Xst,· , φ) = v(Xst,x )φ(x)dx. v(y)φ(X Rd
Rd
The main idea is to use φt as a test function in Equation (8.34). The problem T is that s → φt (s, x) is not differentiable so that the term t (us , ∂s φ)ds makes no sense. However, φt (s, x) is a semimartingale and we have the following result (see Lemma 2.1 in Bally and Matoussi [8] for the proof): Lemma 8.15 For every function φ ∈ C2c (Rd ) , d s d s ∂ j (σij (x)φt (r, x)) dWr + L∗ φt (r, x)dr, φt (s, x) = φ(x) − ∂x i t t j =1 i=1 (8.39)
where L∗ is the adjoint operator of L.
Then we may replace ∂s φds by the Itô stochastic integral with respect to dφt (s, x) in Equation (8.34) and get the following result that will play the same role as Itô’s formula in the Theorem 8.14−i) (see Proposition 2.3 in [8] for the proof). Proposition 8.6 (Bally and Matoussi [8]) Let u ∈ H be a weak solution of PDE (8.34), then for any s ∈ [0, T ] and φ ∈ Cc2 (Rd ), T (u(r, ·), dφt (r, .)) − ((·), φt (T , ·)) + (u(s, ·), φt (s, ·)) s
−
T
s
where T
T
e(u(r, ·), φt (r, ·))dr =
(g(r, ·, u(r, ·), Dσ u(r, ·)), φt (r, .))dr, (8.40)
s
(u(r, ·), dφt (r, .)) := −
s
d j =1
+
T
T s
∂ (u(r, ·), (σij (x)φt (r, ·))) dWrj ∂x i i=1
d
(u(r, ·), L∗ φt (r, .))dr.
(8.41)
s
We give now the weak Feynman-Kac’s formula for the solution of (8.34): Theorem 8.16 (Bally-Matoussi [8], Barles and Lesigne [10]) Let us assume that (M1 f ) and (M1 b) hold. There exists a unique solution u ∈ H of the PDE
294
CHAPTER 8
(8.34). Moreover u(t, x) = and Dσ u = is solution of BSDE (8.25) and Ytt,x
Yst,x = u(s, Xst,x ),
Ztt,x ,
Zst,x = Dσ u(s, Xst,x ),
where {(Yst,x , Zst,x ), t ≤ s ≤ T } ds ⊗ dx ⊗ dP − a.e.
(8.42)
Proof. i) Uniqueness. Let u1 and u2 ∈ H be two weak solutions of the PDE (8.34). Using Proposition 8.6, for any φ ∈ Cc2 (Rd ) and i = 1, 2, we have T (ui (r, ·), dφt (r, .)) − ((·), φt (T , ·)) + (ui (s, ·), φt (s, ·)) s
−
T
T
e(ui (r, ·), φt (r, ·))dr =
s
(g(r, ·, ui (r, ·), Dσ ui (r, ·)), φt (r, .))dr.
s
(8.43) Therefore we substitute (8.41) in the above equation to get T i u (s, x)φt (s, x)dx = ((·), φt (·, T )) − Dσ ui (r, x)φt (r, x)dxdWr Rd
+
T
s
Rd
s
Rd
φt (r, x)g(r, x, ui (r, x), Dσ ui (r, x))drdx.
#rt,x , we obtain Then, by the change of variable y = X t,y ui (s, Xst,y )φ(y)dy = g(XT )φ(y)dy Rd Rd T φ(y)g(r, Xrt,y , ui (r, Xrt,y ), Dσ ui (r, Xrt,y ))dsdy + Rd
−
Rd
s T
Dσ ui (r, Xrt,y )φ(y)dWr dy. s
t,y t,y (s, Xs ), Dσ ui (s, Xs ))
i
Since φ is arbitrary, (u any i = 1, 2
t,y
ui (s, Xst,y ) = (XT ) + −
solves the following BSDE. For
T
g(s, Xst,y , ui (s, Xst,y ), Dσ ui (s, Xst,y ))ds s
T
Dσ ui (r, Xrt,y )dWr . s t,y
Then, by the uniqueness of the BSDE, we get u1 (s, Xs ) = u2 (s, Xst,x ) and t,y t,y Dσ u1 (s, Xs ) = Dσ u2 (s, Xs ). Taking s = t we deduce that u1 (t, y) = u2 (t, y), dy − a.e. ii) Existence. For each (t, x) ∈ [0, T ] × Rd , we define u(t, x) := Ytt,x and v(t, x) := Ztt,x , where (Yst,x , Zst,x )t≤s≤T is the solution of BSDE (8.25). By using the flow props,X t,x
erty Xr s = Xrt,x , ∀t ≤ s ≤ r and the uniqueness of the solution of (8.25), it is easy to check that Yst,x = u(s, Xst,x ) and Zst,x = v(s, Xst,x ). Let us prove that u is the weak solution of PDE (8.34) and that v = Dσ u.
295
BSDEs AND APPLICATIONS
By the a priori estimate (8.10), we have T t,x 2 t,x 2 t,x 2 |Zr | dr ≤ c E|(XT )| + E E|Ys | + E s
T
|g(r, Xrt,x , 0, 0)|2 dr
.
s
(8.44) Using the equivalence of norms results (8.35)–(8.36), and (8.44), we get |(x)|2 (x)dx < ∞, E|(XTt,x )|2 dx ≤ c2 Rd
Rd
and
Rd
T
(|u(r, x)|2 + |v(r, x)|2 )drdx
t
T 1 ≤ (E|u(s, Xst,x )|2 + E|v(r, Xrt,x )|2 )drdx c1 R d t T 1 = (E|Yrt,x |2 + E|Zrt,x |2 )drdx c1 R d t T 2 2 ≤K |(x)| dx + |g(r, x, 0, 0)| drdx < ∞, Rd
Rd
t
where K = cc1 /c2 is a positive a constant. This proves that u and v belong to L2 ([0, T ] × Rd ). Therefore the function f (s, x) := g(s, x, u(s, x), v(s, x)) belongs to L2 ([0, T ] × Rd ) in view of the assumption (M1 b). We consider now (Ys,t,x , Zs,t,x )t≤s≤T a sequence of solutions of linear BSDE with terminal value and coefficient f T T Ys,t,x = (XTt,x ) + f (r, Xrt,x )dr − Zr,t,x dWr . (8.45) s
s
Both (f ) and ( ) are sequences of smooth functions with compact support that approximate f and , respectively, in L2 ([0, T ] × Rd ) (resp. in L2 (Rd )). It is easy 2 2 to check that (Ys,t,x , Zs,t,x ) converges to (Yst,x , Zst,x ) in Hm × Hm×d as → 0. ,t,x is a classical solution of PDE (8.26) and so By Theorem 8.14, u (t, x) := Yt satisfies T (u (r, ·), ∂r φ(r, ·))dr + (u (s, ·), φ(s, ·)) − ( (·), φ(·, T )) s
+ s
T
e(u (r, ·), φ(r, .))dr =
T
(f (r, ·), φ(r, ·))dr
(8.46)
t
for all φ ∈ Cc1,∞ ([0, T ] × Rd ). Moreover, we also get v (t, x) := Zt,t,x = Dσ u (t, x).
(8.47)
Therefore, by Proposition 8.5 and a standard argument, both sequences (u ) and (v ) converge in the Hilbert space L2 ([0, T ] × Rd ) to u and v, respectively. Thus, taking the limit in (8.46) and (8.47), we get that u(t, x) := Ytt,x is a solution of PDE (8.34) and v(t, x) := Ztt,x = Dσ u(t, x). 2
296
CHAPTER 8
8.4 BSDEs WITH QUADRATIC GROWTH WITH RESPECT TO Z We come back to general BSDEs, with different assumptions on the coefficient. This extension is particulary useful in the study of convex risk measures as in Chapter 3 by Barrieu and El Karoui. We now assume that g and ξ satisfy the following conditions: (i) The function g takes its values in R. (ii) There exists C > 0 such that P−a.s., |g(t, y, z)| ≤ C(1 + |z|2 ) for any (t, y, z) in [0, T ] × R1+d . (H3) : (iii) P−a.s. for any t ∈ [0, T ], the function (y, z) −→ g(t, y, z) is continuous. (iv) The random variable ξT is bounded.
8.4.1 Existence of a Maximal Solution Once again we are going to show that the BSDE associated with (g, ξ ) has a solution. So let B be the set of bounded continuous and P-measurable processes. Then we have: Theorem 8.17 (Kobylanski [159], Lepeltier and San Martin [172]) Under (H3) the BSDE associated with (g, ξ ) has a maximal solution (Yt , Zt )t≤T . In addition, the process Y belongs to B. Proof. Basically, the proof is based on an exponential transform. Since the random variable ξ is bounded, there exist two constants γ1 and γ2 such that P−a.s., γ1 ≤ ξ ≤ γ2 . Let (Y, Z) be a solution for the BSDE associated with (g, ξ ). If we set, for t ≤ T , Y¯t = exp(2CYt ) and Z¯ t = 2C Y¯t Zt , where C is the constant of quadratic growth of g, then we have T ln Y¯s Zs |Z¯ s |2 ¯ ¯ Yt = exp(2Cξ ) + ds )− , 2C Ys g(s, 2C 2C Y¯s 4C Y¯s2 t T − Z¯ s dWs , ∀t ≤ T . t
Now, for (t, y, z) ∈ [0, T ]×]0, ∞[×Rd , let us set G(t, w, y, z) = 2Cy(g(t, w, ln2Cy , |z2 | z ) − 4Cy 2 ). 2Cy
Then G satisfies the following condition:
P−a.s. − 2C 2 y −
|z|2 ≤ G(t, w, y, z) ≤ 2C 2 y, y for any (t, y, z) ∈ [0, T ]×]0, ∞[×Rd .
(8.48)
Let α, β be two real constants such that 0 < α ≤ e2C(γ1 −CT ) and β ≥ e2C(γ2 +CT ) . On the other hand, let ρ be the function which associates ρ(y) = α1[yβ] , with y ∈ R. Let η = e2Cξ and G(t, 1+d for any (t, y, z) ∈ [0, T ] × R . To begin with, we are going to show that the BSDE ˜ has a solution (Y, Z). associated with (η, G)
297
BSDEs AND APPLICATIONS
˜ p be the function defined as follows: For p ≥ 0 let kp and G 0 ≤ kp ≤ 1, kp (z) = 1 if |z| ≤ p and
kp (z) = 0 if |z| ≥ p + 1,
and for p ≥ 0 ˜ p (t, ω, y, z) = 2C 2 ρ(y)(1 − kp (z)) + kp (z)G(t, ˜ ω, y, z) for any (t, y, z). G ˜ p is continuous, bounded, Therefore, it can be easily seen, through (8.48), that G p ˜ ˜ p = G. ˜ that the sequence (G )p≥0 is decreasing, and that limp→∞ G p p Now, for p ≥ 0, let (Yt , Zt )t≤T be the maximal solution of the BSDE associated ˜ p , η) which exists according to Theorem 8.4. Then we have Y p ∈ B, Z p ∈ with (G 2 Hd and T T p ˜ p (s, Ysp , Zsp )ds − Zsp dWs , ∀t ≤ T . (8.49) G Yt = η + t
˜p
t
˜ p+1
As G ≥ G , using the comparison result of Proposition 8.4, we have Y p ≥ Y p+1 . 2 2 Now for t ≤ T , let us set at = e2C (t−T )+2γ1 C and a˜ t = e2C (T −t)+2γ2 C ; then the functions a and a˜ satisfy: for any t ≤ T , T T at = e2γ1 C − 2C 2 as ds = e2γ1 C − 2C 2 ρ(as )ds and a˜ t = e
2γ2 C
+ 2C
t T
2
a˜ s ds = e
2γ2 C
+ 2C
α ≤ a0 ,
T
ρ(a˜ s )ds
t
t
since a ≤ a, ˜
t
2
β ≥ a. ˜
˜ p (t, y, z) ≥ G(t, ˜ y, z) ≥ −2C 2 ρ(y) − |z|2 , then once again by comparNow, as G ρ(y) p ison, we have for any t ≤ T , Yt ≥ at ≥ a0 since (a, 0) is a solution for the BSDE |z|2 , e2Cγ1 ). Therefore there exists an upper semiassociated with (−2C 2 ρ(y) − ρ(y) p continuous process (Yt )t≤T such that P−a.s., for any t ≤ T , Yt → Yt as p → ∞. In addition the sequence (Y p )p≥0 converges in H12 to Y . On the other hand, thanks to (8.48), there exists a constant c > 0 such that for any p ≥ 0, p p ˜ Ytp , Ztp ) ≤ 2c2 Ytp , for any t ∈ [0, T ]. (8.50) P−a.s. − 2c2 Yt − c|Zt |2 ≤ G(t,
From now on the proof will be divided into four steps. Step 1: The sequence (Z p )p≥0 is uniformly bounded in Hd2 . For any t ≤ T we have t t p p ˜ p (s, Ysp , Zsp )ds + Yt = Y0 − G Zsp dWs . 0
(8.51)
0
p
Let ψ(x) = exp(−3cx). Using Itô’s formula with ψ(Yt ) and taking t = T , we obtain T T p ˜ p (s, Ysp , Zsp )ds − ψ(η) = ψ(Y0 ) − ψ (Ysp )G ψ (Ys p )Zsp dWs 0 0 1 T
p p 2 + ψ (Ys )|Zs | ds. (8.52) 2 0
298
CHAPTER 8
˜p ≥ G ˜ and taking into account (8.50), we obtain As ψ < 0, G T T 1
p p ψ (Ys ) + cψ (Ysp ) |Zsp |2 ds ≤ ψ(η) − ψ(Y0 ) + 2c2 |Ysp ψ (Ysp )|ds 2 0 0 T ψ (Ys p )Zsp dWs . − 0
1
ψ (x) + cψ (x) 2
But and Y is uniformly bounded, then taking = expectation in the previous result implies that T |Zsp |2 ds < ∞. sup E 3 ψ(x) 2
p≥0
p
0
Step 2: There exist a subsequence of (Z p )p≥0 and a process Z ∈ Hd2 such that (Z p )p≥0 converges to Z in Hd2 . We know that the sequence (Z p )p≥0 is uniformly bounded in Hd2 . Therefore, there exists a subsequence, still denoted by (Z p )p≥0 , which weakly converges in Hd2 to a stochastic process (Zt )t≤T belonging to Hd2 . ˜ p and the relation (8.50), Let p ≤ q be two integers. Through the definition of G we have ˜ q (s, Ysq , Zsq )2c2 ρ(Ysp )(1 − kp (Zsp )) ˜ p (s, Ysp , Zsp ) − G G ˜ q (s, Ysq , Zsq ) ˜ Ysq , Zsq ) − G +kp (Zsp )G(s, q
q
p
p
≤ λ(1 + |Zt |2 ) ≤ λ(1 + |Zt − Zt |2 + |Zt − Zt |2 + |Zt |2 ), 1 is a real constant. For x ∈ R, let us set ψ(x) = e4λx − x4 − 1. Using where λ ≥ 16 Itô’s formula, we obtain T 1 p q
p q p q 2 ψ (Ys − Ys )|Zs − Zs | ds E[ψ(Y0 − Y0 )] + E 2 0 T
p q p p p q q q ˜ ˜ ψ (Ys − Ys )(G (s, Ys , Zs ) − G (s, Ys , Zs ))ds . =E 0
then ψ (Ys − Ys ) ≥ 0 and we have T 1
p q
p q p q 2 ψ − λψ (Ys − Ys ) |Zs − Zs | ds E[ψ(Y0 − Y0 )] + E 2 0 T
p q p 2 2 {ψ (Ys − Ys )}{λ(1 + |Zs − Zs | + |Zs | )}ds . ≤E
But since λ ≥
1 , 16
p
q
0
However, we have 1
1
ψ (x) − λψ (x) 2 q
= 4λ2 e4λx + λ4 , then the bounded process
( 12 ψ
− λψ ) 2 (Y p − Y ) converges, as q tends to ∞, strongly in H12 to the process 1
1
( 12 ψ
− λψ ) 2 (Y p − Y ). Henceforth, {( 12 ψ
− λψ ) 2 (Y p − Y q )}(Z p − Z q ) con1
verges weakly as q → ∞ to {( 12 ψ
− λψ ) 2 (Y p − Y )}(Z p − Z). Now, since for any sequence (un )n≥0 of H12 that converges weakly to u, we have u 2 ≤
299
BSDEs AND APPLICATIONS
lim inf n→∞ un 2 we get T 1
p p 2 ψ − λψ (Ys − Ys ) |Zs − Zs | ds E 2 0 T 1
ψ − λψ (Ysp − Ysq |Zsp − Zsq |2 ds ≤ lim inf E q→∞ 2 0 T p {ψ (Ysp − Ys )}{λ(1 + |Zsp − Zs |2 + |Zs |2 )}ds − E[ψ(Y0 − Y0 )]. ≤E 0
Therefore,
E
1
ψ − 2λψ (Ysp − Ys ) |Zsp − Zs |2 ds 2 0 T p p 2
ψ (Ys − Ys ){λ + λ|Zs | }ds − E[ψ(Y0 − Y0 )]. ≤E T
0
Finally, using the Lebesgue-dominated convergence theorem and the fact that 1
ψ − 2λψ = λ2 , we deduce that 2 T |Zsp − Zs |2 ds = 0. (8.53) lim E p→∞
0
Step 3: The pair of processes (Y, Z) is solution for the BSDE associated with (g, ξ ). The sequence (Z p )p≥0 converges strongly to Z in Hd2 , then there exists a subsequence of (Z p )p≥0 which we still denote (Z p )p≥0 such that dt ⊗ dP−a.e.
Z p →p→∞ Z
p Z˜ t = sup |Zt | ∈ Hd2 .
and
(8.54)
p≥0
Let ν be a stopping time bounded by T ; then we have ν p p p p p ˜ G (s, Ys , Zs )ds + Yν = Y0 − 0
0
ν
Zsp dWs .
ν p ν But the sequence ( 0 Zs dWs )p≥0 converges in L2 (dP) to 0 Zs dWs since (Z p )p≥0 converges in Hd2 to the process Z. On the other hand, ν ν p p p ˜ ˜ ˜ p (s, Ysp , Zsp ) − G(s, ˜ Ysp , Zsp ))ds (G (s, Ys , Zs ) − G(s, Ys , Zs ))ds = (G 0 0 ν ˜ Ysp , Zsp ) − G(s, ˜ Ys , Zs ))ds. (G(s, + 0
The last term in the right-hand side converges to 0 in L1 (dP), as p → ∞, through ˜ Ytp , Lebesgue-dominated convergence theorem. Actually dt ⊗ dP−a.e., G(t, p p p ˜ Yt , Zt ) and |G(t, ˜ Yt , Zt )| ≤ C(1 + |Z˜ t |2 ). Zt ) → G(t, ν p ˜ (s, Ysp , Zsp ) − G(s, ˜ Ysp , Besides, we have also the convergence of ( 0 (G p Zs ))ds)p≥0 to 0 in L1 (dP), as p → ∞, since through Dini’s theorem, dt ⊗ dP˜ p (t, y, z) − G(t, ˜ y, z) converges uniformly on compact subsets a.e., the sequence G
300
CHAPTER 8
of R1+d , and the sequence (Y p )p≥1 (resp. (Z p )p≥1 ) is bounded uniformly (resp. in Hd2 ). This implies that for any stopping time ν, we have ν ν ˜ G(s, Ys , Zs )ds + Zs dWs . Yν = Y0 − 0
0
Finally, once more from the optional section theorem (Dellacherie and Meyer [69], 220) we obtain T T ˜ P−a.s., for any t ≤ T , Yt = η + Zs dWs . G(s, Ys , Zs )ds + t
t
But, for any t ≤ T we have Yt ≥ α and using once again the comparison result we ˜ ω, y, z) ≤ 2C 2 ρ(y) and taking into account the equation get also Yt ≤ β since G(t, satisfied by a. ˜ Therefore ρ(Yt ) = Yt for any t ≤ T , and thus T T Yt = η + Zs dWs . G(s, Ys , Zs )ds + t
t
Zt t) ˜ is a solution and Z˜ t = 2CY . Then (Y˜ , Z) Step 4: For t ≤ T let us set Y˜t = ln(Y 2C t for the BSDE associated with (g, ξT ). ˜ is maximal. Actually, if (Y¯ , Z) ¯ is another solution for the The solution (Y˜ , Z)
¯
¯ BSDE associated with (g, ξT ), then (Yt , Zt ) := (exp(2C Y¯t ), 2C Y¯t Zt )t≤T is a solution for the BSDE associated with (G, η). In addition, we have α ≤ Y¯t ≤ β, then ˜ η). Now, as G ˜ p ≥ G, ˜ then (Y¯ , Z¯ ) is a solution for the BSDE associated with (G, p
Y ≥ Y¯ and Y ≥ Y¯ . Finally, we have also Y˜ ≥ Y¯ . 2
Assume now instead of the assumption (H3)−(ii), the function g satisfies: (H3)−(ii ): there exists a constant C > 0 such that for any |g(t, y, z)| ≤ C(1 + |y| + |z|2 ). Then we have the following result: Theorem 8.18 Assume that ((H3)−(i), (iii), (iv)) and (H3)−(ii ) hold true. Then the BSDE associated with (g, ξT ) has a maximal bounded solution. Proof. Let u1 and u2 be two deterministic functions satisfying ∀t ≤ T , T T u1 (t) = γ1 − C (1 + |u1 (s)|)ds and u2 (t) = γ2 + C (1 + |u2 (s)|)ds. t
t
Now let α and β be two real constants such that α ≤ u1 (0) and β ≥ u2 (0). On the other hand, let ρ be the function that associates ρ(y) = α1[yβ] with y ∈ R. The functions u1 and u2 also satisfy du1 (t) = C(1 + |ρ(u1 (t))|)dt
and
du2 (t) = −C(1 + |ρ(u2 (t))|)dt.
Theorem 8.17 implies that the BSDE associated with (G(t, ρ(y), z), ξ ) has a maximal bounded solution (Y, Z). Once again using comparison theorem, we have, for any t ≤ T , u1 (t) ≤ Yt ≤ u2 (t). Therefore, the pair of processes is a maximal bounded solution associated with (G(t, y, z), ξ ). The proof is now complete. 2
301
BSDEs AND APPLICATIONS
8.4.2 Application in Risk-Sensitive Control Let us consider the setting of Section 8.2.2 where we have a controller who intervenes on a system S whose evolution, when noncontrolled, is described by the same (xt )t≤T solution of (8.20). Assume now that the reward functional is not given by (8.22) but is equal to the following expression: T h(s, x., us )ds + (xT ) , ∀u ∈ U, J (u) := Eu exp θ 0
where θ is a real parameter. The problem we are interested in is to find an optimal strategy for the controller, i.e., an admissible control u∗ that maximizes the reward J (u). Since we have a utility function in the reward, which is of exponential type, we call this a problem of risk-sensitive type. The parameter θ stands for the sensitivity of the controller with respect to risk. He/she is risk averse (resp. seeking) if θ < 0 (resp. θ > 0). There are several works related to those types of control problems and especially their applications; among others, one can quote [24], [84], [85], [266]. Moreover, we can write risk-sensitive control problem in terms of dynamic entropic risk measure, which is studied in Chapter 3 by Barrieu and El Karoui. Indeed, let (eu,γ ,t ) (resp. (eγ ,t )) be a P-dynamic entropic risk measure (resp. Pu -dynamic entropic risk measure) given for any ξT bounded by 1 u eu,γ ,t (ξT ) = γ ln E exp − ξT Ft γ 1 . · resp.eγ ,t (ξT ) = γ ln E exp − ξT Ft γ T Taking γ = 1/θ and ξT = − 0 h(s, x., us )ds − (xT ), we get J (u) = exp( γ1 eu,γ ,0 (ξT ). Actually, Barrieu and El Karoui proved that the dynamic entropic measure (eγ ,t (ξT )) is solution under P of the following BSDE with the quadratic coefficient g(t, z) = 2γ1 z 2 and terminal bounded condition ξT : −deγ ,t (ξT ) =
1 Zt 2 dt − Zt dWt , 2γ
P−a.s,
eγ ,T (ξT ) = −ξT .
(8.55)
We will now get a similar result that will give us the link between J (u) and the P−solution of some BSDEs. Actually, we have the following result: Proposition 8.7 Let H be the Hamiltonian process given by (8.23). There exists a unique P−solution (Ytu , Ztu )t≤T of the BSDE with coefficient g(t, ω, y, z) = H (t, x., z, ut ) + 12 |z|2 and terminal bounded condition (xT ), such that Y u is bounded, i.e., 1 u2 u u −dYt = H (t, x., Zt , ut ) + |Zt | dt − Ztu dWt , P−a.s. YTu = (xT ). 2 Moreover J (u) = exp{Y0u }. Proof. The existence stems from Theorem 8.17 since the function that with z associates H (t, x., z, ut ) is uniformly Lipschitz. Let us focus on uniqueness. For t ≤ T
302
CHAPTER 8
let us set Y˜tu = exp(Ytu ). Then applying Itô’s formula yields −d Y˜tu = Y˜tu H (t, x., Ztu , ut )dt − Y˜tu Ztu dWt , t ≤ T ; Y˜Tu = exp((xT )). Next making a change of probability measure implies that −d Y˜tu = Y˜tu h(t, x., ut )dt − Y˜tu Ztu dWtu , t ≤ T ; Y˜Tu = exp((xT )). t Finally, using once again Itô’s formula with Y˜tu exp 0 h(s, x., us )ds := Y¯tu yields d Y¯tu = Y¯tu Ztu dWtu , t ≤ T . Henceforth,
u u ¯ Yt = E exp
T
0
h(s, x., us )ds + (xT ) Ft , t ≤ T .
It implies that for any t ≤ T we have T h(s, x., us )ds + (xT ) Ft , t ≤ T . Ytu = ln Eu exp t
Then, on the one hand, we have uniqueness and, on the other hand, we have also J (u) = exp{Y0u }. 2 Now as in the risk-neutral case, in order to obtain an optimal control it is enough to maximize with respect to u ∈ U the function H (t, x., z, u). So let u∗ (t, x, z) be a measurable function such that H (t, x., z, u∗ (t, x, z)) = inf u∈U H (t, x., z, u). Besides let (Y ∗ , Z ∗ ) be the bounded solution of the BSDE associated with (H (t, x., z, u∗ (t, x, z)) + 12 |z|2 , (xT )). It exists through Theorem 8.17 since the mapping z → H (t, x., z, u∗ (t, x, z)) + 21 |z|2 is continuous with quadratic growth. Then we have: Theorem 8.19 The admissible control u∗ defined by u∗ := (u∗ (t, x, Zt∗ ))t≤T is optimal and (exp(Yt∗ ))t≤T is the value function of the risk-sensitive control problem, i.e., for any t ≤ T we have T h(s, x., us )ds + (xT ) Ft . exp(Yt∗ ) = ess sup Eu exp u∈U
t
Actually, for any u ∈ U, the comparison theorem implies that Yt∗ ≥ Ytu for any t ≤ T . Therefore, taking into account the characterization of Proposition 8.7, we have T h(s, x., us )ds + (xT ) Ft . Yt∗ ≥ ess sup Eu exp u∈U
t
u∗
But we also have Y ∗ = Y ; therefore, the previous inequality is just an equality, whence the desired result. 2 Remark. In the same spirit, we can also consider risk-sensitive zero-sum stochastic functional differential games. Under Isaacs’ assumption we can show the existence
303
BSDEs AND APPLICATIONS
of a saddle-point for the game. For details on this subject, one can refer to El Karoui and Hamadène [155].
8.5 REFLECTED BACKWARD STOCHASTIC DIFFERENTIAL EQUATIONS 8.5.1 One Barrier Reflected BSDEs Throughout this section, the dimension m is equal to 1. So we are going to deal with solutions of BSDEs whose components Y are forced to stay above a given barrier. Let ξ ∈ L21 (FT ) and g(t, ω, y, z) be a function that satisfies the assumption (H1). Besides, let us introduce another object, called the obstacle, which is a process S := (St )t≤T , continuous, P−measurable, and satisfies E sup (St+ )2 < +∞. 0≤t≤T
Let us now introduce the notion of reflected BSDE (in short RBSDE) associated with (g, ξ, S). A solution for that equation is a triple of P-measurable processes (Y, Z, K) := (Yt , Zt , Kt )t≤T , with values in R1+d+1 such that Y ∈ S12 , Z ∈ Hd2 and K ∈ S12 a non-decreasing process, and K0 = 0 T T g(s, Ys , Zs )ds + KT − Kt − Zs dWs , ∀t ∈ [0, T ] Yt = ξ + (8.56) t t T Yt ≥ St and (Yt − St )dKt = 0. 0
RBSDE: Existence and Uniqueness of a Solution We are going to show that the RBSDE (8.56) has a unique solution. To begin with, let us deal with the uniqueness issue. Proposition 8.8 [Uniqueness] The reflected BSDE (8.56) associated with (g, ξ, S) has at most one solution. Proof. Assume that (Y, Z, K) and (Y , Z , K ) are two solutions of (8.56). First, let us point out that we have ∀ t ≤ T , {1[Yt >Yt ] − 1[Yt >Yt ] }(dKt − dKt ) ≤ 0 since 1[Yt >Yt ] dKt = 1[Yt >Yt ] dKt = 0. Now, using Tanaka’s formula with Y − Y and taking into account the previous inequality yields T
sgn(Ys − Ys )(g(s, Ys , Zs ) − g(s, Ys , Zs ))ds |Yt − Yt | ≤ t
− t
T
(Ys − Ys )(Zs − Zs )dWs .
304
CHAPTER 8
On the other hand, since g is Lipschitz, there exist two bounded process (at )t≤T and (bt )t≤T such that g(t, Yt , Zt ) − g(t, Yt , Zt ) = at (Yt − Yt ) + bt (Zt − Zt ). Therefore, we have T {as |Ys − Ys | + bs sgn(Ys − Ys )(Zs − Zs )}ds |Yt − Yt | ≤ t
T
−
sgn(Ys − Ys )(Zs − Zs )dWs
t
T
≤C
|Ys − Ys |ds
T
−
t
sgn(Ys − Ys )(Zs − Zs )d B˜ s , t ≤ T ,
t
where B˜ is a Brownian motion under a new probability P˜ equivalent to P. Now, taking expectation in both hand-sides implies T ˜ s − Ys |]ds, ∀t ≤ T . ˜ t − Yt |] ≤ C E[|Y E[|Y t
˜ t − Yt )2 ] = 0 for any t ≤ T . Finally, using Gronwall’s inequality implies that E[(Y Henceforth, we have P−a.s., Y = Y and then also Z = Z , K = K since P and P˜ are equivalent probability measures; thus we have uniqueness. 2 We are going now to show that Equation (8.56) has a solution using two methods: the first one is based on the penalization and the other one on the well-known Snell envelope theory of processes. We only present the second method, and we give the main ideas of the proof by penalization. Theorem 8.20 [Existence] The reflected BSDE associated with (g, ξ, S) has a unique solution. Proof. A. Existence via penalization. For n ∈ N, let (Y n , Z n ) := (Ytn , Ztn )t≤T be the P-measurable process of S22 × Hd2 such that T T T Ytn = ξ + g(s, Ysn , Zsn )ds + n(Ysn − Ss )− ds − Zsn dWs , t ≤ T . t
t
t
First, let us point out that through comparison Theorem 8.3, we have Y n ≤ Y n+1 t n n for any n ≥ 0. On the other hand, let us set Kt = 0 n(Ys − Ss )− ds, t ≤ T . Step 1: There exists a constant C ≥ 0 such that T n 2 n 2 n 2 |Zs | ds + (KT ) ≤ C. ∀n ≥ 0 and ∀t ≤ T , E |Yt | +
(8.57)
0
In addition, there exists a P-measurable process (Yt )t≤T , pointwise limit of the sequence (Y n )n≥0 also converging in H12 . Indeed, applying Itô rule to (Y n )2 and taking expectation yields T T n 2 n 2 2 n n n |Zs | ds ≤ E[ξ ] + 2E |Ys g(s, Ys , Zs )|ds E |Yt | + t
t
T
+ 2E t
|Ysn |n(Ysn
−
− Ss ) ds
BSDEs AND APPLICATIONS
T
T
1 |Ysn |2 ds + E ≤ E[ξ 2 ] + C + CE 2 t t ! + −1 E sup (Ss+ )2 + E (KTn − Ktn )2 ,
305 n 2 |Zs | ds
t≤s≤T
where is a universal nonnegative real constant. But, for any t ≤ T , we have 2 T 2 T ˜ E[(KTn − Ktn )2 ] ≤ CE |g(s, Ysn , Zsn )|ds + Zsn dWs ξ 2 + |Ytn |2 + t
˜ ≤ CE 1 + ξ 2 + |Ytn |2 +
T
|Ysn |2 ds +
t
t
T
|Zsn |2 ds ,
t
where C˜ is a constant. Now plug this inequality into the previous one and choose C˜ = 1/4. We obtain T T E |Ytn |2 + |Zsn |2 ds ≤ C¯ 1 + E (Ysn )2 ds , t ≤ T , t
t
where C¯ is an appropriate real constant. Finally, Gronwall’s inequality leads to the t desired result for E[|Ytn |2 ] and then also for both E[ 0 |Zsn |2 ds] and E[(KTn )2 ]. Now the existence of the process (Yt )t≤T stems from the monotonicity of the sequence (Y n )n≥0 , the estimate (8.57), and Fatou’s Lemma. Step 2: limn→∞ E[supt≤T |(Ytn − St )− |2 ] = 0. This property is essential; therefore, we give its entire proof. Let (Y¯tn , Z¯ tn ) be the solution of the following standard BSDE: T T Y¯tn = ξT + Z¯ sn dWs . {g(s, Ysn , Zsn ) − n(Y¯sn − Ss )}ds − t
t
Y¯tn ,
By comparison, we have P−a.s., ∀t ≤ T , ≥ for any n ≥ 0. Now let τ be an Ft -stopping time such that τ ≤ T . Then, T Y¯τn = E ξT e−n(T −τ ) + (g(s, Ysn , Zsn ) + nSs )e−n(s−τ ) ds Fτ . Ytn
τ
T Since S is continuous, then ξT e−n(T −τ ) + τ e−n(s−τ ) (g(s, Ysn , Zsn ) + nS s )ds → ξT 1[τ =T ] + Sτ 1[τ Ys ] [g(s, Ys , Zs ) − g (s, Ys , Zs )]ds +
t
T
1 t
[Ys >Ys ]
(dKs − dKs ) − 2
t
T
1[Ys >Ys ] (Zs − Zs )dWs , t ≤ T .
But (ξT − ξT )+ = 0 and 1[Ys >Ys ] (dKs − dKs ) ≤ 0; therefore, we have T T
+
+ (Yt − Yt ) ≤ as (Ys − Ys ) ds + 1[Ys >Ys ] (Zs − Zs )d B˜ s t
since g(t, Yt , Zt ) − g(t, Yt , Zt )
t
at (Yt − Yt ) + bt (Zt
= − Zt ). Now taking the expec˜ tation with respect to P and using Gronwall’s inequality, we have (Yt − Yt )+ = 0 for any t ≤ T . Therefore Y ≤ Y . Let us now focus on the second point. Since g is Lipschitz, the solution (Y , Z , K ) is unique. On the otherhand, we know that, via the penalizationscheme, t t for any t ≤ T , Kt = limn→∞ n 0 (Ysn − Ss )− ds and Kt = limn→∞ n 0 (Ys n −
− n n
n
n Ss ) ds, where (Y , Z ) and (Y , Z ) are the approximating sequences of (Y, Z) and (Y , Z ), respectively. But comparison of solutions t of standard BSDEs implies that Y n ≤ Y n . Therefore, Kt − Ks = limn→∞ n s (Ysn − Ss )− ds ≥ Kt − t Ks = limn→∞ n 0 (Ys n − Ss )− ds. 2 8.5.2 Connection with Mixed Stochastic Control Assume now the functions g(t, y, z) = δt + βt y + γt · z, where δ ∈ H12 and β (resp. γ ) are bounded and measurable with respect to P1 and Pd , respectively. Let (Y, Z, K) be the solution of reflected BSDE associated with (g, ξT , S). Then we have: Proposition 8.10 For any t ≤ T , τ β β β γ Yt = ess supτ ≥t E δs t,s ds + t,τ Sτ 1[τ −∞}. Note we allow ε to be −∞. On the subdomain (ε, ∞) we suppose U is strictly increasing, concave, continuous, continuously differentiable, and satisfies lim U (x) = ∞ x↓ε
lim U (x) = 0.
x→∞
U is defined on (ε, ∞). The continuous, strictly decreasing inverse of the function U will be denoted by I : (0, ∞) → (ε, ∞). Then I (0) := lim I (y) = ∞ y↓0
I (∞) := lim I (y) = ε. y→∞
For 0 < y < ∞ define (y) : = max[U (x) − xy] U x>ε
= U (I (y)) − yI (y).
(9.11)
(y) = −I (y), 0 < y < ∞. Also, Then U (y) + xy] = U (U (x)) + xU (x) min[U y>0
= U (x), for all
x ∈ R.
(9.12)
From (9.11) and (9.12) we have U (I (y)) ≥ U (x) + y[I (y) − x], for all x and y > 0 (y) − x[U (x) − y] for x > ε and y > 0. U (U (x)) ≤ U As stated, we assume r = 0. However, our discussion immediately extends to the case r = 0 by considering a modified utility function: UR (x) := U (Rx). (y/R). R (y) = U Then U
326
CHAPTER 9
Example 1 In the sequel we shall usually consider the following utility functions: U1: U (x) = −e−γ x , forx ∈ R, γ > 0, U2: U (x) = α1 x α , forx > 0, α < 1, U3: U (x) = log x, x > 0. For these cases with I = (U )−1 U1: I (x) = − γ1 log γx , x > 0 1
U2: I (x) = x α−1 , x > 0, U3: I (x) = x1 , x > 0. In Example 2, we consider the quadratic utility U (x) = −(x − a)2 ,
for
x < a.
Then I (x) = a − x2 for x ≥ 0. Further, with (y) = sup[U (x) − xy] U x
= U (I (y)) − yI (y) we have in the three cases: (y) = y log y − y , y > 0 U1: U γ γ γ α (y) = 1−α y α−1 , y>0 U2: U α (y) = −1 − log y, y > 0. U3: U The relation (9.12) (y) + xy] U (x) = min[U y>0
(U (x)) + xU (x) =U can be verified in cases U1, U2, and U3.
9.4 PRICING CLAIMS Consider L nontradable assets Y 1, Y 2, . . . , Y L with
Y0 = (Y01 , Y02 , . . . , Y0L )
and Y1 (ω) = (Y11 (ω), Y12 (ω), . . . , Y1L (ω)) where ω ∈ = {ω1 , ω2 , . . . , ωN }. Suppose G(Y1 (ω)) is some claim based on Y1 . Then G(Y1 (ω)) takes the N values gi = G(Y1 (ωi )),
i = 1, . . . , N.
327
DUALITY METHODS
Definition 4.1 For X = Xx,α = x + α · (S1 − S0 ) define VG (x) = sup Ep [U (X x,α − G)] α N x,α = sup pi U (X (ωi ) − gi ) α
=
i=1
sup
X,EQ [X]=x Q∈M
Ep [U (X − G)]
= sup Ep [U (x · a1 − G)], x∈R k x·q=x
where we recall that a1 (ω) = (a11 (ω), a12 (ω), . . . , a1k (ω)) is the vector of values of the Arrow-Debreu securities at time t = 1. Then VG (x) represents the optimal investment in terms of the utility U starting with initial wealth x and where the now traded asset G(Y1 (ω)) must be delivered at time 1. Notation 4.2 We consider the special case when G = 0 and then write V0 (x) = sup Ep [U (X x,α )] α
= sup Ep [x · a1 )]. x x·q=x
Remark 4.3 We first discuss the maximization problem in the definition of VG (x) and show it is tractable in the case of utility U1. Recall VG (x) = sup Ep [U (x · a1 − G)] x x·q=x
=
sup
k
x1 ,...,xk x1 q1 +···+xk qk =x s=1
U (xs − g" )p" .
"∈As
We can obtain equations for x1 , . . . , xk from first-order conditions using Lagrange multipliers. Write k k F (x1 , . . . , xk ) = U (xs − g" )p" + λ x − xs qs . s=1
Then
∂F ∂xs
"∈As
= 0 implies U (xs − g" )p" − λqs = 0, for
s=1
s = 1, . . . , k.
"∈As
For any λ > 0 there is then a unique xs (λ), s = 1, . . . , k. We then find λ so that k s=1 xs (λ)qs = x. Unfortunately, this approach gives explicit results only for utility U1. We shall investigate this equality for the different utility functions.
328
CHAPTER 9
Case U1: The first-order condition is then γ e−γ (xs −g" ) p" = λqs "∈As
γ e−γ xs
so
p" eγ g" = λqs
"∈As
γ · p" e γ g " = e γ x s . λqs "∈A
and
s
Note then that λ > 0. Consequently, 1 1 γ γ g p" e " − log λ xs = log qs "∈A γ γ s
and also
k γ 1 1 qs log p" e γ g " . x = − log λ + γ γ s=1 qs "∈A s
Therefore,
1 γ p" e γ g " xs = x + log qs "∈A γ s k 1 γ γ g" − qr log , p" e qr "∈A γ r=1 r
and VG (x) = −
k s=1
e−γ xs eγ g" p"
"∈As
k 1 −γ x γ =− e , exp qr log p" e γ g " γ q r "∈A r=1 r
Further, taking G = 0 (so g" = 0, for all "), k r γp 1 −γ x V0 (x) = − e , exp qr log γ qr r=1 where r = p
"∈Ar
p" .
329
DUALITY METHODS
Case U2: The first-order condition gives p" (xs − g" )α−1 = λqs . "∈As
This has a unique solution xs (λ) with max g" < xs (λ) < ∞. "∈As
However, a satisfactory solution for xs (λ) is not available, so λ and xs cannot be found analytically. Case U3: The first-order condition gives p" = λqs . (xs − g" ) "∈A s
Again, this has a unique solution xs (λ) for any λ > 0 with max g" < xs < ∞. "∈As
However, again a satisfactory solution is not available so λ and xs cannot be found analytically. Remark 4.4 We shall now compute VG (x) using duality theory. We have defined for y > 0 (y) = max[U (x) − xy] U x>ε
= U (I (y)) − yI (y). For y > 0 define, with Ui (x) = U (x − gi ), i (y) = sup[Ui (x) − xy] U
(9.13)
x
= sup[U (x − gi ) − xy]. x
This supremum occurs when x = gi + I (y) so i (y) = U (I (y)) − (gi + I (y))y U (y) − gi y. =U Definition 4.5 For y > 0 define, (with Qi = Q(ωi )), the dual quantity N Qi VG (y) = inf pi Ui y Q∈M pi i=1 N Qi gi Qi = inf pi U y − y Q∈M pi pi i=1 Q y − yEQ [G] . = inf Ep U Q∈M P
(9.14)
330
CHAPTER 9
Remark 4.6 Again, for G = 0 we define
0 (y) = inf Ep U Qy . V Q∈M P
G (y). The following We now establish a key result that relates VG (x) and V theorem relates the value functions of the original and dual problems: Theorem 9.3 G (y) = sup[VG (x) − xy] V
(9.15)
x
and for x in the domain of (U i), i = 1, 2, 3, G (y) + xy]. VG (x) = inf [V
(9.16)
y>0
Proof. From (9.13) for any y > 0 i (y) ≥ Ui (x) − xy U i (y) + xy. so Ui (x) ≤ U Then for any strategy α, any y > 0, any Q ∈ M and any x in the domain U : Qi Qi x,α x,α Ui (X (ωi )) ≤ U , 1 ≤ i ≤ N. y + X (ωi ) y pi pi Therefore, Ep [U (X x,α − G)] =
N
pi Ui (X x,α (ωi ))
i=1
≤ =
N
(y Qi ) + yEQ [X x,α ] pi U pi i=1
N
(y Qi ) + yx pi U pi i=1
so VG (x) = sup Ep [U (X x,α − G)] α
G (y) + xy ≤V and G (y) ≥ sup[VG (x) − xy]. V x
2
(9.17)
We now establish the converse inequality. For this we need two lemmas. # = Q(G, # Lemma 9.4 For the Q y) in M (which is compact), which attains the minimum of Q Ep U (9.18) y − yE Q [G], P # i) = Q #i > 0 for i = 1, . . . , N. we have Q(ω
331
DUALITY METHODS
Proof. We wish to minimize
Q Ep U y − yE Q [G] P
N j j subject to N i=1 Qi S1 (ωi ) = S0 , j = i=1 Qi = 1, Qi > 0, i = 1, . . . , N and 1, . . . , M. The minimum exists as we are minimizing over a compact set. Using Lagrange multipliers, consider N N Qi y − y pi U Q i gi F (Q) = pi i=1 i=1 N N M j j + λ0 Qi − 1 + λi Qi S1 (ωi ) − S0 . j =1
i=1
Then
i=1
= 0 implies
∂F ∂qi
yU
M Qi j y − gi y + λ0 + λj S1 (ωi ) = 0. pi j =1
(y) = −I (y) this gives As U
M
j #i = pi U 1 (λ0 + Q λj S1 (ωi ) − gi y y y j =1
> 0. Consequently, when normalized, #i ∈ (0, 1), Q
i = 1, . . . , N.
(2)
Remark 4.9 This lemma justifies the differentiation below, for any other Q ∈ M and for ε in a small enough neighborhood of 0: # + εQ ∈ M. (1 − ε)Q # = Q(G, # Lemma 9.5 For Q y) as in Lemma 9.4 and for any other Q ∈ M # # Q Q EQ I y + G = EQ# I y +G . P P Proof. Write
# + εQ # + εQ) (1 − ε)Q ((1 − ε)Q − yG . f (ε) = Ep U y · P P
Then
# # # + εQ (Q − Q) (Q − Q) (1 − ε) Q
y· U − yG . y P P P
f (ε) = Ep
332
CHAPTER 9
# minimizes (18) so f (0) = 0 and However, Q # # Q Q
− yG = EQ y U y − yG . EQ# y U y P P (y) = −I (y) so Recall U EQ#
# # Q Q I y + G = EQ I y +G . P P
(2)
We now conclude the proof of Theorem 9.3. # ) + G has the same expected value for all Q ∈ M from Lemma 9.5, it As I (y Q P #
) + G] there is an # α ∈ R N such is an attainable claim. That is, with x = EQ#[I (y Q P that # Q + G = x +# α · (S1 − S0 ) I y P = Xx,#α . Now
so
# # # # Q Q Q Q −y I y =U I y U y P P P P
# # # # # Q Q Q Q Q −y G = U I (y ) + G − G − y I y +G . U y P P P P P
Therefore,
Q Q −y G VG (y) = min Ep U y Q∈M P P # # y Q − y QG = Ep U P P # # # Q Q Q +G −G −y I y +G I y = Ep U P P P = Ep [U (X x,#α − G)] − yE Q#[X x.#α ] = Ep [U (X x,#α − G)] − xy ≤ max Ep [U (X x,α − G)] − xy α
= VG (x) − xy.
(9.19)
≤ sup[VG (x) − xy].
(9.20)
x
From (9.17) and (9.20) G (y) = sup[VG (x) − xy]. V x
333
DUALITY METHODS
G (y) + xy]. From (9.19) Define V G (x) = inf y>0 [V V G (x) ≤ VG (x).
(9.21)
For any ε < 0 there is a y > 0 such that G (y) + xy − ε = sup[VG (z) − zy] + xy − ε V G (x) > V z
≥ VG (x) − xy + xy − ε = VG (x) − ε. Therefore, V G (x) ≥ VG (x).
(9.22)
From (9.21) and (9.22) G (y) + xy] V G (x) = VG (x) = inf [V y>0
2
and the proof is complete. Corollary 9.6 Specializing to the case G = 0 we have 0 (y) = sup[V0 (x) − xy] V x
0 (y) + xy]. V0 (x) = inf [V y>0
9.5 THE DUAL COST FUNCTION G (y) and established a relation In Theorem 9.3 we introduced the dual cost function V with VG (x). We now investigate VG (y) for the three utility functions U1, U2, U3. For the exponential utility U1 we see the relation with the results of Chapter 1. U1: For the exponential utility, U (x) = −e−γ x . Write Ui (x) = U (x − gi ) = −e−γ (x−gi ) . i (y) = sup[Ui (x) − xy]. The maximizing x is when U (x) = y so For y > 0, U i # x=−
1 y log + gi γ γ
and i (y) = Ui (# x) − # xy U y y y = − − gi y + log . γ γ γ Then
Qi y Q y y = − − yEQ [G] + EQ log + log . pi Ui y pi γ γ P γ i=1
N
334
CHAPTER 9 Q ] P
this is Recalling the entropy is h(Q P ) = EQ [log y y y y − + log + [h(Q P ) − γ EQ [G]]. γ γ γ γ Therefore, G (y) = − y + y log y + y inf [h(Q P ) − γ EQ [G]]. V γ γ γ γ Q∈M Note the minimizing Q is independent of y. U2: For the power utility, 1 α x , α
U (x) =
α < 1, x > 0.
1 (x − gi )α α i (y) = sup[Ui (x) − xy], y > 0. U
Ui (x) = U (x − gi ) = x
The maximizing x occurs when Ui (x) = y so 1
# x = y α−1 + gi and #i (y) = Ui (# x) − # xy U α 1−α = y α−1 − gi y. α Then i
i pi U
Qi y pi
and
α α 1−α Qi α−1 Qi = y α−1 pi − gi y α pi pi i 1 α Q α−1 1−α α−1 − yE Q [G] EQ =y P α
G (y) = inf V
Q∈M
y
α α−1
1−α α
EQ
Q P
1 α−1
Here the optimizing Q depends on y. U3: For the logarithmic utility, U (x) =
)
log x −∞
for x > 0 for x ≤ 0.
Ui (x) = U (x − gi ) = log(x − gi ).
− yEQ [G] .
335
DUALITY METHODS
i (y) = supx [Ui (x) − xy]. The maximizing x occurs when For y > 0, U U (x) = y so # x=
1 + gi . y
Then i (y) = Ui (# U x) − # xy i
i y Qi pi U pi
= − log y − 1 − yg i . Qi Qi =− pi log y + pi + yg i pi pi i pi = − log y + pi log − 1 − yE Q [G] Qi i = − log y + h(P Q) − 1 − yE Q [G].
Consequently, G (y) = inf V
Q∈M
i
Qi pi U i y pi
= − log y − 1 + inf [h(P Q) − yE Q [G]]. Q∈M
Again, the optimizing Q depends on y. #G and Q #0 = Q # for the different utility We now compute the minimizing Q functions. Recall from Theorem 9.2 that for Q ∈ M Q(ω" ) = Q" = qs em" p"
for
" ∈ As .
(9.23)
Case U1: For the exponential utility G (y) = − y + y log y + y inf [h(Q P ) − γ EQ [G]]. V γ γ Q∈M γ γ Recalling the form (9.23) for Q, we are finding an infimum over the m" subject to m" "∈As e p" = 1. Write k J = h(Q P ) − γ EQ [G] + λs e m " p " − 1
=
k
qs
k s=1
"∈As
p" em" log(qs em" )
"∈As
s=1
−γ
s=1
qs
"∈As
e m " p " g" +
k s=1
λs
"∈As
em" p" − 1 .
336
CHAPTER 9
For " ∈ As ∂J = qs p" em" log(qs em" ) + qs p" em" − γ qs p" em" g" + λs em" p" ∂m" so
∂J ∂m"
= 0 when qs log(qs em" ) + qs − γ qs g" + λs = 0.
That is, log qs + m" + 1 − γ g" = −
λs qs
and m" − γ g" = −
λs − 1 − log qs qs
so λs
e Now
∂J ∂λs
= 0 implies
m"
eγ g" e(− qs −1) = . qs
em" p" = 1. That is, (− λs ) e qs −1 γ g" e p" = 1 qs "∈A
"∈As
s
so
−1 λs e(− qs −1) = e γ g " p" qs "∈A s
and
e m" = e γ g "
−1 e γ g r pr
.
r∈As
#G (") is then The optimal Q
#G (") = qs em" p" = qs eγ g" p" Q
−1 e γ g r pr
r∈As
#" is When G = 0 the optimal Q
# = q s p" Q(") s = where p
r∈As
r∈As
pr .
−1 pr
=
qs p" , s p
.
337
DUALITY METHODS
Case U2: For the power utility, 1 α−1 α Q 1 − α G (y) = inf y α−1 − yEQ [G] . EQ V Q∈M α P It can be shown that the minimizing Q ∈ M is given by #G (") = Q
(yqs
)α (λ
qs p" , 1−α s − yqs g" )
where λs is the unique solution of the equation φs (λs ) = (yqs )α and φs (λ) = α−1 . When G = 0 we see from (9.24) that "∈As p" (λ − yqs g" ) As "∈As p" em" 1/ ps . Then
λα−1 = (yqs )α · em" . s s λα−1 s = "∈As p" we have p = (yqs )α giving em" = = 1 and p s #" = qs p" em" = qs p" . Q s p
Case U3: For the logarithmic utility G (y) = − log y − 1 + inf [h(P Q) − yE Q [G]], V Q∈M
the minimizing Q is given by #G (") = Q
qs p" . (λs − yqs g" )
As in the previous case, when G = 0 #" = qs p" . Q s p Of course, case U3 is really the limiting version of U2 as α tends to 0. Case U4: The general utility function. Write k Q y λs e m " p " − 1 − yE Q [G] + J = Ep U P "∈As s=1 k (yqs em" ) − y = p" U qs p " e m " g " + λs em" p" − 1 . "
"
"∈As
s=1
The minimizing Q has the form #G (") = p" U
Q y
λs − yqs g" yqs
.
338
CHAPTER 9
The case G = 0 can be treated as before, giving λs . yqs em" = U
yqs Then
1 λs
· p" U e p" = yqs yqs "∈As "∈As s λs p U =1= yqs yqs s 0 p = U (θs (y)), yqs
m"
where θs0 (y) = λs /(yqs ) so that
λs = yqs I
yqs s p
.
#" = qs p" / ps . This gives, as in the previous cases, that Q G (y) will now be computed for y > 0. The value of V
#G (") = qs eγ g" p" · ( Case U1: With Q γ gr γs = r∈As e pr ,
r∈As
eγ gr pr )−1 and writing
#G P ) − γ EQ# [G]] G (y) = − y + y log y + y [h(Q V G γ γ γ γ γ g" y y y qs p " e γ g " qs e y = − + log + log γ γ γ " γs γs γ y q s p" e γ g " −γ · g" γ " γs k qs y y y y qs log . = − + log + γ γ γ γ s=1 γs #" = When G = 0, we saw Q
q s p" s p
so
# ) 0 (y) = − y + y log y + y h(Q P V γ γ γ γ y y y qs p " qs y log = − + log + s s γ γ γ γ " p p k qs y y y y , qs log = − + log + s p γ γ γ γ s=1
s = where p
"∈As
p" .
(9.24)
(9.25)
339
DUALITY METHODS
#G (") = Case U2: With Q
q s p" (yqs )α (λs −yqs g" )1−α
α G (y) = y α−1 V
=
1−α α
#G Q EQ#G P
1 α−1
− yE Q# [G] G
k 1 p" − λs , α " (yqs )α (λs − yqs g" )−α s=1
where λs = λG s (y) is the unique solution of p" (yqs )α (λs − yqs g" )1−α
"∈As
=1
for s = 1, . . . , k. We shall write θsG (y) = so
λG s (y) yqs
p" G yqs (θs (y) − g" )(1−α) "∈As
= 1.
When G = 0, 0 (y) = y V
α α−1
1−α α
#G Q EQ#G P
1 α−1
.
The minimizing Q is of the form #" = qs p" Q s p k 0 (y) = 1 − α so V λ0 (y), α s=1 s α
1
α
s1−α . where λ0s (y) = y α−1 qsα−1 p With G = 0,
θs0 (y)
=
yqs s p
1−α .
Case U3: Here, G (y) = − log y − 1 + h(P Q #G ) − yEQ# (G) V G qs p " g " qs = − log y − 1 − −y p" log , λs − yqs g" (λs − yqs g" ) " "
340
CHAPTER 9
where λs = λG s (y) is the unique solution of p" = 1. (λ − yqs g" ) s "∈A s
Then G (y) = V
"
k λs − yqs g" − λG p" log s (y). yqs s=1
Writing θsG (y) =
λG s (y) yqs
we have
G (y) = V
p" log(θsG (y) − g" ) − y
"
k
qs θsG (g).
s=1
When G = 0, # 0 (y) = − log y − 1 + h(P Q) V p" = − log y − 1 + p" log #" Q " qs = − log y − 1 − p" log s p " k qs s log . p = − log y − 1 − s p s=1
Here
λ0s (y)
s , so with =p
θs0 (y)
=
s p yqs
we have
0 (y) = −1 + V
k
s log θs0 (y). p
s=1
Case 4: For a general utility, #G Q y G (y) = Ep U − yE Q#G [G] V P p" λs − yqs g" U λs − yqs g" = −y g" U
p" U yq y yqs s " "
λs − yqs g"
λs − yqs g" − p" g" . = p" U U U yqs yqs " " Here λs = λG s (y) is the unique solution of p"
λs − yqs g" = 1. U yqs yqs "∈A s
341
DUALITY METHODS
In terms of θsG (y) =
λG s (y) yqs
, that is, p" "∈As
When G = 0,
yqs
U (θsG (y) − g" ) = 1.
# Q y 0 (y) = Ep U V P y qs = p" U s p " k y qs . s U p = s p s=1
In terms of θs0 (y) =
s p , yqs
that is, 0 (y) = V
k
s U p
s=1
1 . θs0 (y)
(z) = −1 − log z, z > 0. Example 2 Note for Case U3, the logarithmic utility, U Then k k qs qs s −1 − log y s log = −1 − log y − p p s s p p s=1
s=1
= −1 +
k
s log θs0 (y), p
s=1
agreeing with the previous result.
G (y) AND V 0 (y) 9.6 THE MINIMUM OF V Recall from Theorem 9.3 that G (y) + xy] VG (x) = inf [V y>0
G (# y ) + x# y, =V (# where x + V y for the dual problem is a solution of this G y ) = 0. The optimizing # (y). equation. In this section we investigate the derivatives V G Case 1. We have in this case
k y y y y qs VG (y) = − + log + qs log , γ γ γ γ s=1 γs
342
CHAPTER 9
with γs =
pr e γ g r .
r∈As
Then
k qs 1 1 y
. qs log VG (y) = log + γs γ γ s=1 γ
The λG s (y) quantities, s = 1, . . . , k, which arise in Cases U2 and U3, do not appear in Case U1. However, write m"
λG s (y) = yqs g" − yqs U (yqs e )
= yqs g" + yqs I (yqs em" ), so θsG (y) =
λG s (y) = g" + I (yqs em" ). yqs
For the exponential utility I (z) = − γ1 log γz , z > 0, so yqs yqs em" (y) = yq g − log λG s " s γ γ and θsG (y) = g" − As em" =
eγ g" γs
1 yqs em" log . γ γ
, λG s (y)
so −
yqs y qs = yqs g" − log + log + γ g" γ γ γs y y y qs = −qs log − qs log γ γ γ γs
k k 1 1 qs y 1 G λs (y) = log + qs log γ γ s=1 γs y s=1 γ G (y) =V
as above. We shall see this result holds in general. When G = 0 we saw k qs 0 (y) = − y + y log y + y qs log V s p γ γ γ γ s=1
so
k y 1 1 qs
. qs log V0 (y) = log + s p γ γ γ s=1
343
DUALITY METHODS
Here the appropriate λ0s (y) is
yqs s p y qs log s γ p qs y . log + log s γ p
λ0s (y) = yqs I yqs γ yqs =− γ =−
That is, λ0s (y) yqs y qs 1 . log + log = s γ γ p
θs0 (y) =
Again we see 0 (y) = − 1 V y =−
k
λ0s (y)
s=1
k
qs θs0 (y).
s=1
Case U2: For the power utility we have seen that k p" 1 − λs VG (y) = α " (yqs )α (λs − yqs g" )−α s=1
so G (y) = − 1 V y
k
λG s (y) = −
s=1
k s=1
where we recall λs = λG s (y) is the unique solution of p" "∈As
qs θsG (y),
(yqs )α (λs − yqs g" )1−α
=1
for each s = 1, . . . , k. When G = 0 k 0 (y) = 1 − α λ0 (y), V α s=1 s
where α
α
1
s1−α . λ0s (y) = y α−1 qsα−1 p
344
CHAPTER 9
Therefore, α 1 1 ∂λ0s (y) α s1−α = y α−1 qsα−1 p ∂y α−1 α 1 0 · λ (y) = α−1 y s
so k ∂λ0s (y) 0 (y) = 1 − α V α s=1 ∂y
=
k 1−α α 1 0 λ (y) · · α α − 1 y s=1 s
=−
k k 1 0 λs (y) = − qs θs0 (y). y s=1 s=1
Case U3: For the logarithmic utility G (y) = V
p" log
"
k λs − yqs g" − λs yqs s=1
so k
G (y) = − 1 V y as λs =
λG s (y)
λG s (y) = −
s=1
qs θsG (y),
s=1
is the unique solution of p" "∈As
k
(λs − yqs g" )
= 1.
When G = 0, 0 (y) = − log y − 1 − V
k
s log p
s=1
qs s p
and s . λ0s (y) = p Therefore, k s=1
λ0s (y) =
k
s = 1 p
s=1
so k k 0 (y) = − 1 = − 1 λ0s (y) = − qs θs0 (y). V y y s=1 s=1
345
DUALITY METHODS
Case 4. The general utility. We have seen that
λs − yqs g"
λs − yqs g" VG (y) = − p" g" p" U U U yqs yqs " " so G (y) = V
λs yqs g" ∂ λs − yqs g" U λs − yqs g" · U
· p" U yqs yqs ∂y yqs " λs − yqs g" ∂ λs − yqs g" − . p" g" U
yqs ∂y yqs "
= −I. Then Recall U G (y) = − 1 V y
p " λs
U
qs
"
λs − yqs g" yqs
·
∂ ∂y
λs − yqs g" . yqs
Now recall λs = λG s (y) was the unique solution of p"
λs − yqs g" =1 U yqs yqs "∈A s
for s = 1, . . . , k. Differentiating, we see ∂ λs − yqs g"
λs − yqs g" · = qs . p" U yqs ∂y yqs "∈A s
(y) = − 1 Therefore, V G y eral case,
k s=1
λG s (y) = − k
0 (y) = V
s=1
k s=1
qs θsG (y). When G = 0 in the gen-
qs s U y p s p
so y qs · qs s U p s s p p s=1 k qs qs s I y · . =− p s s p p
0 (y) = V
k
s=1
Now from (9.24)
s p λs U ( yq ) yqs s
= 1 so λs =I yqs
yqs s p
and 0 (y) = − 1 V y
k s=1
λ0s (y) = −
k s=1
qs θs0 (y).
346
CHAPTER 9
9.7 THE CALCULATION OF V0 (x) In this section we determine V0 (x) for the different utility functions. Recall from Theorem 9.3: G (y) + xy] VG (x) = inf [V y>0
0 (y) + xy]. and V0 (x) = inf [V y>0
(y) + x = 0 (resp. V (y) + x = 0), that is, when The infimum occurs when V G 0 x=
k k 1 G λs (y) = qs θsG (y) y s=1 s=1
k k 1 0 0 x= λ (y) = qs θs (y) . y s=1 s s=1
resp.
(9.26)
(9.27)
G (# If # y is a solution of (9.25) then VG (x) = V y ) + x# y , and similarly for V0 (x). We first calculate V0 (x) for the different utility functions. y ) so that Case U1: Here x = #y1 ks=1 λ0s (# −x = Therefore,
k # 1 qs y 1 . qs log log + s p γ γ γ s=1
# y = γ exp −γ x −
k s=1
qs qs log s p
= γ e−γ x µ,
where µ=
k , s qs p s=1
qs
.
So # # # y # y y # y y 0 (# V y ) = − + log − log µ = − − # yx γ γ γ γ γ and, therefore, 0 (# V0 (x) = V y ) + x# y 1 =− # y γ = −µe−γ x k , s qs p −γ x = −e qs s=1
347
DUALITY METHODS
and this equals −e
# ) −γ x−h(Q P
as
, k qs s qs p qs # )) = exp − exp(−h(Q P = p" log . s s qs p p "
s=1
Case U2: For the power utility, 1
α
α
s1−α . λ0s (y) = (y) α−1 qsα−1 p Then
1−α k α 1 1 α α−1 1−α s qs p V0 (x) = x α s=1 k 1 1−α s 1−α p 1 α = x qs qs α s=1 1 1−α # α−1 Q 1 α EQ# = x . α P
Case U3: For the logarithmic utility, 0 (y) = − log y − 1 − V
k s=1
qs s log p s p
(y) + x = 0 when # (y) = −1/y and V y = 1/x. Consequently, so V 0 0 0 (# V0 (x) = V y ) + x# y = − log # y −1−
k s=1
= log x −
k s=1
s log p
qs s log p s p
qs s p
+1
.
Case 4: For the general utility, 0 (y) = V
k s=1
and
qs s U y p s p
λ0s (y)
qs = yqs I y s p
.
Suppose 1 0 x= λs (# y) = qs I # y s=1 s=1 k
k
# y qs s p
,
(9.28)
348
CHAPTER 9
and suppose this can be solved for # y . Then 0 (# V0 (x) = V y ) + x# y k k # y qs qs s U # # = + p y qs I y s s p p s=1 s=1 k # y qs qs s U # = . p +# y qs I y s s p p s=1
From equation (9.11) in Section 9.2 (y) + yI (y) = U (I (y)) U so k
V0 (x) =
s=1
From (9.24)
I
# y qs s p
# y qs s U I . p s p
=
y) λ0s (# = θs0 (# y) # y qs
so V0 (x) =
k
s U (θs0 (# y )), p
s=1
where x=
k s=1
qs I
# y qs s p
1 0 λs (# y) = qs θs0 (# y ). # y s=1 s=1 k
=
k
The calculation of # y is discussed in Section 9.12. Once # y is determined, k 1 s U V0 (x) = fs (# y) . p qs s=1
9.8 THE INDIFFERENCE ASKING PRICE FOR CLAIMS In Section 9.7 we have discussed V0 (x) and the numerical methods are given in Section 9.12. The indifference asking price of the claim G at wealth x is the number ν such that VG (x + ν) = V0 (x). For utility U1 the price ν will be independent of x, but for other utilities this is not always the case. We now describe an algorithm to compute ν. Recall that G (# VG (x + ν) = V y ) + (x + ν)# y G G (# =V y ) + f (# y )# y,
349
DUALITY METHODS
where y) = x + ν = f G (#
k 1 G G (y). λ (y) = −V y s=1 s
y so that Given x first compute V0 (x) as in Section 9.7. Then find # G (# y ) + f G (# y )# y = V0 (x). V Then ν = f G (# y ) − x. Now in the general case,
λs − yqs g"
λs − yqs g" − g" VG (y) = p" U U p" U yqs yqs " " (U (θsG (y) − g" )) − = p" U p" U (θsG (y) − g" )g" . "
Also,
"
p"
λs − yqs g" =1 U yqs yqs "∈As p" = U (θsG (y) − g" ). yq s "∈A s
(z) = U (I (z)) − yI (z) so that U [U (θ )] = U (θ) − From (9.11) of Section 9.2, U
θ U (θ ). Therefore, λs − yqs g" G (y) = p" U V yqs " λs − yqs g"
λs − yqs g"
λs − yqs g" − U − p" p" g " U yqs yqs yqs " " k p" λs − yqs g" λs − yqs g" − p" U λs U
= yqs yqs yqs " "∈A s=1 =
"
=
"
p" U
λs − yqs g" yqs
s
−
p" U (θsG (y) − g" ) − y
k
λs
s=1 k
qs θsG (y).
s=1
Therefore, G (# y ) + (x + ν)# y VG (x + ν) = V λs − yqs g" = p" U yqs " = p" U (θsG (g) − g" ) "
350
CHAPTER 9
as k
λs =
s=1
k s=1
Note for G = 0, V0 (x) =
p" U
"
=
λG y. s (y) = (x + ν)#
k
y) λ0s (# # y qs
s U p
s=1
Now VG (x + ν) =
y) λ0s (# # y qs
=
k
s U (θs0 (# y )). p
s=1
y ) − g" ) p" U (θsG (#
(9.29)
"
and x +ν =
k
y ), qs θsG (#
(9.30)
s=1
y ) is the optimum investment in the k Arrow-Debreu securities. The existence so θsG (# and computation of # y is provided in Section 9.3. We now consider cases U1, U2, U3. 6 s qs Case U1: For the exponential utility V0 (x) = −e−γ x ks=1 ( p ) . qs λG s (y) VG (x) = p" U − g" = p" U (θsG (y) − g" ). yq s " " Now x +ν = with λG s (y) = −qs
k 1 G λ (y) y s=1 s
y qs y y log − qs log γ γ γ γs
and γs =
p" e γ g " .
"∈As
Then
U
G λs (y) λG s (y) − g" = − exp −γ − g" yqs yqs y qs = − exp log + log + γ g" γ γs y qs γ g " =− e . γ γs
351
DUALITY METHODS
Therefore,
p" U
"
λG s (y) − g" yqs
=
p" U (θsG (y) − g" )
"
= VG (x + ν) y qs γ g " =− p" e γs γ " =−
k y y qs = − , γ s=1 γ
and the condition VG (x + ν) = V0 (x) is
k , qs y p −γ x − = −e , γ qs s=1
giving # y = γe
−γ x
k , qs p s=1
qs
.
Then k 1 G λ (# y) = x + ν y s=1 s
k # y qs 1 = − log − qs log γ γ s=1 γs k k s 1 1 p qs =− − −γ x + qs log qs log γ q γ γ s s s=1 s=1 k s 1 p =x− . qs log γs γ s=1
Therefore, the asking pricing for asset G is k γs 1 qs log ν= s γ s=1 p with γs =
p" e γ g " .
"∈As
Case U2: For the power utility we have seen that V0 (x) = α1 µx α with k 1−α 1 α α−1 1−α s µ= qs p , s=1
352
CHAPTER 9
and VG (x + ν) =
p" U
"
=
λG s (y) − g" yqs
p" U (θsG (y) − g" ).
"
We first find # y by setting VG (x + ν) = V0 (x), that is,
p" U
"
y) λG s (# − g" # y qs
=
p" U (θsG (g) − g" )
"
1 = µx α . α The left side is
G α 1 λs (# y) p" − g" . # α " y qs
Now λG s (y) = λs is the unique solution of p"
=1 (yqs )α (λs − yqs g" )1−α α−1 λG (y) s p" − g" = yqs . yqs "∈A
"∈As
so
s
Therefore, d dy
λG s (y) y
> 0.
Newton’s method, or interval division, can be applied to find # y . This enables us to calculate 1 G λ (# y ), # y s=1 s k
which equals x + ν. Therefore, the asking price for G is determined. Case U3: For the logarithmic utility qs s log V0 (x) = log x − . p s p s=1 G y) λs (# VG (x + ν) = p" U − g" # y qs " k
DUALITY METHODS
=
p" log
λG y) s (# # y qs
"
=
353
− g"
p" log(θsG (# y ) − g" ).
"
We determine # y from the condition VG (x + ν) = V0 (x). y ) is the unique solution of Now λG s (#
p" G (λs (# y) − # y qs g" ) "∈As
= 1.
Therefore,
p"
"∈As
λG y) s (# − g" # y qs
implying
d dy
λG s (y) y
−1 = yqs
< 0.
Having found # y we calculate 1 G λ (# y ). # y s=1 s k
This equals x + ν, and so ν is found. Writing θ = θsG (y) = As before,
λG s (y) . yqs
Q Ep U y − yEQ [G] θ∈M P 0 (y) = inf Ep U yQ V . Q∈M P
G (y) = inf V
In the calculation of the minimizing Q in Section 9.5 G (") = p" U θsG (y) − g" Q y for Also,
" ∈ As .
p" U (θsG (y) − g" ) = 1. yq s "∈A s
(9.30)
354
CHAPTER 9
#G (") = qs p" em" for some m" from (9.30) As Q θsG (y) = g" + I (yqs em" ). Now
p" e m " = 1
"∈As
so when G = 0,
# = Q(")
"∈As
qs p " e m "
"∈As
=
p" U (θs0 (y)). y "∈A s
That is, s 0 p U (θs (y)) y yqs so θs0 (y) = I . s p qs =
Rewriting the results of the U3 case discussed in Section 9.3, G (y) = − V
k
qs θsG (y)
s=1
0 (y) = − V
k
qs θs0 (y).
s=1
We saw earlier that V0 (x) =
k
s U (θs0 (y)) p
with
x=
s=1
k
qs θs0 (y).
s=1
Also, VG (x) =
p" U (θsG (y) − g" )
with
x=
k
"
s=1
We now solve VG (x + ν) = V0 (x). y1 such that We first compute V0 (x). Find # k # y1 qs = x. qs I s p s=1
Then V0 (x) =
k s=1
# y1 qs s U I . p s p
qs θsG (y).
355
DUALITY METHODS
Now compute # y such that
p" U (θsG (# y ) − g" ) = V0 (x)
"
and
y. p" U (θsG (# y ) − g" ) = qs #
"
Then ν=
k
y ) − x. qs θsG (#
s=1
9.9 THE INDIFFERENCE BID PRICE In Section 9.3 we defined the indifference asking price, We now define VGB (x) = sup Ep [U (X x,α + G)], α
and, as before, V0 (x) = sup Ep [U (X x,α )]. α
The bid price, ν b , is the number such that VGB (x − ν b ) = V0 (x). Theorem 9.7 ν b (G) = −ν a (−G). Proof. Write H = −G so VGB (x) = sup Ep [U (X x,α − H )] α
= VH (x). Then VGB (x − ν b (G)) = V0 (x) = VH (x − ν b (G)), and
VH (x + ν a (H )) = V0 (x).
As y → VH (y) is monotone ν b (G) = −ν a (−G). All the computations for ν b are as before. In fact, we have that VGB (x) = p" U ( θsG (y) + g" ) "
with
x=
k s=1
qs θsG (y)
(2)
356
CHAPTER 9
and θs = θsG (y) the unique solution of p" U ( θsG (y) + g" ). yqs = "∈As
The procedure, therefore, is a) Compute V0 (x) b) Compute # y so that
p" U ( θsG (# y ) + g" ) = V0 (x)
"
# y qs =
p" U [ θsG (# y ) + g" ].
"∈As
Then set x − νb =
k
θsG (# qs y)
s=1
to compute ν b .
9.10 EXAMPLES For quadratic and exponential utilities explicit formulas for the indifference prices are provided in the following examples. Example 3 Consider the quadratic utility U (x) = −(x − a)2 ,
x < a.
Then U (x) = −2(x − a) U
(x) = −2. The risk aversion is −
1 U
(x) = U (x) a−x z I (z) = a − . 2
Let us compute θsG (y) and θsG (y). Now for s = 1, . . . , k p" [θsG (y) − g" − a] = yqs −2 "∈As
so −2 ps θsG (y) + 2
"∈As
p" g" + 2a
"
p" = yqs
357
DUALITY METHODS
and 2 ps θsG (y) = 2
p" g" + 2a p" − yqs
"∈As
and s + a − θsG (y) = G
yqs , 2 ps
where s = 1 p " g" . G s p "∈As
s − yqs < 0 for all s. Then The formula for θsG (y) is well defined only if G 2 p VG (x) =
2
p" U (θsG (y) − g" )
"
yqs = − g" p" U Gs + a − 2 ps " 2 yqs p" G s − − g" =− 2 ps "
=−
"
=−
"
s − g" )2 − 1 y 2 p" (G 4 s − g" )2 − 1 y 2 p" (G 4
q2 s · p" 2s p "
k s=1
qs2 s p
and x=
k
θsG (y)qs =
s=1
k
s + a − qs G
s=1
k y qs2 . s 2 s=1 p
Therefore, VG (x) = −
"
−1 2 k k q2 s s − g" ) − s . x −a− p" (G qs G s p 2
s=1
s=1
For G = 0, V0 (x) = −
−1 k q2 s
s=1
s p
[x − a]2 .
The asking price ν = ν is defined by a
VG (x + ν a ) = V0 (x).
358
CHAPTER 9
That is, x +ν −a− a
2
k
s qs G
= [x − a]2 − #,
s=1
where =
k q2 s
s=1
and
s p
#=
s − g" )2 . p" (G
"
Therefore, ν a = −(x − a) +
k
s − [(x − a)2 − #]1/2 . qs G
s=1
Note when G is Fs measurable, s g" = G
" ∈ As
for
so then # = 0 and ν = νa =
k
s . qs G
s=1
Therefore, the negative sign must be taken in the square root term. The bid price ν b is defined by VGB (x − ν b ) = V0 (x), where VGB (x) =
p" U ( θsG (y) + g" )
s + a − yqs . θsG (y) = −G 2ps
and
"
Therefore, VGB (x
−ν ) = − b
s − g" ) − p" (G 2
−1
x −ν −a+ b
"
k s=1
With V0 (x) = − −1 (x − a)2 the price ν b is determined by x − νb − a +
k
s = [(x − a)2 − #]1/2 qs G
s=1
so that ν b = (x − a) +
k s=1
s + [(x − a)2 − #]1/2 . qs G
s . qs G
359
DUALITY METHODS
Here the positive sign is taken in the square root so that when # = 0 ν b reduces to k
s . qs G
s=1
Note that ν a − ν b = 2[(a − x) − [(a − x)2 − #]1/2 ] 2# = . (a − x) + [(a − x)2 − #]1/2 Note this is 0 when # = 0. Now a−x ∂ν a = −1 + ∂x [(a − x)2 − #]1/2 −# = [(a − x)2 − #]1/2 [(a − x) + [(a − x)2 − #]1/2 ] 0. =
Also,
∂ (ν a ∂x
− ν b ) < 0. Now any Q ∈ M is of the form Q(") = qs p" em"
with
p" e m" = 1
"∈As
and
Q(")θsG (# y) =
"
k
qs θsG (# y)
s=1
= x + νa . Therefore, θsG (y) is an attainable claim. Example 4 Consider Case U1, the exponential utility. Then U (x) = −e−γ x Hence, yqs =
and
U (x) = γ e−γ x .
p" γ exp[−γ (θsG (y) − g" )]
"∈As
= e−γ θs
G (y)
γ
"∈As
p" e γ g " .
360
CHAPTER 9
Then G (y)
eγ θs
=
γ p" e γ g " qs y "∈A s
and
γ 1 p" e γ g " . θsG (y) = log γ qs y "∈A s
When G = 0, θs0 (y)
s 1 γp = log . γ qs y
Now k
qs θsG (# y) = x =
s=1
where
1 1 1 log γ − log y + M, γ γ γ
1 γ g qs log p" e " M= qs "∈A s=1 k
s
so
1 1 1 1 p" e γ g " θsG (y) = log γ − log y + log γ γ γ qs "∈A s 1 1 1 = x + log p" eγ g" − M. γ qs "∈A γ s
As VG (x) =
p" U (θsG (y) − g" )
"
1 1 1 =− p" exp −γ x + log p" eγ g" − M − g" γ q γ s " "∈As −1 k = −e−γ x p" eγ g" · qs p" e γ g " e M
s=1
= −e
"∈As
"∈As
−γ x M
e
k 1 = −e−γ x exp qs log p" eγ g" . q s "∈A s=1 s
In particular, when G = 0 V0 (x) = −e−γ x exp M0 ,
361
DUALITY METHODS
where k
s p qs log M0 = qs s=1
.
We are looking for the asking price ν = ν a defined by VG (x + ν) = V0 (x). That is, −e
−γ (x+ν) M
=
e
−e−γ x eM0
so eγν = eM−M0 and
1 (M − M0 ) γ k k r 1 1 1 p γ g" = − qr log p" e qr log γ r=1 qr "∈A γ r=1 qr r k 1 1 = . qr log p" e γ g " r γ p
ν=
"∈Ar
r=1
9.11 PROPERTIES OF ν We have noted that if G is Fs measurable ν a (G) = ν b (G) = s for " ∈ As . g" = G Theorem 9.8 G → ν a (G) is convex. Proof. ν a (G) = ks=1 qs θsG (y) − 1, where p" U (θsG (y) − g" ) qs y = "∈As
and N
p" U (θsG (y) − g" ) = V0 (x).
"=1
Write φ(t) = ν a (tG1 + (1 − t)G2 ),
0 ≤ t ≤ 1,
and G(t) = G2 + t (G1 − G2 ), with components g" (t) = g"2 + t (g"1 − g"2 ) and define # y (t) by
"
p" U (θsG(t) (# y (t)) − g" (t))) = V0 (x).
k s=1
s , where qs G
362
CHAPTER 9
Write y (t)), ψs (t) = θsG (# so that ν (tG1 + (1 − t)G2 ) = φ(t) = a
k
qs ψs (t) − x
s=1
and φ
(t) =
k
qs ψs
(t).
s=1
Also, as
p" U (ψs (t) − g" (t)) = V0 (x)
"
differentiating in t gives p" U (ψs (t) − g" (t))(ψs (t) − g" (t)) = 0
(9.31)
"
and
p" U
(ψs (t) − g" (t))(ψs (t) − g" (t))2 +
"
as
g"
(t)
p" U (ψs (t) − g" (t))ψs
(t) = 0
"
= 0. Now y (t) = qs #
p" U (ψs (t) − g" (t))
"∈As
so we have
p" U
(ψs (t) − g" (t))(ψs (t) − g" (t))2 + # y (t)
"
k
qs ψs
(t) = 0,
s=1
giving # y (t)
k
qs ψs
(t) = # y (t)φ
(t)
s=1
=−
p" U
(ψs (t) − g" (t))(ψs (t) − g" (t))2
"
≥ 0. Therefore, φ(t) is convex. That is, the map G → ν a (G) is convex. The fact that ν b is concave follows from Theorem 2.7. Moreover, Lemma 9.9 G → ν a (G) is monotonically increasing.
2
363
DUALITY METHODS
Proof. We say G1 ≥ G2 if g"1 ≥ g"2
for
" = 1, . . . , N.
Suppose G1 ≥ G2 and write φ(t) = ν a (G2 + t (G1 − G2 )). We must show φ(1) ≥ φ(0). This is the case if φ (t) ≥ 0. Now
φ (t) =
k
qs ψs (t).
s=1
However, we have seen in (9.30) that # y (t)
k
qs ψs (t) =
p" U (ψs (t) − g" (t))(g"1 − g"2 )
"
s=1
≥ 0. Therefore, φ (t) ≥ 0 and the result follows.
2
Corollary 9.10 G → ν b (G) is monotonically increasing. Again this follows from Theorem 2.7. Example 5 With U (x) = −e−γ x
k p" 1 −γ g ν (G) = − qs log e " s γ s=1 p "∈A b
s
b
and clearly ν is an increasing function of G. Lemma 9.11 Suppose G is a constant C. Then ν a (C) = C. Proof. For each y we must determine θsG (y) so that p" U (θsG (y) − C) yqs = "∈As
s U (θsG (y) − C). =p Therefore,
θsG (y) = C + I
yqs s p
.
We then wish to find # y such that p" U (θsG (# y ) − C) = V0 (x) "
=
"
# y qs = V0 (x). p" U I s p
364
CHAPTER 9
This same # y will work for θs0 and
θs0 (# y) = I
# y qs s p
.
Then k
ν (G) = a
qs (θsG (# y ) − θs0 (# y ))
s=1
= C.
(2)
9.12 NUMERICAL METHODS We now describe how to calculate the indifference prices. The procedure to calculate ν a is, for a given x, to compute V0 (x). Then y is determined so that p" U (θsG (y) − g" ) = V0 (x). "
This involves calculating the θsG (y) for s = 1, . . . , k. Then νa =
k
qs θsG (y) − x.
s=1
We shall discuss the computation of θs = θsG (y). Now θs is determined by the condition p" U (θs − g" ) = qs y. (9.32) "∈As
9.12.1 The Calculation of V0 (x) Recall θs0 (y) = λ0s (y)/yqs . For now write θs (y) = θs0 (y). Suppose # y is a solution of x=
k
qs θs (y).
s=1
Then V0 (x) =
p" U (θs (# y ))
"
=
k
s U (θs (# y )), p
s=1
where again s = p
"∈As
p" .
365
DUALITY METHODS
From (9.24) U (θs (# y )) = qs # y p and from (9.28) k
qs θs (y) = x.
s=1
Write θ(y) :=
k
qs θs (y).
s=1
We shall use the Newton-Raphson method to compute # y . From (9.24) qs y s p
U (θs (y)) = so U
(θs (y))θs (y) =
qs > 0, s p
so θs (y) < 0 as U
(z) < 0. Differentiating again θs
(y) = −
U
(θs (y))(θs (y))2 ≥ 0 if U
(θs (y))
U
≥ 0.
Remark. U
(·) ≥ 0 is not really necessary. However, it is satisfied by the utility functions U1, U2, and U3 and it ensures monotone convergence of the NewtonRaphson method. Set s p # y0 = U (x) min s qs and # yn − [θ(# yn ) − x]θ (# yn )−1 . yn+1 = # With the above choice of # y0 θ (# y0 ) =
k
qs θ(# y0 )
s=1
=
k s=1 k
qs I
# y0 qs s p
by (25)
s qs p ≥ qs I U (x) · s qs p s=1
as
I < 0,
366
CHAPTER 9
=
k
qs I (U (x))
s=1
=x
k
qs
s=1 k
= x, as
qs = 1.
s=1
Consequently, # y0 ≤ # y and # yn converges to # y as n → ∞. 9.12.2 VG (x) From equation (9.29) we wish to find # y , which solves p" U (θsG (y) − g" ) = VG (x + ν) "
= V0 (x).
(9.32)
Then, as in (9.31), with this choice of # y x +ν =
k
qs θsG (# y ).
s=1
Consequently, ν = ν , the asking price, can be found. a) We now establish the existence and uniqueness of the solution # y of (9.32). Write F (y) := p" U (θsG (y) − g" ) − V0 (x). a
"
Then F (y) =
p" U (θsG (y) − g" )(θsG ) (y).
"
As in Section 9.5,
p" U (θsG (# y ) − g" ) = yqs
"∈As
so
p" U
(θsG (# y ) − g" )(θsG ) (y) = qs
"∈As
and, therefore, (θsG ) (y) < 0 for s = 1, 2, . . . , k. Further, F (y) =
k s=1
< 0,
yqs (θsG ) (y)
367
DUALITY METHODS
so the solution of (9.32) is unique if it exists. If a finite solution of (9.32) exists, then F must have a sign change. Let us recall some properties of θsG (y) as y → 0+ and y → +∞. Now for each s = 1, 2, . . . , k p" U (θsG (y) − g" ) = yqs . "∈As
U1: If U (x) = −e−γ x , γ > 0, this states that p" exp(−γ (θsG (y) − g" )) = yqs γ "∈As
so that θsG (y)
yqs 1 γ g" p" e − "n "n . = γ γ "∈A s
Therefore, as y → 0+ we see
θsG (y)
→ ∞ and as y → +∞, θsG (y) → ∞.
U2 and U3: If U (x) = α1 x α for α < 1 or U (x) = "nx, x > 0, (corresponding to α = 0), then for s = 1, . . . , k, p" (θsG (y) − g" )α−1 = yqs . "∈As
Then θsG (y) → +∞
as y → 0+
and θsG (y) → max{g" }, "∈As
as y → +∞.
Quadratic Utility: If U (x) = −(a − x)2 , for x ≤ a then U (x) = −2(x − a), and the condition
p" U (θsG (y) − g" ) = yqs
"∈As
becomes −2
p" [θsG (y) − a − g" ] = yqs .
"∈As
This gives θsG (y) = a +
1 1 qs p" g " − y. s s 2p p "∈As
However, we must have
θsG (y) y≥
≤ a so 2 qs
"∈As
θsG (y)
is only defined when
p " g"
for each s.
368
CHAPTER 9
That is, when
2 y ≥ max p" g" = y. s qs "∈A s
We then require F (y) ≥ 0. However, for y ≥ y p" U (θsG (y) − g" ) "∈AS
is then independent of a. If a, (or a − x), is large enough then F (y) > 0. For Case U1, limy→∞ F (y) = −∞. For Case U3, when U (x) = "nx, lim F (y) = −∞.
y→∞
For Case U2, when U (x) = α1 x α , α < 1,
− g" − V0 (x) lim F (y) = p" U max g "
y→∞
"
" ∈As
k
= p" U max g" − g" − V0 (x).
s=1
" ∈As
"∈As
This may not be negative, in which case F (y) = 0 does not have a solution. However, V0 (x) ≥ U (x), which goes to +∞ as x → +∞. Therefore, if x is large enough F (+∞) is negative. Indeed, in Section 9.7, Case U2, it is shown that V0 (x) = Cx α for a constant C independent of x, so it is possible to estimate how large x must be to ensure limy→∞ F (y) < 0. Quadratic Utility: For the quadratic utility limy→∞ F (y) = −∞. b) We now discuss the computation of θsG (y). Recall for each s = 1, 2, . . . , k, θ = θsG (y) is the unique solution of k
p" U (θsG (y) − g" ) = qs y.
(9.33)
"∈As
For Case U1 we know
1 yq s θsG (y) = log p" eγ g" − log . γ γ "∈A s
For the quadratic utility, if y ≥ y, θsG (y) = a +
1 1 qs y p " g" − . s s 2 p p "∈As
Consider then Cases U2 and U3, the power and logarithm utilities. Then (9.33) becomes p" (θsG (y) − g" )α−1 = qs y, "∈As
369
DUALITY METHODS
with α = 0 in Case U3. We know θsG (y) is unique and max g" < θsG (y) < ∞ "∈As
for each
y > 0.
Also lim θsG (y) = max q" . "∈As
y→0+
As
p" U
(θsG (y) − g" )(θsG ) (y) = qs
(9.34)
"∈As
for each s = 1, 2, . . . , k, we must have (θsG ) (y) < 0. Differentiating (9.34) in y gives p" U
(θsG (y) − g" ) ((θsG ) (y))2 "∈As
+
p" U
(θsG (y) − g" ) (θsG )
(y) = 0,
"∈As
so if U
≥ 0 then (θsG )
(y) ≥ 0. Note that U
≥ 0 holds for all the utilities we have considered. Recall the Arrow-Pratt risk aversion A(x) is given in terms of the utility as A(x) = −
U
(x) . U (x)
θ = θsG (y), fix y and solve G(θ) = 0, Then A (x) ≤ 0 implies U
(x) ≥ 0. To find # where p" U (θ − g" ) − qs y. G(θ ) = "∈As
Then G (θ ) =
p" U
(θ − g" ) < 0
"∈As
and G
(θ ) =
p" U
(θ − g" ) ≥ 0.
"∈As
The equation G(θ ) will be solved by the Newton-Raphson method: Choose θ0 with F (θ0 ) ≥ 0 so that θn+1 ≥ θn for all n and so θn is an increasing sequence with limit # θ = θsG (y), where θn+1 = θn − F (θn )/F (θn ). To select θ0 write
370
CHAPTER 9
F ↑
θ θ0
θ1
# gs = max"∈As g" = g" and choose
σs + I θ0 = #
θ→
θ2
yqs p "
.
Then F (θ0 ) > p" U (θ − g" ) − yqs yqs − yqs = p" U I p"
= 0, and
! p" U (θn − g" ) − yqs ! .
"∈As p" U (θn − g" )
"∈As
θn+1 = θn −
Lemma 9.12 G(θn ) > 0 implies G(θn+1 ) ≥ 0. Proof. With # θ = θsG (y) G(# θ ) − G(θn ) = G (θ n ) ≥ G
(θn ) # θ − θn for some θ n , θn < θ n < # θ , and as G
(≥ 0). Then G(θn ) G (θn )
θn G (θn ) − G(θn ) = G (θn )
# θG (θn ) # ≤
= θ. G (θn )
θn+1 = θn −
Therefore, G(θn+1 ) ≥ 0 as G < 0. θ. The lemma implies that θn is an increasing sequence with limit #
2
371
DUALITY METHODS
c) We now discuss the computation of y, the solution of F (y) = 0. We have shown F (y) =
p" U (θsG (y) − g" )(θsG ) (y)
"
=
k
yqs (θsG ) (y).
s=1
Therefore, F
(y) =
k
qs
s=1
d [y(θsG ) (y)]. dy
Write Nk (y) :=
1 p" U (k) (θ (y) − g" ) s p "∈As
for k = 1, 2, . . . , where θ (y) = Now p" U (θsG (# y ) − g" ) = yqs θsG (y).
"∈As
so N1 (y) = Also,
yqs . s p
p" U
(θsG (# y ) − g" )(θsG ) (y) = qs
"∈As
so y(θsG ) (y) = "∈As
qs y p" U
(θ(y) − g" )
N1 (y) = . N2 (y) Then (yθ (y)) =
[N2 (y)2 − N1 (y)N3 (y)]
θ (y) N2 (y)2
as N1 (y) = N2 (y)θ (y) and N2 (y) = N3 (y)θ (y).
372
CHAPTER 9
We wish to show G
(y) > 0, which is the case if (yθ (y)) > 0. That is, if N2 (y)2 ≤ N1 (y)N3 (y). Now this is not the case when U (x) = −(a − x)2 as then N3 (y) = 0. However, it is true for Cases U1, U2, and U3 because for these utilities, (U
(x))2 ≤ U (x)U
(x), with U ≥ 0 and U
≥ 0. U1: With U (x) = −e−γ x , γ > 0, U
(x) = −γ 2 e−γ x
U (x) = γ e−γ x , and
U
(x) = γ 3 e−γ x so (U
(x))2 = U (x)U
(x). Then
p" U
(θ(y) − g" ) s p "∈As p" ≤ |U
(θ (y) − g" )| s p "∈As p" ≤ U (θ (y) − g" ) U
(θ (y) − g" ) p "∈As s 1/2 1/2 p" p" U (θ (y) − g" ) U
(θ (y) − g" ) ≤ s s p p
±N2 (y) = ±
"∈As
= N1 (y)
"∈As
1/2
N3 (y)
1/2
.
Therefore, |N2 (y)| ≤ N1 (y)1/2 N3 (y)1/2 and so N2 (y)2 ≤ N1 (y)N3 (y), as required. U2: With U (x) = α1 x α , α < 1, U (x) = x α−1 ,
U
(x) = (α − 1)x α−2
and U
(x) = (α − 1)(α − 2)x α−3 . Then (U
(x))2 = (α − 1)2 x 2α−4 ,
x>0
≤ x (α − 1)(α − 2)x α−3 = U (x)U
(x). α−1
373
DUALITY METHODS
The proof that N2 (y)2 ≤ N1 (y)N3 (y) then follows as in Case U1. U3: With U (x) = log x, x > 0, U (x) =
1 , x
U
(x) = −
1 x2
and U
(x) =
2 . x3
Then (U
(x))2 =
1 2 1 ≤ · 3, 4 x x x
x > 0.
Again, the proof follows as in Case U1. The Newton-Raphson Method for y: Choose y0 so that V0 (x) θsG (y0 ) ≥ g" + U −1 , p"
(9.33)
where g" = max" g" , and suppose " ∈ As . This is possible for U1, U2, and U3 but not necessarily so for U (x) = −(x − a)2 . However, there we can take y0 = y provided (a − x) is large enough. Then −1 Vo (x)
F (y0 ) ≥ p" U g" + U − g" − V0 (x) p "
V0 (x) = p"
− V0 (x) = 0. p "
Now
p" U (θsG (y0 ) − g" ) = y0 qs
"∈As
so p" U (θsG (y0 ) − g" ) ≤ y0 qs . As I is a decreasing function,
θsG (y0 ) ≥ g" + I If y0 =
qs y0 p "
.
V0 (x) p"
U U −1 qs
p "
(which can be computed), we have qs y0 −1 V0 (x) g" + I = g" + U . p"
p"
374
CHAPTER 9
Lemma 9.13 Choose
p"
−1 V0 (x) y0 = , U U qs
p"
where g" = max" g" and " ∈ As . Then F (y0 ) > 0. Set yn+1 := yn − Then F (yn ) =
F (yn ) . F (yn )
p" U (θsG (y) − g" ) − V0 (x)
"
F (yn ) =
p" U (θsG (yn ) − g" )(θsG ) (yn )
"
=
k
qs yn (θsG ) (yn )
s=1
and (θsG ) (yn ) =
qs .
G "∈As p" U (θs (yn ) − g" )
Summary: The algorithm is as follows: Start with the y0 determined above. If yn has been obtained, compute θsG (yn ) for each s = 1, . . . , k by the algorithm in part (b). Then F (yn ) and F (yn ) can be computed. n) yn+1 is then given as yn − FF (y . We have yn is monotonic increasing to # y as (yn ) k G G n → ∞. As θs (·) is a decreasing function s=1 qs θs (yn ) − x is decreasing to the asking price ν a . Note that for U (x) = −e−γ x , (γ > 0), and U (x) = −(a − x)2 (x ≤ a), explicit formulas are available for θsG (y). 9.13 APPROXIMATE FORMULAS We shall obtain approximate values for ν a (G), ν b (G). Here we focus on ν b (G); ν a (G) is treated similarly. Lemma 9.14 νb ∼ =
qs E[G|As ] +
s
Proof. We first recall: i) E[G|As ] = p1s "∈As p" g" . var[G|As ] =
1 U
(θs0 ) qs 0 var[G|As ]. 2 s U (θs )
1 p" [g" − E[G|As ]]2 . s p "∈As
(9.34)
375
DUALITY METHODS
ii) The Arrow-Pratt risk aversion is given by A(x) = −U
(x)/U (x) so the second term in (9.34) is − Now θs0 is the solution of
1 qs A(θs0 )var[G|As ]. 2 s U (θs0 ) = λ1 k
qs s p
(9.35)
qs θs0 = x,
(9.36)
s=1
where
s = p
p" .
"∈As
Also θsG (y) is the solution of
p" U (θsG (y) + G" ) = λ2 qs
(9.37)
"∈As k
qs θsG (y) = y.
(9.38)
s=1
We then select y so that E[U (θ G (y) + G)] = E[U (θ 0 )] = V0 (x)
(9.39)
and then y = x − ν b (G). Suppose we approximate (9.37) by p" {U (θs0 ) + U
(θs0 )[θsG (y) + g" − θs0 ]} ∼ = λ2 qs
(9.40)
"∈As
or s U
(θs0 )[θsG (y) + E[G|As ] − θs0 ] ∼ s U (θs0 ) + p p = λ2 qs .
(9.41)
Then (θsG (y) − θs0 + E[G|As ])U
(θs0 ) = λ
qs , s p
where λ = λ2 − λ1 by (9.35). Therefore, θsG (y) − θs0 + E[G|As ] = λ
1 qs .
s U (θs0 ) p
(9.42)
376
CHAPTER 9
This implies, using (9.36) and (9.38), that y −x +
k
qs E[G|As ] = λ
s=1
k q2 s
1
U
(θs0 ) p s=1 s
,
= λM,
(9.43)
where M=
k q2 s=1
Consequently, λ = M −1 (y − x +
1 s < 0. s U
(θs0 ) p
k
qs E[G|As ]) so from (9.42): y − x + s qs E[G|As ] qs 1 0 G θs (y) − θs + E[G|As ] = .
s U (θs0 ) M p s=1
(9.44)
We now approximate: VGB (x − ν b ) = p" U (θsG (y) + g" ) "
∼ =
p" [U (θs0 ) + U (θs0 )[θsG (y) + g" − θs0 ]
"
1 + U
(θs0 )[θsG (y) + g" − θs0 ]2 ] 2 k s U (θs0 )[θsG (y) + E[G|As ] − θs0 ] = V0 (x) + p s=1
+
k 1
2
s U
(θs0 ){var(G|As ) + (θsG (y) + E[G|As ] − θs0 )2 }. p
s=1
Write #
Z = y −x +
qs E[G|As ]
s
so from (9.44) θsG (y) + E[G|As ] − θs0 =
Z 1 qs . s M U
(θs0 ) p
(9.45)
Then k s=1
k
U (θ0s ) qs Z from (45) s M U
(θ0s ) p s=1 k U (θ0s ) Z = . (9.46) qs
U (θ0s ) M s=1
s U (θs0 )[θsG (y) + E[G|As ] − θs0 ] = p
s p
377
DUALITY METHODS
Also, using (9.45) again: 1 s U
(θs0 )[θsG (y) + E[G|As ] − θs0 ]2 p 2 s=1 k
=
1 q2 Z2 1 s U
(θs0 ) 2
0 2 s2 p s 2 s=1 M (U (θs )) p
=
k 1 1 Z 2 qs2
2 s U (θs0 ) 2 M s=1 p
k
1 Z2 (9.47) 2M from the definition of M. Now we are looking for the θsG (y), s = 1, . . . , k, so that VGB (x − ν) = V0 (x) so that implies that approximately 1 p" U (θs0 )(θsG (y) + g" − θs0 ) + U
(θs0 )(θsg (y) + g" − θs0 )2 = 0 2 " =
and we have k k Z U (θs0 ) 1 1 Z2 s U
(θ0s )var[G|As ] + qs
0 + p = 0. M s=1 U (θs ) 2 s=1 2M
(9.48)
That is, k
U (θ 0 ) s U
(θ0s )var(G|As ) = 0. p qs
s0 + M Z + 2Z ) U (θ s s=1 s=1 2
k
Using (9.35) we have k
qs
s=1
U (θs0 ) = λ1 M. U
(θs0 )
Also from (9.35) s U
(θs0 ) = λ1 qs p
U
(θs0 ) U (θs0 )
so Z 2 + 2λ1 MZ + λ1 M
k s=1
qs
U
(θs0 ) var(G|As ) = 0. U (θs0 )
Therefore, (Z + λ1 M)2 = λ21 M 2 − λ1 M =
λ21 M 2
k s=1
qs
U
(θs0 ) var(G|As ) U (θs0 )
k 1 U
(θs0 ) var(G|As ) . 1− qs λ1 M s=1 U (θs0 )
(9.49)
378
CHAPTER 9
Consequently,
1/2 k 1 U
(θs0 ) qs var(G|As ) Z + λ1 M = λ1 M 1 − λ1 M s=1 U (θs0 ) k 1 1 U
(θs0 ) ∼ var(G|As ) . qs = λ1 M 1 − 2 λ1 M s=1 U (θs0 ) Then k
1 Z∼ =− 2
qs
s=1
U
(θs0 ) var(G|As ). U (θs0 )
(9.50)
So Z = y −x +
k
1 U
(θs0 ) var(G|As ). qs 2 s=1 U (θs0 )
(9.51)
1 U
(θs0 ) var(G|As ). qs 2 s=1 U (θs0 )
(9.52)
k
qs E[G|As ] ≈ −
s=1
As y = x − ν b (G) ν b (G) ∼ =
k
k
qs E[G|As ] +
s=1
2 Corollary 9.15 Suppose U (x) = −e−γ x , γ > 0. Then ν (G) ∼ = b
k
qs E[G|As ] −
s=1
γ qs var(G|As ). 2 s
(9.53)
α
Corollary 9.16 Suppose U (x) = xα , α < 1. Then 1 1 k s α−1 p qs α−1 1−α U
(θs0 ) =− qs
. 0
s
p U (θs ) x qs
(9.54)
s=1
Remark 13.4 Note that f (θs0 ) ∼ = f (x) + (θs0 − x)f (x) so
qs f (θs0 ) ∼ = f (x)
s
since
qs = 1
s
and
s
qs (θs0 − x) = 0.
(9.55)
379
DUALITY METHODS
Corollary 9.17 If A(θs0 ) = −
U
(θs0 ) U (θs0 )
is (approximately) independent of s then from Remark 13.4 ν b (G) ∼ =
1 qs var(G|As ). qs E[G|As ] − A(x) 2 s=1 s=1
k
k
(9.56)
Corollary 9.18 If G is attainable, it is Fs measurable so var(G|As ) = 0 and ν b (G) =
k
qs E[G|As ].
(9.57)
s=1
Corollary 9.19 If there are no hedging instruments, then 1 ν b (G) ∼ = E(G) − A(x)var(G). 2
(9.58)
Remark. This is a classic actuarial approximation. Lemma 9.20 Another approximation is ν b (G) ∼ qs {U −1 (E[U (θs0 + G)|As ]) − θs0 }. =
(9.59)
s
Proof. 1 E[U (θs0 + g" )|As ] ∼ = U (θs0 ) + U (θs0 )[E[G|As ]] + U
(θs0 )E[G2 |As ] 2 = U (θs0 ) + εs , say. Now for small εs , again using the Taylor expansion: 1 U −1 (U (θs0 ) + εs ) ∼ = U −1 (U (θs0 )) + εs (U −1 ) (U (θs0 )) + εs2 (U −1 )
(U (θs0 )) + · · · 2 1 −U
(θs0 ) εs +··· (9.60) = θs0 + 0 + εs2 U (θs ) 2 U (θs0 )3 since 1 U (θs0 ) U
(x) (U −1 )
(U (θs0 )) = − 3 . U (x) (U −1 ) (U (θs0 )) =
and
In fact, using the chain rule, U −1 (U (x)) = x so that (U −1 ) (U (x))U (x) = 1 and therefore, (U −1 ) (U (x)) = 1/U (x). Differentiating again (U −1 )
(U (x))U (x) = −
1 U (x)2
U
(x),
380
CHAPTER 9
giving (U −1 )
(U (x)) = −
1 U (x)s
U
(x).
Therefore, from (9.60): k
qs {U −1 (E[U (θs0 + G)|As ]) − θs0 }
s=1
∼ =
k s=1
=
k
qs
1 2 U
(θs0 ) εs − ε U (θs0 ) 2 s U (θs0 )3
qs E[G|As ] +
s=1
1 U
(θs0 ) E[G2 |As ] 2 U (θs0 )
2
0 * U (θs ) 1 1 U (θs0 )E[G|As ] + U
(θs0 )E[G2 |As ] − 2 2 U (θs0 )3 =
∼ =
k
k k 1 U
(θs0 ) 1 U
(θs0 ) qs 0 E[G2 |As ] − qs 0 E[G|As ]2 2 U (θ ) 2 U (θs ) s s=1 s=1 s=1
0 2
0 3 k k U (θs ) U (θs ) 1 1 2 − qs E[G|A ]E[G |A ] − q E[G2 |As ]2 s s s
0 2 s=1 U (θs ) 8 s=1 U (θs0 ) k s=1
qs E[G|As ] +
1 U
(θs0 ) qs var(G|As ). 2 s=1 U (θs0 ) k
qs E[G|As ] +
(9.61)
This follows because, if we assume that the Arrow-Pratt risk aversion A(θs0 ) = U
(θ 0 )
− U (θ 0s ) is small, we can ignore the higher powers of A(θs0 ). This last express sion (9.61) is the same as the approximation established in Lemma 9.14 and hence Lemma 9.20 follows. 2 Remark. The interpretation of Equation (9.59) is this: Compute the certainty equivalent of θs0 + G given As and then subtract θs0 . This is an attainable claim whose present value is the right-hand side of (9.59). Corollary 9.21 If G is attainable, then U −1 (E[U (θs0 + G)|As ]) = θs0 + G and so ν b (G) =
k s=1
qs E[G|As ] =
k
qs g s ,
s=1
where g" = gs for " ∈ As . Corollary 9.22 If there are no tradeable assets then ν b (G) ∼ = U −1 E[U (G)], which agrees with actuarial pricing.
381
DUALITY METHODS
9.14 AN ALTERNATIVE REPRESENTATION FOR VG (x) We shall now derive an alternative expression for VG (x) using our duality formulas. Some generalizations of pricing formulas given by Musiela and Zariphopoulou [197] will then be given. Recall from Theorem 9.3 G (y) + xy]. VG (x) = inf [V
(9.62)
Q y − yEQ (G) . G (y) = inf EP U V Q∈M P
(9.63)
y>0
From Definition 4.5
From Equation (9.11), for 0 < z < ∞, (z) = U (I (z)) − zI (z). U
(9.64)
Substituting (9.64) in (9.63), Q Q Q VG (y) = inf EP U I ( y) − EP y I y (9.65) − yEQ (G) Q∈M P P P and substituting in (9.62), Q Q VG (x) = inf inf EP U I ( y) − yEQ I y − yEQ (G) + xy . y>0 Q∈M P P (9.66) Reversing the order of the infima: Q Q VG (x) = inf inf EP U I ( y) − yEQ I ( y) − yEQ (G) + xy . Q∈M y>0 P P (9.67) Given Q, the first-order condition for the minimizing # y =# y (Q) is Q Q Q
Q # # # y I y − EQ I y EP U I P P P P Q Q # −# y EQ I
y − EQ (G) + x = 0. P P That is,
Q # EQ I y = x − EQ (G). P
(9.68)
# # Here we used U (I ( Q y )) = Q y and P P Q Q Q Q Q # # I y = EQ I
y . EP P P P P P Therefore, with this choice of # y =# y (Q) the final three terms in (9.67) cancel so Q VG (x) = inf EP U I ( # y (Q)) . (9.69) Q∈M P
382
CHAPTER 9
Write
Q Q φ(y) := EP U I y − yEQ I y − yEQ (G) + xy. P P
Then
Q φ (y) = −EQ I y − EQ (G) + x P
and
Q
Q φ (y) = −EQ I y >0 P P
as U
(I (y))I (y) = 1 implies I (y) < 0. Consequently y → φ(y) is convex. Thus (9.68) characterizes the minimization with respect to y. We summarize our results as Theorem 9.23
Q # y (Q) , VG (x) = inf EP U I Q∈M P
(9.70)
where # y (Q) is the solution of Q # EQ I y (Q) = x − EQ (G). P This result provides a new recipe for the asking price for G, ν = ν a (G) (and similarly for the bid price ν b (G) = −ν a (−G)). This is: Find ν so that Q Q # # inf EP U I y1 (Q) y2 (Q) = inf EP U I , (9.71) Q∈M Q∈M P P where
Q # EQ I y1 (Q) = x + ν − EQ (G) P
(9.72)
Q # EQ I y2 (Q) = x. P
(9.73)
and
Example 6 With U (x) = −e−γ x , (γ > 0, x ∈ R)
and
U (x) = γ e−γ x ,
I (y) = −
y U (I (y)) = − , γ
asbefore.
Writing
Q , log P
h(Q P ) = EQ
y 1 log γ γ
383
DUALITY METHODS
we can show # y1 (Q) = γ exp[−γ (x + ν − EQ (G)) − h(Q P )] and # y2 (Q) = γ exp[−γ x − h(Q P )]. Then
Q # = − exp[−γ (x + ν − EQ (G)) − h(Q P )] y1 (Q) EP U I P
and
Q # = − exp[−γ x − h(Q P )], y2 (Q) EP U I P
so from (9.71): −γ ν + sup [γ EQ (G) − h(Q P )] = − inf h(Q P ). Q∈M
Q∈M
Therefore,
ν = sup
EQ (G) −
Q∈M
1 [h(Q P ) − inf h(Q P )] , Q∈M γ
= sup [EQ (G) − θ (Q)] Q∈M
where
1 h(Q P ) − inf h(Q P ) . θ (Q) = Q∈M γ
This expression is similar to that in Chapter 1. Example 7 With U (x) = α1 x α , (0 < α < 1, x > 0) U (x) = x α−1 ,
1
I (y) = y α−1 ,
y>0
and U (I (y)) =
1 α y α−1 . α
In this case we can easily find
# y1 (Q) =
1 1−α P 1−α EQ Q
(x + ν − EQ (G))1−α 1 1−α P 1−α 1 # , y2 (Q) = 1−α EQ x Q
(9.74)
(9.75)
384
CHAPTER 9
where, as noted in §12.2, # y1 (Q) is defined, provided x is sufficiently large. Then α α P 1−α 1 Q y1 (Q)) = # y1 (Q) α−1 EP EP U I ( # P α Q α α 1 P P 1−α = # y1 (Q) α−1 EQ Q Q α 1 α 1 P 1−α = # y1 (Q) α−1 EQ Q α 1 1−α P 1−α 1 α = (x + ν − EQ (G)) EQ Q α and
1 1−α P 1−α Q 1 α # = x EQ . y2 (Q) EP U I P α Q
Then from (9.71)
inf (x + ν − EQ (G))α EQ
Q∈M
=x or
α
inf
Q∈M
EQ
P Q
1 1−α
inf (x + ν − EQ (G))α EP =x
α
inf EP
Q∈M
P Q
α 1−α
1 1−α
α 1−α
1−α
1−α
Q∈M
P Q
P Q
1−α
1−α
.
The characterization of Q ∈ M given in Theorem 9.2 can then be used to calculate these infima. Example 8 With U (x) = log x, x > 0 1 1 , I (y) = x y U (I (y)) = − log y. U (x) =
We can then check that 1 (x + ν − EQ (G)) 1 # y2 (Q) = . x # y1 (Q) =
and
385
DUALITY METHODS
Then
Q Q # # = −EP log y1 (Q) y1 (Q) EP U I P P Q = − log # y1 (Q) − Ep log P Q = − log # y1 (Q) + Ep log P = − log # y1 (Q) + h(P Q) = + log(x + ν − EQ (G)) + h(P Q).
Also,
Q # EP U I y2 (Q) = log x + h(P Q). P
Therefore, (9.71) becomes inf [log(x + ν − EQ (G)) + h(P Q)] = inf [log x + h(P Q)]
Q∈M
Q∈M
= log x + inf h(P Q). Q∈M
Example 9 With U (x) = −(a − x)2 , x ≤ a, U (x) = −2(x − a), U (I (y)) = −
y I (y) = a − , 2
y2 . 4
We can then check that # y1 (Q) = and # y2 (Q) =
2[a − ν − x + EQ (G)] 2 EP Q P 2(a − x) 2 . EP Q P
Then
2 2 Q 1 4(a − x − ν + EQ (G)) Q # EP U I = − EP · y1 (Q) 2 2 P 4 P EP Q P =−
and
(a − x − ν + EQ (G))2 , 2 EP Q P
Q −(a − x)2 # y2 (Q) = EP U I 2 . P EP Q P
386
CHAPTER 9
Therefore, (9.71) in this case becomes (a − x)2 (a − x − ν + EQ (G))2 = max max 2 . Q∈M Q∈M )2 EP ( Q EP Q P P
Again, the characterization of Q ∈ M given in Theorem 9.2 can be used to calculate these maxima.
Bibliography
[1] E. Campos A. Garcia and J. Reitzes. Dynamic pricing and learning in the electricity markets. Technical report, University of Virginia, 2003. [2] C. Acerbi and D. Tasche. On the coherence of expected shortfall. Journal of Banking and Finance, 26:1487–1503, 2002. [3] C.W. Anderson, N. Mole and S. Nadarajah. A switching poisson process model for high concentrations in short-range atmospheric dispersion. Atmospheric Environment, 31(6):813–824, 1997. [4] K. Arrow. Essays in the Theory of Risk Bearing. North-Holland, Amsterdam, 1970. [5] J.P. Aubin. Optima and Equilibria. Springer-Verlag, 1998. [6] H. Schumacher B. Roorda and J. Engwerda. Coherent acceptability measures in multiperiod models. Mathematical Finance, to appear in 2005. [7] G. Bakshi, C. Cao and Z. Chen. Empirical performance of alternative option pricing models. J. Fin., 52(5), December 1997. [8] V. Bally and A. Matoussi. Weak solutions for spde’s and backward doubly sde’s. J. of Theoretical Probab., 14:125–164, 2001. [9] Erik Banks, editor. Weather Risk Management: Markets, products and applications. Palgrave, 2002. [10] G. Barles and E. Lesigne. Sde, bsde and pde. [11] M. Barlow. A diffusion model for electricity prices. Mathematical Finance, 12(4):287–298, 2002. [12] O.E. Barndorff-Nielsen and N. Shephard. Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B, 64:253–280, 2002. [13] P. Barrieu and N. El Karoui. Structuration optimale de produits financiers et diversification en présence de sources de risque non-négociables. C. R. de l’Acad. Sci. Paris, Sér. I Math., 336:493–498, 2003.
388
BIBLIOGRAPHY
[14] P. Barrieu and N. El Karoui. Optimal derivatives design under dynamic risk measures. In Mathematics of Finance, Contemporary Mathematics. Amer. Math. Soc., 2004. [15] P. Barrieu and N. El Karoui. Inf-convolution of risk measures and optimal risk transfer. Finance and Stochastics, 9:269–298, 2005. [16] D. Becherer. Rational Hedging and Valuaton with Utility-Based Preferences. PhD thesis, Technical University of Berlin, 2001. [17] V.E. Benes. Existence of optimal strategies based on specific information for a class of stochastic decision problems. SIAM Journal of Computation and Optimization. [18] V.E. Benes. Existence of optimal stochastic control law. SIAM Journal of Control, 9:446–472, 1971. [19] A. Bensoussan. Stochastic Control of Partially Observable Systems. Cambridge University Press, 1992. [20] F.E. Benth and K.H. Karslen. A pde representation of the density of the minimal entropy martingale measure in stochastic volatility markets. Stochastics and Stochastic Reports, 77:109–137, 2005. [21] G. Bernis and M. Jeanblanc. Hedging defaultable derivatives via utility theory, 2002. [22] D. Bernoulli. Specimen theoriae novae de mensura sortis. Commentarii Academiae Sientiarum Imperialis Petropolitanae, (and translated in Econometrica), 5 (and 22):175–192 and 23–56, 1738 (and 1954). [23] S. Biagini and M. Frittelli. On the super replication price of unbounded claims. The Annals of Applied Probability, 14:1970–1991, 2004. [24] T.R. Bielecki and S.R. Pliska. Risk sensitive dynamic asset management. App. Math. Optim., 39:337–360, 1999. [25] T.R. Bielecki and M. Rutkowski. Credit Risk: Modeling, Valuation and Hedging. Springer-Verlag, 2002. [26] J. Bion-Nadal. Conditional risk measure and robust representation of convex conditional risk measures, 2005. [27] J. Bion-Nadal. Pricing functions and risk measures in incomplete markets, 2005. [28] J.M. Bismut. Conjugate convex functions in optimal stochastic control. J. Math. Anal. Appl., 44:384–404, 1973. [29] T. Bjork and C. Landen. On the term structure of futures and forward prices. Technical report, Stockholm School of Economics, 2001.
BIBLIOGRAPHY
389
[30] O. Bobrovnytska and M. Schweizer. Mean-variance hedging and stochastic control: Beyond the Brownian setting. IEEE Transactions on Automatic Control, 49:396–408, 2004. [31] F. Borch. Equilibrium in a reinsurance market. Econometrica, 30:424–444, 1962. [32] J. Borwein and Q. Zhu. Variational Methods in Convex Analysis and Techniques of Variational Analysis. Springer-Verlag, 2004. [33] N. Branger. Pricing derivative securities using cross entropy—an economic analysis. International Journal of Theoretical and Applied Finance, 7:63–82, 2004. [34] S. Brendle and R. Carmona. Hedging in partially observable markets. Technical report, Princeton University, 2005. [35] H. Bühlmann. Mathematical Methods in Risk Theory. Springer-Verlag, 1970. [36] H. Bühlmann. An economic premium principle. Astin Bulletin, 11:52–60, 1980. [37] H. Bühlmann. The general economic premium principle. Astin Bulletin, 14:13–21, 1984. [38] H. Bühlmann and W.S. Jewell. Optimal risk exchanges. Astin Bulletin, 10:243–262, 1979. [39] P. Brockett C. Yang and M. Wang. Indifference pricing of weather derivatives. Technical report. [40] J.Y. Campbell and L.M. Viceira. Strategic Asset Allocation: Portfolio Choice for Long Term Investors. OUP, 2002. [41] R. Carmona. Statistical Analysis of Financial Data. Springer-Verlag, New York, N.Y., 2004. [42] R. Carmona and A. Danilova. Hedging financial instruments written on non-tradable indexes. Technical report, SIAM 50th Anniversary and Annual Meeting, Philadelphia, 2002. [43] R. Carmona and P. Diko. Pricing precipitation based derivatives. International Journal of Theoretical and Applied Finance, 7:959–988, 2005. [44] R. Carmona and V. Durrleman. Pricing and hedging spread options. SIAM Review, 45(4):627–685, 2003. [45] R. Carmona and M. Ludkovski. Spot convenience yield models for energy assets. In Mathematics of Finance, AMS-SIAM Joint Summer Conferences, Providence, RI, 2004.
390
BIBLIOGRAPHY
[46] R. Carmona and M. Ludkovski. Commodity forwards with partial observation and exponential utility. International Journal of Theoretical and Applied Finance, page (to appear), 2005. [47] R. Carmona and M. Tehranchi. Interest Rate Models: An Infinite Dimensional Stochastic Analysis Perspective. Springer-Verlag, New York, N.Y., 2006. [48] P. Carr and D. Madan. Optimal positioning in derivative securities. Quantitative Finance, 1:19–37, 2001. [49] J.H. Cochrane. Asset Pricing. Princeton University Press, 2001. [50] G. Cohen. Convexité et optimisation, 2000. [51] G.M. Constantinides and T. Zariphopoulou. Bounds on prices of contingent claims in an intertemporal economy with proportional transactions costs and general preferences. Finance and Stochastics, 3:345–369, 1999. [52] R. Cont. Model uncertainty and its impact on the pricing of derivative instruments. Mathematical Finance, to appear in 2005. [53] T.E. Copeland and J.F. Weston. Financial Theory and Corporate Policy. Addison-Wesley, 1992. [54] P.S.P. Cowpertwait. A generalized point process model for rainfall. Proceedings: Mathematical and Physical Sciences, 447(1929):23–37, October 1994. [55] D. Crišan, J. Gaines and T. Lyons. Convergence of a branching particle method to the solution of the zakai equation. SIAM Journal of Applied Mathematics, 58(5):1568–1590, October 1998. [56] J. Cvitanic´ and I. Karatzas. Convex duality in constrained portfolio optimization. Annals of Applied Probability, 2(4):767–818, 1992. [57] J. Cvitanic´ and I. Karatzas. Backward stochastic differential equations with reflection and dynkin games. Annals of Probability, 24:2024–2056, 1996. [58] J. Cvitanic´ and I. Karatzas. On dynamic risk mesaures. Finance and Stochastics, 3:451–482, 1999. [59] J. Cvitanic´ , W. Schachermayer and H. Wang. Utility maximization in incomplete markets with random endowment. Finance and Stochastics, 5(2):259–272, 2001. [60] M. Davis. Option pricing in incomplete markets. In M.A.H. Dempster and S.R. Pliska, editors, Mathematics of Derivative Securities, 216–227. Cambridge University Press, 1997. [61] M.H.A. Davis. Option valuation and basis risk. In T.E. Djaferis and L.C. Shick, editors, System Theory: Modelling, Analysis and Control. Academic Publishers, 1999.
BIBLIOGRAPHY
391
[62] M.H.A. Davis. Pricing weather derivatives by marginal value. Quantitative Finance, 1:1–4, 2001. [63] M.H.A. Davis and A.R. Norman. Portfolio selection with transactions costs. Mathematics of Operations Research, 15:676–713, 1990. [64] M.H.A. Davis, V.G. Panas and T. Zariphopoulou. European option pricing with transactions costs. SIAM Journal on Control and Optimization, 31:470– 493, 1993. [65] M.H.A. Davis and T. Zariphopoulou. American options and transaction fees. In Mathematical Finance IMA Volumes in Mathematics and Its Applications. Springer-Verlag, 1995. [66] F. Delbaen. Coherent Risk Measures. Lecture Notes. Cattedra Galileiana. Scuola Normale Superiore. Classe di Scienze, Pisa, 2000. [67] F. Delbaen and W. Schachermayer. Arbitrage and free lunch with bounded risk for unbounded continuous processes. Mathematical Finance, 4:343–348, 1994. [68] F. Delbaen and W. Schachermayer. The variance-optimal martingale measure for continuous processes. Bernoulli, 2:81–105, 1996. [69] C. Dellacherie and P.A. Meyer. Probabilités et Potentiel. Chaps. T-IV. Hermann, 1975. [70] C. Dellacherie and P.A. Meyer. Probabilités et Potentiel. Chaps. V-VIII. Hermann, 1980. [71] O. Deprez and H.U. Gerber. On convex principles of premium calculation. Insurance: Mathematics and Economics, 4:179–189, 1985. [72] J. Detemple and S. Sundaresan. Nontraded asset valuation with portfolio constraints: a binomial approach. Review of Financial Studies, 12:835–872, 1999. [73] K. Detlefsen and G. Scandolo. Conditional and dynamic convex risk measures. Finance and Stochastics, to appear. [74] D. Duffie, J. Pan and K. Singleton. Transform analysis and option pricing for affine jump-diffusions. Econometrica, 68:1343–1376, 2000. [75] D. Duffie and H.R. Richardson. Mean-variance hedging in continuous time. Annals of Applied Probability, 1:1–15, 1991. [76] D. Duffie and K. Singleton. Credit Risk: Pricing, Measurement and Management. Princeton University Press, 2003. [77] P. Protter E. Cinlar, J. Jacod and M.J. Sharpe. Semimartingale and markov process. Zeitschrift Wahrscheinlichkeitstheorie, 54:161–219, 1980.
392
BIBLIOGRAPHY
[78] N. El-Karoui and M.C.Quenez. Non-linear pricing theory and backward stochastic differential equations in financial mathematics. Vol. 1656 of Lecture Notes in Mathematics, 191–246. Springer-Verlag, 1996. [79] N. El-Karoui and M.C. Quenez. Dynamic programming and pricing of contingent claims in an incomplete market, journal =. 1995. [80] R.J. Elliot. Stochastic Calculus and Applications. Vol. 18 of Application of Math. Springer-Verlag, 1982. [81] Th. Rheinländer, D. Sampieri, M. Schweizer, F. Delbaen, P. Grandits, and Ch. Stricker. Exponential hedging and entropic penalties. Math. Finance, 12:99–124, 2002. [82] W. Fleming and M. Soner. Controlled Markov Processes and Viscosity Solutions. Springer-Verlag, 1993. [83] W. H. Fleming and R. W. Rishel. Deterministic and Stochastic Optimal Control. Springer-Verlag, 1975. [84] W.H. Fleming and S.J. Sheu. Risk-sensitive control and an optimal investment model. Mathematical Finance, 10:197–213, 2000. [85] W.H. Fleming and S.J. Sheu. Risk-sensitive control and an optimal investment model ii. Annals of Applied Probability, 12:730–767, 2002. [86] H. Föllmer and Y. Kabanov. Optional decomposition and Lagrange multipliers. Finance and Stochastics, 2:69–81, 1998. [87] H. Föllmer and D. Kramkov. Optional decompositions under constraints. Probability Theory and Related Fields, 109:1–25, 1997. [88] H. Föllmer and P. Leukert. Efficient hedging: cost versus shortfall risk. Finance and Stochastics, pages –, 1999. [89] H. Föllmer and A. Schied. Convex measures of risk and trading constraints. Finance and Stochastics, 6:429–447, 2002. [90] H. Föllmer and A. Schied. Stochastic Finance: An Introduction in discrete Time. De Gruyter Studies in Mathematics. 2002, second revised edition 2004. [91] H. Föllmer and M. Schweizer. Hedging of contingent claims under incomplete information. In M.H.A. Davis and R.J. Elliott, editors, Applied Stochastic Analysis, 389–414. Gordon and Breach, 1990. [92] J.-P. Fouque, G. Papanicolaou and R. Sircar. Derivatives in Financial Markets with Stochastic Volatility. Cambridge University Press, 2000. [93] J.-P. Fouque, G. Papanicolaou, R. Sircar and K. Solna. Multiscale stochastic volatility asymptotics. SIAM J. Multiscale Modeling and Simulation, 2(1):22– 42, 2003.
BIBLIOGRAPHY
393
[94] R. Frey. Derivative asset analysis in models with level-dependent and stochastic volatility. CWI Quarterly, 10:1–34, 1997. [95] R. Frey and C. Sin. Bounds on european option prices under stochastic volatility. Mathematical Finance, 9:97–116, 1999. [96] M. Frittelli. Introduction to a theory of value coherent with the no-arbitrage principle. Finance and Stochastics, 4:275–297, 2000. [97] M. Frittelli. The minimal entropy martingale measure and the valuation problem in incomplete markets. Mathematical Finance, 10(1):39–52, 2000. [98] M. Frittelli and E.R. Gianin. Putting order in risk measures. Journal of Banking and Finance, 26:1473–1486, 2002. [99] M. Frittelli and E.R. Gianin. Dynamic convex risk measures in risk measures for the 21st century. Wiley Finance, John Wiley & Sons, 2004. [100] M. Frittelli and G. Scandolo. Risk measures and capital requirement for processes, 2005. [101] T. Fujiwara and Y. Miyahara. The minimal entropy martingale measure for geometric Lévy processes. Finance Stochast., 27:509–531, 2003. [102] R. Buckdahn, G. Barles, and E. Pardoux. Backward stochastic differential equations and integral-partial differential equations. Stochastics and Stochastics Reports, 60:57–83, 1997. [103] H. Geman, N. El Karoui and J-C Rochet. Changes of numéraire, changes of probability and option pricing. Journal of Applied Probability, 32:443–458, 1995. [104] H. Geman and A. Roncoroni. Understanding the fine structure of electricity prices. Journal of Business, 79, 2006. [105] H.U. Gerber. An introduction to mathematical risk theory. Huebner Foundation for Insurance Education, 1979. [106] H.U. Gerber. Risk exchange induced by an external agent. In Transactions of the International Congress of Actuaries, 385–392. 1980. [107] E.R. Gianin. Some examples of risk measures via g-expectations. Technical report, Universita di Milano Bicocca (Italy). [108] R. Gibson and E. S. Schwartz. Stochastic convenience yield and the pricing of oil contingent claims. Journal of Finance, XLV(3):959–976, 1990. [109] T. Goll and L. Rüschendorf. Minimax and minimal distance martingale measures and their relationship to portfolio optimisation. Finance and Stochastics, 5:557–581, 2001.
394
BIBLIOGRAPHY
[110] P. Grandits and T. Rheinländer. On the minimal entropy martingale measure. The Annals of Probability, 30(3):1003–1038, 2002. [111] M.R. Grasselli and T.R. Hurd. Indifference pricing and hedging in stochastic volatility models. Technical report, McMaster University, 2004. [112] V.K. Gupta and E.C. Waymire. Multiscaling properties of spatial rainfall and river flow distributions. Journal of Geophysical Research, 95(D3):1999– 2010, 1990. [113] V.K. Gupta and E.C. Waymire. A statistical analysis of mesoscale rainfall as a random cascade. Journal of Applied Meteorology, 32:251–267, 1994. [114] S. Hamadène. Equations différentielles stochastiques rétrogrades: Le cas localement lipschitzien. Ann. Inst. Henri Poincaré, 32:645–659, 1996. [115] S. Hamadène. Reflected bsde’s with discontinuous barrier and applications. Stoch. and Stoc. Reports, 74:571–596, 2002. [116] S. Hamadène and J.-P. Lepeltier. Backward equations, stochastic control and zero-sum stochastic differential games. Stochastic and Stochastic Reports, 54:221–231, 1995. [117] S. Hamadène and J.-P. Lepeltier. Zero-sum stochastic differential games and backward equations. Systems and Control Letters, 24:259–263, 1995. [118] U.G. Haussmann and J.P. Lepeltier. On the existence of optimal controls. SIAM J. Control Optim., 28:851–902, 1990. [119] V. Henderson. Valuation of claims on nontraded assets using utility maximization. Mathematical Finance, 12(4):351–373, 2002. [120] V. Henderson. Valuing the option to invest in an incomplete market. Technical report, Princeton University, 2004. [121] V. Henderson. Analytical comparisons of option prices in stochastic volatility models. Mathematical Finance, 15:49–59, 2005. [122] V. Henderson. Explicit solutions to an optimal portfolio choice problem with stochastic income. Journal of Economic Dynamics and Control, 29:1237– 1266, 2005. [123] V. Henderson. The impact of the market portfolio on the valuation, incentives and optimality of executive stock options. Quantitative Finance, 5:1–13, 2005. [124] V. Henderson. Executive exercise explained: Patterns for executive stock options. preprint, May 2006. [125] V. Henderson and D. Hobson. Substitute hedging. RISK, 15(5):71–75, 2002.
BIBLIOGRAPHY
395
[126] S.L. Heston. A closed-form solution for options with stochastic volatility and applications to bond and currency options. Review of Financial Studies, 6:327–343, 1993. [127] J.B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of Convex Analysis. Springer-Verlag, Grundlehren Text Edition, 2001. [128] D. Hobson and V. Henderson. Real options with constant relative risk aversion. Journal of Economic Dynamics and Control, 27:329–355, 2002. [129] D.G. Hobson. Stochastic volatility, correlation and the q-optimal measure. Mathematical Finance, 14:537–556, 2004. [130] D.G. Hobson. Bounds for the utility indifference prices of non-traded assets in incomplete markets. Technical report, 2005. [131] S. Hodges and A. Neuberger. Optimal replication of contingent claims under transactions costs. Review of Futures Markets, 8:222–239, 1989. [132] Y. Hu and X.Y. Zhou. Constrained stochastic lq control with random coefficients and application to mean-variance portfolio selection. Preprint, 2004. [133] C. Huang and R.H. Litzenberger. Foundations for Financial Economics. North-Holland, 1988. [134] F. Hubalek and W. Schachermayer. The limitations of no-arbitrage arguments for real options. International Journal of Theoretical and Applied Finance, 4:361–373, 2001. [135] J. Hull and A. White. The pricing of options on assets with stochastic volatilities. Journal of Finance, 42(2):281–300, 1987. [136] A. ˙Ilhan, M. Jonsson and R. Sircar. Singular perturbations for boundary value problems arising from exotic options. SIAM Journal on Applied Mathematics, 64(4):1268–1293, 2004. [137] A. ˙Ilhan, M. Jonsson and R. Sircar. Optimal investment with derivative securities. Finance and Stochastics, 9:585–595, 2005. [138] A. ˙Ilhan and R. Sircar. Optimal static-dynamic hedges for barrier options. Mathematical Finance, 16 (2):359–385, 2006. [139] D. Kramkov, J. Hugonnier, and W. Schachermayer. On utility based pricing of contingent claims in incomplete markets. Mathematical Finance, 15:203– 212, 2004. [140] S. Jaimungal and S. Nayak. On valuing equity-linked insurance and reinsurance contracts, 2004.
396
BIBLIOGRAPHY
[141] W.S. Jevons. The Theory of Political Economy. Reprinted Pelican Books, London, 1970, [1871]. [142] L. Jiang. Non-linear—g-expectation theory and its applications in finance. PhD Thesis, Shandong University, Jinan, 2005. [143] J. Lim. A numerical algorithm for indifference pricing in incomplete markets. Technical report, University of Texas, Austin, 2005. [144] M. Jonsson and R. Sircar. Optimal investment problems and volatility homogenization approximations. In A. Bourlioux, M. Gander and G. Sabidussi, editors, Modern Methods in Scientific Computing and Applications, volume 75 of NATO Science Series II, 255–281. Kluwer, 2002. [145] M. Jonsson and R. Sircar. Partial hedging in a stochastic volatility environment. Mathematical Finance, 12(4):375–409, October 2002. [146] Y.M. Kabanov and C. Stricker. On the optimal portfolio for the exponential utility maximization: remarks to the six-author. Mathematical Finance, 12:125–134, 2002. [147] A. Kadam, P. Lakner and A. Srinivasan. Executive stock options: value to the executive and cost to the firm. Technical report. Preprint, 2003. [148] M. Kahl, J. Liu and F. Longstaff. Paper millionaires: How valuable is stock to a stockholder who is restricted from selling it? Journal of Financial Economics, 67:385–410, 2003. [149] J. Kallsen. Derivative pricing based on local utility maximisation. Finance and Stochastics, 6:115–140, 2002. [150] I. Karatzas and S. Kou. On the pricing of contingent claims under constraints. Annals of Applied Probability, 6(2):321–369, 1996. [151] I. Karatzas, J.P. Lehoczky, S.E. Shreve and G. Xu. Martingale and duality methods for utility maximization in an incomplete market. SIAM Journal of Control and Optimization, 29(3):702–730, 1991. [152] I. Karatzas and S. Shreve. Brownian Motion and Stochastic Calculus. Springer-Verlag, 1987. [153] I. Karatzas and S. Shreve. Methods of Mathematical Finance. Applications of Mathematics, Vol 39. Springer-Verlag, New York, 1998. [154] N. El Karoui. Les aspects probabilites du contrôle stochastique. In Ecole d’été de Saint-Flour, vol. 876 of Lecture Notes in Mathematics, pages 73–238. Springer-Verlag. [155] N. El Karoui and S. Hamadène. Bsde and risk-sensitive control, zero-sum and non-zero-sum game problems of stochastic functional differential equations. Stochastic Processes and their Applications, 107:145–169, 2003.
BIBLIOGRAPHY
397
[156] N. El Karoui and R. Rouge. Pricing via utility maximization and entropy. Mathematical Finance, 10(2):259–276, 2000. [157] N. Kazamaki. Continuous Exponential Martingales and BMO. Vol. 1579 of Lecture Notes in Mathematics. Springer-Verlag, 1994. [158] S. Kloeppel and M. Schweizer. Dynamic utility indifference valuation via cinvex risk measures, 2005. [159] M. Kobylanski. [160] M. Kohlmann and X.Y. Zhou. Relationship between backward stochastic differential equations and stochastic controls: A linear-quadratic approach. SIAM Journal on Control and Optimization, 38:1392–1407, 2000. [161] J. Komlos. A generalisation of a problem of steinhaus. Acta Math. Acad. Sci. Hungar., 18:217–229, 1967. [162] D. Kramkov and W. Schachermayer. The asymptotic elasticity of utility functions and optimal investment in incomplete markets. Annals of Applied Probability, forthcoming. [163] D. Kramkov and M. Sirbu. Sensitivity analysis of utility based prices and risk tolerance wealth processes. Annals of Applied Probability, 9(3):904–950, 1999. [164] N. V. Krylov. Controlled Diffusion Processes. Springer-Verlag, 1980. [165] H. Kunita. Stochastic differential equations and stochastic flows of diffeomorphisms. In Ecole d’été de Probabilité de Saint-Flour, Vol. 1097 of Lecture Notes in Mathematics, pages 143–303. Springer-Verlag, 1982. [166] S. Kusuoka. A remark on default risk models. Advances in Mathematical Economics, 1:69–82, 1999. [167] G. Turmuhambetova, L. Hansen, T. J. Sargent, and N. Williams. Robust control and model misspecification, 2004. [168] P. Lakner. Optimal trading strategy for an investor: the case of partial information. Stochastic Processes and their Applications, 76:77–97, 1998. [169] J.-M. Lasry and P.-L. Lions. Contrôle stochastique avec informations partielles et applications à la finance. C. R. Acad. Sci. Paris, Série I, 328:1003–1010, 1999. [170] L. LeCam. A stochastic description of precipitation. In J. Neyman, ed., Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 3:165–186, 1961. [171] J.P. Lepeltier and J. San Martin. Backward stochastic differential equations with continuous coefficient. Statist. Probab. Lett., 32:425–430, 1997.
398
BIBLIOGRAPHY
[172] J.P. Lepeltier and J. San Martin. Existence of bsde with superlinear quadratic coefficient. Stochastics and Stochastics Reports, 63:227–260, 1998. [173] A. Lim. Hedging default risk. Preprint, 2004. [174] Lipster and Shiryaev. Statistics of Random Process. (2d ed). Springer-Verlag, 2001. [175] J. Liu and J. Pan. Dynamic derivatives strategies. Journal of Financial Economics, 69:401–430, 2003. [176] M. Santacroce, M. Mania, and R. Tevzadze. A semimartingale bsde related to the minimal entropy martingale measure. Finance and Stochastics, 7:385– 402, 2003. [177] J. Ma and J. Yong. Forward-backward stochastic differential equations and their applications. Vol. 1702 of Lecture Notes in Math. Springer-Verlag, 1999. [178] M. Mania. A general problem of an optimal equivalent change of measure and contingent claim pricing in an incomplete market. Stochastic Processes and their Applications, 90:19–42, 2000. [179] M. Mania and M. Schweizer. Dynamic exponential utility indifference valuation. Annals of Applied Probability, to appear. [180] M. Mania and R. Tevzadze. Backward stochastic pde and imperfect hedging. International Journal of Theoretical and Applied Finance, 6:663–692, 2003. [181] M. Mania and R. Tevzadze. A unified characterization of q-optimal and minimal entropy martingale measures by semimartingale backward equation. Georgian Mathematical Journal, 10:2289–310, 2003. [182] M. Manoliu and S. Tompaidis. Energy futures prices: term structure models with Kalman filter estimation. Applied Mathematical Finance, 9(1):21–43, 2002. [183] W. Margrabe. The value of an option to exchange one asset for another. Journal of Finance, 33:177–186, 1978. [184] A. Mas-Colell, M.D. Whinston and J.R. Green. Microeconomic Theory. OUP, 1995. [185] J. Mason. Numerical weather prediction. Proceeding of Royal Society in London. Series A, 407:51–60, 1986. [186] R.C. Merton. Lifetime portfolio selection under uncertainty: the continoustime case. Rev. Econom. Statist., 51:247–257, 1969. [187] R.C. Merton. Optimum consumption and portfolio rules in a continuous-time model. J. Economic Theory, 3(1/2):373–413, Jan./March 1971.
BIBLIOGRAPHY
399
[188] R.C. Merton. Continuous Time Finance. Basil Blackwell Ltd UK, 1992. [189] T. Møller. Indifference pricing of insurance contracts in a product space model. Finance and Stochastics, 7:197–217, 2003. [190] M. Monoyios. Efficient option pricing with transactions costs. Journal of Computational Finance, 7(1), 2003. [191] M. Monoyios. Option pricing with transactions costs using a markov chain approximation. Journal of Economic Dynamics and Control, 28(5):889–913, 2004. [192] C. Munk. The valuation of contingent claims under portfolio constraints: reservation buying and selling prices. European Finance Review, 3:347–388, 1999. [193] C. Munk. Optimal consumption/investment policies with undiversifiable income risk and liquidity constraints. Journal of Economic Dynamics and Control, 24:1315–1343, 2000. [194] K. Murphy. Executive compensation. In O. Ashenfelter and D. Card, editors, Handbook of Labor Economics. North-Holland, 1999. [195] M. Musiela and T. Zariphopoulou. Optimal investment and pricing in incomplete markets: one-period binomial case. Preprint, 2002. [196] M. Musiela and T. Zariphopoulou. Pricing and risk management of derivatives written on non-traded assets. Technical Report, 2002. [197] M. Musiela and T. Zariphopoulou. Backward indifference valuation in incomplete binomial models. Technical report, 2003. [198] M. Musiela and T. Zariphopoulou. An example of indifference prices under exponential preferences. Finance and Stochastics, 8(2):229–239, 2004. [199] M. Musiela and T. Zariphopoulou. A valuation algorithm for indifference prices in incomplete markets. Finance and Stochastics, 8(3):399–414, 2004. [200] E. Pardoux, N. El Karoui, and M.C. Quenez. Reflected backward sdes and american options, in numerical methods in finance. In Publ. Newton Inst., pages 215–231. Cambridge Univ. Press, Cambridge, 1997. [201] E. Pardoux, S. Peng, N. El-Karoui, C.Kapoudjian, and M.-C. Quenez. Reflected solutions of backward sde’s and related obstacle problems for pde’s. Annals of Probability, 1997. [202] S. Peng, N. El-Karoui, and M.-C. Quenez. Backward stochastic differential equations in finance. Mathematical Finance, 1997.
400
BIBLIOGRAPHY
[203] H. Nagai. Risk-sensitive dynamic asset management with partial information. In T. Hida and B.S. Rajput et al., editors, Stochastics in Finite and Infinite Dimension. A Volume in Honor of G. Kallianpur, 321–340. Birkhäuser, Boston, MA, 2000. [204] X. Bay, O. Roustant, J.P. Laurent, and L. Carraro. A bootstrap approach to price the uncertainty of weather derivatives. Technical report, 2003. [205] A. Oberman and T. Zariphopoulou. Pricing early exercise contracts in incomplete markets. Computational Management Science, 2003. To appear. [206] C. Onof, R.E. Chandler, A. Kakou, P. Northrop, H.S. Wheater and V. Isham. Rainfall modelling using poisson-cluster processes: A review of developments. Stochastic Environmental Research and Risk Management, 14:384– 411, 2000. [207] M.P. Owen. Utility based optimal hedging in incomplete markets. The Annals of Applied Probability, 12(2):691–709, 2002. [208] J.M. Eber, P. Artzner, F. Delbaen, and D. Heath. Coherent measures of risk. Mathematical Finance, 9:203–228, 1999. [209] J.M. Eber, P. Artzner, F. Delbaen and D. Heath. Coherent multiperiod risk adjusted values and Bellman’s principle, 2004. [210] J. Memin, P. Briand, F. Coquet, and S. Peng. A converse comparison theorem for bsdes and related properties of g-expectations. Electron. Comm. Probab., 5:101–117, 2000. [211] H. Geman, P. Carr, and D. Madan. Pricing and hedging in incomplete markets. Journal of Financial Economics, 62:131–167, 2001. [212] F. Delbaen, P. Cheridito, and M. Kupper. Coherent and convex risk measures for bounded cádlaág processes. Stochastic Processes and their Applications, 112:1–22, 2004. [213] F. Delbaen, P. Cheridito, and M. Kupper. Coherent and convex risk measures for unbounded cádlaág processes, 2004. [214] F. Delbaen, P. Cheridito, and M. Kupper. Dynamic monetary risk measures for bounded discrete-time processes, 2004. [215] E. Pardoux. Bsdes, weak convergence and homogenization of semilinear pdes. In Nonlinear Analysis, Differential Equations and Control (Montreal, QC, 1998), volume 528 of NATO Sci. Ser. C Math. Phys. Sci., 503–549. 1999. Kluwer Acad. Publ. [216] E. Pardoux and S. Peng. Adapted solutions of backward stochastic differential equations. Systems and Control Letters, T4:51–61, 1990.
BIBLIOGRAPHY
401
[217] E. Pardoux and S. Peng. Backward stochastic differential equations and quasilinear parabolic partial differential equations. In Stochastic Partial Differential Equations and Their Applications. Vol. 176 of Lect. Notes in Control Inf. Sci., 200–217. Springer-Verlag, 1992. [218] S. Peng. Backward sde and related g-expectation. In Backward Stochastic Differential Equations. Vol. 364 of Pitman Res. Notes Math. Ser., 141–159. Longman, Harlow, 1997. [219] S. Peng. Open problems on backward stochastic differential equations. In Control of Distributed Parameter and Stochastic Systems (Hangzhou, 1998), 265–273. Kluwer, 1999. [220] S. Peng. Nonlinear Expectations, Nonlinear Evaluations and Risk Measures. Lecture Notes in Mathematics 1856, Springer-Verlag, 2004. [221] H. Pham. Smooth solutions to optimal investment models with stochastic volatilities and portfolio constraints. Applied Mathematics and Optimization, 46:55–78, 2002. [222] S. Pliska. Introduction to Mathematical Finance. Blackwell, U.K., 1997. [223] J. Porter. Evolution of the global weather derivatives market. In The History of Risk Management. 2004. [224] J. Pratt. Risk aversion in the small and in the large. Econometrica, 32:122– 136, 1964. [225] M.C. Quenez. Stochastic control and bsdes. In Backward Stochastic Differential Equations. Vol. 364 of Pitman Res. Notes Math. Ser., 83–99. Longman, Harlow, 1997. [226] A. Raviv. The design of an optimal insurance policy. American Economic Review, 69:84–96, 1979. [227] R. Rees. The theory of principal and agent, part i. Bulletin of Economic Research, 37:1:3–26, 1985. [228] R. Rees. The theory of principal and agent, part ii. Bulletin of Economic Research, 37:2:75–95, 1985. [229] F. Riedel. Dynamic coherent risk measures. Stochastic Processes and their Applications. [230] M. Jeanblanc R.J. Elliott and M. Yor. On models of default risk. Mathematical Finance, 2000. [231] R.T. Rockafellar. Convex Analysis. Princeton Landmarks in Mathematics. 1970.
402
BIBLIOGRAPHY
[232] R.T. Rockafellar and S. Uryasev. Optimization of conditional value at risk. Journal of Risk, 2:21–41, 2000. [233] I. Rodriguez-Iturbe, D.R. Cox and V. Isham. Some models for rainfall based on stochastic point processes. Proceeding of the Royal Society of London. Series A., 410(1839):269–288, April 1987. [234] I. Rodriguez-Iturbe, D.R. Cox and V. Isham. A point process model for rainfall: Further developments. Proc. R. Soc. London A, 417:283–298, 1988. [235] L.C.G. Rogers. Duality in constrained optimal investment and consumption problems: a synthesis. In Paris Princeton Lectures in Mathematical Finance. Vol. 1814 of Lect. Notes in Math., 101–137. 2003. [236] S. Rong. On solutions of backward stochastic differential equations with jumps and applications. Stochastic Processes and their Applications, 66:209– 236, 1997. [237] R. Rouge and N. El Karoui. Pricing via utility maximization and entropy. Mathematical Finance, 10:259–276, 2000. [238] M. Royer. Equations différentielles stochastiques rétrogrades et martingales non-linéaires. Doctoral thesis, 2003. [239] W.J. Runggaldier. Estimation via stochastic filtering in financial market models. In Mathematics of Finance, AMS-SIAM Joint Summer Conferences, Providence, RI, 2004. [240] R.W.R. Darling and E. Pardoux. Backward sde with random terminal time and applications to semilinear elliptic pde. Ann. Probability, 25:1135–1159, 1997. [241] J.P. Lepeltier, S. Hamadène, and A. Matousi. Double barrier backward sdes with continuous coefficient. Vol. 364 of Pitman Research Notes in Mathematics Series. Cambridge University Press, 1997. [242] G. Scandolo. Risk measures in a dynamic setting. PhD thesis, Milano University, 2003. [243] W. Schachermayer. Optimal investment in incomplete markets when wealth may become negative. Annals of Applied Probability, 11:694–734, 2001. [244] W. Schachermayer. Optimal investment in incomplete financial markets. In Mathematical Finance Bachelier Congress 2000, 427–462. Springer-Verlag, 2002. [245] W. Schachermayer. Introduction to the mathematics of financial markets. In Pierre Bernard, editor, Lectures on Probability Theory and Statistics, SaintFleur Summer School 2000, number 1816 in Lecture Notes in Mathematics, 111–177. Springer-Verlag, 2003.
BIBLIOGRAPHY
403
[246] A. Schied. Risk Measures and Robust Optimization Problems. SpringerVerlag, Lecture Notes, to appear. [247] Ph.J. Schönbucher. Credit derivatives pricing models. Wiley Finance, 2003. [248] E. Schwartz. The stochastic behavior of commodity prices: implications for valuation and hedging. Journal of Finance, LII(3):922–973, 1997. [249] M. Schweizer. Approximation pricing and the variance optimal martingale measure. Annals of Probability, 24:206–236, 1996. [250] J. Sekine. An approximation for exponential hedging. In Proceedings of the Symposium Stochastic Analysis and Related Topics. RIMS, Tokyo, 2002. [251] J. Sekine. Power-utility maximization for linear gaussian factor model under partial information. Technical report, Osaka University, 2003. [252] R. Sircar and T. Zariphopoulou. Bounds and asymptotic approximations for utility prices when volatility is random. SIAM J. Control & Optimization, 43:1328–1353, 2005. [253] J. Smith and K.F. McCradle. Valuing oil properties: integrating option pricing and decision analysis approaches. Operations Research, 46:198–217, 1998. [254] J.E. Smith and R.F. Nau. Valuing risky projects: option pricing theory and analysis. Management Science, 41(5):795–816, 1995. [255] E.M. Stein and J.C. Stein. Stock-price distributions with stochastic volatility: an analytic approach. Review Financial Studies, 4:727–752, 1991. [256] S.F. Stoikov. Option pricing from the point of view of a trader. [257] Ch. Stricker. Indifference pricing with exponential utility. In Progress in Probability, vol. 58, 323–328. Birkhäuser, 2004. [258] M. Tehranchi. Explicit solution of some utility maximization problems in incomplete markets. Stochastic Processes and Their Applications, 1568– 1590, October 2004. [259] L. Teplá. Optimal hedging and valuation of non-traded assets. European Finance Review, 4:221–251, 2000. [260] M. Jeanblanc, T.R. Bielecki, and M. Rutkowski. Hedging of defaultable claims. In Paris-Princeton Lectures in Mathematical Finance. Vol. 1814, 1–132. Springer-Verlag, 2004. [261] M. Jeanblanc, T.R. Bielecki, and M. Rutkowski. Modeling and Valuation of Credit Risk. Springer-Verlag, 2004. [262] B. Fernandez, V. Bally, M.E. Caballero, and N. El Karoui. Reflected bsde’s, pde’s and variational inequalities. Rapport de recherche INRIA, 2002.
404
BIBLIOGRAPHY
[263] S. Howison, V. Henderson, D. Hobson, and T. Kluge. A comparison of q-optimal option prices in a stochastic volatility model with correlation. Technical report, 2005. [264] T. Wang. A class of dynamic risk measures, 1999. [265] S. Weber. Distribution-invariant dynamic risk measures. Working Paper, 2003. [266] P. Whittle. Risk-Sensitive Optimal Control. Wiley-Interscience series in Systems and Optimization. Wiley & Sons, 1990. [267] D.S. Wilks and R.L. Wilby. The weather generation game: a review of stochastic weather models. Progress in Physical Geography, 23(3):329–357, 1999. [268] P. Imkeller, Y. Hu, and M. Müller. Utility maximization in incomplete markets. Technical report, http://wws.mathematik.hu-berlin.de/∼ imkeller/research/papers/utilityHIM.pdf, 2003. [269] V.R. Young. Pricing in an incomplete market with an affine term structure. Mathematical Finance, 14(3):359–, 2004. [270] T. Zariphopoulou. A solution approach to valuation with unhedgeable risks. Finance & Stochastics, 5:61–82, 2001.
Contributors
Pauline Barrieu Statistics Department London School of Economics Houghton Street, London WC2A 2AE, UK (e-mail:
[email protected]) Research supported in part by the EPSRC, under Grant GR-T23879/01.
Tomasz R. Bielecki Department of Applied Mathematics Illinois Institute of Technology Chicago, IL, USA (e-mail:
[email protected]) Partially supported by NSF Grant 0202851.
René Carmona Bendheim Center for Finance ORFE, Princeton University Princeton, NJ 08544, USA (e-mail:
[email protected]) Nicole El Karoui C.M.A.P., Ecole Polytechnique 91128 Palaiseau Cédex, France (e-mail:
[email protected]) Robert J. Elliott Haskayne School of Business University of Calgary Calgary, Alberta, Canada (e-mail:
[email protected]) Partially supported by the Social Sciences and Humanities Research Council of Canada.
Said Hamadène Laboratoire de Statistique et Processus Université du Maine 72085 Le Mans Cédex 9, France (e-mail:
[email protected])
406 Vicky Henderson ORFE and Bendheim Center for Finance E-Quad, Princeton University Princeton, NJ 08544, USA (e-mail:
[email protected]) David Hobson Department of Mathematical Sciences University of Bath Bath BA2 7AY, UK (e-mail:
[email protected]) Aytac Ilhan Department of Operations Research & Financial Engineering Princeton University Princeton, NJ 08544, USA (e-mail:
[email protected]) Partially supported by NSF grant DMS-0306357.
Monique Jeanblanc Equipe d’Analyse et Probabilités Université d’Évry-Val-d’Essonne 91025 Évry, France (e-mail:
[email protected]) Mattias Jonsson Department of Mathematics University of Michigan Ann Arbor, MI 48109-1109, USA (e-mail:
[email protected]) Anis Matoussi Laboratoire de Statistique et Processus Université du Maine 72085 Le Mans Cédex 9, France (e-mail:
[email protected]) Marek Musiela BNP Paribas London Ronnie Sircar Department of Operations Research & Financial Engineering Princeton University Princeton, NJ 08544, USA (e-mail:
[email protected]) Work partially supported by NSF grant DMS-0306357.
CONTRIBUTORS
CONTRIBUTORS
407
John van der Hoek Department of Applied Mathematics University of Adelaide, Adelaide, Australia (e-mail:
[email protected]) Partially supported by the Social Sciences and Humanities Research Council of Canada.
Thaleia Zariphopoulou Department of Mathematics University of Texas at Austin 1 University Station C 1200 Austin, TX 78712-0257, USA
This page intentionally left blank
Notation Index
(H 1), 119 (H 2), 119 (H 3), 119 (P 1), 121 (P 2), 121 (P 2+), 121 (P 2−), 121 (P 3), 121 (P 3+), 121 (P 3−), 121 (P 4), 121 (P 4+), 121 (P 4−), 121 (P 5), 121 (P 6), 121 (P 7), 121 F (e) (T , Tj ), 258 F (g) (T , Tj ), 258 G, 128 L2n F t , 118 SP , 246 TH , 118 W , 118 α, 82 E, 118 P, 118 Aρ , 87 Ft , 118 R, 122 g Rγ , 134 g R∞ , 134 Rg , 125 g R0+ , 135 A VT , 112 VTB , 112 X , 81 Y, 121 Y g , 123
ν H , 87 ∂g, 142 ρ, 81 ρ m , 104 ρ0+ , 97 ργ , 93 ρ∞ , 94 eγ ,t , 122 eγ , 80 g0+ , 135 l MH , 88 Hn1 (0, T ), 118 Hn2 (0, T ), 118 Pn (0, T ), 118 Sn2 (0, T ), 118 BMO(P), 129 Dom(g), 141 lsc, 141 u.i., 129 BLPC, 250 CDD, 241, 243 CEV, 263 CME, 242 HDD, 241, 243 HJB, 150, 252 MLE, 250 NOAA, 245 NWS, 245 OTC, 242 SDE, 148
Author Index
Acerbi, 85 Artzner, 71, 78, 81, 85, 117 Aubin, 130, 141 Bally, 292, 293, 315, 316 Barles, 223, 292, 293 Barlow, 258 Barrieu, 93, 248, 283, 296, 301 Becherer, 60, 77, 110, 189, 191, 192 Bellman, 56, 150 Benes, 133, 282, 312 Bernis, 221 Bernoulli, 44 Bielecki, 213, 214, 230, 235, 239, 301 Bion-Nadal, 91, 93 Bismut, 267 Black, 49 Bobrovnytska, 230, 239 Borch, 108, 109 Borwein, 99 Branger, 55 Brendle, 159 Briand, 125 Bühlman, 108 Campbell, 66 Carmona, 159, 250, 258 Carr, 105, 184 Cheridito, 78, 117 Cinlar, 286 Cochrane, 73 Cohen, 146 Cole, 151 Cont, 93 Copeland, 45 Coquet, 125 Cox, 250 Cvitanic, 59, 78, 117, 140, 268, 317, 318 Danilova, 147 Darling, 272 Davis, 51, 65, 66, 70, 95, 134, 219, 241 Delbaen, 60, 77, 78, 81, 83, 85, 92, 96, 110, 117, 188, 196, 226 Dellacherie, 133, 278, 286, 300, 305–307 Deprez, 80 Detemple, 49, 51, 66, 68
Detlefsen, 117 Diko, 249, 250 Duffie, 53, 214 Durrleman, 258 Eber, 78, 81, 85, 117 El Karoui, 60, 68, 77, 80, 93, 110, 117, 199, 120, 121, 126, 135, 140, 141, 189, 192, 216, 248, 267–269, 274, 275, 280, 282, 283, 296, 301, 303, 307, 313 Elliot, 282, 283 Elliott, 213 Engwerda, 117 Feynman, 151 Fleming, 199, 301 Föllmer, 47, 52, 55, 58, 68, 71, 77, 81–83, 85, 86, 88, 89, 91, 92, 105, 120 Frey, 54 Frittelli, 52, 60, 77, 78, 83, 110, 117, 188, 189 Geman, 55, 105, 258 Gerber, 44, 80, 108 Gianin, 77, 78, 83, 117 Gibson, 258 Girsanov, 151, 259 Goll, 69 Grandits, 77, 110, 189 Grasselli, 63, 64 Hamadène, 267–269, 282–284, 303 Hamilton, 56, 150 HARA, 46 Heath, 78, 81, 85, 117 Henderson, 54, 57, 58, 66, 68, 71 Heston, 54 Hiriart-Urruty, 130, 141, 144 Hobson, 54, 57, 63, 70 Hodges, 44, 65, 77, 78, 185 Hopf, 151 Hu, 129, 141, 230 Huang, 46 Hull, 54, 197 Hurd, 63, 64 Imkeller, 129, 141 Isham, 250
411
AUTHOR INDEX Jacobi, 56, 150 Jacod, 286 Jeanblanc, 221 Jevons, 44 Jewell, 108 Jiang, 125 Kabanov, 110, 111, 189 Kac, 151 Kadam, 67, 68 Kallsen, 51 Kamazaki, 129 Kapoudjian, 268 Karatzas, 51, 59, 70, 78, 95, 117, 140, 217, 268, 270, 317, 318 Kazamaki, 128 Klöppel, 117 Kobylanski, 63, 120, 267, 296 Kohlmann, 235 Komlos, 111 Kou, 51, 70, 95 Kramkov, 48, 58, 92 Kunita, 289, 292 Kupper, 78, 117 Kushner, 170 Kusuoka, 227, 239 Lasry, 262 Legendre, 48 Lemaréchal, 130, 141, 144 Lepeltier, 119, 120, 267, 268, 276, 282–284, 296 Lesigne, 292, 293 Leukert, 47, 68 Lim, 230, 236 Lions, 262 Litzenberger, 46 Liu, 184 Ludkovski, 256 Ma, 269, 285 Madan, 105, 184 Mania, 64, 77, 130, 141, 225, 226, 230, 237 Margrabe, 55 Markovitz, 165 Mas-Colell, 44, 45 Matoussi, 268, 292, 293 Mazliak, 283 Mémin, 125 Merton, 46, 51, 65, 165 Meyer, 133, 278, 286, 300, 305–307 Møller, 65 Müller, 129, 141 Munk, 49, 65, 66 Murphy, 67 Musiela, 51, 54, 64, 77, 153, 321 Nau, 51, 64 Neuberger, 44, 65, 77, 78, 185
Norman, 65 Novikov, 151 Oberman, 66 Owen, 59 Pan, 184 Pardoux, 119, 267–270, 272, 288–290, 313 Peng, 78, 117, 119, 120, 122, 125–127, 267– 270, 274, 275, 280, 282, 288–290 Pham, 201 Pliska, 301 Porter, 243 Protter, 286 Quenez, 68, 80, 117, 120, 121, 126, 135, 140, 267–269, 274, 275, 280, 282, 283, 313 Raviv, 108 Rees, 108 Rheinländer, 54, 77, 110, 189 Richardson, 53 Riedel, 78, 117 Rockafellar, 85, 130, 141, 144 Rodriguez-Iturbe, 250 Roncoroni, 258 Rong, 223 Roorda, 117 Rouge, 60, 77, 110, 141, 189, 192, 216, 268 Royer, 223 Rüschendorf, 69 Rutkowski, 214, 235 Samperi, 77, 110 San Martin, 119, 120, 267, 276, 296 Santacroce, 130, 141 Scandolo, 78, 117 Schachemayer, 92 Schachermayer, 48, 194 Schied, 71, 77, 81–83, 85, 86, 88, 89, 91, 92, 105, 120 Schönbucher, 214 Scholes, 49 Schumacher, 117 Schwartz, 258 Schweizer, 52, 55, 58, 77, 110, 117, 130, 141, 230, 239 Sekine, 141, 268 Sharpe, 53, 159, 286 Sheu, 301 Shreve, 217, 270 Singleton, 214 Sirbu, 58 Sircar, 64 Smith, 51, 64 Soner, 199 Stein, 54 Stricker, 77, 110, 111, 189 Sundaresan, 49, 51, 66, 68
412 Tasche, 85 Tehranchi, 64, 159 Teplá, 49, 54 Tevzadze, 130, 141, 225, 226, 230, 237 Uryasev, 85 Viceira, 66 Wang, 117 Weber, 78, 117
AUTHOR INDEX Weston, 45 White, 54, 197 Whittle, 301 Yong, 285 Young, 269 Zakai, 170 Zariphopoulou, 51, 54, 57, 64, 66, 77, 153, 321 Zhou, 230, 235 Zhu, 99
Subject Index
γ -tolerant risk measure, 93 g-dynamic operator, 123 acceptance set, 87 actuarial pricing, 246 admissible, 160 asymptotic elasticity, 48 asymptotic expansion, 204 at the money, 244 autoregressive, 245 average value at risk, 85 basis, 262 budget constraint, 59 cap, 244 capital requirement, 86 certainty equivalent, 262 coherent risk measure, 81 compensated martingale, 236 complete market, 49 concave approximation, 204 conditional value at risk, 85 conservative risk measure, 97, 135 consistent convex price system, 121 convex conjugate, 48 convex premium principle, 80 convex risk measure, 71, 81 cooling degree day, 243 cost of partial observations, 263 default-free asset, 212 default-free market, 211 defaultable claim, 214 default time, 212, 214 dilatation, 93 dilated risk measure, 93 distortion, 57 dual approach, 59 dual formulation, 186 duality approach, 51 dual representation, 81 dynamic convex risk measure, 117 dynamic entropic risk measure, 122 dynamic operator, 121
entropic distance, 190 entropic risk measure, 85 epigraph, 141 exact inf-convolution, 144 executive stock option, 67 expected shortfall, 85 exponential utility, 45, 148 Fenchel-Legendre transform, 86, 187, 191 Feynman-Kac formula, 151, 204, 253, 254 floor, 244 full filtration, 212 g-conditional risk measure, 125 gamma, 207 Greeks, 207 HARA utility, 46 hazard process, 213 hazard rate, 214 heating degree day, 243 heat rate, 258 hedge, 51 historical average, 244 HJB, 56 Hopf-Cole transformation, 57, 151 incomplete market, 184 indifference buying price, 216 indifference measure, 153 indifference price, 45, 78 indifference pricing rule, 79 indifference seller’s price, 80 infinitesimal risk measure, 127 Lagrange multiplier, 59 Legendre-Fenchel transform, 47, 59 loss function, 86 marginal price, 219 marginal risk measure, 94 market modified risk measure, 104 mean reverting, 245 Merton problem, 190 minimal entropy measure, 45, 60
414 minimal martingale measure, 45 money market account, 211 National Oceanic and Atmospheric Administration, 245 National Weather Service, 245 nominal payoff rate, 244 nonlinear expectation, 122 NordPool power exchange, 256 numéraire, 55 optimal design, 105 Ornstein-Uhlenbeck process, 245 out of the money, 244 over the counter (OTC), 242 penalty function, 82 perspective function, 136 plant efficiency, 258 pluviometer, 255 positive part, 148 power utility, 45 primal approach, 56 principal-agent problem, 108 private valuation, 49 promised payoff, 214 quadratic hedging strategy, 230 quasi-linear PDE, 197 rainfall cell, 250 recession function, 135 recession super price system, 135 recovery payoff, 214 reference filtration, 211 relative entropy, 187 reservation price, 49 residual risk measure, 109 risk aversion, 45, 148
SUBJECT INDEX risk aversion limit, 191 risk preference, 148 risk premium traded, 148 risk tolerance coefficient, 79 risk transfer, 105 Sharpe ratio, 53 shortfall hedging, 47 shortfall risk, 86 spark spread, 257, 258 spread option, 257 state price density, 54 stochastic intensity, 214 stochastic volatility, 54, 197 storm, 250 strangle, 184 strike period, 243, 246 strongly convex, 146 subhedging price, 191, 194 superprice, 97 superprice system, 135 superhedging price, 194 superreplication, 45 support function, 142 tolerance, 93 traded risk premium, 148 transaction feasibility, 106 translation invariance, 79 utility function, 45, 148 value at risk, 84 Vega, 207 volume perspective risk measure, 136 wealth dependence, 45 weather market, 147