Static & Dynamic Game Theory: Foundations & Applications Series Editor Tamer Ba¸sar, University of Illinois, Urbana-Champaign, IL, USA Editorial Advisory Board Daron Acemoglu, MIT, Cambridge, MA, USA Pierre Bernhard, INRIA, Sophia-Antipolis, France Maurizio Falcone, Università degli Studi di Roma “La Sapienza”, Rome, Italy Alexander Kurzhanski, University of California, Berkeley, CA, USA Ariel Rubinstein, Tel Aviv University, Ramat Aviv, Israel; New York University, New York, NY, USA William H. Sandholm, University of Wisconsin, Madison, WI, USA Yoav Shoham, Stanford University, Palo Alto, CA, USA Georges Zaccour, GERAD, HEC Montréal, Canada
For further volumes: www.springer.com/series/10200
David W.K. Yeung r Leon A. Petrosyan
Subgame Consistent Economic Optimization An Advanced Cooperative Dynamic Game Analysis
David W.K. Yeung SRS Consortium for Advanced Study in Cooperative Dynamic Games Hong Kong Shue Yan University Hong Kong People’s Republic of China
[email protected] Leon A. Petrosyan Faculty of Applied Mathematics and Control Processes St. Petersburg State University Saint Petersburg 198904 Russia
[email protected] Center of Game Theory St. Petersburg State University Saint Petersburg 198904 Russia
ISBN 978-0-8176-8261-3 e-ISBN 978-0-8176-8262-0 DOI 10.1007/978-0-8176-8262-0 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2011942872 Mathematics Subject Classification (2010): 91A12, 92A25, 91B62, 78M50, 91B70 © Springer Science+Business Media, LLC 2012 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.birkhauser-science.com)
To Stella and Nina
Foreword
In the middle of the twentieth century, the seminal work of Rufus Isaacs on differential games established the foundation for the analysis of interactive behavior over time. Since then, dynamic game theory has been applied in many disciplines; in the past three and a half decades, its applications in economics and business have been growing rapidly. Market failures give rise to the need for cooperation in conducting economic activities. Limited success from dynamic cooperation can be expected if there is no guarantee that the participants’ agreed-upon scheme will always be maintained amid changes over time within the entire duration of the cooperation. The solution mechanism for obtaining time consistent and subgame consistent cooperative schemes developed by Leon Petrosyan and David Yeung—the authors of this text—provides an effective prescription to this “classic” game-theoretic problem. Practical policy menus can be formulated with this mechanism to tackle some of the serious problems facing the global market economy. This text covers the main development of subgame consistent economic optimization and includes most of the existing economic studies involving subgame consistent solutions. Illuminating applications are presented to illustrate the detailed workings of the fundamental theorems established. Atypical of current mainstream economic studies, which adopted existing mathematics in their analysis, this text uses novel game-theoretic mathematical techniques developed by its authors. The text is truly a world-leading treatise in the field of dynamically consistent economic optimization and a Russian classic in mathematics and economics. It is a timely publication to tackle the increasingly crucial issues of consistency and dynamic stability in collaborative activities in the economic arena. The elegant mathematics developed by the authors and their practical applications in economics are prevalent in the analysis. The text significantly expands L.V. Kantorovich’s award-winning work in economic optimization in the new directions of game-theoretic interaction, dynamic evolution, stochasticity, and subgame consistency. vii
viii
Foreword
Subgame Consistent Economic Optimization is undoubtedly a needed and important addition to the field of dynamic interactive optimization in economics. Karelian Institute of Applied Mathematical Research, Russian Academy of Sciences, Russia
Vladimir Mazalov
Preface
The postulation that individually rational, self-maximizing behaviors bring about group (Pareto) optimality constitutes one of the most appealing characteristics of the perfectly competitive market. The market is often regarded as an effective means to allocate economic resources efficiently. However, in the presence of an imperfect market structure, externalities, imperfect information, and public goods, the market fails to provide an effective mechanism for efficient resource use. Not only have inefficient outcomes appeared, but gravely detrimental events—such as the global financial crisis and catastrophe-bound industrial pollution problem—have also emerged under the current market system. With market failures prevailing, optimization in economic activities is one of the remedies available. Strategic behaviors in the market are increasingly pervasive, and as a result, game theory has emerged as one of the fundamental tools in pure and applied research in economics. Because economic activities in the modern corporate world are dynamic processes, economic decisions are more appropriately analyzed in an intertemporal framework. Dynamic cooperation suggests the possibility of socially optimal and group efficient solutions to economic decision problems involving strategic action. In dynamic cooperation, a stringent condition is required for a scheme to be dynamically stable. In particular, the optimality principle must remain optimal throughout the game, that is, at any instant of time along the optimal state trajectory determined at the outset. This condition is known as time consistency. In the presence of stochastic elements, a more stringent condition—that of subgame consistency—is required for a credible cooperative solution. In particular, a cooperative solution is subgame consistent if an extension of the solution policy to a situation with a later starting time, and any realizable state brought about by prior optimal behavior, would remain optimal. The notion of subgame consistency originated in Yeung and Petrosyan (2004), which develops a generalized theorem for the derivation of an analytically tractable “payoff distribution procedure” leading to subgame consistent solutions. Time consistency for the economic optimization problem requires dynamical consistency for all subgames along the group optimal trajectory; then time consistency in this context reflects optimal-trajectory-subgame consistency. ix
x
Preface
This book provides a treatise on subgame consistent economic optimization. In particular, dynamically stable game-theoretic optimization techniques are developed to establish the foundation for an effective policy menu to tackle suboptimal problems that the conventional market mechanism fails to resolve. The book is expected to be used as an analytical tool for advanced graduate students, game theorists, economists, mathematicians, and researchers in this field. We are very grateful to our esteemed friends George Leitmann and John Nash for inspiration from their classic work in game theory, on which many of the results in the book are based. Our families have been an enormous and continuing source of encouragement throughout our careers. We thank Stella and Patricia (DY) and Nina and Ovanes (LP), for their love and patience during this and other projects, which, on occasion, may have diverted our attention away from them. We thank Cynthia Yingxuan Zhang for her outstanding research assistance and manuscript formatting. Financial support from the Research Grants Council of the HKSAR (Grant Number 32-07-028), the European Union Research Commission (Contract Number 044287), and HKSYU is gratefully acknowledged. Finally, we would like to dedicate this text to honor the memory of a pioneering researcher and Nobel Laureate in the field of economic optimization—our late Saint Petersburg colleague Leonid Vitalyevich Kantorovich—in his 100th birthday tribute. Saint Petersburg, Russia
D.W.K. Yeung L.A. Petrosyan
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
Dynamic Strategic Interactions in Economic Systems . . . . . 2.1 Dynamic Interactive Economic System . . . . . . . . . . . 2.1.1 Basic Formulation . . . . . . . . . . . . . . . . . . 2.1.2 Typical Dynamic Economic Game Paradigms . . . 2.1.3 Market Equilibrium . . . . . . . . . . . . . . . . . 2.2 Market Outcomes Under Open-Loop Nash Equilibria . . . 2.2.1 Characterization of Open-Loop Equilibria . . . . . 2.2.2 Open-Loop Solution in Competitive Advertising . . 2.3 Market Outcomes Under Feedback Equilibria . . . . . . . 2.3.1 Characterization of Feedback Equilibria . . . . . . 2.3.2 Feedback Equilibria in Resource Extraction . . . . 2.3.3 Feedback Solution in Competitive Advertising . . . 2.3.4 Duopolistic Competition in Infinite Horizon . . . . 2.4 Dynamic Stochastic Interactive Economic System . . . . . 2.4.1 Game Formulation and Solution Characterization . 2.4.2 An Application of Stochastic Differential Games in Resource Extraction . . . . . . . . . . . . . . . . . 2.4.3 Infinite-Horizon Resource Extraction . . . . . . . . 2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
1
. . . . . . . . . . . . . . .
7 7 7 9 18 19 19 21 23 23 28 32 34 36 36
. . . . . . . . . . . .
39 41 43
Dynamic Economic Optimization: Group Optimality and Individual Rationality . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Group Optimality . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Optimal Strategies and Cooperative State Trajectories . . 3.1.2 Group Optimality in Resource Extraction . . . . . . . . . 3.1.3 Group Optimality in Infinite-Horizon Problems . . . . . . 3.2 Individual Rationality . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Lump-Sum and Continuous Transfer Payments . . . . . . 3.2.2 Individually Rational Imputation in Cooperative Resource Extraction . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
47 48 48 51 53 57 57
.
59 xi
xii
Contents
3.3 Individual Rationality Under Infinite Horizon . . . . . . . . . . . . 3.3.1 Individually Rational at the Outset . . . . . . . . . . . . . 3.3.2 Individually Rational Throughout the Cooperative Duration 3.3.3 Individuall Rationality in Resource Extraction . . . . . . . 3.4 Cooperative Economic Games Satisfying Individual Rationality and Group Optimality . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Cooperative Resource Extraction Game . . . . . . . . . . . 3.4.2 Fully Coordinated Pollution Control . . . . . . . . . . . . 3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5
Time Consistency and Optimal-Trajectory-Subgame Consistent Economic Optimization . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Solution in Dynamic Economic Optimization . . . . . . . . . . . 4.2 Principle of Time Consistency . . . . . . . . . . . . . . . . . . . 4.2.1 Characterization of Time Consistent Solution . . . . . . . 4.2.2 Time Consistent Cooperative Strategies . . . . . . . . . . 4.2.3 Time Consistency in Imputation and Payoff Distribution Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Payoff Distribution Procedure Derivation and Time (OptimalTrajectory-Subgame) Consistent Solutions . . . . . . . . . . . . 4.3.1 Derivation of Payoff Distribution Procedures . . . . . . . 4.3.2 Time (Optimal-Trajectory-Subgame) Consistent Solution 4.4 Solutions from Specific Optimality Principle . . . . . . . . . . . 4.5 An Illustration in Cooperative Fishery . . . . . . . . . . . . . . . 4.6 Consistent Economic Optimization Under Infinite Horizon . . . . 4.6.1 Group Optimal Cooperative Strategies . . . . . . . . . . 4.6.2 Consistent Imputation and Payoff Distribution Procedure 4.6.3 Derivation of Consistent Payoff Distribution Procedure . 4.6.4 Time (Optimal-Trajectory-Subgame) Consistent Solution 4.7 Infinite-Horizon Resource Extraction Optimization . . . . . . . . 4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamically Stable Cost-Saving Joint Venture . . . . . . . . . . . . 5.1 A Dynamic Model of Corporate Joint Venture . . . . . . . . . . 5.2 Time (Optimal-Trajectory-Subgame) Consistent Solution in Joint Venture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Imputation Scheme . . . . . . . . . . . . . . . . . . . . 5.2.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Distribution . . . . . . . . . . . . . . . . . . . . . 5.3 A Cost-Saving Joint Venture . . . . . . . . . . . . . . . . . . . . 5.3.1 Joint Venture Profit and Cost Saving . . . . . . . . . . . 5.3.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing . . . . . . . . . . . . . . . . . . . . . . . 5.4 A Shapley Value Solution to Joint Venture . . . . . . . . . . . . 5.4.1 Dynamic Shapley Value Imputation . . . . . . . . . . . . 5.4.2 The PDP for Shapley Value . . . . . . . . . . . . . . . .
62 62 64 65 69 69 72 75
. . . . .
77 77 80 81 81
.
82
. . . . . . . . . . . .
84 84 86 89 93 98 100 101 102 105 106 109
. 111 . 112 . 114 . 115 . 116 . 117 . 119 . . . .
121 124 126 127
Contents
6
7
xiii
5.5 A Joint Venture with Shapley Value Profit Sharing . . . . . . . . 5.5.1 Coalition Payoffs . . . . . . . . . . . . . . . . . . . . . 5.5.2 PDP for Shapley Value . . . . . . . . . . . . . . . . . . 5.6 Infinite-Horizon Analysis . . . . . . . . . . . . . . . . . . . . . 5.6.1 Dynamic Joint Venture . . . . . . . . . . . . . . . . . . 5.6.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 Shapley Value Profit Sharing . . . . . . . . . . . . . . . 5.7 An Infinite-Horizon Joint Venture . . . . . . . . . . . . . . . . . 5.7.1 Joint Venture and Costs . . . . . . . . . . . . . . . . . . 5.7.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing . . . . . . . . . . . . . . . . . . . . . . . 5.7.3 Shapley Value Solution . . . . . . . . . . . . . . . . . . 5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix Proof of Proposition 5.2 . . . . . . . . . . . . . . . . . .
. . . . .
128 129 130 132 133
. . . .
134 135 137 138
. . . .
139 141 143 144
Collaborative Environmental Management . . . . . . . . . . . . . 6.1 An Analytical Framework . . . . . . . . . . . . . . . . . . . . . 6.1.1 The Industrial Sector . . . . . . . . . . . . . . . . . . . . 6.1.2 Impacts and Accumulation Dynamics of Pollutants . . . . 6.1.3 The Governments’ Objectives . . . . . . . . . . . . . . . 6.2 Noncooperative Outcomes . . . . . . . . . . . . . . . . . . . . . 6.3 Cooperative Arrangement . . . . . . . . . . . . . . . . . . . . . 6.3.1 Group Optimality and Cooperative State Trajectory . . . 6.3.2 Individually Rational and Time (Optimal-TrajectorySubgame) Consistent Imputation . . . . . . . . . . . . . 6.4 Benefit Distribution in Collaborative Environmental Management 6.5 Policy Implications . . . . . . . . . . . . . . . . . . . . . . . . 6.6 A Model of Transboundary Industrial Pollution Management . . 6.6.1 A Multinational Economy with Industrial Pollution . . . 6.6.2 Noncooperative Outcomes . . . . . . . . . . . . . . . . . 6.7 Collaborative Scheme in Transboundary Industrial Pollution Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Cooperative Optimization and State Trajectory . . . . . . 6.7.2 Consistent Imputation and Benefit Distribution . . . . . . 6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 1 Proof of Proposition 6.2 . . . . . . . . . . . . . . . . . Appendix 2 Proof of Proposition 6.3 . . . . . . . . . . . . . . . . .
. . . . . . . .
147 148 148 149 149 150 152 152
. . . . . .
154 155 157 158 158 161
. . . . . .
164 164 167 170 171 173
Dynamically Stable Dormant Firm Cartel . . . . . . . . . 7.1 A Dynamic Oligopoly . . . . . . . . . . . . . . . . . . 7.1.1 Basic Settings . . . . . . . . . . . . . . . . . . 7.1.2 Market Outcome . . . . . . . . . . . . . . . . . 7.2 Time (Optimal-Trajectory-Subgame) Consistent Cartel . 7.2.1 Pareto Optimal Output Path . . . . . . . . . . . 7.2.2 Imputation Scheme and Cartel Profit Sharing . .
. . . . . . .
177 177 177 179 181 182 184
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
xiv
Contents
7.3 A Dormant-Firm Cartel . . . . . . . . . . . . . . . 7.4 Infinite-Horizon Cartel . . . . . . . . . . . . . . . . 7.4.1 Pareto Optimal Trajectory . . . . . . . . . . 7.4.2 Imputation Scheme and Cartel Profit Sharing 7.5 An Infinite-Horizon Dormant-Firm Cartel . . . . . . 7.5.1 Cartel Output and Optimal Resource Path . . 7.5.2 Sharing of Cartel Profits . . . . . . . . . . . 7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . 8
9
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
186 191 194 196 197 199 200 201
Subgame Consistent Economic Optimization Under Uncertainty . 8.1 Dynamic Economic Optimization Under Uncertainty . . . . . . . 8.2 Principle of Subgame Consistency . . . . . . . . . . . . . . . . 8.2.1 Subgame Consistent Solution . . . . . . . . . . . . . . . 8.2.2 Subgame Consistency in Imputation and Payoff Distribution Procedure . . . . . . . . . . . . . . . . . . . 8.3 Payoff Distribution Procedure and Subgame Consistent Solutions 8.3.1 Payoff Distribution Procedures Leading to Subgame Consistent Solutions . . . . . . . . . . . . . . . . . . . . 8.3.2 Subgame Consistent Solution . . . . . . . . . . . . . . . 8.3.3 Instantaneous Transfer Payments . . . . . . . . . . . . . 8.4 An Illustration in Cooperative Fishery Under Uncertainty . . . . 8.4.1 Cooperative Extraction Under Uncertainty . . . . . . . . 8.4.2 Subgame Consistent Cooperative Extraction . . . . . . . 8.5 Infinite-Horizon Consistent Economic Optimization Under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 Group Optimal Cooperative Strategies . . . . . . . . . . 8.5.2 Subgame Consistent Imputation and Payoff Distribution Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.3 Payoff Distribution Procedure Leading to Subgame Consistency . . . . . . . . . . . . . . . . . . . . . . . . 8.5.4 Subgame Consistent Solution . . . . . . . . . . . . . . . 8.6 Infinite-Horizon Cooperative Fishery Under Uncertainty . . . . . 8.6.1 Cooperative Extraction . . . . . . . . . . . . . . . . . . 8.6.2 Subgame Consistent Payoff Distribution . . . . . . . . . 8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
203 204 208 209
Cost-Saving Joint Venture Under Uncertainty . . . . . . . . . . . 9.1 Dynamic Corporate Joint Venture Under Uncertainty . . . . . . 9.1.1 Joint Venture and Expected Profit Maximization . . . . 9.1.2 Subgame Consistent Joint Venture . . . . . . . . . . . 9.2 A Cost-Saving Joint Venture with Stochasticity . . . . . . . . . 9.2.1 Expected Venture Profit and Cost Savings . . . . . . . 9.2.2 Subgame Consistent Venture Profit Sharing . . . . . . . 9.3 A Shapley Value Solution to a Joint Venture Under Uncertainty 9.3.1 Expected Joint Venture Profits and Optimal Trajectory . 9.3.2 The PDP for Shapley Value . . . . . . . . . . . . . . .
. . . . . . . . . .
. 209 . 211 . . . . . .
211 214 216 216 218 220
. 222 . 224 . 225 . . . . . .
227 230 232 233 235 236
. . . . . . . . . .
239 239 240 243 245 246 249 250 251 252
Contents
xv
9.4 A Stochastic Joint Venture with Shapley Value Profit Sharing 9.4.1 Expected Coalition Payoffs . . . . . . . . . . . . . . 9.4.2 Subgame Consistent Shapley Value Solution . . . . . 9.5 Infinite-Horizon Analysis . . . . . . . . . . . . . . . . . . . 9.5.1 Infinite-Horizon Dynamic Joint Venture . . . . . . . 9.5.2 Subgame Consistent Venture Profit Sharing . . . . . . 9.5.3 Shapley Value Profit Sharing . . . . . . . . . . . . . 9.6 An Infinite-Horizon Stochastic Joint Venture . . . . . . . . . 9.6.1 Cost-Saving Joint Venture . . . . . . . . . . . . . . . 9.6.2 Subgame Consistent Venture Profit Sharing . . . . . . 9.6.3 Shapley Value Solution . . . . . . . . . . . . . . . . 9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
254 254 256 257 258 259 261 262 263 266 267 268
10 Collaborative Environmental Management Under Uncertainty . 10.1 An Analytical Framework . . . . . . . . . . . . . . . . . . . . 10.1.1 The Industrial Sector . . . . . . . . . . . . . . . . . . . 10.1.2 Impacts of Pollution and Stochastic Accumulation Dynamics . . . . . . . . . . . . . . . . . . . . . . . . 10.1.3 The Governments’ Objectives . . . . . . . . . . . . . . 10.2 Noncooperative Outcomes . . . . . . . . . . . . . . . . . . . . 10.3 Cooperative Arrangement . . . . . . . . . . . . . . . . . . . . 10.3.1 Group Optimality and Cooperative State Trajectory . . 10.3.2 Imputation Scheme . . . . . . . . . . . . . . . . . . . 10.4 Subgame Consistent Collaborative Environmental Management 10.5 A Model of Stochastic Industrial Pollution Management . . . . 10.5.1 A Multinational Economy with Industrial Pollution . . 10.5.2 Noncooperative Outcomes . . . . . . . . . . . . . . . . 10.6 Collaborative Scheme in Stochastic Industrial Pollution Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6.1 Cooperative Optimization and State Trajectory . . . . . 10.6.2 Subgame Consistent Solution and Benefit Distribution . 10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 271 . . 272 . . 272
11 Subgame Consistent Dormant Firm Cartel . . . . . . 11.1 A Stochastic Dynamic Oligopoly . . . . . . . . . 11.1.1 Basic Settings . . . . . . . . . . . . . . . 11.1.2 Market Outcome . . . . . . . . . . . . . . 11.2 Subgame Consistent Cartel . . . . . . . . . . . . 11.2.1 Pareto Optimal Output Path . . . . . . . . 11.2.2 Subgame Consistent Cartel Profit Sharing 11.3 A Dormant-Firm Cartel . . . . . . . . . . . . . . 11.4 Infinite-Horizon Cartel . . . . . . . . . . . . . . . 11.4.1 Pareto Optimal Trajectory . . . . . . . . . 11.4.2 Subgame Consistent Cartel Profit Sharing 11.5 An Infinite-Horizon Dormant-Firm Cartel . . . . . 11.5.1 Cartel Output . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
273 273 274 276 276 279 279 280 280 283
. . . .
. . . .
286 286 289 291
. . . . . . . . . . . . .
. . . . . . . . . . . . .
295 295 295 297 299 300 302 304 309 312 314 316 317
xvi
Contents
11.5.2 Subgame Consistent Cartel Profits Sharing . . . . . . . . . 319 11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 12 Dynamic Consistency in Discrete-Time Cooperative Games . . . . 12.1 Dynamic Games . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.1 Game Formulation . . . . . . . . . . . . . . . . . . . . . 12.1.2 Noncooperative Outcome . . . . . . . . . . . . . . . . . 12.2 Dynamic Cooperation . . . . . . . . . . . . . . . . . . . . . . . 12.2.1 Group Optimality . . . . . . . . . . . . . . . . . . . . . 12.2.2 Individual Rationality . . . . . . . . . . . . . . . . . . . 12.3 Time Consistent Solutions and Payment Mechanism . . . . . . . 12.3.1 Payoff Distribution Procedure . . . . . . . . . . . . . . . 12.3.2 Time (Optimal-Trajectory-Subgame) Consistent Solution 12.4 An Illustration in Cooperative Resource Extraction . . . . . . . . 12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 1 Proof of Proposition 12.1 . . . . . . . . . . . . . . . . . Appendix 2 Proof of Proposition 12.2 . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
323 323 324 324 326 327 328 329 330 331 332 335 336 339
13 Discrete-Time Cooperative Games Under Uncertainty . . 13.1 Stochastic Dynamic Games . . . . . . . . . . . . . . . 13.1.1 Game Formulation . . . . . . . . . . . . . . . . 13.1.2 Noncooperative Solution . . . . . . . . . . . . 13.2 Dynamic Cooperation Under Uncertainty . . . . . . . . 13.2.1 Group Optimality . . . . . . . . . . . . . . . . 13.2.2 Individual Rationality . . . . . . . . . . . . . . 13.3 Subgame Consistent Solutions and Payment Mechanism 13.3.1 Payoff Distribution Procedure . . . . . . . . . . 13.3.2 Subgame Consistent Solution . . . . . . . . . . 13.4 Cooperative Resource Extraction Under Uncertainty . . 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 1 Proof of Theorem 13.1 . . . . . . . . . . . . . Appendix 2 Proof of Proposition 13.1 . . . . . . . . . . . . Appendix 3 Proof of Proposition 13.2 . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
343 344 344 344 345 346 347 348 349 350 351 355 356 359 362
Technical Appendixes: Dynamic Optimization Techniques A.1 Dynamic Programming . . . . . . . . . . . . . . A.2 Optimal Control . . . . . . . . . . . . . . . . . . A.3 Stochastic Control . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
367 367 373 377
. . . .
. . . .
. . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Chapter 1
Introduction
Emile de Laveleye (1882): “Political economy may be defined as the science which determines what laws men ought to adopt in order that they may, with the least possible exertion, procure the greatest abundance of things useful for the satisfaction of their wants, may distribute them justly and consume them rationally.”
The most appealing characteristic of the perfectly competitive market is perhaps the postulation that individually rational self-maximizing behaviors bring about group (Pareto) optimality. Hence the market is regarded as an effective means to allocate economic resources efficiently. However, a competitive market will fail to provide an efficient allocation mechanism if there exists an imperfect market structure, externalities, imperfect information, or public goods. These phenomena are prevalent in the current global economy. As a result, though the market is perceived to be the most effective instrument in conducting economic activities, it fails to guarantee its efficiency under many current conditions. Not only have inefficient market outcomes appeared, but gravely detrimental events—such as the worldwide financial crisis and catastrophe-bound industrial pollution problem—have also emerged under the conventional market system. The 2008 worldwide financial tsunami has revealed, perhaps, the gravest situation generated by the global market economy. The event led to serious challenges on the performance of existing market systems. In one of the most thought-provoking publications about financial markets in the last century, Soros (1998) presented the thesis that the global capitalist system is in crisis. In particular, he pointed out that financial markets are inherently unstable. Market fundamentalism is defined as the widespread belief that markets are self-correcting, that a global economy can flourish under the belief that the common interest is served by allowing everyone to look out for his or her interests, and that attempts to protect the common interest by collective decision making distort the market mechanism. “This belief is false” said Soros. “Instead of acting like a pendulum, financial markets have recently acted more like a wrecking ball, knocking over one economy after another.” D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_1, © Springer Science+Business Media, LLC 2012
1
2
1
Introduction
Iceland, Greece, Italy, Spain, and Portugal are current examples of knocked-over economies in the midst of the 2008 financial tsunami. In the presence of market failures, the optimization of economic activities provides an effective remedial measure. One of the most successful pioneers in economic optimization was Leonid Vitalyevich Kantorovich. His approach to optimizing resource use was of crucial importance in the economy of the former Soviet Union. Moreover, the work of Kantorovich played a significant role in introducing mathematics into economic optimization. One of the main features of Kantorovich’s economic optimization paradigm was the development of novel mathematical tools to handle real-life economic problems rather than just borrowing the techniques from existing mathematics. Economic analysis no longer treats the economic system as a given since the appearance of Hurwicz’s (1973) pioneering work on mechanism design. The term “design” stresses that the structure of the economic system is to be regarded as an unknown. The design point of view enlarges our vision and helps economics avoid a narrow focus on existing institutions. After the 2008 financial tsunami, Soros (2009) stressed that the global market needs new international rules if another global financial collapse is to be avoided. Like Soros, many believe that international cooperation is the way out of the current global financial crisis. International endeavors such as the Kyoto Protocol, Copenhagen Accord, Organization for Economic Cooperation & Development (OECD), Asia-Pacific Economic Cooperation (APEC), and G-20 summits are vivid examples of joint optimization initiatives to seek remedies for the failed market mechanism. In a broad sense, economic optimization can be applied in the micro-framework involving a particular industry or a group of firms, in the macro-national level, and in a global international framework. Hurwicz (1973) also pointed out that in economics, one deals with goal conflicts due to the multiplicity of consumers and one-objective-function problems, which fail to address the crucial issue of goal conflict. The role of strategic interactions in the current market system is increasingly recognized in theory and practice. As a result, game theory has emerged as one of the fundamental tools in pure and applied research in economics. The discipline of game theory studies decision making in an interactive environment. In canonical form, an economics game arises when an economic agent pursues an objective(s) in a situation in which other agents concurrently pursue other (possibly conflicting, possibly overlapping) objectives: The problem is then to determine each agent’s optimal decision, how these decisions interact to produce equilibria, and the properties of such outcomes. The foundations of game theory were established some 60 years ago by von Neumann and Morgenstern (1944). Advances in technology, communications, industrial organization, regulation methodology, international trade, economic integration, and political reform have created rapidly expanding social and economic networks incorporating crosspersonal and cross-country interactions. From a decision- and policy-maker’s perspective, it has become increasingly important to recognize and accommodate the interdependencies and interactions of human decisions under such circumstances. The strategic aspects of decision making are often crucial in areas as diverse as
1 Introduction
3
trade negotiation, foreign and domestic investment, multinational pollution planning, market development and integration, joint venture, technological research and development (R&D), resource extraction, competitive marketing, and regional cooperation. Game theory is perhaps one of the most sophisticated and fertile paradigms that applied mathematics can offer to study and analyze decision making under realworld conditions. Since economic activities in the modern corporate world are dynamic processes, economic decisions would be more appropriately analyzed in an inter-temporal framework. One particularly complex and fruitful branch of game theory is dynamic or differential games, which investigate interactive decision making over time. Nash’s (1950) noncooperative equilibrium theory clearly demonstrated that individually rational behaviors by players would seem to deviate from a jointly optimal outcome when strategic interactions are present. Even worse, there is no guarantee such equilibrium self-maximizing behaviors will not bring about highly undesirable outcomes as those that can be illustrated in the Prisoner’s Dilemma paradigm. Economic optimization continues to be a much needed remedy to many of the current economic activities in which strategic interactions are significant. Cooperative optimization points to the possibility of socially optimal and individually rational solutions to decision making problems involving strategic action over time. However, one may find it hard to be convinced that dynamic cooperation can offer a long-term solution unless there is a guarantee that participants will always be better off throughout the entire cooperation period, and the agreed-upon optimality principle be maintained from beginning to end. Many cooperation schemes become unstable and may fail any time within the cooperation period because of the lack of this kind of guarantee. To guarantee that cooperation will last throughout the agreement period, a stringent condition is required: The optimality principle agreed upon at the outset must remain effective throughout the game, at any instant of time along the optimal state trajectory. This condition is known as time consistency. In other words, the dynamic stability of solutions in cooperative differential games involves the property that, as the game proceeds along an optimal trajectory, players are guided by the same optimality principle at each instant of time, and hence do not possess incentives to deviate from the previously adopted optimal behavior throughout the game. In the presence of stochastic elements, a more stringent condition—that of subgame consistency—is required for a credible cooperative solution. In particular, the optimality principle agreed upon at the outset must remain effective in any subgame starting at a later time with a realizable state brought about by prior optimal behavior. The notion of subgame consistency originated in our 2004 article (Yeung and Petrosyan 2004), in which a generalized theorem for the derivation of an analytically tractable “payoff distribution procedure” leading to subgame consistent solutions is developed. A series of further developments and extensions can be found in Petrosyan and Yeung (2006, 2007), Yeung (2005, 2006, 2007, 2008, 2011a), Yeung and Petrosyan (2005, 2006a, 2006b, 2007a, 2007b, 2007c, 2008, 2010), and Yeung et al. (2007).
4
1
Introduction
Because time consistency for the economic optimization problem requires dynamical consistency for all subgames along the group optimal trajectory, time consistency in this context reflects an optimal-trajectory-subgame consistency. This book presents a treatise on subgame consistent economic optimization. In particular, game-theoretic optimization is developed to establish the foundation for an effective policy menu to tackle suboptimal problems that the conventional market mechanism fails to resolve. The book expands Kantorovich’s single-agent economic optimization paradigm to a multiple-agent framework. Moreover, novel mathematics developed by the authors is provided for dealing with game-theoretic economic optimization problems. The text is organized as follows. Chapter 2 examines the dynamic strategic interactions in the economic system and typical dynamic economic game paradigms. Market equilibrium outcomes and the characterization of the solutions in open-loop and feedback strategies are provided. The dynamic stochastic interactive economic system and characterization of corresponding market outcomes are also presented. Chapter 3 examines two fundamental elements in dynamic economic optimization—group optimality and individual rationality. Group optimal cooperative strategies and cooperative state trajectories are characterized. The conditions under which individual rationality are maintained throughout the cooperation period are identified. In Chap. 4, time (optimal-trajectory-subgame) consistent economic optimization is analyzed. The principle of time consistency, time consistent cooperative strategies, imputation, and optimality principles are scrutinized. Payoff distribution procedures leading to time consistent solutions are derived. Infinite-horizon analysis is also analyzed. Chapter 5 presents a dynamically stable joint venture. Time (optimal-trajectorysubgame) consistent profit sharing in joint ventures and instantaneous venture transfer payments are developed. Because the sizes and earning potentials of the firms in a corporate joint venture may vary significantly, the problem of profit sharing is inescapable in virtually every joint venture. The analysis first considers the case when the venture agrees to share the excess of the total cooperative payoff over the sum of individual noncooperative payoffs proportional to the firms’ noncooperative payoffs. A Shapley value solution is also provided. In Chap. 6, economic optimization is applied to collaborative environmental management. It is hard to be convinced that current multinational joint initiatives, such as the Kyoto Protocol or the Copenhagen Accord, can offer a long-term solution because there is no guarantee that the participants will always be better off within the entire duration of the agreement. Because of the lack of these kinds of incentives, current cooperative schemes fail to provide an effective means to solve the problem. This is a “classic” game-theoretic problem. A theoretical framework capturing the essence of a transboundary industrial pollution paradigm in the form of a differential game is adopted. A time (optimal-trajectory-subgame) consistent cooperative solution is illustrated in the chapter. Benefit distributions in collaborative environmental abatements fulfilling time consistency are obtained. Policy implications are also analyzed.
1 Introduction
5
Chapter 7 looks into the special case of a cartel with dormant firms. Optimization in cartels, which restricts outputs to enhance their joint profits is examined. In particular, some firms have absolute cost disadvantages and this, therefore, forces them to become dormant partners. The Pareto optimal output path, time (optimal-trajectorysubgame) consistent imputation scheme, and cartel profit sharing are examined. Chapter 8 considers dynamic economic optimization under uncertainty. The principles of subgame consistency, subgame consistent cooperative strategies, and imputation are scrutinized. Mechanisms for the derivation of a payoff distribution procedures leading to subgame consistent solutions and instantaneous transfer payments are presented. Chapter 9 investigates a dynamic corporate joint venture under uncertainty and the corresponding subgame consistent solutions. Chapter 10 analyzes collaborative environmental management under uncertainty. Group optimality cooperative state trajectory, subgame consistent imputation, and benefit distribution are derived. Chapter 11 considers the dormant firm cartel under uncertainty and the corresponding subgame consistent solutions. Chapter 12 considers the analysis in a discrete-time framework. In some economic situations, the economic process is in discrete time rather than in continuous time. The chapter presents a general formulation of dynamic economic games in discrete time and derives time (optimal-trajectory-subgame) consistent cooperative solutions with the corresponding payoff distribution procedures. An illustration in cooperative resource extraction is also provided. Chapter 13 extends the analysis in discrete time to a stochastic framework. A general formulation of stochastic dynamic economic games in discrete time, subgame consistent cooperative solutions with corresponding payoff distribution procedures, and an illustration in stochastic cooperation are provided. Additionally, dynamic optimization techniques are provided in the Technical Appendixes at the end of the book.
Chapter 2
Dynamic Strategic Interactions in Economic Systems
The recent globalization and emergence of multinational corporations turned many major economic activities into dynamic interactive endeavors. The number of decision makers involved is relatively small and it leads to significant strategic interdependence. With human life being lived over time, and institutions like markets, firms, and governments changing over time, the economic system is definitely a dynamic interactive entity. Section 2.1 provides a general overview of dynamic interactive economic systems. Market outcomes under open-loop equilibria are investigated in Sect. 2.2 and those under feedback equilibria are examined in Sect. 2.3. An extension of the analysis to a stochastic framework is provided in Sect. 2.4.
2.1 Dynamic Interactive Economic System In this section, we provide the formulation of dynamic interactive economic systems, the typical dynamic game paradigm in economic analysis, and the characterization of market equilibria.
2.1.1 Basic Formulation A fruitful way of modeling a dynamic interactive situation is by differential games. Differential games study a class of decision problems under which the evolution of the state is described by a differential equation and the players act throughout a time interval. In economic analysis the general form of n-person differential games can be characterized as follows. Economic agent i seeks to maximize its objective s T i g s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds t0
+ exp −
T
t0
i
r(y) dy q x(T ) ,
for i ∈ N = {1, 2, . . . , n},
(2.1)
t0
D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_2, © Springer Science+Business Media, LLC 2012
7
8
2
Dynamic Strategic Interactions in Economic Systems
where r(y) is the discount rate, x(s) ∈ X ⊂ R m denotes the state variables of the game, q i (x(T )) is agent i’s valuation of the state at terminal time T , and ui ∈ U i is the control of agent i, for i ∈ N . The state variable evolves according to the dynamics x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 ,
(2.2)
where x(s) ∈ X ⊂ Rm denotes the state variables of the game and ui ∈ U i is the control of agent i, for i ∈ N . The functions f [s, x, u1 , u2 , . . . , un ], g i [s, ·, u1 , u2 , . . . , un ], and q i (·), for i ∈ N , and s ∈ [t0 , T ] are differentiable functions. Examples of economic state variables include capital stock, resource biomass or deposits, the level of technology, market shares, economic assets, equity, prices, pollutants, and company goodwill. Examples of controls include investment, resource extraction rate, research and development (R&D) efforts, advertising rate, output produced, input used, taxes, subsidy, and expenditures. In many economic situations, the terminal time of the game, T , is either very far in the future or unknown to the agents. For example, the value of a publicly listed firm is the present value of its discounted expected future earnings. Nobody knows when the firm will be out of business. As argued by Dockner et al. (2000), in this case setting T = ∞ may very well be the best approximation for the true game horizon. Even if the firm’s management restricts itself to considering profit maximization over the next year, it should value its asset positions at the end of the year by the earning potential of these assets in the years to come. In the case when the terminal horizon T approaches infinity, an autonomous game structure with constant discounting will replace (2.1) and (2.2). In particular, the game becomes
∞
max ui
g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t0 ) ds,
for i ∈ N, (2.3)
t0
subject to the state dynamics x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 ,
(2.4)
where r is a constant discount rate. Since time s does not appear explicitly in the agent’s payoff gi [x(s), u1 (s), u2 (s), . . . , un (s)] and the state dynamics f [x(s), u1 (s), u2 (s), . . . , un (s)], the problem is an autonomous problem. Theoretical research and the applications of differential games proceeded apace in the past in many areas of economics. An in-depth survey and analysis on differential games in economics and management science can be found in Dockner et al. (2000). A detailed account and a comprehensive list of differential games in marketing can be found in Zaccour (2003). A thorough survey of models of dynamic games in economics is given in Long (2010).
2.1 Dynamic Interactive Economic System
9
2.1.2 Typical Dynamic Economic Game Paradigms In this section we present the general model structures and specific examples of some typical dynamic economic game paradigms.
2.1.2.1 Investment Games A general structure of investment games can be characterized as follows. Let K i (s) be the physical capital stock of firm i ∈ N at time s. Each firm in the industry accumulates capital according to the equation K˙ i (s) = I i (s) − δ i K i (s),
for i ∈ N,
(2.5)
where I i (s) is the gross investment of firm i at time s and δ i ≥ 0 is the constant rate N , is given. of depreciation. At initial time t0 , the capital stock K i (t0 ) = K0i , for i ∈ Denote the output of firm i by qi (s), the industry output by Q(s) = nj=1 qj (s), and the output price by P [Q(s)]. The output of firm i is governed by the production function qi (s) = f j [Li (s), i K (s)], where Li (s) is the quantity of noncapital input (like labor) employed at time s. The cost of production is ci {f i [K i (s), Li (s)]}. At time instant s, the operating profit of firm i becomes P
n
i i j
j f L (s), K (s) f L (s), K i (s) − ci f Li (s), K i (s) , j
j =1
for i ∈ N.
(2.6)
The optimal choice of noncapital input by firm i satisfies P
n j =1
j i i j f L (s), K (s) fLi L (s), K i (s) f i Li (s), K i (s)
+P
j
n j =1
i i j j f L (s), K (s) fLi L (s), K i (s) j
i i i − cL = 0, i f L (s), K (s)
(2.7)
for i ∈ N . If an instantaneous industry equilibrium exists, the optimal choice of noncapital inputs by these n firms can be found by solving (2.7) and be expressed as Li∗ (s) = i K 1 (s), K 2 (s), . . . , K n (s) = i K(s) ,
for i ∈ N.
(2.8)
10
2
Dynamic Strategic Interactions in Economic Systems
Upon substituting Li∗ (s) in (2.8) into the firm’s instantaneous operating profit in (2.6) yields
n j i i j j f K(s) , K (s) f K(s) , K i (s) π K(s) = P i
j =1
− ci f i K(s) , K i (s) ,
for i ∈ N.
(2.9)
The cost of investment is mi [I i (s)]. Firms will choose an investment path over the time period [t0 , T ] to maximize their future streams of profits. The present value of future profits of firm i can then be expressed as
s i r(y) dy ds, π K(s) − m I (s) exp −
T
i
t0
for i ∈ N.
(2.10)
t0
The maximization of (2.10) by firm i ∈ N subject to (2.5) forms a differential game. If the time horizon approaches infinity, that is, T = ∞, an infinite-horizon version of the game in (2.5) and (2.10) can be set up as
∞
π i K(s) − mi I (s) exp(−rs) ds,
for i ∈ N,
(2.11)
t0
subject to (2.15). Example 2.1 Consider a specific example of investment games in which the demand function is given as P Q(s) = a − Q(s), 2 mi I i (s) = ai I (s) , 2
ci f i K(s) , K i (s) = bi K i (s) + bˆ i K i (s) . The interest rate is r. With these specifications, different investment games with a linear quadratic type can be formulated. Example 2.2 A knowledge investment game, with knowledge being a public good, can be formulated as follows. The level of knowledge K(s) will change according the accumulation equation ˙ K(s) =
n j =1
I j (s) − δK(s),
K(t0 ) = K0 .
2.1 Dynamic Interactive Economic System
11
Economic agent i’s cost of investment in the public knowledge capital is 2 1 mi I i (s) = ρI i (s) + I i (S) , 2
for i ∈ N.
Economic agent i’s instantaneous operating net revenue is π i K(s) = K(s) a i − K(s) . Once again the interest rate is r. Dockner (1992), Fershtman and Muller (1984, 1986), Fudenberg and Tirole (1983, 1986, 1991), Reynolds (1987, 1991), and Spence (1979) presented various examples of this class of investment games.
2.1.2.2 Renewable Resource Extraction Games A general structure of renewable resource extraction games can be characterized as follows. Consider an economy endowed with a single renewable resource, with n ≥ 2 resource extractors (firms). Let ui (s) denote the quantity of the resource extracted by firm i at time s, for i ∈ N , where each firm controls its rate of extraction. Let U i be the set of admissible extraction rates and x(s) the size of the resource stock at time s. In particular, we have U i ∈ R + , x(s) > 0, and U i = {0} for x(s) = 0. The growth dynamics of the renewable resource stock becomes n uj (s), x(s) ˙ = f s, x(s) −
x(t0 ) = x0 > 0,
(2.12)
j =1
where f [s, x(s)] is the natural rate of evolution of the resource. The extraction cost for firm i ∈ N depends on the quantity of the resource extracted ui (s), the resource stock size x(s), and some other input parameters. In particular, the extraction cost can be specified as C i ui (s), x(s) . (2.13) The cost per unit of the resource extracted by firm i is negatively related to the size of the resource stock. The market price of the resource depends on the total amount of the resource extracted and supplied to the market. The price-output relationship at time s is given by the following downward-sloping demand curve: p = P s, Q(s) , (2.14) n where p is the market price of the resource and Q(s) = j =1 uj (s) is the total amount of the resource extracted and marketed at time s. The firm’s horizon is [t0 , T ], and at time T a terminal payment q i [x(T )] will be given to firm i.
12
2
Dynamic Strategic Interactions in Economic Systems
Firm i seeks to maximize the present value of its profits
s r(y) dy ds P s, Q(s) ui (s) − C i ui (s), x(s) exp −
T
t0
+ exp −
T
t0
r(y) dy q i x(T ) ,
for i ∈ N,
(2.15)
t0
subject to (2.12). If the time horizon approaches infinity, that is, T = ∞, an infinite-horizon version of the game in (2.12) and (2.15) can be set up as ∞ P Q(s) ui (s) − C i ui (s), x(s) exp −r(s − t0 ) ds, for i ∈ N, (2.16) t0
subject to x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 > 0.
(2.17)
Example 2.3 Consider the deterministic version of the Jørgensen and Yeung (1996) renewable resource game in which the growth dynamics is governed by x(s) ˙ = ax(s)
1/2
− bx(s) −
n
uj (s)
and x(t0 ) = x0 > 0.
(2.18)
j =1
The natural growth function is ax 1/2 − bx = x[ax −1/2 − b]. This function represents that pure compensation, viz., the proportional growth rate ax −1/2 is a decreasing function of x. The extraction cost for firm i ∈ N depends on the quantity of the resource extracted ui (s), the resource stock size x(s), and a parameter c. In particular, the extraction cost can be specified as follows: C i ui (s), x(s) =
c ui (s). x(s)1/2
This specification implies that the cost per unit of the resource extracted by firm icx(s)−1/2 decreases when x(s) increases. The above cost structure was also adopted by Jørgensen and Yeung (1996). A decreasing unit cost follows from two assumptions: (i) The cost of extraction is proportional to the extraction effort and (ii) the amount of the resource extracted, seen as the output of a production function of two inputs (effort and stock level), is increasing in both inputs (cf. Clark 1990). The market price of the resource depends on the total amount extracted and supplied to the market. The price-output relationship at time s is given by the following downward-sloping inverse demand curve: P (s) = Q(s)−1/2 ,
2.1 Dynamic Interactive Economic System
13
where Q(s) = i∈N ui (s) is the total amount of the resource extracted and marketed at time s. The objective of extractor i ∈ N is to maximize the present value of the stream of future profits
T t0
n
−1/2 uj (s)
j =1
c ui (s) − ui (s) e−r(s−t0 ) ds + e−r(T −t0 ) x(T )1/2 , x(s)1/2
for i ∈ N,
(2.19)
subject to the stock dynamics of (2.18). Chiarella et al. (1984), Reinganum and Stokey (1985), Clemhout and Wan (1985a, 1994), Dockner and Kaitala (1989), Plourde and Yeung (1989), Jørgensen and Sorger (1990), Fischer and Mirman (1992), Kaitala (1993), and Dockner et al. (1989) presented specific dynamic resource extraction games.
2.1.2.3 Marketing Games Three major types of marketing games are presented below. (i) Market Share Models Market share models derive their name from the fact that the state variables of the game are the firm’s market shares. In an n-firm oligopoly, let xi (s) denote the market share of firm i ∈ N . The state space X is represented by n i i j X = x (s) ∈ R x (s) ∈ [0, 1], i ∈ N, x (s) = 1 . (2.20) j =1
Let ui (s) ∈ R m denote the advertising efforts of firm i at time s; a general version of the market shares dynamics can be expressed as n f j uj (s) x˙ i (s) = 1 − x i (s) f i ui (s) − x i (s) j =1 j =i n = f i ui (s) − x i (s) f j uj (s) ,
for i ∈ N.
(2.21)
j =1
The advertising response function f i [ui (s)] is positive for positive advertising efforts. A diminishing (or nonincreasing) marginal product of advertising efforts is assumed, leading to the second-order derivative of f i [ui (s)] being nonpositive.
14
2
Dynamic Strategic Interactions in Economic Systems
Instead of market shares, the state variable may also represent the sales rates in a market where sales are fixed, say at level m, ¯ and therefore n
i (s) = m. ¯
(2.22)
j =1
Equation (2.22) reflects a market at its maturity stage with a stationary total sales ¯ = x i (s). The dynamics of the volume of m. ¯ The market of firm i is then i (s)/m change in sales rates can be formulated as n ˙ i (s) = m ¯ − i (s) f i ui (s) − i (s) f j uj (s) j =1 j =i n f j uj (s) , = mf ¯ i ui (s) − i (s)
for i ∈ N.
(2.23)
j =1
Firm i’s cost of advertising efforts is ci [ui (s)] and the gross profit of a unit of sales is Pi . The terminal valuation of the sales (or market shares) yields firm i a value q i [ i (T )]. The profit to firm i can be expressed as s T i i r(y) dy ds Pi (s) − c ui (s) exp − t0
+ exp −
T
t0
r(y) dy q i i (T ) .
(2.24)
t0
If the time horizon approaches infinity, that is, T = ∞, an infinite-horizon version of the game can be set up as ∞ max (2.25) Pi i − ci ui (s) exp −r(s − t0 ) ds, for i ∈ N, ui
t0
subject to (2.23). Example 2.4 A popular specification of the response function is α f i ui (s) = βi ui (s) i , with Bi > 0 and αi ∈ (0, 1]. The cost of advertising 2 ci ui (s) = ci ui (s) , where ci is a positive constant.
2.1 Dynamic Interactive Economic System
15
Case (1979), Chintagunta and Jain (1995), Chintagunta and Vilcassim (1994), Erickson (1985, 1993, 1992, 1997), Fruchter (1999a, 1999b, 2001), Fruchter and Kalish (1998), Fruchter et al. (2001), Mesak and Calloway (1995), Mesak and Darrat (1993), Olsder (2001), and Sorger (1989) developed and analyzed market share models along this line. A model closely related to the market shares model is the sales response model. A sales response game model specifies the rate of change of a firm’s sales rate i (s) as a function of the marketing instruments of all the firms in the market. Let ui (s) ∈ R m be the marketing instruments of firm i; a general specification of the sales dynamics is ˙ i (s) = f i s, 1 (s), 2 (s), . . . , n (s), u1 (s), u2 (s), . . . , un (s) , (2.26) i (t0 ) = 0i , for i ∈ N. Example 2.5 Mukundan and Elsner (1975) presented a model with sales dynamics i (s) − δi i (s), for i ∈ {1, 2}. ˙ i (s) = γi ui (s) 1 − 1 (s) + 1 (s) Erickson (1995) presented a model with sales dynamics n i j ˙ (s) = γi ui (s) m(s) ¯ − (s) − δi i (s),
for i ∈ N,
j =1
where m(s) ¯ is the time-varying market potential. Deal (1979), Feichtinger and Dockner (1984), Jørgensen (1982), Little (1979), Sethi (1973), and Wang and Wu (2001) presented various sales response models. (ii) New Product Diffusion Models New product diffusion models are paradigms in which new products or services are introduced and their reputations built up in the market. The cumulative sales affect the current instantaneous sales as the market becomes more mature and the knowledge of the products becomes more available. Using x i (s) to denote the cumulative sales of product i at time s, the time derivative of x i (s) then represents the sales rate at time s. A general diffusion process governing the sales dynamics can be expressed as x˙ i (s) = f i s, u1 (s), u2 (s), . . . , un (s), x 1 (s), x 2 (s), . . . , x n (s) , x i (t0 ) = x0i ,
(2.27)
for i ∈ N , where uj (s) are the advertising strategies of firm i at time s. The function f i is assumed to satisfy the conditions fuii > 0, fuii uj < 0. In a market with all products being substitutes of each other, fuij < 0 for i = j .
16
2
Dynamic Strategic Interactions in Economic Systems
The instantaneous profit to firm i is π i x˙ i (s), ui (s), x i (s) . In particular, π i [x˙ i (s), ui (s), x i (s)] can take on a formulation like R i [x˙ i (s), x i (s)]− ci [ui (s)], where R i [x˙ i (s), x i (s)] is the instantaneous net revenue from sales x˙ i (s) and ci [ui (s)] is the cost of advertising. The cumulative sales may affect the cost of production if experience counts. A general dynamic game model can be formulated as max ui
T
s r(y) dy ds, π x˙ (s), ui (s), x (s) exp − i
i
i
t0
for i ∈ N, (2.28)
t0
subject to the dynamics in (2.27). Example 2.6 Consider an oligopolistic extension of the Horsky and Simon (1983) model in which firm i seeks to maximize T
i πi x˙ (s) − ui (s) exp −r(s − t0 ) ds, for i ∈ N, t0
where πi is the nonnegative unit margin of firm i’s product. The sales dynamics is
n n j j x (s) m ¯− x (s) , x˙ (s) = α + β ln ui (s) + γ
i
j =1
for i ∈ N.
j =1
Industry-wide positive effects are realized as the new products’ cumulative sales increase. For other new product diffusion models one can see Dockner and Jørgensen (1988, 1992). (iii) Goodwill Models Another class of advertising games is one that deals with the accumulation of a stock of goodwill or brand image. Let Gi (s) denote the stock of goodwill of firm i at time s. A general form of the dynamics of the goodwill of firm i is ˙ i (s) = hi s, ui (s), Gi (s), x i (s) , G
for i ∈ N,
(2.29)
where ui (s) is the effort on the creation of goodwill and x i (s) is the market share or sales rate of firm i. The market share (sales rate) of firm i may be affected by all the firms’ goodwill stocks and market shares. The dynamics of the market share or sales rate of firm i
2.1 Dynamic Interactive Economic System
17
yields the relationships x˙ i (s) = f i s, u1 (s), u2 (s), . . . , un (s), x 1 (s), x 2 (s), . . . , x n (s), G1 (s), (2.30) G2 (s), . . . , Gn (s) , for i ∈ N . Firm i seeks to maximize s T
i i 1 2 n r(y) dy ds, π x (s), ui (s), G (s), G (s), . . . , G (s) exp − t0
t0
for i ∈ N,
(2.31)
subject to (2.29) and (2.30). The term π i [x i (s), ui (s), G1 (s), G2 (s), . . . , Gn (s)] represents the instantaneous net revenue of firm i. An infinite-horizon game problem can be formulated with T = ∞, a constant discount rate, autonomous versions of the goodwill dynamics in (2.29), and of the market share dynamics in (2.30). Example 2.7 Fornell et al. (1985) exploited the concept of consumption as a form of production and assumed that production learning took place. This resulted in consumption experience. In an oligopolistic market, brand-specific consumption experience stocks are denoted by G1 , G2 , . . . , Gn . The dynamics of these experience stocks is ˙ i (s) = x i (s) − δGi (s), G
Gi (t0 ) = Gi0 ,
for i ∈ N,
where x i (s) is the market share of firm i. Firm i controls the ratio of its advertising expenditure to unit sales a i (s) and the ration of its promotion expense to unit sales bi (s). The market shares of firm i evolve according to x˙ i (s) =
n
x i (s)x j (s) f a i (s), x(s), Gi (s)
j =1
− f a j (s), x(s), Gj (s) + g bi (s) − g bj (s) ,
x i (t0 ) = x0i ,
for i ∈ N,
where x(s) = {x 1 (s), x 2 (s), . . . , x n (s)}. Example 2.8 Consider a duopoly in which the goodwill dynamics is ˙ i (s) = G
ui (s) − δGi (s),
Gi (t0 ) = Gi0 ,
for i ∈ {1, 2}.
18
2
Dynamic Strategic Interactions in Economic Systems
The sales rate of firm i at time s is 2 2 i G1 (s), G1 (s) = αi Gi (s) − βi Gj (s) − γi Gi (s) + θi Gj (s) + ςi Gi (s)Gj (s), for i, j ∈ {1, 2} and i = j , where αi , βi , γi , θi , and ςi are positive constants. The objectives of the duopolists are ∞ πi i G1 (s), G2 (s) − ui (s) exp −r(s − t0 ) ds, for i, j ∈ {1, 2}, t0
where πi is the constant unit margin of firm i. Chintagunta (1993), Feichtinger et al. (1994), Fershtman (1984), Sethi and Thompson (2000), and Tapiero (1979) considered games involving goodwill.
2.1.3 Market Equilibrium The outcome in the economic system (often known as market outcome when the system is driven by markets) is characterized by an equilibrium in which each participant is maximizing its objective given the other participants’ optimal choices of controls/strategies. A set of strategies {υ1∗ (s), υ2∗ (s), . . . , υn∗ (s)} is said to constitute a noncooperative Nash equilibrium solution for the n-person differential game equations (2.1) and (2.2), if the following inequalities are satisfied for all υi (s) ∈ U i , i ∈ N :
T t0
s g i s, x ∗ (s), υ1∗ (s), υ2∗ (s), . . . , υn∗ (s) exp − r(y) dy ds + exp −
≥
T
t0
r(y) dy q i x ∗ (T )
t0 T
t0
∗ ∗ g i s, xˆ i (s), υ1∗ (s), υ2∗ (s), . . . , υi−1 (s), υi (s), υi+1 (s), . . . , υn∗ (s)
s × exp − r(y) dy ds + exp − t0
T
r(y) dy q i xˆ i (T ) ,
(2.32)
t0
where, on the time interval [t0 , T ], x˙ ∗ (s) = f s, x ∗ (s), υ1∗ (s), υ2∗ (s), . . . , υn∗ (s) , x ∗ (t0 ) = x0 , and ∗ ∗ x˙ˆ i (s) = f s, xˆ i (s), υ1∗ (s), υ2∗ (s), . . . , υi−1 (s), υi (s), υi+1 (s), . . . , υn∗ (s) , x(t ˆ 0 ) = x0 .
2.2 Market Outcomes Under Open-Loop Nash Equilibria
19
Similarly, a set of strategies {υ1∗ (s), υ2∗ (s), . . . , υn∗ (s)} constitutes a noncooperative Nash equilibrium solution for the infinite-horizon n-person differential game in (2.3) and (2.4), if there exists a set of inequalities similar to (2.32) with T = ∞, discount factor exp[−r(s − t0 )], objective functions g i , and state growth f as in (2.3) and (2.4), and the omission of the terminal condition q i . Since the game is being played over time, the conditions on the commitment of the agents’ strategies at the beginning of the game duration has to be specified. If economic agents choose to commit their strategies from the outset, they are using open-loop strategies. If economic agents can revise their strategies contingent upon the state variables, they are using feedback strategies.
2.2 Market Outcomes Under Open-Loop Nash Equilibria If the agents have to commit their strategies from the outset, the agents’ information structure can be seen as an open-loop pattern in which ηi (s) = {x0 }, s ∈ [t0 , T ]. Their strategies become functions of the initial state x0 and time s and can be expressed as {ui (s) = ϑi (s, x0 ), for i ∈ N}.
2.2.1 Characterization of Open-Loop Equilibria An open-loop Nash equilibrium for the game in (2.1) and (2.2) is characterized as follows. Theorem 2.1 If a set of strategies {u∗i (s) = ζi∗ (s, x0 ), for i ∈ N} provides an open-loop Nash equilibrium solution to the game in (2.1) and (2.2) , and {x ∗ (s), t0 ≤ s ≤ T } is the corresponding optimal state trajectory, then there exist m costate functions Λi (s) : [t0 , T ] → R m , for i ∈ N , such that the following relations are satisfied: ζi∗ (s, x0 ) ≡ u∗i (s)
= arg max g i s, x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) ui ∈U i
s r(y) dy × exp − t0
∗ ∗ ∗ ∗ ∗ ∗ + Λ (s)f s, x (s), u1 (s), u2 (s), . . . , ui−1 (s), ui (s), ui+1 (s), . . . , un (s) , i
x˙ ∗ (s) = f s, x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗n (s) ,
x ∗ (t0 ) = x0 ,
(2.33)
20
2
Dynamic Strategic Interactions in Economic Systems
s ∗ ∂ i ∗ ∗ ∗ g (s), u (s), u (s), . . . , u (s) exp − r(y) dy s, x n 1 2 ∂x ∗ t0 + Λi (s)f s, x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗n (s) ,
Λ˙ i (s) = −
Λi (T ) =
T ∂ i ∗ x q (T ) exp − r(y) dy ; ∂x ∗ t0
for i ∈ N . Proof Consider the problem of choosing a control path υi∗ (s) = u∗i (s) = ζi∗ (s, x0 ) that maximizes T g i s, x(s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) t0
s × exp − r(y) dy ds + exp − t0
T
r(y) dy q i x(T ) ,
t0
subject to the state dynamics x(s) ˙ = f s, x(s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) , x(t0 ) = x0 , for i ∈ N . This is a standard optimal control problem for agent i, treating u∗j (s) for j ∈ N and j = i as time paths given at the beginning of the game. Invoking Theorem A.3 in the Technical Appendixes, the conditions for a maximum for agent i’s problem is characterized by the ith set of equalities in Theorem 2.1. Since the set of equalities for all n agents holds, a Nash equilibrium as in (2.32) will arise. There may be multiple Nash equilibria. We assume that the agents will choose an equilibrium at time t0 and stick with the corresponding strategies for the entire game interval. The derivation of open-loop equilibria in nonzero-sum deterministic differential games first appeared in Berkovitz (1964) and Ho et al. (1965), with open-loop and feedback Nash equilibria in nonzero-sum deterministic differential games being presented in Case (1967, 1969) and Starr and Ho (1969a, 1969b). In the case when the game horizon approaches infinity, we can characterize an open-loop equilibrium solution to the infinite-horizon game in (2.3) and (2.4) as follows. Theorem 2.2 If a set of strategies {u∗i (s) = ζi∗ (s, xt ), for i ∈ N } provides an openloop Nash equilibrium solution to the infinite-horizon game in (2.3) and (2.4) and {x ∗ (s), t ≤ s ≤ T } is the corresponding optimal state trajectory, then there exist m
2.2 Market Outcomes Under Open-Loop Nash Equilibria
21
costate functions Λi (s) : [t, T ] → R m , for i ∈ N , such that the following relations are satisfied: ζi∗ (s, x) ≡ u∗i (s)
= arg max g i x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) ui ∈U i
+ λi (s)f x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) , x˙ ∗ (s) = f x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗n (s) , x ∗ (t) = xt , ∂ λ˙ i (s) = rλ(s) − ∗ g i x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗n (s) ∂x + λi (s)f x ∗ (s), u∗1 (s), u∗2 (s), . . . , u∗n (s) , for i ∈ N.
(2.34)
Proof Consider the problem of choosing a control path υi∗ (s) = u∗i (s) that maximizes ∞ g i x(s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) t × exp −r(s − t) ds subject to the state dynamics x(s) ˙ = f x(s), u∗1 (s), u∗2 (s), . . . , u∗i−1 (s), ui (s), u∗i+1 (s), . . . , u∗n (s) ,
x(t) = x,
for i ∈ N . This is an infinite-horizon optimal control problem for agent i, treating u∗j (s), for j ∈ N and j = i as time paths given at the beginning of the game. Invoking Theorem A.4 in the Technical Appendixes, the conditions for a maximum for agent i’s problem is characterized by the ith set of equalities in Theorem 2.2. Since the set of equalities for all n agents hold, a Nash equilibrium as in (2.32) will arise. A detailed account of the applications of open-loop equilibria in marketing, economics, and management science can be found in Zaccour (2003) and Dockner et al. (2000).
2.2.2 Open-Loop Solution in Competitive Advertising Consider the competitive dynamic advertising game in Sorger (1989). There are two firms in a market and the profits of firm 1 and that of firm 2 are, respectively, T c1 q1 x(s) − u1 (s)2 exp(−rs) ds + exp(−rT )S1 x(T ), 2 0
22
2
Dynamic Strategic Interactions in Economic Systems
and 0
T
c2 2 q2 1 − x(s) − u2 (s) exp(−rs) ds + exp(−rT )S2 1 − x(T ) , (2.35) 2
where r, qi , ci , Si , for i ∈ {1, 2}, are positive constants, x(s) is the market share of firm 1 at time s, [1 − x(s)] is that of firm 2’s, and ui (s) is the advertising rate for firm i ∈ {1, 2}. It is assumed that market potential is constant over time. The only marketing instrument used by the firms is advertising. Advertising has diminishing returns since there are increasing marginal costs of advertising as reflected through the quadratic cost function. The dynamics of firm 1’s market share is governed by 1/2 − u2 (s)x(s)1/2 , x(s) ˙ = u1 (s) 1 − x(s)
x(0) = x0 .
(2.36)
Consider that the firms would like to seek an open-loop solution. Using openloop strategies requires the firms to determine their action’s path at the outset. This is realistic only if there are restrictive commitments concerning advertising. Invoking Theorem 2.1, an open-loop solution to the game in (2.35) and (2.36) has to satisfy the following conditions: c1 u∗1 (s) = arg max q1 x ∗ (s) − u1 (s)2 exp(−rs) 2 u1 1/2 + Λ1 (s) u1 (s) 1 − x ∗ (s) − u2 (s)x ∗ (s)1/2 ,
c2 ∗ 2 q2 1 − x (s) − u2 (s) exp(−rs) 2 1/2 2 ∗ ∗ 1/2 + Λ (s) u1 (s) 1 − x (s) − u2 (s)x (s) ,
u∗2 (s) = arg max u2
x˙
∗
(s) = u∗1 (s)
1/2 1 − x (s) − u∗2 (s)x ∗ (s)1/2 , ∗
∗
(2.37)
x (0) = x0 ,
−1/2 1 ∗ 1 ∗ 1 1 ∗ ∗ −1/2 ˙ , Λ (s) = −q1 exp(−rs) + Λ (s) u1 (s) 1 − x (s) + u2 (s)x (s) 2 2 −1/2 1 ∗ 1 ∗ 2 2 ∗ ∗ −1/2 ˙ Λ (s) = q2 exp(−rs) + Λ (s) u1 (s) 1 − x (s) , + u2 (s)x (s) 2 2 Λ1 (T ) = exp(−rT )S1 , Λ2 (T ) = − exp(−rT )S2 .
2.3 Market Outcomes Under Feedback Equilibria
23
Using (2.37), we obtain u∗1 (s) =
1/2 Λ1 (s) exp(rs) 1 − x ∗ (s) c1
Λ2 (s) ∗ 1/2 exp(rs). x (s) c2
and u∗2 (s) =
Substituting u∗1 (s) and u∗2 (s) into (2.37) yields 1 [Λ (s)]2 Λ1 (s)Λ2 (s) 1 ˙ , Λ (s) = −q1 exp(−rs) + + 2c1 2c2 2 [Λ (s)]2 Λ1 (s)Λ2 (s) 2 ˙ Λ (s) = q2 exp(−rs) + , + 2c2 2c1
(2.38)
with boundary conditions Λ1 (T ) = exp(−rT )S1 and Λ2 (T ) = − exp(−rT )S2 . The game equilibrium state dynamics becomes x˙ ∗ (s) =
Λ2 (s) exp(rs) ∗ Λ1 (s) exp(rs) x (s), 1 − x ∗ (s) − c1 c2
x ∗ (0) = x0 . (2.39)
Solving the system of differential equations in (2.38) gives the solution time paths of Λ1 (s) and Λ2 (s). Using these time paths in (2.39), a solution time path for x ∗ (s) can be derived. Substituting these solution paths into u∗1 (s) and u∗2 (s) yields the open-loop game equilibrium strategies.
2.3 Market Outcomes Under Feedback Equilibria In many economic analyses we could not assume that agents would commit to fixed control paths at the outset of the game, as in the case of the open-loop solution. In particular, there are hardly any means that can prevent the agents from revising their strategies during duration of the game. Instead, agents would consider adopting feedback strategies, which are decision rules that are dependent upon the current state x(t) and current time t for t0 ≤ t ≤ s.
2.3.1 Characterization of Feedback Equilibria For the n-person differential game of (2.1) and (2.2), an n-tuple of feedback strategies {u∗i (s) = φi∗ (s, x) ∈ U i , for i ∈ N } constitutes a Nash equilibrium solution if the following relations for each i ∈ N are satisfied: T g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s) t
s r(y) dy ds + q i x ∗ (T ) exp − × exp − t0
T
r(y) dy t0
24
2
≥ t
T
Dynamic Strategic Interactions in Economic Systems
i i ∗ g i s, x i (s), φ1∗ s, x i (s) , φ2∗ s, x i (s) , . . . , φi−1 s, x (s) , φi s, x (s) , i ∗ φi+1 s, x (s) , . . . , φn∗ s, x i (s)
s × exp − r(y) dy ds + q i x i (T ) exp − t0
T
r(y) dy ,
t0
∀φi∗ (s, x) ∈ U i , x ∈ R m ,
(2.40)
where, on the interval [t0 , T ], x˙ ∗ (s) = f s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s) ,
x ∗ (t) = x;
and i i ∗ x˙ i (s) = f s, x i (s), φ1∗ s, x i (s) , φ2∗ s, x i (s) , . . . , φi−1 s, x (s) , φi s, x (s) , ∗ s, x i (s) , . . . , φn∗ s, x i (s) , x i (t) = x, for i ∈ N. φi+1 One salient feature of the concept introduced above is that if an n-tuple {φi∗ ; i ∈ N} provides a feedback Nash equilibrium solution (FNES) to an N -person differential game with duration [t0 , T ], its restriction to the time interval [t, T ] provides an FNES to the same differential game defined on the shorter time interval [t, T ], with the initial state taken asx(t), and this being so for all t0 ≤ t ≤ T . An immediate consequence of this observation is that feedback Nash equilibrium strategies will depend only on the time variable and the current value of the state, but not on memory (including the initial state x0 ). Therefore the agents’ strategies can be expressed as {ui (s) = φi (s, x), for i ∈ N}. The following theorem provides a set of conditions characterizing a feedback Nash equilibrium solution for the game in (2.1) and (2.2) and is characterized as follows. Theorem 2.3 An n-tuple of strategies {u∗i (s) = φi∗ (t, x) ∈ U i , for i ∈ N } provides a feedback Nash equilibrium solution to the game in (2.1) and (2.2) if there exist continuously differentiable functions V (t0 )i (t, x) : [t0 , T ] × R m → R, i ∈ N , satisfying the following set of partial differential equations: (t0 )i ∗ −Vt (t, x) = max g i t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φi−1 (t, x), ui (t, x), ui
∗ (t, x), . . . , φn∗ (t, x) φi+1
+ Vx(t0 )i (t, x)f
t exp − r(y) dy t0
∗ t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φi−1 (t, x), ui (t, x), ∗ (t, x), . . . , φn∗ (t, x) φi+1
2.3 Market Outcomes Under Feedback Equilibria
25
t = g i t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φn∗ (t, x) exp − r(y) dy
t0
+ Vx(t0 )i (t, x)f t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φn∗ (t, x) , T r(y) dy , i ∈ N. V (t0 )i (T , x) = q i (x) exp − t0
Proof Invoking Theorem A.1 in the Technical Appendixes, V (t0 )i (t, x) is the maximized payoff associated with the optimal control problem of agent i for given strategies {u∗j (s) = φj∗ (t, x) ∈ U j , for j ∈ N and j = i} of the other n − 1 agents. The conditions in Theorem 2.3 imply the expressions in (2.40), and hence yield a Nash equilibrium. Again, there may be multiple Nash equilibria; the agents are assumed to choose an equilibrium at time t0 and stick with the corresponding strategies for the entire game interval. Moreover, V (t0 )i (t, x) is the game equilibrium payoff of agent i at time t ∈ [t0 , T ] with the state being x, that is, T (t0 )i (t, x) = g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s) V t
s × exp − r(y) dy ds + q i x ∗ (T ) exp − t0
T
r(y) dy .
t0
We also call it the value function of agent i in the game. A remark that will be utilized in the subsequent analysis is given below. Remark 2.1 Let V (τ )i (t, x) denote the value function of agent i in a game with the payoffs in (2.1) and dynamics in (2.2), which starts at time τ for τ ∈ [t0 , T ). Note that the equilibrium feedback strategies are Markovian in the sense that they depend on the current time and current state. One can readily verify that τ exp r(y) dy V (t0 )i (t, x) t0
= exp
τ
r(y) dy
t0
T
× t
g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s)
s × exp − r(y) dy ds t0
T
= t
g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s)
26
2
Dynamic Strategic Interactions in Economic Systems
s × exp − r(y) dy ds τ
= V (τ )i (t, x), for τ ∈ [t0 , T ). We now turn to the infinite-horizon autonomous game in (2.3) and (2.4). First, consider the infinite-horizon subgame that starts at time τ ∈ [t0 , ∞) with initial state x(τ ) = x ∞ g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − τ ) ds, for i ∈ N, (2.41) max ui
τ
subject to the dynamics x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(τ ) = x.
(2.42)
The infinite-horizon autonomous game in (2.41) and (2.42) is independent of the choice of τ and dependent only upon the state at the starting time, that is, x. In the infinite-horizon optimization problem in Sect. A.1 in the Technical Appendixes, the feedback control is shown to be a function the state variable x only. With the validity of the game equilibrium {u∗i (s) = φi∗ (x) ∈ U i , for i ∈ N } to be verified later, we first define the following. Definition 2.1 For the n-person differential game in (2.41) and (2.42), an n-tuple of feedback strategies {u∗i (s) = φi∗ (x) ∈ U i , for i ∈ N} constitutes a feedback Nash equilibrium solution if the following relations for each i ∈ N are satisfied: ∞ g i x ∗ (s), φ1∗ x ∗ (s) , φ2∗ x ∗ (s) , . . . , φn∗ x ∗ (s) exp −r(s − τ ) ds t
≥ t
∞
i i ∗ g i x i (s), φ1∗ x i (s) , φ2∗ x i (s) , . . . , φi−1 x (s) , φi x (s) , i ∗ x (s) , . . . , φn∗ x i (s) exp −r(s − τ ) ds, φi+1
∀φi (·) ∈ Γ i , x ∈ R m ,
(2.43)
where on the interval [τ, ∞), x˙ ∗ (s) = f x ∗ (s), φ1∗ x ∗ (s) , φ2∗ x ∗ (s) , . . . , φn∗ x ∗ (s) ,
x ∗ (s) = x;
i i ∗ x (s) , φi x (s) , x˙ i (s) = f x i (s), φ1∗ x i (s) , φ2∗ x i (s) , . . . , φi−1 i ∗ x (s) , . . . , φn∗ x i (s) , φi+1
x i (t) = x.
2.3 Market Outcomes Under Feedback Equilibria
27
We can express the value function of agent i as ∞ i ∗ g x (s), φ1∗ x ∗ (s) , φ2∗ x ∗ (s) , . . . , φn∗ x ∗ (s) V (τ )i (t, x) = exp −r(t − τ ) t × exp −r(s − t) ds, for x(t) =x ∗ (t) = x. ∞ Since t g i [x ∗ (s), φ1∗ (x ∗ (s)), φ2∗ (x ∗ (s)), . . . , φn∗ (x ∗ (s))] exp[−r(s − t)] ds is independent of the choice of t and dependent only upon the state at the starting time x, we can write ∞ Vˆ i (x) = g i x ∗ (s), φ1∗ x ∗ (s) , φ2∗ x ∗ (s) , . . . , φn∗ x ∗ (s) exp −r(s − t) ds. t
It follows that V (τ )i (t, x) = exp −r(t − τ ) Vˆ i (x), (τ )i
Vt
(t, x) = −r exp −r(t − τ ) Vˆ i (x),
Vx(τ )i (t, x) = exp −r(t − τ ) Vˆxi (x),
and
(2.44)
for i ∈ N.
A feedback Nash equilibrium solution for the infinite-horizon autonomous game in (2.41) and (2.42) can be characterized as follows. Theorem 2.4 An n-tuple of strategies {u∗i = φi∗ (·) ∈ U i , for i ∈ N }, provides a feedback Nash equilibrium solution to the infinite-horizon game in (2.3) and (2.4) if there exist continuously differentiable functions Vˆ i (x) : R m → R, i ∈ N , satisfying the following set of partial differential equations:
∗ ∗ r Vˆ i (x) = max g i x, φ1∗ (x), φ2∗ (x), . . . , φi−1 (x), ui , φi+1 (x), . . . , φn∗ (x) ui
∗ ∗ + Vˆxi (x)f x, φ1∗ (x), φ2∗ (x), . . . , φi−1 (x), ui , φi+1 (x), . . . , φn∗ (x)
= g i x, φ1∗ (x), φ2∗ (x), . . . , φn∗ (x) + Vˆxi (x)f x, φ1∗ (x), φ2∗ (x), . . . , φn∗ (x) , for i ∈ N . Proof By Theorem A.2 in the Technical Appendixes, Vˆ i (x) is the value function associated with the optimal control problem of agent i, i ∈ N . Together with the expressions in Definition 2.1, the conditions in Theorem 2.4 imply a Nash equilibrium. Since time t is not explicitly involved in the partial differential equations in Theorem 2.4, the validity that the feedback Nash equilibrium {u∗i = φi∗ (x), for i ∈ N }, are functions independent of time is obtained.
28
2
Dynamic Strategic Interactions in Economic Systems
Substituting the game equilibrium strategies in Theorem 2.4 into (2.4) yields the game equilibrium dynamics of the state path as x(s) ˙ = f x(s), φ1∗ x(s) , φ2∗ x(s) , . . . , φn∗ x(s) , x(t0 ) = x0 . Solving the above dynamics yields the optimal state trajectory {x ∗ (t)}t≥t0 as t x ∗ (t) = x0 + f x ∗ (s), φ1∗ x ∗ (s) , φ2∗ x ∗ (s) , . . . , φn∗ x ∗ (s) ds, t0
for t ≥ t0 .
(2.45)
We denote term x ∗ (t) by xt∗ . The feedback Nash equilibrium strategies for the infinite-horizon game in (2.3) and (2.4) can be obtained as ∗ ∗ ∗ ∗ φ1 xt , φ2 xt , . . . , φn∗ xt∗ , for t ≥ t0 .
2.3.2 Feedback Equilibria in Resource Extraction Consider an economy endowed with a renewable resource and with n ≥ 2 resource extractors (firms). The lease for resource extraction begins at time t0 and ends at time T . Let ui (s) denote the rate of resource extraction of firm i at time s, i ∈ N = {1, 2, . . . , n}, where each extractor controls its rate of extraction. Let U i be the set of admissible extraction rates and x(s) the size of the resource stock at time s. In particular, we have U i ∈ R + for x > 0 and = {0} for x = 0. The extraction cost for firm i ∈ N depends on the quantity of the resource extracted ui (s), the resource stock size x(s), and a parameter c. In particular, the extraction cost can be specified as C i = cui (s)/x(s)1/2 . The market price of the resource depends on the total amount extracted and supplied to the market. The price-output relationship at time s is given by the follow−1/2 , where Q(s) = ing n downward-sloping inverse demand curve P (s) = Q(s) u (s) is the total amount of the resource extracted and marketed at time s. j =1 j 1/2 A terminal bonus wx(T ) is offered to each extractor and r is a discount rate that is common to all extractors. Extractor i seeks to maximize the present value of the profits −1/2 T n c uj (s) ui (s) − ui (s) exp −r(s − t0 ) ds 1/2 x(s) t0 j =1
+ exp −r(T − t0 ) wx(T )1/2 ,
for i ∈ N,
(2.46)
subject to the resource dynamics x(s) ˙ = ax(s)1/2 − bx(s) −
n j =1
uj (s),
x(t0 ) = x0 ∈ X.
(2.47)
2.3 Market Outcomes Under Feedback Equilibria
29
The model is a deterministic version of the Jørgensen and Yeung (1996) fishery game model. Invoking Theorem 2.3, a set of feedback strategies {u∗i (t) = φi∗ (t, x); i ∈ N } constitutes a feedback Nash equilibrium solution for the game in (2.46) and (2.47), if there exist functions V (t0 )i (t, x) : [t0 , T ] × R → R for i ∈ N , which satisfy the following set of partial differential equations: n −1/2 c (t0 )i −Vt (t, x) = max ui φj∗ (t, x) + ui − 1/2 ui (t) exp −r(t − t0 ) i x ui ∈U j =1 j =i
+ Vx(t0 )i
ax
1/2
− bx −
V (t0 )i (T , x) = exp −r(T − t0 ) wx 1/2 .
n
φj∗ (t, x) − ui
,
and
(2.48)
j =1 j =i
Applying the maximization operator on the right-hand side of the first equation in (2.49) for agent i yields the condition for a maximum as n n −3/2 1 ∗ c ∗ ∗ φj (t, x) + φi (t, x) φj (t, x) − 1/2 exp −r(t − t0 ) 2 x j =1 j =i
j =1
− Vx(t0 )i = 0,
(2.49)
for i ∈ N . Summing over i = 1, 2, . . . , n in (2.49) yields 1/2
n −1 n (t0 )j 1 c ∗ φj (t, x) = n− + exp r(t − t0 ) Vx . 2 x 1/2 j =1
(2.50)
j =1
Substituting (2.50) into (2.49) produces
n 3 n (t0 )j 1 ∗ 1 −3 c ∗ φj (t, x) + φi (t, x) n − + exp r(t − t0 ) Vx 2 2 x 1/2 j =1 j =i
−
j =1
c x 1/2
− exp r(t − t0 ) Vx(t0 )i = 0,
for i ∈ N.
(2.51)
Rearranging the terms in (2.51) yields
n 1 ∗ ∗ φj (t, x) + φi (t, x) 2 j =1 j =i
(t )i [c + exp[r(t − t0 )]Vx 0 x 1/2 ]x 1 3 , = n− 2 ( n [c + exp[r(t − t0 )]Vx(t0 )j x 1/2 ])3 j =1
for i ∈ N .
(2.52)
30
2
Dynamic Strategic Interactions in Economic Systems
Condition (2.52) represents a system of equations that is linear in {φ1∗ (t, x), Solving (2.52) yields n (t )j Vx 0 x 1/2 x(2n − 1)2 ∗ φi (t, x) = c+ (t )j exp[−r(t − t0 )] 2[ nj=1 [c + exp[r(t − t0 )]Vx 0 x 1/2 ]]3 j =1
φ2∗ (t, x), . . . , φn∗ (t, x)}.
(t )i 3 Vx 0 x 1/2 − n− c+ , 2 exp[−r(t − t0 )]
j =i
for i ∈ N.
(2.53)
Substituting φi∗ (t, x) in (2.53) into (2.49); upon solving it yields the following. Proposition 2.1 The system in (2.49) admits a solution V (t0 )i (t, x) = exp −r(t − t0 ) A(t)x 1/2 + B(t) ,
for i ∈ N,
(2.54)
−1 ˙ = r + b A(t) − (2n − 1) c + A(t) A(t) 2 2 2n2 c(2n − 1)2 A(t) −2 (2n − 1)2 A(t) + + , c + 2 4n3 8n2 (c + A(t) )2
(2.55)
where A(t) and B(t) satisfy
2
˙ = rB(t) − a A(t), B(t) 2 A(T ) = w
and
B(T ) = 0.
Proof Substituting V i (t, x) and the relevant derivatives Vti (t, x) and Vxi (t, x) into (2.53) and (2.49) yields the results in Proposition 2.1. The first equation in (2.55) can be further reduced to 2 1 1 b [A(t)]3 b ˙ = + r + σ2 + c A(t) A(t) r + σ2 + 8 2 4 8 2 (2n − 1)c 1 b 2 (4n2 − 8n + 3) A(t) − + r + σ2 + c + 8 2 8n2 4n3 A(t) 2 c+ . (2.56) 2 The denominator of the right-hand side of (2.56) is always positive. Denote the numerator of the right-hand side of (2.56) by (2n − 1)c F A(t) − . 4n3
(2.57)
2.3 Market Outcomes Under Feedback Equilibria
31
Fig. 2.1 Phase diagram for ˙ and A(t) A(t)
In particular, F [A(t)] is a polynomial function in A(t) of degree 3. Moreover, F [A(t)] = 0 for A(t) = 0, and for any A(t) ∈ (0, ∞), 1 1 dF [A(t)] b 3[A(t)]2 b = r + σ2 + + 2 r + σ2 + c A(t) dA(t) 8 2 4 8 2 2 b 2 (4n − 8n + 3) 1 + r + σ2 + c + > 0. (2.58) 8 2 8n2 Therefore, there exists a unique level of A(t), denoted by A∗ , at which (2n − 1)c = 0. F A∗ − 4n3
(2.59)
˙ = 0. For values of A(t) less than A∗ , A(t) ˙ is negative. If A(t) equals A∗ , A(t) ∗ ˙ For values of A(t) greater than A , A(t) is positive. A phase diagram depicting the ˙ and A(t) is provided in Fig. 2.1, while the time paths of relationship between A(t) ∗ A(t) in relation to A are illustrated in Fig. 2.2. For a given value of w that is less than A∗ , the time path {A(t)}Tt=t0 will start at a value A(t0 ), which is greater than w and less than A∗ . The value of A(t) will decrease over time and reach w at time T . On the other hand, for a given value of w that is greater than A∗ , the time path {A(t)}Tt=t0 will start at a value A(t0 ), which is less than w and greater than A∗ . The value of A(t) will increase over time and reach w at time T . Therefore A(t) is a monotonic function and A(t) > 0, for t ∈ [tτ , T ]. Using A(t), the solution to B(t) can be readily obtained as t a A(s) exp(−rs) ds , (2.60) B(t) = exp(rt) K − t0 2 where K =
T
a t0 2 A(s) exp(−rs) ds.
32
2
Dynamic Strategic Interactions in Economic Systems
Fig. 2.2 Time paths of A(t)
Substituting the relevant derivatives of the value functions in Proposition 2.1 into the game equilibrium strategies of (2.53) gives the feedback Nash equilibrium of the resource extraction game of (2.46) and (2.47).
2.3.3 Feedback Solution in Competitive Advertising Consider the competitive advertising game in Sect. 2.2.2. Instead of an open-loop solution we seek a feedback Nash equilibrium. Invoking Theorem 2.3, a set of feedback strategies {u∗i (t) = φi∗ (t, x); i ∈ N} constitutes a feedback Nash equilibrium solution for the game in (2.35) and (2.36), if there exist functions V (t0 )i (t, x) : [t0 , T ] × R → R for i ∈ {1, 2}, which satisfy the following set of partial differential equations: c1 −Vt(t0 )1 (t, x) = max q1 x − u21 exp(−rt) u1 2 + Vx1 (t, x) u1 (1 − x)1/2 − φ2∗ (t, x)x 1/2 , V (t0 )1 (T , x) = exp(−rT )S1 x and (t0 )2
−Vt
(t, x) = max u1
q2 (1 − x) −
+ Vx2 (t, x)
c2 2 u2 exp(−rt) 2
φ1∗ (t, x)(1 − x)1/2
V (t0 )2 (T , x) = exp(−rT )S2 (1 − x).
− u2 x
1/2
,
(2.61)
2.3 Market Outcomes Under Feedback Equilibria
33
Performing the indicated maximization in (2.62) yields the condition for a maximum as exp rt (t0 )1 Vx (t, x)(1 − x)1/2 c1 exp rt (t0 )2 φ2∗ (t, x) = Vx (t, x)x 1/2 . c2
φ1∗ (t, x) =
and (2.62)
Substituting φ1∗ (t, x) and φ2∗ (t, x) into (2.62) and solving it yields
(t )1 − Vt 0 (t, x) =
2 (exp rt)2 (t0 )1 q1 x − Vx (t, x) (1 − x) exp(−rt) 2c1 exp rt (t0 )1 exp rt (t0 )2 + Vx(t0 )1 (t, x) Vx (t, x)(1 − x) − Vx (t, x)x , c1 c2
V (t0 )1 (T , x) = exp(−rT )S1 x; (2.63) 2 (exp rt)2 (t0 )2 (t0 )2 (t, x) = q2 (1 − x) − − Vt Vx (t, x) x exp(−rt) 2c2 exp rt (t0 )1 exp rt (t0 )2 + Vx(t0 )2 (t, x) Vx (t, x)(1 − x) − Vx (t, x)x , c1 c2 V (t0 )2 (T , x) = exp(−rT )S2 (1 − x). Proposition 2.2 The system in (2.63) admits a solution V (t0 )1 (t, x) = exp −r(t) A1 (t)x + B1 (t) , V (t0 )2 (t, x) = exp −r(t) A2 (t)x + B2 (t) , where A(t) and B(t) satisfy [A1 (t)]2 A1 (t)A2 (t) + , A˙ 1 (t) = rA1 (t) − q1 + 2c1 c2 [A1 (t)]2 B˙ 1 (t) = rB1 (t) − , 2c1
B1 (T ) = 0;
[A2 (t)]2 A1 (t)A2 (t) A˙ 2 (t) = rA2 (t) − q2 + + , 2c2 c1 [A2 (t)]2 B˙ 2 (t) = rB2 (t) − , 2c2
A1 (T ) = S1 ,
A2 (T ) = S2 ,
B2 (T ) = 0.
Proof Substituting V i (t, x) and the relevant derivatives Vti (t, x) and Vxi (t, x) into (2.63) yields the results in Proposition 2.2.
34
2
Dynamic Strategic Interactions in Economic Systems
With the value functions in Proposition 3.2, one can characterize the game equilibrium strategies in (2.62) over the game interval [t0 , T ], the equilibrium state path, and the profits of the firms over time.
2.3.4 Duopolistic Competition in Infinite Horizon Consider a dynamic duopoly in which there are two publicly listed firms selling a homogeneous good. Since the value of a publicly listed firm is the present value of its discounted expected future earnings. The terminal time of the game T may be very far in the future and nobody knows when the firms will be out of business. Therefore, setting T = ∞ may very well be the best approximation for the true game horizon. Even if the firm’s management restricts itself to considering profit maximization over the next year, it should value its asset positions at the end of the year by the earning potential of these assets in the years to come. There is a lag in price adjustment so the evolution of market price over time is assumed to be a function of the current market price and the price specified by the current demand condition. In particular, we follow Tsutsui and Mino (1990) and assume that P˙ (s) = k a − u1 (s) − u2 (s) − P (s) ,
P (t0 ) = P0 ,
(2.64)
where P (s) is the market price at time s, ui (s) is the output supplied firm i ∈ {1, 2}, the current demand condition is specified by the instantaneous inverse demand function P (s) = [a − u1 (s) − u2 (s)] and k > 0 represents the price adjustment velocity. The payoff of firm i is given as the present value of the stream of discounted profits
∞
2 P (s)ui (s) − cui (s) − (1/2) ui (s) exp −r(s − t0 ) ds,
t0
for i ∈ {1, 2},
(2.65)
where cui (s) + (1/2)[ui (s)]2 is the cost of producing output ui (s) and r is the interest rate. Once again, we consider the infinite-horizon game that starts at time t ∈ [t0 , ∞) with initial state P (t) = P ∞ 2
P (s)ui (s) − cui (s) − (1/2) ui (s) exp −r(s − t) ds, max ui
t
for i ∈ {1, 2},
(2.66)
subject to P˙ (s) = k a − u1 (s) − u2 (s) − P (s) ,
P (t) = P .
(2.67)
2.3 Market Outcomes Under Feedback Equilibria
35
The infinite-horizon game in (2.66) and (2.67) has autonomous structures and a constant rate. Therefore, we can apply Theorem 3.2 to characterize a feedback Nash equilibrium solution as
r Vˆ i (P ) = max P ui − cui − (1/2)(ui )2 ui
+ VˆPi k a − ui − φj∗ (P ) − P ,
for i ∈ {1, 2}.
(2.68)
Performing the indicated maximization in (2.68), we obtain φi∗ (P ) = P − c − k VˆPi (P ),
for i ∈ {1, 2}.
(2.69)
Substituting the results from (2.69) into (2.68), and upon solving (2.68), yields 1 Vˆ i (P ) = AP 2 − BP + C, 2
(2.70)
where (r + 6k)2 − 12k 2 , A= 6k 2 −akA + c − 2kcA , and B= r − 3k 2 A + 3k r + 6k −
C=
c2 + 3k 2 B 2 − 2kB(2c + a) . 2r
Again, one can readily verify that Vˆ i (P ) in (2.70) indeed solves (2.68) by substituting Vˆ i (P ) and its derivative into (2.68) and (2.69). The game equilibrium strategy can then be expressed as φi∗ (P ) = P − c − k(AP − B),
for i ∈ {1, 2}.
Substituting the game equilibrium strategies above into (2.64) yields the game equilibrium state dynamics of the game in (2.64) and (2.65) as P˙ (s) = k a − 2(c + kB) − (3 − kA)P (s) , P (t0 ) = P0 . Solving the above dynamics yields the optimal state trajectory as k[a − 2(c + kB)] k[a − 2(c + kB)] P ∗ (t) = P0 − . exp −k(3 − kA)t + k(3 − kA) k(3 − kA) We denote term P ∗ (t) by Pt∗ . The feedback Nash equilibrium strategies for the infinite-horizon game in (2.64) and (2.65) can be obtained as φi∗ Pt∗ = Pt∗ − c − k APt∗ − B , for i ∈ {1, 2}.
36
2
Dynamic Strategic Interactions in Economic Systems
2.4 Dynamic Stochastic Interactive Economic System One way to incorporate stochastic elements in dynamic interactive economic systems is to introduce stochastic dynamics. Uncertainties in the evolution of economic state variables are prevalent. For instance, the natural growth rate of renewable resources, the development of technology, capital accumulation, the build-up of goodwill, and special skills are often subject to stochastic impacts.
2.4.1 Game Formulation and Solution Characterization A stochastic formulation of state dynamics is by adopting a vector-valued stochastic differential equation dx(s) = f s, x(s), u1 (s), u2 (s), . . . , un (s) ds + σ s, x(s) dz(s), x(t0 ) = x0 ,
(2.71)
where σ [s, x(s)] is a m × Θ matrix, z(s) is a Θ-dimensional Wiener process, and the initial state x0 is given. Let Ω[s, x(s)] = σ [s, x(s)], σ [s, x(s)] denote the covariance matrix with its element in row h and column ζ denoted by Ω hζ [s, x(s)]. Moreover, E[dz ] = 0, E[dz dt] = 0 and E[(dz )2 ] = dt for ∈ [1, 2, . . . , Θ]; E[dz dzω ] = 0 for ∈ [1, 2, . . . , Θ], ∈ [1, 2, . . . , Θ], and = ω. Given the stochastic nature of the state dynamics, the economic agent i’s objective becomes Et0
T
s g i s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds
t0
+ exp −
T
t0
r(y) dy q i x(T ) ,
for i ∈ N,
(2.72)
t0
with Et0 {·} denoting the expectation operation taken at time t0 . The system in (2.71) and (2.72) is a stochastic differential game. Basar (1977a, 1977b, 1980) was the first to derive explicit results for stochastic linear quadratic differential games. Examples of solvable stochastic differential games in economics include Clemhout and Wan (1985b), Kaitala (1993), Jørgensen and Yeung (1996, 1999), and Yeung (1998, 1999, 2001). A Nash equilibrium of the stochastic game in (2.71) and (2.72) can be characterized as follows. Theorem 2.5 An N -tuple of feedback strategies {φi∗ (t, x) ∈ U i ; i ∈ N } provides a Nash equilibrium solution to the game in (2.71) and (2.72) if there exist suitably
2.4 Dynamic Stochastic Interactive Economic System
37
smooth functions V (t0 )i (t, x) : [t0 , T ] × R m → R, i ∈ N , satisfying the partial differential equations (t0 )i
−Vt
(t, x) −
m 1 hζ (t )i Ω (t, x)Vx h0x ς (t, x) 2 h,ζ =1
∗ ∗ = max g i t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φi−1 (t, x), ui (t), φi+1 (t, x), . . . , φn∗ (t, x) ui
s × exp − r(y) dy + Vx(t0 )i (t, x) t0
×f V
(t0 )i
∗ ∗ (t, x), ui (t), φi+1 (t, x), . . . , φn∗ (t, x) t, x, φ1∗ (t, x), φ2∗ (t, x), . . . , φi−1
(T , x) = q (x) exp − i
,
T
r(y) dy ,
i ∈ N.
t0
Proof This result follows readily from the definition of the Nash equilibrium and from the stochastic control result in Theorem A.5 of the Technical Appendixes. In particular, V (t0 )i (t, x) represents the expected game equilibrium payoff of agent i at time t ∈ [t0 , T ] with the state being x, that is, Et0 V (t0 )i (t, x) =
t
T
g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s)
s × exp − r(y) dy ds + q i x ∗ (T ) exp − t0
T
r(y) dy
.
t0
A remark that will be utilized in the subsequent analysis is given below. Remark 2.2 Let V (τ )i (t, x) denote the value function of nation i in a game with stochastic dynamics found in (2.71) and expected payoffs in (2.72), which starts at time τ for τ ∈ [t0 , T ). Note that the equilibrium feedback strategies are Markovian in the sense that they depend on the current time and the current state. One can readily verify that exp
τ
r(y) dy V (t0 )i (t, x)
t0
= exp
τ
r(y) dy t0
× Et0
T t
g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s)
38
2
Dynamic Strategic Interactions in Economic Systems
s × exp − r(y) dy ds t0
T
= Et t
g i s, x ∗ (s), φ1∗ s, x ∗ (s) , φ2∗ s, x ∗ (s) , . . . , φn∗ s, x ∗ (s)
s × exp − r(y) dy ds = V (τ )i (t, x),
for τ ∈ [t0 , T ).
τ
In the case when the terminal horizon T approaches infinity, an autonomous game structure with constant discounting will replace (2.71) and (2.72). In particular, the game becomes ∞ i g x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t0 ) ds , max Et0 ui
t0
for i ∈ N, subject to the stochastic dynamics dx(s) = f x(s), u1 (s), u2 (s), . . . , un (s) ds + σ x(s) dz(s),
(2.73)
x(t0 ) = x0 . (2.74)
Consider the alternative infinite-horizon game that starts at time t ∈ [t0 , ∞) with initial state x(t) = x ∞ max Et g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t) ds , (2.75) ui
t
for i ∈ N , subject to the stochastic dynamics dx(s) = f x(s), u1 (s), u2 (s), . . . , un (s) ds + σ x(s) dz(s),
x(t) = xt . (2.76)
Let Ω[x(s)] = σ [x(s)]σ [x(s)]T denote the covariance matrix with its element in row h and column ζ denoted by Ω hζ [x(s)]. The infinite-horizon autonomous game in (2.75) and (2.76) is independent of the choice of t and dependent only upon the state at the starting time, that is, x. A Nash equilibrium solution for the infinite-horizon stochastic differential game in (2.75) and (2.76) can be characterized as follows. Theorem 2.6 An n-tuple of strategies {u∗i = φi∗ (·) ∈ U i , for i ∈ N}, provides a Nash equilibrium solution to the game in (2.75) and (2.76) if there exist continuously twice differentiable functions Vˆ i (x) : R m → R, i ∈ N , satisfying the following set of partial differential equations: m 1 hζ Ω (x)Vˆxih x ζ (x) r Vˆ i (x) − 2 h,ζ =1
∗ ∗ = max g i x, φ1∗ (x), φ2∗ (x), . . . , φi−1 (x), ui (x), φi+1 (x), . . . , φn∗ (x) ui
2.4 Dynamic Stochastic Interactive Economic System
39
∗ ∗ + Vˆxi (x)f x, φ1∗ (x), φ2∗ (x), . . . , φi−1 (x), ui (x), φi+1 (x), . . . , φn∗ (x)
= g i x, φ1∗ (x), φ2∗ (x), . . . , φn∗ (x) + Vˆxi (x)f x, φ1∗ (x), φ2∗ (x), . . . , φn∗ (x) , for i ∈ N . Proof This result follows readily from the definition of a Nash equilibrium and from the infinite-horizon stochastic control Theorem A.6 in the Technical Appendixes.
2.4.2 An Application of Stochastic Differential Games in Resource Extraction Consider the resource extraction game in Sect. 2.3.2. To present a stochastic model we replace the deterministic state dynamics with a stochastic dynamics n dx(s) = ax(s)1/2 − bx(s) − uj (s) ds + σ x(s) dz(s), j =1
x(t0 ) = x0 ∈ X.
(2.77)
In the absence of human harvesting, the resource stock will grow according to the dynamics dx(s) = ax(s)1/2 − bx(s) ds + σ x(s) dz(s). The deterministic part of the natural growth function is G(x) = ax 1/2 − bx = x[ax −1/2 − b]. This function represents pure compensation, viz., the proportional growth rate G(x)/x is a decreasing function of x. The stochastic term reflects the randomness in the deathrate b. The resource stock has a nondegenerate stationary equilibrium level, which is characterized by the stationary density function ϕ(x) (see Jørgensen and Yeung 1996)
2 ϕ(x) = K/ σ 2 x 2(1+b/σ ) exp − 4a/σ 2 x −1/2 , ∞ where K is a normalization factor such that 0 ϕ(x) dx = 1. Extractor i seeks to maximize the expected payoff E t0
T
t0
n j =1
−1/2 uj (s)
c ui (s) − ui (s) exp −r(t − t0 ) ds 1/2 x(s)
+ exp −r(T − t0 ) wx(T )1/2 , subject to the resource dynamics in (2.77).
for i ∈ N,
(2.78)
40
2
Dynamic Strategic Interactions in Economic Systems
Invoking Theorem A.5 in the Technical Appendixes, a set of feedback strategies {u∗i (t) = φi∗ (t, x); i ∈ N} constitutes a Nash equilibrium solution for the game in (2.77) and (2.78), if there exist functions V (t0 )i (t, x) : [t0 , T ] × R → R, for i ∈ N , which satisfy the following set of partial differential equations: 1 (t0 )i (t, x) − σ 2 x 2 Vxx (t, x) 2 n −1/2 c ∗ = max ui φj (t, x) + ui − 1/2 ui (t) exp −r(t − t0 ) i x u1 ∈U (t0 )i
− Vt
j =1 j =i
+ Vx(t0 )i ax 1/2 − bx −
n
(2.79)
φj∗ (t, x) − ui
,
and
j =1 j =i
V (t0 )i (T , x) = exp −r(T − t0 ) wx 1/2 . Applying the maximization operator on the right-hand side of the first equation in (2.79) for agent i yields the condition for a maximum as
n
φj∗ (t, x) +
j =1
n −3/2 1 ∗ c ∗ φj (t, x) − 1/2 exp −r(t − t0 ) φi (t, x) 2 x j =1
j =i
− Vx(t0 )i = 0,
(2.80)
for i ∈ N . Following the analysis in Sect. 2.3.2, we obtain φi∗ (t, x)
=
n
j =1 [c + exp[r(t
2[
3 − n− 2
x(2n − 1)2
c+
(t )j
− t0 )]Vx 0 x 1/2 ]]3 (t )i
Vx 0 x 1/2 exp[−r(t − t0 )]
,
n c+ j =1 j =i
(t )j
Vx 0 x 1/2 exp[−r(t − t0 )]
for i ∈ N.
(2.81)
Substituting φi∗ (t, x) in (2.81) into (2.79), and upon solving it, yields the following. Proposition 2.3 The system in (2.79) admits a solution V (t0 )i (t, x) = exp −r(t − t0 ) A(t)x 1/2 + B(t) ,
for i ∈ N,
(2.82)
2.4 Dynamic Stochastic Interactive Economic System
41
where A(t) and B(t) satisfy −1 ˙ = r + 1 σ 2 + b A(t) − (2n − 1) c + A(t) A(t) 8 2 2n2 2 −2 2 A(t) c(2n − 1) (2n − 1)2 A(t) c + + + , 2 4n3 2 8n2 (c + A(t) 2 ) ˙ = rB(t) − a A(t), B(t) 2 A(T ) = w, and B(T ) = 0. (t )i
(2.83)
(t )i
Proof Substituting V (t0 )i (t, x) and the relevant derivatives Vt 0 (t, x), Vx 0 (t, x), (t )i and Vxx0 (t, x) into (2.81) and (2.79) yields the results in Proposition 2.3. Substituting the relevant derivatives of the value functions in Proposition 2.3 into the game equilibrium strategies in (2.81) gives a Nash equilibrium of the stochastic resource extraction game in (2.77) and (2.78).
2.4.3 Infinite-Horizon Resource Extraction Consider the infinite-horizon game in which extractor i seeks to maximize the expected payoff Et0
∞
t0
n
−1/2 uj (s)
j =1
c ui (s) − ui (s) exp −r(s − t0 ) ds , x(s)1/2
for i ∈ N,
(2.84)
subject to the resource dynamics dx(s) = ax(s)1/2 − bx(s) −
n
uj (s) ds + σ x(s) dz(s),
j =1
x(t0 ) = x0 ∈ X.
(2.85)
Consider the alternative problem that starts at time t ∈ [t0 , ∞) with initial state x(t) = x Et t
∞
n j =1
for i ∈ N,
−1/2 uj (s)
c ui (s) − ui (s) exp −r(s − t) ds , x(s)1/2 (2.86)
42
2
Dynamic Strategic Interactions in Economic Systems
subject to the resource dynamics dx(s) = ax(s)
1/2
− bx(s) −
n
uj (s) ds + σ x(s) dz(s),
j =1
x(t) = x ∈ X.
(2.87)
Invoking Theorem 2.6, we obtain a set of feedback strategies {φi∗ (x), i ∈ N } constituting of a Nash equilibrium solution for the game in (2.86) and (2.87) if there exist functions Vˆ i (x) : R → R for i ∈ N that satisfy the following set of partial differential equations: 1 i (x) r Vˆ i (x) − σ 2 x 2 Wxx 2 n −1/2 c ∗ = max ui φj (x) + ui − 1/2 ui x ui ∈U i j =1 j =i
+ Vˆxi
ax
1/2
− bx −
n
φj∗ (x) − ui
,
for i ∈ N.
(2.88)
j =1 j =i
Applying the maximization operator in (2.88) for agent i yields the condition for a maximum as
n
n −3/2 1 ∗ c ∗ φj (x) − 1/2 − Vˆxi = 0, φ (x) 2 i x
φj∗ (x) +
j =1 j =i
j =1
for i ∈ N.
(2.89)
Summing over i = 1, 2, . . . , n in (2.89) yields
n j =1
1/2 φj∗ (x)
−1 n 1 c j = n− + Vˆx . 2 x 1/2
(2.90)
j =1
Substituting (2.90) into (2.89) produces
n
φj∗ (x) +
j =1 j =i
for i ∈ N.
3 n 1 −3 c 1 ∗ c j + Vˆx − 1/2 − Vˆxi = 0, φ (x) n − 2 i 2 x 1/2 x j =1
(2.91)
2.5 Exercises
43
Rearranging the terms in (2.91) yields
n [c + Vˆxi x 1/2 ]x 1 ∗ 1 3 ∗ φj (x) + φi (x) = n − , 2 2 ( n [c + Vˆxj x 1/2 ])3 j =1 j =1 j =i
for i ∈ N.
(2.92)
The condition in (2.92) represents a system of equations that is linear in {φ1∗ (x), φ2∗ (x), . . . , φn∗ (x)}. Solving (2.92) yields the game equilibrium strategies n 3 x(2n − 1)2 j 1/2 ∗ i 1/2 ˆ ˆ c + Vx x − n− , c + Vx x φi (x) = n j 2 2[ [c + Vˆx x 1/2 ]]3 j =1
j =1 j =i
for i ∈ N.
(2.93)
Substituting φi∗ (t, x) in (2.93) into (2.88), and upon solving it, yields the following. Proposition 2.4 The system in (2.88) admits a solution Vˆ i (x) = Ax 1/2 + B , for i ∈ N,
(2.94)
where A and B satisfy 1 2 b A −1 c(2n − 1)2 A −2 (2n − 1) 0= r+ σ + c+ c+ + − 8 2 2n2 2 4n3 2 + B=
(2n − 1)2 A 8n2 (c + A2 )2
,
(2.95)
a A. 2r
i (x) into (2.93) Proof Substituting Vˆ i (x) and the relevant derivatives Vˆxi (x) and Vˆxx and (2.88) yields the results in Proposition 2.4.
A feedback Nash equilibrium can be readily obtained by substituting the relevant derivatives of the value functions in Proposition 2.4 into the game equilibrium strategies of (2.93).
2.5 Exercises 2.1 Consider the competitive dynamic advertising game in which there are two firms in a market. The profits of firm 1 and that of firm 2 are, respectively, 5 10x(s) − 2u1 (s)2 exp(−0.05s) ds + exp (−0.05)5 12x(5), 0
44
2
Dynamic Strategic Interactions in Economic Systems
and
5
8 1 − x(s) − u2 (s)2 exp(−0.05s) ds + exp (−0.05)5 9 1 − x(5) ,
0
where x(s) is the market share of firm 1 at time s, [1 − x(s)] is that of firm 2, and ui (s) is the advertising rate for firm i ∈ {1, 2}. It is assumed that market potential is constant over time. The only marketing instrument used by the firms is advertising. Advertising has diminishing returns since there are increasing marginal costs of advertising as reflected through the quadratic cost function. The dynamics of firm 1’s market share is governed by 1/2 x(s) ˙ = u1 (s) 1 − x(s) − u2 (s)x(s)1/2 ,
x(0) = 0.6.
Derive an open-loop solution for the market equilibrium. 2.2 Consider an economy endowed with a renewable resource and with n ≥ 2 resource extractors (firms). The lease for resource extraction begins at time 0 and ends at time 10. Let ui (s) denote the rate of resource extraction of firm i at time s, i ∈ {1, 2}, and x(s) is the size of the resource stock. The extraction cost is C i = 2ui (s)/x(s)1/2 for firm i ∈ {1, 2}. The demand for the resource is P (s) = Q(s)−1/2 , where Q(s) = 2j =1 uj (s). A terminal bonus 4x(T )1/2 is offered to each extractor and the discount rate is 0.1. Extractor i seeks to maximize the present value of profits 4 2 0
j =1
−1/2 uj (s)
2 ui (s) − ui (s) exp(−0.1s) ds x(s)1/2
+ exp −0.1(10) 4x(T )1/2 , for i ∈ {1, 2}, subject to the resource dynamics x(s) ˙ = 5x(s)1/2 − x(s) −
2
uj (s),
x(0) = 100.
j =1
Derive a feedback Nash equilibrium solution. 2.3 Consider a dynamic duopoly in which there are two publicly listed firms selling a homogeneous good. The payoff of firm i is given as the present value of the stream of discounted profits ∞ 2
P (s)ui (s) − 2ui (s) − (1/2) ui (s) exp[−0.05s] ds, for i ∈ {1, 2}, 0
where 2ui (s) + (1/2)[ui (s)]2 is the cost of the producing output ui (s) and the interest rate is 0.05.
2.5 Exercises
45
There is a lag in price adjustment so the evolution of the market price over time is assumed to be a function of the current market price and the price specified by the current demand condition. In particular, the price dynamics follows P˙ (s) = 0.5 50 − u1 (s) − u2 (s) − P (s) , P (0) = 5. Characterize a feedback equilibrium for the duopoly. 2.4 Consider an economy endowed with a renewable resource and with two resource extraction firms. The lease for resource extraction begins at time 0 and ends at time 3. Let ui (s) denote the rate of resource extraction of firm i at time s ∈ [0, 3], i ∈ {1, 2}, where each extractor controls its rate of extraction. Let x(s) denote the size of the resource stock at time s; the resource growth dynamics is stochastic. The extraction cost depends on the quantity of the resource extracted ui (s) and the resource stock size x(s). In particular, the extraction cost can be specified as C i = ui (s)/x(s)1/2 for firm i ∈ {1, 2}. The market price of the resource depends on the total amount extracted and supplied to the market. The price-output relationship at time s is given by the following downward-sloping inverse demand curve P (s) = 0.5Q(s)−1/2 , where Q(s) = 2j =1 uj (s) is the total amount of the resource extracted and marketed at time s. A terminal bonus 2x(T )1/2 is offered to each extractor and the discount rate is 0.1. Extractor i ∈ {1, 2} seeks to maximize the present value of the expected profits 3
0.5
E0 0
2
−1/2 uj (s)
j =1
+ exp −3(0.1) 2x(T )1/2
ui (s) ui (s) − exp(−rs) ds x(s)1/2
subject to the stochastic resource dynamics 1/2
dx(s) = 10x(s)
− x(s) −
2
uj (s) ds + 0.4x(s) dz(s),
x(0) = 120.
j =1
Derive a feedback equilibrium for the above stochastic dynamic economy.
Chapter 3
Dynamic Economic Optimization: Group Optimality and Individual Rationality
The most appealing characteristic of perfectly competitive markets is that individually rational behaviors bring about group (Pareto) optimality in economic resource allocation. However, the market fails to provide an effective mechanism for optimal resource use because of the prevalence of imperfect market structure, externalities, imperfect information, and public goods in the current global economy. As a result, though the market is one of the most effective instruments in conducting economic activities, it fails to guarantee its efficiency under the current arrangement. The noncooperative outcomes characterized in Chap. 2 vividly demonstrate that Pareto optimality could not be achieved by markets. Removing market suboptimality is not just a task of achieving a better alternative, but sometimes it can be an absolute necessity. For instance, efforts to alleviate the worldwide financial tsunami and catastrophebound industrial pollution are currently pressing issues. Cooperative games suggest the possibility of socially optimal and group efficient solutions to decision problems involving strategic action. The formulation of optimal behavior for players (economic agents) is a fundamental element in this theory. Two essential factors for economic optimization are group optimality and individual rationality. Group optimality ensures that all potential gains from cooperation are captured. The failure to fulfill group optimality leads to the condition where the participants prefer to deviate from the agreed-upon solution plan to extract the unexploited gains. Individual rationality is required to hold so that the payoff allocated to an economic agent under cooperation will be no less than its noncooperative payoff. The failure to guarantee individual rationality leads to the condition where the concerned participants will reject the agreed upon solution plan and play noncooperatively. For the optimization scheme to be upheld throughout the game horizon both group rationality and individual rationality are required to be satisfied at any time. Section 3.1 examines the notion of group optimality and shows the derivation of group optimal strategies and the cooperative state trajectory. Individual rationality and transfer payments leading to the satisfaction of individual rationality are given in Sect. 3.2. The analysis is extended to the infinite-horizon scenario in Sect. 3.3 and Sect. 3.4 presents cooperative economic games satisfying group optimality and individual rationality. D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_3, © Springer Science+Business Media, LLC 2012
47
48
3 Dynamic Economic Optimization: Group Optimality
3.1 Group Optimality Consider the general form of n-person differential games in the economics characterized in Chap. 2. Economic agent i ∈ N seeks to maximize its objective
T t0
s g s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds i
+ exp −
T
t0
r(y) dy q i x(T ) ,
(3.1)
t0
subject to the state dynamics x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 .
(3.2)
Now consider the case when the agents agree to act cooperatively. The agents agree to act according to an agreed-upon optimality principle. The agreement on how to act cooperatively and allocate cooperative payoff constitutes the solution optimality principle of a cooperative scheme. In particular, the solution optimality principle includes (i) an agreement on a set of cooperative strategies/controls and (ii) a mechanism to distribute the total payoff among agents.
3.1.1 Optimal Strategies and Cooperative State Trajectories Since payoffs are transferable, group optimality requires the agents to maximize their joint payoff. The agents must then solve the following optimal control problem: s n T
max g j s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds u1 ,u2 ,...,un
t0 j =1
+ exp −
T
r(y) dy t0
t0
n
q j x(T )
(3.3)
j =1
subject to (3.2). Both optimal control and dynamic programming can be used to solve the problem in (3.2) and (3.3). The technique of optimal control is given in Sect. A.2 of the Technical Appendixes. For the sake of comparison with other derived results, and for expositional convenience, a dynamic programming technique is adopted. The set of group optimal control strategies can be characterized as follows. Theorem 3.1 A set of controls {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]} provides an optimal solution to the control problem in (3.2) and (3.3) if there exists a continuously
3.1 Group Optimality
49
differentiable function W (t0 ) (t, x) : [t0 , T ] × R m → R satisfying the following Bellman equation: n t
(t0 ) g j [t, x, u1 , u2 , . . . , un ] exp − r(y) dy −Wt (t, x) = max u1 ,u2 ,...,un
j =1
t0
+ Wx(t0 ) f [t, x, u1 , u2 , . . . , un ] , W (t0 ) (T , x) = exp −
T
t0
r(y) dy
n
q j (x).
j =1
Proof Follow the proof of Theorem A.1 in the Technical Appendixes.
Hence the agents will adopt the cooperative control {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]}, to obtain the maximized level of joint profit. In a cooperative framework, the issue of the nonuniqueness of the optimal controls can be resolved by the agreement between the agents on a particular set of controls. Substituting this set of controls into (3.2) yields the dynamics of the optimal (cooperative) trajectory as x(s) ˙ = f s, x(s), ψ1∗ s, x(s) , ψ2∗ s, x(s) , . . . , ψn∗ s, x(s) , (3.4) x(t0 ) = x0 . Let x ∗ (t) denote the solution to (3.4). The optimal trajectory {x ∗ (t)}Tt=t0 can be expressed as t x ∗ (t) = x0 + f s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s) ds. t0
For notational convenience, we use the terms x ∗ (t) and xt∗ interchangeably. The cooperative control for the game in (3.1) and (3.2) over the time interval [t0 , T ] can be expressed more precisely as
∗ ∗ (3.5) ψi t, x (t) , for i ∈ N and t ∈ [t0 , T ] . Note that, for group optimality to be achievable, the cooperative controls {ψi∗ (t, x ∗ (t)), for i ∈ N and t ∈ [t0 , T ]}, must be exercised throughout time interval [t0 , T ]. The cooperative payoff over the interval [t, T ], for t ∈ [t0 , T ), can be expressed as T
n ∗ (t0 ) g j s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s) t, xt = W t
j =1
s × exp − r(y) dy ds + exp − t0
T
t0
r(y) dy
n
q j x ∗ (T ) .
j =1
(3.6)
50
3 Dynamic Economic Optimization: Group Optimality
To verify whether the agent would find it optimal to adopt the cooperative controls in (3.5) throughout the cooperative duration, we consider an optimal control problem with the dynamics in (3.2) and payoff in (3.3), which begins at time τ ∈ [t0 , T ] with initial state xτ∗ . At time τ , the optimality principle ensuring group rationality requires the agents to solve the problem
T
max
u1 ,u2 ,...,un
τ
n
s g j s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds τ
j =1
+ exp −
T
r(y) dy τ
n
q x(T ) , j
(3.7)
j =1
subject to x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(τ ) = xτ∗ .
(3.8)
The problem in (3.7) can alternatively be written as max
u1 ,u2 ,...,un
exp
τ
T
r(y) dy t0
τ
n
g j s, x(s), u1 (s), u2 (s), . . . , un (s)
j =1
s × exp − r(y) dy ds + exp − t0
= exp
r(y) dy
t0
r(y) dy
t0
τ
T
max
u1 ,u2 ,...,un
τ
T
n
q x(T ) j
j =1
g j s, x(s), u1 (s), u2 (s), . . . , un (s)
j =1
s r(y) dy ds + exp − × exp − t0
n
T
t0
r(y) dy
n
q j x(T ) .
(3.9)
j =1
Invoking the backward induction property of the principle of optimality one can readily verify that the optimal controls strategies for the problems in (3.8) and (3.9) are analogous to the optimal controls strategies for the problems in (3.1) and (3.2) in the time interval [t, T ]. A remark that will be utilized in the subsequent analysis is given below. Remark 3.1 Let W (τ ) (t, xt∗ ) denote the total cooperative payoff function of the control problem of (3.8) and (3.9). One can readily verify that exp
τ t0
r(y) dy W (t0 ) t, xt∗
3.1 Group Optimality
= exp
51
τ
r(y) dy
t0
×
T
t
n
g j s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s)
j =1
s r(y) dy ds + exp − × exp − t0
= t
T
n
T
r(y) dy
n
t0
q x ∗ (T ) j
j =1
g j s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s)
j =1
s × exp − r(y) dy ds + exp − τ
T
r(y) dy
n
τ
q j x ∗ (T )
j =1
= W (τ ) t, xt∗ , for τ ∈ [t0 , T ] and t ∈ [τ, T ). Next, we present a cooperative economic game yielding group optimality.
3.1.2 Group Optimality in Resource Extraction Consider the two firms’ version of the resource extraction game in Sect. 2.3.2 in Chap. 2. The resource stock x(s) ∈ X ⊂ R follows the dynamics x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(t0 ) = x0 ∈ X,
(3.10)
where u1 (s) is the harvest rate of economic agent 1 and u2 (s) is the harvest rate of economic agent 2. The instantaneous payoffs at time s ∈ [t0 , T ] for agents 1 and 2 are, respectively, c1 c2 1/2 1/2 u1 (s) and u2 (s) − u2 (s) , u1 (s) − x(s)1/2 x(s)1/2 where c1 and c2 are constants and c1 = c2 . At time T , each agent will receive a termination bonus 1
qx(T ) 2 , which depends on the resource remaining at the terminal time.
52
3 Dynamic Economic Optimization: Group Optimality
Payoffs are transferable between agents 1 and 2 over time. Given the discount rate r, the values received t after time t0 have to be discounted by the factor exp[−r(t − t0 )]. Consider the case when these two economic agents agree to cooperate and maximize the sum of their payoffs T c1 c2 1/2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) x(s)1/2 x(s)1/2 t0 1 × exp −r(s − t0 ) ds + 2 exp −r(T − t0 ) qx(T ) 2 ,
(3.11)
subject to (3.10). Let [ψ1∗ (t, x), ψ2∗ (t, x)] denote a set of controls that provide a solution to the optimal control problem in (3.10) and (3.11), and W (t0 ) (t, x) : [t0 , T ] × R → R denotes the maximized joint payoff function that satisfies the equations (see Theorem 3.1) c1 c2 1/2 1/2 (t0 ) −Wt (t, x) = max u1 − 1/2 u1 + u2 − 1/2 u2 exp −r(t − t0 ) u1 ,u2 x x 1/2 (t0 ) (3.12) + Wx (t, x) ax − bx − u1 − u2 , and 1 W (t0 ) (T , x) = 2 exp −r(T − t0 ) qx 2 . Performing the indicated maximization we obtain ψ1∗ (t, x) = ψ2∗ (t, x) =
x (t ) 4[c1 + Wx 0 exp[r(t
− t0 )]x 1/2 ]2
x (t ) 4[c2 + Wx 0 exp[r(t
− t0 )]x 1/2 ]2
,
and
.
Substituting ψ1∗ (t, x) and ψ2∗ (t, x) above into (3.13) yields the value function 1/2 ˆ ˆ W (t0 ) (t, x) = exp −r(t − t0 ) A(t)x + B(t) , where
(3.13)
1 1 c1 b ˆ ˙ ˆ A(t) − − + A(t) = r + 2 ˆ ˆ ˆ 2 2[c1 + A(t)/2] 2[c2 + A(t)/2] 4[c1 + A(t)/2] +
ˆ ˆ c2 A(t) A(t) + + , 2 2 2 ˆ ˆ ˆ 4[c2 + A(t)/2] 8[c1 + A(t)/2] 8[c2 + A(t)/2]
˙ˆ ˆ − a A(t), ˆ B(t) = r B(t) 2
ˆ ) = 2q, A(T
ˆ ) = 0. and B(T
3.1 Group Optimality
53
The optimal cooperative controls can then be obtained as ψ1∗ (t, x) =
x , 2 ˆ 4[c1 + A(t)/2]
and
ψ2∗ (t, x) =
x . 2 ˆ 4[c2 + A(t)/2]
Substituting these control strategies into (3.10) yields the dynamics of the state trajectory under cooperation x(s) ˙ = ax(s)1/2 − bx(s) −
x(s) x(s) − , 2 2 ˆ ˆ 4[c1 + A(s)/2] 4[c2 + A(s)/2]
x(t0 ) = x0 . (3.14)
Solving (3.14) yields the optimal cooperative state trajectory as 2 s 1/2 x ∗ (s) = (t0 , s)2 x0 + −1 (t0 , t)H1 dt ,
for s ∈ [t0 , T ],
(3.15)
t0
s where (t0 , s) = exp[ t0 H2 (τ ) dτ ], H1 = 12 a, and 1 1 1 . + H2 (s) = − b + 2 2 ˆ ˆ 2 8[c1 + A(s)/2] 8[c2 + A(s)/2]
The outcome of cooperation in the game in (3.10) and (3.11) is completely characterized by its optimal cooperative state trajectory, cooperative strategies, and joint payoff above.
3.1.3 Group Optimality in Infinite-Horizon Problems Consider the n-person infinite-horizon general economic problem in which agent i’s payoff is ∞ g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t0 ) ds, for i ∈ N. (3.16) t0
The state dynamics is x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 .
(3.17)
In the case where the agents agree to cooperate, group optimality can be achieved if the agents agree to maximize the sum of their payoffs, that is, max
u1 ,u2 ,...,un
N ∞
t0
subject to (3.17).
j =1
g x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t0 ) ds , (3.18) j
54
3 Dynamic Economic Optimization: Group Optimality
Now consider the alternative infinite problem that starts at time t ∈ [t0 , ∞) with initial state x(t) = x: N ∞
g j x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t) ds , max u1 ,u2 ,...,un
subject to
t
j =1
x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t) = x.
(3.19)
The infinite-horizon autonomous problem in (3.19) is independent of the choice of t and dependent only upon the state x. We define N ∞
g j x(s), u1 (s), u2 (s), . . . , un (s) W (x) = max u1 ,u2 ,...,un
t
j =1
× exp −r(s − t) ds x(t) = x . Again, for the sake of comparison in later analysis in the book we characterize the group optimal control strategies as follows. Theorem 3.2 A set of controls {ψi∗ (x), for i ∈ N} provides a solution to the optimal control problem in (3.19) if there exists a continuously differentiable function W (x) : R m → R satisfying the infinite-horizon Bellman equation n
j g [x, u1 , u2 , . . . , un ] + Wx f [x, u1 , u2 , . . . , un ] . rW (x) = max u1 ,u2 ,...,un
j =1
Proof Follow the proof of Theorem A.2 in the Technical Appendixes.
Hence the agents will adopt the cooperative control {ψi∗ (x), for i ∈ N } characterized in Theorem 3.2. Note that these controls are functions of the current state x only. Substituting this set of controls into (3.17) yields the dynamics of the optimal (cooperative) trajectory as x(s) ˙ = f x(s), ψ1∗ x(s) , ψ2∗ x(s) , . . . , ψn∗ x(s) , x(t0 ) = x0 . (3.20) Let x ∗ (t) denote the solution to (3.20). The optimal trajectory {x ∗ (t)}∞ t=t0 can be expressed as t x ∗ (t) = x0 + f x ∗ (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s) ds. t0
Once again, for notational convenience, we use the terms x ∗ (t) and xt∗ interchangeably.
3.1 Group Optimality
55
The cooperative control for the game in (3.6) and (3.7) can be expressed more precisely as
∗ ∗ ψi xt , for i ∈ N and t ∈ [t0 , ∞) . Note that these controls are functions of the current state xt∗ only. The currentvalue maximized cooperative payoff at current time t ∈ [t0 , ∞), given that the state is xt∗ at t, can be expressed as W (xt∗ ) =
n ∞
t
g j x ∗ (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s)
j =1
× exp −r(s − t) ds.
(3.21)
An illustration of an infinite-horizon cooperative economic game achieving group optimality is provided in the following example. Example 3.1 Consider an infinite-horizon version of the resource extraction example in Sect. 3.1.2. At time t0 , the payoff functions of agent 1 and agent 2, are respectively, ∞ c1 1/2 u (s) exp −r(t − t0 ) ds, u1 (s) − 1/2 1 x(s) t0 and
∞ t0
u2 (s)1/2 −
c2 u (s) exp −r(t − t0 ) ds. 2 1/2 x(s)
(3.22)
The resource stock x(s) ∈ X ⊂ R follows the dynamics in (3.10). These two extractors agree to cooperate and maximize the sum of their payoffs. The agents have to solve the control problem of maximizing ∞ c1 c2 1/2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) x(s)1/2 x(s)1/2 t0 × exp −r(t − t0 ) ds, (3.23) subject to (3.10). Consider the alternative problem of maximizing ∞ c1 c2 1/2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) x(s)1/2 x(s)1/2 t × exp −r(t − t) ds, (3.24) subject to x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(t) = xt .
56
3 Dynamic Economic Optimization: Group Optimality
Invoking Theorem 3.2 we obtain c1 c2 1/2 1/2 u1 − 1/2 u1 + u2 − 1/2 u2 rW (x) = max u1 ,u2 x x + Wx (x) ax 1/2 − bx − u1 − u2 .
(3.25)
Performing the indicated maximization we obtain x , 4[c1 + Wx (x)x 1/2 ]2 x ψ2∗ (x) = . 4[c2 + Wx (x)x 1/2 ]2 ψ1∗ (x) =
and
Substituting ψ1∗ (x) and ψ2∗ (x) above into (3.25) yields the value function 1/2 ˆ W (x) = Ax + Bˆ ,
(3.26)
where 1 1 b − r + Aˆ − ˆ ˆ 2 2[c1 + A/2] 2[c2 + A/2] +
c1
ˆ 2 4[c1 + A/2] a ˆ Bˆ = A. 2r
+
c2 ˆ 2 4[c2 + A/2]
+
Aˆ ˆ 2 8[c1 + A/2]
+
Aˆ ˆ 2 8[c2 + A/2]
= 0,
and
The optimal cooperative controls can then be obtained as ψ1∗ (x) =
x ˆ 2 4[c1 + A/2]
and ψ2∗ (x) =
x ˆ 2 4[c2 + A/2]
.
(3.27)
Substituting these control strategies into (3.10) yields the dynamics of the state trajectory under cooperation x(s) ˙ = ax(s)1/2 − bx(s) −
x(s) x(s) − , ˆ 2 4[c2 + A/2] ˆ 2 4[c1 + A/2]
x(t0 ) = x0 . (3.28)
Solving (3.28) yields the optimal cooperative state trajectory for the infinite-horizon control problem in (3.10) and (3.24) as 2 a a 1/2 + x0 − exp −H (s − t0 ) , x ∗ (s) = 2H 2H where
H =−
1 b 1 + + . ˆ 2 8[c2 + A/2] ˆ 2 2 8[c1 + A/2]
(3.29)
3.2 Individual Rationality
57
In Sect. 3.1, the notion of group optimality and the derivation of group optimal strategies and the cooperative state trajectory for finite and infinite horizons are presented. The issue of individual rationality will be examined in the following section.
3.2 Individual Rationality After the economic agents agree to cooperate and maximize their joint payoff, they have to distribute the cooperative payoff among themselves. At time t0 , with the state being x0 , the term ξ (t0 )i (t0 , x0 ) is used to denote the imputation of the payoff (received over the time interval [t0 , T ]) to agent i. A necessary condition for group optimality and individual rationality to be upheld is (i)
n
ξ (t0 )j (t0 , x0 ) = W (t0 ) (t0 , x0 ),
and (3.30)
j =1
(ii) ξ (t0 )i (t0 , x0 ) ≥ V (t0 )i (t0 , x0 ),
for i ∈ N.
Condition (i) of (3.30) ensures group optimality and condition (ii) guarantees individual rationality at time t0 . Failure to ensure group optimality leads to disputes over the unexploited gains. If individual rationality is not guaranteed, economic agents whose cooperative payoff is less than their noncooperative payoff will not participate in the cooperative plan at the outset. For the optimization scheme to be upheld throughout the game horizon, both group rationality and individual rationality are required to be satisfied throughout the cooperation period [t0 , T ]. At time τ ∈ [t0 , T ], let ξ (τ )i (τ, xτ∗ ) denote the imputation of the payoff to agent i over the time interval [τ, T ]. Therefore, the conditions (i)
n
ξ (τ )j (τ, xτ∗ ) = W (τ ) (τ, xτ∗ ),
and (3.31)
j =1
(ii)
ξ (τ )i (τ, xτ∗ ) ≥ V (τ )i (τ, xτ∗ ),
for i ∈ N and τ ∈ [t0 , T ],
have to be fulfilled. In particular, condition (i) ensures Pareto optimality and condition (ii) guarantees individual rationality throughout the cooperation period [t0 , T ]. Failure to guarantee individual rationality leads to the condition where the concerned participants would reject the agreed-upon solution plan and play noncooperatively.
3.2.1 Lump-Sum and Continuous Transfer Payments To guarantee individual rationality transfer payments have to be arranged. First consider the case when individual rationality is required to hold only at t0 . In the case
58
3 Dynamic Economic Optimization: Group Optimality
where part (ii) of (3.30) is required to be satisfied lump-sum transfers can be used. With agents using the cooperative strategies {ψi∗ (s, xs∗ ), for s ∈ [t0 , T ] and i ∈ N}, agent i would derive a payoff
T
W (t0 )i (t0 , x0 ) =
t0
g i s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s)
s × exp − r(y) dy ds + exp − + exp −
t0 T
T
r(y) dy q i x ∗ (T )
t0
r(y) dy q i x ∗ (T ) ,
(3.32)
t0
for i ∈ N . Given that the agreed-upon imputation to agent i is ξ (t0 )i (t0 , x0 ), a lump-sum transfer χ¯ i has to be incurred to agent i. In particular, χ¯ i = ξ (t0 )i (t0 , x0 ) − W (t0 )i (t0 , x0 ),
for i ∈ N; and
n
χ¯ i = 0.
(3.33)
j =1
Individual rationality would hold at time t0 if ξ (t0 )i (t0 , x0 ) ≥ V (t0 )i (t0 , x0 ). On the other hand, transfer payments can be paid continuously to satisfy part (ii) of (3.30). Let χ i (s) denote the instantaneous transfer payment allocated to agent i at time s ∈ [t0 , T ]. With agents using the cooperative strategies {ψi∗ (s, xs∗ ), for s ∈ [t0 , T ] and i ∈ N }, for a given agreed-upon imputation ξ (t0 )i (t0 , x0 ), we can express agent i’s cooperative payoff as ξ (t0 )i (t0 , x0 ) T
i ∗ g s, x (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s) + χ i (s) = t0
s × exp − r(y) dy ds + exp − t0
+ exp −
T
T
r(y) dy q i x ∗ (T )
t0
r(y) dy q i x ∗ (T ) ,
t0
for i ∈ N ; and
n
T
j =1 t0
s χ (s) exp − r(y) dy ds = 0. j
(3.34)
t0
Once again, individual rationality would hold at time t0 if ξ (t0 )i (t0 , x0 ) ≥ V (t0 )i (t0 , x0 ). However, requiring individual rationality to hold only at t0 does not guarantee that individual rationality will hold for the rest of the cooperation duration. Credible
3.2 Individual Rationality
59
threats must be created to deter agents from abandoning the cooperative strategies at a later time in the cooperation duration. Now we consider the case when individual rationality is required to be satisfied throughout the cooperation period [t0 , T ]. In general, only continuous instantaneous transfer payments can guarantee the satisfaction of (3.31). To uphold part (ii) of (3.31) one has to devise a set of instantaneous transfer payments χ i (s) for s ∈ [t0 , T ] satisfying T
i ∗ g s, x (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s) + χ i (s) τ
s r(y) dy ds + exp − × exp − τ
τ
T
r(y) dy q i x ∗ (T ) ≥ V (τ )i τ, xτ∗ ,
for i ∈ N; n
and
(3.35) T
s j χ (s) exp − r(y) dy ds = 0,
j =1 τ
for τ ∈ [t0 , T ].
(3.36)
τ
Remark 3.2 In the special case when the economic agents are identical, transfer payments may not be necessary for the maintenance of individual rationality because n
W (τ ) τ, xτ∗ ≥ V (τ )j τ, xτ∗ = nV (τ )i τ, xτ∗ ,
and
j =1
τ
g i s, x ∗ (s), ψ1∗ s, x ∗ (s) , ψ2∗ s, x ∗ (s) , . . . , ψn∗ s, x ∗ (s)
T
s × exp − r(y) dy ds + exp − τ
T
r(y) dy q i x ∗ (T )
τ
1 = W (τ ) τ, xτ∗ V (τ )i τ, xτ∗ ≥ V (τ )i τ, xτ∗ , n for i ∈ N and τ ∈ [t0 , T ]. An illustration of a cooperative economic game involving individually rational imputation is provided in the next section.
3.2.2 Individually Rational Imputation in Cooperative Resource Extraction Consider the two firms’ resource extraction game in Sect. 3.1.2. Let [φ1∗ (t, x), φ2∗ (t, x)], for t ∈ [t0 , T ], denote a set of strategies that provide a feedback Nash
60
3 Dynamic Economic Optimization: Group Optimality
equilibrium solution to the game in (3.10) and (3.36). Invoking Theorem 2.3 in Chap. 2, the value function V (t0 )i (t, x) : [t0 , T ] × R m → R, for i ∈ {1, 2}, satisfies the equations ci (t0 )i 1/2 −Vt (t, x) = max ui (t) − 1/2 ui (t) exp −r(t − t0 ) ui x + Vx(t0 )i (t, x) ax 1/2 − bx − ui (t) − φj∗ (t, x) , and (3.37) 1 V (t0 )i (T , x) = exp −r(T − t0 ) qx(T ) 2 ,
for i ∈ {1, 2} and j ∈ {1, 2}, and j = i.
Performing the indicated maximization in (3.38) yields φi∗ (t, x) =
x (t )i 4[ci + Vx 0
exp[r(t − t0 )]x 1/2 ]2
,
for i ∈ {1, 2}.
Proposition 3.1 The value function of agent i ∈ {1, 2} satisfying (3.38) is V (t0 )i (t, x) = exp −r(t − t0 ) Ai (t)x 1/2 + Bi (t) ,
(3.38)
(3.39)
where for i, j ∈ {1, 2} and i = j, Ai (t), Bi (t), Aj (t), and Bj (t) satisfy b 1 ci A˙ i (t) = r + Ai (t) − + 2 2[ci + Ai (t)/2] 4[ci + Ai (t)/2]2 +
Ai (t) Ai (t) + , 2 8[ci + Ai (t)/2] 8[cj + Aj (t)/2]2
a B˙ i (t) = rBi (t) − Ai (t), 2
Ai (T ) = q,
(3.40)
and Bi (T ) = 0.
Proof Substituting φ1∗ (t, x) and φ2∗ (t, x) into (3.38) and upon solving (3.38) one obtains Proposition 3.1. Now consider the situation when the firms agree to cooperate and want to share their profits with individual rationality being upheld. As mentioned in Sect. 3.2.1, individual rationality may just hold at the outset of the game or be maintained throughout the game. We shall consider transfer schemes leading to the former case first and then schemes leading to the latter case. (i) Transfers Satisfying Individual Rationality at the Outset We begin with transfers that satisfy individual rationality at initial time t0 and consider the case when lump-sum transfers are given out so that (3.30) is fulfilled. The imputation to agent i satisfying individual rationality and group optimality as in (3.30) requires ξ (t0 )i (t0 , x0 ) = V (t0 )i (t0 , x0 ) + i ,
for i ∈ {1, 2}
and i ≥ 0,
(3.41)
3.2 Individual Rationality
61
where 2
j = W (t0 ) (t0 , x0 ) −
j =1
2
V (t0 )j (t0 , x0 )
j =1 2
1/2 ˆ 0) − ˆ 0 )x 1/2 + B(t Aj (t0 )x0 + Bj (t0 ) . = A(t 0 j =1
Under cooperation, agent i would derive a payoff T ∗ ∗ 1/2 ψi s, x (s) − W (t0 )i (t0 , x0 ) = t0
ci ψi∗ s, x ∗ (s) ∗ 1/2 x (s)
× exp −r(s − t0 ) ds 1 + exp −r(T − t0 ) qx ∗ (T ) 2 , where ψi∗ (s, x ∗ (s)) =
x ∗ (s) 2 ˆ 4[ci +A(s)/2]
for i ∈ {1, 2},
(3.42)
and x ∗ (s) as in (3.15).
Given that the agreed-upon imputation to agent i is ξ (t0 )i (t0 , x0 ) = V (t0 )i (t0 , x0 )+ i , a lump-sum transfer χ¯ i has to be incurred to agent i. In particular, χ¯ i = ξ (t0 )i (t0 , x0 ) − W (t0 )i (t0 , x0 ) = V (t0 )i (t0 , x0 ) + i − W (t0 )i (t0 , x0 ),
for i ∈ N; and
2
χ¯ i = 0.
j =1
Now we consider the case of continuous instantaneous transfer payments satisfying (3.30). Let χ i (s) denote the instantaneous transfer payment allocated to agent i at time s ∈ [t0 , T ]. To maintain part (ii) of (3.30) the chosen χ i (s) must satisfy 1/2 ξ (t0 )i (t0 , x0 ) = V (t0 )i (t0 , x0 ) + i = Ai (t0 )x0 + Bi (t0 ) + i =
T t0
ψi∗
1/2 − s, x (s) ∗
∗ ci ∗ ψ s, x (s) + χ(s) x ∗ (s)1/2 i
× exp −r(s − t0 ) ds 1 + exp −r(T − t0 ) qx ∗ (T ) 2 ≥ V (t0 )i (t0 , x0 ), and
2
T
j =1 t0
(3.43) for i ∈ {1, 2},
s χ j (s) exp − r(y) dy ds = 0. t0
Then we proceed to consider the case with transfer payments satisfying individual rationality throughout the cooperative period.
62
3 Dynamic Economic Optimization: Group Optimality
(ii) Transfers Satisfying Individual Rationality Throughout For individual rationality and group optimality to be satisfied throughout the cooperation period, (3.31) has to be maintained. The use of a lump sum cannot secure an outcome fulfilling (3.31). Hence continuous instantaneous transfer payments will be considered. Given that an instantaneous transfer payment χ i (s) allocated to agent i at time s ∈ [t0 , T ], the imputation to him over the period [τ, T ] as viewed at time τ ∈ [t0 , T ] can be expressed as ξ
(τ )i
τ, xτ∗
T
= τ
ψi∗
1/2 − s, x (s)
∗
ci ψi∗ s, x ∗ (s) + χ(s) ∗ 1/2 x (s)
× exp −r(s − τ ) ds 1 + exp −r(T − τ ) qx ∗ (T ) 2 , and
2
T
j =1 t0
for i ∈ {1, 2} and τ ∈ [t0 , T ],
(3.44)
s χ j (s) exp − r(y) dy ds = 0. t0
For individual rationality to be satisfied throughout the cooperation period, it is required that condition ξ (τ )i (τ, xτ∗ ) ≥ V (τ )i (τ, xτ∗ ), for i ∈ {1, 2} and τ ∈ [t0 , T ]. Invoking Remark 2.1 in Chap. 2 and (3.39), we can obtain the value functions V (τ )i (τ, xτ∗ ), for i ∈ {1, 2}, as V (τ )i (τ, xτ∗ ) = Ai (τ )(xτ∗ )1/2 + Bi (τ ) .
(3.45)
Therefore any set of chosen instantaneous transfer payment χ i (s), for s ∈ [t0 , T ], satisfying ξ (τ )i (τ, xτ∗ ) ≥ V (τ )i (τ, xτ∗ ), for i ∈ {1, 2} and τ ∈ [t0 , T ], will ensure individual rationality throughout the cooperation duration.
3.3 Individual Rationality Under Infinite Horizon In many economic situations, the terminal time of the game T is either very far in the future or unknown to the agents. Consider the case in Sect. 3.1.3 where the agents agree to cooperate and maximize the sum of their payoffs in (3.18) subject to (3.17). They have to distribute the cooperative payoff among themselves.
3.3.1 Individually Rational at the Outset At time t0 , with the state being x0 , the term ξ (t0 )i (x0 ) = ξ i (x0 ) is used to denote the imputation of a payoff (over the time interval [t0 , ∞)) to agent i.
3.3 Individual Rationality Under Infinite Horizon
63
A necessary condition for group optimality and individual rationality to be upheld is n
(i)
ξ (t0 )j (x0 ) = W (x0 ),
and
j =1
(3.46)
ξ (t0 )i (x0 ) ≥ Vˆ (t0 )i (x0 ) = Vˆ i (x0 ),
(ii)
for i ∈ N,
where Vˆ (t0 )i (x0 ) = Vˆ i (x0 ) is the payoff value function of agent i in the n-person infinite-horizon general economic game problem
∞
max ui
g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t0 ) ds,
t0
for i ∈ N . Subject to the state dynamics x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 .
Condition (i) of (3.46) ensures group optimality and condition (ii) guarantees individual rationality at time t0 . Consider first the case where lump-sum transfer payments are given out at time t0 . With agents using the cooperative strategies {ψi∗ (xs∗ ), for s ∈ [t0 , ∞) and i ∈ N}, agent i would derive a payoff W (t0 )i (x0 ) =
∞ t0
g i x ∗ (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s)
× exp −r(s − t0 ) ds = W i (x0 ),
(3.47)
for i ∈ N . Given that the agreed-upon imputation to agent i is ξ (t0 )i (x0 ), a lump-sum transfer χ¯ i has to be incurred to agent i. In particular, χ¯ = ξ i
(t0 )i
(x0 ) − W (x0 ), i
for i ∈ N ; and
n
χ¯ i = 0.
(3.48)
j =1
Individual rationality would hold at time t0 if ξ (t0 )i (x0 ) ≥ Vˆ i (x0 ). On the other hand, transfer payments can be paid continuously to satisfy (3.30). Let χ i (s) denote the instantaneous transfer payment allocated to agent i at time s ∈ [t0 , ∞). With agents using the cooperative strategies {ψi∗ (xs∗ ), for s ∈ [t0 , ∞) and i ∈ N } and given an agreed-upon imputation ξ (t0 )i (x0 ), we can express
64
3 Dynamic Economic Optimization: Group Optimality
agent i’s cooperative payoff as ∞
i ∗ (t0 )i (x0 ) = g x (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s) + χ i (s) ξ t0
and
× exp −r(s − t0 ) ds, for i ∈ N ; n ∞
χ j (s) exp −r(s − t) ds = 0.
(3.49)
j =1 t0
Once again, individual rationality would hold at time t0 with ξ (t0 )i (x0 ) ≥ V i (x0 ). Note that requiring individual rationality to hold only at t0 does not guarantee that individual rationality will hold for the rest of the cooperation period. Credible threats must be created to deter agents from abandoning the cooperative strategies at a later time in the cooperation period.
3.3.2 Individually Rational Throughout the Cooperative Duration For the optimization scheme to be upheld throughout the game horizon, both group rationality and individual rationality are required to be satisfied throughout the cooperation period [t0 , ∞). As mentioned earlier, only continuous instantaneous transfer payments can guarantee the satisfaction of individual rationality throughout the duration of cooperation. Therefore along the optimal cooperative path {xτ∗ }τ ∈[t0 ,∞) , one has to devise a set of instantaneous transfer payments χ i (s) for s ∈ [t0 , ∞) satisfying ξ (τ )i xτ∗ = ξ i xτ∗ ∞
i ∗ g x (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s) + χ i (s) = τ
and
× exp −r(s − τ ) ds ≥ Vˆ i xτ∗ , for i ∈ N ; n ∞
χ j (s) exp −r(s − τ ) ds = 0,
(3.50)
for τ ∈ [t0 , ∞).
j =1 τ
A set of instantaneous transfer payments χ i (s) for s ∈ [t0 , ∞) satisfying (3.51) satisfies (i)
n
ξ j xτ∗ = W xτ∗ ,
and
j =1
(ii)
ξ xτ∗ ≥ Vˆ i xτ∗ ; i
for i ∈ N and τ ∈ [t0 , ∞) along the path
(3.51)
xτ∗ τ ∈[t ,∞) . 0
3.3 Individual Rationality Under Infinite Horizon
65
In particular, condition (i) ensures Pareto optimality and condition (ii) guarantees individual rationality throughout the cooperation period [t0 , ∞). The failure to guarantee individual rationality leads to the condition where the concerned participants will reject the agreed-upon solution plan and play noncooperatively. In the steady state of the infinite-horizon problem, the state dynamics under cooperation become x˙ ∗ (s) = f x ∗ (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s) = 0. (3.52) ∗ denote the steady-state level of x ∗ that satisfies (3.52). In the steady state, Let x∞ ∗ ) and the noncooperative the joint cooperative payoffs can be expressed as W (x∞ ∗ ). payoff value function of agent i as Vˆ i (x∞ At any time t ∈ [t0 , ∞) at which a steady state has been attained, agents will use ∗ ), for i ∈ N}, and agent i will derive a payoff the cooperative strategies {ψi∗ (x∞
∗ = W i x∞
∞ t
∗ ∗ ∗ ∗ ∗ g i x∞ (s), ψ1∗ x∞ , ψ2 x∞ , . . . , ψn∗ x∞ exp −r(s − t) ds,
(3.53) for i ∈ N . To fulfill individual rationality in a steady state an infinite stream of constant i satisfying instantaneous transfer payments χ∞ ∞ ∗ i ∗
i = g x (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s) + χ∞ ξ i x∞ τ
× exp −r(s − τ ) ds, n
j =1 τ
∞
j χ∞ exp −r(s
∗ ∗ ≥ Vˆ i x∞ , ξ i x∞
− τ ) ds = 0,
(3.54) and
for i ∈ N,
will be given out. An illustration with an economic cooperation involving the satisfaction of individual rationality is shown below.
3.3.3 Individuall Rationality in Resource Extraction Consider the infinite-horizon resource extraction game in (3.10) and (3.22) in Sect. 3.1. Again, we first examine the alternative game problem that starts at time t ∈ [t0 , ∞) with initial state x(t) = x ∞ c1 u1 (s)1/2 − u (s) exp −r(t − t) ds, max 1 1/2 u1 x(s) t
66
3 Dynamic Economic Optimization: Group Optimality
and max u2
∞
u2 (s) t
1/2
c2 − u (s) exp −r(t − t) ds, 2 x(s)1/2
where t ∈ [t0 , ∞), (3.55)
subject to x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(t) = xt .
(3.56)
Let [φ1∗ (x), φ2∗ (x)], for t ∈ [t0 , ∞), denote a set of strategies that provides a feedback Nash equilibrium solution to the game in (3.55) and (3.56). Invoking Theorem 2.4 in Chap. 2, the value function Vˆ i (x) : R → R, for i ∈ {1, 2}, satisfies the Isaacs–Bellman equations ci r Vˆti (x) = max ui (t)1/2 − 1/2 ui (t) + Vˆxi (x) ax 1/2 − bx − ui (t) − φj∗ (x) , ui x (3.57) for i ∈ {1, 2} and j ∈ {1, 2}, and j = i. Performing the indicated maximization in (3.57) yields φi∗ (x) =
x 4[ci + Vˆxi (x)x 1/2 ]2
,
for i ∈ {1, 2}.
Proposition 3.2 The value function of agent i ∈ {1, 2} satisfying (3.57) is Vˆ i (x) = Aˆ i x 1/2 + Bˆ i ,
(3.58)
(3.59)
where for i, j ∈ {1, 2} and i = j, Aˆ i , Bˆ i , Aˆ j , and Bˆ j satisfy ci b 1 Aˆ i Ai (t) + 0 = r + Aˆ i − + + , 2 2[ci + Aˆ i /2] 4[ci + Aˆ i /2]2 8[ci + Aˆ i /2]2 8[cj + Aˆ j /2]2 and Bˆ i (t) =
a ˆ Ai . 2r
Proof Substituting φ1∗ (x) and φ2∗ (x) into (3.57) and upon solving (3.57) one obtains Proposition 3.2. The joint payoff of the firms can be obtained as in (3.26) and the cooperative strategies are given in (3.27). Again, we consider both the case where individual rationality holds at the outset of the game and the case where it is maintained throughout the game. (i) Transfers Satisfying Individual Rationality at the Outset
3.3 Individual Rationality Under Infinite Horizon
67
We then consider transfers that satisfy individual rationality at initial time t0 . First we present the cases when lump-sum transfers are given out so that (3.46) is fulfilled. The imputation to agents i satisfying individual rationality and group optimality as, in (3.46), requires ξ i (x0 ) = Vˆ i (x0 ) + i ,
for i ∈ {1, 2} and i ≥ 0,
where 2
j = W (x0 ) −
j =1
2
2 1/2
1/2 ˆ ˆ − + B Vˆ j (x0 ) = Ax Aˆ j x0 + Bˆ j . 0
j =1
j =1
Under cooperation, agent i would derive a payoff ∞ ∗ ∗ 1/2 ci i ∗ ∗ W (x0 ) = − ∗ 1/2 ψi x (s) exp −r(s − t0 ) ds, ψi x (s) x (s) t0 (3.60) for i ∈ {1, 2}, where ψ1∗ x ∗ (s) =
x ∗ (s) ˆ 2 4[c1 + A/2]
and
ψ2∗ x ∗ =
x ∗ (s) , ˆ 2 4[c2 + A/2]
as given in (3.26), and x ∗ (s) is given in (3.28). With the agreed-upon imputation to agent i being ξ i (x0 ) = Vˆ i (x0 ) + i , a lumpsum transfer χ¯ i has to be incurred to agent i. In particular, χ¯ i = ξ i (x0 ) − W i (x0 ) = Vˆ i (x0 ) + i − W i (x0 ), for i ∈ N ; and 2j =1 χ¯ i = 0. Now we consider the case of continuous instantaneous transfer payments satisfying (3.46). Let χ i (s) denote the instantaneous transfer payment allocated to agent i at time s ∈ [t0 , ∞). To fulfill (3.46) the chosen χ i (s) must satisfy 1/2 ξ i (x0 ) = Vˆ i (x0 ) + i = Aˆ i x0 + Bˆ i + i ∞ ∗ ∗ 1/2 ci ∗ ∗ ψi x (s) = − ∗ 1/2 ψi x (s) + χ(s) x (s) t0 × exp −r(s − t0 ) ds, for i ∈ {1, 2}, and
2
∞
(3.61)
χ j (s) exp −r(s − t0 ) ds = 0.
j =1 t0
Next, we proceed to consider the case with transfer payments satisfying individual rationality throughout the cooperative period. (ii) Transfers Satisfying Individual Rationality Throughout
68
3 Dynamic Economic Optimization: Group Optimality
For individual rationality and group optimality to be satisfied throughout the cooperation period, (3.51) has to be maintained. The use of a lump sum cannot secure an outcome fulfilling (3.51). Hence continuous instantaneous transfer payments will be considered. Given that an instantaneous transfer payment χ i (s) is allocated to agent i at time s ∈ [t0 , ∞), the imputation to him over the period [τ, T ] as viewed at time τ ∈ [t0 , ∞) can be expressed as ∞ ∗ ∗ 1/2 ci i ∗ ∗ ∗ ξ (xτ ) = ψi x (s) − ∗ 1/2 ψi x (s) + χ(s) exp −r(s − τ ) ds, x (s) τ (3.62) for i ∈ {1, 2} and τ ∈ [t0 , ∞). For individual rationality to be satisfied throughout the cooperation period, it is required that condition ξ i (xτ∗ ) ≥ V i (xτ∗ ), for i ∈ {1, 2} and τ ∈ [t0 , ∞). To fulfill (3.51) the chosen χ i (s) must satisfy ∞ ∗ ∗ 1/2 ci ξ i xτ∗ = ψi x (s) − ∗ 1/2 ψi∗ x ∗ (s) + χ(s) exp −r(s − t0 ) ds x (s) t0 1/2 ≥ Vˆ i xτ∗ = Aˆ i xτ∗ + Bˆ i , for i ∈ {1, 2} and τ ∈ [t0 , ∞), (3.63) and
2
j =1 τ
∞
s χ (s) exp − r(y) dy ds = 0. j
τ
Now consider the case in a steady state. The state dynamics under cooperation become 1/2 x˙ ∗ (s) = a x ∗ (s) − bx ∗ (s) −
x ∗ (s) x ∗ (s) − = 0. ˆ 2 4[c2 + A/2] ˆ 2 4[c1 + A/2]
Solving (3.64) yields the steady-state level of the state variable as 2 1 1 ∗ 2 x∞ = a + . b+ ˆ 2 4[c2 + A/2] ˆ 2 4[c1 + A/2]
(3.64)
(3.65)
The joint cooperative payoffs in a steady state are ∗ ∗ 1/2 = Aˆ x∞ W x∞ + Bˆ , and the payoff value function of agent i is ∗ ∗ 1/2 + Bˆ i , = Aˆ i x∞ Vˆ i x∞
for i ∈ {1, 2}.
At any time t ∈ [t0 , ∞) at which a steady state has been attained, agents will use ∗ ), ψ ∗ (x ∗ )}, and agent i will derive a payoff the cooperative strategies {ψ1∗ (x∞ 2 ∞ ∞ ∗ ∗ ∗ 1/2 ci W i x∞ = − ∗ 1/2 ψi∗ x ∗ (s) exp −r(s − t) ds, ψi x (s) x (s) t (3.66) for i ∈ {1, 2}.
3.4 Cooperative Economic Games Satisfying Individual Rationality
69
To fulfill individual rationality in a steady state an infinite stream of constant i must satisfy instantaneous transfer payments χ∞ ∞ ∗ ∗ 1/2 ci i ∗ ∗ ∗ i ξ x∞ = ψi x (s) − ∗ 1/2 ψi x (s) + χ∞ exp −r(s − t) ds x (s) t 1/2 ∗ ∗ + Bˆ i , for i ∈ {1, 2}, = Aˆ i x∞ ≥ Vˆ i x∞ 2
j =1 t
∞
j χ∞ exp −r(s − t) ds = 0.
3.4 Cooperative Economic Games Satisfying Individual Rationality and Group Optimality Cooperative games suggest the possibility of socially optimal and group efficient solutions to decision problems involving strategic action. As discussed above, individual rationality and group optimality are essential elements of a cooperative game solution. Dockner and Jørgensen (1984), Dockner and Long (1993), Tahvonen (1994), Mäler and de Zeeuw (1998), and Rubio and Casino (2002) presented cooperative solutions satisfying group optimality in differential games. The majority of cooperative differential games adopt solutions satisfying the essential criteria for dynamic stability—group optimality and individual rationality. Haurie and Zaccour (1986, 1991), Kaitala and Pohjola (1988, 1990, 1995), Kaitala et al. (1995), and Jørgensen and Zaccour (2001) presented classes of transferable-payoff cooperative differential games with solutions that satisfy group optimality and individual rationality. In the following sections, cooperative economic games satisfying individual rationality and group optimality are presented.
3.4.1 Cooperative Resource Extraction Game Consider the resource extraction game presented in Sects. 3.2 and 3.3. The growth rate of the fish biomass is characterized by the differential equations x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(t0 ) = x0 ∈ X,
(3.67)
where ui ∈ Ui is the (nonnegative) amount of fish harvested by nation i, for i ∈ {1, 2}; a and b are positive constants. The harvesting cost for firm i ∈ {1, 2} depends on the quantity of resource extracted ui (s), the resource stock size x(s), and a parameter ci . In particular, firm i’s extraction cost can be specified as ci ui (s)x(s)−1/2 . The fish harvested by nation i at time s will generate a net benefit of the amount [ui (s)]1/2 . The horizon of concern
70
3 Dynamic Economic Optimization: Group Optimality
is [t0 , T ]. At time T , nation i will receive a termination bonus qi x(T )1/2 , where qi is nonnegative. There exists a discount rate r, and profits received at time t have to be discounted by the factor exp[−r(t − t0 )]. At time t0 the payoff of nation i ∈ {1, 2} is ci u (s) exp −r(s − t0 ) ds i 1/2 x(s) t0 1 + exp −r(T − t0 ) qi x(T ) 2 .
T
1/2 − ui (s)
(3.68)
The game structure is a deterministic version of an example in Yeung and Petrosyan (2004). According to Proposition 3.1, the value function of agent i ∈ {1, 2} is (3.69) V (t0 )i (t, x) = exp −r(t − t0 ) Ai (t)x 1/2 + Bi (t) , where for i, j ∈ {1, 2} and i = j, Ai (t), Bi (t), Aj (t), and Bj (t) satisfy (3.41). According to (3.45), the value functions V (τ )i (τ, xτ∗ ), for i ∈ {1, 2}, are V (τ )i (τ, xτ∗ ) = Ai (τ )(xτ∗ )1/2 + Bi (τ ) .
(3.70)
The firms agree to cooperate and seek to solve the following joint profit maximization problem to achieve a Pareto optimum by T
c1 c2 1/2 u (s) + u (s) − u (s) 1 2 2 u1 ,u2 t x(s)1/2 x(s)1/2 0 1 (3.71) × exp −r(t − t0 ) ds + 2 exp −r(T − t0 ) qx(T ) 2 ,
max
u1 (s)1/2 −
subject to (3.66). According to (3.13) the maximized value function under cooperation is obtained as 1/2 ˆ ˆ W (t0 ) (t, x) = exp −r(t − t0 ) A(t)x + B(t) , (3.72) ˆ and B(t) ˆ with A(t) satisfying the corresponding differential equations in (3.13). The optimal cooperative controls can then be obtained as ψ1∗ (t, x) =
x , 2 ˆ 4[c1 + A(t)/2]
and
ψ2∗ (t, x) =
x , 2 ˆ 4[c2 + A(t)/2]
with the optimal cooperative state trajectory {x ∗ (s)}Ts=t0 given in (3.15). Under cooperation, firm i would derive a payoff ∗ ci ∗ (t0 , x0 ) = ψ s, x (s) x ∗ (s)1/2 i t0 1 × exp −r(s − t0 ) ds + exp −r(T − t0 ) qi x ∗ (T ) 2 ,
W
(t0 )i
T
ψi∗
1/2 − s, x (s)
∗
(3.73)
3.4 Cooperative Economic Games Satisfying Individual Rationality
71
where ψi∗ s, x ∗ (s) =
x ∗ (s) , 2 ˆ 4[ci + A(s)/2]
for i ∈ {1, 2}.
The firms decide to share the excess gain from cooperation equally. There can be different methods of payment to achieve this. (i) A Lump-sum Transfer at the Outset First consider the case when a lump-sum transfer is arranged at the outset of the game. Given that the firms agree to share the excess gain from cooperation equally, therefore, 2
1 (t0 )i (t0 )i (t0 ) (t0 )j (t0 , x0 ) = V (t0 , x0 ) + V (t0 , x0 ) W (t0 , x0 ) − ξ 2 j =1
=
1 ˆ ˆ 0 ) + Ai (t0 )x 1/2 + Bi (t0 ) A(t0 )x 1/2 + B(t 2 − Aj (t0 )x 1/2 + Bj (t0 ) ,
(3.74)
for i ∈ {1, 2}. Since firm i’s receipt under cooperation is W (t0 )i (t0 , x0 ), a lump-sum transfer χ¯ i has to be incurred to agent i to achieve ξ (t0 )i (t0 , x0 ). In particular, χ¯ i = ξ (t0 )i (t0 , x0 ) − W (t0 )i (t0 , x0 ),
for i ∈ {1, 2}.
Note that nj=1 χ¯ i = 0. Group optimality is satisfied and individual rationality holds at time t0 . (ii) Continuous Instantaneous Transfer Satisfying Individual Rationality at the Outset Now we consider the case of continuous instantaneous transfer payments satisfying individual rationality at the outset. Let χ i (s) denote the instantaneous transfer payment allocated to agent i at time s ∈ [t0 , T ]. The chosen χ i (s) must satisfy 1 ˆ ˆ 0 ) + Ai (t0 )x 1/2 + Bi (t0 ) A(t0 )x 1/2 + B(t 2 − Aj (t0 )x 1/2 + Bj (t0 ) T ∗ ∗ ∗ 1/2 ci ∗ i − ∗ 1/2 ψi s, x (s) − χ (s) = ψi s, x (s) (3.75) x (s) t0 ∗ 1 × exp −r(s − t0 ) ds + exp −r(T − t0 ) qi x (T ) 2 , for i ∈ {1, 2}; s n T
χ j (s) exp − r(y) dy ds = 0.
ξ (t0 )i (t0 , x0 ) =
and
j =1 t0
t0
72
3 Dynamic Economic Optimization: Group Optimality
Once again, group optimality is satisfied and individual rationality holds at time t0 . (iii) Transfers Satisfying Individual Rationality Throughout Now we consider the case of continuous instantaneous transfer payments satisfying individual rationality throughout the cooperative period. Given that an instantaneous transfer payment χ i (s), allocated to agent i at time s ∈ [t0 , T ], the imputation to him over the period [τ, T ] as viewed at time τ ∈ [t0 , T ] can be expressed as T ∗ ∗ 1/2 ci − ∗ 1/2 ψi∗ s, x ∗ (s) − χ(s) ξ (τ )i τ, xτ∗ = ψi s, x (s) x (s) τ 1 × exp −r(s − τ ) ds + exp −r(T − τ ) qi x ∗ (T ) 2 ≥ V (τ )i τ, xτ∗ (3.76) 1/2 + Bi (τ ), = Ai (τ ) xτ∗ for i ∈ {1, 2}, and
n
T
j =1 t0
s χ j (s) exp − r(y) dy ds = 0. t0
Any set of chosen instantaneous transfer payments χ i (s), for s ∈ [t0 , T ], satisfying (3.76) will ensure group optimality and individual rationality throughout the cooperation period.
3.4.2 Fully Coordinated Pollution Control Dockner and Long (1993) presented a differential game of international pollution control. There are two nations and each nation produces a single consumption good with a given fixed factor of production and a given technology. Let the quantity of the good produced at time s be denoted by Qi (s), for i ∈ {1, 2}. The production of a unit of the consumption good results in εi (s) amount of pollutants. An emission consumption trade-off function (see Forster 1973, 1975) states that (3.77) Qi (s) = f i εi (s) , for i ∈ {1, 2}. The function f i [εi (s)] is strictly concave in εi (s) and satisfies f i [0] = 0. Let x(s) denote the level of pollution stock at time s. The pollution accumulation dynamics is governed by the kinematic equation x(s) ˙ = ε1 (s) + ε2 (s) − δx(s),
x(0) = x0 ,
(3.78)
where δ is the rate of natural purification. In each nation there are n identical consumers. The representative consumer in nation i derives utility from consuming qi (s) = Qi (s)/n and faces the costs of the polluted environment ci [x(s)]. Consumer preference U i [Qi (s)/n] is assumed to be
3.4 Cooperative Economic Games Satisfying Individual Rationality
73
strictly concave and the cost function ci [x(s)] is strictly convex. The net benefits of the representative consumer in nation i are given by U i Qi (s)/n − ci x(s) ≡ U i f i εi (s) /n − ci x(s) . (3.79) The objective of government i is to choose a pollution control strategy εi (s), or equivalently, an output strategy that maximizes the discounted stream of net benefits from consumption of a representative consumer, that is, ∞ i i U f εi (s) /n − ci x(s) exp(−rs) ds, for i ∈ {1, 2}, (3.80) max εi
0
subject to the pollutant accumulation dynamics in (3.78). The outcome of a cooperative game is interpreted as the scenario in which the nations are able to reach a pollution control agreement (they coordinate their own control efforts) leading to a Pareto optimum. It is used as a reference scenario yielding a first-best solution. An explicit first-best solution is characterized with the normalization of n to unity and the specification of the functional forms of preferences and technologies as 2 c ci x(s) = x(s) 2
2 1 and U i f i εi (s) /n = Aεi (s) − εi (s) , 2
where A > 0. In particular, a first-best solution can be obtained by solving the optimization problem max ε1 ,ε2
0
2 ∞
j =1
2 c 2 1 Aεj (s) − εj (s) − x(s) exp(−rs) ds, 2 2
(3.81)
subject to (3.78). To solve the optimization problem in (3.78) and (3.81) we invoke Theorem 3.2 to characterize the solution as follows. A set of controls {ψ1∗ (x), ψ2∗ (x)} provide a solution to the optimal control problem in (3.78) and (3.81) if there exists a continuously differentiable function W (x) : R → R satisfying the infinite-horizon Bellman equation 2
1 2 c 2 (3.82) Aεj − εj − x + Wx (x)[ε1 + ε2 − δx] . rW (x) = max ε1 ,ε2 2 2 j =1
Performing the maximization operation in (3.82) yields ψ1∗ (x) = A + Wx (x)
and ψ2∗ (x) = A + Wx (x).
(3.83)
Substituting (3.83) into the Bellman equation yields 2
rW (x) = 2A A + Wx (x) − A + Wx (x) − cx 2 + Wx (x) 2 A + Wx (x) − δx .
74
3 Dynamic Economic Optimization: Group Optimality
Upon the cancellation of terms we have 2 rW (x) = A2 + 2AWx (x) + Wx (x) − cx 2 − δxWx (x).
(3.84)
Proposition 3.3 1 W (x) = αx 2 + βx + γ , 2 where 1 α=− 2
r δ+ 2
2
β=
2Aα < 0, r + δ − 2α
γ=
(−β − A)2 > 0. r
(3.85)
r + 4c − δ + < 0, 2 and
(3.86)
Proof Substituting W (x) and Wx (x) from (3.85) into (3.84) allows the Bellman equation (3.82) to be expressed as rγ + rβx +
rα 2 x 2
= A2 + 2Aαx + 2Aβ + α 2 x 2 + 2αβx + β 2 − cx 2 − αδx 2 − βδx.
Grouping terms together, one obtains 1 2 rα − α + c + αδ x 2 + (rβ − 2Aα − 2αβ + βγ )x + rγ − A2 − 2Aβ − β 2 = 0. 2 (3.87) For (3.87) to hold, it is required that 1 rα − α 2 + c + αδ = 0, 2 rβ − 2Aα − 2αβ + βγ = 0,
and
(3.88)
Solving (3.89) yields (3.86). Hence Proposition 3.3 follows.
rγ − A2 − 2Aβ − β 2 = 0.
Using (3.85) and (3.83) the cooperative emission controls can be obtained as ψ1∗ (x) = A + αx + β
and
ψ2∗ (x) = A + αx + β.
(3.89)
Substituting these controls into (3.78) yields the optimal pollution accumulation dynamics under cooperation x(s) ˙ = 2 A + αx(s) + β − δx(s), x(0) = x0 . (3.90)
3.5 Exercises
75
Let x ∗ (s) = xs∗ , for s ∈ [t0 , ∞), denote the solution to (3.90). The net benefits to nation i under cooperation can be obtained as ∞ 1 2 c ∗ 2 i ∗ ∗ W (x0 ) = A A + αx (s) + β − A + αx (s) + β − x (s) 2 2 0 × exp(−rs) ds 1 1 1 1 = W (x0 ) = αx02 + βx0 + γ , 2 4 2 2
for i ∈ {1, 2}.
(3.91)
Moreover, along the cooperative trajectory {xt∗ }t∈[0,∞) , the net benefits to nation i under cooperation can be obtained as ∞ 1 2 c 2 W i xt∗ = A A + αx ∗ (s) + β − A + αx ∗ (s) + β − x ∗ (s) 2 2 t × exp(−rs) ds 1 1 1 2 1 = W xt∗ = α xt∗ + βxt∗ + γ , 2 4 2 2
for i ∈ {1, 2}.
(3.92)
Given that nations are symmetrical, splitting the cooperative gains would guarantee individual rationality throughout the cooperation period because W xt∗ ≥ Vˆ 1 xt∗ + Vˆ 2 xt∗ and 1 W i xt∗ = W xt∗ ≥ Vˆ i xt∗ , 2
for i ∈ {1, 2} and t ∈ [0, ∞).
As mentioned in Remark 3.2, even without transfer payments, the individual rationality of identical agents can be maintained.
3.5 Exercises 3.1 Consider the resource stock x(s) ∈ X ⊂ R, which follows the dynamics x(s) ˙ = 40x(s)1/2 − 2x(s) − u1 (s) − u2 (s),
x(0) = 50,
where u1 (s) is the harvest rate of economic agent 1 and u2 (s) is the harvest rate of economic agent 2. The extractors are entitled to harvest the resource in the period [0, 4]. The instantaneous payoff at time s ∈ [0, 4] for agents 1 and 2 are, respectively, 2 1 1/2 1/2 u1 (s) and u2 (s) − u2 (s) . u1 (s) − x(s)1/2 x(s)1/2 At terminal time 4, each agent will receive a termination bonus equaling 1
6x(4) 2 ,
76
3 Dynamic Economic Optimization: Group Optimality
which depends on the resource remaining at the terminal time. Payoffs are transferable between agents 1 and 2 and over time; the discount rate is 0.05. Consider the case when these two firms agree to cooperate and maximize the sum of their payoffs 4 2 1 1/2 u1 (s)1/2 − u (s) + u (s) − u (s) exp(−0.05s) ds 1 2 2 x(s)1/2 x(s)1/2 0 1 + 2 exp −0.05(4) 6x(4) 2 , subject to the resource dynamics above. Derive the optimal cooperative strategies and the optimal trajectory path of the resource stock. 3.2 Solve a feedback equilibrium solution for the resource extraction game in Exercise 3.1. 3.3 The agents agree to cooperate and share the excess gain over the noncooperative profits equally. They also agree to distribute the excess gain at the end of the game. Compute the transfers.
Chapter 4
Time Consistency and Optimal-Trajectory-Subgame Consistent Economic Optimization
The noncooperative games discussed in Chap. 2 fail to reflect all the facets of optimal behavior in n-person market games. In particular, equilibria in noncooperative games do not take into consideration Pareto efficiency or group optimality. Chapter 3 considers cooperation in economic optimization and it is shown that group optimality and individual rationality are two essential properties for cooperation. However, merely satisfying group optimality and individual rationality does not necessarily bring about a dynamically stable solution in economic cooperation because there is no guarantee that the agreed-upon optimality principle is fulfilled throughout the cooperative period. In this chapter we consider dynamically stable economic optimization. The formulation of a solution for dynamic economic optimization is given in Sect. 4.1. The principle of time consistency and the characterization of time consistent solutions are provided in Sect. 4.2. The derivation of payoff distribution procedures leading to a time consistent solution is investigated in Sect. 4.3. Section 4.4 depicts solutions from specific optimality principles and Sect. 4.5 presents an illustration in the cooperative harvesting of a fishery. The analysis is extended to an infinite-horizon framework in Sect. 4.6 and an example of optimizing infinitehorizon resource extraction is presented in Sect. 4.7.
4.1 Solution in Dynamic Economic Optimization The formulation of optimal behaviors for participating agents is a fundamental element in the theory of cooperative games. The agents’ behaviors satisfying the agreed-upon optimality principles constitute a solution of the game. In other words, the solution of a cooperative economic game is generated by a set of optimality principles. Consider again the situation when economic agents agree to optimize cooperatively in a dynamic context. Let Γc (x0 , T − t0 ) denote a cooperative game in which D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_4, © Springer Science+Business Media, LLC 2012
77
78
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
agent i’s payoff is
T t0
s g i s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds
+ exp −
T
t0
r(y) dy q i x(T ) ,
for i ∈ N,
(4.1)
t0
and the dynamics of the state is x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 .
(4.2)
The participating agents agree to act according to an agreed-upon optimality principle. The solution generated by the agreed-upon optimality principle includes agreements on how to act cooperatively and allocate the cooperative payoff. Let there be an optimality principle agreed upon by all agents in the cooperative game Γc (x0 , T − t0 ). Based on this optimality principle, the solution P (x0 , T − t0 ) of the game Γc (x0 , T − t0 ) at time t0 includes the following. (t )∗
(t )∗
(t )∗
(i) A set of cooperative strategies u(t0 )∗ (s) = [u1 0 (s), u2 0 (s), . . . , un 0 (s)], for s ∈ [t0 , T ]. (ii) An imputation vector ξ (t0 ) (t0 , x0 ) = [ξ (t0 )1 (t0 , x0 ), ξ (t0 )2 (t0 , x0 ), . . . , ξ (t0 )n (t0 , x0 )] to allocate the cooperative payoff to the agents. (iii) A payoff distribution procedure B t0 (s) = [B1t0 (s), B2t0 (s), . . . , Bnt0 (s)] for s ∈ t [t0 , T ], where Bi 0 (s) is the instantaneous payments for agent i at time s. In particular, ξ (t0 )i (t0 , x0 ) =
T t0
s t Bi 0 (s) exp − r(y) dy ds
+ q i (xT ) exp −
t0 T
r(y) dy ,
(4.3)
t0
for i ∈ N . This means that the agents agree at the outset on a set of cooperative strategies an imputation ξ (t0 )i (t0 , x0 ) of the gains to the ith agent covering the time interval [t0 , T ], and a payoff distribution procedure {B t0 (s)}Ts=t0 to allocate payments to the agents over the game interval. Using the agreed-upon cooperative strategies the state evolves according to the state dynamics u(t0 )∗ (s),
(t )∗ (t )∗ x(s) ˙ = f s, x(s), u1 0 (s), u2 0 (s), . . . , un(t0 )∗ (s) ,
x(t0 ) = x0 .
(4.4)
The solution to (4.4) yields the conditional optimal trajectory, which is denoted by {x c (s)}Ts=t0 . For notational convenience we use x c (s) and xsc interchangeably.
4.1 Solution in Dynamic Economic Optimization
79
When time t ∈ (t0 , T ] has arrived, the situation becomes a cooperative game in which economic agent i’s payoff is t
T
s g i s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds
+ exp −
T
t
i
r(y) dy q x(T ) ,
for i ∈ N,
(4.5)
t
and the evolutionary dynamics of the state is x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t) = xtc .
(4.6)
We use Γc (xtc , T − t) to denote a cooperative game in which economic agent i’s objective is (4.5) with the state dynamics in (4.6). At time t ∈ (t0 , T ] when the state is xtc , according to the agreed-upon optimality principle, the solution P (xtc , T − t) of the game Γc (xtc , T − t) at time t includes the following. (t)∗ (t)∗ (i) A set of cooperative strategies u(t)∗ (s) = [u(t)∗ 1 (s), u2 (s), . . . , un (s)], for s ∈ [t, T ]. (ii) An imputation vector ξ (t) (t, xtc ) = [ξ (t)1 (t, xtc ), ξ (t)2 (t, xtc ), . . . , ξ (t)n (t, xtc )] to allocate the cooperative payoff to the agents. (iii) A payoff distribution procedure B t (s) = [B1t (s), B2t (s), . . . , Bnt (s)] for s ∈ [t, T ], where Bit (s) is the instantaneous payments for agent i at time s. In particular,
ξ (t)i t, xtc =
T t
s Bit (s) exp − r(y) dy ds
+ q i xTc exp −
t
T
r(y) dy ,
(4.7)
t
for i ∈ N and t ∈ [t0 , T ]. This means that under the agreed-upon optimality principle, the agents agree on a set of cooperative strategies u(t)∗ (s), an imputation of the gains in such a way that the gain under cooperation of the ith agent over the time interval [t, T ] is equal to ξ (t)i (t, xtc ), and a payoff distribution procedure {B t (s)}Ts=t to allocate payments to the agents over the game interval [t, T ]. Let there exist solutions P (xtc , T − t) = ∅, t0 ≤ t ≤ T along the conditionally optimal trajectory {x c (t)}Tt=t0 . If this condition is not satisfied it is impossible for the agents to adhere to the chosen principle of optimality since at the very first instant t, when P (xtc , T − t) = ∅, the agents cannot follow this optimality principle. For ξ (t) (t, xtc ), t ∈ [t0 , T ], to be valid imputations, it is required that both group optimality and individual rationality have to be satisfied. Hence a valid optimality principle would yield a solution P (xtc , T − t), which contains
80
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
n (t)j (t, x c ) = W (t) (t, x c ), for t ∈ [t , T ], and (i) 0 t t j =1 ξ (ii) ξ (t)i (t, xtc ) ≥ V (t)i (t, xtc ), for i ∈ N and t ∈ [t0 , T ]. As discussed in Chap. 3, part (i) above guarantees group optimality, which yields the highest joint payoffs for the participating agents. Part (ii) yields individual rationality so that the payoff allocated to an economic agent under cooperation will be no less than its noncooperative payoff. The failure to guarantee group optimality and individual rationality would lead to the condition where participants will reject the agreed-upon optimality principle and play noncooperatively.
4.2 Principle of Time Consistency To ensure stability in dynamic cooperation over time a stringent condition is required: the specific agreed-upon optimality principle must be maintained at any instant of time throughout the game along the optimal state trajectory. This condition is known as time consistency. Assume that at the start of the game the agents execute the solution P (x0 , T − t0 ) generated by an agreed-upon optimality principle (which includes a set of cooperative strategies, an imputation to distribute the cooperative payoff, and a payoff distribution procedure). When the game proceeds to time t, the continuation of the scheme in P (x0 , T − t0 ) has to be consistent with the solution P (xtc , T − t) to the game Γc (xtc , T − t) under the same optimality principle. If this consistency condition is violated, some of the agents will have an incentive to deviate from the initially chosen trajectory. If this happens, instability arises. In particular, the dynamic stability of a solution of a cooperative differential game is the property that, when the game proceeds along the cooperative state trajectory, at each instant of time the agents are guided by the same optimality principle; therefore, they do not have any incentive to deviate from the previously adopted optimal behavior. The question of time consistency or dynamic stability in differential games has been explored rigorously in the past three decades. Haurie (1976) discussed the problem of dynamic instability in extending the Nash bargaining solution to differential games. Petrosyan (1977) formalized mathematically the notion of dynamic stability in solutions of differential games. Petrosyan and Danilov (1979, 1982) introduced the notion of “imputation distribution procedure” for a cooperative solution. Tolwinski et al. (1986) considered cooperative equilibria in differential games in which memory-dependent strategies and threats are introduced to maintain the agreed-upon control path. Petrosyan and Zenkevich (1996) and Petrosyan (1997) provided a detailed analysis of dynamic stability in cooperative differential games. In particular, the method of regularization was introduced to construct time consistent solutions. Yeung and Petrosyan (2001) designed a time consistent solution in differential games and characterized the conditions that the allocation distribution procedure must satisfy. Petrosyan (1995, 2003) employed the regularization method to construct time consistent bargaining procedures. Yeung and Petrosyan (2006a) developed a generalized method for the derivation of analytically tractable time consistent solutions for games with transferable payoffs.
4.2 Principle of Time Consistency
81
4.2.1 Characterization of Time Consistent Solution Let there be an optimality principle agreed upon by all agents in the cooperative game Γc (x0 , T − t0 ). At time t0 , the solution generated by this optimality principle is P (x0 , T − t0 ). At time t ∈ (t0 , T ] when the state is xtc , according to the agreedupon optimality, the solution of the game at time t with state xtc is P (xtc , T − t). A cooperative game Γc (x0 , T − t0 ) has a time consistent solution P (x0 , T − t0 ) if the continuation of the scheme from the solution P (x0 , T − t0 ) = {u(t0 )∗ (s) and B t0 (s) for s ∈ [t0 , T ]; ξ (t0 ) (t0 , x0 )} over the time period [t, T ] coincides with the solution P (xtc , T − t) = {u(t)∗ (s) and B t (s) for s ∈ [t, T ]; ξ (t) (t, xtc )} generated by the same agreed-upon optimality principle at any time instant t ∈ [t0 , T ] along the conditional optimal trajectory {xsc }Ts=t0 . If this coincidence does not appear, there is no guarantee that the agents will not abandon the solution P (x0 , T − t0 ) and switch to P (xtc , T − t). Dynamical instability would arise as participants found that their agreed-upon optimality principle could not be maintained after cooperation has gone on for some time. To verify whether the solution P (x0 , T − t0 ) = {u(t0 )∗ (s) and B t0 (s) for s ∈ [t0 , T ]; ξ (t0 ) (t0 , x0 )} is indeed time consistent, one has to verify whether the agreedupon cooperative strategies, payoff distribution procedures, and imputations are all time consistent.
4.2.2 Time Consistent Cooperative Strategies First, we consider the cooperative strategies adopted under the solution P (x0 , T − t0 ) generated by the agreed-upon optimality principle. At time t0 when the initial state is x0 , the set of cooperative strategies according to P (x0 , T − t0 ) is (t )∗ (t )∗ u(t0 )∗ (s) = u1 0 (s), u2 0 (s), . . . , un(t0 )∗ (s) , for s ∈ [t0 , T ]. Consider the case when the game has proceeded to time t and the state variable becomes xtc . Then one has a cooperative game Γc (xtc , T − t), which starts at time t with initial state xtc . According to the solution P (xtc , T −t) generated by the adopted optimality principle a set of cooperative strategies (t)∗ (t)∗ u(t)∗ (s) = u(t)∗ 1 (s), u2 (s), . . . , un (s) , for s ∈ [t, T ], will be adopted. (t )∗
(t )∗
Definition 4.1 The set of cooperative strategies u(t0 )∗ (s) = [u1 0 (s), u2 0 (s), . . . , (t )∗ un 0 (s)] ∈ P (x0 , T − t0 ) is time consistent if, for s ∈ [t, T ] and t ∈ [t0 , T ], (t0 )∗ (t )∗ u1 (s), u2 0 (s), . . . , un(t0 )∗ (s) c (t)∗ (t)∗ = u1 (s), u2 (s), . . . , u(t)∗ n (s) ∈ P xt , T − t .
82
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
If the condition in Definition 4.1 is satisfied at each instant of time t ∈ [t0 , T ] along the conditional optimal trajectory {x c (t)}Tt=t0 , the continuation of the original cooperative strategies u(t0 )∗ (s) coincides with the cooperative strategies u(t)∗ (s) in the cooperative game Γc (xtc , T − t). Hence the set of cooperative strategies u(t0 )∗ (s) ∈ P (x0 , T − t0 ) is time consistent. Recall that to ensure group optimality the agents have to maximize the agents’ joint payoffs. An optimality principle that requires group optimality will yield a solution P (x0 , T − t0 ), which includes the set of cooperative controls that solves the problem
T
max
u1 ,u2 ,...,un
n
s g j s, x(s), u1 (s), u2 (s), . . . , un (s) exp − r(y) dy ds
t0 j =1
+ exp −
T
t0
r(y) dy
t0
n
q j x(T ) ,
(4.8)
j =1
subject to x(s) ˙ = f s, x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t0 ) = x0 .
(4.9)
A set of group optimal cooperative strategies {ψi∗ (s, x ∗ (s)), for i ∈ N and s ∈ [t0 , T ]}, which solves the problem in (4.8) and (4.9), can be characterized by Theorem 3.1 in Chap. 3. In particular, {x ∗ (t)}Tt=t0 is the solution path of the optimal cooperative trajectory x(s) ˙ = f s, x(s), ψ1∗ s, x(s) , ψ2∗ s, x(s) , . . . , ψn∗ s, x(s) , x(t0 ) = x0 . Invoking Remark 3.1 in Chap. 3, one can show that the joint payoff maximizing controls for the cooperative game Γc (xt∗ , T − t) over the time interval [t, T ] is identical to the joint payoff maximizing controls for the cooperative game Γc (x0 , T − t0 ) over the same time interval. Therefore, the solution to an optimality principle that requires group optimality yields a system of time consistent cooperative strategies. Given that group optimality is an essential element in dynamic cooperation, a valid optimality principle will require the maximization of joint payoff and the cooperative strategies (t)∗ u(t0 )∗ (s) = u1 (s) = ψi∗ (s, x ∗ (s)), for s ∈ [t, T ] and t ∈ [t0 , T ]. Hence the conditional optimal trajectory {x c (t)}Tt=t0 coincides with {x ∗ (t)}Tt=t0 in games where the optimality principle requires group optimality.
4.2.3 Time Consistency in Imputation and Payoff Distribution Procedure Now we consider time consistency in imputation and the payoff distribution procedure. At time t0 when the initial state is x0 , according to the solution P (x0 , T − t0 )
4.2 Principle of Time Consistency
83
generated by the agreed-upon optimality principle, the economic agents will use the payoff distribution procedure {B t0 (s)}Ts=t0 to bring about an imputation to agent i as
T
ξ (t0 )i (t0 , x0 ) =
t0
s t Bi 0 (s) exp − r(y) dy ds + q i (xT ) exp − t0
T
r(y) dy ,
t0
(4.10) for i ∈ N . When the game proceeds to time t ∈ (t0 , T ], the current state is xtc . According to the solution P (x0 , T − t0 ), agent i will receive an imputation (in the present value viewed at time t0 ) equaling s T T t Bi 0 (s) exp − r(y) dy ds + q i xTc exp − r(y) dy , ξ (t0 )i t, xtc = t
t0
t0
(4.11) over the time interval [t, T ]. At time t ∈ (t0 , T ] when the current state is xtc , we have a cooperative game Γc (xtc , T − t). According to the solution P (xtc , T − t) generated by the agreed-upon optimality principle, the economic agents will use the payoff distribution procedure {B t (s)}Ts=t to bring about an imputation to agent i as s T T c (t)i t i c Bi (s) exp − r(y) dy ds + q xT exp − r(y) dy , t, xt = ξ t
t
t
(4.12) for i ∈ N . For the imputation and payoff distribution procedure from P (x0 , T − t0 ) to be consistent with those from P (xtc , T − t), it is essential that t exp r(y) dy ξ (t0 ) t, xtc = ξ (t) t, xtc ∈ P xtc , T − t , for t ∈ [t0 , T ]. t0
In addition, at time t0 when the initial state is x0 , according to the solution P (x0 , T − t0 ) generated by the agreed-upon optimality principle, the payoff distribution procedure is t t B t0 (s) = B10 (s), B20 (s), . . . , Bnt0 (s) , for s ∈ [t0 , T ]. Consider the case when the game has proceeded to time t and the state variable became xtc . Then one has a cooperative game Γc (xtc , T − t), which starts at time t with initial state xtc . According to the solution P (xtc , T − t) generated by the agreedupon optimality principle, the payoff distribution procedure B t (s) = B1t (s), B2t (s), . . . , Bnt (s) , for s ∈ [t, T ], will be adopted.
84
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
For the continuation of the payoff distribution procedure B t0 (s) under P (x0 , T − t0 ) to be consistent with B t (s) ∈ P (xtc , T − t), it is required that B t0 (s) = B t (s),
for s ∈ [t, T ]
and t ∈ [t0 , T ].
Therefore, a formal definition can be presented as follows. Definition 4.2 The imputation and payoff distribution procedure {ξ (t0 ) (t0 , x0 ) and B t0 (s), for s ∈ [t0 , T ]} ∈ P (x0 , T − t0 ), are time consistent if t (i) exp r(y) dy ξ (t0 )i t, xtc t0
t r(y) dy ≡ exp t0
+ q i xTc exp −
T t
t Bi 0 (s) exp
T
s − r(y) dy ds t
r(y) dy t0
= ξ (t)i t, xtc s T ≡ Bit (s) exp − r(y) dy ds + q i xTc exp − t
∈ P xtc , T − t ,
t
T
r(y) dy
t
(4.13)
for i ∈ N and t ∈ [t0 , T ], and t t t (ii) the payoff distribution procedure B t0 (s) = [B10 (s), B20 (s), . . . , Bn0 (s)], for s ∈ [t, T ], is identical to B t (s) = B1t (s), B2t (s), . . . , Bnt (s) ∈ P xtc , T − t . (4.14) Thus cooperative strategies, payoff distribution procedures, and imputations satisfying the conditions in Definitions 4.1 and 4.2 are time consistent.
4.3 Payoff Distribution Procedure Derivation and Time (Optimal-Trajectory-Subgame) Consistent Solutions Crucial to obtaining a time consistent solution is the derivation of a payoff distribution procedure satisfying Definition 4.2 in Sect. 4.2.
4.3.1 Derivation of Payoff Distribution Procedures Invoking part (ii) of Definition 4.2, we have B t0 (s) = B t (s) for t ∈ [t0 , T ] and s ∈ [t, T ]. We use B(s) = {B1 (s), B2 (s), . . . , Bn (s)} to denote B t (s) for all t ∈ [t0 , T ].
4.3 Payoff Distribution Procedure Derivation and Time Consistent Solutions
85
Along the conditional optimal trajectory {x c (s)}Ts=t0 we then have ξ (τ )i τ, xτc =
T τ
s Bi (s) exp − r(y) dy ds + q i xTc exp − τ
T
r(y) dy ,
τ
(4.15) for i ∈ N and τ ∈ [t0 , T ]; and n
Bj (s) =
j =1
n
(τ )∗ (τ )∗ )∗ g j s, xsc , u1 (s), u2 (s), . . . , u(τ n (s) .
j =1
Moreover, for t ∈ [τ, T ], we use the term s T ξ (τ )i t, xtc = Bi (s) exp − r(y) dy ds + q i xTc exp − t
τ
T
r(y) dy ,
τ
(4.16) to denote the present value (with the initial time being τ ) of agent i’s payoff under cooperation over the time interval [t, T ] according to the solution P (xτc , T − τ ) along the cooperative state trajectory. Invoking (4.15) and (4.16) we have t r(y) dy ξ (t)i t, xtc , ξ (τ )i t, xtc = exp − (4.17) τ
for i ∈ N , and τ ∈ [t0 , T ] and t ∈ [τ, T ]. One can readily verify that a payoff distribution procedure {B(s)}Ts=t0 that satisfies (4.17) will give rise to time consistent imputations satisfying part (ii) of Definition 4.2. The next task is the derivation of a payoff distribution procedure {B(s)}Ts=t0 that leads to the realization of (4.15)–(4.17). We first consider the following condition concerning the imputation ξ (τ ) (t, xtc ), for τ ∈ [t0 , T ] and t ∈ [τ, T ]. Condition 4.1 For i ∈ N, t ∈ [τ, T ], and τ ∈ [t0 , T ], the imputation ξ (τ )i (t, xtc ), for i ∈ N , is a function that is continuously differentiable in t and xtc . A theorem characterizing a formula for Bi (s), for s ∈ [t0 , T ] and i ∈ N , which yields (4.15)–(4.17), can be provided as follows. Theorem 4.1 If Condition 4.1 is satisfied, a PDP with a terminal payment q i (xTc ) at time T , and an instantaneous payment at time s ∈ [τ, T ] (s)i c t, xt t=τ Bi (s) = − ξt (s)i − ξx c τ, xτc f s, xsc , ψ1∗ τ, xsc , ψ2∗ τ, xsc , . . . , ψn∗ τ, xsc , (4.18) τ
for i ∈ N , yields the imputation vector ξ (τ ) (τ, xτc ), for τ ∈ [t0 , T ] which satisfies (4.15)–(4.17).
86
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Proof Invoking (4.15)–(4.17), one can obtain s υ+t ξ (υ)i υ, xυc = Bi (s) exp − r(y) dy ds υ υ υ+t + exp − r(y) dy ξ (υ+t)i υ + t, xυc + xυc , (4.19) υ
for υ ∈ [τ, T ] and i ∈ N , where xυc = f υ, xυc , ψ1∗ υ, xυc , ψ2∗ υ, xυc , . . . , ψn∗ υ, xυc t + o(t), o(t)/t → 0 as t → 0.
and
From (4.16) and (4.19), one obtains s υ+t Bi (s) exp − r(y) dy ds υ υ υ+t (υ)i c =ξ r(y) dy ξ (υ+t)i υ + t, xυc + xυc υ, xυ − exp − υ (4.20) = ξ (υ)i υ, xυc − ξ (υ)i υ + t, xυc + xυc , for all υ ∈ [t0 , T ] and i ∈ N . If the imputations ξ (υ) (t, xtc ), for υ ∈ [t0 , T ], satisfy Condition 4.1, as t → 0, one can express (4.20) as Bi (υ)t = − ξt(υ)i t, xtc t=υ t − ξx(υ)i υ, xυc c υ × f υ, xυc , ψ1∗ υ, xυc , ψ2∗ υ, xυc , . . . , ψn∗ υ, xυc t − o(t). (4.21) Dividing (4.21) throughout by t, with t → 0, yields (4.18). Thus the payoff distribution procedure in Bi (s) in (4.18) will lead to the realization of ξ (τ )i (τ, xτc ), for τ ∈ [t0 , T ] which satisfies (4.15)–(4.17). Assigning the instantaneous payments according to the payoff distribution procedure in (4.18) leads to the realization of the imputation ξ (τ ) (τ, xτc ) ∈ P (xτc , T − τ ) for τ ∈ [t0 , T ]. Therefore, the payoff distribution procedure in Bi (s) in (4.18) yields time consistent imputations.
4.3.2 Time (Optimal-Trajectory-Subgame) Consistent Solution Given that group optimality has to be satisfied at every instant in dynamic cooperation we consider the following optimality principle. Principle PI Principle PI is an optimality principle that entails (i) group optimality and individual rationality, and (ii) the distribution of the total cooperative payoff according to an imputation that equals ξ (τ ) (τ, xτ∗ ) for the subgame in [τ, T ] along
4.3 Payoff Distribution Procedure Derivation and Time Consistent Solutions
87
the group optimal trajectory. Moreover, the function ξ (τ ) (τ, xτ∗ ), for τ ∈ [t0 , T ], is continuously differentiable in τ and xτ∗ . The term “time consistency” has been applied in a wide range of problems, like dynamic optimization, noncooperative differential games, noncooperative dynamic games, and rational choice theory. However, time consistency for the economic optimization problem presented in this chapter requires dynamical consistency for all subgames along the group optimal trajectory. Hence time consistency in this context reflects optimal-trajectory-subgame consistency. Therefore, we use the term optimal-trajectory-subgame consistency as a qualifier to the general term of time consistency. A theorem characterizing a time (optimal-trajectory-subgame) consistent solution for the cooperative game Γc (x0 , T − t0 ) under optimality Principle PI is presented below. Theorem 4.2 For the cooperative game Γc (x0 , T − t0 ) with optimality Principle PI the solution P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) for the game Γc (x0 , T − t0 ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ] where ∗ s, xs Bi (s) = − ξt(s)i t, xt∗ t=s − ξx(s)i ∗ s ∗ ∗ × f s, xs , ψ1 τ, xs∗ , ψ2∗ τ, xs∗ , . . . , ψn∗ τ, xs∗ , for i ∈ N ; (4.22) and
ξ (s) s, xs∗ = ξ (s)1 s, xs∗ , ξ (s)2 s, xs∗ , . . . , ξ (s)n s, xs∗ ∈ P xs∗ , T − s
is the imputation at time s ∈ [t0 , T ] with the state being xs∗ ∈ {x ∗ (t)}Tt=t0 according to optimality Principle PI and is time (optimal-trajectory-subgame) consistent. Proof Following the algorithm that specifies P (x0 , T − t0 ) as the solution to the game Γc (x0 , T − t0 ) one can readily obtain the solution of the cooperative game Γc (xυ∗ , T − υ), for υ ∈ [t0 , T ], as P (xυ∗ , T − υ) = {u(s) and B(s) for s ∈ [υ, T ] and ξ (υ) (υ, xυ∗ )} in which (i) u(s) for s ∈ [υ, T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) for the game Γc (xυ∗ , T − υ), and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [υ, T ] where ∗ Bi (s) = − ξt(s)i t, xt∗ t=s − ξx(s)i s, xs ∗ s ∗ ∗ ∗ ∗ × f s, xs , ψ1 τ, xs , ψ2 τ, xs∗ , . . . , ψn∗ τ, xs∗ , for i ∈ N; (4.23) and
ξ (s) s, xs∗ = ξ (s)1 s, xs∗ , ξ (s)2 s, xs∗ , . . . , ξ (s)n s, xs∗ ∈ P xs∗ , T − s
is the imputation according to the agreed-upon optimality principle at time s ∈ [υ, T ] with the state being xs∗ ∈ {x ∗ (t)}Tt=υ .
88
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Invoking Remark 3.1 in Chap. 3 and Definition 4.1, one can show that the group optimal joint payoff maximizing strategies ψ ∗ (s, xs∗ ) for the cooperative game Γc (x0 , T − t0 ) over the time interval [υ, T ] is identical to the joint payoff maximizing strategies controls for the cooperative game Γc (xυ∗ , T − υ) over the same time interval. Comparing (4.22) and (4.23) one can show that the payoff distribution procedure B(s) for the cooperative game Γc (x0 , T − t0 ) over the time interval [υ, T ] is identical to the payoff distribution procedure B(s) for the cooperative game Γc (xυ∗ , T − υ) over the same time interval. Invoking Theorem 4.1 one can show that the payoff distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (4.22) will yield s
T (υ)i c ξ Bi (s) exp − r(y) dy ds υ, xυ = τ τ T i ∗ r(y) dy ∈ P xυ∗ , T − υ , + q xT exp − τ
for i ∈ N , and υ ∈ [τ, T ]. Hence υ exp r(y) dy ξ (t0 )i υ, xυ∗ t0
≡ exp
υ
r(y) dy t0
T
s Bi (s) exp − r(y) dy ds
υ
T r(y) dy + q xT exp − t0 = ξ (υ)i υ, xυc ∈ P xυ∗ , T − υ . i
t0
∗
In summary, the continuation of the solution P (x0 , T − t0 ) over the time interval [υ, T ] coincides with the solution P (xυ∗ , T − υ) of the game Γc (xυ∗ , T − υ) under optimality Principle PI. Thus the solution P (x0 , T − t0 ) in Theorem 4.2 is indeed time consistent. With agents using the cooperative strategies {ψi∗ (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N}, the instantaneous payment received by agent i at time instant τ is (4.24) ζi (τ ) = g i τ, xτ∗ , ψ1∗ τ, xτ∗ , ψ2∗ τ, xτ∗ , . . . , ψn∗ τ, xτ∗ , for τ ∈ [t0 , T ] and i ∈ N . According to Theorem 4.2, the instantaneous payment that agent i should receive under the agreed-upon optimality principle is Bi (τ ), as stated in (4.22). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ ), has to be given to agent i at time τ , for i ∈ N and τ ∈ [t0 , T ].
(4.25)
4.4 Solutions from Specific Optimality Principle
89
Under an optimal-trajectory-subgame consistent solution, the agreed-upon optimality principle remains effective at any instant of time throughout the game along the optimal state trajectory. Moreover, group and individual rationality are satisfied throughout the entire game interval. Theorem 4.2 provides a handy tool to obtain optimal-trajectory-subgame consistent or time consistent cooperative solutions. Examples of cooperative differential games with solutions satisfying time (optimaltrajectory-subgame) consistency can be found in Petrosyan (1997), Jørgensen and Zaccour (2001), Yeung (2005, 2007), Yeung and Petrosyan (2004, 2006a, 2006b), and Filar and Petrosjan (2000). Moreover, Theorem 4.2 can be applied to obtain a time (optimal-trajectory-subgame) consistent cooperative solution for the existing differential games in economic analysis.
4.4 Solutions from Specific Optimality Principle In this section we present examples of time consistent solutions from some optimality principles. Case 1. Joint Payoff Maximization and Equal Sharing of Gains from Cooperation Consider the cooperative differential game Γc (x0 , T − t0 ). In particular, the agents agree with an optimality principle that entails (i) group optimality and (ii) the division of the excess of the total cooperative payoff over the sum of individual noncooperative payoffs equally. We denote the above optimality principle as Principle PI. Recall in Chap. 3 that the total cooperative payoffs in the cooperative game Γc (x0 , T − t0 ) is W (t0 ) (t0 , x0 ), the noncooperative payoff for agent j is V (t0 )j (t0 , x0 ) in the noncooperative game Γ (x0 , T − t0 ). According to optimality Principle PI the imputation to agent j in Γc (x0 , T − t0 ) is n 1 (t0 )i (t0 )i (t0 ) (t0 )j ξ (t0 , x0 ) = V (t0 , x0 ) + V (t0 , x0 ) , for i ∈ N. W (t0 , x0 ) − n j =1
(4.26) As the game progresses along the conditional optimal cooperative path {xsc }Ts=t 0 , according to Principle PI the imputation to agent j in the cooperative game Γc (xτc , T − τ ) is n 1 (τ )i c (τ )i c (τ ) c (τ )j c W ξ V τ, xτ = V τ, xτ + τ, xτ − τ, xτ , (4.27) n j =1
for i ∈ N and τ ∈ (t0 , T ]. The imputation in (4.26) and (4.27) yields (τ )i (τ, x c ) ≥ V (τ )i (τ, x c ), for i ∈ N and τ ∈ [t , T ]; and (i) ξ 0 τ τ n (τ )j (τ, x c ) = W (τ ) (τ, x c ) for τ ∈ [t , T ]. (ii) 0 τ τ j =1 ξ
90
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Hence the imputation vector ξ (τ )i (τ, xτ∗ ) satisfies individual rationality and group optimality. Applying Theorem 4.2, a time (optimal-trajectory-subgame) consistent solution under the optimal Principle PI can be characterized as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) in the game Γ (x0 , T − t0 ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ], where n ∗ 1 ∗ ∗
∂ (s)i (s) (s)j Bi (s) = − V t, xt + t, xt V W t, xt −
∂t n j =1 t=τ n 1 ∂ (s)i ∗ (s) ∗ (s)j ∗ V s, xs + s, xs − ∗ V W s, xs − ∂xs n j =1 × f s, xs∗ , ψ1∗ s, xs∗ , ψ2∗ s, xs∗ , . . . , ψn∗ s, xs∗ , (4.28) for i ∈ N . Case 2. Joint Payoff Maximization and Sharing Gains Proportional to Noncooperative Payoffs Consider the cooperative differential game Γc (x0 , T − t0 ). In particular, the agents agree with an optimality principle that entails (i) group optimality and (ii) the sharing of the excess of the total cooperative payoff over the sum of the individual noncooperative payoffs proportional to the agents’ noncooperative payoffs. We denote the above optimality principle as Principle PII. According to optimality Principle PII the imputation to agent j in Γc (x0 , T − t0 ) is V (t0 )i (t0 , x0 ) ξ (t0 )i (t0 , x0 ) = V (t0 )i (t0 , x0 ) + n (t0 )j (t , x ) 0 0 j =1 V n (t0 ) (τ )j ∗ × W (t0 , x0 ) − V τ, xτ j =1
V (t0 )i (t0 , x0 ) W (t0 ) (t0 , x0 ), = n (t0 )j (t , x ) 0 0 j =1 V
for i ∈ N.
(4.29)
As the game progresses along the conditional optimal cooperative path {xsc }Ts=t 0 , according to Principle PII the imputation to agent j in the cooperative game Γc (xτc , T − τ ) is V (τ )i (τ, xτc ) W (τ ) τ, xτc , ξ (τ )i τ, xτc = n (τ )j (τ, x c ) V τ j =1
for i ∈ N
and τ ∈ [t0 , T ]. (4.30)
4.4 Solutions from Specific Optimality Principle
91
Again the imputation under optimality Principle PII satisfies individual rationality and group optimality. Applying Theorem 4.2, a time (optimal-trajectory-subgame) consistent solution under the optimal Principle PII can be characterized as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) in the game Γ (x0 , T − t0 ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ], where ∗ ∂ V (s)i (t, xt∗ ) (s)
n W t, xt t=s Bi (s) = − (s)j (t, x ∗ ) ∂t t j =1 V ∗ ∂ V (s)i (s, xs∗ ) (s) − ∗ W s, xs n ∂xs V (s)j (s, xs∗ ) ∗ j =1 (4.31) × f s, xs , ψ1∗ s, xs∗ , ψ2∗ s, xs∗ , . . . , ψn∗ s, xs∗ , for i ∈ N. Case 3. Joint Payoff Maximization and Sharing Gains as a Combination of the Imputations in Principles PI and PII Consider the cooperative differential game Γc (x0 , T − t0 ). In particular, the agents agree with an optimality principle that entails (i) group optimality and (ii) the sharing of the excess of the total cooperative payoff over the sum of individual noncooperative payoffs as a linear combination of the imputations in Principles PI and PII. We denote the above optimality principle as PIII. According to optimality principle PIII the imputation to agent j in Γc (x0 , T − t0 ) is n 1 V (t0 )j (t0 , x0 ) ξ (t0 )i (t0 , x0 ) = α V (t0 )i (t0 , x0 ) + W (t0 ) (t0 , x0 ) − n j =1
V (t0 )i (t0 , x0 ) W (t0 ) (t0 , x0 ), + (1 − α) n (t0 )j (t , x ) V 0 0 j =1
for i ∈ N,
(4.32)
where α ∈ (0, 1). As the game progresses along the conditional optimal cooperative path {xsc }Ts=t 0 , according to PIII the imputation to agent j in the cooperative game Γc (xτc , T − τ ) is ξ (τ )i τ, xτc n 1 (τ )i c (τ ) c (τ )j c τ, xτ + τ, xτ − τ, xτ =α V W V n j =1 (τ )i c V (τ, xτ ) + (1 − α) n W (τ ) τ, xτc , (τ )j c (τ, xτ ) j =1 V
for i ∈ N and τ ∈ [t0 , T ]. (4.33)
Again, the imputation under optimality Principle PIII satisfies individual rationality and group optimality.
92
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Applying Theorem 4.2, a time (optimal-trajectory-subgame) consistent solution under the optimal Principle PIII can be characterized as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) in the game Γ (x0 , T − t0 ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ], where n ∗ 1 ∗ ∗ ∂ (s)i (s) (s)j t, xt + t, xt α V W t, xt − Bi (s) = − V ∂t n j =1 ∗ V (s)i (t, xt∗ ) (s)
+ (1 − α) n W t, xt t=s V (s)j (t, xt∗ ) j =1 n ∂ 1 W (s) s, xs∗ − − ∗ α V (s)i s, xs∗ + V (s)j s, xs∗ ∂xs n j =1 (s)i ∗ V (s, xs ) W (s) s, xs∗ + (1 − α) n (s)j (s, x ∗ ) V s j =1 (4.34) × f s, xs∗ , ψ1∗ s, xs∗ , ψ2∗ s, xs∗ , . . . , ψn∗ s, xs∗ , for i ∈ N . Case 4. Joint Payoff Maximization and Time Varying Sharing Weights Consider the cooperative differential game Γc (x0 , T − t0 ) with two agents. In particular, the agents agree with an optimality principle that entails (i) group optimality and (ii) the division of the excess of the total cooperative payoff over the sum of individual τ noncooperative payoffs by the time-varying weights— T +α for agent 1 and T T+α−τ +α for agent 2 at time τ ∈ [t0 , T ]. We denote the above optimality principle as PIV. According to optimality Principle PIV the imputations to agents 1 and 2 in Γc (x0 , T − t0 ) are 2 t0 (t0 )1 (t0 )1 (t0 ) (t0 )j (t0 , x0 ) = V (t0 , x0 ) + V (t0 , x0 ) , W (t0 , x0 ) − ξ T +α j =1
for agent 1, and ξ
(t0 )2
(t0 , x0 ) = V
(t0 )2
2 T + α − t0 (t0 ) (t0 )j (t0 , x0 ) + V (t0 , x0 ) , W (t0 , x0 ) − T +α j =1
(4.35) for agent 2. As the game progresses along the conditional optimal cooperative path {xsc }Ts=t 0 , according to PIV the imputation to agent j in the cooperative game Γc (xτc , T − τ ) is 2 τ (τ )1 c (τ )1 c (τ ) c (τ )j c V τ, xτ = V τ, xτ + τ, xτ − τ, xτ , W ξ T +α j =1
4.5 An Illustration in Cooperative Fishery
93
for agent 1, and ξ
(τ )2
c
τ, xτ
2 T + α − τ V (τ )j τ, xτc , = V (τ )2 τ, xτc + W (τ ) τ, xτc − T +α j =1
(4.36) for agent 2; τ ∈ [t0 , T ]. Again the imputation under optimality Principle PIV satisfies individual rationality and group optimality. Applying Theorem 4.2, a time (optimal-trajectory-subgame) consistent solution under the optimal Principle PIV can be characterized as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) in the game Γ (x0 , T − t0 ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ], where 2 ∗ ∗ ∗
t ∂ (s)1 (s) (s)j t, xt + t, xt V W t, xt − B1 (s) = − V
∂t T +α
j =1
t=s
∗ ∗ ∗ ∂ s (s)1 (s) (s)j W s, xs − − ∗ V V s, xs + s, xs ∂xs T +α j =1 × f s, xs∗ , ψ1∗ s, xs∗ , ψ2∗ s, xs∗ ,
2
2
∗
(s)j V t, xt
(4.37)
T −t +α ∂ V (s)2 t, xt∗ + W (s) t, xt∗ − ∂t T +α j =1 t=s 2 ∂ T − s + ε − ∗ V (s)1 s, xs∗ + V (s)j s, xs∗ W (s) s, xs∗ − ∂xs T +α j =1 ∗ ∗ ∗ ∗ ∗ × f s, xs , ψ1 s, xs , ψ2 s, xs .
B2 (s) = −
A variety of optimality principles with various imputation schemes like those in cases 1 to 4 can be constructed.
4.5 An Illustration in Cooperative Fishery Consider the case of two nations harvesting fish in common waters. The growth rate of the fish biomass is characterized by the differential equation x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(t0 ) = x0 ∈ X,
(4.38)
where ui ∈ Ui is the (nonnegative) amount of fish harvested by nation i, for i ∈ {1, 2}, a and b are positive constants.
94
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
The harvesting cost for nation i ∈ {1, 2} depends on the quantity of the resource extracted ui (s), the resource stock size x(s), and a parameter ci . In particular, nation i’s extraction cost can be specified as ci ui (s)x(s)−1/2 . The fish harvested by nation i at time s will generate a net benefit of the amount [ui (s)]1/2 . The horizon of concern is [t0 , T ]. At time T , nation i will receive a termination bonus qi x(T )1/2 , where qi is nonnegative. There exists a positive discount rate r. At time t0 the payoff of nation i ∈ {1, 2} is
T t0
1/2 − ui (s)
1 ci ui (s) exp −r(s − t0 ) ds + exp −r(T − t0 ) qi x(T ) 2 . 1/2 x(s) (4.39)
The game is a deterministic version of an example in Yeung and Petrosyan (2004). A set of feedback strategies {u∗i (t) = φi∗ (t, x), for i ∈ {1, 2}}, provides a feedback Nash equilibrium solution to the game in (4.38) and (4.39), if there exist continuously differentiable functions V (t0 )i (t, x) : [t0 , T ] × R → R, i ∈ {1, 2}, satisfying the following partial differential equations:
ci 1/2 −Vt(t0 )i (t, x) = max ui − 1/2 ui exp −r(t − t0 ) ui x
+ Vx(t0 )i (t, x) ax 1/2 − bx − ui − φj∗ (t, x) ,
V (t0 )i (T , x) = qi x 1/2 exp −r(T − t0 ) ,
and
(4.40)
for i ∈ {1, 2}, j ∈ {1, 2} and j = i.
Performing the indicated maximization yields the game equilibrium strategies φi∗ (t, x) =
x (t )i 4[ci + Vx 0
exp[r(t − t0 )]x 1/2 ]2
,
for i ∈ {1, 2}.
(4.41)
Proposition 4.1 The value function of nation i in the game in (4.38) and (4.39) is V (t0 )i (t, x) = exp −r(t − t0 ) Ai (t)x 1/2 + Ci (t) ,
for i ∈ {1, 2} and t ∈ [t0 , T ], (4.42)
where Ai (t), Ci (t), Aj (t), and Cj (t), for i ∈ {1, 2} and j ∈ {1, 2}, and i = j , satisfy ci b 1 + A˙ i (t) = r + Ai (t) − 2 2[ci + Ai (t)/2] 4[ci + Ai (t)/2]2 Ai (t) Ai (t) + , (4.43) 2 8[ci + Ai (t)/2] 8[cj + Aj (t)/2]2 a C˙ i (t) = rCi (t) − Ai (t), and Ai (T ) = q, and Ci (T ) = 0. 2 +
4.5 An Illustration in Cooperative Fishery
95
Proof By substituting φ1∗ (t, x) and φ2∗ (t, x) into (4.40) and upon solving (4.40) one can obtain the results in Proposition 4.1 (see also the proof of Proposition 3.1 in Chap. 3). Consider the alternative game Γ (xτ , T − τ ) with the payoff structure in (4.39) and the dynamics in (4.38) starting at time τ ∈ [t0 , T ] with initial state xτ ∈ X. Following the above analysis, the value function V (τ )i (t, x) : [τ, T ] × R → R, for i ∈ {1, 2} and τ ∈ [t0 , T ], for the game Γ (xτ , T − τ ) can be obtained as follows. Proposition 4.2 The value function of nation i ∈ {1, 2} in the game Γ (xτ , T − τ ) is V (τ )i (t, x) = exp −r(t − τ ) Ai (t)x 1/2 + Ci (t) , (4.44) where for i, j ∈ {1, 2} and i = j, Ai (t), Ci (t), Aj (t), and Cj (t) are the same as those in Proposition 4.1. Proof Follow the proof of Proposition 4.1.
Substituting the relevant derivatives of the value functions into the game equilibrium strategies of (4.41) yields a feedback Nash equilibrium for the game in (4.38) and (4.39). Now consider the case when the nations agree to cooperate in harvesting the fishery. Let Γc (x0 , T − t0 ) denote a cooperative game with the game structure of Γ (x0 , T − t0 ) in which the agents agree to act according to the optimality principle that they would (i) maximize the sum of their payoffs and (ii) divide the excess of the total cooperative payoff over the sum of individual noncooperative payoffs equally. To maximize the joint payoffs, the nations would consider the optimal control problem T c1 c2 1/2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) x(s)1/2 x(s)1/2 t0 1 (4.45) × exp −r(t − t0 ) ds + 2 exp −r(T − t0 ) qx(T ) 2 , subject to (4.38). Let [ψ1∗ (t, x), ψ2∗ (t, x)] denote a set of controls that provides a solution to the optimal control problem in (4.38) and (4.45) and W (t0 ) (t, x) : [t0 , T ] × R n → R denote the value function that satisfies the equations
c1 c2 1/2 1/2 −Wt(t0 ) (t, x) = max u1 − 1/2 u1 + u2 − 1/2 u2 exp −r(t − t0 ) u1 ,u2 x x + Wx(t0 ) (t, x) ax 1/2 − bx − u1 − u2 , and (4.46) 1 W (t0 ) (T , x) = 2 exp −r(T − t0 ) qx 2 .
96
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Performing the indicated maximization we obtain ψ1∗ (t, x) = ψ2∗ (t, x) =
x (t ) 4[c1 + Wx 0 exp[r(t
− t0 )]x 1/2 ]2
x (t ) 4[c2 + Wx 0 exp[r(t
− t0 )]x 1/2 ]2
,
and
.
Substituting ψ1∗ (t, x) and ψ2∗ (t, x) above into (4.47) yields the value function 1/2 ˆ ˆ W (t0 ) (t, x) = exp −r(t − t0 ) A(t)x + C(t) , 1 1 ˙ˆ = r + b ˆ where A(t) A(t) − − ˆ ˆ 2 2[c1 + A(t)/2] 2[c2 + A(t)/2] c1 c2 + + 2 2 ˆ ˆ 4[c1 + A(t)/2] 4[c2 + A(t)/2]
(4.47)
ˆ ˆ A(t) A(t) + , 2 2 ˆ ˆ 8[c1 + A(t)/2] 8[c2 + A(t)/2] ˙ˆ = r C(t) ˆ ˆ ) = 2q, and B(T ˆ ) = 0. ˆ − a A(t), A(T C(t) 2 +
The optimal cooperative controls can then be obtained as ψ1∗ (t, x) =
x 2 ˆ 4[c1 + A(t)/2]
,
and
ψ2∗ (t, x) =
x 2 ˆ 4[c2 + A(t)/2]
.
(4.48)
Substituting these control strategies into (4.38) yields the dynamics of the state trajectory under cooperation x(s) ˙ = ax(s)1/2 − bx(s) −
x(s) x(s) − , 2 2 ˆ ˆ 4[c1 + A(s)/2] 4[c2 + A(s)/2]
x(t0 ) = x0 .
(4.49)
Solving (4.49) yields the optimal cooperative state trajectory for Γc (x0 , T − t0 ) as 2 s 1/2 −1 (t0 , t)H1 dt , x ∗ (s) = (t0 , s)2 x0 +
for s ∈ [t0 , T ],
t0
s where (t0 , s) = exp[ t0 H2 (τ ) dτ ], H1 = 12 a, and H2 (s) = −
1 1 1 + b+ . 2 2 ˆ ˆ 2 8[c1 + A(s)/2] 8[c2 + A(s)/2]
(4.50)
4.5 An Illustration in Cooperative Fishery
97
The cooperative control for the game Γc (x0 , T − t0 ) over the time interval [t0 , T ] along the optimal trajectory {x ∗ (t)}Tt=t0 can be expressed precisely as ψ1∗ t, xt∗ =
xt∗ , 2 ˆ 4[c1 + A(t)/2]
and
ψ2∗ t, xt∗ =
xt∗ . (4.51) 2 ˆ 4[c2 + A(t)/2]
Following the above analysis, the value function of the optimal control problem with the dynamics structure of (4.38) and the payoff structure in (4.45) which starts at time τ with initial state xτ∗ can be obtained as W (τ ) (t, x) = exp[−r(t − 1/2 + B(t)], ˆ ˆ and the corresponding optimal controls as τ )][A(t)x ψ1∗ t, xt∗ =
xt∗ , 2 ˆ 4[c1 + A(t)/2]
and
ψ2∗ t, xt∗ =
xt∗ , 2 ˆ 4[c2 + A(t)/2]
over the time interval [τ, T ]. The agreed-upon optimality principle entails an imputation n 1 (τ )i ∗ (τ )i ∗ (τ ) ∗ (τ )j ∗ V τ, xτ = V τ, xτ + τ, xτ − τ, xτ , W ξ n
i ∈ {1, 2},
j =1
(4.52) in the cooperative game Γc (xτ∗ , T − τ ) for τ ∈ {t0 , T ]. Applying Theorem 4.2, a time (optimal-trajectory-subgame) consistent solution under the above optimal principle for the cooperative game Γc (x0 , T −t0 ) can be obtained as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ1∗ s, xs∗ =
xs∗ , 2 ˆ 4[c1 + A(s)/2]
and
ψ2∗ s, xs∗ =
xs∗ . 2 ˆ 4[c2 + A(s)/2]
(ii) The imputation distribution procedure B(s) = {B1 (s), B2 (s)} for s ∈ [t0 , T ] where
1/2 1/2 −1 ˙ + C˙ i (s) + r Ai (s) xs∗ + Ci (s) Ai (s) xs∗ Bi (s) = 2 ∗ −1/2 1 + Ai (s) xs 2 1/2 xs∗ xs∗ × a xs∗ − bxs∗ − − 2 2 ˆ ˆ 4[ci + A(s)/2] 4[cj + A(s)/2]
∗ 1/2 1 ˙ˆ ∗ 1/2 ˙ˆ ˆ ˆ + C(s) + r A(s) xs + C(s) − A(s) xs 2 1 ˆ ∗ −1/2 + A(s) xs 2
98
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
1/2 × a xs∗ − bxs∗ −
xs∗ xs∗ − 2 2 ˆ ˆ 4[ci + A(s)/2] 4[cj + A(s)/2]
1/2 1/2 1 ˙ + C˙ j (s) + r Aj (s) xs∗ + Cj (s) Aj (s) xs∗ 2 −1/2 1 + Aj (s) xs∗ 2 ∗ 1/2 xs∗ xs∗ ∗ − bxs − − × a xs , (4.53) 2 2 ˆ ˆ 4[ci + A(s)/2] 4[cj + A(s)/2]
+
˙ˆ for i, j ∈ {1, 2} and i = j , where A˙ i (s) and C˙ i (s) are given in (4.44) and A(s) ˙ˆ and C(s) are given in (4.47). With agents using the cooperative strategies, the instantaneous receipt of agent i at time instant τ is ζi (τ ) =
(xτ∗ )1/2 ci (xτ∗ )1/2 . − 2[ci + A(τ )/2] 4[ci + A(τ )/2]2
(4.54)
Under cooperation the instantaneous payment that agent i should receive is Bi (τ ), as stated in (4.53). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ )
(4.55)
has to be given to agent i at time τ , for i ∈ {1, 2} and τ ∈ [t0 , T ].
4.6 Consistent Economic Optimization Under Infinite Horizon In many economic situations, the terminal time of the game T is either very far in the future or unknown to the agents. In this section, time consistent cooperation for games with infinite horizon are considered. Consider the n-person infinite-horizon general economic problem in which economic agent i’s payoff is ∞ g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − τ ) ds, for i ∈ N. (4.56) τ
The state dynamics is x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(τ ) = xτ .
(4.57)
Since s does not appear in g i [x(s), u1 (s), u2 (s)] or the state dynamics, the game in (4.56) and (4.57) is an autonomous problem. Consider the alternative game Γ (x)
4.6 Consistent Economic Optimization Under Infinite Horizon
99
that starts at time t ∈ [t0 , ∞) with initial state x(t) = x
∞
max ui
g i x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − t) ds,
for i ∈ N,
t
(4.58) subject to the state dynamics x(s) ˙ = f x(s), u1 (s), u2 (s), . . . , un (s) ,
x(t) = x.
(4.59)
The infinite-horizon autonomous game Γ (x) is independent of the choice of t and dependent only upon the state at the starting time, that is, x. Now consider the case when the economic agents agree to act cooperatively. Let Γc (τ, xτ ) denote a cooperative game in which agent i’s payoff is (4.56) and the state dynamics is (4.57). The agents agree to act according to an agreed-upon optimality principle. As noted before, group optimality is an essential factor in cooperation and we let the agreed-upon optimality principle be as follows. Principle PII It is an optimality principle that entails (i) group optimality and individual rationality, and (ii) the distribution of the total cooperative payoff according to an imputation that equals ξ (υ) (υ, xυ∗ ) for υ ∈ [τ, ∞) over the game duration. Moreover, the function ξ (υ)i (υ, xυ∗ ) ∈ ξ (υ) (υ, xυ∗ ), for i ∈ N , is continuously differentiable in υ and xυ∗ . The solution P (τ, xτ ) of the cooperative game Γc (τ, xτ ) under optimality Principle PII includes the following. (i) A set of group optimal cooperative strategies )∗ (τ )∗ (τ )∗ u(τ )∗ (s) = u(τ 1 (s), u2 (s), . . . , un (s) ,
for s ∈ [τ, ∞).
(ii) An imputation vector ξ (τ ) (τ, xτ ) = [ξ (τ )1 (τ, xτ ), ξ (τ )2 (τ, xτ ), . . . , ξ (τ )n (τ, xτ )] to allocate the cooperative payoff to the agents. (iii) A payoff distribution procedure B τ (s) = [B1τ (s), B2τ (s), . . . , Bnτ (s)], for s ∈ [τ, ∞), where Biτ (s) is the instantaneous payment for agent i at time s. In particular,
∞
ξ (τ )i (τ, xτ ) = τ
Biτ (s) exp −r(s − τ ) ds,
for i ∈ N.
(4.60)
In the following sections, we explicitly characterize the solution P (τ, xτ ) of the cooperative game Γc (τ, xτ ) under the optimality principle.
100
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
4.6.1 Group Optimal Cooperative Strategies To ensure group rationality, the agents maximize the sum of their payoffs, the agents solve the problem n ∞ j g x(s), u1 (s), u2 (s), . . . , un (s) exp −r(s − τ ) ds , (4.61) max u1 ,u2 ,...,un
τ
j =1
subject to (4.57). Following Theorem 3.2 in Chap. 3, we note that a set of controls {ψ1∗ (x), for i ∈ N} provides a solution to the optimal control problem in (4.61) if there exists a continuously differentiable function W (x) : R m → R satisfying the infinite-horizon Bellman equation 2 j rW (x) = max g [x, u1 , u2 , . . . , un ] + Wx f [x, u1 , u2 , . . . , un ] . (4.62) u1 ,u2 ,...,un
j =1
According to optimality Principle PII the agents will adopt the cooperative control {ψ1∗ (x), for i ∈ N} characterized in (4.62). Note that these controls are functions of the current state x only. Substituting this set of controls into the state dynamics yields the optimal (cooperative) trajectory as x(s) ˙ = f x(s), ψ1∗ x(s) , ψ2∗ x(s) , . . . , ψn∗ x(s) , x(τ ) = xτ . (4.63) Let x ∗ (s) denote the solution to (4.63). The optimal trajectory {x ∗ (s)}∞ s=τ can be expressed as s ∗ x (s) = xτ + f x ∗ (υ), ψ1∗ x ∗ (υ) , ψ2∗ x ∗ (υ) , . . . , ψn∗ x ∗ (υ) dυ. τ
For notational convenience, we use the terms x ∗ (s) and xs∗ interchangeably. The cooperative control for the game can be expressed more precisely as ∗ ∗ ψi xs , for i ∈ N and s ∈ τ, ∞) , which are functions of the current state xs∗ only. The term W xτ∗ =
n ∞ τ
g j x ∗ (s), ψ1∗ x ∗ (s) , ψ2∗ x ∗ (s) , . . . , ψn∗ x ∗ (s)
j =1
× exp −r(s − τ ) ds, is the maximized cooperative payoff at current time τ , given that the state is xτ∗ . Moreover, one can easily verify that the joint payoff maximizing controls for the cooperative game Γc (τ, xτ ) over the time interval [t, ∞) is identical to the joint payoff maximizing controls for the cooperative game Γc (t, xt∗ ) over the same time interval.
4.6 Consistent Economic Optimization Under Infinite Horizon
101
4.6.2 Consistent Imputation and Payoff Distribution Procedure Let P (τ, xτ ) denote the solution to the cooperative game Γc (τ, xτ ) under the agreedupon optimality Principle PII. According to P (τ, xτ ), the economic agents would use the Payoff Distribution Procedure {B τ (s)}∞ s=τ to bring about an imputation to agent i as ∞ ξ (τ )i (τ, xτ ) = Biτ (s) exp −r(s − τ ) ds, for i ∈ N. (4.64) τ
We define ξ
(τ )i
∗ t, xt =
∞ t
Biτ (s) exp −r(s − τ ) ds,
for i ∈ N,
(4.65)
where t > τ and xt∗ ∈ {x ∗ (s)}∞ s=τ . According to P (τ, xτ ), agent i is supposed to receive a payoff ξ (τ )i (t, xt∗ ) over the remaining time interval [t, ∞). Consider the case when the game has proceeded to time t and the state variable became xt∗ . Then one has a cooperative game Γc (t, xt∗ ) that starts at time t with initial state xt∗ . According to the solution P (t, xt∗ ), an imputation ∞ Bit (s) exp −r(s − t) ds, ξ (t)i t, xt∗ = t
will be allotted to agent i, for i ∈ N . However, according to P (τ, xτ ), the imputation (in the present value viewed at time τ ) to agent i over the period [t, ∞) is ∞ ξ (τ )i t, xt∗ = Biτ (s) exp −r(s − τ ) ds, for i ∈ N. (4.66) t
For the imputation from P (τ, xτ ) to be consistent with those from P (t, xt∗ ), it is essential that exp r(t − τ ) ξ (τ )i t, xt∗ = ξ (t)i t, xt∗ ∈ P t, xt∗ , for t ∈ (τ, ∞). In addition, at time τ when the initial state is xτ , according to the solution P (τ, xτ ) generated by optimality Principle PII, the payoff distribution procedure is B τ (s) = B1τ (s), B2τ (s), . . . , Bnτ (s) , for s ∈ [τ, ∞). When the game has proceeded to time t and the state variable has become xt∗ , according to the solution P (t, xt∗ ) generated by optimality Principle PII, the payoff distribution procedure B t (s) = B1t (s), B2t (s), . . . , Bnt (s) , for s ∈ [t, ∞), will be adopted.
102
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
For the continuation of the payoff distribution procedure B τ (s) under P (τ, xτ ) to be consistent with B t (s) ∈ P (xtc , T − t), it is required that B t0 (s) = B t (s),
for s ∈ [t, ∞) and t ∈ [τ, ∞).
Definition 4.3 The imputation and payoff distribution procedure {ξ (τ ) (τ, xτ ) and B τ (s) for s ∈ [τ, ∞)} ∈ P (τ, xτ ) are time consistent if (i)
exp r(t − τ ) ξ (τ )i t, xt∗ ∞ τ ≡ exp r(t − τ ) Bi (s) exp −r(s − τ ) ds t
=ξ
(t)i
∗ t, xt ∈ P t, xt∗ ,
for t ∈ (τ, ∞) and i ∈ N ;
(4.67)
and (ii) the payoff distribution procedure B τ (s) = [B1τ (s), B2τ (s), . . . , Bnτ (s)] for s ∈ [t, ∞) is identical to B t (s) = [B1t (s), B2t (s), . . . , Bnt (s)] ∈ P (t, xt∗ ). Definition 4.3 is the infinite-horizon counterpart of Definition 4.2 in characterizing the time consistent imputation and payoff distribution procedure.
4.6.3 Derivation of Consistent Payoff Distribution Procedure A payoff distribution procedure leading to the time consistent imputation has to satisfy Definition 4.3. Invoking Definition 4.3, we have Biτ (s) = Bit (s) = Bi (s), for s ∈ [τ, ∞), t ∈ [τ, ∞), and i ∈ N . Therefore, along the cooperative trajectory {x ∗ (t)}t≥t0 , ∞ ξ (τ )i τ, xτ∗ = Bi (s) exp −r(s − τ ) ds, for i ∈ N, τ
ξ (υ)i υ, xυ∗ = ξ
(t)i
∗ t, xt =
∞ υ ∞
Bi (s) exp −r(s − υ) ds,
Bi (s) exp −r(s − t) ds,
for i ∈ N,
and
(4.68)
for i ∈ N and t ≥ υ ≥ τ.
t
Moreover, for i ∈ N and t ∈ [τ, ∞), we define the term
∞
ξ (υ)i t, xt∗ = Bi (s) exp −r(s − υ) ds
x(t) = xt∗
(4.69)
t
to denote the present value of agent i’s cooperative payoff over the time interval [t, ∞), given that the state is xt∗ at time t ∈ [υ, ∞), under the solution P (υ, xυ∗ ).
4.6 Consistent Economic Optimization Under Infinite Horizon
103
Invoking (4.69) and (4.69), one can readily verify that exp[r(t − τ )]ξ (τ )i (t, xt∗ ) = for i ∈ N, τ ∈ [t0 , T ], and t ∈ [τ, T ]. The next task is to derive Bi (s), for s ∈ [τ, ∞) and t ∈ [τ, ∞) so that (4.69) can be realized. Consider again the following condition. ξ (t)i (t, xt∗ ),
Condition 4.2 For i ∈ N , t ≥ υ, and υ ∈ [τ, T ], the term ξ (υ)i (t, xt∗ ) is a function that is continuously differentiable in t and xt∗ . Lemma 4.1 If Condition 4.2 is satisfied, a PDP with instantaneous payments at time s equaling ∗ ∗ ∗ ∗ ∗ ∗ Bi (s) = − ξt(s)i t, xt∗ t=s − ξx(s)i s, xs f xs , ψ1 xs , ψ2 xs , . . . , ψn∗ xs∗ , ∗ s
(4.70) for i ∈ N and s ∈ [υ, ∞), yields imputation ξ (υ)i (υ, xυc ), for υ ∈ [τ, ∞), which satisfies (4.69). Proof Note that along the cooperative trajectory {x ∗ (t)}t≥τ ∞ ∗ (υ)i ξ Bi (s) exp −r(s − υ) ds = exp −r(t − υ) ξ (t)i t, xt∗ , t, xt = t
(4.71) for i ∈ N and t ∈ [υ, ∞). For t → 0, (4.69) can be expressed as ∞ (υ)i ∗ ξ Bi (s) exp −r(s − υ) ds τ, xτ = υ
=
υ
υ+t
Bi (s) exp −r(s − υ) ds + ξ (υ)i υ + t, xυ∗ + xυ∗ , (4.72)
where xυ∗ = f xυ∗ , ψ1∗ xυ∗ , ψ2∗ xυ∗ , . . . , ψn∗ xυ∗ t + o(t),
and
o(t)/t → 0 as t → 0. ∗ Replacing the term xυ∗ + xυ∗ with xυ+t and rearranging (4.72) yields
υ+t υ
Bi (s) exp −r(s − υ) ds
∗ , = ξ (υ)i υ, xυ∗ − ξ (υ)i υ + t, xυ+t
for all υ ∈ [τ, ∞) and i ∈ N. (4.73)
104
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Consider the following condition concerning ξ (υ)i (t, xt∗ ), for υ ∈ [τ, ∞) and t ∈ [υ, ∞). With Condition 4.2 holding and t → 0, (4.73) can be expressed as Bi (υ)t = − ξt(υ)i t, xt∗ t=τ t − ξx(υ)i υ, xυ∗ f xυ∗ , ψ1∗ xυ∗ , ψ2∗ xυ∗ , . . . , ψn∗ xυ∗ t − o(t). ∗ υ
(4.74) Dividing (4.74) throughout by t , with t → 0 yields (4.70). Thus the payoff distribution procedure in Bi (υ) in (4.70) will lead to the realization of the imputations that satisfy (4.70). Since the payoff distribution procedure in Bi (τ ) in (4.70) leads to the realization of (4.69), it will yield time consistent imputations satisfying Definition 4.3. A more succinct form of Lemma 4.1 can be derived as follows. If Condition 4.2 is satisfied, a PDP with instantaneous payments at time s equaling (s)i Bi (s) = rξ (s)i s, xs∗ − ξx ∗ s, xs∗ f xs∗ , ψ1∗ xs∗ , ψ2∗ xs∗ , . . . , ψn∗ xs∗ , (4.75) s
for i ∈ N and s ∈ [υ, ∞), yields imputation ξ (υ)i (υ, xυc ), for υ ∈ [τ, ∞), which satisfies (4.69). To demonstrate that (4.75) is an alternative form for (4.70) in Lemma 4.1, we first define
∞ ∗
ˆξ i xυ∗ = Bi (s) exp −r(s − υ) ds x(υ) = xυ = ξ (υ)i τ, xυ∗ , and ξˆ i xt∗ =
υ
t
∞
∗ ∗ (t)i
x(t) = x Bi (s) exp −r(s − t) ds t, xt , t =ξ
for i ∈ N , υ ∈ [τ, ∞), and t ∈ [υ, ∞) along the optimal cooperative trajectory {xs∗ }∞ s=τ . We then have ξ (υ)i t, xt∗ = exp −r(t − υ) ξˆ i xt∗ . Differentiating ξ (υ)i (t, xt∗ ) with respect to t yields (υ)i ∗ ξt t, xt t=υ = −r exp −r(t − υ) ξˆ i xt∗ = −rξ (υ)i t, xt∗ . At t = υ, ξ (υ)i (t, xt∗ ) = ξ (υ)i (υ, xυ∗ ), therefore, (υ)i ∗ t, xt t=υ = rξ (υ)i t, xt∗ = rξ (υ)i υ, xυ∗ . ξt
(4.76)
Substituting (4.76) into (4.70) yields (4.75). Using (4.75), a time (optimal-trajectory-subgame) consistent solution in an infinite-horizon framework is characterized in the next section.
4.6 Consistent Economic Optimization Under Infinite Horizon
105
4.6.4 Time (Optimal-Trajectory-Subgame) Consistent Solution A theorem characterizing a time (optimal-trajectory-subgame) consistent solution P (τ, xτ ) for the cooperative game Γc (τ, xτ ) under optimality Principle PII is presented below. Theorem 4.3 For the cooperative game Γc (τ, xτ ) with optimality Principle PII the solution P (τ, xτ ) = {u(s) and B(s) for s ∈ [τ, ∞) and ξ (τ ) (τ, xτ )} in which (i) u(s) for s ∈ [τ, ∞) is the set of group optimal strategies ψ ∗ (xs∗ ) for the game Γc (τ, xτ ), and (ii) the imputation distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [τ, ∞), where ∗ ∗ ∗ ∗ ∗ ∗ Bi (s) = rξ (s)i s, xs∗ − ξx(s)i s, xs f xs , ψ1 xs , ψ2 xs , . . . , ψn∗ xs∗ , ∗ s
(4.77) for i ∈ N , and ξ (s) s, xs∗ = ξ (s)1 s, xs∗ , ξ (s)2 s, xs∗ , . . . , ξ (s)n s, xs∗ ∈ P s, xs∗ is the imputation at time s ∈ [τ, ∞) with the state being xs∗ ∈ {x ∗ (t)}t≥τ under optimality Principle PII and it is time (optimal-trajectory-subgame) consistent. Proof Following the algorithm that specifies P (τ, xτ ) as the solution to the game Γc (τ, xτ ) one can readily obtain the solution of the cooperative game Γc (υ, xυ∗ ), for υ > τ , as P (υ, xυ∗ ) = {u(s) and B(s) for s ∈ [υ, ∞) and ξ (υ) (υ, xυ∗ )} in which (i) u(s) for s ∈ [υ, ∞) is the set of group optimal strategies ψ ∗ (xs∗ ) for the game Γc (υ, xυ∗ ), and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [υ, ∞), where (s)i ∗ (s)i ∗ t, xt t=s − ξx ∗ s, xs Bi (s) = − ξt s ∗ ∗ ∗ ∗ × f s, xs , ψ1 τ, xs , ψ2 τ, xs∗ , . . . , ψn∗ τ, xs∗ ,
(4.78)
for i ∈ N , and (s)1 (s)2 (s) (s)n ξx c s, xs∗ = ξx c s, xs∗ , ξx c s, xs∗ , . . . , ξx c s, xs∗ ∈ P s, xs∗ τ
τ
τ
τ
is the imputation at time s ∈ [υ, ∞) with the state being xs∗ ∈ {x ∗ (t)}t≥υ . Using the characterization of optimal control strategies in (4.62), one can show that the group optimal joint payoff maximizing strategies ψ ∗ (xs∗ ) for the cooperative game Γc (τ, xτ ) over the time interval [υ, ∞) is identical to the joint payoff maximizing strategies controls for the cooperative game Γc (υ, xυ∗ ) over the same time interval.
106
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Comparing (4.77) and (4.78), one can show that the payoff distribution procedure B(s) for the cooperative game Γc (τ, xτ ) over the time interval [υ, ∞) is identical to the payoff distribution procedure B(s) for the cooperative game Γc (υ, xυ∗ ) over the same time interval. Invoking Lemma 4.1 and (4.75) one can show that the payoff distribution procedure B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (4.77) would yield
∞ ξ (υ)i υ, xυ∗ = Bi (s) exp −r(s − υ) ds ∈ P υ, xυ∗ , for i ∈ N, υ
and υ ∈ [τ, ∞). Hence exp r(υ − τ ) ξ (τ )i υ, xυ∗ ≡ exp r(υ − τ ) =ξ
(υ)i
∞
Bi (s) exp −r(s − τ ) ds
υ
υ, xυ∗ P υ, xυ∗ ,
for i ∈ N and υ ∈ [τ, ∞).
In summary, the continuation of the solution P (τ, xτ ) over the time interval [υ, ∞) is consistent with the solution P (υ, xυ∗ ) of the game Γc (υ, xυ∗ ) under optimality Principle PII. Thus the solution P (τ, xτ ) in Theorem 4.3 is indeed time (optimal-subgame-consistent) consistent. With agents using the cooperative strategies {ψi∗ (xυ∗ ), for i ∈ N and υ ∈ [τ, ∞)}, the instantaneous receipt of agent i at time instant υ is ζi (υ) = g i xυ∗ , ψ1∗ xυ∗ , ψ2∗ xυ∗ , . . . , ψn∗ xυ∗ , for i ∈ N. (4.79) According to Theorem 4.3, the instantaneous payment that agent i should receive under the agreed-upon optimality principle is Bi (υ), as stated in (4.77). Hence an instantaneous transfer payment χ i (υ) = Bi (υ) − ζi (υ)
(4.80)
has to be given to agent i at time υ, for i ∈ N .
4.7 Infinite-Horizon Resource Extraction Optimization Consider an infinite-horizon version of the cooperative fishery game in Sect. 4.5. At initial time τ , the payoff function of nations 1 and 2 are, respectively, ∞ c1 1/2 u1 (s) − u1 (s) exp −r(t − τ ) ds, 1/2 x(s) τ and
τ
∞
u2 (s)1/2 −
c2 u (s) exp −r(t − τ ) ds. 2 1/2 x(s)
(4.81)
4.7 Infinite-Horizon Resource Extraction Optimization
107
The resource stock x(s) ∈ X ⊂ R follows the dynamics x(s) ˙ = ax(s)1/2 − bx(s) − u1 (s) − u2 (s),
x(τ ) = xτ ∈ X.
(4.82)
Invoking Theorem 2.4 in Chap. 2, a noncooperative feedback Nash equilibrium solution of the game in (4.81) and (4.82) can be characterized as
1/2 ci 1/2 i i ∗ ˆ ˆ r V (x) = max ui − 1/2 ui + Vx (x) ax − bx − ui − φj (x) , (4.83) ui x for i, j ∈ {1, 2} and i = j . Performing the indicated maximization in (4.83) yields φi∗ (x) =
x , ˆ 4[ci + Vxi (x)x 1/2 ]2
for i ∈ {1, 2}.
Substituting φ1∗ (x) and φ2∗ (x) above into (4.83) and upon solving (4.83) one obtains the value function of nation i ∈ {1, 2} as (4.84) Vˆ i (t, x) = Ai x 1/2 + Ci , where, for i, j ∈ {1, 2} and i = j, Ai , Ci , Aj , and Cj satisfy b 1 ci r+ Ai − + 2 2[ci + Ai /2] 4[ci + Ai /2]2 Ai Ai + = 0, 8[ci + Ai /2]2 8[cj + Aj /2]2 a Ci = Ai . 2 +
and
The game equilibrium strategies can be obtained as φ1∗ (x) =
x , 4[c1 + A1 /2]2
and
φ2∗ (x) =
x . 4[c2 + A2 /2]2
(4.85)
Consider the case when these two nations agree to act according to an agreed-upon optimality principle that entails (i) group optimality and (ii) the distribution of the cooperative payoff according to the imputation that equally divides the excess of the total cooperative payoff over the sum of individual noncooperative payoffs. To maximize their joint payoff for group optimality, the nations have to solve the control problem of maximizing ∞ c1 c2 1/2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) x(s)1/2 x(s)1/2 τ × exp −r(t − τ ) ds, (4.86) subject to (4.82).
108
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
Invoking Theorem 3.2 in Chap. 3, we obtain
c1 c2 1/2 u − u + u 1 2 2 u1 ,u2 x 1/2 x 1/2 + Wx (x) ax 1/2 − bx − u1 − u2 .
rW (x) = max
1/2
u1 −
Following similar procedures in previous analyses, one can obtain W (x) = Ax 1/2 + C , where c2 1 1 c1 b + − + r + A− 2 2 2[c1 + A/2] 2[c2 + A/2] 4[c1 + A/2] 4[c2 + A/2]2 A A + = 0, 8[c1 + A/2]2 8[c2 + A/2]2 a C = A. 2r +
and
The optimal cooperative controls can then be obtained as ψ1∗ (x) =
x 4[c1 + A/2]2
and
ψ2∗ (x) =
x . 4[c2 + A/2]2
(4.87)
Substituting these control strategies into (4.82) yields the dynamics of the state trajectory under cooperation x(s) ˙ = ax(s)1/2 − bx(s) −
x(s) x(s) − , 4[c1 + A/2]2 4[c2 + A/2]2
x(τ ) = xτ .
(4.88)
Solving (4.88) yields the optimal cooperative state trajectory {x ∗ (s)}∞ τ =t0 for the cooperative game in (4.81) and (4.82) as 1 2 a a 2 + (xτ ) − exp −H (s − τ ) , x (s) = 2H 2H ∗
where
(4.89)
1 1 b . + H =− + 2 8[c1 + A/2]2 8[c2 + A/2]2
According to the agreed-upon optimality principle these nations will distribute the cooperative payoff according to the imputation that equally divides the excess of the total cooperative payoff over the sum of individual noncooperative payoffs. Hence the imputation ξ(υ, xυ∗ ) = [ξ 1 (υ, xυ∗ ), ξ 2 (υ, xυ∗ )] has to satisfy the following:
4.8 Exercises
109
Condition 4.3
2 1 ∗ ∗ i ∗ j ∗ ˆ ˆ V xυ , W xυ − ξ υ, xυ = V xυ + 2 i
(4.90)
j =1
for i ∈ {1, 2} and υ ∈ [τ, ∞). Applying Theorem 4.3 a time (optimal-trajectory-subgame) consistent solution for the cooperative game Γc (τ, xτ ) can be obtained as P (τ, xτ ) = {u(s) and B(s) for s ∈ [τ, ∞) and ξ (τ ) (τ, xτ )} in which (i) u(s) for s ∈ [τ, ∞) is the set of group optimal strategies ψ1∗ xs∗ =
xs∗ 4[c1 + A/2]2
and ψ2∗ xs∗ =
xs∗ ; 4[c2 + A/2]2
and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [τ, ∞) where Bi (s) =
1/2 1/2 1 ∗ 1/2 + Ci + r A xs∗ + C − r Aj xs∗ + Cj r A i xs 2 −1/2 −1/2 1 −1/2 + A xs∗ − Aj xs∗ − Ai xs∗ 4 ∗ 1/2 xs∗ xs∗ ∗ − bxs − − × a xs , (4.91) 4[c1 + A/2]2 4[c2 + A/2]2
for i, j ∈ {1, 2} and i = j . With agents using the cooperative strategies {ψi∗ (xυ∗ ), i ∈ {1, 2}} along the cooperative trajectory, the instantaneous receipt of agent i at time instant υ becomes ζi (υ) =
(xυ∗ )1/2 ci (xυ∗ )1/2 . − 2[ci + A/2] 4[ci + A/2]2
(4.92)
According to (4.91), the instantaneous payment that agent i should receive under the agreed-upon optimality principle is Bi (υ). Hence an instantaneous transfer payment χ i (υ) = Bi (υ) − ζi (υ),
(4.93)
has to be given to agent i at time υ ∈ [τ, ∞), for i ∈ {1, 2}.
4.8 Exercises 4.1 Consider the case of two nations harvesting fish in common waters. The growth rate of the fish biomass is characterized by the differential equation x(s) ˙ = 3x(s)1/2 − 0.5x(s) − u1 (s) − u2 (s),
x(0) = 50,
110
4
Time Consistency and Optimal-Trajectory-Subgame Consistent
where ui ∈ Ui is the (nonnegative) amount of fish harvested by nation i, for i ∈ {1, 2}. The horizon of the game is [0, 4]. The harvesting cost for nation i ∈ {1, 2} depends on the quantity of resource extracted ui (s) and the resource stock size x(s). In particular, nation 1’s extraction cost is 2u1 (s)x(s)−1/2 and nation 2’s is u2 (s)x(s)−1/2 . The fish harvested by nation i at time s will generate a net benefit of the amount [ui (s)]1/2 . At terminal time 4, nations 1 and 2 will receive termination bonuses 7.5x(4)1/2 and 5x(4)1/2 while the interest rate is 0.05. At time 0 the payoffs of nation 1 and nation 2 are, respectively, 4 1/2 1 4 u1 (s) − u (s) exp(−0.05s) ds + exp −r(4) 7.5x(4) 2 , and i 1/2 x(s) 0 4 1/2 1 3 u2 (s) − u (s) exp(−0.05s) ds + exp −r(4) 5x(4) 2 . i 1/2 x(s) 0 Obtain a feedback Nash equilibrium solution for this transnational market activity. 4.2 If these nations agree to cooperate and maximize their joint payoff, compute the optimal cooperative strategies and optimal stock path of the fish biomass. 4.3 Furthermore, if these nations agree to share the excess of their gain equally along the optimal trajectory, obtain a time (optimal-trajectory-subgame) consistent solution. 4.4 Consider the case when the game horizon in exercise 1 is extended to infinity. (i) Obtain a feedback Nash equilibrium solution for this transnational market activity. (ii) If these nations agree to cooperate and maximize their joint payoff, compute the optimal cooperative strategies and optimal stock path of the fish biomass. (iii) If these nations agree to share the excess of their gain equally along the optimal trajectory, obtain a time (optimal-trajectory-subgame) consistent solution.
Chapter 5
Dynamically Stable Cost-Saving Joint Venture
In this chapter, we consider a common economic activity involving cooperative optimization—joint venture. However, it is often observed that after a certain time of cooperation some firms in a joint venture may gain sufficient skills and technology that they would do better by breaking away from the joint operation. Analysis on time (optimal-trajectory subgame) consistent joint ventures are presented in the following sections. As markets become increasingly globalized and firms become more multinational, corporate joint ventures are likely to yield opportunities to quickly create economies of scale and critical mass, incorporate new skills and technology, and facilitate rational resource sharing (see Bleeke and Ernst 1993). With joint ventures becoming a powerful force shaping global corporate strategy, partnerships between firms have significantly increased. Despite their purported benefits, however, joint ventures are highly unstable and have a consistently high rate of failure (Blodgett 1992; Parkhe 1993). In addition, other adverse effects, such as uncompensated transfers of technology, operational difficulties, disagreements, and anxiety over the loss of proprietary information, have been found (Hamel et al. 1989 and Gomes-Casseres 1987). D’Aspremont and Jacquemin (1988), Kamien et al. (1992), and Suzumura (1992) have studied cooperative R&D with spillovers in joint ventures under a static framework. Cellini and Lambertini (2002, 2004) considered cooperative solutions to investment in product differentiation in a dynamic approach. A dynamic model of a corporate joint venture resulting in cost saving is presented in Sect. 5.1. Time (optimal-trajectory-subgame) consistent solutions are derived in Sect. 5.2 and an example is given in Sect. 5.3. The derivation of a Shapley value solution to the joint venture is given in Sect. 5.4 and the Shapley value profit sharing is analyzed in Sect. 5.5. An extension of the analysis to an infinite horizon is investigated in Sect. 5.6 and an example of an infinite-horizon joint venture is provided in Sect. 5.7.
D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_5, © Springer Science+Business Media, LLC 2012
111
112
5
Dynamically Stable Cost-Saving Joint Venture
5.1 A Dynamic Model of Corporate Joint Venture A fundamental premise is that joint ventures are formed primarily so that participating firms can readily gain core skills and technology that will be difficult for them to obtain on their own (Murray and Siehl 1989). Costs reduction often is an advantage gained by the firms in the joint venture. However, after a certain time of cooperation some firms may gain sufficient managerial and technological expertise that they would do better by breaking away from the joint venture. Thus a major source of instability is the lack of dynamical stable or time consistent cooperative solutions to the joint venture. Time (optimal-trajectory subgame) consistency is a fundamental element in dynamic cooperation, and it ensures that (i) the extension of the solution policy to a later starting time along the optimal trajectory will remain optimal, and (ii) all participating firms do not have an incentive to deviate from the initial plan. The absence of a formal mechanism to time consistent cooperative solutions has precluded the rigorous analysis of the problem of corporate joint ventures. Petrosyan and Zaccour (2003) provided a time consistent solution to a class of differential games involving pollution cost reduction. Yeung and Petrosyan (2006b) presented a dynamically stable joint venture involving cooperative R&D with spillovers. Yeung (2010) provided an analysis on time consistent cost-saving joint ventures. In this section, we present a framework of a dynamic joint venture in which there are n firms. The venture horizon is [t0 , T ]. The state dynamics of the ith firm is characterized by the set of vector-valued differential equations. The state dynamics of the ith firm is characterized by the set of vector-valued differential equations x˙ i (s) = f i s, x i (s), ui (s) , x i (t0 ) = x i(0) , for i ∈ N, (5.1) where x i (s) ∈ Xi ⊂ R mi + denotes the state variables of firm i, ui ∈ Ui ⊂ R i + is firm i’s investment in technology advancement. The state of firm i includes its capital stock, level of technology, special skills, and productive resources. The objective of firm i is s T i i {i} r(y) dy ds g s, x (s) − ci ui (s) exp − t0 t0 T
i i r(y) dy q x (T ) , (5.2) + exp − for i ∈ N , where exp[−
t
t0
i i t0 r(y) dy] is the discount factor, g [s, x (s)] the instanta{i} neous revenue, ci [ui (s)] represents the costs of the firm’s control ui (s) when it is operating on its own, and q i (x i (T )) is the terminal payment. In particular, the firm’s
revenue g i [s, x i ] is affected by the state variables, like capital stock, special skills, productive resources, and technologies. Note that since the objectives and state dynamics of the firms in a noncooperative equilibrium are independent, the market outcome is represented by an n neoclassical theory of the firm problems. Let V (t0 )i (t, x i ) and φi∗ (t, x i ) denote the payoff and investment strategies of firm i, for i ∈ N , by which a firm’s equilibrium is characterized (see Theorem A.1 in the Technical Appendixes) as follows.
5.1 A Dynamic Model of Corporate Joint Venture
113
A set of investment strategies φi∗ (t, x i ) for firm i constitutes an optimal solution to the neoclassical theory of the firm problem, which maximizes (5.2) subject to (5.1) if there exists a continuously differentiable function V (t0 )i (t, x i ) defined by [t0 , T ] × R mi → R and satisfying the following Bellman equation: t i
{i} (t0 )i i −Vt r(y) dy t, x = max g t, x − ci (ui ) exp − ui t0
and + Vx(t0 )i (t, x)f i t, x i , ui T
V (t0 )i T , x i = q i x i exp − r(y) dy . t0
Let V (τ )i (t, x i ) denote the payoff function of firm i in a game with the dynamics in (5.1) and the payoff of (5.2), which starts at time τ for τ ∈ [t0 , T ). Note that the equilibrium feedback strategies are Markovian in the sense that they depend on the current time and the current state. Invoking Remark 2.1 of Chap. 2, one can obtain τ
exp r(y) dy V (t0 )i t, x i = V (τ )i t, x i , t0
for τ ∈ [t0 , T ] and i ∈ N . Consider a joint venture consisting of all these n companies. The participating firms can gain core skills and technology that would be impossible for them to obtain on their own individually. Cost-saving opportunities are created under joint venture, for instance, savings in joint R&D, administration, marketing, customer services, purchasing, financing, and economy of scales and scope. The cost of control of firm j under the joint venture becomes cjN [uj (s)]. With the absolute joint venture cost advantage we have {j }
cjN (uj ) ≤ cj (uj ),
for j ∈ N.
(5.3)
Moreover, marginal cost advantages lead to {j }
∂cjN (uj )/∂uj ≤ ∂cj (uj )/∂uj ,
for j ∈ N.
At time t0 , the joint venture would maximize the joint venture profit
T
n
t0 j =1 n
+
j =1
s r(y) dy ds g j s, x j (s) − cjN uj (s) exp − t0
exp −
T
r(y) dy q j x j (T ) ,
(5.4)
t0
subject to (5.1). The model adopted for analysis concentrates on the reduction of costs within the joint venture; the profit of an outside firm is not affected by the actions of the joint venture. Such an outcome would appear in scenarios in which firms are selling
114
5
Dynamically Stable Cost-Saving Joint Venture
different products, in which firms are making vertical integrations, or in which there exists a sizable world market. For the sake of clarity in exposition, we consider the case where mi = 1, for i ∈ N .
5.2 Time (Optimal-Trajectory-Subgame) Consistent Solution in Joint Venture We begin with the characterization of the profit of the joint venture. Let x denote {x 1 , x 2 , . . . , x n }. Invoking Bellman’s technique of dynamic programming in Sect. A.1 of the Technical Appendixes, the solution to the problem in (5.3) and (5.4) can be characterized as follows. Corollary 5.1 A set of controls {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]} provides an optimal solution to the control problem in (5.3) and (5.4) if there exists a continuously differentiable function W (t0 ) (t, x) : [t0 , T ] × R n → R satisfying the following Bellman equation: n t
(t0 ) j j N −Wt (t, x) = max r(y) dy g t, x − cj (uj ) exp − u1 ,u2 ,...,un
+
n
j =1
W
t0
Wx(tj0 ) (t, x)f j t, x j , uj ,
j =1
(t0 )
(T , x) = exp −
T
t0
r(y) dy
n
(5.5)
qj xj .
j =1
Hence the firms will adopt the cooperative control {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]} to obtain the maximized level of joint profit. In a cooperative framework, the issue of the nonuniqueness of the optimal controls can be resolved by the agreement between the firms on a particular set of controls. Substituting this set of controls into (5.1) yields the dynamics of technology advancement under cooperation as
x˙ i (s) = f i s, x i (s), ψi∗ s, x(s) , x i (t0 ) = x0i , for i ∈ N. (5.6) Let x ∗ (t) = {x 1∗ (t), x 2∗ (t), . . . , x n∗ (t)} denote the solution to (5.6). The optimal trajectory {x ∗ (t)}Tt=t0 can be expressed as x i∗ (t) = x0i +
t
t0
f i s, x i∗ (s), ψi∗ s, x ∗ (s) ds,
for i ∈ N.
(5.7)
For notational convenience, we use the terms x ∗ (t) and xt∗ interchangeably. The cooperative investment strategies for the cooperative game in (5.1) and (5.4) over the time interval [t0 , T ] can be expressed more precisely as ∗ ∗
(5.8) ψi t, x (t) , for i ∈ N and t ∈ [t0 , T ] .
5.2 Time (Optimal-Trajectory-Subgame) Consistent Solution in Joint Venture
115
Note that for group optimality to be achievable, the cooperative investment strategies {ψi∗ (t, x ∗ (t)), for i ∈ N and t ∈ [t0 , T ]} must be exercised throughout the time interval [t0 , T ]. Along the cooperative investment path {x ∗ (t)}Tt=t0 , the total venture profit over the interval [t, T ], for t ∈ [t0 , T ), can be expressed as W
(t0 )
t, xt∗
T
= t
s ∗ ∗
N r(y) dy ds g s, x (s) − cj ψj s, x (s) exp −
n
j
j∗
t0
j =1
+ exp −
T
r(y) dy
t0
n
q j x j ∗ (T ) .
(5.9)
j =1
Let W (τ ) (t, xt∗ ) denote the total venture profit from the control problem with the dynamics in (5.1) and the payoff in (5.4), which begins at time τ ∈ [t0 , T ] with initial state xτ∗ . Invoking Remark 3.1 of Chap. 3, one can readily obtain exp
τ
t0
r(y) dy W (t0 ) t, xt∗ = W (τ ) t, xt∗ ,
for τ ∈ [t0 , T ] and t ∈ [τ, T ). Next, we consider an imputation scheme to share the total venture profit.
5.2.1 Imputation Scheme The problem of profit sharing is inescapable in virtually every joint venture. Since the sizes and earning potentials of the firms in a corporate joint venture may vary significantly, we consider the case when the venture agrees to share the excess of the total cooperative payoff over the sum of individual noncooperative payoffs proportional to the firms’ noncooperative payoffs. The imputation scheme has to fulfill the following condition. Condition 5.1 An imputation
V (t0 )i (t0 , x0i ) ξ (t0 )i (t0 , x0 ) = V (t0 )i t0 , x0i + n (t0 )j (t , x j ) 0 0 j =1 V n j
(t0 ) (t0 )j V t0 , x 0 × W (t0 , x0 ) − j =1
V (t0 )i (t0 , x0i ) = n W (t0 ) (t0 , x0 ), j (t )j 0 (t0 , x0 ) j =1 V
116
5
Dynamically Stable Cost-Saving Joint Venture
is assigned to firm i, for i ∈ N at the outset and an imputation
V (τ )i (τ, xτi∗ ) (τ ) ∗ W τ, x , ξ (τ )i τ, xτ∗ = n τ (τ )j (τ, x j ∗ ) V τ j =1
(5.10)
is assigned to firm i, for i ∈ N at time τ ∈ (t0 , T ]. The imputation in (5.10) satisfies (i) ξ (τ )i (τ, xτ∗ ) ≥ V (τ )i (τ, xτi∗ ), for i ∈ N and τ ∈ [t0 , T ]; and n (τ )j (τ, x ∗ ) = W (τ ) (τ, x ∗ ) for τ ∈ [t , T ]. (ii) 0 τ τ j =1 ξ Hence the imputation vector ξ (τ ) (τ, xτ∗ ) in (5.10) satisfies individual rationality and group optimality throughout the game horizon [t0 , T ]. The solution to the optimality principle guiding the joint venture can then be expressed as
P xt∗ , T − t = ψ ∗ s, x ∗ (s) and B(s) for s ∈ [t, T ], ξ (t) t, xt∗ , for t ∈ [t0 , T ], where ψ ∗ (s, x ∗ (s)) = {ψ1∗ (s, x ∗ (s)), ψ2∗ (s, x ∗ (s)), . . . , ψn∗ (s, x ∗ (s))} is the vector of the cooperative investment strategies maximizing joint profit, ξ (t) (t, xt∗ ) is the imputation scheme satisfying Condition 5.1, and B(s) is a profit distribution mechanism that will lead to the realization of Condition 5.1. All the participating firms in the joint venture will have no incentive to exit the venture if the agreed-upon optimality principle is maintained at every instant t ∈ [t0 , T ]. A profit distribution mechanism that will lead to the realization of Condition 5.1 will be formulated in the next section.
5.2.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Distribution To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 5.1 we first obtain the following. Lemma 5.1 A PDP with a terminal payment q i (xT∗ ) at time T and an instantaneous payment at time τ ∈ [t0 , T ] n
(τ )i ∗ h
Bi (τ ) = − ξt(τ )i t, xt∗ |t=τ − ξ h∗ t, xt t=τ f τ, xτh∗ , ψh∗ τ, xτ∗ h=1
=−
xt
∗ V (τ )i (t, xti∗ ) ∂ (τ ) W t, x n t t=τ j∗ ∂t V (τ )j (t, xt ) j =1
5.3 A Cost-Saving Joint Venture
117
n
∂ V (τ )i (τ, xτi∗ ) (τ ) ∗ τ, xτ W − n (τ )j (τ, x j ∗ ) ∂xτh∗ τ j =1 V h=1
× f h τ, xτh∗ , ψh∗ τ, xτ∗ ,
(5.11)
for i ∈ N , will lead to realization of the solution imputations ξ (τ )i (τ, xτ∗ ), for i ∈ N and τ ∈ [t0 , T ], satisfying Condition 5.1. Proof Invoking Theorem 4.2 in Chap. 4, Lemma 5.1 follows.
A time (optimal-trajectory-subgame) consistent solution can be obtained as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} where (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) characterized in (5.6), and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ] is given as in (5.11). With firms using the cooperative investment strategies {ψi∗ (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N}, the instantaneous receipt of firm i at time instant τ is
ζi (τ ) = g i τ, xτi∗ − ciN ψi∗ τ, xτ∗ , (5.12) for τ ∈ [t0 , T ] and i ∈ N . According to Lemma 5.1, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , T ] and i ∈ N , as stated in (5.11). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ )
(5.13)
has to be given or charged to firm i at time τ , for i ∈ N and τ ∈ [t0 , T ].
5.3 A Cost-Saving Joint Venture Consider the case when there are three companies involved in a joint venture. The planning period is [t0 , T ]. Company i’s profit is T 1/2 i 1/2 {i} − ci ui (s) exp −r(s − t0 ) ds + exp −r(T − t0 ) qi x i (T ) , Pi x (s) t0
{i}
for i ∈ {1, 2, 3}, where Pi , ci and qi are positive constants, r is the discount rate, xi (s) ⊂ R + is the level of technology of company i at time s, and ui (s) ⊂ R + is its physical investment in technological advancement. The term Pi [x i (s)]1/2 reflects the net operating revenue of company i at technology level xi (s), and ci ui is the cost of the investment. The salvage value of company i’s technology at time T is given by qi [x i (T )]1/2 . The evolution of the technology level of company i follows the dynamics 1/2 x˙ i (s) = αi ui (s)xi (s) − δx i (s) , for i ∈ {1, 2, 3}. (5.14)
118
5
Dynamically Stable Cost-Saving Joint Venture
In the case when each of these three firms acts independently, and using Theorem A.1 in the Technical Appendixes, we obtain the Bellman equation as
1/2 {i} (t )i − ci ui exp −r(t − t0 ) −Vt 0 t, x i = max Pi x i ui
1/2 (τ )i + Vx i t, x i αi ui xi − δx i , (5.15)
1/2 , for i ∈ {1, 2, 3}. V (t0 )i T , x i = exp −r(T − t0 ) qi x i Performing the indicated maximization in (5.15) yields ui =
αi2
{i} 4(ci )2
2
t, x i exp r(t − t0 ) x i ,
(t )i
Vx i 0
for i ∈ {1, 2, 3}.
Substituting ui into the Bellman equation yields
1/2 (t )i exp −r(t − t0 ) −Vt 0 t, x i = Pi x i αi2 (t0 )i i 2 Vx i t, x − {i} exp r(t − t0 ) x i 4ci
2
α 2 (t )i (t )i + i Vx i 0 t, x i exp r(t − τ ) x i − δVx i 0 t, x i xi , 2ci for i ∈ {1, 2, 3}. Solving the above system of partial differential equations yields
{i} 1/2 {i} + Ci (t) exp −r(τ − t0 ) , V (t0 )i t, x i = Ai (t) x i for i ∈ {1, 2, 3}, where
(5.16)
δ {i} {i} ˙ Ai (t) = r + Ai (t) − Pi , 2 {i} {i} C˙ i (t) = rCi (t) − {i}
Ai (T ) = qi
and
2 {i} Ai (t) , {i} 16ci {i} Ci (T ) = 0. αi2
(5.17)
The first equation in the block-recursive system in (5.17) is a first-order linear {i} differential equation in Ai (t) that can be solved independently by standard tech{i} niques. Substituting the solution of Ai (t) into the second equation of (5.17) yields {i} {i} a first-order linear differential equation in Ci (t). The solution of Ci (t) can be readily obtained by standard techniques. Moreover, one can easily derive
{i} 1/2 {i} V (τ )i t, x i = Ai (t) x i + Ci (t) exp −r(t − τ ) , for i ∈ {1, 2, 3} and τ ∈ [t0 , T ].
5.3 A Cost-Saving Joint Venture
119
After characterizing the outcome when each of these three firms acts independently, we investigate the outcome when these firms form a joint venture.
5.3.1 Joint Venture Profit and Cost Saving Consider the case when all three firms agree to form a joint venture and share their joint profit proportionally to their noncooperative profits. Cost-saving opportunities are created under joint venture from joint R&D, administration, purchasing, financing, and economy of scales and scope. The cost of control of firm j under the joint {1,2,3} venture becomes cj uj (s). With joint venture cost advantage {1,2,3}
cj
{j }
≤ cj ,
for j ∈ N.
(5.18)
The profit of the joint venture is the sum of the participating firms’ profits
T
3
t0 j =1
+
1/2 {1,2,3} − cj uj (s) exp −r(s − t0 ) ds Pj x j (s)
3
1/2 exp −r(T − t0 ) qj x j (T ) .
(5.19)
j =1
The firms in the joint venture then act cooperatively to maximize (5.19) subject to (5.14). In particular, (5.14) and (5.19) become an optimization problem under the three firms’ cost-saving joint venture. Using Theorem A.1 in the Technical Appendixes, we obtain the Bellman equation as
(t ){1,2,3} − Wt 0 t, x 1 , x 2 , x 3 3 1/2 = max − ci ui exp −r(t − t0 ) Pi x i u1 ,u2 ,u3
+
3
i=1
1/2 (t ){1,2,3} Wx i0 t, x 1 , x 2 , x 3 αi ui x i
− δx
i
,
(5.20)
i=1 3 1/2
exp −r(T − t0 ) qj x j . W (t0 ){1,2,3} T , x 1 , x 2 , x 3 = j =1
Performing the indicated maximization yields ui =
αi2
{1,2,3} 2 4(ci )
for i ∈ {1, 2, 3}.
2
t, x 1 , x 2 , x 3 exp r(t − t0 ) x i ,
(t ){1,2,3}
Wx i0
(5.21)
120
5
Dynamically Stable Cost-Saving Joint Venture
Substituting (5.21) into (5.20) yields (t0 ){1,2,3}
−Wt =
t, x 1 , x 2 , x 3
3 1/2 αi2 x i exp −r(t − t0 ) − {1,2,3} Pi x i 4ci i=1
(t ){1,2,3} 1 2 3 2 × Wx i0 exp r(t − t0 ) t, x , x , x
2 α Wx(ti 0 ){1,2,3} (t, x1 , x2 , x3 ) i Wx(ti 0 )i (t, x1 , x2 , x3 ) 2ci i=1 × exp r(t − t0 ) x i − δx i , +
3
and 3 1/2
exp −r(T − t0 ) qj x j . W (t0 ){1,2,3} T , x 1 , x 2 , x 3 =
(5.22)
j =1
Solving (5.22) yields
W (t0 ){1,2,3} t, x 1 , x 2 , x 3 1/2 1/2 {1,2,3} 1 1/2 {1,2,3} {1,2,3} (t) x + A2 (t) x 2 + A3 (t) x 3 = A1 + C {1,2,3} (t) exp −r(t − t0 ) , {1,2,3}
where A1
{1,2,3}
(t), A2
{1,2,3}
(t), A3
(5.23)
(t), and C {1,2,3} (t) satisfy
δ {1,2,3} {1,2,3} ˙ (t) = r + (t) − Pi Ai Ai 2 for i, j, h ∈ {1, 2, 3} and i = j = h, C˙ {1,2,3} (t) = rC {1,2,3} (t) − {1,2,3}
Ai
(T ) = qi
3
αi2
{1,2,3} i=1 16ci
{1,2,3}
Ai
2 (t) , (5.24)
for i ∈ {1, 2, 3}, and C {1,2,3} (T ) = 0.
The first three equations in the block recursive system in (5.24) are a system of three linear differential equations that can be solved explicitly by standard tech{1,2,3} (t) for i ∈ {1, 2, 3} and substituting them into the niques. Upon solving Ai fourth equation of (5.24), one has a linear differential equation in C {1,2,3} (t).
5.3 A Cost-Saving Joint Venture
121
The investment strategies of the grand coalition joint venture can be derived as {1,2,3}
ψi
(t, x) =
αi2
{1,2,3} 2 16(ci )
{1,2,3} 2 (t) , Ai
for i ∈ {1, 2, 3}.
(5.25)
The dynamics of the technological progress of the joint venture over the time interval s ∈ [t0 , T ] can be expressed as x˙ i (s) =
αi2 {1,2,3} i 1/2 A (t) x (s) − δx i (s), 4ci i
x i (t0 ) = x0i ,
(5.26)
for i ∈ {1, 2, 3}. Taking the transforming y i (s) = x i (s)1/2 , for i ∈ {1, 2, 3}, equation system in (5.26) can be expressed as y˙ i (s) =
αi2 {1,2,3} δ Ai (t) − y i (s), 8ci 2
1/2 y i (t0 ) = x0i ,
(5.27)
for i ∈ {1, 2, 3}. Equation (5.27) is a system of linear differential equations that can be solved by standard techniques. Solving (5.27) yields the joint venture’s state trajectory. Let {y 1∗ (t), y 2∗ (t), y 3∗ (t)} denote the solution to (5.27). Transforming x i = (y i )2 , we obtain the state trajectories of the joint venture over the time interval s ∈ [t0 , T ] as ∗ T T x (t) t=t ≡ x 1∗ (t), x 2∗ (t), x 3∗ (t) t=t 0 0 1∗ 2 2∗ 2 3∗ 2 T = y (t) , y (t) , y (t) t=t . (5.28) 0
Once again, we use the terms
x i∗ (t)
and
xti∗
interchangeably.
Remark 5.1 One can readily verify that
W (t0 ){1,2,3} t, x 1∗ , x 2∗ , x 3∗ = W (t){1,2,3} t, x 1∗ , x 2∗ , x 3∗ exp −r(t − t0 ) , for i ∈ {1, 2, 3}.
5.3.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing Since the firms agree to share their joint profit proportionally to their noncooperative profits, the imputation scheme has to fulfill the following condition. Condition 5.2 In the game Γc (x0 , T − t0 ), an imputation
V (t0 )i (t0 , x0i ) W (t0 ){1,2,3} t0 , x01 , x02 , x03 , ξ (t0 )i (t0 , x 0 ) = n i (t )j 0 (t0 , x0 ) j =1 V
122
5
Dynamically Stable Cost-Saving Joint Venture
is assigned to firm i, for i ∈ {1, 2, 3}, and in the subgame Γc (xτ∗ , T − τ ), for τ ∈ (t0 , T ], an imputation
V (τ )i (τ, xτi∗ ) W (τ ){1,2,3} τ, xτ1∗ , xτ2∗ , xτ3∗ , ξ (τ )i τ, xτ∗ = n (τ )j i∗ (τ, xτ ) j =1 V
(5.29)
is assigned to firm i, for i ∈ {1, 2, 3}. To formulate a payoff distribution procedure over time so that the agreed imputations in Condition 5.2 are satisfied we present the following proposition. Proposition 5.1 A PDP with a terminal payment q i (xT∗ )) at time T and an instantaneous payment at time τ ∈ [t0 , T ] 1∗ 2∗ 3∗ V (τ )i (t, xti∗ ) ∂ (τ ){1,2,3} t, x Bi (τ ) = − , x , x W 3 t t t t=τ (τ )j (t, x j ∗ ) ∂t t j =1 V n
∂ V (τ )i (τ, xτi∗ ) (τ ){1,2,3} 1∗ 2∗ 3∗ W τ, xτ , xτ , xτ − 3 (τ )j (τ, x j ∗ ) ∂xτ∗ τ j =1 V =1 2 α {1,2,3} ∗ 1/2 ∗ A (τ ) xτ − δxτ , for i ∈ {1, 2, 3}, × (5.30) 4c would lead to the realization of the solution imputations ξ (τ )i (τ, xτ∗ ), for i ∈ {1, 2, 3} and τ ∈ [t0 , T ], satisfying Condition 5.2. Proof Invoking Lemma 5.1, one obtains an equation similar to (5.11) with i ∈ {1, 2, 3} and
f τ, xτ∗ , ψ∗ τ, xτ∗ =
α2 {1,2,3} ∗ 1/2 A (τ ) xτ − δxτ∗ . 4c
Hence Proposition 5.1 follows.
In particular, from (5.17) and (5.24), 1∗ 2∗ 3∗ V (τ )i (t, xti∗ ) ∂ (τ ){1,2,3} , x , x t, x W 3 t t t t=τ (τ )j (t, x j ∗ ) ∂t t j =1 V 1/2 1/2 {1,2,3} 1∗ 1/2 {1,2,3} {1,2,3} (τ ) xτ + A2 (τ ) xτ2∗ + A3 (τ ) xτ3∗ + C {1,2,3} (τ ) = A1 3 {j } 1/2 {j } j∗ × + Cj (τ ) Aj (τ ) xτ j =1 {i}
×
{i}
{i}
{i}
(r[Ai (τ )(x i∗ )1/2 + Ci (τ )] + [A˙ i (τ )(x i∗ )1/2 + C˙ i (τ )]) {j } j {j } ( 3j =1 [Aj (τ )(xτ )1/2 + Cj (τ )])2
5.3 A Cost-Saving Joint Venture
123
{i} 1/2 {i} − Ai (τ ) x i∗ + Ci (τ ) 3 {j } j ∗ 1/2 + C {j } (τ )] + [A ˙ {j } (τ )(x j ∗ )1/2 + C˙ {j } (τ )]) j =1 (r[Aj (τ )(x ) j j j × {j } j∗ {j } ( 3j =1 [Aj (τ )(xτ )1/2 + Cj (τ )])2 {i}
{i}
[A (τ )(x i∗ )1/2 + Ci (τ )] + 3 i {j } j∗ {j } ( j =1 [Aj (τ )(xτ )1/2 + Cj (τ )])2 {1,2,3} 1∗ 1/2 1/2 {1,2,3} × r A1 (τ ) xτ + A2 (τ ) xτ2∗ 1/2 {1,2,3} + A3 (τ ) xτ3∗ + C {1,2,3} (τ ) 1/2 {1,2,3} 1∗ 1/2 {1,2,3} (τ ) xτ + A˙ 2 (τ ) xτ2∗ + A˙ 1 1/2
{1,2,3} + A˙ 3 (τ ) xτ3∗ + C˙ {1,2,3} (τ ) ; and
V (τ )i (τ, xτi∗ ) ∂ (τ ){1,2,3} 1∗ 2∗ 3∗ τ, xτ , xτ , xτ W 3 j∗ ∂xτi∗ V (τ )j (τ, xτ ) =
j =1
1/2 1/2 {1,2,3} (τ ) xτ2∗ + A3 (τ ) xτ3∗ + C {1,2,3} (τ ) {i} {i} {h} [Ai (τ )(x i∗ )1/2 + Ci (τ )] 12 Ah (τ )(x h∗ )−1/2 × − {j } j∗ {j } ( 3j =1 [Aj (τ )(xτ )1/2 + Cj (τ )])2
1/2 {1,2,3} A1 (τ ) xτ1∗
{1,2,3}
+ A2
{i} {i} [A (τ )(x i∗ )1/2 + Ci (τ )] 1 {1,2,3} h∗ −1/2 Ah + 3 i {j } (τ ) xτ ; j ∗ 1/2 {j } ( j =1 [Aj (τ )(xτ ) + Cj (τ )])2 2
V (τ )i (τ, xτi∗ ) ∂ (τ ){1,2,3} 1∗ 2∗ 3∗ W τ, xτ , xτ , xτ 3 (τ )j (τ, x j ∗ ) ∂xτh∗ τ j =1 V 1/2 1/2 {1,2,3} 1∗ 1/2 {1,2,3} {1,2,3} (τ ) xτ + A2 (τ ) xτ2∗ + A3 (τ ) xτ3∗ + C {1,2,3} (τ ) = A1 {i} {i} {h} [Ai (τ )(x i∗ )1/2 + Ci (τ )] 12 Ah (τ )(x h∗ )−1/2 × − {j } j∗ {j } ( 3j =1 [Aj (τ )(xτ )1/2 + Cj (τ )])2 {i} {i} [A (τ )(x i∗ )1/2 + Ci (τ )] 1 {1,2,3} h∗ −1/2 Ah + 3 i {j } (τ ) xτ , j ∗ 1/2 {j } ( j =1 [Aj (τ )(xτ ) + Cj (τ )])2 2
for h = i. A time (optimal-trajectory-subgame) consistent solution can be obtained as P (x0 , T − t0 ) = {u(s) and B(s) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} where (i) u(s) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) in (5.25), and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} for s ∈ [t0 , T ] is given as in (5.30).
124
5
Dynamically Stable Cost-Saving Joint Venture
Using the cooperative strategies the instantaneous receipt of firm i at time instant τ is 1/2 − ζi (τ ) = Pi xτi∗
αi2
{1,2,3} 16(ci )
{1,2,3}
Ai
2 (τ ) ,
(5.31)
for τ ∈ [t0 , T ] and i ∈ {1, 2, 3}. According to Proposition 5.1, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , T ] and i ∈ {1, 2, 3}, as stated in (5.30). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ ),
(5.32)
has to be given or charged to firm i at time τ , for i ∈ {1, 2, 3} and τ ∈ [t0 , T ].
5.4 A Shapley Value Solution to Joint Venture Consider again the dynamic venture model in (5.1) and (5.2). If firms are allowed to form different coalitions consisting of a subset of companies K ⊆ N . There are k firms in the subset K. The participating firms in a coalition can gain core skills and technology from each other. In particular, they can obtain cost reduction and with absolute joint venture cost advantage (5.33) cjK uj (s) ≤ cjL uj (s) , for j ∈ L ⊆ K, where cjK [uj (s)] represents the costs of the controls of the firm j in the subset K and cjL [uj (s)] represents the costs of the controls of the firm j in the subset L. Moreover, marginal cost advantages lead to (5.34) ∂cjK uj (s) /∂uj (s) ≤ ∂cjL uj (s) /∂uj (s), for j ∈ L ⊆ K. At time t0 , the profit to the joint venture K becomes s T j j r(y) dy ds g s, x (s) − cjK uj (s) exp − t0 j ∈K
t0
+ exp − j ∈K
T
r(y) dy q j x j (T ) ,
for K ⊆ N.
(5.35)
t0
To compute the profit of the joint venture K we have to consider the optimal control problem [K; t0 , x0K ] which maximizes the joint venture profit in (5.35) subject to the technology accumulation dynamics in (5.1). Invoking Bellman’s technique of dynamic programming in Sect. A.1 of the Technical Appendixes, the solution to the optimal control problem [K; t0 , x0K ] can be characterized as follows.
5.4 A Shapley Value Solution to Joint Venture
125
Corollary 5.2 A set of controls {ψiK∗ (t, x K ), for i ∈ K and t ∈ [t0 , T ]}, provides an optimal solution to the control problem [K; t0 , x0K ] if there exists a continuously differentiable function W (t0 )K (t, x K ) : [t0 , T ] × R k → R satisfying the following Bellman equation: t
(t0 )K K j j K r(y) dy t, x = max g t, x − cj (uj ) exp − −Wt uK
+
j ∈K
t0
Wx(tj0 )K t, x K f j s, x j , uj ,
j ∈K
W (t0 )K T , x K = exp −
T
r(y) dy
t0
(5.36)
qj xj .
j ∈K
Following Corollary 5.2, one can characterize the maximized payoff W (τ )K (t, to the optimal control problem [K; τ, xτK ] which maximizes s T
j j
r(y) dy ds g s, x (s) − cjK uj (s) exp −
xK )
τ
τ
j ∈K
exp − +
T
r(y) dy q j x j (T ) ,
(5.37)
τ
j ∈K
subject to x˙ j (s) = f j s, x j (s), uj (s) ,
x j (τ ) = xτj , for j ∈ K.
(5.38)
Invoking Remark 3.1 of Chap. 3, one can readily obtain W (t0 )K (t, x K ) = τ W (τ )K (t, x K ) exp[− t0 r(y) dy], for τ ∈ [t0 , T ] and t ∈ [τ, T ). Now consider the case of a grand coalition N in which all the n firms are in the coalition. In the grand coalition, firms will adopt the cooperative control {ψiN ∗ (t, x N ), for i ∈ N and t ∈ [t0 , T ]}, to obtain the maximized level of joint profit. The state dynamics of the grand coalition can be obtained as in (5.6) and the optimal trajectory {x ∗ (t)}Tt=t0 as in (5.7) {x ∗ (t)}Tt=t0 . Note that for group optimality to be achievable the cooperative investment strategies {ψiN∗ (t, x ∗ (t)), for i ∈ N and t ∈ [t0 , T ]}, must be exercised throughout time interval [t0 , T ]. Along the cooperative control path {x ∗ (t)}Tt=t0 the total venture profit over the interval [t, T ], for t ∈ [t0 , T ), can be expressed as W
(t0 )N
t, xt∗
T
= t
n
j
g s, x
j∗
(s), ψjN∗
j =1
+ exp −
T
r(y) dy t0
s
r(y) dy ds s, x (s) exp −
n j =1
∗
t0
q j x j ∗ (T ) .
(5.39)
126
5
Dynamically Stable Cost-Saving Joint Venture
Moreover, the superadditivity of the coalition payoff can be demonstrated. Proposition 5.2 The coalition profits W (τ )K (t, x K ) is superadditivity, that is,
W (τ )K τ, x K ≥ W (τ )L τ, x L + W (τ )K\L τ, x K\L , for L ⊂ K ⊆ N, where K\L is the relative complement of L in K.
Proof See the Appendix of this chapter.
With joint profits under different venture coalitions characterized we proceed to consider the distribution of venture profits according to the Shapley Value (1953).
5.4.1 Dynamic Shapley Value Imputation Consider a joint venture involving n firms. The member firms will maximize their joint profit and share their cooperative profits according to the Shapley Value (1953). The problem of profit sharing is inescapable in virtually every joint venture. The Shapley Value is one of the most commonly used sharing mechanisms in static cooperation games with transferable payoffs. Besides being individually rational and group rational, the Shapley Value is also unique. Specifically, the Shapley Value gives an imputation rule ϕ i (v) =
(k − 1)!(n − k)! v(K) − v(K\i) , n!
for i ∈ N,
(5.40)
K⊆N
where K\i is the relative complement of i in K, v(K) is the profit of coalition K, and [v(K) − v(K\i)] is the marginal contribution of firm i to the coalition K. Though the Shapley Value is used as the profit allocation mechanism, there exist two features that do not conform with the standard Shapley Value analysis. The first is that the present analysis is dynamic so that, instead of a one-time allocation of the Shapley Value, we have to consider the maintenance of the Shapley Value imputation over the joint venture horizon. The second is that the profit v(K) is the maximized profit to coalition K and is not a characteristic function (from the game in which coalition K is playing a zero-sum game against coalition N \K). Applications of the Shapley Value in cost allocation usually do not follow the characteristic function approach. Moreover, since profit maximization by coalition K is not affected by firms outside the coalition, the analysis does not have to adopt arbitrary assumptions like those in Petrosyan and Zaccour (2003) in which the left-out players are assumed to stick with their feedback Nash strategies in computing a nonstandard characteristic function. Consider the situation when the firms in the joint venture agree to adopt an optimality principle which (i) maximizes the joint venture profit, and (ii) shares the venture profit among participating firms according to the Shapley Value.
5.4 A Shapley Value Solution to Joint Venture
127
To maximize the joint venture’s profits the firms will adopt the cooperative investment strategies {ψiN∗ (t, x ∗ (t)), for i ∈ N and t ∈ [t0 , T ]}, and the corresponding cooperative investment path {x ∗ (t)}Tt=t0 ≡ {x N∗ (t)}Tt=t0 in (5.38) would result. To share the venture profit among participating firms according to the Shapley Value, the imputation has to satisfy the following condition. Condition 5.3 In the game Γc (x0 , T − t0 ), an imputation
(k − 1)!(n − k)! (t )K
K\i , W 0 t0 , x0K − W (t0 )K\i t0 , x0 ξ (t0 )i t0 , x0N = n! K⊆N
is assigned to firm i, for i ∈ N and in the subgame Γc (xτ∗ , T − τ ), for τ ∈ (t0 , T ], an imputation
(k − 1)!(n − k)! (τ )K
ξ (τ )i τ, xτN∗ = τ, xτK∗ − W (τ )K\i τ, xτK\i∗ , W n! K⊆N
(5.41) is assigned to firm i, for i ∈ N . Note that ξ (τ ) (τ, xτN∗ ) = ξ (τ )i (τ, xτN∗ ), ξ (τ )i (τ, xτN ∗ ), . . . , v (τ )i (τ, xτN ∗ )], as specified in (5.41), satisfies the basic properties of an imputation vector as follows: (i)
n j =1
(ii)
v
v (τ )j τ, xτN∗ = W (τ )N τ, xτN∗ ,
and (5.42)
(τ )i
N∗
τ, xτ
≥W
(τ )i
N∗
τ, xτ
,
for i ∈ N and τ ∈ [t0 , T ].
Part (i) of (5.42) shows that ξ (τ ) (τ, xτN∗ ) satisfies the property of Pareto optimality throughout the game interval. Part (ii) demonstrates that ξ (τ ) (τ, xτN ∗ ) guarantees individual rationality throughout the game interval. Crucial to the analysis is the formulation of a profit distribution mechanism that would lead to the realization of Condition 5.3. This will be done in the next section.
5.4.2 The PDP for Shapley Value To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 5.3 we invoke Theorem 4.2 in Chap. 4 and obtain the following.
128
5
Dynamically Stable Cost-Saving Joint Venture
Lemma 5.2 A PDP with a terminal payment q i (xT∗ ) at time T and an instantaneous payment at time τ ∈ [t0 , T ] (k − 1)!(n − k)! (τ )K
(τ )K\i K\i∗ t, xtK∗ t=τ − Wt t, xt Wt Bi (τ ) = − t=τ n! K⊆N ∂
h
(τ )K K∗ + W τ, xτ f τ, xτh∗ , ψh∗ τ, xτ∗ h∗ ∂xτ h∈K ∂
h
(τ )K\i K\i∗ h∗ ∗ ∗ W − τ, xτ f τ, xτ , ψh τ, xτ , (5.43) ∂xτh∗ h∈K\i
for i ∈ N , will lead to the realization of the Shapley Value imputations ξ (τ )i (τ, xτN ∗ ) in Condition 5.3.
Proof Invoking Theorem 4.2 in Chap. 4 one can obtain Lemma 5.2.
A time (optimal-trajectory-subgame) consistent solution can be obtained using the set of group optimal strategies ψ ∗ (s, xs∗ ) characterized in Corollary 5.2 and B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.43). With firms using the cooperative investment strategies {ψi∗ (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N}, the instantaneous receipt of firm i at time instant τ is
ζi (τ ) = g i τ, xτi∗ − ciN ψi∗ τ, xτ∗ ,
for τ ∈ [t0 , T ] and i ∈ N.
(5.44)
According to Lemma 5.2, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , T ] and i ∈ N , as stated in (5.43). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ ),
(5.45)
would be given or charged to firm i at time τ , for i ∈ N and τ ∈ [t0 , T ].
5.5 A Joint Venture with Shapley Value Profit Sharing Consider the joint venture with technology spillovers in Sect. 5.3. In particular, the participating firms would share their cooperative profits according to the Shapley Value (1953). In the case when each of these three firms acts independently, we follow the analysis in Sect. 5.3 and obtain
{i} 1/2 {i} + Ci (t) exp −r(τ − t0 ) , W (t0 )i t, x i = Ai (t) x i
for i ∈ {1, 2, 3}, (5.46)
5.5 A Joint Venture with Shapley Value Profit Sharing
129
where δ {i} {i} ˙ Ai (t) = r + Ai (t) − Pi , 2 {i}
{i} {i} C˙ i (t) = rCi (t) −
αi2
{i} 16ci
2 {i} Ai (t) , (5.47)
{i}
Ai (T ) = qi ,
and Ci (T ) = 0.
Moreover, one can easily derive for, τ ∈ [t0 , T ],
{i} 1/2 {i} W (τ )i t, x i = Ai (t) x i + Ci (t) exp −r(t − τ ) , for i ∈ {1, 2, 3} and τ ∈ [t0 , T ].
5.5.1 Coalition Payoffs Through knowledge diffusion, participating firms can gain core skills and technology that would be very difficult for them to obtain in a coalition. In particular, the cost savings in a joint venture are depicted as follows: {i,j }
{i}
ci ≤ c i {i,j }
ci
,
{i,j,k}
≤ ci
for i, j ∈ {1, 2, 3} and i = j, ,
for i, j, k ∈ {1, 2, 3} and i = j = k.
(5.48)
The firms in the joint venture maximize the sum of their profits
T
3 j 1/2 {1,2,3} − cj uj (s) exp −r(s − t0 ) ds Pj x (s)
t0 j =1
+
3
1/2 exp −r(T − t0 ) qj x j (T ) ,
(5.49)
j =1
subject to (5.48). Following the analysis in Sect. 5.3, one can obtain
W (t0 ){1,2,3} t, x 1 , x 2 , x 3 1/2 1/2 {1,2,3} 1 1/2 {1,2,3} {1,2,3} (t) x + A2 (t) x 2 + A3 (t) x 3 + C {1,2,3} (t) = A1 (5.50) × exp −r(t − t0 ) , as in (5.24). The investment strategies of the grand coalition joint venture can be derived as in (5.25) and the dynamics of technological progress of the joint venture over the time interval s ∈ [t0 , T ] can be expressed as in (5.28).
130
5
Dynamically Stable Cost-Saving Joint Venture
Once again, we denote the state trajectories of the joint venture over the time interval s ∈ [t0 , T ] as {x 1∗ (t), x 2∗ (t), x 3∗ (t)}Tt=t0 ≡ {x ∗ (t)}Tt=t0 , and use the terms x i∗ (t) and xti∗ interchangeably. For the computation of the dynamic in the Shapley Value, we consider cases when two of the firms form a coalition {i, j } ⊂ {1, 2, 3} to maximize joint profit
1/2 1/2 {i,j } {i,j } − ci ui (s) + Pj x j (s) − cj uj (s) exp −r(s − t0 ) ds Pi x i (s)
T
t0
1/2 1/2 , + exp −r(T − t0 ) qi x i (T ) + qj x j (T )
(5.51)
subject to 1/2 x˙ i (s) = αi ui (s)x i (s) − δxi (s) ,
x i (t0 ) = x0i ∈ X i ,
(5.52)
for i, j ∈ {1, 2, 3} and i = j . Following the above analysis, we obtain the following value functions: 1/2
{i,j } 1/2 {i,j } + Aj (t) x j W (t0 ){i,j } t, x i , x j = Ai (t) x i + C {i,j } (t) exp −r(t − t0 ) , {i,j }
for i, j ∈ {1, 2, 3} and i = j , where Ai
{i,j }
(t), Aj
δ {i.j } {i,j } A˙ i (t) = r + Ai (t) − Pi , 2
(5.53)
(t), and C {i,j } (t) satisfy {i,j }
and Ai
(T ) = qi ,
for i, j ∈ {1, 2, 3} and i = j ; C˙ {i,j } (t) = rC {i,j } (t) −
αh2
{i,j } h∈{i,j } 16ch
{i,j } 2 Ah (t) ,
C {i,j } (T ) = 0. The block-recursive system in (5.54) can be solved readily by standard techniques. Moreover, one can easily derive for, τ ∈ [t0 , T ],
W (t0 ){i,j } t, x i , x j = exp −r(τ − t0 ) W (τ ){i,j } t, x i , x j , for i, j ∈ {1, 2, 3} and i = j.
5.5.2 PDP for Shapley Value To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 5.3 we present the following.
5.5 A Joint Venture with Shapley Value Profit Sharing
131
Proposition 5.3 A PDP with a terminal payment q i (xT∗ ) at time T and an instantaneous payment at time τ ∈ [t0 , T ]
Bi (τ ) = −
K⊆{1,2,3}
× +
(k − 1)!(3 − k)! 3!
(τ )K\i K\i∗ t, xt Wt(τ )K t, xtK∗ t=τ − Wt t=τ
∂
αh2 {1,2,3} i∗ 1/2 (τ )K K∗ h∗ W A (τ ) x − δx τ, x τ τ τ ∂xτh∗ 4ch h
h∈K
2 ∂
α 1/2 {1,2,3} h W (τ )K\i τ, xτK\i∗ A (τ ) xτi∗ − δxτh∗ , − ∂xτh∗ 4ch h h∈K\i
for i ∈ {1, 2, 3},
(5.54)
will lead to the realization of the Shapley Value imputations ξ (τ )i (τ, xτN ∗ ) in Condition 5.3. Proof Invoking Theorem 4.2 in Chap. 4 one can readily obtain Proposition 5.3. Using (5.46), (5.50), and (5.53), (τ )i i∗ {i} 1/2 {i} 1/2 {i} {i} Wt + Ci (τ ) + A˙ i (τ ) xτi∗ + C˙ i (τ ) , t, xt t=τ = r Ai (τ ) xτi∗ for i ∈ {1, 2, 3}; (τ ){i,j } i∗ 1/2 {i,j } 1/2 {i,j } Wt + Aj (τ ) xτj ∗ + C {i,j } (τ ) t, xt t=τ = r Ai (τ ) xτi∗ 1/2 {i,j } 1/2 {i,j } + A˙ j (τ ) xτj ∗ + C˙ {i,j } (τ ) , + A˙ i (τ ) xτi∗ for i, j ∈ {1, 2, 3} and i = j ; 1/2 (τ ){1,2,3} i∗ {1,2,3} 1∗ 1/2 {1,2,3} (τ ) xτ + A2 (τ ) xτ2∗ Wt t, xt t=τ = r A1 1/2 {1,2,3} + A3 (τ ) xτ3∗ + C {1,2,3} (τ ) 1/2 {1,2,3} 1∗ 1/2 {1,2,3} (τ ) xτ + A˙ 2 (τ ) xτ2∗ + A˙ 1 1/2 {1,2,3} + A˙ 3 (τ ) xτ3∗ + C˙ {1,2,3} (τ ) ; and h∗ −1/2
1 ∂ (τ )K K∗ W , for h ∈ K ⊆ {1, 2, 3}. = AK τ, xτ h (τ ) xτ h∗ ∂xτ 2 A time (optimal-trajectory-subgame) consistent solution can be obtained using the set of group optimal strategies ψ ∗ (s, xs∗ ) and B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.54).
132
5
Dynamically Stable Cost-Saving Joint Venture
Finally, using the cooperative strategies, the instantaneous receipt of firm i at time instant τ is 1/2 − ζi (τ ) = Pi xτi∗
αi2
{1,2,3} 16(ci )
{1,2,3}
Ai
2 (τ )
(5.55)
for τ ∈ [t0 , T ] and i ∈ {1, 2, 3}. According to Proposition 5.3, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , T ] and i ∈ {1, 2, 3}, as stated in (5.54). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ )
(5.56)
has to be given or charged to firm i at time τ , for i ∈ {1, 2, 3} and τ ∈ [t0 , T ].
5.6 Infinite-Horizon Analysis Consider the case when the horizon of the analysis approaches infinity. The state dynamics of the ith firm is characterized by the set of vector-valued differential equations (5.57) x˙ i (s) = f i x i (s), ui (s) , x i (t0 ) = x0i , for i ∈ N. The objective of firm i to be maximized is ∞
i i {i} g x (s) − ci ui (s) exp −r(s − t0 ) ds,
(5.58)
for i ∈ N . Consider the alternative formulation of (5.57) and (5.58) as ∞
i i {i} g x (s) − ci ui (s) exp −r(s − t) ds, for i ∈ N, max
(5.59)
t0
ui
t
subject to x˙ i (s) = fi x i (s), ui (s) ,
x i (t) = x i , for i ∈ N.
(5.60)
The infinite-horizon theory of the firm problem in (5.59) and (5.60) is independent of the choice of t and dependent only upon the state at the starting time. Invoking Theorem 2.4 in Chap. 2, a noncooperative feedback Nash equilibrium solution can be characterized by a set of strategies {φi∗ (x i ), for i ∈ N }, constituting a firm’s equilibrium solution to the problem in (5.59) and (5.60), if there exist functionals Vˆ i (x i ) : R m → R for i ∈ N , satisfying the following set of partial differential equations:
{i} (5.61) r Vˆ i x i = max g x i − ci (ui ) + Vˆxi (x)f x i , ui . ui
5.6 Infinite-Horizon Analysis
133
5.6.1 Dynamic Joint Venture Consider the case when all these n companies form a joint venture. The cost of control of firm j under the joint venture becomes cjN (uj ). With the absolute joint venture cost advantage {j }
cjN (uj ) ≤ cj (uj ),
for j ∈ N,
and {j }
∂cjN (uj )/∂uj ≤ ∂cj (uj )/∂uj ,
for j ∈ N.
(5.62)
The joint venture would maximize the joint venture profit
n ∞
t
g j x j (s) − cjN uj (s) exp −r(s − t) ds,
(5.63)
j =1
subject to (5.60). An optimal solution of the control problem in (5.60) and (5.63) can be characterized using Theorem A.2 in the Technical Appendixes as follows. Corollary 5.3 A set of control strategies {ψi∗ (x) for i ∈ N} provides a solution to the control problem in (5.57) and (5.63), if there exist continuously differentiable functions W (x) : R n → R, satisfying the following partial differential equation: n n
j j N j j Wxj (x)f x , uj , (5.64) g x − cj (uj ) + rW (x) = max u1 ,u2 ,...,un
j =1
j =1
where x = {x 1 , x 2 , . . . , x n }. Hence the firms will adopt the cooperative control {ψi∗ (x), for i ∈ N } to obtain the maximized level of joint profit. Substituting this set of control into (5.57) yields the dynamics of technology advancement under cooperation as
x˙ i (s) = f i x i (s), ψi∗ x(s) ,
x i (t0 ) = x0i , for i ∈ N.
(5.65)
Let x ∗ (t) = {x 1∗ (t), x 2∗ (t), . . . , x n∗ (t)} denote the solution to (5.65). The optimal trajectory {x ∗ (t)}∞ t=t0 can be expressed as x
i∗
(t) = x0i
+
t t0
f i x i∗ (s), ψi∗ x ∗ (s) ds,
for i ∈ N.
For notational convenience, we use the terms x ∗ (t) and xt∗ interchangeably.
(5.66)
134
5
Dynamically Stable Cost-Saving Joint Venture
Substituting the optimal extraction strategies in {ψi∗ (x), for i ∈ N} into (5.63) yields the venture profit as
W xt∗ =
t
n ∞
g j x j ∗ (s) − cjN ψj∗ x ∗ (s) exp −r(s − t) ds.
(5.67)
j =1
5.6.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing Consider the case when the firms in the venture share the excess of the total cooperative payoff over the sum of individual noncooperative payoffs proportional to the firms’ noncooperative payoffs. The imputation scheme has to fulfill the following condition. Condition 5.4 An imputation
Vˆ i (xτ∗ ) W xτ∗ , ξ (τ )i τ, xτ∗ = n ˆi ∗ i=1 V (xτ )
(5.68)
is assigned to firm i, for i ∈ N at time τ ∈ [t0 , ∞). To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 5.4 we obtain the following. Proposition 5.4 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞)
∗
Vˆ i (xτi∗ ) W xτ Bi (τ ) = r n ˆ j j∗ j =1 V (xτ ) −
n ∗ h h∗ ∗ ∗ ∂ Vˆ i (xτi∗ ) W x n τ f xτ , ψh xτ , j∗ ∂xτh∗ Vˆ j (xτ ) h=1
(5.69)
j =1
for i ∈ N , will lead to the realization of the solution imputations in Condition 5.4. Proof Invoking Theorem 4.3 in Chap. 4 one can obtain Proposition 5.4.
A time (optimal-trajectory-subgame) consistent solution can be obtained as P (τ, xτ ) = {u(s) and B(s), for s ∈ [τ, ∞) and ξ (τ ) (τ, xτ )}, with (i) u(s) for s ∈ [τ, ∞) being the set of group optimal strategies ψ ∗ (xs∗ ) characterized in (5.64), and (ii) B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.69).
5.6 Infinite-Horizon Analysis
135
With firms using the cooperative investment strategies {ψi∗ (xτ∗ ), for i ∈ N }, the instantaneous receipt of firm i at time instant τ is
ζi (τ ) = g i xτi∗ − ciN ψi∗ xτ∗ , for τ ∈ [t0 , ∞) and i ∈ N. According to Proposition 5.4, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , ∞) and i ∈ N , as stated in (5.69). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ ), has to be given or charged to firm i at time τ , for i ∈ N .
5.6.3 Shapley Value Profit Sharing Consider again the infinite-horizon dynamic venture model in (5.59) and (5.60). The member firms would maximize their joint profit and share their cooperative profits according to the Shapley Value. If firms are allowed to form different coalitions consisting of a subset of companies, K ⊆ N . There are k firms in the subset K. In particular, under the coalition they can obtain cost reduction and with absolute joint venture cost advantage (5.70) cjK uj (s) ≤ cjL uj (s) , for j ∈ L ⊆ K, where cjK [uj (s)] represents the costs of the controls of the firm j in the subset K and cjL [uj (s)] represents the costs of the controls of the firm j in the subset L. Moreover, marginal cost advantages lead to ∂cjK uj (s) /∂uj (s) ≤ ∂cjL uj (s) /∂uj (s), for j ∈ L ⊆ K. The profit to the joint venture K becomes ∞
j j g s, x (s) − cjK uj (s) exp −r(s − t) ds, t
(5.71)
j ∈K
for K ⊆ N . To compute the profit of the joint venture K we have to consider the optimal control problem in (5.70) and (5.71). Invoking Bellman’s technique of dynamic programming as in Theorem A.2 of the Technical Appendixes, the solution to the optimal control problem can be characterized as follows. Corollary 5.4 A set of controls {ψiK∗ (x K ), for i ∈ K and t ∈ [t0 , ∞)} provides an optimal solution to the control problem in (5.57) and (5.71) if there exists a continuously differentiable function W K (x K ) : R k → R satisfying the following Bellman
136
5
Dynamically Stable Cost-Saving Joint Venture
equation:
rW
K
x
K
= max uK
j
g x
j ∈K
j
− cjK (uj )
+
j ∈K
WxKj
x
K
f
j
j
x , uj
. (5.72)
Now consider the case of a grand coalition N in which all the n firms are in the coalition. Using the result in Corollary 5.3, the cooperative state trajectory can be obtained as in (5.66). To share the venture profit among participating firms according to the Shapley Value the imputation has to satisfy the following condition. Condition 5.5 An imputation
(k − 1)!(n − k)! K K∗
W xτ − W K\i xτK\i∗ ξ (τ )i τ, xτN∗ = n!
(5.73)
K⊆N
is assigned to firm i for i ∈ N at time τ when the state is xτ∗ . To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 5.5 we obtain the following. Proposition 5.5 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞) (k − 1)!(n − k)!
rW K\i xτK\i∗ − rW K xτK∗ Bi (τ ) = − n! K⊆N ∂
h h∗ ∗ ∗ K K∗ x f xτ , ψh xτ + W τ ∂xτh∗ h∈K ∂
− W K\i xτK\i∗ f h xτh∗ , ψh∗ xτ∗ , for i ∈ N, ∂xτh∗
(5.74)
h∈K\i
will lead to the realization of the Shapley Value in Condition 5.5. Proof Invoking Theorem 4.3 in Chap. 4 one can readily obtain Proposition 5.5. A time (optimal-trajectory-subgame) consistent solution can be obtained with the group optimal strategies ψ ∗ (xs∗ ) characterized in (5.72) and B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.74).
5.7 An Infinite-Horizon Joint Venture
137
5.7 An Infinite-Horizon Joint Venture Consider the infinite-horizon version of the three-company joint venture in Sect. 5.3. The planning period is [t0 , ∞). Company i’s profit is
∞
t0
1/2 {i} − ci ui (s) exp −r(s − t0 ) ds, Pi x i (s)
(5.75)
for i ∈ N = {1, 2, 3}. The evolution of the technology level of company i follows the dynamics 1/2 x˙ i (s) = αi ui (s)x i (s) − δx i (s) , x i (t0 ) = x0i ∈ Xi , for i ∈ {1, 2, 3}.
(5.76)
In the case when each of these three firms acts independently, using Theorem A.2 in the Technical Appendixes, we obtain the Bellman equation as
1/2
1/2 {i} rW i x i = max Pi x i − ci ui + Wxi i x i αi ui xi − δx i , ui
(5.77)
for i ∈ {1, 2, 3}. Performing the indicated maximization yields ui =
αi2
{i} 4(ci )2
2 Vxii x i x i ,
for i ∈ {1, 2, 3}.
Substituting ui into the Bellman equation yields
1/2 αi2 i i 2 i − {i} x Vx i x rV i x i = Pi x i 4ci +
αi2
{i} 2ci
2
Vxii x i x i − δVxii x i xi ,
for i ∈ [1, 2, 3].
Solving the above system of partial differential equations yields {i} 1/2 {i} + Ci , V i x i = Ai x i where
δ {i} 0= r + Ai − Pi , 2
for i ∈ {1, 2, 3},
{i}
rCi =
αi2
{i} 16ci
{i} 2
Ai
(5.78)
.
After obtaining the noncooperative outcome we move on to consider the formation of a joint venture with these three firms.
138
5
Dynamically Stable Cost-Saving Joint Venture
5.7.1 Joint Venture and Costs Consider the case when all three firms agree to form a joint venture and share their joint profit proportional to their noncooperative profits. With joint venture cost advantage {1,2,3}
cj
{j }
≤ cj ,
for j ∈ N,
(5.79)
the profit of the joint venture is the sum of the participating firms’ profits
3 ∞
t0
1/2 {1,2,3} − cj uj (s) exp −r(s − t0 ) ds. Pj x j (s)
(5.80)
j =1
The firms in the joint venture then act cooperatively to maximize (5.80) subject to (5.76). Using Theorem A.2 in the Technical Appendixes, we obtain the Bellman equation as 3 1/2
{1,2,3} rW {1,2,3} x 1 , x 2 , x 3 = max − ci ui Pi x i u1 ,u2 ,u3
+
3
i=1
1/2 {1,2,3} 1 2 3 Wx i x , x , x αi ui x i
− δx
i
.(5.81)
i=1
Performing the indicated maximization yields ui =
αi2
{1,2,3} 2 4(ci )
{1,2,3} 1
x , x2, x3
Wx i
2
xi ,
for i ∈ {1, 2, 3}.
(5.82)
Substituting (5.82) into (5.81) yields
rW {1,2,3} x 1 , x 2 , x 3 3 i 1/2 αi2 x i {1,2,3} 1 2 3 2 x ,x ,x = Pi x − {1,2,3} Wx i 4ci i=1 +
3
Wx{1,2,3} (x1 , x2 , x3 ) i
i=1
αi2
{1,2,3} 2 )
2(ci
(x1 , x2 , x3 ) Wx{1,2,3} i
x − δx . i
i
(5.83) Solving (5.83) yields
{1,2,3} 1 1/2 {1,2,3} 2 1/2 + A2 x x W {1,2,3} x 1 , x 2 , x 3 = A1
{1,2,3} 3 1/2 + A3 + C {1,2,3} , x
(5.84)
5.7 An Infinite-Horizon Joint Venture {1,2,3}
where A1
{1,2,3}
, A2
{1,2,3}
, A3
139
, and C {1,2,3} satisfy
δ {1,2,3} − Pi Ai 0= r + 2 for i, j, h ∈ {1, 2, 3} and i = j = h, rC {1,2,3} =
3
αi2
{1,2,3} i=1 16ci
{1,2,3} 2 . Ai
(5.85)
The investment strategies of the grand coalition joint venture can be derived as {1,2,3}
ψi
(x) =
αi2
{1,2,3} 2 16(ci )
{1,2,3} 2 Ai ,
for i ∈ {1, 2, 3}.
(5.86)
The dynamics of the technological progress of the joint venture over the time interval s ∈ [t0 , ∞) can be expressed as x˙ i (s) =
αi2 {1,2,3} i 1/2 A − δx i (s), x (s) 4ci i
x i (t0 ) = x0i , for i ∈ {1, 2, 3}. (5.87)
Taking the transforming y i (s) = x i (s)1/2 , for i ∈ {1, 2, 3}, the equation system in (5.87) can be expressed as y˙ i (s) =
αi2 {1,2,3} δ i A − y (s), 8ci i 2
1/2 y i (t0 ) = x0i , for i ∈ {1, 2, 3}.
(5.88)
Equation (5.88) is a system of linear differential equations that can be solved by standard techniques. Solving (5.88) yields the joint venture’s state trajectory. Let {y 1∗ (t), y 2∗ (t), y 3∗ (t)} denote the solution to (5.88). Transforming x i = (y i )2 , we obtain the state trajectories of the joint venture over the time interval s ∈ [t0 , ∞) as
∞ T x ∗ (t) t=t ≡ x 1∗ (t), x 2∗ (t), x 3∗ (t) t=t 0 0 1∗ 2 2∗ 2 3∗ 2 T = y (t) , y (t) , y (t) t=t . 0
(5.89)
Once again, we use the terms x i∗ (t) and xti∗ interchangeably.
5.7.2 Time (Optimal-Trajectory-Subgame) Consistent Venture Profit Sharing If the firms agree to share their joint profit proportional to their noncooperative profits the imputation scheme has to fulfill the following condition.
140
5
Dynamically Stable Cost-Saving Joint Venture
Condition 5.6 An imputation
Vˆ i (xτi∗ ) W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ , ξ (τ )i τ, xτ∗ = 3 ˆ j i∗ j =1 V (xτ )
(5.90)
is assigned to firm i, for i ∈ {1, 2, 3} at time τ when the state is xτ∗ . To formulate a payoff distribution procedure over time so that the agreed imputations in Condition 5.6 are satisfied we obtain the following. Proposition 5.6 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞)
Vˆ i (xτi∗ ) W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ Bi (τ ) = r 3 j i∗ ˆ j =1 V (xτ ) −
3
Vˆ i (xτi∗ ) ∂ {1,2,3} 1∗ 2∗ 3∗ x W , x , x τ τ τ ∂xτh∗ 3j =1 Vˆ j (xτi∗ ) h=1
αh2 {1,2,3} h∗ 1/2 h∗ × A − δxτ (s) , xτ 4ch h
for i ∈ {1, 2, 3},
(5.91)
will lead to the realization of the imputation in Condition 5.6. Proof Invoking Proposition 5.4 one can obtain Proposition 5.6. In particular,
Vˆ i (xτi∗ ) W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ 3 Vˆ j (xτi∗ ) j =1
{i}
{i}
[A (x i∗ )1/2 + Ci ] = 3 i {j } j ∗ {j } ( j =1 [Aj (xτ )1/2 + Cj ]) {1,2,3} 1∗ 1/2 {1,2,3} 2∗ 1/2 {1,2,3} 3∗ 1/2 + A2 + A3 + C {1,2,3} ; xτ xτ xτ × A1
Vˆ i (xτi∗ ) ∂ W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ 3 i∗ ˆ j i∗ ∂xτ j =1 V (xτ ) {1,2,3} 1∗ 1/2 {1,2,3} 2∗ 1/2 {1,2,3} 3∗ 1/2 + A2 + A3 + C {1,2,3} xτ xτ xτ = A1 {j } j ∗ {j } 3 {i} ( j =1 [Aj (xτ )1/2 + Cj ]) 12 Ai (x i∗ )−1/2 × {j } j ∗ {j } ( 3j =1 [Aj (xτ )1/2 + Cj ])2
5.7 An Infinite-Horizon Joint Venture {i}
141
{i}
{i}
[A (x i∗ )1/2 + Ci (t)] 12 Ai (x i∗ )−1/2 − i 3 {j } j ∗ {j } ( j =1 [Aj (xτ )1/2 + Cj ])2 {i}
{i}
1 {1,2,3} i∗ −1/2 + 3 A ; xτ {j } j ∗ 1/2 {j } 2 2 i ( j =1 [Aj (xτ ) + Cj ]) [Ai (x i∗ )1/2 + Ci ]
and
Vˆ i (xτi∗ ) ∂ W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ h∗ 3 ˆ j i∗ ∂xτ j =1 V (xτ ) {1,2,3} 1∗ 1/2 {1,2,3} 2∗ 1/2 {1,2,3} 3∗ 1/2 + A2 + A3 + C {1,2,3} xτ xτ xτ = A1
{i}
{i}
{h}
[A (x i∗ )1/2 + Ci ] 12 Ah (x h∗ )−1/2 × − i 3 {j } j ∗ {j } ( j =1 [Aj (xτ )1/2 + Cj ])2 {i}
{i}
1 {1,2,3} h∗ −1/2 , xτ + 3 A {j } j ∗ 1/2 {j } 2 2 h ( j =1 [Aj (xτ ) + Cj ]) [Ai (x i∗ )1/2 + Ci ]
for h = i. A time (optimal-trajectory-subgame) consistent solution can be obtained with the group optimal strategies ψ ∗ (xs∗ ) characterized in (5.86) and B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.91). Using the cooperative strategies the instantaneous receipt of firm i at time instant τ is 1/2 − ζi (τ ) = Pi xτi∗
{1,2,3} 2 , Ai {1,2,3} 16(ci ) path {x ∗ (t)}∞ t=t0 . αi2
(5.92)
for i ∈ {1, 2, 3} along the cooperative According to Proposition 5.6, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for i ∈ {1, 2, 3}, as stated in (5.91). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ ),
(5.93)
has to be given or charged to firm i at time τ , for i ∈ {1, 2, 3} along the cooperative path {x ∗ (t)}∞ t=t0 .
5.7.3 Shapley Value Solution Consider the case when the participating firms agree to share their cooperative profits according to the Shapley Value. For the computation of the dynamic in the Shap-
142
5
Dynamically Stable Cost-Saving Joint Venture
ley Value we consider cases when two of the firms form a coalition {i, j } ⊂ {1, 2, 3}. In particular, they can obtain cost reduction, and with joint venture cost advantage {i,j }
{i}
ci ≤ ci {i,j }
ci
,
{i,j,k}
≤ ci
for i, j ∈ {1, 2, 3} and i = j, ,
(5.94)
for i, j, k ∈ {1, 2, 3} and i = j = k.
To maximize the joint profit of coalition {i, j }, the firms consider the problem of maximizing
∞
t0
1/2 1/2 {i,j } {i,j } − ci ui (s) + Pj x j (s) − cj uj (s) exp −r(s − t0 ) ds, Pi x i (s) (5.95)
subject to (5.76). Following the above analysis, we obtain the following value functions:
{i,j } i 1/2 {i,j } j 1/2 + Aj + C {i,j } , x x W {i,j } x i , x j = Ai {i,j }
for i, j, ∈ {1, 2, 3} and i = j , where Ai δ {1,2} − Pi , Ai 0= r + 2 rC {i,j } =
αh2
{i,j } h∈{i,j } 16ch
{i,j }
, Aj
(5.96)
, and C {i,j } satisfy
for i, j, ∈ {1, 2, 3} and i = j,
and
{i,j } 2 . Ah
To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 5.5 we obtain the following. Proposition 5.7 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞) (k − 1)!(3 − k)!
Bi (τ ) = − rW K\i xτK\i∗ − rW K xτK∗ 3! K⊆{1,2,3}
∂
αh2 {1,2,3} i∗ 1/2 K K∗ h∗ W xτ A − δxτ xτ + ∂xτh∗ 4ch h h∈K
∂
αh2 {1,2,3} i∗ 1/2 K\i K\i∗ h∗ W A − δxτ − xτ xτ , ∂xτh∗ 4ch h h∈K\i
for i ∈ {1, 2, 3}, will lead to the realization of the Shapley Value in Condition 5.5.
(5.97)
5.8 Exercises
143
Proof Invoking Proposition 5.5 the results in Proposition 5.7 follow. In particular, W K (xτK∗ ) is given in (5.78), (5.84), and (5.96), and
h∗ −1/2 1 ∂ K K∗ W xτ , for h ∈ K ⊆ {1, 2, 3}. = AK h xτ h∗ ∂xτ 2
A time (optimal-trajectory-subgame) consistent solution can be obtained with the group optimal strategies ψ ∗ (xs∗ ) characterized in (5.86) and the PDP B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (5.97).
5.8 Exercises 5.1 Consider the case when there are three companies involved in a joint venture. The planning period is [0, 2]. We use x i (s) to denote the level of technology of company i at time s ∈ [0, 2], and ui (s) ⊂ R + is its physical investment in technological advancement. The discount rate is 0.05. The salvage values of the firms’ technologies are 2[x 1 (2)]1/2 , [x 2 (2)]1/2 , and 3[x 3 (2)]1/2 . If they act independently, the costs of the physical investment of these three companies are, respectively, 2u1 (s),
3u2 (s),
and 2u3 (s).
The profits for companies 1, 2, and 3 are, respectively,
1/2 1/2 − 2u1 (s) exp(−0.05s) ds + exp −0.05(4) 2 x 1 (2) , 10 x 1 (s)
2
0
1/2 1/2 − 3u2 (s) exp(−0.05s) ds + exp −0.05(4) x 2 (2) , 9 x 2 (s)
2
and
0
1/2 1/2 − 2u3 (s) exp(−0.05s) ds + exp −0.05(4) 3 x 3 (2) . 8 x 3 (s)
2 0
The evolution of the technology level of company i ∈ {1, 2, 3} follows the dynamics 1/2 x˙ 1 (s) = 2 u1 (s)x 1 (s) − 0.01x 1 (s) , x 1 (0) = 20, 1/2 − 0.03x 2 (s) , x 2 (0) = 10, and x˙ 2 (s) = u2 (s)x 2 (s) 1/2 − 0.02x 3 (s) , x 3 (0) = 15. x˙ 3 (s) = 1.5 u3 (s)x 3 (s) Compute a Nash equilibrium solution when these three firms act independently. 5.2 Consider the case when these three firms form a joint venture. The participating firms in a coalition can gain core skills and technology from each other. In particular, they can obtain cost reduction and, with absolute joint venture, cost advantage.
144
5
Dynamically Stable Cost-Saving Joint Venture
With joint venture cost advantage, the cost of investment of firm j ∈ {1, 2, 3} {1,2,3} {1,2,3} {1,2,3} uj (s), where c1 = 1, c2 = 1.2, and under the joint venture becomes cj {1,2,3}
= 0.8. If the joint venture firms agree to maximize their joint profit and share the excess gain equally, characterize a time (optimal-trajectory-subgame) consistent solution. c3
5.3 Consider the joint venture in exercise 5.2. In particular, the firms would like to share the venture profit according to the Shapley Value. The costs under joint ventures in different coalitions K ⊆ {1, 2, 3} are {1,2,3}
c1
{1,2,3}
= 1,
c2
{1,2}
= 1.5 and c2
{1,3}
= 1.3 and c3
{2,3}
= 1.8 and c3
c1 c1 c2
{1}
c1 = 2,
{2}
{1,2,3}
= 1.2,
and c3
{1,2}
= 2;
{1,3}
= 1.2;
{2,3}
= 1.1;
c2 = 3,
and
= 0.8;
{3}
c3 = 2.
Characterize a time (optimal-trajectory-subgame) consistent solution. 5.4 Prove that the coalition profits in Exercise 5.3 are superadditive.
Appendix: Proof of Proposition 5.2 To prove Proposition 5.2 we first use xˆ j (L) , for j ∈ L, to denote the optimal trajectory of the optimal control problem [L; τ, xτL ], which maximizes
T τ
j ∈L
+
j ∈L
s r(y) dy ds g j s, x j (s) − cjL uj (s) exp − τ
exp −
T
r(y) dy q j x j (T )
t0
subject to x˙ j (s) = f j s, x j (s), uj (s) ,
x j (τ ) = xτj , for j ∈ L.
Note that
W (τ )L τ, xτL s T
j j (L) = (s) − cjL ψj(τ )L∗ s, xˆ L(L) (s) exp − r(y) dy ds g s, xˆ τ
j ∈L
τ
Proof of Proposition 5.2
+
exp −
T
≤ τ
+
T
r(y) dy q j xˆ j (L) (T )
τ
j ∈L
145
s (τ )L∗ L(L) j j (L) K (s) − cj ψj (s) exp − r(y) dy ds g s, xˆ s, xˆ τ
j ∈L
exp −
T
r(y) dy q j xˆ j (L) (T )
τ
j ∈L
because cjK uj (s) ≤ cjL uj (s) , for j ∈ L ⊆ K.
(5.98) K\L
Similarly, for the optimal control problem [K\L; τ, xτ ], we have
W (τ )K\L τ, xτK\L T
j j (K\L) K\L (τ )K\L∗ = (s) − cj g s, xˆ ψj s, xˆ K\L(K\L) (s) τ
j ∈K\L
s × exp − r(y) dy ds τ
+ ≤
exp −
τ
r(y) dy q j xˆ j (K\L) (T )
τ
j ∈K\L T
T
(τ )K\L∗ K\L(K\L) (s) g j s, xˆ j (K\L) (s) − cjK ψj s, xˆ j ∈K\L
s × exp − r(y) dy ds τ
+
exp −
j ∈K\L
T
r(y) dy q j xˆ j (K\L) (T )
τ
K\L because cjK uj (s) ≤ cj uj (s) , for j ∈ K\L ⊆ K. Now consider the optimal control problem [K; τ, xτK ] that maximizes s T j j K r(y) dy ds g s, x (s) − cj uj (s) exp − τ
j ∈K
+
j ∈K
τ
exp −
T
r(y) dy q j x j (T ) ,
t0
subject to x˙ j (s) = f j s, x j (s), uj (s) ,
x j (τ ) = xτj , for j ∈ K.
(5.99)
146
5
Dynamically Stable Cost-Saving Joint Venture
(τ )K∗
Since ψj (s, xˆ K(K) (s)) and xˆ K(K) (s) are, respectively, the optimal control and optimal state trajectory of the control problem [K; τ, xτK ],
W (τ )K τ, xτK s T j j (K) (τ )K∗ K(K) K = (s) − cj ψj (s) exp − r(y) dy ds g s, xˆ s, xˆ τ
τ
j ∈K
+
exp −
T
≥ τ
+
r(y) dy q j xˆ j (K) (T )
τ
j ∈K
T
s (τ )L∗ L(L) j j (L) K (s) − cj ψj (s) exp − r(y) dy ds g s, xˆ s, xˆ τ
j ∈L
exp −
+ τ
r(y) dy q j xˆ j (L) (T )
τ
j ∈L
T
(τ )K\L∗ K\L(K\L) (s) g j s, xˆ j (K\L) (s) − cjK ψj s, xˆ
T
j ∈K\L
s × exp − r(y) dy ds τ
+
j ∈K\L
exp −
T
r(y) dy q j xˆ j (K\L) (T ) .
τ
Invoking (5.98), (5.99), and (5.100), we have
W (τ )K τ, xτK ≥ W (τ )L τ, xτL + W (τ )K\L τ, xτK\L . Hence Proposition 5.2 follows.
(5.100)
Chapter 6
Collaborative Environmental Management
After decades of rapid technological advancement and economic growth, alarming levels of pollution and environmental degradation are emerging globally. Due to the geographical diffusion of pollutants, the unilateral response of one nation or region is often ineffective. Reports portray the situation as an industrial civilization on the verge of suicide, destroying its environmental conditions of existence, with people being held as prisoners on a runaway catastrophe-bound train. Though global cooperation in environmental control holds out the best promise of effective action, limited success has been observed. This is the result of many hurdles, ranging from commitment, monitoring, and sharing of costs to disparities in future development under the cooperative plans. One finds it hard to be convinced that multinational joint initiatives, like the Kyoto Protocol, can offer a long-term solution because there is no guarantee that participants will always be better off within the entire extent of the agreement. More than anything else, it is due to the lack of these kinds of incentives that current cooperative schemes fail to provide an effective means to avert disaster. This is a “classic” game-theoretic problem. To construct a theoretical framework capturing the essence of a transboundary industrial pollution paradigm a differential game approach is adopted. Differential games provide an effective tool to study pollution control problems and to analyze the interactions between the participants’ strategic behaviors and the dynamic evolution of pollution. Applications of noncooperative differential games in environmental studies can be found in Yeung (1992), Dockner and Long (1993), Tahvonen (1994), Stimming (1999), Feenstra et al. (2001), and Dockner and Leitmann (2001). Cooperative differential games in environmental control are presented by Dockner and Long (1993), Jørgensen and Zaccour (2001), Fredj et al. (2004), Breton et al. (2005, 2006), Petrosyan and Zaccour (2003), Yeung (2007), and Yeung and Petrosyan (2008). To formulate the foundation for an effective policy to tackle one of the gravest problems facing the global market economy this chapter presents a cooperative initiative involving a set of environmental policy instruments including taxes, subsidies, and pollution abatement activities. The implementation of such a scheme will inevitably bring about different implications in cost and benefit to each of the participating nations. To construct a cooperative solution that every party will commit D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_6, © Springer Science+Business Media, LLC 2012
147
148
6
Collaborative Environmental Management
to from beginning to end, the arrangements must guarantee that every participant will be better off and the originally agreed-upon arrangement remains effective at any time within the cooperative period along the cooperative trajectory. An analytical framework for studying transboundary industrial pollution management is established in Sect. 6.1. Noncooperative outcomes appear in Sect. 6.2. Cooperative arrangements, cooperative state trajectory, and time consistent imputations are derived in Sect. 6.3. Benefit distributions leading to a time (optimaltrajectory-subgame) consistent collaborative environmental management scheme are obtained in Sect. 6.4. Policy implications are examined in the following section. An explicitly solvable model of transboundary industrial pollution management is given in Sect. 6.6 and a time (optimal-trajectory-subgame) consistent solution is presented in Sect. 6.7.
6.1 An Analytical Framework In this section we present an analytical framework to study transboundary industrial pollution management.
6.1.1 The Industrial Sector Consider an international economy with n nations. At time instant s the demand system of the outputs of the nations is (6.1) Pi (s) = f i q1 (s), q2 (s), . . . , qn (s), s , i ∈ N ≡ {1, 2, . . . , n}, where qj (s) is the output of nation j and Pi (s) is the price of the output of nation i. The demand system of (6.1) shows that the multination economy is a form of a generalized differentiated products oligopoly. Industrial profits of nation i at time s can be expressed as f i q1 (s), q2 (s), . . . , qn (s), s qi (s) − ci qi (s), vi (s) , for i ∈ N, (6.2) where vi (s) is the set of environmental policy instruments of government i. Policy instruments may include tools like taxes, subsidies, technology choices, and pollution legislations. The cost of producing qi (s) under policy vi (s) is ci [qi (s), vi (s)]. Profit maximization by the industrial sectors yields f i q1 (s), q2 (s), . . . , qn (s), s + fqii q1 (s), q2 (s), . . . , qn (s), s qi (s) (6.3) − cqi i qi (s), vi (s) = 0, for i ∈ N. Equation (6.3) is a system of implicit functions in q(s) = [q1 (s), q2 (s), . . . , qn (s)] with government policies v(s) = [v1 (s), v2 (s), . . . , vn (s)] being regarded as parameters. The existence of a market equilibrium reflects the satisfaction of the Implicit
6.1 An Analytical Framework
149
Function Theorem in (6.3) and nation i’s instantaneous market equilibrium output can be expressed as (6.4) qi∗ (s) = qˆ i v1 (s), v2 (s), . . . , vn (s), s ≡ qˆ i v(s), s , for i ∈ N. One can readily observe from (6.4) that each nation’s output decision depends on the government’s environmental policies.
6.1.2 Impacts and Accumulation Dynamics of Pollutants Industrial production emits pollutants into the environment and the amount of pollution created by different nations’ outputs may be different. For an output of qi (s) produced by nation i, there will be an instantaneous damaging environmental impact of εii [qi (s)] on nation i itself and a damaging impact of εji [qi (s)] on its adjacent nation j for j ∈ K i . On the other hand, nation i will receive instantaneous damaging j environmental impacts from its adjacent nations measured as εi [qj (s)] for j ∈ K¯ i . This first type of externality is typical in static analysis and the second in dynamic analysis. Moreover, the pollutant will then add to the stock of existing pollution. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) ⊂ R m denote the level of pollution at time s, the dynamics of pollution stock is governed by the differential equation x(s) ˙ =
n
n aj qj (s), vj (s) − bj uj (s), x(s) − δ x(s) x(s),
j =1
j =1
x(t0 ) = xt0 ,
(6.5)
where aj [qj (s), vj (s)] is the amount of pollution created by qj (s) (the amount of output produced under policy vi (s), uj (s) is the pollution abatement effort of nation j ), bj [uj (s), x(s)] is the amount of pollution removed by uj (s) (the unit of abatement effort of nation j ), and δ[x(s)] is the natural rate of decay of the pollutants. Moreover, δ(x) is negatively related to x, reflecting the phenomenon that the natural rate of decay declines as the level of pollution stock rises.
6.1.3 The Governments’ Objectives The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. In particular, each government maximizes the net gains in the industrial sector, plus tax revenue, minus expenditures on pollution abatement and damages from pollution. A lump-sum income tax is levied on the industrial sector to balance the government budget. The last item
150
6
Collaborative Environmental Management
turns out to be a net transfer between the government and the public (with no effect on industrial output). The instantaneous objective of government i at time s can be expressed as f i q1 (s), q2 (s), . . . , qn (s), s qi (s) − ci qi (s), vi (s) − ciP vi (s) − cia ui (s) j εi qj (s) − hi x(s) , i ∈ N, (6.6) − εii qi (s) − j ∈K¯ i
where ciP [vi (s)] is the cost of implementing the vector policy instrument vi (s), cia [ui (s)] is the cost of employing the ui amount of pollution abatement effort, and hi [x(s)] is the value of damage to country i from an x(s) amount of pollution. The governments’ planning horizon is [t0 , T ]. It is possible that T may be very large. The discount rate is r. At time T , the terminal appraisal of pollution damage is g i [x(T )], where ∂g i /∂x < 0. Each one of the n governments seeks to maximize the integral of its instantaneous objective found in (6.6) over the planning horizon subject to the pollution dynamics in (6.5) with controls on the level of abatement effort and output tax. Substitute qi (s), for i ∈ N , from (6.4) into (6.5) and (6.6) one obtains a differential game in which government i ∈ N seeks to T f i qˆ 1 v(s), s , qˆ 2 v(s), s , . . . , qˆ n v(s), s , s qˆ i v(s), s max vi (s),ui (s)
t0
− ci qˆ i v(s), s , vi (s) − ciP vi (s) − cia ui (s) − εii qˆ i v(s), s
j −r(s−t ) −r(T −t ) j i 0 ds + g x(T ) e 0 − , εi qˆ v(s), s − hi x(s) e
(6.7)
j ∈K¯ i
subject to x(s) ˙ =
n
n aj qˆ v(s), s , vj (s) − bj uj (s), x(s) − δ x(s) x(s),
j
j =1
x(t0 ) = xt0 .
j =1
(6.8)
Thus the economic interactions among nations in industrial production, pollution emission, and abatement are characterized as a differential game with the payoffs in (6.7) and pollution dynamics in (6.8).
6.2 Noncooperative Outcomes In this section we discuss the solution to the noncooperative game in (6.7) and (6.8). Since the payoffs of nations are measured in monetary terms, the game is a transferable payoff game. Invoking Theorem 2.3 in Chap. 2, a feedback Nash equilibrium solution can be characterized as follows.
6.2 Noncooperative Outcomes
151
Corollary 6.1 A set of feedback strategies {u∗i (t) = μi (t, x), vi∗ (t) = φi (t, x), for i ∈ N}, provides a feedback Nash equilibrium solution to the game in (6.7) and (6.8) if there exist suitably smooth functions V (t0 )i (t, x) : [t0 , T ] × R m → R, i ∈ N , satisfying the following partial differential equations: (t )i −Vt 0 (t, x) = max v ,u i
i
f i qˆ 1 vi , φ=i (t, x), t , qˆ 2 vi , φ=i (t, x), t , . . . , qˆ n vi , φ=i (t, x), t
× qˆ i vi , φ=i (t, x), t − c qˆ i vi , φ=i (t, x), t , vi − ciP [vi ] − cia [ui ] − εii ϕ i vi , φ=i (t, x), t
j j − εi qˆ vi , φ=i (t, x), t − hi (x) e−r(t−t0 ) j ∈K¯ i
+ Vx(t0 )i
n
aj qˆ j vi , φ=i (t, x), t , vj − bi (ui , x)
j =1
−
n
bj μj (t, x), x − δ(x)x
(6.9)
,
j =1 j =i
V (t0 )i (T , x) = g i [x]e−r(T −t0 ) ,
(6.10)
where φ=i (t, x) = φ 1 (t, x), φ 2 (t, x), . . . , φ i−1 (t, x), φ i+1 (t, x), . . . , φ n (t, x) . In a prevailing Nash equilibrium the function V (t0 )i (t, x) is then the integral
T
f i qˆ 1 φ s, x(s) , s , qˆ 2 φ s, x(s) , s , . . . , qˆ n φ s, x(s) , s , s
t
× qˆ i φ s, x(s) , s − ci qˆ i φ s, x(s) , s , φi s, x(s)
− ciP φi s, x(s) − cia μi s, x(s) − εii qˆ i φ s, x(s) , s j
εi qˆ j φ s, x(s) , s − j ∈K¯ i
−r(s−t ) 0 ds + g i x(T ) e−r(T −t0 ) − hi x(s) e , x(t)=x
for i ∈ N. (6.11)
152
6
Collaborative Environmental Management
The game equilibrium dynamics then becomes x(s) ˙ =
n
aj qˆ j φ s, x(s) , s , φj s, x(s)
j =1
−
n
bj μj s, x(s) , x(s) − δ x(s) x(s),
j =1
x(t0 ) = xt0 .
(6.12)
Remark 6.1 One can readily verify that V (τ )i (t, xt ) = V i (t, xt )er(τ −t0 ) , for τ ∈ [t0 , T ], is the value function to nation i at time t ∈ [τ, T ] when the state x(t) = xt in the game of (6.7) and (6.8), which starts at time τ . With negative externalities in pollution a noncooperative outcome is suboptimal. International collaboration in industrial pollution management is an effective way out of this situation.
6.3 Cooperative Arrangement Now consider the case when all the nations want to cooperate and agree to act so that an international optimum can be achieved. For the cooperative scheme to be upheld throughout the game horizon both group rationality and individual rationality are required to be satisfied at any time. Group optimality ensures that all potential gains from cooperation are captured. The failure to fulfill group optimality leads to the condition where the participants prefer to deviate from the agreed-upon solution plan to extract the unexploited gains. Individual rationality is required to hold so that the payoff allocated to a nation under cooperation will be no less than its noncooperative payoff. The failure to guarantee individual rationality leads to a condition where the concerned participants will reject the agreed-upon solution plan and play noncooperatively. In the absence of a punishment scheme, the cooperative plan will dissolve if any of the nations deviate from the agreed-upon plan. In addition, as mentioned above, an agreeable optimality principle must uphold group optimality and individual rationality throughout the period of cooperation.
6.3.1 Group Optimality and Cooperative State Trajectory Consider the collaborative environmental scheme with the participating nations’ payoff structure in (6.7) and the pollution dynamics in (6.8). To secure group optimality the participating nations seek to maximize their joint payoff by solving the
6.3 Cooperative Arrangement
153
following control problem:
T
max
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
t0
n
f i qˆ 1 v(s), s , qˆ 2 v(s), s , . . . , qˆ n v(s), s , s
i=1
× qˆ v(s), s − ci qˆ i v(s), s , vi (s) j j εi qˆ v(s), s − ciP vi (s) − cia ui (s) − εii qˆ i v(s), s − i
j ∈K¯ i
n −r(t−t ) −r(T −t ) i 0 ds + 0 − hi x(s) e , g x(T ) e
(6.13)
i=1
subject to (6.8). Invoking Theorem A.1 in the Technical Appendixes, a set of controls {[vi∗∗ (t), ∗∗ ui (t)] = [ψi (t, x), i (t, x)], for i ∈ N }, constitutes an optimal solution to the control problem in (6.13) and (6.8) if there exists a continuously differentiable function W (t0 ) (t, x) : [t0 , T ] × R m → R, i ∈ N , satisfying the following partial differential equations: (t ) − Wt 0 (t, x) =
max
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
n
f i qˆ 1 (v, t), qˆ 2 (v, t), . . . , qˆ n (v, t), t
i=1
× qˆ (v, t) − c qˆ (v, t), vi − ciP (vi ) − cia (ui ) − εii qˆ i (v, t) j εi qˆ j (v, t) − hi (x) e−r(t−t0 ) + Wx(t0 ) (t, x) − j ∈K¯ i (6.14)
n n aj qˆ j (v, t), vj − bj (uj , x) − δ(x)x , and × i
i
i
j =1
W (t0 ) (T , x) =
n
j =1
g i (x)e−r(T −t0 ) .
i=1
Hence the nations will adopt the cooperative control {[ψi (t, x), i (t, x)], for i ∈ N and t ∈ [t0 , T ]}. The optimal trajectory under cooperation becomes x(s) ˙ =
n
aj qˆ j ψ s, x(s) , s , ψj s, x(s)
j =1
−
n
bj j s, x(s) , x(s) − δ x(s) x(s),
j =1
x(t0 ) = xt0 .
(6.15)
154
6
Collaborative Environmental Management
The solution to (6.15) can be expressed as t n
∗ aj qˆ j ψ s, x ∗ (s) , s , ψj s, x ∗ (s) x (t) = x0 + t0
−
n
j =1
∗ ∗ ∗ ∗ bj j s, x (s) , x (s) − δ x (s) x (s) ds.
(6.16)
j =1
We use {x ∗ (t)}Tt=t0 to denote the solution path generated by (6.16). The terms xt∗ and x ∗ (t)are used interchangeably. The cooperative control for the game Γc (x0 , T − t0 ) over the time interval [t0 , T ] can be expressed more precisely as
i t, x ∗ (t) , for t ∈ [t0 , T ] and i ∈ N. (6.17) ψi t, x ∗ (t) , Note that, for group optimality to be achievable, the cooperative controls in (6.17) must be exercised throughout time interval [t0 , T ]. The value function W (t0 )i (t, x) is then the integral T n
f i qˆ 1 ψ s, x ∗ (s) , s , qˆ 2 ψ s, x ∗ (s) , s , . . . , qˆ n ψ s, x ∗ (s) , s , s t
i=1
× qˆ i ψ s, x ∗ (s) , s − ci qˆ i ψ s, x ∗ (s) , s , ψi s, x ∗ (s)
− ciP ψi s, x ∗ (s) − cia i s, x ∗ (s) − εii qˆ i ψ s, x ∗ (s) , s
j
∗ −r(s−t ) j ∗ 0 ds − εi qˆ ψ s, x (s) , s − hi x (s) e j ∈K¯ i
+
n
g i x ∗ (T ) e−r(T −t0 ) ,
for i ∈ N,
(6.18)
i=1
where ψ ∗ (s, x ∗ (s)) = {ψ1∗ (s, x ∗ (s)), ψ2∗ (s, x ∗ (s)), . . . , ψn∗ (s, x ∗ (s))}. Remark 6.2 One can readily verify that W (τ ) (t, xt∗ ) = W (t0 ) (t, xt∗ )er(τ −t0 ) , for τ ∈ [t0 , T ], is the value function at time t ∈ [τ, T ] of the control problem in (6.8) and (6.13) which starts at time τ with x(t) = xt∗ .
6.3.2 Individually Rational and Time (Optimal-Trajectory-Subgame) Consistent Imputation An agreed-upon optimality principle must be sought to allocate the cooperative payoff. In a dynamic framework individual rationality has to be maintained at every
6.4 Benefit Distribution in Collaborative Environmental Management
155
instant of time within the cooperative duration [t0 , T ] along the cooperative path {xτ∗ }Tτ=t0 . For τ ∈ [t0 , T ], let ξ (τ )i (τ, xτ∗ ) denote the imputation (payoff according to the agreed-upon under optimality principle) over the period [τ, T ] to nation i ∈ N along the cooperative path. Individual rationality along the cooperative trajectory requires
ξ (τ )i τ, xτ∗ ≥ V (τ )i τ, xτ∗ , for i ∈ N, xτ∗ ∈ Xτ∗ and τ ∈ [t0 , T ]. (6.19) Since nations are asymmetric and the number of nations may be large, a reasonable optimality principle for gain distribution is to share the gain from cooperation proportional to the nations’ relative sizes of noncooperative payoffs. As mentioned before, time (optimal-trajectory-subgame) consistency is required for a credible cooperative solution under a dynamic framework. In particular, the agreed-upon optimality principle must be maintained in any subgame that starts at a later time along the cooperative state trajectory so that no nation has the incentive to deviate from the previously adopted optimal behavior throughout the game. Hence the solution imputation scheme {ξ (τ )i (τ, xτ∗ ); for i ∈ N } has to satisfy the following condition: Condition 6.1
V (τ )i (τ, xτ∗ ) ξ (τ )i τ, xτ∗ = n W (τ ) τ, xτ∗ , (τ )j ∗ (τ, xτ ) j =1 V
(6.20)
for i ∈ N, xτ∗ ∈ Xτ∗ and τ ∈ [t0 , T ]. The imputation scheme in Condition 6.1 satisfies individual rationality. Crucial to the analysis is the formulation of a payment distribution mechanism that will lead to the realization of Condition 6.1. This will be done in the next section.
6.4 Benefit Distribution in Collaborative Environmental Management The solution to the optimality principle guiding the multinational collaboration in environmental management can then be expressed as
T
T T P xt∗ , T − t = ψ s, x ∗ (s) s=t , i s, x ∗ (s) s=t , B(s) s=t ξ (t) t, xt∗ for t ∈ [t0 , T ], where ψ(s, x ∗ (s)) = {ψ1 (s, x ∗ (s)), ψ2 (s, x ∗ (s)), . . . , ψn (s, x ∗ (s))} is a set of joint payoffs maximizing the environmental policy instruments, and
s, x ∗ (s) = 1 s, x ∗ (s) , 2 s, x ∗ (s) , . . . , n s, x ∗ (s)
156
6
Collaborative Environmental Management
is the set of joint payoffs maximizing pollution abatement efforts in the collaborative environmental scheme; ξ (t) (t, xt∗ ) is an imputation scheme satisfying Condition 6.1 and B(s) is the payoff distribution procedure leading to Condition 6.1. All the participating nations in a collaborative scheme will have no incentive to exit the venture if the agreed-upon optimality principle is maintained at every instant t ∈ [t0 , T ]. To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 6.1 we obtain the following. Proposition 6.1 A distribution scheme with a terminal payment −g i [xT∗ − x¯ i ] at time T and an instantaneous payment at time τ ∈ [t0 , T ] ∗ ∂ V (τ )i (t, xt∗ ) (τ ) n t, xt |t=τ W Bi (τ ) = − (τ )j (t, x ∗ ) ∂t t j =1 V
∂ V (τ )i (τ, xτi∗ ) (τ ) ∗ W τ, xτ − ∗ n (τ )j (τ, x j ∗ ) ∂xτ τ j =1 V n
aj qˆ j ψ τ, xτ∗ , τ , ψj τ, xτ∗ × j =1
−
n
bj j τ, xτ∗
, xτ∗
−δ
xτ∗
xτ∗
,
j =1
for i ∈ N,
(6.21)
will lead to a realization of the imputations ξ (τ )i (τ, xτ∗ ), for i ∈ N and τ ∈ [t0 , T ], satisfying Condition 6.1. Proof Invoking Theorem 4.3 in Chap. 4 the results in Proposition 6.1 follow.
A time (optimal-trajectory-subgame) consistent solution can be obtained using the set of group optimal strategies ψ ∗ (s, xs∗ ) characterized in (6.14) and B(s) = {B1 (s), B2 (s), . . . , Bn (s)} in (6.21). With the nations using the cooperative environmental policy instruments ψ(s, x ∗ (s)) and pollution abatement efforts (s, x ∗ (s)), the instantaneous receipt of nation i at time instant τ is
ζi (τ ) = f i qˆ 1 ψ τ, xτ∗ , τ , qˆ 2 ψ τ, xτ∗ , τ , . . . , qˆ n ψ τ, xτ∗ , τ , s
× qˆ i ψ τ, xτ∗ , τ − ci qˆ i ψ τ, xτ∗ , τ , ψi τ, xτ∗
− ciP ψi τ, xτ∗ − cia i s, x ∗ (s) − εii qˆ i ψ s, x ∗ (s) , s j
εi qˆ j ψ τ, xτ∗ , τ − hi xτ∗ , (6.22) − j ∈K¯ i
for τ ∈ [t0 , T ] and i ∈ N .
6.5 Policy Implications
157
According to Proposition 6.1, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ ), for τ ∈ [t0 , T ] and i ∈ N , as stated in (6.21). Hence an instantaneous transfer payment χ i (τ ) = Bi (τ ) − ζi (τ )
(6.23)
would be given or charged to firm i at time τ , for i ∈ N and τ ∈ [t0 , T ].
6.5 Policy Implications Facing an increasing demand for a sustainable solution, the international community has responded to the deteriorating problem of global pollution. Over a decade ago most countries joined an international treaty—the United Nations Framework Convention on Climate Change (UNFCCC)—to consider solutions to reduce global warming and to cope with whatever temperature increases are inevitable. Recently, a number of nations approved an addition to the treaty: the Kyoto Protocol, which has more powerful and legally binding measures. In brief, the Kyoto Protocol is an international agreement, which builds on the United Nations Framework Convention on Climate Change, and sets legally binding targets and timetables for cutting the greenhouse-gas emissions of industrialized countries. Conditions for entry are that some UNFCCC parties cut greenhouse-gas emissions of at least 5% from 1990 levels in the commitment period 2008–2012. As of December 2006, 169 countries and other governmental entities ratified the agreement. Notable exceptions include the United States. Other countries, like India and China, which have ratified the protocol, are not required to reduce carbon emissions under the present agreement despite their relatively large industrial production activities. As mentioned before, placing a constraint just on certain types of pollution emissions cannot offer a long-term solution because the plans are limited to a confined set of controls, like gas emissions and permits, which is unlikely to be able to offer an effective means to reverse the accelerating trend of environmental deterioration. In addition, there is no guarantee that participants will always be better off and hence be committed within the entire duration of the agreement. Guided by the analysis shown previously, a grand coalition of all nations should be formed to pursue a comprehensive cooperative scheme of industrial pollution abatement. In particular, the entire set of policy instruments available—including environmental taxes and charges, a subsidy for the replacement of polluting techniques, and the restoration and preservation of the natural ecosystem—will be used to achieve an optimal cooperative outcome. A payment distribution mechanism has to be formulated so that cooperative gains will be shared according to the proportions of the nations’ relative sizes of noncooperative payoffs throughout the planning horizon. In sum, the appropriate policy coordination will lead to the enhancement of economic performance and the realization of a cleaner environment. This analysis opens up a novel policy forum for the international community. A particularly relevant instance would be the formation of a United Nations Agency
158
6
Collaborative Environmental Management
to coordinate international cooperative actions on pollution and climate change. The proposed agency is to be comprised of three divisions. An executive branch will be established to coordinate the adoption and development of clean technology, pollution abatement activities, use of materials, waste disposal, mode of resource extraction, and cooperation in environmental R&D. A financial branch (or FUND) would be set up to handle pollution charges, clean technology subsidies, and allocate payoff distributions so that the agreed-upon optimality principle will be realized throughout the cooperative period. Lastly, a legislative body would be in place to enact regulations on the activities damaging the environment and in violation of the cooperative agreement. Finally, a large-scale scheme is in order for research in the mechanism design theory initiated by Hurwicz (1973) and refined and applied by Myerson (1989) and Maskin (1999). In particular, the mechanism designs for conventional markets in the face of the impact from a comprehensive set of environmental policy instruments, including taxes, subsidies, technology choices, pollution abatement activities, pollution legislations, and green technology R&D, have to be considered. In addition, the mechanism designs for intergovernment transfers, institution formation, like-market, and beyond-conventional market arrangements have also to be investigated.
6.6 A Model of Transboundary Industrial Pollution Management As an illustration of a collaborative scheme for transboundary industrial pollution management we consider the deterministic version of the Yeung and Petrosyan (2008) game.
6.6.1 A Multinational Economy with Industrial Pollution We first present a multinational economy with n asymmetric nations or regions. Industrial pollution is generated via the production process.
6.6.1.1 The Industrial Economy Consider a multinational economy that is comprised of n nations. To allow different degrees of substitutability among the nations’ outputs a differentiated products oligopoly model has to be adopted. The differentiated oligopoly model used by Dixit (1979) and Singh and Vives (1984) in industrial organizations is adopted to characterize the interactions in this international market. In particular, the nations’ outputs may range from a homogeneous product to n unrelated products. Specifically, the
6.6 A Model of Transboundary Industrial Pollution Management
159
inverse demand function of the output of nation i ∈ N at time instant s is Pi (s) = α − i
n
βji qj (s),
(6.24)
j =1
where Pi (s) is the price of the output of nation i, qj (s) is the output of nation j , and α i and βji for i ∈ N and j ∈ N are positive constants. The output choice qj (s) ∈ [0, q¯j ] is nonnegative and bounded by a maximum output constraint q¯j . The output price is equal to zero if the right-hand side of (6.24) becomes negative. The demand system in (6.24) shows that the economy is a form of differentiated j products oligopoly with substitute goods. In the case when α i = α j and βji = βi for all i ∈ N and j ∈ N , the industrial outputs resemble a homogeneous good. In the case when βji = 0 for i = j , the n nations produce n unrelated products. Industrial profits of nation i at time s can be expressed as
n i i πi (s) = α − βj qj (s) qi (s) − ci qi (s) − vi (s)qi (s), for i ∈ N, (6.25) j =1
where vi (s) ≥ 0 is the tax rate imposed by government i on its industrial output at time s and ci is the unit cost of production. At each time instant s, the industrial sector of nation i ∈ N seeks to maximize (6.25). Note that each industrial sector would consider the information on the demand structure, each other’s cost structures, and tax policies. The first-order condition for a Nash equilibrium for the n nations economy yields n
βji qj (s) + βii qi (s) = α i − ci − vi (s),
for i ∈ N.
(6.26)
j =1
With output tax rates v(s) = {v1 (s), v2 (s), . . . , vn (s)} being regarded as parameters, (6.26) becomes a system of equations linear in q(s) = {q1 (s), q2 (s), . . . , qn (s)}. Solving (6.26) yields an industry equilibrium
qi (s) = φi v(s) = α¯ i + β¯ji vj (s), (6.27) j ∈N
where α¯ i and β¯ji , for i ∈ N and j ∈ N , are constants involving the model parameters {β11 , β21 , . . . , βn1 ; β12 , β22 , . . . , βn2 ; . . . ; β1n , β2n , . . . , βnn }, {α 1 , α 2 , . . . , α n }, and {c1 , c2 , . . . , cn }. The industry equilibrium generated by this oligopoly model is computable and fully tractable. One can readily observe from (6.26) that an increase in the tax rate has the same effect as of an increase in cost. Ceteris paribus, an increase in nation i’s tax rate would depress the output of industrial sector i and vice versa. Given that outputs are substitutable products and the linear demand functions of (6.24), industrial sector i’s output and nation j ’s tax rate, where j = i, are positively related.
160
6
Collaborative Environmental Management
6.6.1.2 Local and Global Environmental Impacts Industrial production emits pollutants into the environment. The emitted pollutants cause short-term local impacts on neighboring areas of the origin of production in forms like passing-by waste in waterways, wind-driven suspended particles in the air, unpleasant odor, noise, dust, and heat. For an output of qi (s) produced by nation i there will be a short-term local environmental impact (cost) of εii qi (s) on nation i itself and a local impact of εji qi (s) on its neighbor nation j . In particular, εji is a positive constant. Nation i will receive short-term local environmental impacts from its j adjacent nations measured as εi qj (s) for j ∈ K¯ i . Thus K¯ i is the subset of nations whose outputs produce local environmental impacts to nation i. Moreover, industrial production will also create long-term global environmental impacts by building up existing pollution stocks, like greenhouse gases, chlorofluorocarbons (CFC), and atmospheric particulates. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) ⊂ R + denote the level of pollution at time s, the dynamics of the pollution stock is governed by the differential equation x(s) ˙ =
n j =1
aj qj (s) −
n
1/2 bj uj (s) x(s) − δx(s),
x(t0 ) = xt0 ,
(6.28)
j =1
where aj > 0, bj > 0, δ > 0 are positive constants, aj qj is the amount added to the pollution stock by a unit of nation j ’s output, uj (s) is the pollution abatement effort of nation j , bj uj (s)[x(s)]1/2 is the amount of pollution removed by the uj (s) unit of abatement effort of nation j , and δ is the natural rate of decay of the pollutants.
6.6.1.3 The Governments’ Objectives The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. In particular, each government maximizes the net gains in the industrial sector minus the sum of expenditures on pollution abatement and damages from pollution. The instantaneous objective of government i at time s can be expressed as
n 2 j i i α − βj qj (s) qi (s) − ci qi (s) − cia ui (s) − εi qj (s) − hi x(s), j =1
for i ∈ N,
j ∈K¯ i
(6.29)
where cia > 0 and hi > 0 are constants, cia [ui (s)]2 is the cost of employing a ui amount of the pollution abatement effort, and hi x(s) is the value of damage to country i from an x(s) amount of pollution. The governments’ planning horizon is [t0 , T ]. It is possible that T may be very large. At time T , the terminal appraisal associated with the state of pollution is g i [x¯ i − x(T )], where g i ≥ 0 and x¯ i ≥ 0. The discount rate is r. Each one of the n
6.6 A Model of Transboundary Industrial Pollution Management
161
governments seeks to maximize the integral of its instantaneous objective in (6.29) over the planning horizon subject to the pollution dynamics in (6.28) with controls on the level of the abatement effort and output tax. By substituting qi (s), for i ∈ N , from (6.27) into (6.28) and (6.29) one obtains a differential game in which government i ∈ N seeks to
T
α − i
max
vi (s),ui (s)
t0
n
α¯ +
βji
j
j =1
− ci α¯ + i
j β¯h vh (s)
h∈N
β¯ji vj (s)
− cia
−r(s−t0 )
2 j j j ui (s) − εi α¯ + β¯ v (s)
ds − g x(T ) − x¯ e i
β¯hi vh (s)
j ∈K¯ i
h∈N
j ∈N
− hi x(s) e
α¯ + i
i
∈N
−r(T −t0 )
,
(6.30)
subject to x(s) ˙ =
n
aj α¯ +
j =1
j
j β¯h vh (s)
h∈N
x(t0 ) = xt0 .
−
n
1/2 bj uj (s) x(s) − δx(s),
j =1
(6.31)
In the game in (6.30) and (6.31) one can readily observe that government i’s tax policy vi (s) is not only explicitly reflected in its own output but also on the outputs of other nations. This modeling formulation allows some intriguing scenarios to arise. For instance, an increase of vi (s) may just cause a minor drop in nation i’s industrial profit, but may cause significant increases in its neighbors’ outputs, which produce large, local, negative environmental impacts to nation i. This results in the nations’ reluctance to increase or impose taxes on industrial outputs.
6.6.2 Noncooperative Outcomes In this section we discuss the solution to the noncooperative game in (6.30) and (6.31). Invoking Theorem 2.3 of Chap. 2 a feedback Nash equilibrium solution can be characterized as follows. Corollary 6.2 A set of strategies {u∗i (t) = μi (t, x), vi∗ (t) = φi (t, x), for i ∈ N } provides a feedback Nash equilibrium solution to the game in (6.30) and (6.31) if there exist suitably smooth functions V (t0 )i (t, x) : [t0 , T ] × R → R, i ∈ N , satisfying the
162
6
Collaborative Environmental Management
following partial differential equations:
n n j j (t0 )i αi − β¯h φh (t, x) + β¯i vi βji α¯ j + −Vt (t, x) = max vi ,ui
j =1
× α¯ + i
− ci α¯ + i
−
β¯hi φh (t, x) + β¯ii vi
h∈N h=i
h∈N h=i
j εi
β¯ji φj (t, x) + β¯ii vi
− cia [ui ]2
j ∈N j =i
α¯ + j
j ∈K¯ i
j j β¯ v (s) + β¯i vi
− hi x e−r(t−t0 )
∈N =i
n j j β¯h φh (t, x) + β¯i vi + Vx(t0 )i (t, x) aj α¯ j + j =1
−
n
bj μj (t, x)x
h∈N h=i 1/2
− bi ui x
1/2
− δx
,
(6.32)
j =1 j =i
V (t0 )i (T , x) = −g i x − x¯ i e−r(T −t0 ) .
(6.33)
Performing the indicated maximization in (6.32) yields bi (t0 )i V (t, x)er(t−t0 ) x 1/2 , (6.34) 2cia x
n
n n j j α¯ i + β¯h φh (t, x) β¯ii − β¯hi φh (t, x) βji α¯ j + βji β¯i αi −
μi (t, x) = −
j =1
− ci β¯ii −
j =1
h∈N
εi β¯i + Vx(t0 )i (t, x) j
j
j ∈K¯ i
n
aj β¯i er(t−t0 ) = 0, j
h∈N
(6.35)
j =1
for t ∈ [t0 < T ] and i ∈ N . The system in (6.35) forms a set of equations linear in {φ1 (t, x), φ2 (t, x), . . . , (t )1 (t )2 (t )n φn (t, x)} with {Vx 0 (t, x)er(t−t0 ) , Vx 0 (t, x)er(t−t0 ) , . . . , Vx 0 (t, x)er(t−t0 ) } being taken as a set of parameters. Solving (6.35) yields (t )j βˆji Vx 0 (t, x)er(t−t0 ) , i ∈ N, (6.36) φi (t, x) = αˆ i + j ∈N
6.6 A Model of Transboundary Industrial Pollution Management
163
where αˆ i and βˆji , for i ∈ N and j ∈ N , are constants involving the constant coefficients in (6.35). Substituting the results in (6.34) and (6.36) into (6.32) and (6.33) we obtain the following. Proposition 6.2 The system in (6.32) and (6.33) admits a solution V (t0 )i (t, x) = Ai (t)x + Ci (t) e−r(t−t0 ) ,
for i ∈ N,
(6.37)
where {A1 (t), A2 (t), . . . , An (t)} satisfies the following set of constant coefficient quadratic ordinary differential equations: n b2 2 bi2 j (t) − A (t) Aj (t) + hi , A i i a 4ci 2cja
A˙ i (t) = (r + δ)Ai (t) −
j =1 j =i
Ai (T ) = −g i ;
(6.38)
for i ∈ N,
and {Ci (t); i ∈ N} is given by
Fi (y)e−r(y−t0 ) dy + Ci0 ,
t
Ci (t) = er(t−t0 )
t0
(6.39)
where Ci0 = g i x¯ i e−r(T −t0 ) − Fi (t) = − α − i
n
T
Fi (y)e−r(y−t0 ) dy,
t0
j
j =1
× α¯ i +
+
α¯ + j
j ∈K¯ i
βˆkh Ak (t)
j βˆk Ak (t)
k∈N
j εi
β¯ji αˆ j +
j ∈N
j β¯
αˆ +
− Ai (t)
n j =1
aj α¯ + j
βˆk Ak (t)
k∈N
∈N
βˆkh Ak (t)
k∈N
+ ci α¯ −
k∈N
β¯hi αˆ h +
h∈N i
j β¯h αˆ h +
h∈N i
n
α¯ +
βji
j β¯h
αˆ + h
h∈N
Proof See Appendix 1 in this chapter’s appendixes.
βˆkh Ak (t)
.
k∈N
164
6
Collaborative Environmental Management
The corresponding feedback Nash equilibrium strategies of the game in (6.30) and (6.31) can be obtained as bi μi (t, x) = − a Ai (t)x 1/2 and φi (t, x) = αˆ i + βˆji Aj (t), (6.40) 2ci j ∈N
for i ∈ N and t ∈ [t0 , T ]. A remark that will be utilized in the subsequent analysis is given below. Remark 6.3 Let V (τ )i (t, xt ) denote the value function of nation i in a game with the payoffs in (6.30) and dynamics in (6.31) that starts at time τ . One can readily verify that V (τ )i (t, xt ) = V (t0 )i (t, xt )er(τ −t0 ) , for τ ∈ [t0 , T ].
6.7 Collaborative Scheme in Transboundary Industrial Pollution Management Now consider the case when all the nations want to cooperate and agree to act so that an international optimum can be achieved. Since nations are asymmetric and the number of nations may be large, a reasonable optimality principle for gain distribution is to share the gain from cooperation proportional to the relative sizes of the nations’ noncooperative payoffs. As mentioned before, to ensure that the cooperative solution is time consistent, the above optimality principle must be maintained throughout the game.
6.7.1 Cooperative Optimization and State Trajectory To secure group optimality the participating nations seek to maximize their joint payoff by solving the following optimal control problem:
n n T j j ¯ α − max βh vh (s) βj α¯ + v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
× α¯ +
t0 =1
β¯h vh (s) − c α¯ +
h∈N
−
j ε
j ∈K¯
−
n =1
subject to (6.31).
j =1
α¯ + j
j ∈N
j β¯k vk (s)
h∈N
2 β¯j vj (s) − ca u (s)
− h x(s) e−r(s−t0 ) ds
k∈N
−r(T −t0 ) , g x(T ) − x¯ e
(6.41)
6.7 Collaborative Scheme in Transboundary Industrial Pollution Management
165
Invoking Theorem A.1 in the Technical Appendixes, a set of controls {[vi∗∗ (t), i ∈ N } constitutes an optimal solution to the control problem in (6.41) and (6.31) if there exists a continuously differentiable function W (t0 ) (t, x) : [t0 , T ] × R → R, i ∈ N , satisfying the following partial differential equations: u∗∗ i (t)] = [ψi (t, x), i (t, x)], for
(t0 )
−Wt =
(t, x)
n
max
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
− c α¯ +
α −
α¯ +
βj
j
j =1
=1
β¯j vj
n
− ca [u ]2
−
j ∈N
j ε
j β¯h vh
h∈N
α¯ + j
j ∈K¯
α¯ +
− h x e−r(s−t0 )
k∈N
n
n j + Wx(t0 ) (t, x) aj α¯ j + bj uj x 1/2 − δx , β¯h vh − j =1
W (t0 ) (T , x) = −
n
β¯h vh
h∈N
j β¯k vk
(6.42)
j =1
h∈N
g i x(T ) − x¯ i e−r(T −t0 ) .
(6.43)
i=1
Performing the indicated maximization in (6.42) yields the optimal controls under cooperation as bi W (t0 ) (t, x)er(t−t0 ) x 1/2 , for i ∈ N ; 2cia x
n n j j ¯ α − βh ψh (t, x) β¯i βj α¯ +
i (t, x) = −
=1
−
n
j =1
h∈N
j βj β¯i
α¯ +
j =1
−
(6.44)
β¯h ψh (t, x)
h∈N
n
=1
j ∈K¯ i
c β¯i +
j j ε β¯i
+ Wx(t0 )
n
j aj β¯i er(t−t0 ) = 0,
for i ∈ N.
(6.45)
j =1
The system in (6.45) can be viewed as a set of equations linear in {ψ1 (t, x), ψ2 (t, x), . . . , ψn (t, x)} with Wx (t, x)er(t−t0 ) being taken as a parameter. Solving (6.45) yields ψi (t, x) = αˆˆ i + βˆˆ i Wx(t0 ) (t, x)er(t−t0 ) , where αˆˆ i and βˆˆ i , for i ∈ N , are constants involving the model parameters.
(6.46)
166
6
Collaborative Environmental Management
Proposition 6.3 The system in (6.42) and (6.43) admits a solution W (t0 ) (t, x) = A∗ (t)x + C ∗ (t) e−r(t−t0 ) ,
(6.47)
with A∗ (t) = AP∗ + Φ ∗ (t) C¯ ∗ −
∗
t
C (t) = e
r(t−t0 )
−1
t n b2 j 2cja
t0 j =1
∗
F (y)e
−r(y−t0 )
t0
dy
Φ ∗ (y) dy
+ C∗0
,
and
,
where
∗
t
Φ (t) = exp
AP 2cja ∗ j =1
t0
C¯ ∗ =
T
n b2 j
t0 j =1
= (r + δ) − (r + δ) + 4 2
∗
F (t) = −
n
α −
n
j =1
α¯ +
βj
2cja
Φ ∗ (y) dy,
n n b2 j
j
j =1
=1
+ (r + δ) dy ,
−Φ ∗ (T ) + (AP∗ + nj=1 g j )
AP∗ (t)
n b2 j
2cja
1/2 hj
, ca j =1 j
j =1
j β¯h αˆˆ h
n b2 j
+ βˆˆ h A∗ (t)
α¯
h∈N
ˆ ˆ ˆh h ∗ ˆj j ∗ ¯ ¯ ˆ ˆ βh αˆ + β A (t) − c α¯ + βj αˆ + β A (t) + h∈N
−
j ∈N
j ε
α¯ j +
j ∈K¯
j β¯k αˆˆ k
+ βˆˆ kj A∗ (t)
k∈N
n
j ˆ ∗ j h h ∗ ˆ ¯ ˆ βh αˆ + β A (t) aj α¯ + , − Ax (t) j =1
C∗0
=
n j =1
j j −r(T −t0 )
g x¯ e
and
h∈N
−
T
F ∗ (y)e−r(y−t0 ) dy.
t0
Proof See Appendix 2 in this chapter’s appendixes.
6.7 Collaborative Scheme in Transboundary Industrial Pollution Management
167
Using (6.44), (6.46), and (6.47), the control strategy under cooperation can be obtained as ψi (t, x) = αˆˆ i + βˆˆ i A∗ (t)
and
i (t, x) = −
bi ∗ A (t)x 1/2 , 2cia
(6.48)
for t ∈ [t0 < T ] and i = 1, 2, . . . , n. Substituting the optimal control strategy from (6.48) into (6.26) yields the dynamics of pollution accumulation under cooperation. Solving the cooperative pollution dynamics yields the cooperative state trajectory ∗
x (t) = e
t [ t [ nj=1 0
× xt0 +
bj2 2cja
A∗ (s)−δ] ds]
t n
aj α¯ j +
t0 j =1
b2 s [ t [δ− nj=1 2cja 0 j
×e
j β¯h αˆˆ h
+ βˆˆ h A∗ (s)
h∈N A∗ (τ )]dτ ]
ds ,
(6.49)
for t ∈ [t0 , T ]. We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t generated by (6.49). The term xt∗ is used to denote an element in the set Xt∗ . A remark that will be utilized in the subsequent analysis is given below. Remark 6.4 Let W (τ ) (t, xt ) denote the value function of the optimal control problem with the objective in (6.41) and dynamics in (6.31), which starts at time τ . One can readily verify that
W (τ ) t, xt∗ = W (t0 ) t, xt∗ er(τ −t0 ) ,
for τ ∈ [t0 , T ].
A group optimal scenario will be realized with the nations adopting the cooperative strategies in (6.48). To construct a cooperative solution that every party will commit to throughout cooperation period, a time consistent solution has to be sought. This will be investigated in the following subsection.
6.7.2 Consistent Imputation and Benefit Distribution To satisfy the property of time (optimal-trajectory-subgame) consistency, the optimality principle has to remain in effect throughout the cooperation period along the cooperative trajectory {x ∗ (τ )}Tτ=t0 . Hence the solution imputation scheme {ξ (τ )i (τ, xτ∗ ); for i ∈ N} has to satisfy the following:
168
6
Collaborative Environmental Management
Condition 6.2
V (τ )i (τ, xτ∗ ) ξ (τ )i τ, xτ∗ = n W (τ ) τ, xτ∗ , (τ )j ∗ (τ, xτ ) j =1 V
(6.50)
for i ∈ N, xτ∗ ∈ Xτ∗ and τ ∈ [t0 , T ]. To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 6.2 we obtain the following. Proposition 6.4 A distribution scheme with a terminal payment −g i [xT∗ − x¯ i ] at time T and an instantaneous payment at time τ ∈ [t0 , T ] ∗ ∂ V (τ )i (t, xt∗ ) (τ ) n Bi (τ ) = − W t, xt |t=τ (τ )j (t, x ∗ ) ∂t t j =1 V
∂ V (τ )i (τ, xτi∗ ) (τ ) ∗ W τ, x − ∗ n τ (τ )j (τ, x j ∗ ) ∂xτ τ j =1 V n
n j
∗ 1/2 j ∗ ∗ ∗ aj α¯ + bj j τ, xτ xτ − δxτ , β¯ ψh τ, xτ − × h
j =1
h∈N
j =1
(6.51) for i ∈ N , will lead to the realization of the solution imputations ξ (τ )i (τ, xτ∗ ), for i ∈ N and τ ∈ [t0 , T ], satisfying Condition 6.2. Proof Invoking Theorem 4.2 in Chap. 4, Proposition 6.4 follows.
Using Propositions 6.2, 6.3, and (6.48), one can express Bi (τ ) in Proposition 6.4 as −[Ai (τ )xτ∗ + Ci (τ )] ˙ ˙ ) − r A(τ )xτ∗ + C(τ ) A(τ )xτ∗ + C(τ Bi (τ ) = 2 ( j =1 [Aj (τ )xτ∗ + Cj (τ )]) [A(τ )xτ∗ + C(τ )] − 2 ( j =1 [Aj (τ )xτ∗ + Cj (τ )]) × A˙ i (τ )xτ∗ + C˙ i (τ ) − r Ai (τ )xτ∗ + Ci (τ ) +
×
[Ai (τ )xτ∗ + Ci (τ )][A(τ )xτ∗ + C(τ )] ( 2j =1 [Aj (τ )xτ∗ + Cj (τ )])2 2 A˙ j (τ )xτ∗ + C˙ j (τ ) − r Aj (τ )xτ∗ + Cj (τ ) j =1
6.7 Collaborative Scheme in Transboundary Industrial Pollution Management
169
2 [Ai (τ )xτ∗ + Ci (τ )][A(τ )xτ∗ + C(τ )] + Aj (τ ) ( 2j =1 [Aj (τ )xτ∗ + Cj (τ )])2 j =1 n
n b2 j
j ˆ j h h ∗ ∗ ∗ ∗ ˆ × β¯h αˆ + βˆ A (τ ) + aj α¯ + A (τ )xτ − δxτ , 2cja j =1
j =1
h∈N
(6.52) for i ∈ N . A time (optimal-trajectory-subgame) consistent solution to the multinational collaboration in environmental management can then be expressed as
T
T T P xt∗ , T − t = ψ s, x ∗ (s) s=t , s, x ∗ (s) s=t , B(s) s=t , ξ (t) t, xt∗ , for t ∈ [t0 , T ], where
ψi s, x ∗ (s) = αˆˆ i + βˆˆ i A∗ (s),
1/2 bi i s, x ∗ (s) = − a A∗ (s) x ∗ (s) , 2ci
Bi (τ ) is given as in (6.52), and ξ (t) (t, xt∗ ) is an imputation scheme satisfying Condition 6.2. When all nations are adopting the cooperative strategies the rate of instantaneous payment that nation ∈ N will realize at time t with the state being xt∗ can be expressed as (t) = α −
n
j
j =1
× α¯ +
− c α¯ + j ∈K¯
j ε
j β¯h αˆˆ h
ˆ h ∗ + βˆ A (t)
h∈N
β¯h
h∈N
−
α¯ +
βj
j ∈N
h ˆh ∗ ˆαˆ + βˆ A (t)
2 ˆ ˆj j ∗ a b ∗ ¯ ˆ βj αˆ + β A (t) − c A (t) xt∗ 2ca
α¯ + j
j β¯k αˆˆ k
ˆ kj ∗ + βˆ A (t) − h xt∗ .
(6.53)
k∈N
Since, according to Proposition 6.4 under the cooperative scheme, an instantaneous payment to nation equaling B (t) at time t, and a side payment of the value B (t)− (t) will be offered to nation .
170
6
Collaborative Environmental Management
6.8 Exercises 6.1 Consider a two-nation international economy in which transboundary pollution is generated by the production process. The planning horizon is [0, 2]. The inverse demand function of the outputs of nations 1 and 2 at time instant s ∈ [0, 2] are, respectively, P1 (s) = 100 − 2q1 (s) − 0.5q2 (s)
and P2 (s) = 120 − q1 (s) − 0.5q2 (s),
where Pi (s) is the price of the output and qi (s) is the output of the nation. The unit costs of production in these nations are c1 = 1 and c2 = 1.5. The instantaneous industrial profits of nations 1 and 2 at time s can be expressed as π1 (s) = 100 − 2q1 (s) − 0.5q2 (s) q1 (s) − q1 (s) − v1 (s)q1 (s) and π2 (s) = 120 − 0.5q1 (s) − q2 (s) q2 (s) − 1.5q2 (s) − v2 (s)q2 (s), where vi (s) ≥ 0 is the tax rate imposed by government i on its industrial output at time s. At each time instant s, the industrial sector of nation i ∈ {1, 2} seeks to maximize its instantaneous profit. Derive the market equilibrium at time instant s. 6.2 Industrial production emits pollutants into the environment. The emitted pollutants cause short-term local impacts on neighboring areas of the origin of production in forms like passing-by waste in waterways, wind-driven suspended particles in the air, unpleasant odor, noise, dust, and heat. For an output of q1 (s) produced by nation 1 there will be a short-term local environmental impact (cost) of 0.2q1 (s) on nation 1 itself and a local impact of 0.1q1 (s) on its neighbor nation 2. On the other hand, for an output of q2 (s) produced by nation 2, there will be a short-term local environmental impact (cost) of 0.3q2 (s) on nation 2 itself and a local impact of 0.25q2 (s) on its neighbor nation 1. Moreover, industrial production will also create long-term global environmental impacts by building up existing pollution stocks, like greenhouse gases, CFC, and atmospheric particulates. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) denote the level of pollution and uj (s) the pollution abatement effort of nation j at time s, the dynamics of the pollution stock is governed by the differential equation 1/2 1/2 − 0.02u2 (s) x(s) − 0.04x(s), x(s) ˙ = 4q1 (s) + 3q2 (s) − 0.05u1 (s) x(s) x(0) = 100. The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. The damages to countries 1 and 2 from an x(s) amount of pollution are 0.15x(s) and 0.1x(s). In particular, each government maximizes the net gains in the industrial sector minus the sum of the
1 Proof of Proposition 6.2
171
expenditures on pollution abatement and damages from pollution. The instantaneous objective of governments 1 and 2 at time s are, respectively, 100 − 2q1 (s) − 0.5q2 (s) q1 (s) − q1 (s) − 0.2q1 (s) − 0.25q2 (s) − 0.15x(s), and 120 − q1 (s) − q2 (s) q2 (s) − 1.5q2 (s) − 0.1q1 (s) − 0.3q2 (s) − 0.1x(s). The governments’ planning horizon is [0, 2]. At terminal time 2, the terminal appraisal associated with the state of pollution is 5[60 − x(2)] for nation 1 and 4[40 − x(2)] for nation 2. The discount rate is 0.05. Each government seeks to maximize the integral of its instantaneous objective over the planning horizon subject to pollution stock dynamics. Construct a differential game of noncooperative pollution management by these two nations. Obtain a feedback Nash equilibrium solution for the game. 6.3 Consider the case when both nations want to cooperate and agree to act so that an international optimum can be achieved. Obtain the optimal cooperative levels of outputs and abatement efforts. 6.4 These cooperating nations adopt an optimality principle that distributes the gain from cooperation proportional to the relative sizes of the nations’ noncooperative payoffs. Characterize a time (optimal-trajectory-subgame) consistent solution.
Appendix 1: Proof of Proposition 6.2 Using (6.34), (6.36), and (6.37), the system in (6.32) and (6.33) can be expressed as r Ai (t)x + Ci (t) − A˙ i (t)x + C˙ i (t)
n n j β¯h αˆ h + βˆkh Ak (t) βji α¯ j + = αi − j =1
× α¯ i +
β¯hi αˆ h +
h∈N
− ci α¯ + i
h∈N i
j ∈N
k∈N
βˆkh Ak (t)
k∈N
β¯ji
αˆ + j
j βˆk Ak (t)
k∈N
2 j j a bi j ¯ ˆ β αˆ + βk Ak (t) − ci Ai (t) x − εi α¯ + − hi x 2cia ∈N k∈N j ∈K¯ i n
j j h h + Ai (t) β¯ αˆ + βˆk Ak (t) aj α¯ +
h
j =1
+
n j =1
bj
h∈N
bj Aj (t)x − δx , 2cja
k∈N
(6.54)
172
6
Ai (T )x + Ci (T ) = −g i x − x¯ i ,
Collaborative Environmental Management
for i ∈ N.
(6.55)
For (6.54) and (6.55) to hold, it is required that n b2 j
A˙ i (t) = (r + δ)Ai (t) − Ai (t) Ai (T ) = −g i ,
C˙ i (t) = rCi (t) − α i −
n
× α¯ i +
+ ci α¯ −
β¯hi αˆ h +
i
h∈N i
h∈N
αˆ +
+
j ∈K¯ i
− Ai (t)
α¯ + j
αˆ +
∈N
n
βˆk Ak (t)
k∈N
aj α¯ + j
j =1
βˆkh Ak (t)
j βˆk Ak (t)
j β¯
(6.57)
k∈N
k∈N
j εi
j β¯h αˆ h +
(6.56)
βˆkh Ak (t)
j
j ∈N
k∈N
β¯ji
n
βji α¯ j +
j =1
2 bi2 a Ai (t) + hi , 4ci
Aj (t) − 2cja j =1 j =i
j β¯h
αˆ + h
h∈N
βˆkh Ak (t)
k∈N
= rCi (t) + Fi (t), Ci (T ) = g i x¯ i .
(6.58) (6.59)
Equations (6.56)–(6.59) form a block recursive system of differential equations, with (6.56) and (6.57) being independent of (6.58) and (6.59). Solving {A1 (t), A2 (t), . . . , An (t)} in (6.56) and (6.57) and substituting them into (6.58) and (6.59) yields a system of linear first-order differential equations C˙ i (t) = rCi (t) + Fi (t),
(6.60)
Ci (T ) = g i x¯ i ,
(6.61)
for i ∈ N.
Since Ci (t) is independent of Cj (t) for i = j, Ci (t) can be solved as Ci (t) = er(t−t0 )
t
t0
Fi (y)e−r(y−t0 ) dy + Ci0 ,
(6.62)
where Ci0 = g i x¯ i e−r(T −t0 ) − Hence Proposition 6.2 follows.
T t0
Fi (y)e−r(y−t0 ) dy.
(6.63)
2 Proof of Proposition 6.3
173
Appendix 2: Proof of Proposition 6.3 Substituting (6.44) and (6.46) into (6.42) and using (6.47) one obtains r A∗ (t)x + C ∗ (t) − A˙ ∗ (t)x + C˙ ∗ (t) n n j ˆ j h h ∗ ˆ ¯ ˆ α − = βh αˆ + β A (t) βj α¯ + j =1
=1
× α¯ +
β¯h αˆˆ h + βˆˆ h A∗ (t)
h∈N
− c α¯ +
−
h∈N
j ε
j ∈K¯
β¯j
j ∈N
α¯ + j
2 j ˆαˆ + βˆˆ j A∗ (t) − ca b A∗ (t) x 2ca
j β¯k αˆˆ k
n
aj α¯ + j
j =1 n b2 j j =1
ˆ kj ∗ + βˆ A (t) − h x
k∈N
+ A∗x (t)
+
2cja
j β¯h αˆˆ h
ˆ h ∗ + βˆ A (t)
h∈N
A∗ (t)x − δx ,
(6.64)
n ∗ g i x(T ) − x¯ i . A (T )x + C ∗ (T ) = −
(6.65)
i=1
For (6.64) and (6.65) to hold, it is required that A˙ ∗ (t) = (r + δ)A∗ (t) −
n b2 n 2 j ∗ hj , A (t) + 2cja j =1
A∗ (T ) = −
n
gj ;
(6.67)
j =1
C˙ ∗ (t) = rC ∗ (t) n n j ˆ β¯h αˆˆ h + βˆ h A∗ (t) βj α¯ j + − α − j =1
=1
× α¯ +
(6.66)
j =1
h∈N
β¯h
h∈N
h ˆh ∗ αˆˆ + βˆ A (t)
174
6
− c α¯ +
−
j ε
j ∈K¯
Collaborative Environmental Management
ˆ ˆj j ∗ ¯ ˆ βj αˆ + β A (t)
j ∈N
α¯ j +
j β¯k αˆˆ k
ˆ + βˆ kj A∗ (t)
k∈N
− A∗x (t)
n
aj α¯ + j
j =1 ∗
j β¯h αˆˆ h + βˆˆ h A∗ (t)
h∈N
∗
= rC (t) + F (t), C ∗ (T ) =
n
(6.68)
g j x¯ j .
(6.69)
j =1
Equations (6.66)–(6.69) form a block recursive system of differential equations, with (6.66) and (6.67) being independent of (6.68) and (6.69). Moreover, (6.68) and (6.69) are a Riccati equation with constant coefficients, the solution to which can be obtained by standard methods as A
∗
(t) = AP∗
∗
¯∗
+ Φ (t) C −
t n b2 j t0 j =1
2cja
−1 ∗
Φ (y) dy
,
(6.70)
where
∗
Φ (t) = exp
t
t0
C¯ ∗ =
AP 2cja ∗ j =1
−Φ ∗ (T ) + P (A∗ + nj=1 g j )
AP∗
n b2 j
T
+ (r + δ) dy , n b2 j
t0 j =1
= (r + δ) − (r + δ) + 4 2
2cja
Φ ∗ (y) dy,
n b2 n j j =1
2cja
j =1
and
1/2 hj
n b2 j
ca j =1 j
is a particular solution of (6.66). Substituting A∗ (t) above into (6.68), the system in (6.68) and (6.69) becomes a system of linear first-order differential equations C˙ ∗ (t) = rC ∗ (t) + F ∗ (t), C ∗ (T ) =
n j =1
g j x¯ j .
(6.71) (6.72)
2 Proof of Proposition 6.3
175
Solving (6.71) and (6.72) yields
∗
C (t) = e
r(t−t0 )
t
∗
−r(y−t0 )
F (y)e t0
where C∗0 =
n
g j x¯ j e−r(T −t0 ) −
j =1
Hence Proposition 6.3 follows.
T
t0
dy
+ C∗0
,
F ∗ (y)e−r(y−t0 ) dy.
(6.73)
Chapter 7
Dynamically Stable Dormant Firm Cartel
In this chapter, the optimization by cartels that restricts outputs to enhance their joint profit is examined. In particular, we consider oligopolies in which firms agree to form a cartel to restrain output and enhance their profits. Some firms have cost disadvantages that force them to become dormant partners. In Sect. 7.1 a dynamic oligopoly in which there are cost differentials among firms is presented. Pareto optimal output path, imputation schemes, profit sharing arrangements, and time (optimal-trajectory-subgame) consistent solution are derived for a dormant firm cartel in Sect. 7.2. An illustration is shown in the following section. The case when the planning horizon becomes infinite is analyzed in Sect. 7.4, including an illustration with an explicit solution following in the subsequent section.
7.1 A Dynamic Oligopoly For analytical purposes we develop a dynamic model of oligopoly in which cost differentials among firms are present.
7.1.1 Basic Settings Consider an oligopoly in which n firms are allowed to extract a renewable resource within the duration [t0 , T ]. Among the n firms, n1 of them have cost advantages over the other n2 = n − n1 firms. For notational convenience, the firms with cost advantages are numbered from 1 to n1 and the firms with cost disadvantages are numbered from n1 + 1 to n. The subset of firms with cost advantages is denoted by N1 and that of firms with cost disadvantages is denoted by N2 . The firms with cost advantages are identical and so are the firms with cost disadvantages. D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_7, © Springer Science+Business Media, LLC 2012
177
178
7
Dynamically Stable Dormant Firm Cartel
The dynamics of the resource is characterized by the differential equations x(s) ˙ = f s, x(s),
n
uj (s) = f s, x(s),
j =1
uj 1 (s) +
j i ∈N1
uj 2 (s) ,
j 2 ∈N2
x(t0 ) = x0 ∈ X,
(7.1)
where uj ∈ Uj is the (nonnegative) amount of resource extracted by firm i, for i ∈ N , and x(s) is the resource stock. The extraction cost depends on the quantity of resource extracted ui (s) and the resource stock size x(s). In particular, the extraction cost for the n1 firms with cost advantages is 1 cj uj 1 (s), x(s) , for j 1 ∈ N1 , and the extraction cost for the n1 firms with cost advantages is 2 cj uj 2 (s), x(s) ,
for j 2 ∈ N2 .
This formulation of unit cost follows from two assumptions: (i) the cost of extraction is positively related to the extraction effort, and (ii) the amount of the resource extracted, seen as the output of a production function of two inputs (effort and stock level), is increasing in both inputs (see Clark 1976). In particular, firm j 1 ∈ N1 has cost advantage so that 1 2 ∂cj uj 1 (s), x(s) /∂uj 1 (s) < ∂cj uj 2 (s), x(s) /∂uj 2 (s), for all levels of uj 1 ∈ Uj 1 and uj 2 ∈ Uj 2 at any x ∈ X. The market price of the resource depends on the total amount extracted and supplied to the market. The price-output relationship at time s is given by the following downward-sloping inverse demand curve P (s) = g[Q(s)], where Q(s) = u (s) + u 1 i 2 j ∈N1 j j ∈N2 j 2 (s) is the total amount of the resource extracted and marketed at time s. At time T , firm j 1 ∈ N1 will receive a termination bonus 1 2 q j [x(T )] and firm j 2 ∈ N2 will receive a termination bonus q j [x(T )]. There exists a discount rate r, and the profits received at time t have to be discounted by the factor exp[−r(t − t0 )]. At time t0 , firm j 1 ∈ N1 , which has cost advantages, seeks to maximize its profit
T
g
t0
uh (s) +
h∈Ni
u (s) uj 1 (s) − c
uj 1 (s), x(s) exp −r(s − t0 ) ds
∈N2
1 + exp −r(T − t0 ) q j x(T ) , subject to (7.1).
j1
(7.2)
7.1 A Dynamic Oligopoly
179
At time t0 , firm j 2 ∈ N2 , which has cost disadvantages, seeks to maximize profit
T
g
t0
uh (s) +
h∈Ni
u (s) uj 2 (s) − c
j2
uj 2 (s), x(s) exp −r(s − t0 ) ds
∈N2
2 + exp −r(T − t0 ) q j x(T ) ,
(7.3)
subject to (7.1). The noncooperative market equilibrium of the oligopoly game in (7.1)–(7.3) will be characterized next.
7.1.2 Market Outcome We use Γ (x0 , T − t0 ) to denote the game in (7.1)–(7.3) and Γ (xτ , T − τ ) to denote an alternative game with state dynamics in (7.1) and the payoff structures in (7.2) and (7.3), which starts at time τ ∈ [t0 , T ] with initial state xτ ∈ X. Invoking Theorem 2.3 in Chap. 2, a noncooperative Nash equilibrium solution of the game Γ (xτ , T − τ ) can be characterized as follows. Corollary 7.1 A set of feedback strategies {φj∗1 (t, x) for j 1 ∈ N1 and φj∗2 (t, x) for j 2 ∈ N2 } provides a Nash equilibrium solution to the game Γ (xτ , T − τ ) if 1 there exist continuously differentiable functions V (τ )j (t, x) : [τ, T ] × R m → R for 2 j 1 ∈ N1 and V (τ )j (t, x) : [τ, T ] × R m → R for j 2 ∈ N2 , satisfying the following partial differential equations: (τ )j 1
−Vt
(t, x)
= max
g
uj 1
φh∗ (t, x) + uj 1
+
h∈Ni h=j 1
φ∗ (t, x)
j1
uj 1 , −c (uj 1 , x)
∈N2
(τ )j 1 × exp −r(t − τ ) + Vx (t, x) × f t, x,
h∈Ni h=j 1
φh∗ (t, x) + uj 1
+
φ∗ (t, x)
,
and
∈N2
1 1 V (τ )j (T , x) = exp −r(T − t0 ) q j (x),
for j 1 ∈ N1 ;
(7.4)
180
7 (τ )j 2
−Vt
(t, x)
= max
g
uj 2
φh∗ (t, x) +
h∈Ni
Dynamically Stable Dormant Firm Cartel
φ∗ (t, x) + uj 2 uj 2 − c (uj 2 , x) j2
∈N2 =j 2
(τ )j 2 × exp −r(t − τ ) + Vx (t, x) ∗ ∗ × f t, x, φh (t, x) + φ (t, x) + uj 2 , h∈Ni
and
∈N2 =j 2
2 2 V (τ )j (T , x) = exp −r(T − t0 ) q j (x),
for j 2 ∈ N2 .
Conditions satisfying the indicated maximization in (7.5) yield g
φh∗ (t, x) + uj 1
h∈Ni h=j 1
+g
+
φ∗ (t, x)
∈N2
φh∗ (t, x) + uj 1
h∈Ni h=j 1
+
φ∗ (t, x)
uj 1
∈N2
∂ j1 − c (uj 1 , x) exp −r(t − τ ) ∂uj 1 ∂ (τ )j 1 ∗ ∗ + Vx (t, x) f t, x, φh (t, x) + uj 1 + φ (t, x) = 0, ∂uj 1 h∈Ni h=j 1
∈N2
for j 1 ∈ N1 ;
∗ ∗ φh (t, x) + φ (t, x) + uj 2 g h∈Ni
+g
∈N2 =j 2
h∈Ni
φh∗ (t, x) +
(7.5)
φ∗ (t, x) + uj 2
∈N2 =j 2
∂ j2 − c (uj 2 , x) exp −r(t − τ ) ∂uj 2
uj 2
7.2 Time (Optimal-Trajectory-Subgame) Consistent Cartel (τ )j 2 + Vx (t, x)
181
∂ ∗ ∗ f t, x, φh (t, x) + φ (t, x) + uj 2 = 0, ∂uj 2 h∈Ni
∈N2 =j 2
for j 2 ∈ N2 . The profits of firm j 1 ∈ N1 , which has cost advantages, can be expressed as V
(τ )j 1
T
(t, xτ ) =
g
τ
φh∗
∗ ∗ φ s, x(s) φj 1 s, x(s) s, x(s) +
h∈Ni
−c
j1
φj∗1
∈N2
s, x(s) , x(s) exp −r(s − τ ) ds
1 + exp −r(T − τ ) q j x(T ) , for j 1 ∈ N1 . The profits of firm j 2 ∈ N2 , which has cost disadvantages, can be expressed as V
(τ )j 2
T
(t, xτ ) =
g
τ
∗ ∗ φh s, x(s) + φ s, x(s) φj 2 s, x(s) ∗
h∈Ni
∈N2
2 − cj φj∗2 s, x(s) , x(s) exp −r(s − τ ) ds 2 + exp −r(T − τ ) q j x(T ) , for j 2 ∈ N2 , where x(s) ˙ = f s, x(s),
j i ∈N1
uj 1 (s) +
uj 2 (s) ,
x(τ ) = xτ ∈ X.
j 2 ∈N2
The dynamic oligopoly model presented above is an extension of the dormant firm duopoly model in Yeung (2005).
7.2 Time (Optimal-Trajectory-Subgame) Consistent Cartel Assume that the firms in the oligopoly agree to form a cartel to restrain output and enhance their profits.
182
7
Dynamically Stable Dormant Firm Cartel
7.2.1 Pareto Optimal Output Path To achieve a group optimum these firms are required to solve the following joint profit maximization problem: T uh (s) + u (s) uh (s) + u (s) max g u1 ,u2 ,...,un
−
t0
h∈Ni
∈N2
h∈Ni
ch uh (s), x(s) + c u (s), x(s)
h∈Ni
∈N2
exp −r(s − t0 ) ds
∈N
2 h + exp −r(T − t0 ) q x(T ) + q x(T ) ,
h∈Ni
(7.6)
∈N2
subject to (7.1). An optimal solution of the problem in (7.1) and (7.6) can be characterized using Theorem A.1 in the Technical Appendixes as follows. Corollary 7.2 A set of control strategies {ψj∗1 (t, x) for j 1 ∈ N1 and ψj∗2 (t, x) for j 2 ∈ N2 } provides a solution to the control problem in (7.1) and (7.6), if there exist continuously differentiable functions W (t0 ) (t, x) : [τ, T ] × R m → R, satisfying the following partial differential equation: (t0 ) − Wt (t, x) = max g uh + u uh + u u1 ,u2 ,...,un
−
h∈Ni
∈N2
ch [uh , x] +
h∈Ni
c [u , x]
t, x,
uh +
h∈N
∈N2
exp −r(t − t0 )
∈N2
+ Wx(t0 ) (t, x)f
h∈Ni
u
(7.7) ,
and
∈N
1 2 (t0 ) h q x+ q x . W (T , x) = exp −r(T − t0 )
h∈Ni
∈N2
Conditions satisfying the indicated maximization in (7.7) include g uh + u + g uh + u uh + u h∈Ni
∈N2
h∈Ni
∈N2
h∈Ni
∂ j1 c (uj 1 , x) exp −r(t − t0 ) + Wx(t0 ) (t, x), ∂uj 1 ∂ f t, x, uh + u ≤ 0, uj 1 ≥ 0, ∂uj 1 −
h∈N1
∈N2
∈N2
7.2 Time (Optimal-Trajectory-Subgame) Consistent Cartel
183
and if uj 1 > 0, the equality sign must hold, for j 1 ∈ N1 ; g
uh +
h∈Ni
u + g
∈N2
uh +
h∈Ni
u
∈N2
h∈Ni
∂ j2 − c (uj 2 , x) exp −r(t − t0 ) + Wx(t0 ) (t, x), ∂uj 2 ∂ f t, x, uh + u ≤ 0, uj 2 ≥ 0, ∂uj 2 h∈N1
uh +
u
∈N2
(7.8)
∈N2
and if uj 2 > 0, the equality sign must hold, for j 2 ∈ N2 . Since
∂ j1 ∂uj 1 c (uj 1 , x)
0, the equality sign must hold, for j 1 ∈ N1 ; g uh + u + g uh + u uh + u h∈Ni
∈N2
h∈Ni
∈N2
h∈Ni
∈N2
∂ ∂ j2 c (uj 1 , x) + Wx (x) f x, uh + u ≤ 0, − ∂uj 2 ∂uj 2 h∈N1
∈N2
uj 2 ≥ 0,
(7.39)
and if uj 2 > 0, the equality sign must hold, for j 2 ∈ N2 . Since
∂ j1 ∂uj 1 c (uj 1 , x)
τ and xt∗ ∈ {x ∗ (s)}∞ s=τ . According to P (τ, xτ ), agent i is supposed to receive a payoff ξ (τ )i (t, xt∗ ) over the remaining time interval [t, ∞) if the state is xt∗ ∈ Xt∗ . Consider the case when the game has proceeded to time t and the state variable became xt∗ ∈ Xt∗ . Then one has a cooperative game Γc (t, xt∗ ), which starts at time t with initial state xt∗ . According to the solution P (t, xt∗ ), an imputation ∞
∗ ∗ (t)i t ∗ ∗ t, xt = Et Bi s, xs exp −r(s − t) ds x(t) = xt ∈ Xt , ξ t
will be allotted to agent i, for i ∈ N .
226
8
Subgame Consistent Economic Optimization Under Uncertainty
However, according to the solution P (τ, xτ ), the imputation (in the present value viewed at time τ ) to agent i over the period [t, ∞) is (8.52). For the imputation from P (τ, xτ ) to be consistent with those from P (t, xt∗ ), it is essential that exp r(t − τ ) ξ (τ )i t, xt∗ = ξ (t)i t, xt∗ ∈ P t, xt∗ ,
for t ∈ (τ, ∞). (8.53)
In addition, at time τ when the initial state is xτ , according to the solution P (τ, xτ ) generated by optimality Principle PII, the payoff distribution procedure is B τ s, xs∗ = B1τ s, xs∗ , B2τ s, xs∗ , . . . , Bnτ s, xs∗ ,
for s ∈ [τ, ∞) and xs∗ ∈ Xs∗ .
When the game proceeds to time t the state variable becomes xt∗ ∈ Xt∗ . According to the solution P (t, xt∗ ) generated by optimality Principle PII, the payoff distribution procedure B t s, xs∗ = B1t s, xs∗ , B2t s, xs∗ , . . . , Bnt s, xs∗ ,
for s ∈ [t, ∞) and xs∗ ∈ Xs∗ ,
will be adopted. For the continuation of the payoff distribution procedure B τ (s, xs∗ ) ∈ P (τ, xτ ) to be consistent with B t (s, xs∗ ) ∈ P (t, xt∗ ), it is required that B t0 s, xs∗ = B t s, xs∗ ,
for s ∈ [t, ∞) and t ∈ [τ, ∞) and xs∗ ∈ Xs∗ .
Therefore, we have the following definition. Definition 8.2 The imputation and payoff distribution procedure {ξ (τ ) (τ, xτ ) and B τ (s, xs∗ ) for s ∈ [τ, ∞)} ∈ P (τ, xτ ) are subgame consistent if (i)
exp r(t − τ ) ξ (τ )i t, xt∗ ∞
∗ τ ∗ ∗ Bi s, xs exp −r(s − τ ) ds x(t) = xt ∈ Xt ≡ exp r(t − τ ) Eτ =ξ
(t)i
t, xt∗
t
∈ P t, xt∗ ,
for t ∈ (τ, ∞) and i ∈ N ;
(8.54)
and (ii) the payoff distribution procedure B τ (s, xs∗ ) for s ∈ [t, ∞) is identical to B t (s, xs∗ ) ∈ P (t, xt∗ ). Thus a payoff distribution procedure leading to a subgame consistent imputation has to satisfy Definition 8.2.
8.5 Infinite-Horizon Consistent Economic Optimization Under Uncertainty
227
8.5.3 Payoff Distribution Procedure Leading to Subgame Consistency To derive a payoff distribution procedure leading to a subgame consistent imputation we invoke Definition 8.2 and obtain Biτ (s, xs∗ ) = Bit (s, xs∗ ) = Bi (s, xs∗ ), for s ∈ [τ, ∞), xs∗ ∈ Xs∗ and t ∈ [τ, ∞) and i ∈ N . Therefore, along the cooperative trajectory, ∞
∗ (τ )i ξ (τ, xτ ) = Eτ Bi s, xs exp −r(s − τ ) ds , for i ∈ N, and τ ∞
∗ (υ)i ∗ Bi s, xs exp −r(s − υ) ds x(υ) = xυ∗ ∈ Xυ∗ , ξ υ, xυ = Eυ υ
for i ∈ N , and
∗ ∗ ∗ Bi s, xs exp −r(s − t) ds x(t) = xt ∈ Xt ,
(8.55)
for i ∈ N and t ≥ υ ≥ τ . Moreover, for i ∈ N and t ∈ [τ, ∞), we define the term ∞
∗ ∗ (υ)i ∗ Bi s, xs exp −r(s − υ) ds x(t) = xt , t, xt = Eυ ξ
(8.56)
ξ
(t)i
∗ t, xt = Et
∞ t
t
to denote the present value of agent i’s cooperative payoff over the time interval [t, ∞), given that the state is xt∗ at time t ∈ [υ, ∞], under the solution P (υ, xυ∗ ). Invoking (8.55) and (8.56) one can readily verify that exp r(t − τ ) ξ (τ )i t, xt∗ = ξ (t)i t, xt∗ , for i ∈ N and τ ∈ [t0 , T ] and t ∈ [τ0 , T ]. The next task is to derive Bi (s, xs∗ ), for s ∈ [τ, ∞) and t ∈ [τ, ∞) so that (8.55) can be realized. Consider again the following condition. Condition 8.3 For i ∈ N and t ≥ υ and υ ∈ [τ, T ], the term ξ (υ)i (t, xt∗ ) is a function that is continuously differentiable in t and xt∗ . Lemma 8.1 If Condition 8.3 is satisfied, a PDP with instantaneous payments at time s with the state being xs∗ ∈ Xs∗ equaling (s)i ∗ Bi s, xs∗ = − ξt t, xt t=s (s)i ∗ ∗ ∗ ∗ ∗ ∗ − ξx ∗ t, xt t=s f xs , ψ1 s, xs , ψ2 s, xs , . . . , ψn∗ s, xs∗ t
m 1 hζ ∗ (s)i ∗ − Ω xs ξ h ζ t, xt t=s , xt xt 2 h,ζ =1
for i ∈ N and s ∈ [υ, ∞),
(8.57)
yields imputation ξ (υ)i (υ, xυ∗ ) for υ ∈ [τ, ∞) and xυ∗ ∈ Xυ∗ , which satisfies (8.55).
228
8
Subgame Consistent Economic Optimization Under Uncertainty
Proof Note that along the cooperative trajectory ∞
∗ ∗ (υ)i ∗ ∗ Bi s, xs exp −r(s − υ) ds x(t) = xt ∈ Xt t, xt = Eυ ξ t
= exp −r(t − υ) ξ (t)i t, xt∗ ,
for i ∈ N and t ∈ [υ, ∞). (8.58)
For t → 0, (8.55) can be expressed as ∞
ξ (υ)i υ, xυ∗ = Eυ Bi s, xs∗ exp −r(s − υ) ds υ
υ+ t
= Eυ υ
Bi s, xs∗ exp −r(s − υ) ds
+ ξ (υ)i υ + t, xυ∗ + xυ∗ ,
(8.59)
where xυ∗ = f xυ∗ , ψ1∗ xυ∗ , ψ2∗ xυ∗ , . . . , ψn∗ xυ∗ t + σ xυ∗ zυ + o( t), zυ = Z(υ + t) − z(υ), and Eυ o( t) / t → 0 as t → 0. ∗ Replacing the term xυ∗ + xυ∗ with xυ+ t and rearranging (8.59) yields
υ+ t
Eυ
Bi (s) exp −r(s − υ) ds
υ
∗ = Eυ ξ (υ)i υ, xυ∗ − ξ (υ)i υ + t, xυ+ t , for all υ ∈ [τ, ∞) and i ∈ N.
(8.60)
With Condition 8.3 holding and t → 0, (8.60) can be expressed as Eυ Bi s, xs∗ t + o( t) (s)i ∗ = Eυ − ξt t, x t t
t=s
(s)i − ξx ∗ t, xt∗ t=s f xs∗ , ψ1∗ s, xs∗ , ψ2∗ s, xs∗ , . . . , ψn∗ s, xs∗ t t
−
m 1 hζ ∗ (s)i ∗ Ω xs ξ h ζ t, xt t=s t xt xt 2 h,ζ =1
(s)i ∗ ∗ − ξx ∗ t, xt t=s σ xυ zυ − o( t) . t
(8.61)
8.5 Infinite-Horizon Consistent Economic Optimization Under Uncertainty
229
Dividing (8.61) throughout by t, with t → 0 and taking the expectation, yields (8.57). Thus the payoff distribution procedure in Bi (υ, xυ∗ ) in (8.57) would lead to the realization of the imputations that satisfy (8.55). Since the payoff distribution procedure in Bi (τ ) in (8.57) leads to the realization of (8.55), it would yield subgame consistent imputations satisfying Definition 8.2. A more succinct form of Lemma 8.1 can be derived as follows. If Condition 8.3 is satisfied, a PDP with instantaneous payments at time s equaling (s)i Bi s, xs∗ = rξ (s)i s, xs∗ − ξx ∗ s, xs∗ f xs∗ , ψ1∗ xs∗ , ψ2∗ xs∗ , . . . , ψn∗ xs∗ s
m 1 hζ ∗ (s)i ∗ − Ω xs ξ h ζ t, xt t=s , xt xt 2 h,ζ =1
for i
∈ N, xs∗
∈ Xs∗ and s ∈ [υ, ∞),
(8.62)
yields the imputation ξ (υ)i (υ, xυc ), for υ ∈ [τ, ∞) which satisfies (8.55). To demonstrate that (8.62) is an alternative form for (8.57) in Lemma 8.1, we first define ∞
∗ i Bi (s) exp −r(s − υ) ds x(υ) = xυ∗ = ξ (υ)i τ, xυ∗ , and ξˆ xυ = Eυ ξˆ i xt∗ = Et
υ ∞
t
∗ Bi (s) exp −r(s − t) ds x(t) = xt = ξ (t)i t, xt∗ ,
for i ∈ N , and υ ∈ [τ, ∞) and t ∈ [υ, ∞) along the optimal cooperative trajectory {xs∗ }∞ s=τ . We then have ξ (υ)i t, xt∗ = exp −r(t − υ) ξˆ i xt∗ . Differentiating the above condition with respect to t yields (υ)i ∗ t, xt t=υ = −r exp −r(t − υ) ξˆ i xt∗ = −rξ (υ)i t, xt∗ . ξt At t = υ, ξ (υ)i (t, xt∗ ) = ξ (υ)i (υ, xυ∗ ), therefore, (υ)i ∗ t, xt t=υ = rξ (υ)i t, xt∗ = rξ (υ)i υ, xυ∗ . ξt
(8.63)
Substituting (8.63) into (8.57) yields (8.62). Using (8.62), a subgame consistent solution in an infinite-horizon framework is characterized in the section below.
230
8
Subgame Consistent Economic Optimization Under Uncertainty
8.5.4 Subgame Consistent Solution A theorem characterizing a subgame consistent solution P (τ, xτ ) for the cooperative game Γc (τ, xτ ) under optimality Principle PII is presented below. Theorem 8.3 For the cooperative game Γc (τ, xτ ) with optimality Principle PII the solution P (τ, xτ ) = {u(s, xs∗ ) and B(s, xs∗ ) for s ∈ [τ, ∞) and ξ (τ ) (τ, xτ )}—in which (i) u(s, xs∗ ) for s ∈ [τ, ∞) is the set of group optimal strategies ψ ∗ (xs∗ ) for the game Γc (τ, xτ ), and (ii) the imputation distribution procedure for s ∈ [τ, ∞) B s, xs∗ = B1 s, xs∗ , B2 s, xs∗ , . . . , Bn s, xs∗ where (s)i Bi s, xs∗ = rξ (s)i s, xs∗ − ξx ∗ s, xs∗ f xs∗ , ψ1∗ xs∗ , ψ2∗ xs∗ , . . . , ψn∗ xs∗ s
−
1 2
m
(s)i Ω hζ xs∗ ξ h ζ t, xt∗ |t=s , xt xt
h,ζ =1
for i ∈ N,
(8.64)
and ξ (s) (s, xs∗ ) = [ξ (s)1 (s, xs∗ ), ξ (s)2 (s, xs∗ ), . . . , ξ (s)n (s, xs∗ )] ∈ P (s, xs∗ ) is the imputation at time s ∈ [τ, ∞) with the state being xs∗ ∈ {x ∗ (t)}t≥τ under optimality Principle PII—is subgame consistent. Proof Following the algorithm that specifies P (τ, xτ ) as the solution to the game Γc (τ, xτ ) one can readily obtain the solution of the cooperative game Γc (υ, xυ∗ ), for υ > τ , as P (υ, xυ∗ ) = {u(s, xs∗ ) and B(s, xs∗ ) for s ∈ [υ, ∞) and ξ (υ) (υ, xυ∗ )} in which (i) u(s, xs∗ ) for s ∈ [υ, ∞) is the set of group optimal strategies ψ ∗ (xs∗ ) for the game Γc (υ, xυ∗ ), and (ii) B s, xs∗ = B1 s, xs∗ , B2 s, xs∗ , . . . , Bn s, xs∗ for s ∈ [υ, ∞) where (s)i Bi s, xs∗ = rξ (s)i s, xs∗ − ξx ∗ s, xs∗ f xs∗ , ψ1∗ xs∗ , ψ2∗ xs∗ , . . . , ψn∗ xs∗ s
−
1 2
m h,ζ =1
(s)i Ω hζ xs∗ ξ h ζ t, xt∗ t=s , xt xt
for i ∈ N,
(8.65)
and ξ (s) (s, xs∗ ) = [ξ (s)1 (s, xs∗ ), ξ (s)2 (s, xs∗ ), . . . , ξ (s)n (s, xs∗ )] ∈ P (s, xs∗ ) is the imputation at time s ∈ [υ, ∞) with the state being xs∗ ∈ {x ∗ (t)}t≥υ .
8.5 Infinite-Horizon Consistent Economic Optimization Under Uncertainty
231
Using the characterization of optimal control strategies in (8.49), one can show that the group optimal joint expected payoff maximizing strategies ψ ∗ (xs∗ ) for the cooperative game Γc (τ, xτ ) over the time interval [υ, ∞) is identical to the joint payoff maximizing strategies controls for the cooperative game Γc (υ, xυ∗ ) over the same time interval. Comparing (8.64) and (8.65) one can show that the payoff distribution procedure B(s, xs∗ ) for the cooperative game Γc (τ, xτ ) over the time interval [υ, ∞) is identical to the payoff distribution procedure B(s, xs∗ ) for the cooperative game Γc (υ, xυ∗ ) over the same time interval. Invoking Lemma 8.1 and (8.62), one can show that the payoff distribution procedure B(s, xs∗ ) = {B1 (s, xs∗ ), B2 (s), . . . , Bn (s, xs∗ )} in (8.64) will yield ξ (υ)i υ, xυ∗ = Eυ
υ
∞
Bi s, xs∗ exp −r(s − υ) ds ∈ P υ, xυ∗ ,
for i ∈ N , and υ ∈ [τ, ∞). Hence exp r(υ − τ ) ξ (τ )i υ, xυ∗ ∞
Bi (s) exp −r(s − τ ) ds x(υ) = xυ∗ ≡ exp r(υ − τ ) Eτ υ
=ξ
(υ)i
υ, xυ∗ P υ, xυ∗ ,
for i ∈ N and υ ∈ [τ, ∞).
In sum, the continuation of the solution P (τ, xτ ) over the time interval [υ, ∞) is consistent with the solution P (υ, xυ∗ ) of the game Γc (υ, xυ∗ ) under optimality Principle PII. Thus the solution P (τ, xτ ) in Theorem 8.3 is indeed subgame consistent. With agents using the cooperative strategies {ψi∗ (xυ∗ ), for i ∈ N and υ ∈ [τ, ∞)}, the instantaneous receipt of agent i at time instant υ is ζi υ, xυ∗ = g i xυ∗ , ψ1∗ xυ∗ , ψ2∗ xυ∗ , . . . , ψn∗ xυ∗ ,
for i ∈ N,
(8.66)
when the state is xυ∗ ∈ Xυ∗ . According to Theorem 8.3, the instantaneous payment that agent i should receive under the agreed-upon optimality principle is Bi (υ, xυ∗ ) as stated in (8.64). Hence an instantaneous transfer payment χ i υ, xυ∗ = Bi υ, xυ∗ − ζi υ, xυ∗ ,
for i ∈ N,
has to be given to agent i at time υ when the state is xυ∗ ∈ Xυ∗ .
(8.67)
232
8
Subgame Consistent Economic Optimization Under Uncertainty
8.6 Infinite-Horizon Cooperative Fishery Under Uncertainty Consider an infinite-horizon version of the cooperative fishery in Sect. 8.5. At time τ , the expected payoff function of extractors 1 and 2 are, respectively, ∞
c1 1/2 Eτ u1 (s) − u1 (s) exp −r(t − τ ) ds , x(s)1/2 τ and
∞
Eτ
u2 (s) τ
1/2
c2 − u2 (s) exp −r(t − τ ) ds . x(s)1/2
(8.68)
The fish resource stock x(s) ∈ X ⊂ R follows the stochastic dynamics dx(s) = ax(s)1/2 − bx(s) − u1 (s) − u2 (s) ds + σ x(s) dz(s), x(τ ) = xτ . (8.69) Invoking Theorem 2.6 in Chap. 2, we let [φ1∗ (x), φ2∗ (x)] for t ∈ [t0 , T ] denote a set of strategies that provides a feedback Nash equilibrium solution to the game in (8.68) and (8.69) can be characterized by 1 i r Vˆ i (x) − σ 2 x 2 Vˆxx (x) 2
ci 1/2 = max ui − 1/2 ui + Vˆxi (x) ax 1/2 − bx − ui − φj∗ (x) , ui x for i, j ∈ {1, 2} and i = j.
(8.70)
Performing the indicated maximization in (8.70) yields φi∗ (x) =
x , ˆ 4[ci + Vxi (x)x 1/2 ]2
for i ∈ {1, 2}.
Substituting φ1∗ (x) and φ2∗ (x) above into (8.70) and upon solving (8.70) one obtains the value function of agent i ∈ {1, 2} as Vˆ i (t, x) = Ai x 1/2 + Ci , (8.71) where for i, j ∈ {1, 2} and i = j, Ai , Ci , Aj , and Cj satisfy r+
σ2 b 1 ci + Ai − + 8 2 2[ci + Ai /2] 4[ci + Ai /2]2
Ai Ai + = 0, 2 8[ci + Ai /2] 8[cj + Aj /2]2 a Ci = Ai . 2 +
and
8.6 Infinite-Horizon Cooperative Fishery Under Uncertainty
233
The game equilibrium strategies can be obtained as φ1∗ (x) =
x , 4[c1 + A1 /2]2
and φ2∗ (x) =
x . 4[c2 + A2 /2]2
8.6.1 Cooperative Extraction Consider the case when these two nations agree to act according to an agreed-upon optimality principle which entails (i) group optimality, and (ii) the distribution of the excess of the total expected cooperative payoff over the sum of expected individual noncooperative payoffs proportional to the agents’ expected noncooperative payoffs. To maximize their joint expected payoff for group optimality, the nations have to solve the stochastic control problem of maximizing
∞
c1 c2 1/2 u1 (s) − u1 (s) + u2 (s) − u2 (s) Et x(s)1/2 x(s)1/2 t
× exp −r(t − t) ds , (8.72) 1/2
subject to (8.69). Invoking Theorem A.6 in the Technical Appendixes yields the characterization of the solution of the problem in (8.69) and (8.72) as follows. Corollary 8.3 A set of controls {ψi∗ (x), for i ∈ {1, 2}} constitutes an optimal solution to the stochastic control problem in (8.69) and (8.72), if there exist continuously twice differentiable functions W (x) : R m → R, satisfying the following partial differential equation c1 c2 1/2 1/2 u1 − 1/2 u1 + u2 − 1/2 u2 x x
+ Wx (x) ax 1/2 − bx − u1 − u2 . (8.73)
1 rW (x) − σ 2 x 2 Wxx (x) = max u1 ,u2 2
Performing the indicated maximization in (8.73) we obtain x , 4[c1 + Wx (x)x 1/2 ]2 x . ψ2∗ (x) = 4[c2 + Wx (x)x 1/2 ]2 ψ1∗ (x) =
and
The maximized expected joint profit function can be derived as follows.
(8.74)
234
8
Subgame Consistent Economic Optimization Under Uncertainty
Proposition 8.3 The maximized expected joint profit function is W (x) = Ax 1/2 + C ,
(8.75)
where 1 1 σ2 b + A− − r+ 8 2 2[c1 + A/2] 2[c2 + A/2] c1 c2 A A + + + = 0, 4[c1 + A/2]2 4[c2 + A/2]2 8[c1 + A/2]2 8[c2 + A/2]2 a C = A. 2r +
and
Proof Substituting the optimal strategies in (8.74), W (x) in (8.75), and the relevant derivatives Wx (x) and Wxx (x) into (8.73) yields the results in Proposition 8.3. The optimal cooperative controls can then be obtained as ψ1∗ (x) =
x , 4[c1 + A/2]2
and
ψ2∗ (x) =
x . 4[c2 + A/2]2
(8.76)
Substituting these control strategies into (8.69) yields the dynamics of the state trajectory under cooperation x(s) x(s) − dx(s) = ax(s)1/2 − bx(s) − ds + σ x(s) dz(s), 4[c1 + A/2]2 4[c2 + A/2]2 x(t0 ) = x0 .
(8.77)
Solving (8.77) yields the optimal cooperative state trajectory as 2 s 1/2 −1 (t0 , t)H1 dt , x ∗ (s) = (t0 , s)2 x0 +
for s ∈ [t0 , T ],
(8.78)
t0
where s s σ σ2 (t0 , s) = exp H2 (τ ) − dz(υ) , dυ + 8 t0 t0 2 1 1 1 σ2 1 + + . H1 = a, and H2 (s) = − b + 2 2 8[c1 + A(s)/2]2 8[c2 + A(s)/2]2 8 The cooperative control for the game can be expressed as ψ1∗ xt∗ =
xt∗ , 4[c1 + A/2]2
for xt∗ ∈ Xt∗ .
and ψ2∗ xt∗ =
xt∗ , 4[c2 + A/2]2 (8.79)
8.6 Infinite-Horizon Cooperative Fishery Under Uncertainty
235
8.6.2 Subgame Consistent Payoff Distribution With the extractors using the cooperative strategies in (8.79) along the stochastic cooperative path, they agree to share the excess of the total expected cooperative payoff over the sum of individual noncooperative payoffs proportional to the agents’ expected noncooperative payoffs. Therefore, the following imputation has to be satisfied. Condition 8.4 An imputation Vˆ i (xυ∗ ) [Ai (xυ∗ )1/2 + Ci ] ∗ 1/2 A xυ W xυ∗ = 2 +C ξ (υ)i υ, xυ∗ = 2 j ∗ ∗ 1/2 ˆ + Cj ] j =1 V (xυ ) j =1 [Aj (xυ ) (8.80) is assigned to extractor i, for i ∈ {1, 2} if xυ∗ ∈ Xυ∗ occurs at time υ ∈ [τ, ∞). Applying Theorem 8.3, a subgame consistent solution for the cooperative game Γc (τ, xτ ) can be obtained as P (τ, xτ ) = u s, xs∗ and B s, xs∗ for s ∈ [τ, ∞) and ξ (τ ) (τ, xτ ) in which (i) u(s, xs∗ ) for s ∈ [τ, ∞) is the set of group optimal strategies ψ1∗ xs∗ =
xs∗ 4[c1 + A/2]2
and ψ2∗ xs∗ =
xs∗ ; 4[c2 + A/2]2
and (ii) B(s, xs∗ ) = {B1 (s, xs∗ ), B2 (s, xs∗ ), . . . , Bn (s, xs∗ )} for s ∈ [τ, ∞) with Bi s, xs∗ = rξ (s)i s, xs∗ (s)i
− ξx ∗ s
s, xs∗
∗ 1/2 − bxs∗ − a xs
1 2 − σ 2 xs∗ ξ (τh )iζ s, xs∗ , x x 2 s s
xs∗ xs∗ − 4[c1 + A/2]2 4[c2 + A/2]2
for i ∈ {1, 2},
where ∗ [Ai (xs∗ )1/2 + Ci ]A(xs∗ )−1/2 + [A(xs∗ )1/2 + C]Ai (xs∗ )−1/2 s, xs = ξx(s)i ∗ s 2 2j =1 [Aj (xs∗ )1/2 + Cj ] 2 [Ai (xs∗ )1/2 + Ci ][A(xs∗ )1/2 + C] 1 ∗ −1/2 ; A j xs − 2 ( 2j =1 [Aj (xs∗ )1/2 + Cj ])2 j =1
236
8
Subgame Consistent Economic Optimization Under Uncertainty
and Ci A(xs∗ )−3/2 + CAi (xs∗ )−3/2 (τ )i ξx ∗ x ∗ s, xs∗ = − s s 4 2j =1 [Aj (xs∗ )1/2 + Cj ] −
[Ai (xs∗ )1/2 + Ci ]A(xs∗ )−1/2 + [A(xs∗ )1/2 + C]Ai (xs∗ )−1/2 (2 2j =1 [Aj (xs∗ )1/2 + Cj ])2
2 ∗ −1/2 × Aj x s j =1
2 [Ai (xs∗ )1/2 + Ci ][A(xs∗ )1/2 + C] 1 ∗ −3/2 A j xs + 4 ( 2j =1 [Aj (xs∗ )1/2 + Cj ])2 j =1
− ×
1 ∗ −1/2 Aj xs 2 2
j =1
Ai A + 12 [Ai C + ACi ](xs∗ )−1/2 ( 2j =1 [Aj (xs∗ )1/2 + Cj ])2
2 [Ai (xτ∗ )1/2 + Ci ][A(xτ∗ )1/2 + C] ∗ −1/2 . (8.81) − A j xτ ( 2j =1 [Aj (xτ∗ )1/2 + Cj ])3 j =1 With extractors using the cooperative strategies in (8.79), the instantaneous receipt of agent i at time instant υ ∈ [τ, ∞) is ζi υ, xυ∗ =
(xυ∗ )1/2 ci (xυ∗ )1/2 , − 2[ci + A/2] 4[ci + A/2]2
for i ∈ {1, 2},
(8.82)
if xυ∗ ∈ Xυ∗ occurs. Under the cooperative agreement, the instantaneous payment that agent i ∈ {1, 2} should receive under the agreed-upon optimality principle is Bi (υ, xυ∗ ) in (8.81). Hence an instantaneous transfer payment (8.83) χ i υ, xυ∗ = Bi υ, xυ∗ − ζi υ, xυ∗ , has to be given to agent i at time υ, for i ∈ {1, 2} and xυ∗ ∈ Xυ∗ .
8.7 Exercises 8.1 Consider the case of two nations harvesting fish in common waters. The growth rate of the fish biomass is subject to stochastic shocks and follows the differential
8.7 Exercises
237
equation dx(s) = 8x(s)1/2 − x(s) − u1 (s) − u2 (s) ds + 0.05x(s) dz(s),
x(0) = 100,
where z(s) is a Wiener process, x(s) is the fish stock, and ui (s) is the amount of fish harvested by nation i, for i ∈ {1, 2}. The horizon of the game is [0, 3]. The harvesting cost for nation i ∈ {1, 2} depends on the quantity of the resource extracted ui (s) and the resource stock size x(s). In particular, nation 1’s extraction cost is 1.5u1 (s)x(s)−1/2 and nation 2’s is u2 (s)x(s)−1/2 . The fish harvested by nation i at time s will generate a net benefit of the amount [ui (s)]1/2 . At terminal time 4, nations 1 and 2 will receive termination bonuses 8x(3)1/2 and 6x(3)1/2 while the interest rate is 0.05. At time 0 the expected payoffs of nations 1 and 2 are, respectively,
3 1/2 1 2.5 2 , u1 (s) and − u (s) exp(−0.05s) ds + exp −r(3) 8x(3) E i x(s)1/2 0
3 1/2 1 3 2 . u2 (s) − u (s) exp(−0.05s) ds + exp −r(3) 6x(3) E i x(s)1/2 0 Obtain a Nash equilibrium solution for this stochastic transnational market activity. 8.2 If these nations agree to cooperate and maximize their expected joint payoff, compute the optimal cooperative strategies and optimal stock path of the fish biomass. 8.3 Furthermore, if these nations agree to share the excess of their expected gain equally, obtain a subgame consistent solution. 8.4 Consider the case when the game horizon in exercise 1 is extended to infinity. (i) Obtain a Nash equilibrium solution for this stochastic dynamic transnational market activity. (ii) If these nations agree to cooperate and maximize their expected joint payoff and share the excess of their expected gain equally, obtain a subgame consistent solution.
Chapter 9
Cost-Saving Joint Venture Under Uncertainty
In this chapter, we consider a cost-saving joint venture in the presence of stochastic elements. Section 9.1 formulates a dynamic cost-saving corporate joint venture in a stochastic environment and characterizes its subgame consistent solutions. An explicitly solvable illustration is given in Sect. 9.2. A characterization of the Shapley Value solution to a stochastic cost-saving joint venture is presented in Sect. 9.3 and a payoff distribution procedure leading to a subgame consistent solution is computed. Extensions to infinite-horizon ventures are formulated with explicit illustrations in the subsequent two sections.
9.1 Dynamic Corporate Joint Venture Under Uncertainty To incorporate uncertainty in the corporate joint venture models in Chap. 5 we formulate the technology state dynamics of the ith firm as a set of stochastic differential equations dx i (s) = f i s, x i (s), ui (s) ds + σ i s, x i (s) dzi (s),
x i (t0 ) = x0i ,
for i ∈ N,
(9.1)
where x i (s) ∈ X i ⊂ R mi denotes the technology state of firm i, ui ∈ Ui ⊂ comp R is the control vector of firm i, σ i [s, x i (s)] is a mi × Θi , and zi (s) is a Θi dimensional Wiener process and the initial state x0i is given. Let Ω i [s, x i (s)] = σ i [s, x i (s)]σ i [s, x i (s)]T denote the covariance matrix with its element in row h hζ and column ζ denoted by Ωi [s, x i (s)]. For i = j, x i ∩ x j = ∅, and zi (s) and zj (s) are independent Wiener processes. We also used x N (s) to denote the vector [x 1 (s), x 2 (s), . . . , x n (s)] and x0N the vector [x01 , x02 , . . . , x0n ]. D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_9, © Springer Science+Business Media, LLC 2012
239
240
9
Cost-Saving Joint Venture Under Uncertainty
The expected profit of firm i is T s i i {i} Et0 r(y) dy ds g s, x (s) − ci ui (s) exp − t0
+ exp −
T
r(y) dy q i x i (T ) ,
t0
for i ∈ [1, 2, . . . , n] ≡ N,
(9.2)
t0
t where exp[− t0 r(y) dy] is the discount factor and q i (xi (T )) the terminal payoff. In particular, g i [s, xi , ui ] and q i (xi ) are positively related to xi , reflecting the earning potent of the technology. Since the expected payoffs and state dynamics in a noncooperative equilibrium are independent across firms, the market outcome is represented by an n stochastic neoclassical theory of the firm problems. Let V (t0 )i (t, x i ) and φi∗ (t, x i ) denote the expected payoff and investment strategies of firm i, for i ∈ N , by which a firm’s equilibrium is characterized (see Theorem A.5 in the Technical Appendixes). In particular, a set of investment strategies φi∗ (t, x i ) for firm i constitutes an optimal solution to the stochastic investment problem in (9.1) and (9.2) if there exists a continuously twice differentiable function V (t0 )i (t, x i ) defined by [t0 , T ] × R mi → R and satisfying the following Bellman equation: mi 1 (t0 )i i hζ Ωi t, x i Vx i(h) t, x i − t, x x i(ζ ) 2
(t0 )i
−Vt
h,ζ =1
t
{i} r(y) dy + Vx(t0 )i (t, x)f i t, x i , ui , = max g t, x i − ci (ui ) exp − ui
V (t0 )i T , x i = q i x i exp −
t0
T
r(y) dy .
t0
Let V (τ )i (t, x i ) denote the payoff function of firm i in a game with the dynamics in (9.1) and payoff in (9.2), which starts at time τ for τ ∈ [t0 , T ). Note that the equilibrium feedback strategies are Markovian in the sense that they depend on the current time and current state. Invoking Remark 2.2 of Chap. 2, one can obtain τ r(y) dy V (t0 )i t, x i = V (τ )i t, x i , exp t0
for τ ∈ [t0 , T ] and i ∈ N . For the sake of clarity in exposition, we consider the case where mi = 1, for i ∈ N.
9.1.1 Joint Venture and Expected Profit Maximization Consider a joint venture consisting of all these n companies. The participating firms can gain core skills and technology that would be impossible for them to obtain on
9.1 Dynamic Corporate Joint Venture Under Uncertainty
241
their own individually. Cost-saving opportunities are created under joint venture, for instance, savings in joint R&D, administration, marketing, customer services, purchasing, financing, and economy of scales and scope. The cost of control of firm j under the joint venture becomes cjN [uj (s)]. With absolute joint venture cost advantage we have {j }
cjN (uj ) ≤ cj (uj ),
for j ∈ N.
(9.3)
Moreover, marginal cost advantages lead to {j }
∂cjN (uj )/∂uj ≤ ∂cj (uj )/∂uj ,
for j ∈ N.
At time t0 , the joint venture would maximize the expected joint venture profit
s n T j j Et0 r(y) dy ds g s, x (s) − cjN uj (s) exp − +
t0 j =1 n
exp −
T
t0
r(y) dy q j x j (T ) ,
(9.4)
t0
j ∈1
subject to (9.3). Invoking Fleming’s techniques of stochastic optimal control in Theorem A.5 of the Technical Appendixes, the solution to the problem in (9.3) and (9.4) can be characterized as follows. Corollary 9.1 A set of controls {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]}, provides an optimal solution to the control problem in (9.3) and (9.4) if there exists a continuously twice differentiable function W (t0 ) (t, x) : [t0 , T ] × R n → R satisfying the following Bellman equation: (t0 )
− Wt
(t, x) −
=
max
u1 ,u2 ,...,un
+
n j =1
n 1 hζ (t ) Ω (t, x)Wx h0x ζ (t, x) 2 h,ζ =1
n
j
g t, x
j
− cjN (uj )
j =1
Wx(tj0 ) (t, x)f j
W (t0 ) (T , x) = exp −
j
t, x , uj
T
t0
t exp − r(y) dy
r(y) dy
t0
(9.5) ,
n
qj xj ,
j =1
where x = {x 1 , x 2 , . . . , x n }. Hence the firms will adopt the cooperative control {ψi∗ (t, x), for i ∈ N and t ∈ [t0 , T ]}, to obtain the maximized level of expected joint profit. Substituting this
242
9
Cost-Saving Joint Venture Under Uncertainty
set of controls into (9.3) yields the dynamics of technology advancement under cooperation as dx i (s) = f i s, x i (s), ψi∗ s, x(s) ds + σ i s, x i (s) dzi (s), x i (t0 ) = x0i ,
for i ∈ N.
(9.6)
Let x ∗ (t) = {x 1∗ (t), x 2∗ (t), . . . , x n∗ (t)} denote the solution to (9.6). The optimal cooperative trajectory can be expressed as t t i∗ ∗ i∗ i i ∗ f s, x (s), ψi s, x (s) ds + σ i s, x i∗ (s) dzi (s), x (t) = x0 + t0
t0
for i ∈ N.
(9.7)
We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t generated by (9.6). The term xt∗ ∈ Xt∗ is used to denote an element in Xt∗ . The cooperative investment strategies for the joint venture with the dynamics of (9.3) and the expected joint venture profit in (9.4) over the time interval [t0 , T ] can be expressed more precisely as ∗ ∗ (9.8) ψi t, x (t) , for i ∈ N and t ∈ [t0 , T ] . Note that for group optimality to be achievable, the cooperative investment strategies {ψi∗ (t, x ∗ (t)), for i ∈ N and t ∈ [t0 , T ]}, must be exercised throughout time interval [t0 , T ]. Along the cooperative investment path {x ∗ (t)}Tt=t0 , the present value of the total expected joint venture profit over the interval [t, T ], for t ∈ [t0 , T ), can be expressed as
n T ∗ j j∗ (t0 ) t, xt = Et0 g s, x (s) − cjN ψj∗ s, x ∗ (s) W t
j =1
s r(y) dy ds × exp − t0
+ exp −
T
t0
r(y) dy
n
q x j ∗ (T ) | x ∗ (t) = xt∗ ∈ Xt∗ . (9.9) j
j =1
Let W (τ ) (t, xt∗ ) denote the total venture profit from the control problem with the dynamics in (9.3) and payoff in (9.4), which begins at time τ ∈ [t0 , T ] with initial state xτ∗ . Invoking Remark 8.1 of Chap. 8, one can readily obtain τ r(y) dy W (t0 ) t, xt∗ = W (τ ) t, xt∗ , exp t0
for τ ∈ [t0 , T ] and t ∈ [τ, T ).
9.1 Dynamic Corporate Joint Venture Under Uncertainty
243
9.1.2 Subgame Consistent Joint Venture Since the sizes and earning potentials of the firms in a corporate joint venture may vary significantly, we consider the case when the venture agrees to share the excess of the expected total cooperative payoff over the sum of expected individual noncooperative payoffs proportionally to the firms’ expected noncooperative payoffs. The imputation scheme has to fulfill the following condition. Condition 9.1 An imputation V (t0 )i (t0 , x0i ) ξ (t0 )i (t0 , x0 ) = n W (t0 ) (t0 , x0 ), j (t )j 0 (t0 , x0 ) j =1 V is assigned to firm i, for i ∈ N at the outset; and an imputation V (τ )i (τ, xτi∗ ) W (τ ) τ, xτ∗ ξ (τ )i τ, xτ∗ = n j∗ (τ )j (τ, xτ ) j =1 V
(9.10)
is assigned to firm i, for i ∈ N at time τ ∈ (t0 , T ]. The imputation in (9.10) satisfies (i) ξ (τ )i (τ, xτ∗ ) ≥ V (τ )i (τ, xτi∗ ), for i ∈ N and τ ∈ [t0 , T ]; and n (τ )j (τ, x ∗ ) = W (τ ) (τ, x ∗ ), for τ ∈ [t , T ]. (ii) 0 τ τ j =1 ξ Hence the imputation vector ξ (τ ) (τ, xτ∗ ) in (9.10) satisfies individual rationality and group optimality throughout the game horizon [t0 , T ]. The optimality principle guiding the joint venture can then be stated as follows: Optimality Principle PI (i) the maximization of the venture’s expected payoffs and (ii) the sharing of the expected venture cooperative profit proportionally to individual firms’ expected noncooperative payoffs. All the participating firms in the joint venture will have no incentive to exit the venture if the agreed-upon optimality principle is maintained at every instant t ∈ [t0 , T ] along the cooperative state trajectory. Hence a subgame consistent solution has to be sought. As in Chap. 4, we let the solution under this optimality principle be expressed as P (x0 , T − t0 ) = u s, xs∗ and B t0 s, xs∗ for s ∈ [t0 , T ]; ξ (t0 ) (t0 , x0 ) . Invoking Theorem 8.2 of Chap. 8, a subgame consistent solution for the joint venture under the above optimality principle can be obtained as follows.
244
9
Cost-Saving Joint Venture Under Uncertainty
Corollary 9.2 For the joint venture characterized by (9.3) and (9.4) under optimality Principle PI, the solution P (x0 , T − t0 ) = {u(s, xs∗ ) and B(s, xs∗ ) for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 )} in which (i) u(s, xs∗ ) for s ∈ [t0 , T ] is the set of group optimal strategies ψ ∗ (s, xs∗ ) for the game Γc (x0 , T − t0 ), and (ii) B(s, xs∗ ) = {B1 (s, xs∗ ), B2 (s, xs∗ ), . . . , Bn (s, xs∗ )} for s ∈ [t0 , T ] where n 1 ∗ Ω hζ s, xs∗ ξ (s)i Bi s, xs∗ = − ξt(s)i t, xt∗ |t=s − ζ t, xt |t=s h xt xt 2 h,ζ =1
−
n (s)i ∗ ξ h∗ t, xt |t=s f h s, xsh∗ , ψh∗ s, xs∗ xt
h=1
∗ ∂ V (s)i (t, xti∗ ) (s) =− W t, xt |t=s n j∗ ∂t V (s)j (t, xt ) j =1
−
1 2
n
Ω hζ s, xs∗
h,ζ =1
V (s)i (t, xti∗ )
× n
j =1 V
∂2 ζ∗
∂xth∗ ∂xt
(s)j (t, x j ∗ ) t
W
(s)
∗ t, xt |t=s
n ∗ ∂ V (s)i (t, xti∗ ) (s) W t, x | n t t=s (t)j (t, x j ∗ ) ∂xth∗ t j =1 V h=1 × f h s, xsh∗ , ψh∗ s, xs∗ , for i ∈ N and xs∗ ∈ Xs∗ , −
and ξ (s)i (s, xs∗ ) =
n
V (s)i (s,xsi∗ )
j =1 V
(s)j (s,x j ∗ ) s
(9.11)
W (s) (s, xs∗ ) is subgame consistent.
With firms using the cooperative investment strategies {ψi∗ (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N }, the instantaneous receipt of firm i at time instant τ is ζi τ, xτ∗ = g i τ, xτi∗ − ciN ψi∗ τ, xτ∗ , for τ ∈ [t0 , T ] and i ∈ N.
(9.12)
According to Corollary 9.2, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ) as stated in (9.11). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ has to be given or charged to firm i at time τ , for i ∈ N and τ ∈ [t0 , T ].
(9.13)
9.2 A Cost-Saving Joint Venture with Stochasticity
245
9.2 A Cost-Saving Joint Venture with Stochasticity Consider the case when there are three companies involved in a joint venture. The planning period is [t0 , T ]. Company i’s expected profit is Et0
1/2 {i} − ci ui (s) exp −r(s − t0 ) ds Pi x i (s)
T t0
1/2 , + exp −r(T − t0 ) qi x i (T )
for i ∈ {1, 2, 3},
(9.14)
{i}
where Pi , ci , and qi are positive constants, r is the discount rate, xi (s) ⊂ R + is the level of technology of company i at time s, and ui (s) ⊂ R + is its physical investment in technological advancement. The term Pi [x i(s)]1/2 reflects the net operating {i} revenue of company i at technology level xi (s), and ci ui is the cost of investment if firm i operates on its own. The term qi [x i (T )]1/2 gives the salvage value of company i’s technology at time T . The dynamics of the technology level of company i follows the stochastic differential equation 1/2 dx i (s) = αi ui (s)x i (s) − δx i (s) ds + σi x i (s) dzi (s),
x i (t0 ) = x0i ∈ Xi ,
for i ∈ {1, 2, 3},
(9.15)
where αi [ui (s)x i (s)]1/2 is the addition to the technology brought about by ui (s) amount of physical investment, δ is the rate of obsolescence, and z1 (s), z2 (s), and z3 (s) are independent Wiener processes. In the case when each of these three firms acts independently, using Theorem A.5 in the Technical Appendixes, we obtain the corresponding partial differential equations as (t0 )i
(σi x i )2 (t0 )i i t, x i − Vx i x i t, x 2 1/2 {i} − ci ui exp −r(t − t0 ) = max Pi x i
−Vt
ui
1/2 + Vx(ti 0 )i t, x i αi ui x i − δx i , 1/2 , V (t0 )i T , x i = exp −r(T − t0 ) qi x i
for i ∈ {1, 2, 3}.
Performing the indicated maximization yields ui =
αi2
{i} 4(ci )2
2 t, x i exp r(t − t0 ) x i ,
(t )i
Vx i 0
for i ∈ {1, 2, 3}.
246
9
Cost-Saving Joint Venture Under Uncertainty
Substituting ui into the above partial differential equations yields (t0 )i
(σi x i )2 (t0 )i i t, x i − Vx i x i t, x 2
−Vt
1/2 αi2 (t0 )i i 2 exp −r(t − t0 ) − {i} exp r(t − t0 ) x i = Pi x i Vx i t, x 4ci +
i αi2 (t0 )i i 2 (t0 )i i exp r(t − τ ) x − δV V t, x t, x xi , i i {i} x x 2ci
for i ∈ [1, 2, 3].
Solving the above system of partial differential equations yields {i} 1/2 {i} V (t0 )i t, x i = Ai (t) x i + Ci (t) exp −r(τ − t0 ) ,
for i ∈ {1, 2, 3}, (9.16)
where δ σi2 {i} {i} ˙ Ai (t) − Pi , Ai (t) = r + + 2 8 {i}
Ai (T ) = qi ,
{i} {i} C˙ i (t) = rCi (t) −
αi2 {i} 2 Ai (t) , {i} 16ci (9.17)
{i}
and Ci (T ) = 0.
The first equation in the block-recursive system in (9.17) is a first-order linear {i} differential equation in Ai (t) that can be solved independently by standard tech{i} niques. Substituting the solution of Ai (t) into the second equation of (9.17) yields {i} {i} a first-order linear differential equation in Ci (t). The solution of Ci (t) can be readily obtained by standard techniques. Moreover, one can easily derive for τ ∈ [t0 , T ] {i} 1/2 {i} V (τ )i t, x i = Ai (t) x i + Ci (t) exp −r(t − τ ) , for i ∈ {1, 2, 3} and τ ∈ [t0 , T ].
9.2.1 Expected Venture Profit and Cost Savings Consider the case when all three firms agree to form a joint venture and share their expected joint profit proportionally to their expected noncooperative profits. Costsaving opportunities are created under joint venture from joint R&D, administration, purchasing, financing, and economy of scales and scope. The cost of control of firm {1,2,3} j under the joint venture becomes cj [uj (s)], with joint venture cost advantage {1,2,3}
cj
{j }
≤ cj ,
for j ∈ N.
(9.18)
9.2 A Cost-Saving Joint Venture with Stochasticity
247
The expected profit of the joint venture is the sum of the participating firms’ expected profits
Et0
T
3
t0 j =1
+
3
1/2 {1,2,3} − cj uj (s) exp −r(s − t0 ) ds Pj x j (s)
j 1/2 . exp −r(T − t0 ) qj x (T )
(9.19)
j =1
The firms in the joint venture then act cooperatively to maximize (9.19) subject to (9.18). In particular, (9.18)–(9.19) becomes an optimization problem under a joint venture involving all three firms with technology spillover. Using Theorem A.5 in the Technical Appendixes, we obtain the equation (t0 ){1,2,3}
− Wt
1
2
t, x , x , x
3
3 (σh x h )(σζ x ζ ) (t0 ){1,2,3} 1 2 3 − t, x , x , x Wx h x ζ 2 h,ζ =1
3 1/2 {1,2,3} − ci ui exp −r(t − t0 ) = max Pi x i ui
+
i=1
3
1/2 (t ){1,2,3} Wx i0 t, x 1 , x 2 , x 3 αi ui x i
− δx
i
(9.20) ,
i=1 3 1/2 W (t0 ){1,2,3} T , x 1 , x 2 , x 3 = exp −r(T − t0 ) qj x j , j =1
for i, j, h ∈ {1, 2, 3} and i = j = h. Performing the indicated maximization yields ui =
(t0 ){1,2,3} 1 2 3 αi2 Wx i t, x , x , x exp r(t {1,2,3} 2 4(ci )
2 − t0 ) x i ,
for i ∈ {1, 2, 3}.
(9.21)
Substituting (9.21) into (9.20) yields 3 (σj x j )2 (t0 ){1,2,3} 1 2 3 t, x 1 , x 2 , x 3 − t, x , x , x Wx j x j 2
(t0 ){1,2,3}
−Wt
j =1
=
3 1/2 αi2 x i (t0 ){1,2,3} 1 2 3 2 Pi x i exp −r(t − t0 ) − {1,2,3} Wx i t, x , x , x 4ci i=1
248
9
Cost-Saving Joint Venture Under Uncertainty
3 × exp r(t − t0 ) + Wx(ti 0 ){1,2,3} (t, x1 , x2 , x3 ) ×
W
i=1
i αi2 (t0 )i i W , (t, x , x , x ) exp r(t − t ) x − δx 1 2 3 0 xi 2ci2
(t0 ){1,2,3}
and
(9.22)
3 1/2 1 2 3 exp −r(T − t0 ) qj x j , T,x ,x ,x = j =1
for i, j, h ∈ {1, 2, 3} and i = j = h. Solving (9.23) yields W (t0 ){1,2,3} t, x 1 , x 2 , x 3 {1,2,3} 1 1/2 1/2 1/2 {1,2,3} {1,2,3} = A1 (t) x + A2 (t) x 2 + A3 (t) x 3 + C {1,2,3} (t) × exp −r(t − t0 ) , (9.23) {1,2,3}
where A1
{1,2,3}
(t), A2
{1,2,3}
(t), A3
(t), and x3 , C {1,2,3} (t) satisfy
[i,j ] bi bi[i,h] {1,2,3} δ σ 2i {1,2,3} {1,2,3} {1,2,3} ˙ (t) = r + + (t) − (t) − (t) − Pi Ai Aj Ah Ai 2 8 2 2 for i, j, h ∈ {1, 2, 3} and i = j = h, ˙ {1,2,3}
C
(t) = rC
{1,2,3}
(t) −
3 i=1
{1,2,3}
Ai
(T ) = qi ,
(9.24)
{1,2,3} 2 (t) , Ai {1,2,3}
αi2 16ci
for i ∈ {1, 2, 3}, and C {1,2,3} (T ) = 0.
The first three equations in the block recursive system in (9.24) is a system of three linear differential equations that can be solved explicitly by standard tech{1,2,3} (t) for i ∈ {1, 2, 3}, and substituting them into the niques. Upon solving Ai fourth equation of (9.24), one has a linear differential equation in C {1,2,3} (t). The investment strategies of the grand coalition joint venture can be derived as {1,2,3}
ψi
(t, x) =
αi2
{1,2,3} 2 )
16(ci
{1,2,3} 2 (t) , Ai
for i ∈ {1, 2, 3}.
(9.25)
The dynamics of technological progress of the joint venture over the time interval s ∈ [t0 , T ] can be expressed as i 1/2 αi2 {1,2,3} i i A (t) x (s) − δx (s) ds + σi x i (s) dzi (s), dx (s) = {1,2,3} i 4ci x i (t0 ) = x0i ,
for i ∈ {1, 2, 3}.
(9.26)
9.2 A Cost-Saving Joint Venture with Stochasticity
249
Taking the transforming y i (s) = x i (s)1/2 , for i ∈ {1, 2, 3}, the equation system in (9.26) can be expressed as
αi2 {1,2,3} σ 2 {1,2,3} δ 1 Ai (t) − y i (s) − i Ai (s)y i (s) ds + σi y i (s) dzi (s), 8ci 2 8 2 1/2 , for i ∈ {1, 2, 3}. (9.27) y i (t0 ) = x0i
dy i (s) =
Equation (9.27) is a system of linear stochastic differential equations that can be solved by standard techniques. Solving (9.27) yields the joint venture’s state trajectory. Let {y 1∗ (t), y 2∗ (t), y 3∗ (t)} denote the solution to (9.27). Transforming x i = (y i )2 , we obtain the state trajectories of the joint venture over the time interval s ∈ [t0 , T ] as T 2 2 2 T x ∗ (s) = x 1∗ (t), x 2∗ (t), x 3∗ (t) t=t = y 1∗ (t) , y 2∗ (t) , y 3∗ (t) t=t . 0
0
(9.28) We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t and the term xt∗ ∈ Xt∗ is used to denote an element in Xt∗ . Remark 9.1 One can readily verify that W (t0 ){1,2,3} t, x 1∗ , x 2∗ , x 3∗ = W (t){1,2,3} t, x 1∗ , x 2∗ , x 3∗ exp −r(t − t0 ) , for i ∈ {1, 2, 3}.
9.2.2 Subgame Consistent Venture Profit Sharing Since the firms agree to share their expected joint profit proportionally to their expected noncooperative profits, the imputation scheme has to fulfill the following condition. Condition 9.2 In the game Γc (x0 , T − t0 ), an imputation V (t0 )i (t0 , x0i ) ξ (t0 )i (t0 , x 0 ) = n W (t0 ){1,2,3} t0 , x01 , x02 , x03 , i (t )j 0 (t , x ) 0 0 j =1 V is assigned to firm i, for i ∈ {1, 2, 3}, and in the subgame Γc (xτ∗ , T − τ ), for τ ∈ (t0 , T ], an imputation V (τ )i (τ, xτi∗ ) ξ (τ )i τ, xτ∗ = n W (τ ){1,2,3} τ, xτ1∗ , xτ2∗ , xτ3∗ (τ )j i∗ (τ, xτ ) j =1 V is assigned to firm i, for i ∈ {1, 2, 3}.
(9.29)
250
9
Cost-Saving Joint Venture Under Uncertainty
To formulate a payoff distribution procedure over time so that the agreed imputations in Condition 9.2 are satisfied we invoke Corollary 9.2 to obtain Bi τ, xτ1∗ , xτ2∗ , xτ3∗ 1∗ 2∗ 3∗ V (τ )i (t, xti∗ ) ∂ (τ ){1,2,3} =− , x , x t, x | W t=τ 3 t t t (τ )j (t, x j ∗ ) ∂t t j =1 V −
3 1 ∂2 σh x h σζ x ζ ζ∗ 2 ∂xτh∗ ∂xτ
h,ζ =1
× 3
V (τ )i (τ, xτi∗ )
j =1 V
−
(τ )j (τ, x j ∗ ) τ
3 ∂ V (τ )i (τ, xτi∗ ) (τ ){1,2,3} 1∗ 2∗ 3∗ , x , x τ, x W 3 τ τ τ j∗ ∂xτh∗ V (τ )j (τ, xτ ) j =1
h=1
×
W (τ ){1,2,3} τ, xτ1∗ , xτ2∗ , xτ3∗
αh2
{1,2,3}
A {1,2,3} h
4ch
1/2 (τ ) xτh∗ − δxτh∗ (s) ,
for i ∈ {1, 2, 3}, xτ∗ ∈ Xτ∗ , τ ∈ [t0 , T ].
(9.30)
Finally, with firms using the cooperative investment strategies the instantaneous receipt of firm i at time instant τ is {1,2,3} 2 (τ ) , Ai {1,2,3} 16(ci ) ∈ {1, 2, 3}, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
1/2 ζi τ, xτ∗ = Pi xτi∗ − for i
αi2
(9.31)
Under cooperation, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ) as stated in (9.30). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ , (9.32) has to be given or charged to firm i at time τ , for i ∈ {1, 2, 3}, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
9.3 A Shapley Value Solution to a Joint Venture Under Uncertainty Consider again the stochastic dynamic venture model in (9.1) and (9.2). If firms are allowed to form different coalitions consisting of a subset of companies K ⊆ N .
9.3 A Shapley Value Solution to a Joint Venture Under Uncertainty
251
There are k firms in the subset K. The participating firms in a coalition can gain core skills and technology from each other. In particular, they can obtain cost reduction and with joint venture cost advantage as (9.33) cjK uj (s) ≤ cjL uj (s) , for j ∈ L ⊆ K, where cjK [uj (s)] represents the costs of the controls of the firm j in the subset K and cjL [uj (s)] represents the costs of the controls of the firm j in the subset L. Moreover, marginal cost advantages lead to ∂cjK uj (s) /∂uj (s) ≤ ∂cjL uj (s) /∂uj (s), for j ∈ L ⊆ K. At time t0 , the expected profit to the joint venture K becomes T s j j r(y) dy ds Et0 g s, x (s) − cjK uj (s) exp − t0 j ∈K
t0
+ exp −
T
r(y) dy q j x j (T ) ,
for K ⊆ N.
(9.34)
t0
j ∈K
9.3.1 Expected Joint Venture Profits and Optimal Trajectory To compute the expected profit of the joint venture K we have to consider the stochastic control problem [K; t0 , x0K ], which maximizes the expected joint venture profit in (9.34) subject to the technology accumulation dynamics in (9.33). Invoking Theorem A.5 in the Technical Appendixes, the solution to the control problem [K; t0 , x0K ] can be characterized as follows. Corollary 9.3 A set of controls {ψiK∗ (t, x K ), for i ∈ K and t ∈ [t0 , T ]}, provides an optimal solution to the stochastic control problem [K; t0 , x0K ] if there exists a continuously twice differentiable function W (t0 )K (t, x) : [t0 , T ]×R k → R satisfying the following partial differential equation: 1 hζ K (t0 )K K − Wt(t0 )K t, x K − Ω t, x Wx h x ζ t, x 2 h,ζ ∈K
t j j K = max r(y) dy g t, x − cj (uj ) exp − uK
+
t0
j ∈K
Wx(tj0 )K
t, x
K
f
j
j
s, x , uj
j ∈K
W (t0 )K T , x K = exp −
T
t0
r(y) dy
(9.35) ,
j ∈K
qj xj .
252
9
Cost-Saving Joint Venture Under Uncertainty
Following Corollary 9.3, one can characterize the maximized expected payoff W (τ )K (t, x K ) to the optimal control problem [K; τ, xτK ] which maximizes Eτ τ
+
T
s r(y) dy ds g j s, x j (s) − cjK uj (s) exp −
τ
j ∈K
j ∈K
exp −
T
r(y) dy q j x j (T ) ,
τ
subject to x˙ j (s) = f j s, x j (s), uj (s) ,
x j (τ ) = xτj ,
for j ∈ K.
The superadditivity of the expected coalition payoff can be demonstrated. Proposition 9.1 The expected coalition profits W (τ )K (t, x K ) is superadditivity, that is, W (τ )K τ, x K ≥ W (τ )L τ, x L + W (τ )K\L τ, x K\L ,
for L ⊂ K ⊆ N,
where K\L is the relative complement of L in K. Proof Follow the Proof of Proposition 5.2 in the Appendix of Chap. 5.
Now consider the case of a grand coalition N in which all the n firms are in the coalition. Following Corollary 9.3, the solution to the stochastic control problem [N; t0 , x0N ] can be characterized as in Corollary 9.1. The cooperative state dynamics is (9.6) and the optimal stochastic trajectory is (9.7). The optimal cooperative strategies are in (9.8). Along the cooperative investment path {x ∗ (t)}Tt=t0 the expected total venture profit over the interval [t, T ], for t ∈ [t0 , T ), can be expressed as (9.9).
9.3.2 The PDP for Shapley Value Consider the case where the participating firms agree to share their expected cooperative profits according to the Shapley Value (1953). The imputation has to satisfy the following condition. Condition 9.3 In the game Γc (x0 , T − t0 ), an imputation (k − 1)!(n − k)! (t )K K\i ξ (t0 )i t0 , x0N = W 0 t0 , x0K − W (t0 )K\i t0 , x0 , n! K⊆N
9.3 A Shapley Value Solution to a Joint Venture Under Uncertainty
253
is assigned to firm i, for i ∈ N , and in the subgame Γc (xτ∗ , T − τ ), for τ ∈ (t0 , T ], an imputation (k − 1)!(n − k)! (τ )K ξ (τ )i τ, xτN∗ = τ, xτK∗ − W (τ )K\i τ, xτK\i∗ , W n! K⊆N
(9.36) is assigned to firm i, for i ∈ N . To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 9.3 we invoke Theorem 8.1 in Chap. 8 and obtain the following. Corollary 9.4 A PDP with a terminal payment q i (xT∗ )) at time T and an instantaneous payment at time τ ∈ [t0 , T ] when x ∗ (τ ) = xτ∗ ∈ Xτ∗ (k − 1)!(n − k)! (τ )K Bi τ, xτ∗ = − t, xtK∗ t=τ Wt n! K⊆N
(τ )K\i K\i∗ t, xt − Wt t=τ h ∂ (τ )K K∗ W + τ, xτ f τ, xτh∗ , ψh∗ τ, xτ∗ ∂xτh∗ h∈K ∂ h (τ )K\i K\i∗ W τ, x f τ, xτh∗ , ψh∗ τ, xτ∗ − τ h∗ ∂xτ h∈K\i
+
(τ )K 1 hζ ∂2 Ω τ, xτ∗ W τ, xτK∗ ζ ∗ h∗ 2 ∂xτ ∂xτ h,ζ ∈K
−
(τ )K\i ∂2 1 K\i∗ W τ, x , Ω hζ τ, xτ∗ τ ζ∗ 2 ∂xτh∗ ∂xτ h,ζ ∈K\i
for i ∈ N,
(9.37)
would lead to the realization of the Shapley Value imputations ξ (τ )i (τ, xτN ∗ ) in Condition 9.3. Invoking Theorem 8.2 in Chap. 8, a subgame consistent Shapley Value solution for the joint venture can be obtained as P (x0 , T − t0 ) = u s, xs∗ and B s, xs∗ for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 ) , where u(s, xs∗ ) = ψ N∗ (s, x ∗ (s)) is the set of group optimal strategies in the grand coalition, B(s, xs∗ ) is the PDP given in (9.37), and ξ (t0 ) (t0 , x0 ) is the Shapley Value imputation in Condition 9.3.
254
9
Cost-Saving Joint Venture Under Uncertainty
Finally, with firms using the cooperative investment strategies {ψi∗ (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N }, the instantaneous receipt of firm i at time instant τ when x ∗ (τ ) = xτ∗ ∈ Xτ∗ is ζi τ, xτ∗ = g i τ, xτi∗ − ciN ψi∗ τ, xτ∗ ,
for τ ∈ [t0 , T ] and i ∈ N.
(9.38)
According to Corollary 9.4, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ) as stated in (9.37). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗
(9.39)
would be given or charged to firm i at time τ , for i ∈ N, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
9.4 A Stochastic Joint Venture with Shapley Value Profit Sharing Consider a similar venture as that in Sect. 9.2. When the firms act independently, their expected profits and state dynamics are, respectively, (9.14) and (9.15). The expected profits of firm i ∈ {1, 2, 3} are given in (9.16). However, the participating firms would like to share their expected cooperative profits according to the Shapley Value.
9.4.1 Expected Coalition Payoffs Cost-saving opportunities are created under joint venture. In particular, the cost savings in joint venture is depicted as follows: {i,j }
{i}
ci ≤ c i {i,j }
ci
,
{i,j,k}
≤ ci
for i, j ∈ {1, 2, 3} and i = j, ,
for i, j, k ∈ {1, 2, 3} and i = j = k.
(9.40)
The firms in the joint venture maximize the sum of their expected profits
Et0
+
T
3 j 1/2 {1,2,3} − cj uj (s) exp −r(s − t0 ) ds Pj x (s)
t0 j =1 3 j =1
subject to (9.15).
j 1/2 , exp −r(T − t0 ) qj x (T )
(9.41)
9.4 A Stochastic Joint Venture with Shapley Value Profit Sharing
255
Following the analysis in Sect. 9.2, one can obtain the maximized expected venture profit W (t0 ){1,2,3} (t, x 1 , x 2 , x 3 ) as in (9.23), and the investment strategies {1,2,3}
ψi
(t, x) =
αi2 {1,2,3} 2 (t) , A 16(ci )2 i
for i ∈ {1, 2, 3}.
(9.42)
The cooperative state dynamics of the joint venture over the time interval s ∈ [t0 , T ] is in (9.26). For the computation of the dynamic the Shapley Value, we consider cases when two of the firms form a coalition {i, j } ⊂ {1, 2, 3} to maximize the expected joint profit T i 1/2 1/2 {i,j } {i,j } Pi x (s) Et0 − ci ui (s) + Pj x j (s) − cj uj (s) t0
i 1/2 1/2 j + qj x (T ) × exp −r(s − t0 ) ds + exp −r(T − t0 ) qi x (T ) , (9.43) subject to 1/2 − δxi (s) ds + σi x i (s) dzi (s), dx i (s) = αi ui (s)x i (s) x i (t0 ) = x0i ∈ Xi ,
for i, j, ∈ {1, 2, 3} and i = j.
(9.44)
Following the analysis in Sect. 9.2, we obtain the following value functions: 1/2 {i,j } 1/2 {i,j } W (t0 ){i,j } t, x i , x j = Ai (t) x i + Aj (t) x j + C {i,j } (t) (9.45) × exp −r(t − t0 ) , {i,j }
for i, j, ∈ {1, 2, 3} and i = j , where Ai
{i,j }
(t), Aj
(t), and C {i,j } (t) satisfy
δ σi2 {i,j } {i,j } ˙ Ai (t) = r + + Ai (t) − Pi , 2 8
and
{i,j }
Ai
(T ) = qi
for i, j, ∈ {1, 2, 3} and i = j ; C˙ {i,j } (t) = rC {i,j } (t) −
αh2
{i,j } h∈{i,j } 16ch
{i,j } 2 Ah (t) ,
C {i,j } (T ) = 0. Moreover, one can easily derive, for τ ∈ [t0 , T ], W (t0 ){i,j } t, x i , x j = exp −r(τ − t0 ) W (τ ){i,j } t, x i , x j , for i, j, ∈ {1, 2, 3} and i = j.
256
9
Cost-Saving Joint Venture Under Uncertainty
9.4.2 Subgame Consistent Shapley Value Solution To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value we invoke Corollary 9.4 and obtain the following. A PDP with a terminal payment q i (xT∗ ) at time T and an instantaneous payment at time τ ∈ [t0 , T ] Bi τ, xτ∗ (k − 1)!(3 − k)! (τ )K (τ )K\i K\i∗ t, xtK∗ t=τ − Wt t, xt =− Wt t=τ 3! K⊆{1,2,3}
∂ i∗ 1/2 αh2 {1,2,3} (τ )K K∗ h∗ W A (τ ) xτ − δxτ τ, xτ + {1,2,3} h ∂xτh∗ 4c h∈K
h
∂ i∗ 1/2 αh2 {1,2,3} (τ )K\i K\i∗ h∗ τ, xτ − W A (τ ) xτ − δxτ {1,2,3} h ∂xτh∗ 4c h∈K\i
+
h
(τ )K 1 σh xτh∗ σζ xτζ ∗ W τ, xτK∗ ζ ∗ 2 ∂xτh∗ ∂xτ ∂2
h,ζ ∈K
(τ )K\i 1 ∂2 h∗ ζ∗ K\i∗ − σh xτ σζ xτ W τ, xτ , ζ∗ 2 ∂xτh∗ ∂xτ h,ζ ∈K\i
for i ∈ {1, 2, 3},
(9.46)
would lead to the realization of the Shapley Value imputations in Condition 9.3. Using (9.23) and (9.45), {i} 1/2 {i} 1/2 (τ )i i∗ {i} {i} + Ci (τ ) + A˙ i (τ ) xτi∗ + C˙ i (τ ) , t, xt t=τ = r Ai (τ ) xτi∗ Wt
for i ∈ {1, 2, 3}; (τ ){i,j } Wt t, x i∗ t
t=τ
1/2 {i,j } 1/2 {i,j } + Aj (τ ) xτj ∗ + C {i,j } (τ ) = r Ai (τ ) xτi∗ {i,j } 1/2 1/2 {i,j } + A˙ i (τ ) xτi∗ + A˙ j (τ ) xτj ∗ + C˙ {i,j } (τ ) ,
for i, j ∈ {1, 2, 3} and i = j ; (τ ){1,2,3} t, x i∗ Wt t
t=τ
1/2 1/2 {1,2,3} (τ ) xτ2∗ + A3 (τ ) xτ3∗ {1,2,3} 1∗ 1/2 + C {1,2,3} (τ ) + A˙ 1 (τ ) xτ 1/2 1/2 {1,2,3} {1,2,3} + A˙ 2 (τ ) xτ2∗ + A˙ 3 (τ ) xτ3∗ + C˙ {1,2,3} (τ ) ; h∗ −1/2 ∂ 1 (τ )K K∗ W , for h ∈ K ⊆ {1, 2, 3}, τ, x = AK τ h (τ ) xτ h∗ ∂xτ 2 =r
1/2 {1,2,3} (τ ) xτ1∗ A1
{1,2,3}
+ A2
9.5 Infinite-Horizon Analysis
257
−1 K h∗ −3/2 ∂ 2 (τ )K A (τ ) xτ , W τ, xτK∗ = h∗ 2 4 h ∂(xτ ) ∂2 ζ∗ ∂xτh∗ ∂xτ
W (τ )K τ, xτK∗ = 0,
and
for h = ζ.
Invoking Theorem 8.2 in Chap. 8 a subgame consistent solution for the joint venture can be obtained as P (x0 , T − t0 ) = u s, xs∗ and B s, xs∗ for s ∈ [t0 , T ] and ξ (t0 ) (t0 , x0 ) , where u(s, xs∗ ) = ψ N∗ (s, x ∗ (s)) is the set of group optimal strategies in the grand coalition, B(s, xs∗ ) is the PDP given in (9.46), and ξ (t0 ) (t0 , x0 ) is the Shapley Value imputation in Condition 9.3. Finally, with firms using the cooperative investment strategies the instantaneous receipt of firm i at time instant τ is {1,2,3} 2 (τ ) , Ai {1,2,3} 16(ci ) ∈ {1, 2, 3}, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
1/2 ζi τ, xτ∗ = Pi xτi∗ − for i
αi2
(9.47)
According to (9.46), the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ , (9.48) has to be given or charged to firm i at time τ , for i ∈ {1, 2, 3}, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
9.5 Infinite-Horizon Analysis Consider the case when the horizon of the analysis approaches infinity. The state dynamics of the ith firm is characterized by the set of vector-valued differential equations dx i (s) = f i x i (s), ui (s) ds + σ i x i (s) dzi (s), x i (t0 ) = x0i , for i ∈ N, (9.49) where σ i [x i (s)] is a mi × Θi and zi (s) is a Θi -dimensional Wiener process and the initial state x0i is given. Let Ω i [x i (s)] = σ i [x i (s)]σ i [x i (s)]T denote the covariance hζ matrix with its element in row h and column ζ denoted by Ωi [x i (s)]. The objective of firm i to be maximized is ∞
i i {i} E t0 g x (s) − ci ui (s) exp −r(s − t0 ) ds , for i ∈ N. (9.50) t0
258
9
Cost-Saving Joint Venture Under Uncertainty
Consider the alternative formulation of (9.49) and (9.50) as ∞
i i {i} g x (s) − ci ui (s) exp −r(s − t) ds , for i ∈ N, max Et ui
(9.51)
t
subject to dx i (s) = fi x i (s), ui (s) ds + σ i x i (s) dzi (s),
x i (t) = x i ,
for i ∈ N. (9.52)
The infinite-horizon problem in (9.51) and (9.52) is independent of the choice of t and dependent only upon the state at the starting time. Invoking Theorem A.6 in the Technical Appendixes, a noncooperative equilibrium can be characterized by a set of strategies {φi∗ (x) for i ∈ N } constitutes a firm’s equilibrium solution to the problem in (9.51) and (9.52), if there exist functionals Vˆ i (x i ) : R m → R for i ∈ N , satisfying the following set of partial differential equations: mi 1 Ω hζ x i Vˆxii(h) x i(ζ ) x i r Vˆ i x i − 2 h,ζ =1
{i} = max g x i − ci (ui ) + Vˆxi (x)f x i , ui . ui
(9.53)
Once again, for the sake of clarity in exposition, we consider the case where mi = 1, for i ∈ N .
9.5.1 Infinite-Horizon Dynamic Joint Venture Consider the case when all these n companies form a joint venture. Cost-saving opportunities are created under joint venture. The cost of control of firm j under the {1,2,3} joint venture becomes cj [uj (s)]. With joint venture cost advantage {1,2,3}
cj
{j }
(uj ) ≤ cj (uj ),
for j ∈ N,
(9.54)
the joint venture would maximize the expected joint venture profit
N g x (s) − cj uj (s) exp −r(s − t) ds ,
n ∞
Et t
j
j
(9.55)
j =1
subject to (9.52). An optimal solution of the control problem in (9.52) and (9.55) can be characterized using Theorem A.6 in the Technical Appendixes as follows.
9.5 Infinite-Horizon Analysis
259
Corollary 9.5 A set of control strategies {ψi∗ (x) for i ∈ N1 } provides a solution to the control problem in (9.52) and (9.55), if there exist continuously twice differentiable functions W (x) : R n → R, satisfying the following partial differential equation: rW (x) −
n 1 hζ Ω (x)Wx h x ζ (x) 2 h,ζ =1
=
max
u1 ,u2 ,...,un
n n j j g x − cjN (uj ) + Wxj (x)f j x j , uj , (9.56) j =1
j =1
where x = {x 1 , x 2 , . . . , x n }. Hence the firms will adopt the cooperative control {ψi∗ (x), for i ∈ N}, to obtain the maximized level of expected joint profit. Substituting this set of control into (9.52) yields the dynamics of technology advancement under cooperation as dx i (s) = f i x i (s), ψi∗ x(s) ds + σ i x i (s) dzi (s), x i (t0 ) = x0i ,
for i ∈ N.
(9.57)
Let x ∗ (t) = {x 1∗ (t), x 2∗ (t), . . . , x n∗ (t)} denote the solution to (9.57). The optimal trajectory {x ∗ (t)}∞ t=t0 can be expressed as x
i∗
(t) = x0i
+
t
t0
f x i∗ (s), ψi∗ x ∗ (s) ds + i
t
σ i s, x i∗ (s) dzi (s),
t0
for i ∈ N.
(9.58)
We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t generated by (9.58). The term xt∗ ∈ Xt∗ is used to denote an element in Xt∗ . Substituting the optimal extraction strategies in {ψi∗ (x), for i ∈ N} into (9.55) yields the expected venture profit as
n ∞ ∗ j j∗ ∗ ∗ N W xt = Et g x (s) − cj ψj x (s) exp −r(s − t) ds . (9.59) t
j =1
9.5.2 Subgame Consistent Venture Profit Sharing Consider the case when the firms in the venture share the excess of the total expected cooperative payoff over the sum of the expected individual noncooperative payoffs proportionally to the firms’ expected noncooperative payoffs.
260
9
Cost-Saving Joint Venture Under Uncertainty
The imputation scheme has to fulfill the following condition. Condition 9.4 An imputation Vˆ i (xτ∗ ) W xτ∗ , ξ (τ )i τ, xτ∗ = n i ∗ ˆ i=1 V (xτ )
(9.60)
is assigned to firm i, for i ∈ N at time τ ∈ [t0 , ∞) if x ∗ (τ ) = xτ∗ ∈ Xτ∗ . To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 9.4 we invoke Theorem 8.3 in Chap. 8 and obtain the following. Corollary 9.6 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞) ∗ Vˆ i (xτi∗ ) Bi τ, xτ∗ = r n W x τ ˆ j j∗ j =1 V (xτ ) −
n ∗ Vˆ i (xτi∗ ) ∂2 1 hζ ∗ Ω xτ W x τ ζ ∗ n j∗ 2 ∂xτh∗ ∂xτ Vˆ j (xτ ) j =1
h,ζ =1
−
n ∗ h h∗ ∗ ∗ ∂ Vˆ i (xτi∗ ) W x n τ f xτ , ψh xτ , j∗ ∂xτh∗ Vˆ j (xτ ) h=1
j =1
for i ∈ N and x ∗ (τ ) = xτ∗ ∈ Xτ∗ ,
(9.61)
would lead to the realization of the solution imputations in Condition 9.4. With (9.61) a subgame consistent solution can be obtained. Note that while the firms use the cooperative investment strategies {ψi∗ (xτ∗ ), for i ∈ N }, the instantaneous receipt of firm i at time instant τ is ζi τ, xτ∗ = g i xτi∗ − ciN ψi∗ xτ∗ , for i ∈ N, x ∗ (τ ) = xτ∗ ∈ Xτ∗ and τ ∈ [t0 , ∞). According to Corollary 9.6, the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ), for i ∈ N , as stated in (9.61). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ , has to be given or charged to firm i at time τ , for i ∈ N if x ∗ (τ ) = xτ∗ ∈ Xτ∗ .
9.5 Infinite-Horizon Analysis
261
9.5.3 Shapley Value Profit Sharing Consider again the infinite-horizon dynamic venture model in (9.51) and (9.52). The member firms would maximize their expected joint profit and share their expected cooperative profits according to the Shapley Value. If firms are allowed to form different coalitions consisting of a subset of companies K ⊆ N . There are k firms in the subset K. In particular, they can obtain cost reduction and with absolute joint venture cost advantage cjK uj (s) ≤ cjL uj (s) ,
for j ∈ L ⊆ K,
(9.62)
where cjK [uj (s)] represents the costs of the controls of the firm j in the subset K and cjL [uj (s)] represents the costs of the controls of the firm j in the subset L. Moreover, marginal cost advantages lead to ∂cjK uj (s) /∂uj (s) ≤ ∂cjL uj (s) /∂uj (s),
for j ∈ L ⊆ K.
The expected profit to the joint venture K becomes Et t
∞
g j s, x j (s) − cjK uj (s) exp −r(s − t) ds ,
for K ⊆ N. (9.63)
j ∈K
To compute the profit of the joint venture K we have to consider the optimal control problem in (9.62) and (9.63). Invoking Theorem A.6 of the Technical Appendixes, the solution to the stochastic control problem can be characterized as follows. Corollary 9.7 A set of controls {ψiK∗ (x K ), for i ∈ K and t ∈ [t0 , ∞)} provides an optimal solution to the stochastic control problem in (9.62) and (9.63) if there exists a continuously twice differentiable function W K (x K ) : R k → R satisfying the following equation: 1 hζ K K K Ω x Wx h x ζ x rW K x K − 2 = max uK
j ∈K
h,ζ ∈K
j
g x
j
− cjK (uj )
+
j ∈K
WxKj
x
K
f
j
j
x , uj
.
(9.64)
Now consider the case of a grand coalition N in which all the n firms are in the coalition. Using the result in Corollary 9.5, the cooperative state trajectory can be obtained as in (9.58). To share the venture profit among participating firms according to the Shapley Value, the imputation has to satisfy the following.
262
9
Cost-Saving Joint Venture Under Uncertainty
Condition 9.5 An imputation (k − 1)!(n − k)! K K∗ W xτ − W K\i xτK\i∗ , (9.65) ξ (τ )i τ, xτN∗ = n! K⊆N
is assigned to firm i, for i ∈ N at time τ when the state is xτ∗ . To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 9.5 we invoke Theorem 8.3 in Chap. 8 and obtain the following. Corollary 9.8 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞) (k − 1)!(n − k)! ∗ rW K\i xτK\i∗ − rW K xτK∗ Bi τ, xτ = − n! K⊆N
∂ h h∗ ∗ ∗ K K∗ W x f xτ , ψh xτ τ ∂xτh∗
+
h∈K
∂ h h∗ ∗ ∗ K\i K\i∗ xτ f xτ , ψh xτ − W ∂xτh∗ h∈K\i
+
K K∗ 1 hζ ∗ ∂2 Ω xτ W xτ ζ ∗ h∗ 2 ∂xτ ∂xτ h,ζ ∈K
−
K\i K\i∗ 1 ∂2 Ω hζ xτ∗ W x , τ ζ∗ 2 ∂xτh∗ ∂xτ
for i ∈ N, (9.66)
h,ζ ∈K\i
would lead to the realization of the Shapley Value in Condition 9.5. A subgame consistent solution (as that in Theorem 8.3 of Chap. 8) can be constructed with the optimal cooperative strategies and the PDP in (9.66).
9.6 An Infinite-Horizon Stochastic Joint Venture Consider the infinite-horizon version of the three-company joint venture in Sect. 9.2. The planning period is [t0 , ∞). Company i’s expected profit is ∞
i 1/2 {i} − ci ui (s) exp −r(s − t0 ) ds , (9.67) Pi x (s) Et0 t0
for i ∈ {1, 2, 3}.
9.6 An Infinite-Horizon Stochastic Joint Venture
263
The evolution of the technology level of company i follows the dynamics 1/2 − δx i (s) ds + σi x i (s) dzi (s), dx i (s) = αi ui (s)x i (s) x i (t0 ) = x0i ∈ X i ,
for i ∈ {1, 2, 3},
(9.68)
in the case when each of these three firms acts independently. Using Theorem A.6 in the Technical Appendixes, we obtain the Bellman equation as (σi x i )2 i i Wx i x i x rW i x i − 2 1/2 1/2 {i} − ci ui + Wxi i x i αi ui xi − δx i , = max Pi x i ui
(9.69)
for i ∈ {1, 2, 3}. Performing the indicated maximization yields ui =
αi2
{i} 4(ci )2
2 Wxi i x i x i ,
for i ∈ {1, 2, 3}.
Substituting ui into (9.69) yields (σi x i )2 i i rW i x i − Wx i x i x 2 1/2 αi2 i i 2 i − {i} x = Pi x i Wx i x 4ci +
αi2
{i} 2ci
2 Wxi i x i x i − δWxi i x i xi ,
for i ∈ {1, 2, 3}.
Solving the above system of partial differential equations yields {i} 1/2 {i} W i x i = Ai x i + Ci , where
δ σi2 {i} A i − Pi , 0= r + + 2 8
for i ∈ {1, 2, 3},
{i}
rCi =
αi2
{i} 16ci
{i} 2
Ai
(9.70)
.
9.6.1 Cost-Saving Joint Venture Consider the case when all three firms agree to form a joint venture and share their expected joint profit proportionally to their expected noncooperative profits. The
264
9
Cost-Saving Joint Venture Under Uncertainty {1,2,3}
cost of control of firm j under the joint venture becomes cj venture cost advantage {1,2,3}
cj
{j }
≤ cj ,
uj (s), with joint
for j ∈ N.
(9.71)
The expected profit of the joint venture is the sum of the participating firms’ profits
3 ∞ j 1/2 {1,2,3} Et − cj uj (s) exp −r(s − t) ds. (9.72) Pj x (s) t
j =1
The firms in the joint venture then act cooperatively to maximize (9.72) subject to (9.71). Using Theorem A.6 in the Technical Appendixes, we obtain 3 (σh x h )(σζ x ζ ) {1,2,3} 1 2 3 rW {1,2,3} x 1 , x 2 , x 3 − Wx h x ζ x , x , x 2
= max
u1 ,u2 ,u3
+
3
h,ζ =1
3 i 1/2 {1,2,3} − ci ui Pi x i=1
1/2 {1,2,3} 1 2 3 Wx i x , x , x αi ui x i
− δx
i
.
(9.73)
for i ∈ {1, 2, 3}.
(9.74)
i=1
Performing the indicated maximization yields ui =
{1,2,3} 1 2 3 2 i αi2 x, Wx i x ,x ,x {1,2,3} 2 4(ci )
Substituting (9.74) into (9.73) yields rW
{1,2,3}
1
2
x ,x ,x
3
3 (σh x h )(σζ x ζ ) {1,2,3} 1 2 3 − Wx h x ζ x , x , x 2 h,ζ =1
3 i 1/2 αi2 x i {1,2,3} 1 2 3 2 − {1,2,3} Wx i = x ,x ,x Pi x 4ci i=1 +
3
Wx{1,2,3} (x1 , x2 , x3 ) i
i=1
αi2
(x1 , x2 , x3 ) Wx{1,2,3} i 2ci2
x − δxi . i
(9.75)
Solving (9.75) yields {1,2,3} 1 1/2 {1,2,3} 2 1/2 {1,2,3} 3 1/2 x x x W {1,2,3} x 1 , x 2 , x 3 = A1 + A2 + A3 + C {1,2,3} , (9.76)
9.6 An Infinite-Horizon Stochastic Joint Venture {1,2,3}
where A1
{1,2,3}
, A2
{1,2,3}
, A3
265
, and C {1,2,3} satisfy
δ σi2 {1,2,3} − Pi , Ai 0= r + + 2 8
for i, j, h ∈ {1, 2, 3} and i = j = h, rC {1,2,3} =
3
αi2
{1,2,3} i=1 16ci
{1,2,3} 2 . Ai
(9.77)
The investment strategies of the grand coalition joint venture can be derived as {1,2,3}
ψi
(x) =
{1,2,3} 2 αi2 Ai , {1,2,3} 2 16(ci )
for i ∈ {1, 2, 3}.
(9.78)
The dynamics of the technological progress of the joint venture over the time interval s ∈ [t0 , ∞) can be expressed as dx (s) = i
1/2 {1,2,3} i A x (s) {1,2,3} i 4ci αi2
− δx (s) ds + σi x i (s) dzi (s), i
x i (t0 ) = x0i ,
(9.79)
for i ∈ {1, 2, 3}. Taking the transforming y i (s) = x i (s)1/2 , for i ∈ {1, 2, 3}, equation system (9.79) can be expressed as dy i (s) =
αi2
{1,2,3}
A {1,2,3} i
8ci 1/2 y i (t0 ) = x0i ,
σ 2 {1,2,3} i δ 1 − y i (s) − i Ai y (s) ds + σi y i (s) dzi (s), 2 8 2 (9.80)
for i ∈ {1, 2, 3}. Equation (9.80) is a system of linear stochastic differential equations that can be solved by standard techniques. Solving (9.80) yields the joint venture’s state trajectory. Let {y 1∗ (t), y 2∗ (t), y 3∗ (t)} denote the solution to (9.80). Transforming x i = (y i )2 , we obtain the state trajectories of the joint venture over the time interval s ∈ [t0 , ∞) as
∞ T x ∗ (t) t=t ≡ x 1∗ (t), x 2∗ (t), x 3∗ (t) t=t 0 0 1∗ 2 2∗ 2 3∗ 2 T = y (t) , y (t) , y (t) t=t . 0
(9.81)
We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t and the term xt∗ ∈ Xt∗ is used to denote an element in Xt∗ .
266
9
Cost-Saving Joint Venture Under Uncertainty
9.6.2 Subgame Consistent Venture Profit Sharing If the firms agree to share their expected joint profit proportionally to their expected noncooperative profits, the imputation scheme has to fulfill the following condition. Condition 9.6 An imputation Vˆ i (xτi∗ ) ξ (τ )i τ, xτ∗ = 3 W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ , j i∗ ˆ j =1 V (xτ )
(9.82)
is assigned to firm i, for i ∈ {1, 2, 3} at time τ when the state is xτ∗ ∈ Xτ∗ . To formulate a payoff distribution procedure over time so that the agreed imputations in Condition 9.6 are satisfied we invoke Corollary 9.6 and obtain the following. Corollary 9.9 A PDP with and an instantaneous payment at time τ ∈ [t0 , ∞) when x ∗ (τ ) = xτ∗ ∈ Xτ∗ : Vˆ i (xτi∗ ) Bi τ, xτ∗ = r 3 W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ ˆ j i∗ j =1 V (xτ ) −
3 ∂2 1 σh x h σζ x ζ ζ∗ 2 ∂xτh∗ ∂xτ
h,ζ =1
Vˆ i (xτi∗ ) {1,2,3} 1∗ 2∗ 3∗ , x , x x W τ τ τ ˆ j j∗ j =1 V (xτ )
× 3 −
3 ∂ Vˆ i (xτi∗ ) W {1,2,3} xτ1∗ , xτ2∗ , xτ3∗ 3 h∗ ˆ j i∗ ∂xτ j =1 V (xτ ) h=1
×
αh2
{1,2,3} h∗ 1/2 xτ
A {1,2,3} h
4ch
− δxτh∗ (s) ,
for i ∈ {1, 2, 3},
(9.83)
will lead to the realization of the imputation in Condition 9.6. A subgame consistent solution can be readily obtained using (9.78) and (9.83). Using the cooperative strategies, the instantaneous receipt of firm i at time instant τ given x ∗ (τ ) = xτ∗ ∈ Xτ∗ is 1/2 ζi τ, xτ∗ = Pi xτi∗ −
αi2
{1,2,3} 16(ci )
{1,2,3} 2 , Ai
for i ∈ {1, 2, 3} along the cooperative path {x ∗ (t)}∞ t=t0 .
(9.84)
9.6 An Infinite-Horizon Stochastic Joint Venture
267
According to (9.83), the instantaneous payment that firm i should receive under the agreed-upon optimality principle is Bi (τ, xτ∗ ), for i ∈ {1, 2, 3}. Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ , (9.85) has to be given or charged to firm i at time τ given x ∗ (τ ) = xτ∗ ∈ Xτ∗ , for i ∈ {1, 2, 3} along the cooperative path {x ∗ (t)}∞ t=t0 .
9.6.3 Shapley Value Solution Consider the case when the participating firms agree to share their expected cooperative profits according to the Shapley Value. For the computation of the dynamic of the Shapley Value, we consider cases when two of the firms form a coalition {i, j } ⊂ {1, 2, 3}. The cost savings in the joint venture are depicted as follows: {i,j }
{i}
ci ≤ c i {i,j }
ci
,
{i,j,k}
≤ ci
for i, j ∈ {1, 2, 3} and i = j, ,
(9.86)
for i, j, k ∈ {1, 2, 3} and i = j = k.
Coalition {i, j } would maximize the expected joint profit ∞ 1/2 i 1/2 Et − ci ui (s) + Pj x j (s) − cj uj (s) Pi x (s) t
× exp −r(s − t) ds ,
(9.87)
subject to (9.68). Following the above analysis, we obtain the following value functions: {i,j } i 1/2 {i,j } j 1/2 + Aj + C {i,j } , x x W {i,j } x i , x j = Ai {i,j }
for i, j, ∈ {1, 2, 3} and i = j , where Ai
δ σi2 {1,2} − Pi , Ai 0= r + + 2 8 rC {i,j } =
αh2
{i,j } h∈{i,j } 16ch
{i,j }
, Aj
(9.88)
, and C {i,j } satisfy
for i, j, ∈ {1, 2, 3} and i = j,
and
{i,j } 2 . Ah
To formulate a payoff distribution procedure over time so that the agreed imputations satisfy the Shapley Value in Condition 9.5 we invoke Corollary 9.8 and obtain the following.
268
9
Cost-Saving Joint Venture Under Uncertainty
Corollary 9.10 A PDP with an instantaneous payment at time τ ∈ [t0 , ∞):
Bi τ, xτ∗ = −
K⊆{1,2,3}
(k − 1)!(3 − k)! rW K\i xτK\i∗ − rW K xτK∗ 3!
2 ∂ αh {1,2,3} i∗ 1/2 K K∗ h∗ W xτ A − δxτ xτ + {i,j } h ∂xτh∗ 4c h∈K
h
2 ∂ αh {1,2,3} i∗ 1/2 K\i K\i∗ h∗ W A − δx x x − τ τ τ {i,j } h ∂xτh∗ 4c h∈K\i
+
h
K K∗ 1 ∂2 σh xτh∗ σζ xτζ ∗ W xτ ζ∗ 2 ∂xτh∗ ∂xτ h,ζ ∈K
K\i K\i∗ 1 ∂2 h∗ ζ∗ − σ h xτ σ ζ xτ W xτ , ζ∗ 2 ∂xτh∗ ∂xτ
(9.89)
h,ζ ∈K\i
for i ∈ {1, 2, 3}, would lead to the realization of the Shapley Value in Condition 9.5. In particular, W K (xτK∗ ) is given in (9.70), (9.76), and (9.88) and
h∗ −1/2 1 ∂ K K∗ W xτ , x = AK ∂xτh∗ 2 h τ
for h ∈ K ⊆ {1, 2, 3},
∂ 2 K K∗ −1 K h∗ −3/2 A x , W xτ = 4 h τ ∂(xτh∗ )2 ∂2 ζ∗ ∂xτh∗ ∂xτ
W K xτK∗ = 0,
and
for h = ζ.
A subgame consistent solution can be obtained using (9.78) and (9.89).
9.7 Exercises 9.1 Consider the case when there are three companies involved in a joint venture. The planning period is [0, 2]. We use xi (s) to denote the level of technology of company i at time s ∈ [0, 2], and ui (s) ⊂ R + is its physical investment in technological advancement. The increments of the levels of technology are subject to stochastic disturbances. The discount rate is 0.05. The salvage values of the firms’ technologies are 4[x 1 (2)]1/2 , 3[x 2 (2)]1/2 , and 1.5[x 3 (2)]1/2 . If the companies act independently, the costs of the physical investment of these three firms are, respectively, 2u1 (s), 3u2 (s), and 1.5u3 (s).
9.7 Exercises
269
The expected profits for companies 1, 2, and 3 are, respectively,
2 1/2 1 1/2 , − 2u1 (s) exp(−0.05s) ds + exp −0.05(4) 4 x 1 (2) 10 x (s) E 0
1/2 − 3u2 (s) exp(−0.05s) ds 8 x 2 (s)
2
E 0
2 1/2 , + exp −0.05(4) 3 x (2)
and
3 1/2 3 1/2 12 x (s) . − 1.5u3 (s) exp(−0.05s) ds + exp −0.05(4) 1.5 x (2)
2
E 0
The evolution of the technology level of company i ∈ {1, 2, 3} follows a system of stochastic dynamics 1/2 − 0.1x 1 (s) ds + 0.5x 1 (s) dz1 (s), dx 1 (s) = 4 u1 (s)x 1 (s)
x 1 (0) = 30,
1/2 dx 2 (s) = 2 u2 (s)x 2 (s) − 0.08x 2 (s) ds + 0.8x 2 (s) dz2 (s),
x 2 (0) = 20,
and
1/2 − 0.05x 3 (s) ds + 0.75x i (s) dz3 (s), dx 3 (s) = 3 u3 (s)x 3 (s)
x 3 (0) = 25,
where z1 (s), z2 (s), and z3 (s) are independent Wiener processes. Compute a Nash equilibrium solution when these three firms act independently. 9.2 Consider the case when these three companies form a joint venture. The participating firms in a coalition can gain core skills and technology from each other. In particular, they can obtain cost reduction and with absolute joint venture cost advantage. With joint venture cost advantage, the cost of the investment of firm j ∈ {1, 2, 3} {1,2,3}
under the joint venture becomes cj {1,2,3} c3
{1,2,3}
uj (s), where c1
{1,2,3}
= 1, c2
= 1.5, and
= 0.8. If the joint venture firms agree to maximize their expected joint profit and share the excess gain equally, characterize a subgame consistent solution. 9.3 Consider the joint venture in Exercise 9.2. In particular, the firms would like to share the expected venture profit according to the Shapley Value. The costs under joint ventures in different coalitions K ⊆ {1, 2, 3} are {1,2,3}
c1
{1,2}
c1
= 1,
{1,2,3}
c2
= 1.5,
{1,2}
= 1.5 and c2
= 2.5;
{1,2,3}
and c3
= 0.8;
270
9 {1,3}
= 1.4 and c3
{2,3}
= 2 and c3
c1 c2
{1}
{1,3}
c1 = 2,
{2,3}
{2}
c2 = 3,
Cost-Saving Joint Venture Under Uncertainty
= 1;
= 1.1; and
Characterize a subgame consistent solution.
{3}
c3 = 1.5.
Chapter 10
Collaborative Environmental Management Under Uncertainty
In this chapter, we introduce stochastic elements in collaborative environmental management. Similar to the deterministic analysis in Chap. 6, the industrial sector is characterized by an international trading zone involving n nations or regions. Each government adopts its own abatement policy and tax scheme to reduce pollution. The governments have to promote business interests and at the same time have to handle the financing of the costs brought about by pollution. The industrial sectors remain competitive among themselves while the governments cooperate in pollution abatement. Industrial production creates two types of negative environmental externalities. First, pollutants emitted via industrial production cause short-term local impacts on neighboring areas of the origin of production. Examples of these short-term local impacts include passing-by waste in waterways, wind-driven suspended particles in the air, unpleasant odor, noise, dust, and heat generated in the production processes. Second, the emitted pollutants will add to the existing pollution stock in the environment and produce longterm impacts to extensive and far-away areas. Greenhouse-gases, CFC, and atmospheric particulates are examples of this form of negative environmental externality. This specification permits the proximity of the origin of industrial production to receive heavier environmental damages as production increases. Given these neighboring impacts, the individual government tax policy has to take into consideration the tax policies of other nations and these policies’ intricate effects on outputs and environmental effects. In particular, while designing tax policies to curtail their outputs, governments have to consider the inducement to neighboring nations’ output that can cause local negative environmental impacts to themselves. To incorporate the widely observed uncertainty in nature’s capability to replenish the environment this article adopts a stochastic pollution stock dynamics and formulates a cooperative stochastic differential game of transboundary industrial pollution. The number of solvable cooperative stochastic differential games so far remains low because of the difficulties in deriving tractable solutions (like Haurie et al. 1994; Yeung and Petrosyan 2004, 2005, 2006a). The stringent condition of subgame consistency is required for a dynamically stable cooperative solution in stochastic differential games. D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_10, © Springer Science+Business Media, LLC 2012
271
272
10
Collaborative Environmental Management Under Uncertainty
An analytical framework is constructed in Sect. 10.1 and its noncooperative outcome is characterized in Sect. 10.2. Cooperative arrangement and subgame consistent collaborative environmental schemes are presented in the next two sections. A stochastic industrial pollution model is presented and a subgame consistent collaboration management scheme is explicitly characterized in Sects. 10.5 and 10.6.
10.1 An Analytical Framework In this section we present an analytical framework to study transboundary industrial pollution management under uncertainty.
10.1.1 The Industrial Sector Following Chap. 6 we consider a global economy that is comprised of n nations. At time instant s the demand system of the outputs of the nations is Pi (s) = f i q1 (s), q2 (s), . . . , qn (s), s ,
i ∈ N ≡ {1, 2, . . . , n},
(10.1)
where Pi (s) is the price vector of the output vector of nation i and qj (s) is the output of nation j . The demand system in (10.1) shows that the world economy is a form of generalized differentiated products oligopoly. Industrial profits of nation i at time s can be expressed as f i q1 (s), q2 (s), . . . , qn (s), s qi (s) − ci qi (s), vi (s) ,
for i ∈ N,
(10.2)
where vi (s) is the set of environmental policy instruments of government i. Policy instruments may include tools like taxes, subsidies, technology choices, and pollution legislations. The cost of producing qi (s) under policy vi (s) is ci [qi (s), vi (s)]. Profit maximization by the industrial sectors yields f i q1 (s), q2 (s), . . . , qn (s), s + fqii q1 (s), q2 (s), . . . , qn (s), s qi (s) − cqi i qi (s), vi (s) = 0, for i ∈ N. (10.3) Nation i’s instantaneous market equilibrium output can be expressed as qi∗ (s) = qˆ i v1 (s), v2 (s), . . . , vn (s), s ≡ qˆ i v(s), s ,
for i ∈ N.
(10.4)
The fact that each nation’s output decision depends on government environmental policies is reflected in (10.4).
10.1
An Analytical Framework
273
10.1.2 Impacts of Pollution and Stochastic Accumulation Dynamics Industrial production emits pollutants into the environment and the amount of pollution created by different nations’ outputs may be different. For an output of qi (s) produced by nation i, there will be an instantaneous damaging environmental impact of εii [qi (s)] on nation i itself and a damaging impact of εji [qi (s)] on its adjacent nation j for j ∈ K i . On the other hand, nation i will receive instantaneous damaging j environmental impacts from its adjacent nations measured as εi [qj (s)] for j ∈ K¯ i . This type of externality is typical in spatial environmental impacts. Moreover, the pollutant will then add to the stock of existing pollution. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) ⊂ R m denote the level of pollution at time s. The dynamics of the pollution stock is governed by the stochastic differential equation n n aj qj (s), vj (s) − bj uj (s), x(s) − δ x(s) x(s) ds dx(s) = j =1
+ σ s, x(s) dz(s),
j =1
x(t0 ) = xt0 ,
(10.5)
where σ [s, x(s)] is a scaling function, z(s) is a Wiener process, Ω[s, x(s)] = σ [s, x(s)]σ [s, x(s)] is the covariance matrix with its element in row h and column ζ denoted by Ω hζ [s, x(s)], aj [qj (s), vj (s)] is the amount of pollution created by the qj (s) amount of output produced under policy vi (s), uj (s) is the pollution abatement effort of nation j , bj [uj (s), x(s)] is the amount of pollution removed by the uj (s) unit of abatement effort of nation j , and δ[x(s)] is the natural rate of decay of the pollutants.
10.1.3 The Governments’ Objectives The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. In particular, each government maximizes the net gains in the industrial sector plus tax revenue minus expenditures on pollution abatement and damages from pollution. A lump-sum income tax is levied on the industrial sector to balance the government budget. The last item turns out to be a net transfer between the government and the public (with no effect on industrial output). The instantaneous objective of government i at time s can be expressed as f i q1 (s), q2 (s), . . . , qn (s), s qi (s) − ci qi (s), vi (s) − ciP vi (s) − cia ui (s) j εi qj (s) − hi x(s) , i ∈ N, (10.6) − εii qi (s) − j ∈K¯ i
274
10
Collaborative Environmental Management Under Uncertainty
where ciP [vi (s)] is the cost of implementing the vector policy instrument vi (s), cia [ui (s)] is the cost of employing a ui amount of pollution abatement effort, and hi [x(s)] is the value of damage to country i from an x(s) amount of pollution. The governments’ planning horizon is [t0 , T ]. It is possible that T may be very large. The discount rate is r. At time T , the terminal appraisal of pollution damage is gi [x(T )] where ∂g i /∂x < 0. Each one of the n governments seeks to maximize the integral of its expected instantaneous objective in (10.6) over the planning horizon subject to the pollution dynamics of (10.5) with controls on the level of abatement effort and output tax. Substitute qi (s), for i ∈ N , from (10.4) into (10.5) and (10.6) one obtains a stochastic differential game in which government i ∈ N seeks to T
f i qˆ 1 v(s), s , qˆ 2 v(s), s , . . . , qˆ n v(s), s , s qˆ i v(s), s max Et0 vi (s),ui (s)
t0
− ci qˆ i v(s), s , vi (s) − ciP vi (s) − cia ui (s) − εii qˆ i v(s), s −
j εi qˆ j v(s), s − hi x(s)
e
−r(s−t0 )
−r(T −t ) 0 , (10.7) ds + g x(T ) e i
j ∈K¯ i
subject to dx(s) =
n
n
j aj qˆ v(s), s , vj (s) − bj uj (s), x(s) − δ x(s) x(s) ds
j =1
j =1
+ σ s, x(s) dz(s), x(t0 ) = xt0 .
(10.8)
Thus the economic interactions among nations in industrial production, pollution emission, and abatement are characterized as a stochastic differential game with the expected the payoffs in (10.7) and stochastic pollution dynamics of (10.8).
10.2 Noncooperative Outcomes In this section we discuss the solution to the stochastic differential game in (10.7) and (10.8). Since the payoffs of nations are measured in monetary terms, the game is a transferable payoff game. Under a noncooperative framework, a Nash equilibrium solution can be characterized as the following (see Theorem 2.5 in Chap. 2).
10.2
Noncooperative Outcomes
275
Corollary 10.1 A set of feedback strategies {u∗i (t) = μi (t, x), vi∗ (t) = φi (t, x), for i ∈ N}, provides a feedback Nash equilibrium solution to the game in (10.7) and (10.8) if there exist suitably smooth functions V (t0 )i (t, x) : [t0 , T ] × R → R, i ∈ N , satisfying the following partial differential equations: (t0 )i
−Vt
(t, x) −
= max vi ,ui
m 1 hζ (t )i Ω (t, x)Vx h0x ζ (t, x) 2 h,ζ =1
f i qˆ 1 vi , φ=i (t, x), t , qˆ 2 vi , φ=i (t, x), t , . . . , qˆ n vi , φ=i (t, x), t)
× qˆ i vi , φ=i (t, x), t − c qˆ i vi , φ=i (t, x), t , vi − ciP [vi ] − cia [ui ] j j
i i − εi ϕ vi , φ=i (t, x), t − εi qˆ vi , φ=i (t, x), t − hi (x) e−r(t−t0 ) j ∈K¯ i
n (t0 )i + Vx (t, x) aj qˆ j vi , φ=i (t, x), t , vj − bi (ui , x) j =1
−
n
bj μj (t, x), x − δ(x)x
(10.9)
,
j =1 j =i
V (t0 )i (T , x) = g i [x]e−r(T −t0 ) ,
(10.10)
where φ=i (t, x) = φ 1 (t, x), φ 2 (t, x), . . . , φ i−1 (t, x), φ i+1 (t, x), . . . , φ n (t, x) . In a prevailing Nash equilibrium the function V (t0 )i (t, x) is then the integral Et0
T
f i qˆ 1 φ s, x(s) , s , qˆ 2 φ s, x(s) , s , . . . , qˆ n φ s, x(s) , s , s
t
× qˆ i φ s, x(s) , s − ci qˆ i φ s, x(s) , s , φi s, x(s)
− ciP φi s, x(s) − cia μi s, x(s) − εii qˆ i φ s, x(s) , s j −r(s−t ) j 0 ds εi qˆ φ s, x(s) , s − hi x(s) e − j ∈K¯ i
+ g i x(T ) e−r(T −t0 ) x(t) = x ∈ X ,
for i ∈ N.
(10.11)
276
10
Collaborative Environmental Management Under Uncertainty
The game equilibrium dynamics then becomes dx(s) =
n
n
aj qˆ j φ s, x(s) , s , φj s, x(s) − bj μj s, x(s) , x(s)
j =1
j =1
− δ x(s) x(s) ds + σ s, x(s) dz(s), x(t0 ) = xt0 .
(10.12)
Remark 10.1 One can readily verify that V (τ )i (t, xt ) = V i (t, xt )er(τ −t0 ) , for τ ∈ [t0 , T ], is the value function to nation i at time t ∈ [τ, T ] when the state x(t) = xt in the game in (10.7) and (10.8), which starts at time τ .
10.3 Cooperative Arrangement Now consider the case when all the nations agree to cooperate and adhere to an optimality principle. Since the nations are asymmetric and the number of nations may be large, a reasonable solution optimality principle for gain distribution is to share the expected gain from cooperation proportional to the nations’ relative sizes of expected noncooperative payoffs. Group optimality requires the nations to maximize their joint payoff. Hence the optimality principle entails (i) group optimality, and (ii) sharing the expected gain from cooperation proportional to the nations’ relative sizes of expected noncooperative payoffs. For the cooperative scheme to be sustainable the agreed-upon optimality has to be upheld throughout the game horizon.
10.3.1 Group Optimality and Cooperative State Trajectory Consider the collaborative environmental scheme with the participating nations’ expected payoff structure in (10.7) and pollution dynamics in (10.8). To secure group optimality the participating nations seek to maximize their expected joint payoff by solving the following stochastic control problem: max
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
Et0
T t0
n i=1
f i qˆ 1 v(s), s , qˆ 2 v(s), s , . . . , qˆ n v(s), s , s
× qˆ i v(s), s − ci qˆ i v(s), s , vi (s)
− ciP vi (s) − cia ui (s) − εii qˆ i v(s), s
10.3
Cooperative Arrangement
−
277
j εi qˆ j v(s), s − hi x(s)
e−r(t−t0 ) ds
j ∈K¯ i
+
n
−r(T −t ) 0 , g x(T ) e i
(10.13)
i=1
subject to (10.8). Invoking Theorem A.5 in the Technical Appendixes, a set of controls {[vi∗∗ (t), u∗∗ i (t)] = [ψi (t, x), i (t, x)], for i ∈ N} constitutes an optimal solution to the stochastic control problem in (10.13) and (10.8) if there exists a continuously twice differentiable function W (t0 ) (t, x) : [t0 , T ] × R m → R, i ∈ N , satisfying the following partial differential equations:
(t0 )
− Wt
(t, x) −
m 1 hζ (t ) Ω (t, x)Wx h0x ζ (t, x) 2 h,ζ =1
=
n
max
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
f i qˆ 1 (v, t), qˆ 2 (v, t), . . . , qˆ n (v, t), t qˆ i (v, t)
i=1
− c qˆ i (v, t), vi − ciP (vi ) − cia (ui ) − εii qˆ i (v, t) i
−
j εi qˆ j (v, t) − hi (x)]e−r(t−t0 )
(10.14)
j ∈K¯ i
n
+ Wx(t0 ) (t, x)
j =1
W (t0 ) (T , x) =
n
aj qˆ (v, t), vj − j
n
bj (uj , x) − δ(x)x
,
and
j =1
g i (x)e−r(T −t0 ) .
i=1
Hence the nations will adopt the cooperative control {[ψi (t, x), i (t, x)], for i ∈ N and t ∈ [t0 , T ]}. The optimal trajectory under cooperation becomes dx(s) =
n
n
j aj qˆ ψ s, x(s) , s , ψj s, x(s) − bj j s, x(s) , x(s)
j =1
j =1
− δ x(s) x(s) ds + σ s, x(s) dz(s),
x(t0 ) = xt0 .
(10.15)
278
10
Collaborative Environmental Management Under Uncertainty
The solution to (10.15) can be expressed as ∗
x (t) = x0 +
t n t0
−
n
aj qˆ j ψ s, x ∗ (s) , s , ψj s, x ∗ (s)
j =1
∗ ∗ ∗ ∗ bj j s, x (s) , x (s) − δ x (s) x (s) ds
j =1
t
+
σ s, x(s) dz(s).
(10.16)
t0
We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t generated by (10.16). The term xt∗ is used to denote an element in the set Xt∗ . The cooperative control for the game Γc (x0 , T − t0 ) over the time interval [t0 , T ] can be expressed more precisely as ψi t, x ∗ (t) ,
i t, x ∗ (t)
for t ∈ [t0 , T ] and i ∈ N.
(10.17)
Note that for group optimality to be achievable, the cooperative controls in (10.17) must be exercised throughout time interval [t0 , T ]. The value function W (t0 )i (t, x) reflects E t0
T
n
t
f i qˆ 1 ψ s, x ∗ (s) , s , qˆ 2 ψ s, x ∗ (s) , s , . . . ,
i=1
qˆ n ψ s, x ∗ (s) , s , s
× qˆ i ψ s, x ∗ (s) , s − ci qˆ i ψ s, x ∗ (s) , s , ψi s, x ∗ (s)
− ciP ψi s, x ∗ (s) − cia i s, x ∗ (s) − εii qˆ i ψ s, x ∗ (s) , s j ∗ −r(s−t ) j ∗ 0 ds − εi qˆ ψ s, x (s) , s − hi x (s) e j ∈K¯ i
+
n
−r(T −t ) ∗ 0 , g x (T ) e i
for i ∈ N,
(10.18)
i=1
where ψ ∗ (s, x ∗ (s)) = {ψ1∗ (s, x ∗ (s)), ψ2∗ (s, x ∗ (s)), . . . , ψn∗ (s, x ∗ (s))}. Remark 10.2 One can readily verify that W (τ ) (t, xt∗ ) = W (t0 ) (t, xt∗ )er(τ −t0 ) , for τ ∈ [t0 , T ], is the value function at time t ∈ [τ, T ] of the control problem in (10.8) and (10.13) that starts at time τ with x(t) = xt∗ ∈ Xt∗ .
10.4
Subgame Consistent Collaborative Environmental Management
279
10.3.2 Imputation Scheme The agreed-upon optimality principle must be maintained at every instant of time within the cooperative duration [t0 , T ] given any realizable state xτ∗ ∈ Xτ∗ generated by the cooperative trajectory in (10.16). For τ ∈ [t0 , T ], let ξ (τ )i (τ, xτ∗ ) denote the solution imputation (payoff under cooperation) over the period [τ, T ] to nation i ∈ N given that the state is xτ∗ ∈ Xτ∗ . Hence the imputation scheme {ξ (τ )i (τ, xτ∗ ); for i ∈ N } has to satisfy the following condition: Condition 10.1 V (τ )i (τ, xτ∗ ) ξ (τ )i τ, xτ∗ = n W (τ ) τ, xτ∗ , (τ )j ∗ (τ, xτ ) j =1 V
(10.19)
for i ∈ N, xτ∗ ∈ Xτ∗ and τ ∈ [t0 , T ]. The imputation scheme in Condition 10.1 satisfies individual rationality. Crucial to the analysis is the formulation of a payment distribution mechanism that would lead to the realization of Condition 10.1. This will be done in the next section.
10.4 Subgame Consistent Collaborative Environmental Management To formulate a payoff distribution procedure over time so that the agreed imputations satisfy Condition 10.1 we invoke Theorem 8.1 in Chap. 8 and obtain the following. Corollary 10.2 A distribution scheme with a terminal payment −g i [xT∗ − x¯ i ] at time T and an instantaneous payment at time τ ∈ [t0 , T ] ∗ ∂ V (τ )i (t, xt∗ ) (τ ) n t, x W Bi τ, xτ∗ = − t t=τ (τ )j (t, x ∗ ) ∂t t j =1 V −
∂ V (τ )i (τ, xτi∗ ) (τ ) ∗ τ, x W n τ j∗ ∂xτ∗ V (τ )j (τ, xτ ) j =1
×
n j =1
aj qˆ j ψ τ, xτ∗ , τ , ψj τ, xτ∗
280
10
−
n
Collaborative Environmental Management Under Uncertainty
bj j τ, xτ∗
, xτ∗
−δ
xτ∗
xτ∗
j =1
−
m V (τ )i (τ, xτi∗ ) 1 hζ ∂2 (τ ) ∗ Ω τ, xτ∗ τ, x W τ , ζ ∗ n j∗ 2 ∂xτh∗ ∂xτ V (τ )j (τ, xτ ) h,ζ =1
j =1
for i ∈ N,
(10.20)
will lead to the realization of the solution imputations ξ (τ )i (τ, xτ∗ ), for i ∈ N and τ ∈ [t0 , T ], satisfying Condition 10.1. Invoking Theorem 8.2 in Chap. 8 a subgame consistent solution is constructed with the group optimal strategies [ψ(τ, xτ∗ ), (τ, xτ∗ )], the imputation in Condition 10.1, and the distribution scheme B(τ, xτ∗ ) in (10.20). With nations using the cooperative environmental policy instruments ψ(s, x ∗ (s)) and pollution abatement efforts (s, x ∗ (s)), the instantaneous receipt of nation i at time instant τ given x(τ ) = xτ∗ ∈ Xτ∗ is
ζi τ, xτ∗ = f i qˆ 1 ψ τ, xτ∗ , τ , qˆ 2 ψ τ, xτ∗ , τ , . . . , qˆ n ψ τ, xτ∗ , τ , s
× qˆ i ψ τ, xτ∗ , τ − ci qˆ i ψ τ, xτ∗ , τ , ψi τ, xτ∗
− ciP ψi τ, xτ∗ − cia i s, x ∗ (s) − εii qˆ i ψ s, x ∗ (s) , s j εi qˆ j ψ τ, xτ∗ , τ − hi xτ∗ , (10.21) − j ∈K¯ i
for τ ∈ [t0 , T ] and i ∈ N . According to Corollary 10.2, the instantaneous payment that firm i should receive under the solution to the agreed-upon optimality principle is Bi (τ, xτ∗ ), for τ ∈ [t0 , T ] and i ∈ N , as stated in (10.20). Hence an instantaneous transfer payment χ i τ, xτ∗ = Bi τ, xτ∗ − ζi τ, xτ∗ , (10.22) will be given or charged to firm i at time τ , for i ∈ N, x(τ ) = xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ].
10.5 A Model of Stochastic Industrial Pollution Management This section presents a model of the collaborative industrial pollution management under uncertainty.
10.5.1 A Multinational Economy with Industrial Pollution We follow the multinational economy with n asymmetric nations or regions in Chap. 6. Industrial pollution is generated via the production process.
10.5
A Model of Stochastic Industrial Pollution Management
281
10.5.1.1 The Industrial Economy The inverse demand function of the output of nation i ∈ N at time instant s is Pi (s) = α − i
n
βji qj (s),
(10.23)
j =1
where Pi (s) is the price of the output of nation i, qj (s) is the output of nation j , and α i and βji for i ∈ N and j ∈ N are positive constants. The output choice qj (s) ∈ [0, q¯j ] is nonnegative and bounded by a maximum output constraint q¯j . The output price equals zero if the right-hand side of (10.23) becomes negative. Industrial profits of nation i at time s can be expressed as n πi (s) = α i − βji qj (s) qi (s) − ci qi (s) − vi (s)qi (s), for i ∈ N, (10.24) j =1
where vi (s) ≥ 0 is the tax rate imposed by government i on its industrial output at time s and ci is the unit cost of production. At each time instant s, the industrial sector of nation i ∈ N seeks to maximize (10.24). The first-order condition for a Nash equilibrium for the n nations economy yields n
βji qj (s) + βii qi (s) = α i − ci − vi (s),
for i ∈ N.
(10.25)
j =1
With output tax rates v(s) = {v1 (s), v2 (s), . . . , vn (s)} being regarded as parameters, (10.25) becomes a system of equations linear in q(s) = {q1 (s), q2 (s), . . . , qn (s)}. Solving (10.25) yields an industry equilibrium qi (s) = φi v(s) = α¯ i + β¯ji vj (s), (10.26) j ∈N
where α¯ i and β¯ji , for i ∈ N and j ∈ N , are constants involving the model parameters {β11 , β21 , . . . , βn1 ; β12 , β22 , . . . , βn2 ; . . . ; β1n , β2n , . . . , βnn }, {α 1 , α 2 , . . . , α n }, and {c1 , c2 , . . . , cn }. The industry equilibrium generated by this oligopoly model is computable and fully tractable.
10.5.1.2 Local and Global Environmental Impacts Industrial production emits pollutants into the environment. The emitted pollutants cause short-term local impacts on neighboring areas of the origin of production in forms like passing-by waste in waterways, wind-driven suspended particles in the air, unpleasant odor, noise, dust, and heat. For an output of qi (s) produced by nation
282
10
Collaborative Environmental Management Under Uncertainty
i, there will be a short-term local environmental impact (cost) of εii qi (s) on nation i itself and a local impact of εji qi (s) on its neighbor nation j . In particular, εji is a positive constant. Nation i will receive short-term local environmental impacts j from its adjacent nations measured as εi qj (s) for j ∈ K¯ i . Thus K¯ i is the subset of nations whose outputs produce local environmental impacts to nation i. Moreover, industrial production will also create long-term global environmental impacts by building up existing pollution stocks like greenhouse gases, CFC, and atmospheric particulates. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) ⊂ R + denote the level of pollution at time s, the dynamics of pollution stock is governed by the stochastic differential equation n n 1/2 aj qj (s) − bj uj (s) x(s) − δx(s) ds + σ x(s) dz(s), dx(s) = j =1
j =1
x(t0 ) = xt0 ,
(10.27)
where σ is a noise parameter and z(s) is a Wiener process, aj qj is the amount added to the pollution stock by a unit of nation j ’s output, uj (s) is the pollution abatement effort of nation j, bj uj (s)[x(s)]1/2 is the amount of pollution removed by a uj (s) unit of abatement effort of nation j , and δ is the natural rate of decay of the pollutants. Short-term local impacts are closely related to the level of production activities and hence are characterized by a deterministic scheme. However, the accumulation of pollution stock like greenhouse gases often involves the interactions between the natural environment and the pollutants emitted and hence stochastic elements will appear. For instance, nature’s capability to replenish the environment, the rate of pollution degradation, and climate change are subject to certain degrees of uncertainty. Hence a stochastic dynamic is used to model the evolution of the pollution stock in (10.27). Finally, the damage (cost) of the pollution stock in the environment to nation i at time s is hi x(s).
10.5.1.3 The Governments’ Objectives The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. The instantaneous objective of government i at time s can be expressed as n 2 αi − βji qj (s) qi (s) − ci qi (s) − cia ui (s) j =1
−
j ∈K¯ i
j εi qj (s) − hi x(s),
i ∈ N,
(10.28)
10.5
A Model of Stochastic Industrial Pollution Management
283
where cia > 0 and hi > 0 are constants, cia [ui (s)]2 is the cost of employing a ui amount of pollution abatement effort, and hi x(s) is the value of damage to country i from an x(s) amount of pollution. The governments’ planning horizon is [t0 , T ]. It is possible that T may be very large. At time T , the terminal appraisal associated with the state of pollution is g i [x¯ i − x(T )], where g i ≥ 0 and x¯ i ≥ 0. The discount rate is r. Each one of the n governments seeks to maximize the expected value of the integral of its instantaneous objective in (10.28) over the planning horizon subject to the pollution dynamics in (10.27) with controls on the level of the abatement effort and output tax. By substituting qi (s), for i ∈ N , from (10.26) into (10.27) and (10.28), one obtains a stochastic differential game in which government i ∈ N seeks to n T j i i j i i ¯ ¯ max Et0 α − βj α¯ + α¯ + βh vh (s) βh vh (s) vi (s),ui (s)
t0
− ci α¯ i +
j =1
h∈N
h∈N
j j 2 εi α¯ j + β¯ji vj (s) − ci ui (s) − β¯ v (s)
a
j ∈K¯ i
j ∈N
∈N
− hi x(s) e−r(s−t0 ) ds − g i x(T ) − x¯ i e−r(T −t0 ) ,
(10.29)
subject to dx(s) =
n j =1
aj α¯ j +
j β¯h vh (s)
h∈N
−
n
1/2 bj uj (s) x(s) − δx(s) ds
j =1
+ σ x(s) dz(s), x(t0 ) = xt0 .
(10.30)
In the game in (10.29) and (10.30) one can readily observe that government i’s tax policy vi (s) is not only explicitly reflected in its own output, but also on the outputs of other nations. As mentioned before, this modeling formulation allows some intriguing scenarios to arise. For instance, an increase of vi (s) may just cause a minor drop in nation i’s industrial profit, but may cause significant increases in its neighbors’ outputs that produce large, local, negative environmental impacts to nation i. This results in the nations’ reluctance to increase or impose taxes on industrial outputs.
10.5.2 Noncooperative Outcomes In this section we discuss the solution to the noncooperative game in (10.29) and (10.30). Under a noncooperative framework, a Nash equilibrium solution can be characterized with Theorem 2.5 in Chap. 2 as follows.
284
10
Collaborative Environmental Management Under Uncertainty
Corollary 10.3 A set of strategies {u∗i (t) = μi (t, x), vi∗ (t) = φi (t, x), for i ∈ N }, provides a Nash equilibrium solution to the stochastic differential game in (10.29) and (10.30) if there exist suitably smooth functions V (t0 )i (t, x) : [t0 , T ] × R → R, i ∈ N , satisfying the following partial differential equations: (t0 )i
−Vt
σ 2 x 2 (t0 )i Vxx (t, x) 2 n n j j i i j α − β¯h φh (t, x) + β¯i vi βj α¯ +
(t, x) −
= max vi ,ui
j =1
× α¯ + i
− ci α¯ + i
−
β¯hi φh (t, x) + β¯ii vi
h∈N h=i
h∈N h=i
j εi
j ∈K¯ i
β¯ji φj (t, x) + β¯ii vi
− cia [ui ]2
j ∈N j =i
α¯ + j
j j β¯ φ (t, x) + β¯i vi − hi x e−r(t−t0 )
∈N =i
n j j (t0 )i j β¯h φh (t, x) + β¯i vi + Vx (t, x) aj α¯ + j =1
−
n
bj μj (t, x)x
h∈N h=i 1/2
− bi ui x
1/2
− δx
(10.31)
,
j =1 j =i
V (t0 )i (T , x) = −g i x − x¯ i e−r(T −t0 ) .
(10.32)
Performing the indicated maximization in (10.31) yields bi μi (t, x) = − a Vx(t0 )i (t, x)er(t−t0 ) x 1/2 , (10.33) 2ci n n n j j α¯ i + β¯h φh (t, x) β¯ii − β¯hi φh (t, x) βji α¯ j + βji β¯i αi − j =1
− ci β¯ii −
j =1
h∈N
j ∈K¯ i
εi β¯i + Vx(t0 )i (t, x) j
j
n
aj β¯i er(t−t0 ) = 0, j
h∈N
(10.34)
j =1
for t ∈ [t0 , T ] and i ∈ N . The system in (10.34) forms a set of equations that are linear in {φ1 (t, x), φ2 (t, x), (t )1 (t )2 (t )n . . . , φn (t, x)}, with {Vx 0 (t, x)er(t−t0 ) , Vx 0 (t, x)er(t−t0 ) , . . . , Vx 0 (t, x)er(t−t0 ) }
10.5
A Model of Stochastic Industrial Pollution Management
285
being taken as a set of parameters. Solving (10.34) yields
φi (t, x) = αˆ i +
(t )j βˆji Vx 0 (t, x)er(t−t0 ) ,
i ∈ N,
(10.35)
j ∈N
where αˆ i and βˆji , for i ∈ N and j ∈ N , are constants involving the constant coefficients in (10.34). Substituting the results in (10.33) and (10.35) into (10.31) and (10.32) we obtain the following proposition. Proposition 10.1 The system in (10.31) and (10.32) admits a solution V (t0 )i (t, x) = Ai (t)x + Ci (t) e−r(t−t0 ) ,
for i ∈ N,
(10.36)
where {A1 (t), A2 (t), . . . , An (t)} satisfies the following set of constant coefficient quadratic ordinary differential equations: A˙ i (t) = (r + δ)Ai (t) −
n b2 2 bi2 j (t) − A (t) Aj (t) + hi , A i i 4cia 2cja j =1 j =i
Ai (T ) = −g i ; for i ∈ N,
(10.37)
and
Ci (t); i ∈ N
is given by Ci (t) = er(t−t0 )
t
t0
Fi (y)e−r(y−t0 ) dy + Ci0 , (10.38)
where T Ci0 = g i x¯ i e−r(T −t0 ) − Fi (y)e−r(y−t0 ) dy, t0 n n j β¯h αˆ h + βˆkh Ak (t) βji α¯ j + Fi (t) = − α i − j =1
× α¯ i +
β¯hi αˆ h +
h∈N
+ ci α¯ − i
h∈N
+
j ∈K¯ i
− Ai (t)
j βˆk Ak (t)
k∈N
α¯ + j
βˆkh Ak (t)
β¯ji αˆ j +
j εi
k∈N
k∈N
j ∈N
j β¯ αˆ +
∈N
n j =1
aj α¯ + j
βˆk Ak (t)
k∈N
h∈N
j β¯h
αˆ + h
k∈N
βˆkh Ak (t)
.
286
10
Collaborative Environmental Management Under Uncertainty
Proof Follow the proof of Proposition 6.2 in the Appendixes of Chap. 6.
The corresponding feedback Nash equilibrium strategies of the game in (10.29) and (10.30) can be obtained as μi (t, x) = −
bi Ai (t)x 1/2 2cia
and
φi (t, x) = αˆ i +
βˆji Aj (t)
(10.39)
j ∈N
for i ∈ N and t ∈ [t0 , T ]. A remark that will be utilized in the subsequent analysis is given below. Remark 10.3 Let V (τ )i (t, xt ) denote the value function of nation i in a game with the payoffs in (10.29) and dynamics in (10.30) that starts at time τ . One can readily verify that V (τ )i (t, xt ) = V (t0 )i (t, xt )er(τ −t0 ) , for τ ∈ [t0 , T ].
10.6 Collaborative Scheme in Stochastic Industrial Pollution Management Now consider the case when all the nations agree to cooperate and adhere to an optimality principle that entails (i) group optimality and (ii) sharing the expected gain from cooperation proportional to the nations’ relative sizes of the expected noncooperative payoffs.
10.6.1 Cooperative Optimization and State Trajectory To secure group optimality the participating nations seek to maximize their expected joint payoff by solving the following optimal control problem: n n T j Et0 βj α¯ j + max α − β¯h vh (s) v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
× α¯ +
t0 =1
β¯h vh (s) − c α¯ +
h∈N
−
j ε
j ∈K¯
−
n
j =1
α¯ + j
j ∈N
j β¯k vk (s)
h∈N
2 β¯j vj (s) − ca u (s)
− h x(s) e−r(s−t0 ) ds
k∈N
g x(T ) − x¯ e−r(T −t0 ) ,
=1
subject to (10.30).
(10.40)
10.6
Collaborative Scheme in Stochastic Industrial Pollution Management
287
Invoking Theorem A.5 in the Technical Appendixes, a set of controls {[vi∗∗ (t), u∗∗ i (t)] = [ψi (t, x), i (t, x)], for i ∈ N} constitutes an optimal solution to the stochastic control problem in (10.30) and (10.40) if there exists a continuously twice differentiable function W (t0 ) (t, x) : [t0 , T ] × R → R, i ∈ N , satisfying the following partial differential equations: σ 2 x 2 (t0 ) Wxx (t, x) 2 n n j j α − α¯ + β¯h vh β¯h vh max βj α¯ +
−Wt(t0 ) (t, x) − =
v1 ,v2 ,...,vn ;u1 ,u2 ,...,un
− c α¯ +
j =1
=1
β¯j vj
− ca [u ]2
−
j ∈N
h∈N
j ε
α¯ + j
j ∈K¯
h∈N
j β¯k vk
− h x e−r(s−t0 )
k∈N
n n j (t0 ) j 1/2 + Wx (t, x) aj α¯ + bj uj x − δx , β¯h vh − j =1
W (t0 ) (T , x) = −
n
(10.41)
j =1
h∈N
g i x(T ) − x¯ i e−r(T −t0 ) .
(10.42)
i=1
Performing the indicated maximization in (10.41) yields the optimal controls under cooperation as bi W (t0 ) (t, x)er(t−t0 ) x 1/2 , for i ∈ N ; 2cia x n n j j ¯ α − βj α¯ + βh ψh (t, x) β¯i i (t, x) = −
=1
−
j =1 n
j βj β¯i
α¯ +
j =1
−
n
h∈N
(10.43)
β¯h ψh (t, x)
h∈N
c β¯i
+
j ∈K¯ i
=1
j j ε β¯i
+ Wx(t0 )
n
j aj β¯i er(t−t0 ) = 0,
j =1
for i ∈ N.
(10.44)
The system in (10.44) can be viewed as a set of equations that are linear in {ψ1 (t, x), ψ2 (t, x), . . . , ψn (t, x)}, with Wx (t, x)er(t−t0 ) being taken as a parameter. Solving (10.44) yields ψi (t, x) = αˆˆ i + βˆˆ i Wx(t0 ) (t, x)er(t−t0 ) , where αˆˆ i and βˆˆ i , for i ∈ N , are constants involving the model parameters.
(10.45)
288
10
Collaborative Environmental Management Under Uncertainty
Proposition 10.2 The system in (10.41) and (10.42) admits a solution W (t0 ) (t, x) = A∗ (t)x + C ∗ (t) e−r(t−t0 ) ,
(10.46)
with A∗ (t) = AP∗ + Φ ∗ (t) C¯ ∗ −
∗
t
C (t) = e
r(t−t0 )
∗
−1
t n b2 j t0 j =1
F (y)e
−r(y−t0 )
Φ ∗ (y) dy
2cja dy
t0
+ C∗0
,
and
,
where
∗
t
Φ (t) = exp
AP 2cja ∗ j =1
t0
C¯ ∗ =
T
n b2 j
t0 j =1
= (r + δ) − (r + δ) + 4 2
∗
F (t) = −
n
α −
n
j =1
α¯ +
βj
× α¯ +
2cja
Φ ∗ (y) dy,
n n b2 j
j
j =1
=1
+ (r + δ) dy ,
−Φ ∗ (T ) + (AP∗ + nj=1 g j )
AP∗
n b2 j
2cja
1/2 hj
+ βˆˆ h A∗ (t)
ˆ ˆ ˆh h ∗ ˆj j ∗ ˆ ˆ ¯ ¯ βh αˆ + β A (t) − c α¯ + βj αˆ + β A (t)
j ∈K¯
j
ε α¯ j +
j β¯k αˆˆ k + βˆˆ kj A∗ (t)
n
aj α¯ + j
j =1
=
n j =1
j ∈N
k∈N
− A∗x (t)
C∗0
h∈N
h∈N
−
, ca j =1 j
j =1
j β¯h αˆˆ h
n b2 j
j j −r(T −t0 )
g x¯ e
j β¯h αˆˆ h + βˆˆ h A∗ (t)
,
and
h∈N
−
T
F ∗ (y)e−r(y−t0 ) dy.
t0
Proof Follow the Proof of Proposition 6.3 in the Appendixes of Chap. 6.
10.6
Collaborative Scheme in Stochastic Industrial Pollution Management
289
Using (10.43), (10.45), and (10.46), the control strategy under cooperation can be obtained as ψi (t, x) = αˆˆ i + βˆˆ i A∗ (t)
i (t, x) = −
and
bi ∗ A (t)x 1/2 , 2cia
(10.47)
for t ∈ [t0 < T ] and i = 1, 2, . . . , n. Substituting the control strategy from (10.47) into (10.25) yields the dynamics of pollution accumulation under cooperation. Solving the stochastic cooperative pollution dynamics yields the cooperative state trajectory ∗
x (t) = e
t n t [ j =1
[
0
× xt0 +
bj2 2cja
t
2
A∗ (s)−δ− σ2 ] ds+
t n
aj α¯ +
t0 j =1
b2 s 2 [ t [ σ2 +δ− nj=1 2cja 0 j
×e
j
t0
σ dz(s)]
j β¯h αˆˆ h
+ βˆˆ h A∗ (s)
h∈N s
A∗ (τ )] dτ −
t0
σ dz(τ )]
ds ,
(10.48)
for t ∈ [t0 , T ]. We use Xt∗ to denote the set of realizable values of x ∗ (t) at time t generated by (10.48). The term xt∗ is used to denote an element in the set Xt∗ . A remark that will be utilized in the subsequent analysis is given below. Remark 10.4 Let W (τ ) (t, xt ) denote the value function of the stochastic control problem with the objective in (10.40) and dynamics in (10.30), which starts at time τ . One can readily verify that W (τ ) (t, xt∗ ) = W (t0 ) (t, xt∗ )er(τ −t0 ) , for τ ∈ [t0 , T ].
10.6.2 Subgame Consistent Solution and Benefit Distribution According to the agreed-upon optimality principle, the imputation scheme {ξ (τ )i (τ, xτ∗ ); for i ∈ N} has to satisfy the following condition: Condition 10.2 V (τ )i (τ, xτ∗ ) ξ (τ )i τ, xτ∗ = n W (τ ) τ, xτ∗ , (τ )j ∗ (τ, xτ ) j =1 V
(10.49)
for i ∈ N, xτ∗ ∈ Xτ∗ , and τ ∈ [t0 , T ]. Invoking Theorem 8.2 in Chap. 8, a subgame consistent solution can then be expressed as
T T T P (x0 , T −t0 ) = ψ s, x ∗ (s) s=t , s, x ∗ (s) s=t , B s, xs∗ s=t , sξ (t0 ) (t0 , xt0 ) , 0
0
0
290
10
Collaborative Environmental Management Under Uncertainty
where ψi (s, x ∗ (s)) = αˆˆ i + βˆˆ i A∗ (s) for i ∈ N is a set of environmental policy instruments in the collaborative environmental scheme, 1/2 bi , s, x ∗ (s) = − a A∗ (s) x ∗ (s) 2ci is the set of pollution abatement efforts in the collaborative environmental scheme, ξ (t0 ) (t0 , xt0 ) is an imputation scheme satisfying Condition 10.2, and ∗ ∂ V (s)i (t, xt∗ ) (s) n W t, x Bi s, xs∗ = − t t=s (s)j (t, x ∗ ) ∂t t j =1 V ∗ ∂ V (s)i (s, xsi∗ ) (s) W s, xs − ∗ n j∗ ∂xs V (s)j (s, xs ) j =1
×
n j =1
aj α¯ + j
j β¯h ψh s, xs∗
h∈N
−
n
1/2 bj j s, xs∗ xs∗ − δxs∗
j =1
∗ V (s)i (s, xsi∗ ) σ 2 xs∗ ∂ 2 (s) W s, xs − n j∗ 2 ∂(xs∗ )2 V (s)j (s, xs ) j =1
for i ∈ N and xs∗ ∈ Xs∗ . Invoking Propositions 10.1, 10.2, and (10.47), one can express Bi (s, xs∗ ) as −[Ai (s)xs∗ + Ci (s)] ˙ ˙ A(s)xs∗ + C(s) − r A(s)xs∗ + C(s) Bi s, xs∗ = 2 ( j =1 [Aj (s)xs∗ + Cj (s)]) [A(s)xs∗ + C(s)] − 2 ( j =1 [Aj (s)xs∗ + Cj (s)]) ×
A˙ i (s)xs∗ + C˙ i (s) − r Ai (s)xs∗ + Ci (s)
+
[Ai (s)xs∗ + Ci (s)][A(s)xs∗ + C(s)] ( 2j =1 [Aj (s)xs∗ + Cj (s)])2
×
2
A˙ j (s)xs∗ + C˙ j (s) − r Aj (s)xs∗ + Cj (s) j =1
2 [Ai (s)xs∗ + Ci (s)][A(τ )xs∗ + C(s)] + Aj (s) ( 2j =1 [Aj (s)xs∗ + Cj (s)])2 j =1
10.7
Exercises
291
×
n
aj α¯ + j
j =1
+
j β¯h αˆˆ h + βˆˆ h A∗ (s)
h∈N
n bj2 j =1
2cja
A∗ (s)xs∗ − δxs∗ ,
(10.50)
for i ∈ N and xs∗ ∈ Xs∗ . When all nations are adopting the cooperative strategies, the rate of instantaneous payment that nation ∈ N will realize at time t with the state being xt∗ can be expressed as
n j ∗ ˆ j h h ∗ ˆ β¯ αˆ + βˆ A (t) β α¯ +
t, x = α −
t
j
× α¯ +
h∈N
− c α¯ +
j ε
β¯j
j ∈N
β¯h αˆˆ h + βˆˆ h A∗ (t)
h∈N
−
h
j =1
α¯ + j
j ∈K¯
2 b ∗ ˆ j ∗ A (t) xt∗ αˆ + βˆ A (t) − ca 2ca
ˆj
j β¯k αˆˆ k
ˆ kj ∗ + βˆ A (t) − h xt∗ .
(10.51)
k∈N
Since, according to (10.50), under the cooperative scheme an instantaneous payment to nation equaling B (t, xt∗ ) at time t when x ∗ (τ ) = xτ∗ ∈ Xτ∗ , a side payment of the value B (t, xt∗ ) − (t, xt∗ ) will be offered to nation . The model analyzed in Sects. 10.5 and 10.6 comes mainly from Yeung and Petrosyan (2008). Another version of stochastic differential games on cooperative pollution management can be found in Yeung (2007).
10.7 Exercises 10.1 Consider a two-nation international economy in which transboundary pollution is generated by the production process. The planning horizon is [0, 2]. The inverse demand function of the outputs of nations 1 and 2 at time instant s ∈ [0, 2] are, respectively, P1 (s) = 150 − 2q1 (s) − q2 (s)
and P2 (s) = 130 − q1 (s) − 0.5q2 (s),
where Pi (s) is the price of the output and qi (s) is the output of nation. The unit costs of production in these nations are c1 = 0.5 and c2 = 1. The instantaneous industrial
292
10
Collaborative Environmental Management Under Uncertainty
profits of nations 1 and 2 at time s can be expressed as π1 (s) = 150 − 2q1 (s) − q2 (s) q1 (s) − 0.5q1 (s) − v1 (s)q1 (s), π2 (s) = 130 − 0.5q1 (s) − q2 (s) q2 (s) − q2 (s) − v2 (s)q2 (s),
and
where vi (s) ≥ 0 is the tax rate imposed by government i on its industrial output at time s. At each time instant s, the industrial sector of nation i ∈ {1, 2} seeks to maximize its instantaneous profit. Derive the market equilibrium at time instant s. 10.2 Industrial production emits pollutants into the environment. The emitted pollutants cause short-term local impacts on neighboring areas of the origin of production in forms like passing-by waste in waterways, wind-driven suspended particles in the air, unpleasant odor, noise, dust, and heat. For an output of q1 (s) produced by nation 1, there will be a short-term local environmental impact (cost) of 0.2q1 (s) on nation 1 itself and a local impact of 0.15q1 (s) on its neighbor nation 2. On the other hand, for an output of q2 (s) produced by nation 2, there will be a short-term local environmental impact (cost) of 0.35q2 (s) on nation 2 itself and a local impact of 0.3q2 (s) on its neighbor nation 1. Moreover, industrial production will also create long-term global environmental impacts by building up existing pollution stocks like greenhouse gases, CFC, and atmospheric particulates. Each government adopts its own pollution abatement policy to reduce the pollution stock. Let x(s) denote the level of pollution and uj (s) the pollution abatement effort of nation j at time s, the dynamics of pollution stock is governed by the stochastic differential equation 1/2 1/2 − 0.02u2 (s) x(s) dx(s) = 4q1 (s) + 3q2 (s) − 0.05u1 (s) x(s) − 0.04x(s) ds + 0.2 dz(s), x(0) = 100, where z(s) is a Wiener process. The governments have to promote business interests and at the same time handle the financing of the costs brought about by pollution. The damages to countries 1 and 2 from an x(s) amount of pollution are 0.25x(s) and 0.2x(s). In particular, each government maximizes the net gains in the industrial sector minus the sum of expenditures on pollution abatement and damages from pollution. The instantaneous objective of governments 1 and 2 at time s are, respectively, 150 − 2q1 (s) − q2 (s) q1 (s) − 0.5q1 (s) − 0.2q1 (s) − 0.3q2 (s) − 0.25x(s), and 130 − 0.5q1 (s) − q2 (s) q2 (s) − q2 (s) − 0.15q1 (s) − 0.35q2 (s) − 0.2x(s). The governments’ planning horizon is [0, 2]. At terminal time 2, the terminal appraisal associated with the state of pollution is 3[60 − x(2)] for nation 1 and 2[40 − x(2)] for nation 2. The discount rate is 0.05. Each government seeks to maximize the integral of its instantaneous objective over the planning horizon subject to pollution stock dynamics.
10.7
Exercises
293
Construct a stochastic differential game of noncooperative pollution management by these two nations. Obtain a Nash equilibrium solution for the game. 10.3 Consider the case when both nations want to cooperate and agree to act so that an international optimum can be achieved. Obtain the optimal cooperative levels of outputs and abatement efforts. 10.4 These cooperating nations adopt an optimality principle that distributes the expected gain from cooperation proportional to the relative sizes of the nations’ expected noncooperative payoffs. Characterize a subgame consistent solution.
Chapter 11
Subgame Consistent Dormant Firm Cartel
In this chapter, we introduce uncertainty into the dormant-firm cartel discussed in Chap. 7. Section 11.1 presents a stochastic dynamic oligopoly in which there are cost differentials among firms. The optimal cartel output trajectory, subgame consistent imputation schemes, and profit sharing arrangement are derived in Sect. 11.2. An illustration is shown in the following section. The case when the planning horizon becomes infinite is analyzed in Sect. 11.4; an illustration with an explicit subgame consistent solution in a stochastic framework is given in Sect. 11.5.
11.1 A Stochastic Dynamic Oligopoly In this section, we extend the dynamic model of oligopoly in Chap. 7 to a stochastic environment.
11.1.1 Basic Settings Consider an oligopoly in which n firms are allowed to extract a renewable resource within the duration [t0 , T ]. Among the n firms, n1 of them have absolute and marginal cost disadvantages over the other n2 = n − n1 firms. For notational convenience, the firms with cost advantages are numbered from 1 to n1 and the firms with cost disadvantages are numbered from n1 + 1 to n. The subset of firms with cost advantages is denoted by N1 and that of firms with cost disadvantages is denoted by N2 . The firms with cost advantages are identical and so are the firms with cost disadvantages. D.W.K. Yeung, L.A. Petrosyan, Subgame Consistent Economic Optimization, Static & Dynamic Game Theory: Foundations & Applications, DOI 10.1007/978-0-8176-8262-0_11, © Springer Science+Business Media, LLC 2012
295
296
11
Subgame Consistent Dormant Firm Cartel
The dynamics of the resource is characterized by the stochastic differential equations uj 1 (s) + uj 2 (s) ds + σ s, x(s) dz(s), dx(s) = f s, x(s), x(t0 ) = x0 ∈ X,
j i ∈N1
j 2 ∈N2
(11.1)
where uj ∈ Uj is the (nonnegative) amount of the resource extracted by firm i, for i ∈ N , and x(s) is the resource stock, σ [s, x(s)] is a m × Θ matrix, and z(s) is a Θ-dimensional Wiener process and the initial state x0 is given. Let Ω[s, x(s)] = σ [s, x(s)]σ [s, x(s)] denote the covariance matrix with its element in row h and column ζ denoted by Ω hζ [s, x(s)]. The extraction cost depends on the quantity of the resource extracted ui (s) and the resource stock size x(s). In particular, the extraction cost for the n1 firms with cost advantages is 1 cj uj 1 (s), x(s) , for j 1 ∈ N1 , and the extraction cost for the n1 firms with cost disadvantages is 2 cj uj 2 (s), x(s) , for j 2 ∈ N2 . This formulation of unit cost follows from two assumptions: (i) the cost of extraction is positively related to extraction effort and (ii) the amount of resource extracted, seen as the output of a production function of two inputs (effort and stock level) is increasing in both inputs (see Clark 1976). In particular, firm j 1 ∈ N1 has cost advantage so that 1 2 ∂cj uj 1 (s), x(s) /∂uj 1 (s) < ∂cj uj 2 (s), x(s) /∂uj 2 (s), for all levels of uj 1 ∈ Uj 1 and uj 2 ∈ Uj 2 at any x ∈ X. The market price of the resource depends on the total amount extracted and supplied to the market. The price-output relationship at time s is given by the following downward-sloping inverse demand curve P (s) = g[Q(s)], where Q(s) = u (s) + u 1 i 2 j ∈N1 j j ∈N2 j 2 (s) is the total amount of the resource extracted and marketed at time s. At time T , firm j 1 ∈ N1 will receive a termination bonus 1 2 q j [x(T )] and firm j 2 ∈ N2 will receive a termination bonus q j [x(T )]. There exists a discount rate r, and the profits received at time t have to be discounted by the factor exp[−r(t − t0 )]. At time t0 , firm j 1 ∈ N1 , which has cost advantages, seeks to maximize its expected profit
T j1 Et0 g uh (s) + u (s) uj 1 (s) − c uj 1 (s), x(s) t0 h∈Ni ∈N2 j1 × exp −r(s − t0 ) ds + exp −r(T − t0 ) q x(T ) , (11.2) subject to (11.1).
11.1
A Stochastic Dynamic Oligopoly
297
At time t0 , firm j 2 ∈ N2 , which has cost disadvantages, seeks to maximize the expected profit
T 2 uh (s) + u (s) uj 2 (s) − cj uj 2 (s), x(s) Et0 g t0
h∈Ni
∈N2
j2 × exp −r(s − t0 ) ds + exp −r(T − t0 ) q x(T ) ,
(11.3)
subject to (11.1).
11.1.2 Market Outcome We use Γ (x0 , T − t0 ) to denote the game in (11.1)–(11.3) and Γ (xτ , T − τ ) to denote an alternative game with the state dynamics in (11.1) and payoff structures in (11.2) and (11.3), which starts at time τ ∈ [t0 , T ] with the initial state xτ ∈ X. Invoking Theorem 2.5 in Chap. 2, a noncooperative Nash equilibrium solution of the game Γ (xτ , T − τ ) can be characterized as follows. Corollary 11.1 A set of feedback strategies {φj∗1 (t, x) for j 1 ∈ N1 and φj∗2 (t, x) for j 2 ∈ N2 } provides a Nash equilibrium solution to the game Γ (xτ , T − τ ) if there 1 exist continuously twice differentiable functions V (τ )j (t, x) : [τ, T ] × R → R for 2 j 1 ∈ N1 and V (τ )j (t, x) : [τ, T ] × R → R for j 2 ∈ N2 , satisfying the following partial differential equations: (τ )j 1
−Vt
(t, x) −
h,ζ =1
= max uj 1
m 1 hζ (τ )j 1 Ω (t, x)Vx h x ζ (t, x) 2
g
φh∗ (t, x) + uj 1
h∈Ni h=j 1
φ∗ (t, x)
t, x,
φh∗ (t, x) + uj 1 +
h∈Ni h=j 1
1 1 V (τ )j (T , x) = exp −r(T − t0 ) q j (x), (τ )j 2
−Vt
(t, x) −
j1
uj 1 , −c (uj 1 , x)
∈N2
× exp −r(t − τ ) (τ )j 1 (t, x)f + Vx
+
φ∗ (t, x)
,
and
∈N2
for j 1 ∈ N1 ;
m 1 hζ (τ )j 2 Ω (t, x)Vx h x ζ (t, x) 2 h,ζ =1
(11.4)
298
11
= max
g
uj 2
φh∗ (t, x) +
h∈Ni
φ∗ (t, x) + uj 2
j2
uj 2 − c (uj 2 , x)
∈N2 =j 2
× exp −r(t − τ ) (τ )j 2 (t, x)f + Vx
Subgame Consistent Dormant Firm Cartel
t, x,
φh∗ (t, x) +
h∈Ni
2 2 V (τ )j (T , x) = exp −r(T − t0 ) q j (x),
φ∗ (t, x) + uj 2
,
and
∈N2 =j 2
for j 2 ∈ N2 .
Conditions satisfying the indicated maximization in (11.4) yield
g
φh∗ (t, x) + uj 1
h∈Ni h=j 1
+g
+
φ∗ (t, x)
∈N2
φh∗ (t, x) + uj 1
h∈Ni h=j 1
+
φ∗ (t, x)
uj 1
∈N2
∂ j1 − c (uj 1 , x) exp −r(t − τ ) ∂uj 1 ∂ (τ )j 1 ∗ ∗ + Vx (t, x) f t, x, φh (t, x) + uj 1 + φ (t, x) = 0, ∂uj 1 h∈Ni h=j 1
∈N2
for j 1 ∈ N1 ;
∗ ∗ φh (t, x) + φ (t, x) + uj 2 g h∈Ni
+ g
∈N2 =j 2
φh∗ (t, x) +
h∈Ni
(11.5)
φ∗ (t, x) + uj 2 uj 2
∈N2 =j 2
∂ j2 − c (uj 1 , x) exp −r(t − τ ) ∂uj 2 ∂ (τ )j 2 (t, x) f t, x, φh∗ (t, x) + φ∗ (t, x) + uj 2 = 0, + Vx ∂uj 2 h∈Ni
for j 2 ∈ N2 .
∈N2 =j 2
11.2
Subgame Consistent Cartel
299
The expected profits of firm j 1 ∈ N1 , which has cost advantages, can be expressed as
V
(τ )j 1
T
(t, xτ ) = Eτ
g
τ
∗ ∗ φh s, x(s) + φ s, x(s) φj 1 s, x(s) ∗
h∈Ni
∈N2
1 − cj φj∗1 s, x(s) , x(s) exp −r(s − τ ) ds j1 + exp −r(T − τ ) q x(T ) , for j 1 ∈ N1 . The expected profits of firm j 2 ∈ N2 , which has cost disadvantages, can be expressed as
V
(τ )j 2
T
(t, xτ ) = Eτ
g
τ
φh∗
∗ ∗ φ s, x(s) φj 2 s, x(s) s, x(s) +
h∈Ni
∈N2
2 − cj φj∗2 s, x(s) , x(s) exp −r(s − τ ) ds j2 + exp −r(T − τ ) q x(T ) , for j 2 ∈ N2 , where dx(s) = f s, x(s),
j i ∈N1
φj∗1
∗ φj 2 s, x(s) ds + σ s, x(s) dz(s), s, x(s) +
j 2 ∈N2
x(τ ) = xτ ∈ X. The dynamic oligopoly model presented above is an extension of the dormantfirm duopoly model in Yeung (2005).
11.2 Subgame Consistent Cartel Assume that the firms in the oligopoly agree to form a cartel to restrain output and enhance their expected profits.
300
11
Subgame Consistent Dormant Firm Cartel
11.2.1 Pareto Optimal Output Path To achieve a group optimum, these firms are required to solve the following expected joint profit maximization problem:
T g uh (s) + u (s) uh (s) + u (s) max Et0 u1 ,u2 ,...,un
−
t0
h∈Ni
h∈Ni
∈N2
ch uh (s), x(s) +
h∈Ni
c u (s), x(s)
∈N2
exp −r(s − t0 ) ds
∈N
2 h q x(T ) + q x(T ) , + exp −r(T − t0 )
h∈Ni
(11.6)
∈N2
subject to (11.1). An optimal solution of the stochastic control problem in (11.1) and (11.6) can be characterized using Theorem A.5 in the Technical Appendixes as follows. Corollary 11.2 A set of control strategies {ψj∗1 (t, x) for j 1 ∈ N1 and ψj∗2 (t, x) for j 2 ∈ N2 }, provides a solution to the control problem in (11.1) and (11.6) if there exist continuously twice differentiable functions W (t0 ) (t, x) : [τ, T ]×R m → R, satisfying the following partial differential equation: m 1 hζ (t ) Ω (t, x)Wx h0x ζ (t, x) 2 h,ζ =1
max g uh + u uh + u
(t0 )
− Wt =
(t, x) −
u1 ,u2 ,...,un
−
h∈Ni
∈N2
ch [uh , x] +
h∈Ni
c [u , x]
t, x,
uh +
h∈N1
W
(t0 )
∈N2
exp −r(t − t0 )
∈N2
+ Wx(t0 ) (t, x)f
h∈Ni
u
,
and
∈N2
h (T , x) = exp −r(T − t0 ) q x+ q x . h∈Ni
∈N2
Conditions satisfying the indicated maximization in (11.7) include
uh + u + g uh + u uh + u g h∈Ni
−
∈N2
h∈Ni
∂ j1 c (uj 1 , x) exp −r(t − t0 ) ∂uj 1
∈N2
h∈Ni
∈N2
(11.7)
11.2
Subgame Consistent Cartel
+ Wx(t0 ) (t, x)
301
∂ f t, x, uh + u ≤ 0, ∂uj 1 h∈N1
∈N2
uj 1 ≥ 0, and if uj 1 > 0, the equality sign must hold, for j 1 ∈ N1 ;
g
uh +
h∈Ni
u + g
∈N2
uh +
h∈Ni
∈N2
u
uh +
h∈Ni
u
∈N2
∂ j2 c (uj 1 , x) exp −r(t − t0 ) ∂uj 2 ∂ (t0 ) f t, x, uh + u ≤ 0, + Wx (t, x) ∂uj 2
−
h∈N1
∈N2
uj 2 ≥ 0.
(11.8)
If uj 2 > 0, the equality sign must hold, for j 2 ∈ N2 . Since
∂ j1 ∂uj 1 c (uj 1 , x)
∂ j2 ∂uj 2 c (uj 1 , x),
0, the equality sign must hold, for j 1 ∈ N1 ;
g
uh +
h∈Ni
u + g
∈N2
uh +
h∈Ni
u
∈N2
uh +
h∈Ni
u
∈N2
(11.39) ∂ ∂ j2 c (uj 1 , x) + Wx (x) f x, uh + u ≤ 0, − ∂uj 2 ∂uj 2
h∈N1
∈N2
uj 2 ≥ 0. If uj 2 > 0, the equality sign must hold, for j 2 ∈ N2 . Since
∂ j1 ∂uj 1 c (uj 1 , x)