Sally
The
SERIES
Pure and Applied UNDERGRADUATE TEXTS
An Introduction to Game-Theoretic Modelling Third Edition
Mike Mesterton-Gibbons
37
An Introduction to Game-Theoretic Modelling Third Edition
Sally
The
Pure and Applied UNDERGRADUATE TEXTS • 37
SERIES
An Introduction to Game-Theoretic Modelling Third Edition
Mike Mesterton-Gibbons
EDITORIAL COMMITTEE Gerald B. Folland (Chair) Jamie Pommersheim
Steven J. Miller Serge Tabachnikov
2010 Mathematics Subject Classification. Primary 91-01, 91A10, 91A12, 91A22, 91A40, 91A80; Secondary 92-02, 92D50.
For additional information and updates on this book, visit www.ams.org/bookpages/amstext-37
Library of Congress Cataloging-in-Publication Data Names: Mesterton-Gibbons, Mike, author. Title: An introduction to game-theoretic modelling / Mike Mesterton-Gibbons. Other titles: Introduction to game theoretic modelling Description: Third edition. | Providence, Rhode Island : American Mathematical Society, [2019] | Series: Pure and applied undergraduate texts ; volume 37 | Includes bibliographical references and index. Identifiers: LCCN 2019000060 | ISBN 9781470450298 (alk. paper) Subjects: LCSH: Game theory. | Mathematical models. | AMS: Game theory, economics, social and behavioral sciences – Instructional exposition (textbooks, tutorial papers, etc.). msc | Game theory, economics, social and behavioral sciences – Game theory – Noncooperative games. msc | Game theory, economics, social and behavioral sciences – Game theory – Cooperative games. msc | Game theory, economics, social and behavioral sciences – Game theory – Evolutionary games. msc | Game theory, economics, social and behavioral sciences – Game theory – Game-theoretic models. msc | Game theory, economics, social and behavioral sciences – Game theory – Applications of game theory. msc | Biology and other natural sciences – Research exposition (monographs, survey articles). msc | Biology and other natural sciences – Genetics and population dynamics – Animal behavior. msc Classification: LCC QA269 .M464 2019 | DDC 519.3–dc23 LC record available at https://lccn.loc.gov/2019000060
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to
[email protected]. c 2019 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1
24 23 22 21 20 19
To the memory of my father-in-law
Bob who took pleasure in analyzing games
Contents
Preface Acknowledgments
xi xiii
Agenda
1
Chapter 1. Community Games
9
1.1. Crossroads: a motorist’s dilemma
9
1.2. Optimal reaction sets and Nash equilibria
12
1.3. Four Ways: a motorist’s trilemma
20
1.4. Store Wars: a continuous game of prices
26
1.5. Store Wars II: a three-player game
34
1.6. Contests as games. The paradox of power
41
1.7. A peek at the extensive form
47
1.8. Max-min strategies
50
1.9. Commentary
51
Exercises 1
53
Chapter 2. Population Games
57
2.1. Crossroads as a population game
57
2.2. Evolutionarily stable strategies
63
2.3. Crossroads as a continuous population game
66
2.4. Hawk-Dove games
70
2.5. More on evolutionarily stable strategies
76
2.6. A continuous Hawk-Dove game
82
2.7. Multiple deviation. Population dynamics
87
2.8. Discrete population games. Multiple ESSs
90 vii
viii
2.9. Continuously stable strategies 2.10. State-dependent dynamic games 2.11. Commentary Exercises 2
Contents
96 100 108 111
Chapter 3. Cooperative Games in Strategic Form 3.1. Unimprovability or group rationality 3.2. Necessary conditions for unimprovability 3.3. The Nash bargaining solution 3.4. Independent versus correlated strategies 3.5. Commentary Exercises 3
115 116 121 125 128 132 132
Chapter 4. Cooperative Games in Nonstrategic Form 4.1. Characteristic functions and reasonable sets 4.2. Core-related concepts 4.3. A four-person car pool 4.4. Log hauling: a coreless game 4.5. Antique dealing. The nucleolus 4.6. Team long-jumping. An improper game 4.7. The Shapley value 4.8. Simple games. The Shapley–Shubik index 4.9. Coalition formation: a nonstrategic model 4.10. Commentary Exercises 4
135 136 141 144 147 150 158 160 163 165 170 171
Chapter 5. Cooperation and the Prisoner’s Dilemma 5.1. A game of foraging among oviposition sites 5.2. Tit for tat: champion reciprocal strategy
175 177 180
5.3. Other reciprocal strategies 5.4. Dynamic versus static interaction 5.5. Stability of a nice population: static case 5.6. Stability of a nice population: dynamic case 5.7. Mutualism: common ends or enemies 5.8. Much ado about scorekeeping 5.9. The comedy of errors 5.10. Commentary Exercises 5
183 193 197 198 201 205 207 209 212
Chapter 6. Continuous Population Games 6.1. Sex allocation. What is the evolutionarily stable sex ratio?
215 216
Contents
6.2. Damselfly duels: a war of attrition 6.3. Games among kin versus games between kin 6.4. Information and strategy: a mating game 6.5. Competition over divisible resources: a Hawk-Dove game 6.6. Partitioning territory: a landmark game 6.7. The influence of contests on clutch size 6.8. Models of vaccination behavior 6.9. Stomatopod strife: a threat game 6.10. Commentary Exercises 6
ix
217 224 228 232 237 248 256 264 274 276
Chapter 7. Discrete Population Games 7.1. Roving ravens: a recruitment game
281 281
7.2. Cooperative wildlife management 7.3. An iterated Hawk-Dove game 7.4. Commentary Exercises 7
288 296 314 316
Chapter 8. Triadic Population Games 8.1. Winner and loser effects 8.2. Victory displays 8.3. Coalition formation: a strategic model 8.4. Commentary Exercises 8
317 318 330 341 355 357
Chapter 9. Appraisal
359
Appendix A. Bimatrix Games
363
Appendix B. Answers or Hints for Selected Exercises
365
Bibliography
373
Index
387
Preface
This is the third edition of an introduction to game theory and its applications from the perspective of an applied mathematician. It covers a range of concepts that have proven useful, and are likely to continue proving useful, for mathematical modelling in the life and social sciences. Its approach is heuristic, but systematic, and it deals in a unified manner with the central ideas of both classical and evolutionary game theory. In many ways, it is a sequel to my earlier work, A Concrete Approach to Mathematical Modelling [210],1 in which games were not discussed. The mathematical prerequisites for a complete understanding of the entire book are correspondingly modest: the standard calculus sequence, a rudimentary knowledge of matrix algebra and probability, a passing acquaintance with differential equations, some facility with a mathematical software package, such as Maple, Mathematica (which I used to draw all of the figures in this book) or MATLAB—nowadays, almost invariably implied by the first three prerequisites—and that intangible quantity, a degree of mathematical maturity. But I have written the book to be read at more than one level. On the one hand, aided by over 90 figures (half of them new to this edition), those with a limited mathematical background can read past much of the mathematics and still understand not only the central concepts of game theory, but also the various assumptions that define the different models and the conclusions that can be drawn from those assumptions. On the other hand, for those with a strong enough background, the underlying mathematics is all there to be appreciated. So I hope that my book will have broad appeal, not only to mathematicians, but also to biologists and social scientists.2 Among other things, mathematical maturity implies an understanding that abstraction is the essence of modelling. Like any model, a game-theoretic model is a deliberate simplification of reality; however, a well designed game is also a useful simplification of reality. But the truth of this statement is more readily appreciated
1
Bold numbers in square brackets are references listed in the Bibliography, pp. 373–386. Books that cover the prerequisite material well include Bodine et al. [31] and Neuhauser and Roper [244] from a biological perspective, and Kropko [164] from that of a social scientist. 2
xi
xii
Preface
after evidence in its favor has been presented. That evidence appears in Chapters 1–8. Further discussion of this point is deferred to Chapter 9. Reactions to the first and second editions of this book have been both positive and pleasing. I haven’t fixed what isn’t broken. But I have paid attention to feedback, and I have extensively rewritten and restructured the book accordingly; in particular, Chapter 2 is almost entirely new.3 As in the previous editions, I aim to take my readers all the way from introductory material to the research frontier. I have therefore included a number of game-theoretic models that were not in the second edition because they are based on more recent work; this new material reflects my interest in the field of behavioral ecology,4 where game theory holds great promise. Again with the research frontier in mind, I have brought the end-of-chapter commentaries and the bibliography right up to date. By design, the references are still selective: it would be against the spirit of an introductory text to cite all potentially relevant published literature. Yet I have tried to make the vast majority of it easily traceable through judicious references to more recent work. There exist several excellent texts on game theory, but their value is greatest to the mathematical purist. Practices that are de rigueur to a purist are often merely distracting to a modeller—for example, lingering over the elegant theory of zerosum games (nonzero-sum conflicts are much more common in practice), or proving the existence of a Nash equilibrium in bimatrix games (for which the problem in practice may be to distinguish among a superabundance of such equilibria); or, more fundamentally, beginning with the most general possible formulation of a game and only later proceeding to specific examples (the essence of modelling is rather the opposite). Such practices are therefore honored in the breach. Instead, and as described more fully in the agenda that follows, the emphasis is on concrete examples, and the direction of pedagogy throughout the book is predominantly from specific to general. This downplaying of rigor and generality facilitates the appropriate emphasis on the subtle process of capturing ideas and giving them mathematical substance—that is, the process of modelling. In that regard, I continue to hope that my book not only helps to make game-theoretic models accessible, but also conveys both their power and their scope in a wide variety of applications.
3 I have also corrected all known errors (and tried hard to avoid introducing any new ones), and I have added several new exercises. Brief answers or hints for only about a third of the exercises appear at the end of the book; however, a manual containing complete solutions to almost all of the exercises—as well as 13 additional problems and 16 additional figures—is available to any instructor. Please send mail to
[email protected] for more information. 4 See p. 217.
Acknowledgments
I am very grateful to my daughter Alex for her delightfully arresting cover art, whimsically capturing the idea of strategic behavior among animals; to Mark Briffa, Jonathan Green, and the anonymous reviewers for their very helpful comments on the manuscript; to my former game theory students, too numerous to mention individually, for their feedback over the years; to the American Mathematical Society staff for their professionalism, especially to Eko Hironaka and Marcia Almeida for their encouragement and guidance, and to Jennifer Wright Sharp for expert editorial production; and to the Simons Foundation for supporting my work on game-theoretic modelling. Above all else, I am grateful to the Mesterton-Gibbons family—Karen, Alex, Sarah and Sam—for their constant moral support. They are the reason that I have a wonderful life.
xiii
Agenda
This is a book about mathematical modelling of strategic behavior, defined as behavior that arises whenever the outcome of an individual’s actions depends on actions to be taken by other individuals. The individuals may be either human or nonhuman beings, the actions either premeditated or instinctive. Thus models of strategic behavior are applicable in both the social and the natural sciences. Examples of humans interacting strategically include store managers setting prices (the number of customers who buy at one store’s price will depend on the price at other stores in the neighborhood) and drivers negotiating a 4-way junction (whether it is advantageous for a driver to assume right of way depends on whether the driver facing her concedes right of way).1 Examples of nonhumans interacting strategically include spiders disputing a territory (the risk of injury to one animal from being an aggressor will depend on whether the other is prepared to fight), insects foraging for sites at which to lay eggs (the number of one insect’s eggs that mature into adults at a given site, where food for growth is limited, will depend on the number of eggs laid there by other insects), and mammals in a trio deciding between sharing a resource three ways or attempting to exclude one or both of their partners from it (whether it is advantageous to go it alone depends on whether the other two form an opposing coalition, as well as on the animals’ strengths, and on the extent to which those strengths are amplified by coalition membership), and on a host of other factors, including chance.2 These and other strategic interactions will be modelled in detail, beginning in Chapter 1.3 1 We assume throughout that a human of indeterminate gender is female in odd-numbered chapters or appendices and male in even-numbered ones. This convention is simply the reverse of that which I adopted in A Concrete Approach to Mathematical Modelling [210] and obviates the continual use of “his or her” and “he or she” in place of epicene pronouns. For nonhumans, however, I adhere to the biologist’s convention of using neuter pronouns—e.g., for the damselflies in §6.2 and the wasps in §6.7, even though their gender is known (male and female, respectively). 2 As whimsically suggested by the picture on the front cover, where the two smallest meerkats of the trio roll a backgammon die together, in coalition, while the largest meerkat rolls a die by itself. 3 Specifically, store managers setting prices will be modelled in §1.4 and §1.5; drivers negotiating a 4-way junction in §1.1, §1.3, §2.1, and §2.3; spiders disputing a territory in §7.3; insects foraging for oviposition sites in §5.1 and §5.7; and coalitions of two versus one in §4.9 and §8.3.
1
2
Agenda
Table 0.1. Student achievement (number of satisfactory solutions) in mathematics as a function of effort
Student Student Student Student Student Student
1 2 3 4 5 6
Very hard (E = 5) 10 8 7 7 5 4
Quite hard (E = 3) 8 7 5 4 4 2
Hardly at all (E = 1) 6 5 3 3 2 1
To fix ideas, however, it will be helpful first to consider an example that, although somewhat fanciful, will serve to delineate the important distinction between strategic and nonstrategic decision making. Let us therefore suppose that the enrollment for some mathematics course is a mere six humans, and that grades for this course are based exclusively on answers to ten questions. Answers are judged to be either satisfactory or unsatisfactory, and the number of satisfactory solutions determines the final letter grade for the course—A, B, C, D, or F. In the usual way, A corresponds to 4 units of merit, and B, C, D, and F correspond to 3, 2, 1, and 0 units of merit, respectively. The students vary in motivation and intellectual ability, and all are capable of working very hard, or only quite hard, or hardly at all; but there is nothing random about student achievement as a function of effort, which (in these fanciful circumstances) is precisely defined as in Table 0.1. Thus, for example, Student 5 will produce five satisfactory solutions if she works very hard, but only four if she works quite hard; whereas Student 4 will produce seven satisfactory solutions if he works very hard, but only three if he works hardly at all. The students have complete control over how much effort they apply, and so we refer to effort as a decision variable. Furthermore, for the sake of definiteness, we assume that working very hard corresponds to 5 units of effort, quite hard to 3, and hardly at all to 1. Thus, if we denote effort by E and merit by M , then working very hard corresponds to E = 5; obtaining the letter grade A corresponds to M = 4; and similarly for the other values of E and M . Let us now suppose that academic standards are absolute, i.e., the number of satisfactory solutions required for each letter grade is prescribed in advance. Then no strategic behavior is possible. This doesn’t eliminate scope for decision making— quite the contrary. If, for example, 9 or 10 satisfactory solutions were required for an A, 7 or 8 for a B, 5 or 6 for a C, and 3 or 4 for a D, then Student 3 would earn 3 units of merit for E = 5, 2 for E = 3, and only 1 for E = 1. If she wished to maximize merit per unit of effort, or M/E, then she would still have to solve a simple optimization problem, namely, to determine the maximum of 3 ÷ 5 = 0.6, 2 ÷ 3 = 0.67, and 1 ÷ 1 = 1. The answer, of course, is 1, corresponding to E = 1: to maximize M/E, Student 3 should hardly work at all. Nevertheless, such a decision would not be strategic because its outcome would depend solely on the individual concerned; it would not depend in any way on the behavior of other students. The story is very different, however, if academic standards are relative, i.e., if letter grades depend on collective student achievement. To illustrate, let s denote
Agenda
3
Table 0.2. The grading scheme
A: B: C: D: F:
+ w)
≤s≤
b
+ 2w)
≤s
2.
5 In either case, a gain to one player means a loss to the other; the games are equivalent because subtracting the same constant from (or adding the same constant to) every payoff in a player’s matrix has no strategic effect—it cannot affect the conditions for Nash equilibrium, which we derive in §1.2.
1.2. Optimal reaction sets and Nash equilibria
13
Then whether San chooses W or G is quite irrelevant because (1.7)
−δ − τ2 /2 > −τ2 ,
and so it follows immediately from Table 1.1 that Nan’s best strategy is to hit the gas: every element in the first row of her payoff matrix is greater than the corresponding element in the second row of her payoff matrix. We say that strategy G dominates strategy W for Nan, and that G is a dominant strategy for Nan. More generally, if A is Player 1’s payoff matrix (defined at the end of §1.1), pure strategy i is said to dominate pure strategy k for Player 1 if aiq ≥ akq for all q = 1, . . . , m2 and aiq > akq for some q; if i dominates k for all k = i, then i is called a dominant strategy. Dominance is strong if the above inequalities are all strictly satisfied, and otherwise (i.e., if even one inequality is not strictly satisfied) weak. Thus, in particular, (1.7) implies that G is strongly dominant for Player 1. Similarly, if B is Player 2’s payoff matrix, then pure strategy j dominates pure strategy l for Player 2 if bpj ≥ bpl for all p = 1, . . . , m1 with bpj > bpl for some p; j is a dominant strategy if j dominates l for all l = j. Again, dominance is weak unless all inequalities are strictly satisfied. Note that we speak of a strongly dominant strategy, rather than the strongly dominant strategy, because both players may have one. Clearly, if one strategy (weakly or strongly) dominates another, then the other is (weakly or strongly) dominated by the one. In practice, if (1.6) is used to define a slow San, then we could interpret our model as yielding the advice: “If you think the driver across the road is a slowpoke, then put down your foot and go.” Furthermore, if (1.6) is used to define a slow San, then it might well be appropriate to define an intermediate San by (1.8)
2δ > τ2 > 2
and a fast San by (1.9)
2δ > 2 > τ2 .
Nan would be slow for τ1 > 2δ > 2, intermediate for 2δ > τ1 > 2, and fast for 2δ > 2 > τ1 . G is a dominant strategy for San if Nan is slow, because then −δ − τ1 /2 > −τ1 .6 So if both drivers are slow, then we can regard the strategy combination GG as the solution of our game: it contains a dominant strategy for each player. In terms of Table 0.3, dominant strategy is an adequate solution concept—in this very special case. Suppose, however, that either (1.8) or (1.9) holds. Then (1.7) is false, and what’s best for Nan is no longer independent of what San chooses because the second element in row 1 of Nan’s payoff matrix is still greater than the second element in row 2, whereas the first element in row 1 is now smaller than the first in row 2. No pure strategy is obviously better for Nan. Then what should Nan choose? If sometimes G is better and sometimes W is better (depending on what San chooses), then shouldn’t Nan’s choice be in some sense a mixture of strategies 6 We do not consider that, say, τ2 and 2δ might actually be equal. In general, we ignore as fanciful the possibility that strict equality between parameters aggregating behavior—population parameters— can ever be meaningful unless justified by a prior assumption of symmetry. This assumption is discussed at length elsewhere under the guise of “genericity” [292, p. 30] or the “generic payoffs assumption” [49, pp. 21–23].
14
1. Community Games
G and W ? One way to mix strategies would be to play G with probability u, and hence W with probability 1 − u. Accordingly, let N denote Nan’s choice of pure strategy. Then N is a random variable, with sample space {G, W }; Prob(N = G) = u and Prob(N = W ) = 1 − u. If Nan plays G with probability u, then we will say that she selects mixed strategy u, where u ∈ [0, 1].7 Similarly, if San plays G with probability v and hence W with probability 1 − v, that is, if Prob(S = G) = v and Prob(S = W ) = 1 − v, where the random variable S, again with sample space {G, W }, is San’s pure strategy, then we shall say that San selects mixed strategy v. Thus, for either player, the set of all feasible strategies or strategy set is the unit interval Δ = [0, 1]. When Player 1 selects strategy u and Player 2 selects strategy v, we refer to the row vector (u, v) as their mixed strategy combination; and we refer to the set of all feasible strategy combinations, which we denote by D, as the decision set (because this phrase is less cumbersome than “strategy combination set”). In this particular case, (1.10)
D = {(u, v) | 0 ≤ u, v ≤ 1} = {(u, v) | u, v ∈ Δ} = Δ × Δ,
the unit square of the Cartesian coordinate plane.8 Note that (1, 1), (1, 0), (0, 1), and (0, 0) are equivalent to GG, GW , W G, and W W , respectively, in §1.1. But how could Nan and San arrange all this? Let’s suppose that the spinning arrow depicted in Figure 1.2 is mounted on Nan’s dashboard. When confronted by San, Nan gives the arrow a quick spin. If it comes to rest in the shaded sector of the disk, then she plays G; if it comes to rest in the unshaded sector, then she plays W . Thus selecting strategy u means having a disk with shaded sectoral angle 2πu, and changing one’s strategy means changing the disk. What about the time required to spin the arrow—does it matter? Not if San also has a spinning arrow mounted on her dashboard and takes about as long to spin it (of course, San’s shaded sector would subtend angle 2πv at the center). And if you think it’s a bit far-fetched that motorists would drive around with spinning arrows on their dashboards, then you can think of Nan’s spinning arrow as merely the analogue of a mental process through which she decides whether to go or wait at random—but in such a way that she goes, on average, fraction u of the time. Similarly for San. Note, incidentally, the important point that strategies are selected prior to interaction: the players arrive at the junction with their disks already shaded. Let F1 denote the payoff to Nan. Then F1 is a random variable with sample space (1.11) −δ − 12 τ2 , 0, −τ2 , − − 12 τ2 . If strategies are chosen independently, then Prob F1 = −δ − 12 τ2 = Prob(N = G and S = G) = Prob(N = G) · Prob(S = G) = uv. 7 Here “∈” means “belongs to” (and in other contexts, e.g., (1.14),“belonging to”), and [a, b] denotes the set of all real numbers between a and b, inclusive. 8 In (1.10), the vertical bar means “such that” and “×” denotes Cartesian product. In general, {x | P } denotes the set of all x such that P is satisfied; for example, [a, b] = {x | a ≤ x ≤ b}. The Cartesian product of sets U and V is the set U × V = {(u, v)|u ∈ U, v ∈ V }, and more generally, the Cartesian product of A1 , . . . , An is A1 × · · · × An = {(a1 , . . . , an )|ai ∈ Ai , 1 ≤ i ≤ n}. Although the decision set is usually a Cartesian product, it does not have to be; for an exception, see §1.4, in particular (1.44).
1.2. Optimal reaction sets and Nash equilibria
15
Figure 1.2. Nan’s spinning arrow and disk
Similarly, Prob(F1 = 0) = u(1 − v), Prob(F1 = −τ2 ) = (1 − u)v, and
Prob F1 = − − 12 τ2 = (1 − u)(1 − v).
Let E denote expected value, and let f1 (u, v) denote the expected value of Nan’s payoff from strategy combination (u, v). We will refer to the expected value of a payoff as a reward. Thus Nan’s reward is f1 (u, v) = E[F1 ] = −(δ + 12 τ2 ) · Prob(F1 = −δ − 12 τ2 ) + 0 · Prob(F1 = 0) − τ2 · Prob(F1 = −τ2 ) − ( + 12 τ2 ) · Prob(F1 = − − 12 τ2 ) or (1.12a)
f1 (u, v) = + 12 τ2 − {δ + }v u + − 12 τ2 v − − 12 τ2
after simplification. San’s reward from the same combination is (1.12b) f2 (u, v) = + 12 τ1 − {δ + }u v + − 12 τ1 u − − 12 τ1 after a similar calculation—or simply exchanging Nan’s strategy and San’s transit time in (1.12a) for San’s strategy and Nan’s transit time. Note that (1.12) can be written more compactly as (1.13)
f1 (u, v) = (u, 1 − u)A(v, 1 − v)T , f2 (u, v) = (u, 1 − u)B(v, 1 − v)T ,
where A and B are the matrices in Tables 1.1 and 1.2, respectively.9 Equations (1.13) define a vector-valued function f = (f1 , f2 ) from the decision set D into the f1 -f2 plane. In terms of Table 0.3, this reward function is the third key ingredient of our game of Crossroads. Both Nan and San would like their reward to be as large as possible. Unfortunately, Nan does not know what San will do (and San does not know what Nan 9
In turn, (1.13) is a special case of (A.2).
16
1. Community Games
will do) because this is a noncooperative game. Therefore, Nan should reason as follows: “I do not know which v San will pick—but for every v, I will pick u to make f1 (u, v) as large as possible.” In this way, Nan obtains a set of points in the unit square. Each of these points corresponds to a strategy combination (u, v) that is optimal for Nan, in the sense that for each v (over which Nan has no control) a corresponding u is a strategy that makes Nan’s reward as large as possible. We will refer to this set of strategy combinations as Nan’s optimal reaction set and denote it by R1 .10 In mathematical terms, we have R1 = (u, v) ∈ D f1 (u, v) = max f1 (u, v) u∈Δ (1.14a) = (u, v) ∈ D | u = B1 (v) , where D is the decision set defined by (1.10) and B1 (v) denotes a best reply to v for Player 1—a best reply, rather than the best reply, because there may be more than one. Note that R1 is obtained in practice by holding v constant and maximizing f1 (u, v) as a function of a single variable u. For given v, a maximizing u is denoted by u and will in general depend upon v. Likewise, San should reason as follows: “I do not know which u Nan will pick, but for every u I will pick v to make f2 (u, v) as large as possible.” In this way, San obtains a set of points in the unit square. Each of these points corresponds to a strategy combination that is optimal for San in the sense that for each u (over which San has no control) a corresponding v is a strategy that makes San’s reward as large as possible. We will refer to this set of strategy combinations as San’s optimal reaction set, and denote it by R2 . That is, R2 = (u, v) ∈ D f2 (u, v) = max f2 (u, v) v∈[0,1] (1.14b) = (u, v) ∈ D | v = B2 (u) , where B2 (u) denotes a best reply to u for Player 2. Again, in practice, R2 is obtained by holding u constant and maximizing f2 (u, v) as a function of a single variable v, and for given u, a maximizing v is denoted by v and will in general depend upon u. Note the important point that each player can determine her optimal reaction set without any knowledge of the other player’s reward. Suppose, for example, that δ < 12 min(τ1 , τ2 ); i.e., both drivers are slow. Then, because 0 ≤ v ≤ 1 implies that the coefficient of u in (1.15) f1 (u, v) = {1 − v} + 12 τ2 − δv u + − 12 τ2 v − − 12 τ2 is always positive, f1 (u, v) is maximized for 0 ≤ u ≤ 1 by choosing u = 1. Therefore, Nan’s optimal reaction set is (1.16) R1 = (u, v) | u = 1, 0 ≤ v ≤ 1 , the edge of the unit square that runs between (1, 0) and (1, 1); see Figure 1.3(a) where R1 is represented by a thick solid line. Similarly, because 0 ≤ u ≤ 1 implies 10 “Optimal” for Nan is also “rational” for Nan. Correspondingly, a more common term for R1 , in both classical [313] and evolutionary [347] game theory, is rational reaction set, and I followed the prevailing custom by using it in the second edition of this book. It can be argued, however, that even though rationality “does not necessarily imply the use of reason, when the term is used as part of the study of animal economics” [194, p. 167], “rational” is an unnecessary qualifier in the evolutionary context of Chapters 2 and 5–8 and, hence, is best avoided.
1.2. Optimal reaction sets and Nash equilibria
17
Figure 1.3. Optimal reaction sets when τ1 > 2δ, τ2 > 2δ
that the coefficient of v in (1.17) f2 (u, v) = {1 − u} + 12 τ1 − δu v + − 12 τ1 u − − 12 τ1 is always positive, f2 (u, v) is maximized for 0 ≤ v ≤ 1 by choosing v = 1. Therefore, San’s optimal reaction set is (1.18) R2 = (u, v) | 0 ≤ u ≤ 1, v = 1 , the edge of the unit square that runs between (0, 1) and (1, 1); see Figure 1.3(b) where R2 is represented by a thin solid line. Of course, all that Figure 1.3 tells us is that the best strategy against a slow driver is G, which we knew long before we began to talk about mixed strategies. But something new will shortly emerge. Notice that the reaction sets R1 and R2 have a nonempty intersection R1 ∩R2 = {(1, 1)}. The strategy combination (1, 1) that lies in both sets has the following property: if either player selects strategy 1, then the other cannot obtain a greater reward by selecting a strategy other than 1. In other words, neither player can increase her reward by a unilateral departure from the strategy combination (1, 1). By virtue of having this property, (1, 1) is said to be a Nash-equilibrium strategy combination. More generally, (u∗ , v ∗ ) ∈ D is said to be a Nash-equilibrium strategy combination, or simply a Nash equilibrium, of a noncooperative, two-player game whenever (1.19)
(u∗ , v ∗ ) ∈ R1 ∩ R2
because if either player sticks rigidly to her Nash-equilibrium strategy (u∗ for Player 1, v ∗ for Player 2), then the other player cannot increase her reward by selecting a strategy other than her Nash-equilibrium strategy. That is, (1.20a)
f1 (u∗ , v ∗ ) ≥ f1 (u, v ∗ )
for all (u, v ∗ ) ∈ D and (1.20b)
f2 (u∗ , v ∗ ) ≥ f2 (u∗ , v)
for all (u∗ , v) ∈ D, so that u∗ is a best reply to v ∗ and v ∗ is a best reply to u∗ . If (1.20a) is satisfied with strict inequality for all u = u∗ and (1.20b) is satisfied with strict inequality for all v = v ∗ (that is, if not only are u∗ and v ∗ mutual best replies, but also each is uniquely the best reply to the other), then the Nash equilibrium is said to be strong; otherwise it is said to be weak.
18
1. Community Games
Figure 1.4. R1 and R2 for (a) τ1 < 2δ < τ2 , (b) τ2 < 2δ < τ1
The analogue of (1.20) for the discrete bimatrix game with matrices A, B (p. 12) is that the strategy combination (i, j) is a Nash equilibrium if aij ≥ akj
(1.21a) for all k = 1, . . . , m1 and
bij ≥ bil
(1.21b)
for all l = 1, . . . , m2 . Again, the Nash equilibrium is strong if (1.21a) and (1.21b) hold with strict inequality for all k = i and all l = j, respectively; otherwise, it is weak. In terms of Table 0.3, the concept of Nash equilibrium is our solution concept for noncooperative community games—their fourth and last key ingredient. If we were interested solely in finding the best pair of strategies for two slow drivers in Crossroads, however, then introducing the concept of Nash equilibrium would be like using a sledgehammer to burst a soap bubble. It is obvious from Figure 1.3 that (1, 1) is the only pair of strategies that two rational players would select, because (u, v) will be selected only if u lies in R1 and v in R2 . But things get a bit more complicated when either driver is either fast or intermediate. To determine R1 and R2 in these circumstances, it will be convenient first to define parameters θ1 and θ2 by (δ + )θk = + 12 τk ,
(1.22)
k = 1, 2.
Then from (1.12), (1.23a) (1.23b)
f1 (u, v) = (δ + )(θ2 − v)u + − 12 τ2 v − − 12 τ2 , f2 (u, v) = (δ + )(θ1 − u)v + − 12 τ1 u − − 12 τ1 .
If 12 τ1 < δ < 12 τ2 (San slow, Nan fast or intermediate), then θ1 < 1 < θ2 . So the u 1 that maximizes f1 (u, v) for 0 ≤ u ≤ 1 is still u = 1 because ∂f ∂u > 0 for all u ∈ [0, 1]. But the v that maximizes f2 (u, v)—that is, Player 2’s best reply to u—is now ⎧ ⎪ 1 if 0 ≤ u < θ1 , ⎨ (1.24) v = B2 (u) = any v ∈ [0, 1] if u = θ1 , ⎪ ⎩ 0 if θ1 < u ≤ 1,
1.2. Optimal reaction sets and Nash equilibria
19
2 because ∂f ∂v is positive, zero, or negative according to whether u < θ1 , u = θ1 , or u > θ1 . Thus R1 is the same as before; whereas (u, v) ∈ R2 is equivalent to v = B2 (u). So R2 consists of three straight-line segments as shown in Figure 1.4(a).11 We see that, if San has no knowledge of Nan’s reward function f1 , then any v ∈ [0, 1] could be optimal for San because, for all she knows, Nan could select the strategy u = θ1 , to which any v ∈ [0, 1] is a best reply. If, on the other hand, San knows Nan’s reward function, then the only optimal choice for San is v = 0 because only v = 0 is a best reply to u = 1. Of course, u = 1 is also a best reply to v = 0 because it’s the best reply to anything. Thus (1, 0), the only point in the intersection of R1 and R2 , is a Nash equilibrium. In terms of pure strategies, the Nash equilibrium is GW : G is a best reply to W (regardless), and W is a slow driver’s best reply to an intermediate or fast driver’s G. Similarly, if 12 τ2 < δ < 12 τ1 or θ2 < 1 < θ1 (Nan slow, San fast or intermediate) then, either from symmetry or proceeding as above (Exercise 1.3), we find that v = 1 2 maximizes f2 (u, v) because ∂f ∂v is strictly positive; whereas the u that maximizes f1 (u, v) for 0 ≤ u ≤ 1 (that is, Player 1’s best reply to v) is ⎧ ⎪ 1 if 0 ≤ v < θ2 , ⎨ (1.25) u = B1 (v) = any u ∈ [0, 1] if v = θ2 , ⎪ ⎩ 0 if θ2 < v ≤ 1, 1 because ∂f ∂u is positive, zero, or negative according to whether v < θ2 , v = θ2 , or v > θ2 . Thus R1 and R2 are as shown in Figure 1.4(b). If Nan has no knowledge of San’s reward function f2 , then any u ∈ [0, 1] could be optimal for Nan because, for all she knows, San could select the strategy v = θ2 , to which any u ∈ [0, 1] is a best reply. If, on the other hand, Nan knows San’s reward function, then the only optimal choice for Nan is u = 0 because only u = 0 is a best reply to v = 1. Because v = 1 is also a best reply to u = 0, then (0, 1), the only point in the intersection of R1 and R2 , is a Nash equilibrium. In terms of pure strategies, this time the Nash equilibrium is W G, but the interpretation is otherwise the same: G is a best reply to W , and W is a slow driver’s best reply to an intermediate or fast driver’s G. We see that the concept of Nash equilibrium depends crucially on each player knowing the other player’s reward (whereas the concept of optimal reaction set does not). If such is the case, then it is customary to say that the players have complete information.
Provided each player has knowledge of the other’s reward, and despite a lack of explicit cooperation between the players, a Nash equilibrium is self-enforcing in the sense that neither player has a unilateral incentive to depart from it. Consider, however, the case in which neither driver is slow, so that max(τ1 , τ2 ) < 2δ, or max(θ1 , θ2 ) < 1. The u that maximizes f1 (u, v) for 0 ≤ u ≤ 1 (that is, Player 1’s best reply to v) is now still B1 (v) defined by (1.25), and the v that maximizes f2 (u, v) (that is, Player 2’s best reply to u) is again B2 (u) defined by (1.24). That is, (u, v) ∈ R1 is equivalent to u = B1 (v) and (u, v) ∈ R2 is equivalent to v = B2 (u), 11 Note that B2 defined by (1.24) is not a function because B2 (u) takes more than one value where u = θ1 . In general, a function is a rule that assigns a unique element of a set V (called the range), to each element of a set U (called the domain), and a rule that assigns a subset of V to each element of U is called a multivalued function, or correspondence. Thus (1.24) defines only a correspondence with U = V = [0, 1]. Often, however, a best-reply correspondence is also a function; see, e.g., (2.30).
20
1. Community Games
Table 1.3. Nash-equilibrium rewards in Crossroads
(u, v) (1, 0) (θ1 , θ2 ) (0, 1)
f1 (u, v) 0 − δ + 12 τ2 θ2 −τ2
1
f2 (u, v) −τ1 − δ + 12 τ1 θ1 0
v
R2
θ2
R1
0 0
θ1
1
u
Figure 1.5. Optimal reaction sets when max(τ1 , τ2 ) < 2δ
so that R1 and R2 are as shown in Figure 1.5. We observe at once that R1 ∩ R2 = {(1, 0), (θ1 , θ2 ), (0, 1)}: there are three Nash equilibria. Then which do we regard as the solution? The rewards associated with the three Nash equilibria are given in Table 1.3. You can readily show that k )(τk −2) , k = 1, 2. (1.26) − δ + 12 τk θk − −τk = (2δ−τ4(δ+) Thus (1, 0) is always the best Nash equilibrium for Nan, and (θ1 , θ2 ) is second or third best, according to whether 2δ > τ2 > 2 (intermediate San) or 2δ > 2 > τ2 (fast San). Likewise, (0, 1) is always the best Nash equilibrium for San, and (θ1 , θ2 ) is second or third best according to whether 2δ > τ1 > 2 (intermediate Nan) or 2δ > 2 > τ1 (fast Nan). Even though θ1 is a best reply to θ2 and θ2 is a best reply to θ1 , there is no reason to expect the players to select these strategies because for each player there is another combination of mutual best replies that yields a higher reward. But if Nan selects her best Nash-equilibrium strategy u = 1, and if San selects her best Nash-equilibrium strategy v = 1, then the resulting strategy combination (1, 1) belongs to neither player’s optimal reaction set! Then which—if any—of the Nash equilibria should we regard as the solution of the game? We will return to this matter in Chapter 2.
1.3. Four Ways: a motorist’s trilemma Nan and San’s dilemma becomes even more intriguing if we allow a third pure strategy, denoted by C, in which each player’s action is contingent upon that of the other.12 A player who adopts C will select G if the other player selects W , but she will select W if the other player selects G. Let us suppose that, if Nan is 12 Such a strategy is sometimes called a conditional strategy, although use of this term should perhaps be discouraged, essentially on the grounds of redundancy; see, e.g., [139, p. 77].
1.3. Four Ways: a motorist’s trilemma
21
Table 1.4. Nan’s payoff matrix in Four Ways
G G W C
−δ − −τ −τ
1 2τ
W
C
0 − − 12 τ 0
0 −τ −δ − 12 τ
Table 1.5. San’s payoff matrix in Four Ways
G W C
G
W
C
−δ − 12 τ 0 0
−τ − − 12 τ −τ
−τ 0 −δ − 12 τ
a C-strategist, then the first thing she does when she arrives at the junction is to wave San on; but if San replies by waving Nan on, then immediately Nan puts down her foot and drives away. If, on the other hand, San replies by hitting the gas, then Nan waits until San has traversed the junction. But what happens if San is also a C-strategist? As soon as they reach the junction, Nan and San both wave at one another. Nan interprets San’s wave to mean that San wants to wait, so Nan drives forward; San interprets Nan’s wave to mean that Nan wants to wait, so San also drives forward, and the result is the same as if both had selected strategy G. Thus if a G-strategist can be described as aggressive and a W -strategist as cooperative, then a C-strategist could perhaps be described as an impatient cooperator. For the sake of simplicity, let us assume that the game is symmetric, i.e., τ1 = τ2 , and denote the common value of these two parameters by τ . Then Nan and San’s payoff matrices, A and B, respectively, are as shown in Tables 1.4 and 1.5 (assuming the time for which two C-strategists wave at each other to be negligibly small, that is, very small compared to ). As always, the rows correspond to strategies of Player 1 (Nan), and the columns correspond to strategies of Player 2 (San). Thus, the entry in row i and column j is the payoff to the player whose payoffs are stored in the matrix if Player 1 selects strategy i and Player 2 selects strategy j. Because the game is symmetric, B is just the transpose of A. To distinguish this game from Crossroads, we will refer to it as Four Ways. If the drivers are so slow that τ > 2δ or σ > 1, where (1.27)
σ = τ /2δ,
then their best strategy is to hit the gas because G dominates C and strictly dominates W for Nan from Table 1.4; similarly for San from Table 1.5. Thus G is a (weakly) dominant strategy for both players: neither has an incentive to depart from it, which makes strategy combination GG a Nash equilibrium. Furthermore, GG is the only Nash equilibrium when σ > 1 (see Exercise 1.2), and so we do
22
1. Community Games
not hesitate to regard it as the solution of the game: when there is only one Nash equilibrium, there is no indeterminacy to resolve.13 The game becomes interesting, however, when τ < 2δ or σ < 1, which we assume for the rest of this section. As in Crossroads, no pure strategy is now dominant. We therefore consider mixed strategies. If Nan selects pure strategy G with probability u1 and pure strategy W with probability u2 , then we shall say that Nan selects strategy u, where u = (u1 , u2 ) is a two-dimensional row vector. Thus Nan selects pure strategy C with probability 1 − u1 − u2 , where (1.28a)
0 ≤ u1 ≤ 1,
0 ≤ u2 ≤ 1,
0 ≤ u1 + u2 ≤ 1.
So Nan’s strategies correspond to points of a closed triangle in two-dimensional space. Similarly, if San selects G with probability v1 and W with probability v2 , then we shall say that San selects strategy v, where v = (v1 , v2 ). Because San selects C with probability 1 − v1 − v2 , we have (1.28b)
0 ≤ v1 ≤ 1,
0 ≤ v2 ≤ 1,
0 ≤ v1 + v2 ≤ 1.
Subsequently, we shall use Δ to denote the closed triangle in two-dimensional space defined either as the set of all points u satisfying (1.28a) or as the set of all points v satisfying (1.28b): Δ is the same strategy set, regardless of whether we use u or v to label a point in it. If Nan selects u ∈ Δ and San selects v ∈ Δ, then they jointly select strategy combination (u, v), where (u, v) = (u1 , u2 , v1 , v2 ) is a four-dimensional vector.14 The sample space of N , Nan’s choice of pure strategy, is now {G, W, C} instead of {G, W }; Prob(N = G) = u1 , Prob(N = W ) = u2 , and Prob(N = C) = 1 − u1 − u2 . San’s choice of pure strategy, S, has the same sample space, but with and Prob(S = C) = 1 − v1 − v2 . The payoff to Nan, F1 , now has sample space −δ − 12 τ, 0, −τ, − − 12 τ , and if strategies are still chosen independently, then Prob(S = G) = v1 ,
Prob(S = W ) = v2 ,
Prob(F1 = −δ − τ /2) = Prob(N = G, S = G or N = C, S = C) = Prob(N = G, S = G) + Prob(N = C, S = C) = Prob(N = G) · Prob(S = G)+Prob(N = C)·Prob(S = C) = u1 v1 + (1 − u1 − u2 )(1 − v1 − v2 ). Similarly, Prob(F1 = 0) = u1 v2 + u1 (1 − v1 − v2 ) + (1 − u1 − u2 )v2 , Prob(F1 = −τ ) = u2 v1 + u2 (1 − v1 − v2 ) + (1 − u1 − u2 )v1 , 13 Even if there were more than one Nash equilibrium, there would be no indeterminacy if all combinations of Nash-equilibrium strategies yielded the same payoffs. This equivalence holds in general only for zero-sum games; see, for example, Owen [257] or Wang [354]. For an example of a zero-sum game, see Exercise 1.30. 14 If u = (u1 , u2 ) ∈ Δ and v = (v1 , v2 ) ∈ Δ are both two-dimensional row vectors, then strictly (u, v) ∈ D = Δ × Δ is a two-dimensional row vector of two-dimensional row vectors. But we prefer to think of strategy combination (u, v) as a single four-dimensional row vector (u1 , u2 , v1 , v2 ), whose first two components are Player 1’s strategy and whose last two are Player 2’s, because we can then write Player i’s reward as fi (u, v) = fi (u1 , u2 , v1 , v2 ) instead of fi (u, v) = fi ((u1 , u2 ), (v1 , v2 )). Preferring to avoid such a cumbersome notation, we ignore whatever claims it may have to being more technically correct, whenever it is convenient to do so, e.g., in §2.4 (p. 76) and in §7.3 (p. 308).
1.3. Four Ways: a motorist’s trilemma
23
and Prob(F1 = − − τ /2) = u2 v2 . Thus Nan’s reward from the mixed strategy combination (u, v) is f1 (u, v) = E[F1 ] = − δ + 12 τ · Prob F1 = −δ − 12 τ + 0 · Prob F1 = 0 − τ · Prob F1 = −τ − + 12 τ · Prob F1 = − − 12 τ or, after simplification,
f1 (u, v) = − 2δv1 + δ + 12 τ {v2 − 1} u1 − δ − 12 τ {v1 − 1} + {δ + }v2 u2 + δ − 12 τ v1 + δ + 12 τ (v2 − 1).
(1.29a)
Similarly, San’s reward from the strategy combination (u, v) is f2 (u, v) = − 2δu1 + δ + 12 τ {u2 − 1} v1 (1.29b) − δ − 12 τ {u1 − 1} + {δ + }u2 v2 + δ − 12 τ u1 + δ + 12 τ (u2 − 1). By virtue of symmetry, (1.30)
f2 (u, v) = f1 (v, u)
for all u and v satisfying (1.28). Note that (1.29) can be written more compactly as (1.31)
f1 (u, v) = (u1 , u2 , 1 − u1 − u2 )A(v1 , v2 , 1 − v1 − v2 )T , f2 (u, v) = (u1 , u2 , 1 − u1 − u2 )B(v1 , v2 , 1 − v1 − v2 )T ,
where A and B = AT are defined by Tables 1.4 and 1.5.15 Although u and v are now vectors, as opposed to scalars, everything we have said about optimal reaction sets and Nash equilibria with respect to Crossroads remains true for Four Ways, provided only that we replace Δ = [0, 1] in (1.10) and (1.14) by Δ defined as the set of points satisfying (1.28). Thus the players’ optimal reaction sets in Four Ways are still defined by (1.32a) R1 = (u, v) ∈ D | f1 (u, v) = max f1 (u, v) , u∈Δ (1.32b) R2 = (u, v) ∈ D | f2 (u, v) = max f2 (u, v) , v∈Δ
and the set of all Nash equilibria is still R1 ∩ R2 . On the other hand, because the optimal reaction sets now lie in a four-dimensional space, as opposed to a two-dimensional space, we cannot locate the Nash equilibria by drawing diagrams equivalent to Figures 1.3–1.5. Instead, we proceed as follows. We first define dimensionless parameters (1.33a) 15
γ =
, δ
α =
(σ + γ)(σ + 1) , 1 + 2γ + σ 2
β =
(1 − σ)2 , 1 + 2γ + σ 2
Moreover, (1.31) is another special case of (A.2), just like (1.13).
ω =
2σ , 1+σ
24
1. Community Games
Figure 1.6. Subsets of the strategy set Δ defined by (1.28)
and (1.33b)
θ =
+ τ /2 σ+γ = , +δ 1+γ
where σ is defined by (1.27). In view of (1.1), α, β, γ, σ, θ, and ω all lie between 0 and 1. If the coefficients of u1 and u2 in (1.29a) are both negative, then f1 (u, v) is maximized by selecting u1 = 0 and u2 = 0, or u = (0, 0); moreover, (0, 0) is the only maximizing strategy for Player 1. If these coefficients are merely nonpositive, then there will be more than one maximizing strategy; nevertheless, u = (0, 0) will continue to be one of them. But the coefficient of u1 in (1.29a) is nonpositive when the point (v1 , v2 ) lies on or above the line in two-dimensional space that joins the point (σ/ω, 0) to the point (0, 1); whereas the coefficient of u2 in (1.29a) is nonpositive when (v1 , v2 ) lies on or above the line that joins (1, 0) to (0, 1 − θ). Thus, the coefficients of u1 and u2 in (1.29a) are both nonpositive when the point (v1 , v2 ) lies in that part of Δ which corresponds to (the interior or boundary of) the triangle marked C in Figure 1.6. Let us denote by v C = (v1C , v2C ) any strategy for San that corresponds to a point in C. Then what we have shown is that all four-dimensional vectors of the form 0, 0, v1C , v2C must lie in R1 . Extending our notation in an obvious way, let v A = (v1A , v2A ) denote any strategy for San that corresponds to a point in A, let v AC = (v1AC , v2AC ) denote any strategy for San that corresponds to a point lying in both A and C, and so on. Then, by considering the various cases in which the coefficient of u1 or the coefficient of u2 or both in (1.29a) are nonpositive, nonnegative, or zero, it can be shown that all strategy combinations in Table 1.6 must lie in Nan’s optimal reaction set, R1 ; see Exercise 1.6. Furthermore, if we repeat the analysis for f2 and San (as A opposed to f1 and Nan), and if we let uA = (uA 1 , u2 ) denote any strategy for Nan AC AC = (u1 , uAC that corresponds to a point in A, let u 2 ) denote any strategy for Nan that corresponds to a point in both A and C, and so on, then we find that all strategy combinations in Table 1.7 must lie in San’s optimal reaction set R2 . Indeed, in view of symmetry condition (1.30), it is hardly necessary to repeat the analysis.
1.3. Four Ways: a motorist’s trilemma
25
Table 1.6. R1 for Four Ways
u1
u2
v1
v2
1 0 0
0 1 0
u1 0 u1 u1
0 u2 u2 u2
v1A v1B v1C v1AC v1BC v1AB
v2A v2B v2C v2AC v2BC v2AB
α
β
constraints
0 ≤ u1 ≤ 1 0 ≤ u2 ≤ 1 u ∈ Δ, u1 + u2 = 1 u∈Δ
Table 1.7. R2 for Four Ways
u1
u2
v1
v2
constraints
uA 1 uB 1 uC 1 uAC 1 uBC 1 uAB 1
uA 2 uB 2 uC 2 uAC 2 uBC 2 uAB 2
1 0 0 v1
0 1 0 0
0 ≤ v1 ≤ 1
α
β
0 v1 v1
v2 v2 v2
0 ≤ v2 ≤ 1 v ∈ Δ, v1 + v2 = 1 v∈Δ
Table 1.8. Nash equilibria for Four Ways
u1 1 0 1 0 1 0 0 u1 α
u2 0 1 0 0 0 u2 1 0 β
v1 0 1 0 1 0 1 v1 0 α
v2 1 0 0 0 v2 0 0 1 β
constraints
0 < v2 < 1 0 < u2 < 1 ω ≤ v1 < 1 ω ≤ u1 < 1
A strategy combination is a Nash equilibrium if, and only if, it appears both in Table 1.6 and in Table 1.7. Therefore, to find all Nash equilibria, we must match strategy combinations from Table 1.6 with strategy combinations from Table 1.7 in every possible way. For example, consider the first row of Table 1.6. It does not match the first, fourth, or sixth row of Table 1.7 because (1, 0) does not lie in A. It does not match the last row of Table 1.7, even for (v1 , v2 ) ∈ A, because α < 1 (or β > 0). Because (1, 0) lies in B and (0, 1) lies in A, however, we can match the first row of Table 1.6 with the second row of Table 1.7, and so (1, 0, 0, 1) is a Nash equilibrium. Likewise, because (1, 0) lies in C and (0, 0) in A, we can match
26
1. Community Games
the first row of Table 1.6 with the third row of Table 1.7, so that (1, 0, 0, 0) is a Nash equilibrium too. Finally, we can match the first row of Table 1.6 with the fifth row of Table 1.7 to deduce that (1, 0, 0, v2 ) is a Nash equilibrium not only for v2 = 1 and v2 = 0, but also for 0 < v2 < 1, because then (0, v2 ) lies in A. The Nash equilibria we have found in this way are recorded in rows 1, 3, and 5 of Table 1.8. Repeating the analysis for the remaining six rows of Table 1.8, we obtain (see Exercise 1.7) an exhaustive list of Nash-equilibrium strategy combinations. They are recorded in Table 1.8. The first four rows of this table correspond to equilibria in pure strategies: rows 1 and 2 correspond to equilibria in which one player selects G and the other W ; rows 3 and 4, to equilibria in which one player selects G and the other C. The remaining five rows correspond to equilibria in mixed strategies.16 We see that, although rows 1–4 and 9 of the table correspond to isolated equilibria, there are infinitely many equilibria of the other types. If you thought that having three equilibria to choose from in Crossroads was bad enough, then I wonder what you are thinking now. Which, if any, of all these infinitely many equilibria do we regard as the solution of Four Ways? We will return to this question in Chapter 2.
1.4. Store Wars: a continuous game of prices Although it is always reasonable to suppose that decision makers have only a finite number of pure strategies, when the number is large it is often convenient to imagine instead that the strategies form a continuum. Suppose, for example, that the price of some item could reasonably lie anywhere between $5 and $10. Then if a cent is the smallest unit of currency and if selecting a strategy corresponds to setting the price of the item, then the decision maker has a finite total of 501 pure strategies. Because this number is large, however, it may be preferable to suppose that the price in dollars can take any value between 5 and 10 (and round to two decimal places). Then rewards are calculated directly, i.e., without the intermediate step of calculating payoff matrices, and the game is said to be continuous, in order to distinguish it from matrix games like Crossroads and Four Ways.17 The definition of Nash equilibrium is not in the least affected, but whereas matrix games are guaranteed to have at least one Nash equilibrium, continuous games may have none at all.18 These ideas are illustrated by the following example. A district or subdivision of an area of 50 square miles consists of two rectangles of land, as shaded in Figure 1.7: the smaller rectangle measures 15 square miles; the larger rectangle, 35. If we take the southwest corner of the subdivision to be the origin of a Cartesian coordinate system Oxy, with x increasing to the east and y to the north, then the subdivision contains all points (x, y) such that either 0 ≤ x ≤ 7, 0 ≤ y ≤ 5 or 7 ≤ x ≤ 10, 5 ≤ y ≤ 10. All roads through the subdivision 16 If one has no wish to distinguish pure strategies from mixed ones, then Table 1.8 can be reduced to just five rows by weakening the inequalities in rows 5 and 6. 17 To be sure, these games have a continuum of mixed strategies; however, we reserve the phrase “continuous game” for a game that has a continuum of strategies but is not also a bimatrix game—or, to be more precise, is not also the mixed extension of a discrete bimatrix game; see Appendix A, p. 364. Thus, on the one hand, there are games with a continuous reward function that we do not regard as continuous games, because they are also bimatrix games, and, on the other hand, there exist games that we think of as continuous games, even though their reward functions have isolated discontinuities on the decision set D—e.g., across a line in the unit square, as in Exercise 1.28. 18 For a proof that matrix games have at least one Nash equilibrium, see, for example, [354].
1.4. Store Wars: a continuous game of prices
27
Figure 1.7. Battleground for Store Wars
run either from east to west or from north to south. There are two stores, one at (0, 0), the other at (7, 5). Each sells a product for which the daily demand is uniformly distributed over the 50 square miles, in the sense that customers are equally likely to live anywhere in the subdivision; the product might, for example, be bags of ice. If buyers select a store solely by weighing the price of the product against the cost of getting there (bags of ice at the first store are identical to those at the second, etc.), and if each store wishes to maximize revenue from the product in question, then how should prices be set? Because the best price for each store depends on the other store’s price, their decisions are interdependent; and if they do not communicate with one another before setting prices, then we have all the necessary ingredients for a noncooperative game. We call this game Store Wars.19 Let Player 1 be Nan, who is manager of the store at (0, 0), and let Player 2 be San, who is manager of the store at (7, 5).20 Let p1 be Nan’s price for the product, let p2 be San’s price, and let c be the cost per mile of travel to the store, assumed the same for all customers. Thus the round-trip cost of travel from Nan’s store to San’s store would be 24c—no matter how you went, because all roads through the subdivision run from east to west or from north to south. Clearly, if Nan’s price were to exceed this round-trip travel cost plus San’s price for the item in question, then Nan could never expect anyone to buy from her. Accordingly, we can safely assume that (1.34a)
p1 ≤ p2 + 24c.
Similarly, because nobody in the larger rectangle can be expected to buy from San if her price exceeds Nan’s by the round-trip travel cost between the stores, and assuming that San would like to attract at least some customers from the larger rectangle, we have (1.34b)
p2 ≤ p1 + 24c.
19 Store Wars was suggested by the Hotelling model described in Phlips [271, pp. 42–45]. Phlips assumes that prospective customers are uniformly distributed along a line, whereas Store Wars assumes, in effect, that they are nonuniformly distributed along a line. 20 If Nan were to live near San’s store and San were to live near Nan’s store, then we could easily explain why they keep meeting each other in Crossroads!
28
1. Community Games
Furthermore, there are upper and lower limits to the price that a store can charge for a product, and it will be convenient to write these as (1.34c)
p1 ≤ 4cα,
p2 ≤ 4cα,
(1.34d)
p1 ≥ 4cβ,
p2 ≥ 4cβ,
where α and β are dimensionless parameters.21 For the sake of simplicity, however, we assume throughout that β = 0; provided that β is sufficiently small, this assumption will not affect the principal results of our analysis.22 Also, we assume that α > 6, as in Figure 1.8. Note that (1.34) can be written more compactly as |p1 − p2 | ≤ 24c: the difference in prices does not exceed the cost of round-trip travel between the stores. Now, let (X, Y ) be the residential coordinates of the next customer for the product in question. Because all roads run either north and south or east and west, her distance from Nan’s store is |X| + |Y | and her distance from San’s store is |7 − X| + |5 − Y |. Thus, assuming that she selects a store solely by weighing the price of the product against the cost of travel from her residence (she doesn’t, for example, buy the product on her way home from work), this customer will buy from Nan if (1.35a)
p1 + 2c(|X| + |Y |) < p2 + 2c(|7 − X| + |5 − Y |);
whereas she will buy from San if (1.35b)
p1 + 2c(|X| + |Y |) > p2 + 2c(|7 − X| + |5 − Y |).
But X ≥ 0, Y ≥ 0; thus |X| + |Y | is the same thing as X + Y . Furthermore, the shape of the subdivision precludes either X > 7, Y < 5 or X < 7, Y > 5; therefore, |7 − X| + |5 − Y | is the same thing as |12 − X − Y |. So, if we had X + Y > 12 in (1.35a), then it would now imply p1 + 24c < p2 , which violates (1.34b). Accordingly, we can both assume that X + Y ≤ 12 in (1.35a) and rewrite it as p1 + 2c(X + Y ) < p2 + 2c(12 − X − Y ). Hence the next customer will buy from Nan if p2 − p1 + 6. (1.36a) X +Y < 4c Similarly, if X + Y ≤ 12, then the next customer will buy from San if p2 − p1 (1.36b) X +Y > + 6. 4c If, on the other hand, X + Y > 12, then the customer will certainly buy from San (and (1.35b) reduces to p1 + 24c > p2 ). So the next customer will buy from San either if X + Y > 12 or if X + Y ≤ 12 and (1.36b) is satisfied. But X + Y > 12 implies (1.36b) because the right-hand side of (1.36b) is less than or equal to 12, by virtue of (1.34b). Thus, in any event, the next customer will buy from San if 21 Note that 4c is the cost of driving around a square-mile block. Because c is a cost per unit length, we must multiply it by a distance (here 4) to make the right-hand side of each inequality a quantity with the dimensions of price. 22 In terms of the economist’s inverse demand curve, with quantity measured along the horizontal axis and price along the vertical axis, 4αc is the price at which the demand curve meets the vertical axis, whereas 4βc is simply the cost price of the item. Strictly, however, we ignore questions of supply and demand, or, if you prefer, we assume that demand is infinitely elastic at 4αc but infinitely inelastic at greater or lower prices.
1.4. Store Wars: a continuous game of prices
29
(1.36b) is satisfied. Of course, San’s monopoly over the smaller rectangle was built into the model when we assumed (1.34b). Because the next customer could live anywhere in the subdivision, X and Y are (continuous) random variables; hence so is X + Y . Let G denote its cumulative distribution function, i.e., define G(s) = Prob(X + Y ≤ s),
(1.37)
0 ≤ s ≤ 20,
and let F1 denote Nan’s payoff from the next customer. Then F1 is also a random variable, which in view of (1.36) is defined by
−p1 + 6, p1 if X + Y < p24c (1.38) F1 = −p1 + 6. 0 if X + Y > p24c Because F1 is a random variable, it cannot itself be maximized; instead we can maximize its expected value, which we shall denote by f1 , and define to be Nan’s reward. It will be convenient to make prices dimensionless, by scaling them with respect to 4c. Let us therefore define u and v by p2 p1 , v = , (1.39) u = 4c 4c where u is Nan’s strategy and v is San’s. Then, from (1.37)–(1.39), (1.40)
f1 (u, v) = E[F1 ] = p1 · Prob(X + Y < v − u + 6) + 0 · Prob(X + Y > v − u + 6) = 4cuG(v − u + 6);
of course, Prob(X + Y = v − u + 6) = 0, because X + Y is a continuous random variable.23 Similarly, San’s payoff is the random variable
0 if X + Y < v − u + 6, (1.41) F2 = 4cv if X + Y > v − u + 6, and her reward is (1.42)
f2 (u, v) = E[F2 ] = 4cv · Prob(X + Y > v − u + 6) = 4cv {1 − G(v − u + 6)} .
Note that, in view of (1.39), (1.34a) requires u ≤ v + 6, whereas (1.34b) requires v ≤ u+6. Thus, in view of (1.34c), the set of all feasible strategy combinations—the decision set—is (1.43) D = (u, v) | 0 ≤ u, v ≤ α, |u − v| ≤ 6 . It will be convenient to define three subsets of D by (1.44a) DA = (u, v) | u ≤ α, v ≥ 0, 1 ≤ u − v ≤ 6 , (1.44b) DB = (u, v) | 0 ≤ u, v ≤ α, |u − v| ≤ 1 , DC = (u, v) | u ≥ 0, v ≤ α, 1 ≤ v − u ≤ 6 , (1.44c) so that D = DA ∪ DB ∪ DC . For α = 10, D is depicted in Figure 1.8. The lighter shaded region is DB ; the darker region lies outside D. 23
See, for example, [210, pp. 523–524].
30
1. Community Games
Figure 1.8. The decision set D and a set of points containing R1 for α = 10; the dark shaded triangles lie outside D.
Figure 1.9. Calculation of G defined by (1.37); see text for discussion.
If we assume that customers are uniformly distributed throughout the subdivision, then G(s) is readily calculated with the help of Figure 1.9, because Prob(X + Y ≤ s) is just the fraction of the total area of the subdivision that lies below the line x + y = s. Suppose, for example, that 0 ≤ s ≤ 5. Then the dark shaded area in Figure 1.9(a) is 12 s2 ; so the fraction of total area below the line 1 2 x + y = s is 100 s (because the populated area is 50 square miles). Or suppose that 5 ≤ s ≤ 7. Then, similarly, the fraction of total area shaded dark in Figure 1.9(b) 1 s − 14 . Continuing in this manner, we readily find that is 10 ⎧ ⎪ ⎨ (1.45)
G(s) =
⎪ ⎩
7 10
−
1 2 100 s 2s−5 20 1 100 (12
− s)2
if 0 ≤ s ≤ 5, if 5 ≤ s ≤ 7, if 7 ≤ s ≤ 12.
1.4. Store Wars: a continuous game of prices
31
Because San has a monopoly over the upper rectangle in Figure 1.9, G(s) is not needed for s ≥ 12.24 We can now obtain the optimal reaction sets. We have R1 = (u, v) ∈ D | f1 (u, v) = max f1 (u, v) u : (u,v)∈D (1.46a) = {(u, v) ∈ D | u = B1 (v)}, where “:” means “such that”, and R2 = (u, v) ∈ D | f2 (u, v) = max f2 (u, v) v : (u,v)∈D (1.46b) = {(u, v) ∈ D | v = B2 (u)}, where D is defined by (1.43) and Bi denotes Player i’s best reply to the other player. First we find R1 . From (1.40), (1.42), and (1.45), ⎧ ⎪ (v − u + 6)2 if (u, v) ∈ DA , cu ⎨ (1.47a) f1 (u, v) = 5(2v − 2u + 7) if (u, v) ∈ DB , 25 ⎪ ⎩ 2 70 − (u − v + 6) if (u, v) ∈ DC and (1.47b)
⎧ ⎪(u − v + 4)(v − u + 16) if (u, v) ∈ DA , cv ⎨ f2 (u, v) = 5(2u − 2v + 13) if (u, v) ∈ DB , 25 ⎪ ⎩ 2 if (u, v) ∈ DC . 30 + (u − v + 6)
From (1.47), if (u, v) lies inside DA , implying v + 1 < u < min(v + 6, α) and hence v + 1 < u < v + 6), then (1.48)
∂f1 /∂u =
1 25 c(v
− u + 6)(v − 3u + 6)
is positive for u < and negative for u > 13 (v + 6). Thus if v + 1 ≥ 13 (v + 6), then f1 has its maximum on DA at u = v+1. If, on the other hand, 13 (v+6) ≥ v+1, then f1 has its maximum on DA at u = 13 (v + 6) because 13 (v + 6) ≤ min(v + 6, α).25 In other words, the maximum of f1 over the region DA occurs at u = uA (v), where
1 if 0 ≤ v ≤ 32 , 3v + 2 (1.49a) uA (v) = v+1 if 32 ≤ v ≤ α − 1. 1 3 (v + 6)
The curve u = uA is represented in Figure 1.8 by dotted lines. For any v ∈ [0, α−1], u = uA (v) is Nan’s best reply to v with (u, v) ∈ DA . If (u, v) ∈ DB or max(0, v − 1) ≤ u ≤ min(v + 1, α), then ∂f1 /∂u is positive for u < 14 (2v + 7) but negative for u > 14 (2v + 7). So the maximum of f1 over DB occurs at u = uB (v), where ⎧ ⎪ if 0 ≤ v ≤ 32 , ⎨ v+1 1 7 (1.49b) uB (v) = if 32 ≤ v ≤ 11 2v + 4 2 , ⎪ ⎩ ≤ v ≤ α. v−1 if 11 2 24 It will shortly transpire that, in effect, 0 ≤ s ≤ 5 corresponds to DA , 5 ≤ s ≤ 7 to DB and 7 ≤ s ≤ 12 to DC in Figure 1.8. 25 This inequality clearly holds if v + 6 ≤ α or v ≤ α − 6; whereas if α − 1 ≥ v > α − 6 (> 0, as assumed on p. 28), then the inequality reduces to v + 6 ≤ 3α, which again must hold because v ≤ α and α exceeds 6 (hence also 3).
32
1. Community Games
The curve u = uB is represented in Figure 1.8 by dashed lines. For any v ∈ [0, α], u = uB (v) is Nan’s best reply to v with (u, v) ∈ DB . Similarly (see Exercise 1.8), the maximum of f1 over DC occurs at u = uC (v), where
v−1 if 1 ≤ v ≤ 11 2 , (1.49c) uC (v) = 1 11 2 (v − 6) + 210 − 4 if 2 ≤ v ≤ α, 3 2v + and it is assumed (on p. 28) that α > 6. The curve u = uC is shown solid in Figure 1.8; although it appears to consist of two straight line segments, for v ≥ 11 2 it has a slight downward curvature. For any v ∈ [1, α], u = uC (v) is Nan’s best reply to v with (u, v) ∈ DC . Now, for any v ∈ [0, α], a comparison of the conditional best replies in (1.49) yields Nan’s unconditional best reply to v with (u, v) ∈ DA ∪ DB ∪ DC = D. From (1.46a) and Figure 1.8, we obtain
(1.50a)
B1 (v) =
⎧ ⎪ ⎨ ⎪ ⎩1 3 2v +
1 3v + 2 1 v + 74 2 (v − 6)2
if 0 ≤ v ≤ 32 , if 32 ≤ v ≤ 11 2 , 11 + 210 − 4 if 2 ≤ v ≤ α.
Equivalently, Nan’s optimal reaction set is (1.50b)
R1 =
(uA (v), v) | 0 ≤ v ≤
∪ (uB (v), v) | ∪ (uC (v), v) |
3 2 ≤v ≤ 11 2 ≤v ≤
3 2 11 2
α .
To verify (1.50), suppose, for example, that 0 ≤ v ≤ 32 . Then f1 is larger along u = uA (v) than elsewhere in DA , including the boundary with DB ; but because this boundary is where f1 is maximized on DB (for 0 ≤ v ≤ 32 ), f1 must be larger along u = uA (v) than elsewhere in both DA and DB , including its boundary with DC (when 1 ≤ v ≤ 32 ). But because this boundary is where f1 is maximized on DC (for 1 ≤ v ≤ 32 ), f1 must be larger along u = uA (v) than anywhere else in DA ∪ DB ∪ DC = D (for 0 ≤ v ≤ 32 ). Similarly for 32 ≤ v ≤ α. The result for α = 10 is sketched in Figure 1.11(a) as a thick solid curve. Note in particular that Nan’s optimal reaction to v = 0 would be u = 2. Thus, even if San were to give away the product (p2 = 0), Nan should still charge p1 = 8c for it because she would still attract customers who reside south or west of the line x + y = 4. Although R1 is connected —it’s all in one piece—connectedness is not a general property of optimal reaction sets. To see this, note that if α = 10 and if the maxima of f2 over subsets DA , DB , and DC of D occur where v = vA (u), v = vB (u), and v = vC (u), respectively, then from Exercise 1.9 we have
(1.51a)
vA (u) =
u−1 if 1 ≤ u ≤ 17 2 , 1 2 + 300 − 4 if 17 ≤ u ≤ 10 2u + (u − 6) 3 2
1.4. Store Wars: a continuous game of prices
33
Figure 1.10. The decision set D and a set of points containing R2 for α = 10; the dark shaded triangles lie outside D.
√ Figure 1.11. R1 and R2 for (a) α = 10 and (b) α = 2 10
because max(0, u − 6) ≤ v ≤ u − 1 for (u, v) ∈ DA ;
(1.51b)
⎧ ⎪ ⎨ u+1 1 13 vB (u) = 2u + 4 ⎪ ⎩ u−1
if 0 ≤ u ≤ 92 , if 92 ≤ u ≤ 17 2 , 17 if 2 ≤ u ≤ 10
34
1. Community Games
because max(0, u − 1) ≤ v ≤ min(u + 1, 10) for (u, v) ∈ DB ; and ⎧ ⎪ if 0 ≤ u ≤ 4, ⎨u + 6 (1.51c) vC (u) = 10 if 4 ≤ u ≤ 92 , ⎪ ⎩ u+1 if 92 ≤ u ≤ 9 because u + 1 ≤ v ≤ min(u + 6, 10) for (u, v) ∈ DC . The graphs of v = vA (u), v = vB (u), and v = vC (u) are depicted in Figure 1.10. Note that v = vC (u) is not strictly a function but rather a multivalued function or correspondence; it is double valued at u = 92 because v = 11 2 and v = 10 both maximize f2 (9/2, v). By analogy with (1.50), it follows that San’s best reply to u for (u, v) ∈ D is ⎧ ⎪ min(u + 6, 10) if 0 ≤ u ≤ 92 , ⎨ 1 B2 (u) = (1.52a) u + 13 if 92 ≤ u ≤ 17 4 2 , 2 ⎪ ⎩1 2 + 300 − 4 if 17 ≤ u ≤ 10. 2u + (u − 6) 3 2 Equivalently, San’s optimal reaction set is (1.52b) R2 = (u, vC (u)) | 0 ≤ u ≤ 92 ∪ (u, vB (u)) | 92 ≤ u ≤ 17 2 ∪ (u, vA (u)) | 17 2 ≤ u ≤ 10 . R2 is sketched in Figure 1.11(a) as a thin solid curve. Note that the maximum of 2 1 f2 92 , v = 25 c 30 + v − 21 2 for 11 ≤ v ≤ 10, namely, 121c 2 10 , occurs at both ends of the interval, and because 9 f2 2 , v is less than 121c at every point, R2 is disconnected along 10 intermediate 9 and , 10 but no points that lie in between. u = 92 ; it contains both 92 , 11 2 2 and R still intersect one another at the (only) Nash equilibrium Nevertheless, R 1 2 . If this is accepted as the solution of the noncooperative game, (u∗ , v ∗ ) = 92 , 11 2 then Nan’s price is p1 = 18c and San’s price is p2 = 22c from (1.39). This result is strongly dependent on the value we chose for α. Indeed α = 10 has a critical property: it is the largest value of α for which a Nash equilibrium exists. As α increases beyond 10, the left endpoint of the right-hand segment of R2 moves away from DB ∩ DC into the interior of DB , so that R1 ∩ R2 = ∅, the empty set. As α moves below 10, on the other hand, the same endpoint √ moves into the interior of DC , and there is a second critical value, namely, α = 2 10, at which R2 becomes connected; for this value of α, R1 and R2 are sketched in Figure 1.11(b). These results are best left as exercises, however; see Exercises 1.10–1.12.26
1.5. Store Wars II: a three-player game We could easily turn Store Wars into a three-player continuous noncooperative game by placing a third store, say Van’s, somewhere else in the subdivision, perhaps at the northeast corner; however, we prefer to devise an example of a three-player game by supposing instead that the interior of some circular island is uninhabitable (perhaps because of a volcano), so that all prospective customers for a certain product must 26 And for further illustration of the point that connectedness is not a general property of optimal reaction sets, see Figure 6.2.
1.5. Store Wars II: a three-player game
35
N θ
NV
V
π/3
NS
π
2π/3
VS S Figure 1.12. Map of battleground for Store Wars II
reside on the island’s circumference. To be specific, let us suppose that Nan’s store is at the most northerly point of the island and that Van’s store is east of Nan’s and one third of the way from Nan’s store to the most southerly point of the island, which is also the location of the third store, San’s; Nan is Player 1, Van is Player 2, and San is now Player 3. Let the radius of the island be a miles, and let aθ denote distance along the circumference, measured clockwise from the most northerly point. Then 0 ≤ θ < 2π, and the location of a customer’s residence is determined by her θ-coordinate, with Nan’s store at θ = 0, Van’s at θ = π/3, and San’s at θ = π; see Figure 1.12. We call this game Store Wars II. We will suppose that customers are uniformly distributed along the circumference. Thus if Θ denotes the θ-coordinate of a randomly chosen customer, then (1.53)
Prob(0 ≤ θ1 < Θ < θ2 < 2π) =
1 2π (θ2
− θ1 ).
For example, if N V denotes the event that Θ lies between 0 and π/3, V S the event that Θ lies between π/3 and π, and N S the event that Θ lies between π and 2π (see Figure 1.12), then (1.53) implies (1.54)
Prob(N V ) = 16 ,
Prob(V S) = 13 ,
Prob(N S) = 12 .
Let pi denote Player i’s price for the product in question, for i = 1, 2, 3. Then we shall assume, as in §1.4, that the difference in prices between adjacent stores does not exceed the round-trip cost of travel between them. Thus if travel costs c dollars per mile, then (1.55)
|p1 − p2 | ≤ 23 πac,
|p2 − p3 | ≤ 43 πac,
and |p1 − p3 | ≤ 2πac, which (1.55) implies.27 As in §1.4, there are lower and upper bounds on the prices: (1.56)
8πacβ ≤ pi ≤ 8πacα,
i = 1, 2, 3.
But again as in §1.4, we shall assume throughout that β = 0. 27
By the triangle inequality |δ1 + δ2 | ≤ |δ1 | + |δ2 | with δ1 = p1 − p2 , δ2 = p2 − p3 .
36
1. Community Games
Now, let Θ be the residential coordinate of the next customer (hence 0 ≤ Θ < 2π); and suppose, as in §1.4, that this customer selects a store solely by weighing the price of the product against the cost of travel from her residence. Then, in view of (1.55), she will always buy from one of the two stores between which she lives. For example, the customer will buy from Nan if she resides in the dark shaded sector denoted by N V in Figure 1.12 and the total cost of buying from Nan is less than the total cost of buying from Van, i.e., if 0 < Θ < π/3 and p1 +2acΘ < p2 +2ac(π/3−Θ) or, equivalently, 0 < Θ < π/6 + (p2 − p1 )/4ac.28 The customer will also buy from Nan, however, if she resides in the unshaded sector denoted by N S in Figure 1.12 and the total cost of buying from Nan is less than the total cost of buying from San, i.e., if π < Θ < 2π and p1 + 2ac(2π − Θ) < p3 + 2ac(Θ − π) or, equivalently, 3π/2 + (p1 − p3 )/4ac < Θ < 2π. As usual, we need not worry about the event that, for example, p1 + 2acΘ equals p2 + 2ac(π/3 − Θ) precisely, because the event is associated with probability zero. Thus the next customer will buy from Nan if +
p1 −p3 4ac
From (1.53), the probability of this event is p2 −p1 1 π 1 + 2π 2π − 3π (1.58a) 2π 6 + 4ac 2 +
p1 −p3 4ac
(1.57)
0 0 and, hence, has its maximum with respect to u as u → 0; whereas when 0 < v < λb/κ, f1 increases from 0 as u → 0 to a maximum of 2 √ κv u(v), v) = b− (1.81) f1 (ˆ
λ
at u = u ˆ(v) before decreasing towards 0 again as u → b/κ−v/λ. Moreover, because (1.80) implies (1.82)
u ˆ (v) =
1 2
b 1 − , λκv λ
u ˆ (v) = −
1 4v
b < 0, λκv
λb b λb we see that u ˆ(v) increases from u ˆ(0) = 0 to u ˆ( 4κ ) = 4κ on [0, 4κ ] before decreasing λb λb λb b on [ 4κ , κ ] to u ˆ( κ ) = 0, implying that u ˆ(v) has maximum 4κ on [0, λb κ ], as indicated by the vertical dashed line in Figure 1.14. So u ˆ(v) is certainly the best reply to v b ≤ b1 /κ or 4b ≤ b1 , ensuring u ˆ(v) ∈ S1 for all v ∈ S2 by (1.78); that is, when 4κ
(1.83)
B1 (v) = u ˆ(v)
for all v ∈ S2 ,
as sketched in Figure 1.14. b A similar analysis shows that when u ≥ λk , f2 is strictly decreasing with respect to v for all v > 0, and hence has its maximum with respect to v where b , f2 has its maximum with respect to v on v = 0; whereas when 0 < u < λk b [0, κ − λu] at v = vˆ(u) defined by (1.80), the maximum being √ √ 2 (1.84) f2 (u, vˆ(u)) = b − λκv . b b b Moreover, vˆ(u) has its maximum vˆ( 4κλ ) = 4κ on [0, λκ ], as indicated by the horizontal dashed line in Figure 1.14. So (1.83) has the companion result that
(1.85)
B2 (u) = vˆ(u) for all u ∈ S1
all of the prize b, guaranteeing f1 (, 0) > f1 (0, 0) = 12 b; likewise for P2 . In this section, we have opted to exclude (0, 0) from D by simply assuming that there is positive effort on both sides. We return to this matter in §1.7.
44
1. Community Games
v λb/κ b2 /κ
R1 b/4κ
R2
λb/4κ 0 b/4κ
0
b1 /κ
b/4λκ
u
Figure 1.14. The optimal reaction sets R1 (thick curve), R2 (thin curve) and Nash equilibrium (dot) for 14 b < b2 < b1 and 14 < λ < 1. The decision set D = S1 × S2 is shaded.
b when 4κ ≤ b2 /κ or 4b ≤ b2 , again as sketched in Figure 1.14. The Nash equilibrium occurs where R1 and R2 intersect at (u∗ , v ∗ ) ∈ D, that is, where u ˆ(v ∗ ) = u∗ and vˆ(u∗ ) = v ∗ or
u∗ = v ∗ =
(1.86)
λb (1 + λ)2 κ
(see Exercise 1.24). By (1.79), this equilibrium yields the reward (1.87a)
w1 = f1 (u∗ , v ∗ ) =
λ 1+λ
2 b
to Player 1 and the reward (1.87b)
w2 = f2 (u∗ , v ∗ ) =
b f (u∗ , v ∗ ) = 1 2 2 (1 + λ) λ
to Player 2. This result illustrates the so-called “paradox of power” [143] for λ ≤ 1. If λ = 1, so that neither nation has greater fighting skills than the other, then both nations obtain the same reward, even though Nation 2 is far poorer than Nation 1 when b2 /b1 is small compared to 1. If λ < 1, so that Nation 2 has greater fighting skills than Nation 1, then Nation 2’s reward will be greater than Nation 1’s, no matter how much poorer Nation 2 might be. Note, however, that λ is assumed to be independent of b1 and b2 , which may not be realistic. On the other hand, Figure 1.14 is by no means the whole story, because we have also assumed 14 b ≤ b2 (and hence, by (1.77), 14 b ≤ b1 ): the disputed resource is worth less than four times the resources that the poorer nation controls exclusively. Let us now define (1.88)
βi =
bi b
1.6. Contests as games. The paradox of power
45
v λb/κ
b/4κ b2 /κ
R2
λb/4κ
R1 0 0
u−
b/4κ
b1 /κ
b/4λκ
u
a v λb/4κ b/4κ b2 /κ
R1 R2 0 0 u−
b/λκ b1 /κ
b/4λκ b/4κ u+ b
u
Figure 1.15. The optimal reaction sets R1 (thick curve), R2 (thin curve) and Nash equilibrium (dot) for b2 < 14 b < b1 when (a) 14 < λ < 1 and (b) 1 < λ < 4. The decision set D = S1 × S2 is shaded.
for i = 1, 2, so that 14 b ≤ b2 becomes β2 ≥ 14 , and assume instead that β2 < 14 , or 1 ˆ(u) > b2 /κ for u ∈ (u− , u+ ), where we define 4 b > b2 . Then v (1.89)
u± =
1 2λκ
b − 2b2 ±
b(b − 4b2 ) .
Because u ∈ (u− , u+ ) implies ∂f2 /∂v > 0 for all v ∈ (0, b2 /κ), the best reply to all u ∈ (u− , u+ ) becomes v = b2 /κ. So in place of (1.85) we obtain
(1.90a)
B2 (u) =
vˆ(u) b2 /κ
if 0 < u < u− , if u− ≤ u ≤ b1 /κ
46
1. Community Games
1
2
Figure 1.16. Equilibrium rewards when λ = 1 in (1.91), as a proportion of b, for Nation 1 (thick curve) and Nation 2 (thin curve), together with the total reward (dashed).
for u+ ≥ b1 /κ, as sketched in Figure 1.15(a); whereas ⎧ ⎪ ⎨ vˆ(u) if 0 < u < u− , B2 (u) = (1.90b) b2 /κ if u− ≤ u ≤ u+ , ⎪ ⎩ vˆ(u) if u+ < u ≤ b1 /κ for u+ < b1 /κ
λκ , as sketched in Figure 1.15(b). Now R1 and R2 intersect where ∗ ˆ(v ∗ ) = u ˆ(b2 /κ), so that v = v = b2 /κ and u = u∗ = u β2 β b b (1.91) u∗ = − 2 , v∗ = 2
λ
λ
κ
κ
by (1.80), where β2 = b2 /κ from (1.88). The equilibrium rewards are w1 = 2 f1 (u∗ , v ∗ ) = 1−{β2 /λ}1/2 b to Player 1 and w2 = f2 (u∗ , v ∗ ) = {λβ2 }−1/2 −1 β2 b to Player 2, by (1.79). √ 2 In particular, in the case where λ = 1, these rewards become w1 /b = 1− β2 √ and w2 /b = β2 − β2 as proportions of the prize. They are plotted in Figure 1.16, where they are clearly no longer equal, except in the limit as β2 → 14 . Again as √ proportions of the prize, the costs of effort at equilibrium are κu∗ /b = β2 − β2 √ for Nation 1 and κv ∗ /b = β2 for Nation 2, with total β2 ; all increase with β2 , i.e., with the wealth of the poorer nation. Moreover, the total reward as a proportion √ of the prize, namely, (w1 + w2 )/b = 1 − β2 decreases from 1 in the limit as β2 → 0 to 12 in the limit as β2 → 14 (Figure 1.16, dashed curve). Correspondingly, raising √ the poorer nation’s wealth increases the proportion 1 − (w1 + w2 )/b = β2 of the total prize that is effectively wasted by the contest, in the sense that the maximum possible reward (obtained in the limit as β2 → 0 in Figure 1.16) is enjoyed by
1.7. A peek at the extensive form
47
neither nation; however, an increase of such wasted effort increases the proportion of the prize that goes to Nation 2 (while reducing that which goes to Nation 1).
1.7. A peek at the extensive form Noncooperative games are sometimes represented in the form of a “decision tree”33 (as opposed to payoff matrices or reward functions) and are then said to be in extensive form. A discrete version of the contest we studied in §1.6 allows us to illustrate the basics of this alternative representation. For the sake of simplicity, we assume that λ = 1 and b1 ≥ b2 > 14 b. So neither side has an advantage in military skill, and the value of the bonanza does not greatly exceed the wealth of either nation. Then, for u, v not both zero, (1.92)
f1 (u, v) =
ub − κu = f2 (v, u), u+v
b by (1.79), u∗ = v ∗ = 4κ at the Nash equilibrium, by (1.86), and the equilibrium 1 reward is 4 b for either player, by (1.87). In §1.6 we purposely assumed that both nations will use only positive effort by restricting the strategy set for Pi to the half-open interval (0, bi /κ], so that (0, 0) lay outside D and pi (0, 0) did not need to be defined. Now we ask what it would mean for both nations to apply zero effort—that is, zero military effort. It seems reasonable to presume that both nations would go about their business of retrieving the treasure in the absence of military interference, and if neither side had an advantage in the requisite technical skills, then they would each end up with half of the bonanza. To be sure, retrieval would require effort. But it would be civilian effort, which is irrelevant to the contest that we studied in §1.6. So it is reasonable to add zero to each player’s strategy set and define
(1.93)
p1 (0, 0) =
1 2
= p2 (0, 0).
The associated reward to Pi is fi (0, 0) = 12 b, by (1.74)–(1.75) with V = b and wi = 0. This reward exceeds the reward 14 b at the Nash equilibrium. Therefore, each nation should consider zero effort. If each nation considers strategy 0, however, then each nation should also consider the optimal response to 0. From (1.92), f1 (u, 0) = b − κu, which is maximized by making u as small as practicable, but not zero (which would yield the smaller reward 12 b). Let denote this minimal positive effort; in practice, might correspond to a single patrol boat, sufficient for either nation to keep the other nation’s civilians away from the salvage area while allowing its own to retrieve the treasure. Then each nation should consider strategy as being the optimal response to a strategy the other side is considering. Each nation should also consider the b , which yields the Nash equilibrium of the contest in §1.6. So a suitable strategy 4κ b strategy set for a discrete version of the contest is 0, , 4κ . Figure 1.17 shows the extensive form of the discrete game with strategy set b 0, , 4κ . The decision tree is drawn in the conventional way, upside-down, with the “root” at the top and the “leaves” at the bottom. The root corresponds to the 33 No prior knowledge of trees is necessary for their limited use in this book, although an elementary account of this topic appears in [287, Chapter 11].
48
1. Community Games
P1 P2 0
ε
0
ε
ε
0
(b/2, b/2) (0, b−) (0, 3b/4) (b−, 0)
P2 0
(c, c)
ε
(a, A) (3b/4, 0) (A, a) (b/4, b/4)
Figure 1.17. The extensive form for the discrete version of the sunkentreasure contest. Branches selected through backward induction are indicated by increased thickness.
player moving first; when moves are simultaneous, as here, the choice is arbitrary, and we have chosen Player 1. The leaves correspond to possible endpoints of the game. Every other vertex, including the root, is said to be an internal vertex and represents a player’s decision point, the branches emanating from that vertex correspond to the player’s choices, and all such vertices at the same level belong to the same player.34 Specifically, in Figure 1.17, P1 moves first and can choose 0, , b b or u∗ = 4κ ; in all three cases, P2 can choose 0, , or v ∗ = 4κ . Every leaf is labelled with the ordered pair of payoffs corresponding to that endpoint of the game. For example, if both players choose , then the game ends at the central leaf with payoff (1.94)
c = f1 (, ) =
1 2b
− κ = f2 (, )
to each player; if P1 chooses 0 and P2 chooses , then the game ends at the second leaf from the left, with payoffs 0 to P1 and (1.95)
b− = f1 (0, ) = b − κ = f2 (0, )
to P2 ; if P1 chooses u∗ and P2 chooses , then the game ends at the second leaf from the right, with payoffs A = f1 (u∗ , ) to P1 and a = f2 (u∗ , ) to P2 , where 3 3 v∗ A = +v (1.96) a = +v ∗ ∗ 4 b − κ , 4 b − κ ; and similarly for all other possible choices. The game is now solved by backward induction, that is, by working backwards in time, in this case moving back up the tree from the leaves to the root. If the game reaches P2 ’s left-hand vertex, then its three possible payoffs are 12 b from 0, b− from and 34 b from v ∗ . For sufficiently small (specifically, < v ∗ ), we have b− > 34 b > 12 b. So P2 will choose if the game reaches this vertex, as indicated by the thickness of the corresponding branch. Likewise, because A > c > 0 and 1 ∗ 4 b > a > 0, P2 will choose v if the game reaches either its central or its righthand vertex. Because Nation 1 now knows that P2 will choose or v ∗ according to whether P1 chooses 0 or either or u∗ , Nation 1 now also knows that its payoffs 34 Note that any internal vertex apart from the root is also the root of a (proper) subtree of the original tree. So it corresponds to the player moving first in a subgame of the original game whose extensive form is that subtree. All three subgames in Figure 1.17 are trivial, since they do not involve P1 ; however, G in Figure 1.18 is not trivial, since the whole of Figure 1.17 can be substituted for it.
1.7. A peek at the extensive form
49
P1 P2 Accept (b/2, b/2)
Offer 50%
Don't offer
G (b/4, b/4)
G (b/4, b/4)
Figure 1.18. The extensive form for the expanded sunken-treasure contest with external enforcement of binational agreements. The subgame G is either the continuous game of §1.6 or the discrete game of Figure 1.17.
from choosing 0, , or u∗ are 0, a, or 14 b, respectively. The largest of the three is the last. So P1 will choose u∗ and P2 will choose v ∗ , thus arriving at the Nash equilibrium in Figure 1.14. We did not need the extensive form to obtain this result. It can be obtained just as readily from the payoff matrix (see Exercise 1.26), and it is anyhow only what we already knew from §1.6. But the point of this section was to introduce the extensive form and show how it is used, as opposed to obtaining a new result. Moreover, there is more to be said. The outcome of the game, with reward 1 b to each nation, is a most unfortunate one. Both nations would much prefer 4 to desist from military effort, and thereby double their reward. So why does this outcome not arise? It would indeed arise if both sides could commit to zero effort and be believed, or if there were effective external enforcement. In this regard, let G denote either the contest in §1.6 or the contest in Figure 1.17 (it does not matter which), and consider an expanded game with the extensive form of Figure 1.18, where G is embedded as the subgame G. In this expanded game, before G even arises, Nation 1 can offer to desist from military effort and split the treasure equally with Nation 2 if Nation 2 will likewise desist; and if Nation 2 accepts this offer, then the agreement is externally enforced. If either Nation 1 makes no such offer or Nation 2 rejects the offer, however, then their interaction reverts to G, as indicated in Figure 1.18. This is a very different game, in which making an offer or accepting it are not the same thing as choosing zero military effort (although they do imply it). As before, this game can be solved by backward induction, starting at the lowest level. We know that if the game reaches subgame G, then the outcome will be a reward to each player of 14 b, as indicated in Figure 1.18. If the game reaches P2 ’s vertex, then its two possible payoffs are 12 b from Accept or 14 b from Reject; because the former is larger, P2 will choose Accept if the game reaches this vertex, as indicated by the thickness of its branch. So Nation 1 knows that its payoffs from choosing Offer or Don’t Offer are 12 b or 14 b, respectively. Again the former is larger. So P1 will choose Offer, P2 will choose Accept, and they will thus reach the outcome they both prefer. Technically, it corresponds to a Nash equilibrium of the game represented by Figure 1.18, because either player would fare worse by choosing the thin branch at its vertex. But the equilibrium is “self-enforcing” only
50
1. Community Games
because we have presumed the existence of an external enforcer for a binational agreement—which in practice is not so likely. Thus, the sunken-treasure contest encapsulates the difficulty of achieving cooperation. The essence of this problem is also captured by an even simpler game, known as the prisoner’s dilemma, which we do not study formally until Chapter 5. We preview it in Exercise 1.23, which verifies that a reduced version of the discrete sunken-treasure game is a special case of it.
1.8. Max-min strategies Our studies of bimatrix games have revealed a difficulty with Nash equilibrium as a solution concept for noncooperative games: a game may have more than one equilibrium. In our studies of continuous games, however, we have seen that a Nash equilibrium can be unique (and often is). In this case, is it not reasonable to regard it as the game’s solution? Not necessarily. A further difficulty with our concept of Nash equilibrium is that we have had to assume complete information: every player must know every other player’s reward function. But suppose that each player knows only her own reward function. What then is her best strategy? A possible answer involves the concept of max-min strategy, which we describe in this section. Let us first define the minimizing functions, m1 , m2 , . . . , mn by min fk w . (1.97) mk w k = w: wk =wk
For each k ∈ N , mk (wk ) yields the minimum value of Player k’s reward with respect to the variables controlled by the other players, i.e., w\wk . In particular, for a two-player game we have (1.98)
m1 (u) = min f1 (u, v), v
m2 (v) = min f2 (u, v). u
Suppose, for example, that San is an intermediate or slow driver in Crossroads, for which Nan’s reward function is (1.99) f1 (u, v) = + 12 τ2 − {δ + }v u + − 12 τ2 v − − 12 τ2 , from (1.12a). Then because < 12 τ2 , the coefficient of v in f1 , namely, − τ2 /2 − (δ + )u, is always negative, and so f1 is minimized with respect to v by v = 1. Thus, from (1.98), Nan’s minimizing function is given by (1.100)
m1 (u) = f1 (u, 1) = (τ2 /2 − δ)u − τ2 .
For k ∈ N , we now define a max-min strategy for Player k to be a wk that maximizes mk (wk ), that is, maximizes the minimum reward that the other players can impose on Player k; we denote this max-min strategy by w ˜ k , and we refer to w ˜ as a joint max-min strategy combination. In Crossroads with < τ2 /2, for example, it is clear from (1.100) that u ˜ = 0 (always wait) is the unique max-min strategy ˜ = 1 (always go) is the unique max-min strategy for for Nan if τ2 /2 < δ, whereas u Nan if τ2 /2 > δ. Thus (provided τ2 /2 = δ) Nan’s max-min strategy when San is intermediate or slow is always a pure strategy: W if San is intermediate, G if San is slow. But a max-min strategy for a matrix game need not be a pure strategy. For
1.9. Commentary
51
example, in Crossroads with > τ2 /2 (hence δ > τ2 /2), it follows from (1.98)–(1.99) that
( + 12 τ2 )(u − 1) if (δ + )u ≤ − 12 τ2 , (1.101) m1 (u) = if (δ + )u > − 12 τ2 , ( 21 τ2 − δ)u − τ2 so that Nan’s unique max-min strategy when San is fast is the mixed strategy u ˜ = ( − τ2 /2)/(δ + ).35 The concept of max-min strategy rests on the idea that, no matter which wk is chosen, the other players will do their worst by making the reward fk as small as possible (equal to mk (wk ), in fact), and it responds by selecting the best of these ˜ k ). A max-min strategy is a fail-safe strategy. It is worst rewards, namely, mk (w absolutely fail-safe when it is a pure strategy, and it is fail-safe on average when it is a mixed strategy in a bimatrix game or when the rewards in a continuous game are expected values. But a max-min strategy is also in general a very pessimistic strategy, because if the other players do not know Player k’s reward function, then how could they minimize it—except, perhaps, by chance? Not surprisingly, w ˜ rarely belongs to every player’s optimal reaction set and frequently belongs to no player’s optimal reaction set; see Exercise 1.18. Indeed there is no guarantee that w ˜ is even feasible, i.e., that w ˜ ∈ D; see Exercise 1.22. But there is one important exception: if a two-player game is zero-sum, i.e., if f1 (u, v) + f2 (u, v) = 0 for all (u, v) ∈ D, then w ˜ ∈ R1 ∩ R2 ; see Exercise 1.17. In this very special case, there is no need to argue over the merits of max-min strategies versus Nash-equilibrium strategies because the two coincide. Such happy circumstances are rare, however, in game-theoretic modelling.
1.9. Commentary In this chapter we studied games among specific actors, termed community games [220]. We introduced the concepts of pure strategy, payoff matrix (§1.1), mixed strategy, optimal reaction set, and Nash equilibrium (§1.2, §1.5), and we used these concepts to analyze bimatrix games with two (§1.1, §1.2) or three (§1.3) pure strategies, as well as two-player (§1.4, §1.6) and three-player (§1.5) continuous games. We discovered that neither existence nor uniqueness of Nash equilibrium is assured in general; however, existence is assured for bimatrix games if we allow mixed strategies. A proof that every bimatrix game has at least one Nash-equilibrium strategy combination, based on Nash’s [243] application of the Brouwer fixed-point theorem to n-player games, appears in Owen [257, pp. 127–128]. Existence theorems do not tell us how to compute Nash equilibria, however, and we saw that this task can be far from trivial—even for two-player games. In §1.6, which is based on [8] and [162], we introduced the class of games known as contests. Despite a recent surge in activity on this topic, the monograph by Konrad [162] remains an excellent survey of the field as a whole and, in particular, cites all relevant earlier literature in economics; more recent work is cited in [349]. 35 A max-min strategy need not be unique; for example, any strategy would be a max-min strategy for Nan with < τ2 /2 and δ = τ2 /2 because m1 would then be independent of u, from (1.100). But see Footnote 6 on p. 13.
52
1. Community Games
But [162] contains very few pointers to relevant biological literature, which instead is traceable from [160] and [304]. In §1.7 we touched on the extensive form. In practice, its use tends to be largely limited to discrete games with few moves between few players; beyond that, it rapidly becomes too unwieldy for the printed page.36 Fortunately, however, we shall find in subsequent chapters that we can get by very nicely without it, that is, by using only payoff matrices and reward functions.37 Although our treatment of Nash equilibrium, which is the central concept in noncooperative game theory, has been predicated on complete information, the concept extends to so-called Bayesian Nash equilibrium in games of incomplete information, in which players do not know their opponents’ rewards but are able to quantify their feelings about them. Nevertheless, the literature on such games, which derives from the work of Harsanyi [132–134], is largely against the spirit and beyond the scope of our agenda, and so we confine our treatment of incomplete information to the concept of max-min strategy (§1.8). Except in §1.8, we assume throughout that players’ rewards and decision sets are common knowledge. Relaxing this assumption makes game-theoretic analysis considerably more difficult. Here two remarks are in order. First, the assumption of common knowledge may simply be quite reasonable. Second, the assumption is relaxed in a nowextensive literature on games of partial information, which distinguishes between incomplete information, i.e., partial knowledge of a conflict’s structure, and imperfect information, i.e., partial knowledge of a conflict’s history. Clearly, you cannot possibly know a conflict’s history if you don’t even know its structure, although you can know its structure without knowing its history. Thus incomplete information is always imperfect, although imperfect information can still be complete. The distinction between incomplete and imperfect information is less important in practice than it is in theory, however; as a practical matter, if there are things we don’t know, then—regardless of whether our ignorance is an imperfection or an incompleteness—we either exclude them from our models, or else we call them random variables and we assign them distributions. We have done so already in §1.4 and §1.5, and we will do so again repeatedly, especially in Chapters 2 and 6–8, always assuming a common distribution for any individual attributes that are not public information. By contrast, games of incomplete information allow distributions to be subjective: they are assigned, as it were, not by the modeller but by the players. Moreover, these distributions can be updated as an interaction progresses and new information is acquired—an inherently dynamical process. Thus games of incomplete information possess considerably more flexibility than games of complete information, at least in principle. But this is an introductory text, and we cannot cover everything. Instead we refer to the literature; see, e.g., [269,271,343], and references therein.
36
For an illustration of this point, see Figure 5.1 and Footnote 4 on p. 177. We therefore study noncooperative games primarily in their so-called normal form, as opposed to their extensive form. (In atypical circumstances, there can exist subtle differences between these two forms of what otherwise appears to be the same game [74, pp. 2–8], but they need not concern us in this book.) Yet despite having little use for the extensive form, we may still find trees useful—even in a continuous game—for a different purpose, namely, keeping track of random events in calculating a reward. For an illustration of this point, see Figure 6.13. 37
Exercises 1
53
Exercises 1 1. Suppose that Crossroads is symmetric with τ1 = τ = τ2 and that neither driver is especially fast or slow, i.e., 2δ > τ > 2. Show directly from the payoff matrices in Tables 1.1 and 1.2 that the pure strategy combinations W W and GG cannot be Nash equilibria. Thus W W cannot be chosen by two rational players, because if either selected W , then the other would have an incentive to deviate from W ; and similarly for GG. 2. Show that a pair of strongly dominant strategies in a discrete bimatrix game is always the unique Nash equilibrium, whereas uniqueness need not hold for a pair of merely dominant strategies. Thus GG is the unique Nash equilibrium of Crossroads with two slow drivers on p. 13, whereas GG is not the unique Nash equilibrium of Four Ways with slow drivers on p. 21 solely by virtue of comprising a pair of dominant strategies. Nevertheless, show that GG is the unique Nash equilibrium in this case also. 3. Verify (1.25), and hence that Figures 1.4(b) and 1.5 are correct. Also verify that Table 1.3 and (1.26) are correct. 4. Show that (1, 0) and (0, 1) are both strong Nash-equilibrium strategy combinations in Crossroads when neither driver is slow. 5. Show that (1.29a) and (1.29b) are special cases of (A.2). 6. Verify Tables 1.6 and 1.7.38 7. Verify Table 1.8. 8. Obtain (1.47) and (1.49). 9. Obtain (1.51). 10. Show that Store Wars has no Nash equilibrium if α = 11. √ 11. Find R2 for Store Wars when α = 2 10. 12. According to Figure 1.11(a), when α = 10 there is a value of p1 for which San’s optimal reaction to an increase in p1 should be to lower her price. Does this make sense? Interpret. 13. Verify (1.58). 14. Verify Table 1.10.
3 11 , 60 is a Nash equilibrium of Store Wars II 15. (a) Verify that (u∗ , v ∗ , z ∗ ) = 16 , 20 by applying (1.70). 3 11 (b) Verify that (u∗ , v ∗ , z ∗ ) = 16 , 20 , 60 is the unique Nash equilibrium of Store Wars II by showing that it is the only strategy combination that matches a row from each of Tables 1.9–1.11. 16. (a) For any noncooperative, two-player game, show that (u∗ , v ∗ ) ∈ D is a Nash equilibrium if and only if max f1 (u, v ∗ ) + f2 (u∗ , v) = f1 (u∗ , v ∗ ) + f2 (u∗ , v ∗ ), (u,v)∈D
38 The line joining (ω, 0) to (α, β) in Figure 1.6 corresponds to equal coefficients; the point (α, β) to vanishing coefficients.
54
1. Community Games
where fi denotes Player i’s reward function. (b) Likewise, show that (u∗ , v ∗ , z ∗ ) ∈ D is a Nash equilibrium of a noncooperative three-player game if and only if max f1 (u, v ∗ , z ∗ ) + f2 (u∗ , v, z ∗ ) + f3 (u∗ , v ∗ , z) (u,v,z)∈D
equals f1 (u∗ , v ∗ , z ∗ ) + f2 (u∗ , v ∗ , z ∗ ) + f3 (u∗ , v ∗ , z ∗ ), where again fi denotes Player i’s reward function. 3 11 , 60 is the unique Nash (c) Use this result to show that (u∗ , v ∗ , z ∗ ) = 16 , 20 equilibrium for Store Wars II. 17. Show that for a two-player, noncooperative, zero-sum game, a max-min strategy is always a Nash-equilibrium strategy, and vice versa. 18. Use Crossroads to establish that (a) A two-player game need not be zero-sum for a joint max-min strategy combination to be a Nash equilibrium. (b) A joint max-min strategy combination may lie in no player’s optimal reaction set. (c) Even if a joint max-min strategy combination does not lie in any player’s optimal reaction set, it may be equivalent to a Nash equilibrium in terms of associated rewards. 19. Find all max-min strategies for Store Wars II in §1.5. 20. Find all three reward functions for a modified version of Store Wars II in which San’s store is located at θ = 3π/2 (instead of θ = π) but everything else remains 17 7 11 , 40 , 60 is a Nash equilibrium. the same, and hence show that (u∗ , v ∗ , z ∗ ) = 120 Is it unique? 21. Find all max-min strategies for Four Ways. 22. Find all max-min strategies for Store Wars in §1.4.
S 23. (a) The symmetric, two-player bimatrix game with payoff matrices A = R T P and B = AT satisfying T > R > P > S and 2R > S + T is known in the literature of the social and biological sciences as the prisoner’s dilemma. Sketch the optimal reaction sets and find all Nash equilibria. (b) Show that, if the strategy setfor the sunken-treasure contest in discrete b b to 0, 4κ , then the resulting game is a §1.7 is reduced from 0, , 4κ special case of the prisoner’s dilemma. 24. Use (1.80) to verify (1.86). 25. Figures 1.14 and 1.15 assume that 14 b < b1 , whereas here we assume instead that 14 b > b1 ≥ b2 . (a) Sketch the optimal reaction sets for 14 b > b1 ≥ b2 when 14 < λ < 1. Find both the Nash equilibrium and the equilibrium rewards. Is there still a paradox of power for λ = 1? (b) Sketch the optimal reaction sets for 14 b > b1 ≥ b2 when λ < 14 . Verify that the Nash equilibrium is still given by (1.86), despite R1 and R2 both having boundary segments. 26. (a) Verify that A > c > 0 and 14 b > a > 0 in the sunken-treasure contest of §1.7 (where a, A and c are defined by (1.94) and (1.96)).
Exercises 1
55
(b) Write down the payoff matrices for the game. b b is the unique Nash , 4κ (c) Verify by inspection of the payoff matrix that 4κ equilibrium. 27. Here we modify Crossroads (§1.1 and §1.2) to allow a motorist’s behavior to depend on her direction of travel, as suggested by [326]. So we introduce an asymmetry of role—by contrast, the asymmetry in Crossroads is purely a payoff asymmetry. Suppose that a town center lies to the north and west of the crossroad in Figure 1.1, whereas the suburbs lie to the south and east; thus a northbound driver is heading (west) into town, but a southbound driver is heading out. Then a strategy is a two-dimensional vector whose first component is the probability of selecting pure strategy G if heading into town, and whose second component is the probability of selecting pure strategy G if heading out. Let τ2 be the junction transit time of any southbound driver and τ1 that of any northbound driver; thus Player 1’s transit time is τ1 when she heads into town but τ2 when she heads out, and similarly for Player 2. (The only asymmetry between players is directional; the junction may be slower in one direction and faster in the other because, e.g., the road is on a hill.) If no drivers are slow (i.e., 2δ > max(τ1 , τ2 )) and the possible directions of travel are equally likely: (a) Obtain the reward functions, verifying that the game is symmetric (i.e., satisfies (1.30)). (b) Calculate the optimal reaction sets, and hence find all Nash-equilibrium strategy combinations. 28. Two individuals divide a cake by submitting sealed bids for a specific proportion of the cake to a neutral referee. If the sum of the two proportions does not exceed one, then the cake is distributed according to the bids; but if the sum exceeds one, then neither player receives anything. Model this strategic interaction as a continuous community game. (a) Find each player’s reward function. (b) Find the optimal reaction sets R1 and R2 . (c) Find all Nash equilibria. In every case, determine whether the equilibrium is weak or strong, and why. 29. A matrix game with three pure strategies has payoff matrices ⎡ ⎤ −λ ρ −ρ A = ⎣ −ρ −λ ρ ⎦ , B = AT , ρ −ρ −λ where ρ > 0 and |λ| is much smaller than ρ. Find all Nash-equilibrium strategy combinations. 30. The zero-sum game called Chump is played between two camels, a dromedary (Player 1) and a bactrian (Player 2). Player k must simultaneously flash Fk humps and guess that her opponent will flash Gk . Possible pure strategies (Fk , Gk ) satisfy 0 ≤ F1 , G2 ≤ 1, and 0 ≤ F2 , G1 ≤ 2. Thus Player 1 has six pure strategies, namely, (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), and (0, 0). Player 2 likewise has six pure strategies, namely, (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), and
56
1. Community Games
(0, 0). If both players are right or wrong, then the game is a draw. But if one is wrong and the other is right, then the first pays F1 + F2 dollars to the second. (a) Write down the payoff matrices A and B for Chump. (b) With rewards defined according to (A.2), a strategy for each player be2 4 1 4 ∗ = , , , , 0 and comes a five-dimensional vector. Verify that, if u 21 7 7 21 4 2 1 , 21 , 7 , 0 , then (u∗ , v ∗ ) is a Nash equilibrium. v ∗ = 47 , 21 31. Bonnie and Clyde play a game called Pokeroo with an infinite deck of numbers. Each bets $1 and draws a number at random from [0, 1]. Bonnie goes first, drawing the random number X. She can either “fold” (losing $1) or “raise” the ante by $b; in which case Clyde, who draws Y , can either fold (Bonnie wins his $1) or match Bonnie’s raised stake of 1 + b dollars. If neither folds, then the player with the higher number wins (netting ${1 + b} from the other). Both players use a threshold strategy, u ∈ [0, 1] for Bonnie, v ∈ [0, 1] for Clyde; that is, Bonnie folds if X ≤ u and otherwise raises, whereas Clyde folds if Y ≤ v and otherwise matches Bonnie’s stake. (a) Calculate the reward and optimal reaction set for Player 1, i.e., Bonnie. (b) Calculate the reward and optimal reaction set for Player 2, i.e., Clyde. Show that there is a unique Nash equilibrium, and verify that it is a maxmin strategy combination. Who would you rather be in this game, Bonnie or Clyde? Why?
Chapter 2
Population Games
As we saw in Chapter 1, a noncooperative game can have many Nash equilibria. If one of them is to be regarded as the solution, then we need criteria for distinguishing it from the others.1 One such criterion is provided by Maynard Smith’s concept of evolutionarily stable strategy or ESS [188], which we formally introduce in §2.2. In order to make use of ESSs, however, we must first adopt a new perspective and consider games in which an arbitrary focal individual or protagonist, the ustrategist, interacts with individuals drawn at random from a large population of v-strategists. We call such games population games to distinguish them from community games like Store Wars (§1.4) or Store Wars II (§1.5). For our first example of a population game, we need only revisit Crossroads; however, we must now consider the symmetric version of it, as opposed to the asymmetric community game between a particular Nan and a particular San that we considered in §1.2.
2.1. Crossroads as a population game Consider once more the game of Crossroads. Our northbound driver—called Ned in this chapter—and our southbound driver—now called Sed—are no longer specific individuals; rather, either one could be any member of the population. Thus we have no basis for supposing that Ned and Sed have different transit times, and instead we assume that each takes time τ to cross the junction unimpeded. In other words, and of necessity, the game is now symmetric, as already noted above. Setting τ1 = τ2 = τ in (1.12) and hence θ1 = θ2 = θ in (1.23) and Table 1.3, the players’ rewards become (2.1a)
f1 (u, v) = (δ + ) {(θ − v)u − θ(1 + v)} + 2v
and (2.1b)
f2 (u, v) = (δ + ) {(θ − u)v − θ(1 + u)} + 2u,
1 Three such criteria were compared in the second edition of this book. But the first two criteria, namely, by Harsanyi and Selten [135] and by Kalai and Samet [155], have proven to be of limited use in applications, and so here we restrict our attention to the third criterion, namely, by Maynard Smith.
57
58
2. Population Games
where (2.2)
θ =
+ τ /2 +δ
as in (1.33b). For the sake of simplicity, let us now assume that drivers are not slow, as in most of §1.3. Then δ > 12 τ or θ < 1, and it follows from §1.2 that the game has three Nash-equilibrium strategy combinations, namely, (u∗ , v ∗ ) = (1, 0), (u∗ , v ∗ ) = (0, 1), and (u∗ , v ∗ ) = (θ, θ). How might Ned and Sed converge on any of these equilibria? Suppose that the following thoughts occur to Ned, who is rational, and who has three Nash-equilibrium strategies, namely, u∗ = 0, u∗ = θ, and u∗ = 1. From Table 1.3, u = 1 is best for Ned if—and only if—Sed selects Nash-equilibrium strategy v ∗ = 0. But why should Sed select v ∗ = 0? No reason at all. On the other hand, if Sed is to select from only three Nash-equilibrium strategies, namely, v ∗ = 0, v ∗ = θ, and v ∗ = 1, and if Ned doesn’t know which strategy Sed will pick, then Ned might as well hope that Sed will choose v ∗ = 0. So let Ned select u∗ = 1. Now suppose that Ned’s thoughts have also occurred to Sed, who is also rational. Then, clearly, Sed will select strategy v ∗ = 1, which is his best strategy if—and only if—Ned selects u∗ = 0. But, of course, Ned does not select u∗ = 0, and Sed does not select v ∗ = 0; rather, in this way they select the strategy combination (1, 1). The associated reward is f1 (1, 1) = − 12 τ − δ = f2 (1, 1). We don’t have only one Ned, however, nor only one Sed; rather, we have a huge population of Neds and Seds, all confronting one another across 4-way junctions, all day long, day in, day out, all over the land. All of these Neds and Seds are rational. Why shouldn’t the thoughts that have just occurred to our Ned and Sed occur to all of them? No reason at all. So before very long we have a huge population of Neds and Seds, all of whom are playing strategy 1. Indeed there is no longer any reason to distinguish between players by calling one player Ned and the other one Sed, and so we shall refer to them all as Ed. Suddenly, one day, it occurs to an Ed that, if everyone but he is playing strategy 1, then there is no longer any uncertainty about the strategy his opponent will choose. Because nobody but this particular Ed has had this brainwave, the next Ed he meets is bound to select v ∗ = 1. Now, the problem of Nash-equilibrium selection arises because players in a noncooperative game do not know for sure which Nash-equilibrium strategy their opponent will select. All of a sudden, however, the requisite information is available to an Ed: on the day when he has his brainwave, his reward function reduces to f1 (u, v) = f1 (u, 1) = (τ /2 − δ)u − τ . Because τ < 2δ, f1 (u, 1) is maximized by selecting u = 0. Does this mean that the Ed should begin to play u = 0? After all, isn’t (0, 1) a Nash equilibrium? It is true that (0, 1) is a Nash equilibrium, but playing u = 0 is a rational long-term strategy only if Ed is sure that his next opponent will adhere to v = 1. But if one Ed has had this brainwave (and there are so many Eds on the road that, sooner or later, another of them is bound to have the brainwave), do you think he can keep it to himself? Not likely. You know how word gets around. Before very long, all the Eds in the world will have figured out that if everyone else is selecting strategy 1, then they would do better to select strategy 0—0 is a better reply than
2.1. Crossroads as a population game
59
1 to strategy 1 because (from a Ned’s point of view) f1 (0, 1) − f1 (1, 1) = δ − τ /2 > 0 or, which is exactly the same thing (but from a Sed’s point of view), f2 (1, 0) − f2 (1, 1) = δ − τ /2 > 0. But then everyone will be playing strategy 0. Thus the next strategy combination selected will be, not (0, 1), but rather (0, 0), and the associated reward will be f1 (0, 0) = −τ /2 − = f2 (0, 0). Because < δ, we have to admit that the reward associated with (0, 0) is greater than that associated with (1, 1). But the strategy combination (1, 1) did not persist, because an Ed had a brainwave, word got around, and before very long the whole world had evolved to (0, 0). Of course, we should never have expected the world to remain at (1, 1), because (1, 1) is not a Nash equilibrium; indeed it doesn’t even lie in an Ed’s optimal reaction set. Likewise, (0, 0) lies in no Ed’s optimal reaction set, and so we don’t expect the world to remain at (0, 0). It is just as inevitable that some Ed somewhere will try something else as it was when the world stood at (1, 1). This Ed already knows, however, that neither u∗ = 1 nor u∗ = 0 is a decent long-term strategy, but u∗ = 0, u∗ = θ and u∗ = 1 are his only Nash-equilibrium strategies. Therefore, out of sheer desperation, an Ed will select u∗ = θ. Because else is playing u∗ = 0, our Ed’s reward will be f1(θ, 0) = everyone 1 −(1 − θ) + 2 τ . In the usual way, because f1 (θ, 0) − f1 (0, 0) = θ + 12 τ > 0, it won’t be long before word gets around that θ is a better reply to 0 than 0 is, and it won’t be much longer before all the Eds in the world are playing it. The world is now at (θ, θ). Thus every Ed in the world who is contemplating strategy u can safely assume that his opponent will select strategy θ; in which case, his reward function reduces to f1 (u, θ) = − δ + 12 τ θ = f2 (θ, u), which is independent of u; i.e., f1 (u, θ) = f1 (θ, θ) = f2 (θ, θ) = f2 (θ, u) for all u. No strategy can yield a higher reward against an opponent who selects θ than u = θ itself yields. Suppose, however, that an Ed decides, in the usual way, to start playing a different Nash-equilibrium strategy from θ, say u∗ = 1. Because f1 (1, θ) = f1 (θ, θ), this Ed does no better against an opponent who selects θ than by selecting θ himself. On the other hand, this Ed does no worse; and he may therefore be tempted to continue selecting u∗ = 1. What happens now? Will 1 become fashionable? Not likely! If 1 begins to catch on, then sooner or later this Ed will meet another Ed who is also using strategy 1, and Ed’s reward from this encounter will be f1 (1, 1). If either Ed had stuck to using θ, then his reward would have been f1 (θ, 1). But f1 (θ, 1) − f1 (1, 1) = (δ + )(1 − θ)2 , which is positive. Thus, although a player who switches from θ to 1 will do precisely as well against an opponent who still uses θ, he will fare worse against an opponent who also has switched from θ to 1; therefore, switching from θ to 1 is a bad idea. Similarly, because f1 (θ, 0) > f1 (0, 0), it is a bad idea to switch from θ to the other Nash-equilibrium strategy, namely, 0. More generally, because (2.3)
f1 (θ, u) − f1 (u, u)
=
(δ + ) (θ − u)2
60
2. Population Games
is positive unless u = θ, it would be irrational for an individual to switch from θ to any other strategy. In other words, once the world has arrived at the strategy combination (θ, θ), the world will stay at (θ, θ). It thus appears that the Nash equilibrium (θ, θ) has a measure of long-term stability, which the other two Nash equilibria do not possess. We shall refer to a strategy that is stable in this sense as uninvadable. Thus θ is an uninvadable strategy for Crossroads, whereas 0 and 1 are invadable. The concept of uninvadability yields a criterion for distinguishing among Nash equilibria: eliminate strategies that are invadable. In the game we have just considered, however, there is complete symmetry between any two players. The model cannot distinguish between them; or, if you prefer, there are no grounds whatsoever for calling one player Ned and the other one Sed. Therefore, any Nash equilibrium (u∗ , v ∗ ) that the whole world adopts must also show symmetry between strategies, i.e., u∗ = v ∗ . In other words, we may eliminate the Nash equilibria (0, 1) and (1, 0) purely on the grounds that symmetry between players requires symmetry between strategies, because a strategy cannot be uninvadable unless first of all it is universally adoptable. But if symmetry suffices to eliminate the Nash equilibria (0, 1) and (1, 0), then who needs uninvadability? Here three remarks are in order. First, the symmetry argument provides only a necessary condition for uninvadability; a sufficient condition is provided by (2.3) being positive for u = θ. Second, a game may have several symmetric Nash equilibria, only some of which are uninvadable. Then elimination of invadable strategies selects among equilibria where symmetry alone no longer suffices.2 For an example of such a game, see Exercise 2.17. Third, with the adoption of a population perspective, (2.1) becomes redundant. Because of the inherent symmetry between the players’ rewards, i.e., because f2 (u, v) = f1 (v, u), the reward to a player selecting strategy p against a player selecting strategy q is always (2.4)
f (p, q) = (δ + ) {(θ − q)p − θ(1 + q)} + 2q
by (2.1): we need no subscript 1 or 2, because the expression applies equally to either of the two players whose strategy combination is (u, v). If the player in question is Player 2, then p = v, q = u and the player’s reward is f (v, u) = f2 (u, v) by (2.1b). If, on the other hand, the player in question is Player 1, then p = u, q = v and the player’s reward is f (u, v) = f1 (u, v) by (2.1a). Thus an expression for f (u, v) yields either player’s reward: in a population game, rewards require no subscripts, and so we do not use them from now on. In effect, because of symmetry, we need to analyze the game only from the perspective of a single arbitrary focal player—Ed, say—whom we may think of as Player 1, but there is no longer any especially good reason to denote his reward by f1 , and so we denote it by f instead. Correspondingly, there is no longer a good reason to use R1 for Ed’s optimal reaction
2 Other criteria for distinguishing among Nash equilibria in games with more than one include Harsanyi and Selten’s tracing procedure, Kalai and Samet’s concept of persistence, Selten’s concept of perfectness (insensitivity to arbitrary small random errors in choosing a strategy) and Myerson’s concept of properness (modified perfectness that assigns lower probability to a more costly error than to a less costly error); see [340] and [163, Chapter 5]. Harsanyi and Selten’s criterion, augmented where necessary by their “logarithmic tracing procedure” [135], is the only criterion to guarantee uniqueness.
2.1. Crossroads as a population game
1
v
61
1
v
θ
R
R 0 0
1 (a) θ > 1
u
0 0
θ
1
u
(b) θ < 1
Figure 2.1. A focal individual’s optimal reaction set for Crossroads or the classic Hawk-Dove game at (a) low and (b) high cost. In Crossroads, the average savings in transit delays for each of two aggressive drivers is 12 τ ; and according to whether it is greater or less than the cost of aggression δ, that is, whether θ > 1 or θ < 1, we regard the cost of aggression as low or high. The ESS, v ∗ , is where R intersects u = v, so v ∗ = min(1, θ). Similarly for the Hawk-Dove game (§2.4), with θ reinterpreted as the ratio of value to cost.
set, and so we denote it henceforth by R. That is, with B standing for best reply, R = (u, v) ∈ D | f (u, v) = max f (u, v) u (2.5) = (u, v) ∈ D | u = B(v) is a focal individual’s optimal reaction set—for any population game. But in the particular case of Crossroads among fast or intermediate drivers, we obtain the solid curves in Figure 2.1. Figure 2.1(b) shows the case we have been considering, and 2.1(a) adds the case of slow drivers to complete the picture. A v-strategist’s optimal reaction set (u, v) ∈ D | f (v, u) = max f (v, u) v (2.6) = (u, v) ∈ D | v = B(u) is now easily obtained—again quite generally—by interchanging u and v in (2.5); however, we will discover that (2.6) is largely irrelevant, and so we do not even give this set a name. On the one hand, given R, reflection in the line u = v immediately yields a v-strategist’s optimal reaction set—for Crossroads it is shown dashed in Figure 2.1. On the other hand, we are now interested only in where it intersects R at symmetric equilibria of the form (v ∗ , v ∗ ), which lie on the line of symmetry. Thus it suffices to know where R intersects u = v. For further illustration of this point, see Figures 2.3 and 2.4 in §2.3.3 Now, we have seen that a strategy v ∗ is uninvadable if, in a large population of players who almost all select it, v ∗ yields a greater reward than any deviant or “mutant” strategy, say u, that might instead be selected by the diminutive remainder of the population. One discerns here an echo of Kant’s categorical imperative—to behave in such a way that, if everyone did so, then each would benefit. Yet although 3 Likewise, the decision set D becomes largely irrelevant for the remainder of this chapter. On the one hand, it is always the Cartesian product of the strategy set with itself. On the other hand, our focus in this chapter is on strategies rather than strategy combinations. By rarely talking about the decision set, we free up D for alternative use, and in §2.4 we will use it to stand for Dove instead.
62
2. Population Games
it is broadly true that the player who selects an uninvadable strategy behaves in such a way that, if virtually everyone did so, then anyone who failed to do so would fail to benefit, it cannot be too strongly emphasized that uninvadability as we have defined it can crucially depend on the assumption that any deviant strategy is uniformly adopted by the diminutive remainder in the previous paragraph—or, which amounts to the same thing, that if there are several deviant strategies, then they are adopted at different times (as in the narrative above). We explore this issue further in §2.7. Is it reasonable to assume that at most one deviant strategy is adopted at any given time? Although the answer to this question depends explicitly upon the dynamics of interaction between the players, which our model fails to capture explicitly (in its current state of development), the requisite dynamics is at least quite plausible: a lone player selects a deviant strategy, discovers that it rewards him less than the orthodox strategy, and reverts to orthodoxy before another player has a chance to deviate. Moreover, the lower the frequency of deviation, the more reasonable the assumption. The frequency of deviation is widely thought to be sufficiently low when conflict arises in the context of evolutionary biology, because strategies can be identified with inherited behavior and deviations with mutations. Thus the dynamic of gossip and rumor—or whatever it was that made word get around in Crossroads—is replaced by the dynamic of genetic transmission. In repeated plays of Crossroads, the composition of the population (in terms of strategies) changes because successful strategists are imitated by other drivers; whereas, in the course of biological evolution, the composition of a population (again in terms of strategies) changes because successful strategists leave more offspring (who are assumed to inherit genes for the successful strategy). In either case, however, the frequency of a successful strategy increases because “success breeds success”—whether metaphorically, as in the case of Crossroads, or literally, as in the context of biological evolution—and so the difference between the two dynamics, in terms of their effects on the composition of a population, is largely a matter of time scales. Indeed the concept of uninvadable strategy was first defined in the context of evolutionary biology by Maynard Smith and Price [185, 192] who, because of the context, named the concept evolutionarily stable strategy, or ESS. It has since been developed extensively by Maynard Smith [188] and others.4 Henceforward, we will find it convenient to refer to uninvadable strategies as evolutionarily stable strategies or ESSs, regardless of whether the context is biological or sociological. In particular, from Figure 2.1, choosing pure strategy G with probability v ∗ = min(1, θ) is the unique ESS of Crossroads. From (2.4) with p = q = v ∗ , the reward to each individual in a population at this ESS is (2.7)
f (v ∗ , v ∗ ) = −(δ + 12 τ ) min(1, θ).
But this is not the largest possible reward to each individual from adopting a population strategy, because f (v, v) = 2v −(δ +)(θ +v 2 ) is maximized by vˆ = δ+ ∗ (< v ), and (2.8) 4
f (ˆ v , vˆ) = f (v ∗ , v ∗ ) + See §2.11 for more recent references.
2 1 δ+ {min(δ, τ /2)}
2.2. Evolutionarily stable strategies
63
is always greater than f (v ∗ , v ∗ ). Then, given that we all like delays to be shorter, why doesn’t the population adopt vˆ? The answer is simple: the (less aggressive) “cooperative” strategy vˆ is not a best reply to itself, and so it is not strategically stable in the way that v ∗ is. We will return to this point in §3.3.
2.2. Evolutionarily stable strategies We have loosely defined strategy v ∗ to be an ESS if, when it is adopted by a population, it does not pay an individual to switch from v ∗ to any other strategy. To make this definition more precise, we now consider a large population in which almost every individual plays the orthodox or “vogue” strategy v while a small subpopulation experiments with a deviant or mutant or “unfashionable” strategy u. We analyze strategic interaction within this population from the perspective of a focal u-strategist. It will often be convenient to think of this u-strategist as Player 1 and a v-strategist as Player 2—even though Player 2 is not a specific individual, but rather a representative of extremely common behavior. For the sake of definiteness, let us agree that proportion 1 − of the population is going vogue while proportion is being unfashionable, where is a small positive number. Then, in effect, the vogue population strategy has been perturbed from v to yield a mix of strategies with average value (2.9)
v = u + (1 − )v,
which does not differ greatly from v. Will more and more individuals copy u, so that its frequency increases, or will even the few unorthodox experimenters return to going vogue? The answer depends on the relative success of the strategies u and v. Let f (u, v) denote the reward to u in a v population, assumed so large that the frequency of u is effectively zero; correspondingly, f (v, v) is the payoff to every individual in an absolutely uniform population of v-strategists. By extension, the reward to strategy u in a population playing v is denoted by f (u, v), and the reward to v in the same population is denoted by f (v, v). We assume that5 (2.10)
f (u, v) = f (u, u + (1 − )v) = f (u, u) + (1 − )f (u, v),
which we interpret as the reward to being unfashionable when surrounded by u with (small) probability and by v with (large) probability 1 − . Correspondingly, (2.11)
f (v, v) = f (v, u + (1 − )v) = f (v, u) + (1 − )f (v, v)
5 Note how careful we were to say that the reward to strategy u in a population playing v is denoted by f (u, v), because f (u, v) is essentially undefined until we have assumed (2.10). In the first instance, f is a function from the set of all pairs of strategies to the real numbers; the function is well defined only if the second argument is a strategy that all players other than the focal individual adopt. But v is not a strategy—it is a strategy mix. So, in the first instance, f (u, v) is undefined—you can’t say that all the other players adopt v, because v represents a population in which some players adopt u and others adopt v. It is therefore necessary to extend the definition of f from the domain consisting of all pairs of strategies to a domain consisting of all pairs whose first component is a strategy and whose second component is a strategy mix. That is what (2.10) achieves. Alternatively, we may consider a function with three arguments, say Φ, such that Φ(u, v, 0) = f (u, v). In the first instance, Φ(u, v, ) is defined only for = 0. We extend its definition to > 0 by assuming Φ(u, v, ) = Φ(u, u, 0) + (1 − )Φ(u, v, 0). Then we regard f (u, v) as just a different representation for Φ(u, v, ), but a much more intuitive one, because it instantly evokes the idea of a u-strategist in a population with a very small proportion of u and a very large proportion 1 − of v.
64
2. Population Games
is the reward to keeping vogue when surrounded by u with probability and by v with probability 1 − . It does not pay to switch from v to u when f (v, v) > f (u, v) or (2.12)
f (v, u + (1 − )v) > f (u, u + (1 − )v).
For v ∗ to be an ESS, it must not pay to switch from v ∗ to any u = v ∗ , for all sufficiently small > 0. Hence, setting v = v ∗ in (2.12) and taking the limit as → 0, a necessary condition for v ∗ to be an ESS is that f (v ∗ , v ∗ ) ≥ f (u, v ∗ )
(2.13)
for all u. Comparing with (1.20) and recalling that f1 (u, v) = f (u, v) and f2 (u, v) = f (v, u) by symmetry, we see that a necessary condition for v ∗ to be an ESS is that (v ∗ , v ∗ ) is a Nash equilibrium. Thus candidates for evolutionary stability correspond to symmetric Nash equilibria—but an ESS is a strategy, whereas a Nash equilibrium is a strategy combination. A sufficient condition for v ∗ to be an ESS is that (2.13) holds with strict inequality, that is, f (v ∗ , v ∗ ) > f (u, v ∗ )
(2.14)
for all u = v ∗ . This result holds because it follows from (2.10)–(2.12) that (2.15) f (v ∗ , u + (1 − )v ∗ ) − f (u, u + (1 − )v ∗ ) = (1 − ){f (v ∗ , v ∗ ) − f (u, v ∗ )} + {f (v ∗ , u) − f (u, u)}, which is positive for all u = v ∗ for all sufficiently small > 0 if (2.14) holds—if, of course, f is bounded, which we assume. If this sufficiency condition holds, then we say that v ∗ is a strong ESS. So v ∗ is a strong ESS if and only if (v ∗ , v ∗ ) a strong, symmetric Nash equilibrium. Alternatively, from (2.14), v ∗ is a strong ESS if it is uniquely the best reply to itself. For example, Crossroads has a strong ESS if θ > 1 because v ∗ = 1 is then uniquely the best reply to itself, by Figure 2.1(a). It is possible, however, that (2.13) holds for all u but (2.14) fails for some u = v ∗ . Any such u is an alternative best reply to v ∗ . For all such u it follows from (2.15) that (2.16)
f (v ∗ , u + (1 − )v ∗ ) − f (u, u + (1 − )v ∗ ) = {f (v ∗ , u) − f (u, u)},
which must be positive for all such u for v ∗ to be an ESS. In sum, v ∗ is an ESS if both (2.17a)
f (v ∗ , v ∗ ) ≥ f (u, v ∗ )
and, for all u = v ∗ , either (2.17b)
f (v ∗ , v ∗ ) > f (u, v ∗ )
or (2.17c)
f (v ∗ , u) > f (u, u).
for all u
2.2. Evolutionarily stable strategies
65
Equivalently, v ∗ is an ESS if for all u = v ∗ , either (2.18a)
f (v ∗ , v ∗ ) > f (u, v ∗ )
or (2.18b)
f (v ∗ , v ∗ ) = f (u, v ∗ )
and (2.18c)
f (v ∗ , u) > f (u, u).
However we look at it, the underlying intuition is that either mutant u-strategists cannot even enter a population of v ∗ -strategists (because the ESS is strong); or mutant u-strategists can enter—by virtue of playing an alternative best reply to v ∗ —but they cannot proliferate. Naturally, if v ∗ is an ESS but not a strong ESS, then we say that v ∗ is a weak ESS. For example, Crossroads has a weak ESS if θ < 1 because then there are infinitely many alternative best replies to v ∗ = θ, by Figure 2.1(b). Here three remarks are in order. First, in principle (2.12) and either (2.17) or (2.18) yield equivalent conditions, although in practice (2.12) is rarely used. Second, these conditions are quite general: strategies may be vectors of any dimension. To illustrate, we revisit Four Ways. Although we viewed this game as a community game in §1.3, where we discovered the profusion of Nash equilibria in Table 1.8, we also paved the way for our current populational perspective by assuming the game to be symmetric. Now, no matter how many Nash equilibria a game may have when viewed as a community game, only its symmetric Nash equilibria can yield an ESS when the same game is viewed as a population game instead. Recall that strategies in Four Ways are two-dimensional vectors, u = (u1 , u2 ) for the focal individual and v = (v1 , v2 ) for the rest of the population (where the two components are the probabilities of selecting pure strategies G and W , respectively). Symmetry between strategies—that is, u = v—requires both u1 = v1 and u2 = v2 . Thus, from Table 1.8, the only symmetric Nash equilibrium is(u, v) = (α, β, α, β), where α and β are defined by (1.33). Let us define ζ = α, β . Then v ∗ = ζ is the only candidate for ESS. From (1.30), the reward to a player who selects u = (u1 , u2 ) against an opponent who selects v = (v1 , v2 ) is (2.19)
f (u, v) = − (2δv1 + (δ + τ /2)(v2 − 1)) u1 − ((δ − τ /2)(v1 − 1) + (δ + )v2 ) u2 + (δ − τ /2)v1 + (δ + τ /2)(v2 − 1).
We readily find (see Exercise 2.4) that (2.20a)
f (ζ, ζ) = f (u, ζ)
and that 2
2
2
(2.20b) f (ζ, u) − f (u, u) = δ {u1 + u2 − α − β} + δ {u1 − α} + δ {u2 − β} , which is greater than zero for all u = ζ. So v ∗ = ζ is a weak ESS, because (2.18b) and (2.18c) both hold. Third, we are not obliged to allow mixed strategies in population games, and for some purposes it is preferable to dispense with them. For games restricted to
66
2. Population Games
finitely many pure strategies—that is, for discrete population games—it is convenient to frame a separate definition of ESS in terms of a payoff or reward matrix.6 Accordingly, let aij be the payoff to strategy i against strategy j in a symmetric game restricted to m pure strategies. Then, by analogy with (2.17), strategy k is an ESS if both (2.21a)
akk ≥ ajk
for all
j = 1, . . . , m
and, for all j = k, either (2.21b)
akk > ajk
or (2.21c)
akj > ajj .
If (2.21b) holds for all j = k (that is, if the diagonal element akk is the largest element in its column of the m × m payoff matrix A), then strategy k is a strong ESS (and otherwise it is a weak ESS). A key result is that if strategy k is a strong ESS of the discrete game, then it must remain a strong ESS when the game is extended to allow for mixed strategies. For suppose that a mutant selects strategy j with probability pj for j = 1, . . . , m, there would be no deviation from strategy k). Then the payoff where pk < 1 (or against k will be m j=1 ajk pj . But akk > ajk for all j = k, implying akk pj ≥ ajk pj for all j = 1, . . . , m, with strict inequality (and m pj > 0) for at least one value mof j = k a p > a p , implying a > (because pk < 1). Hence m kk j=1 kk j j=1 jk j j=1 ajk pj m because j=1 pj = 1. Hence the reward to k against itself is strictly greater than the reward to any other strategy against k, making k a strong ESS. For example, because a11 is the largest element in the first column of the payoff matrix A in Table 1.1 when τ2 = τ and δ < 12 τ (slow drivers), G is a strong ESS of the discrete population game of Crossroads. It is therefore also a strong ESS of the extended game considered in §2.1 when θ > 1, as established by Figure 2.1.
2.3. Crossroads as a continuous population game The population games that we discussed in §2.1 and §2.2 were all matrix games. But a population game can instead be continuous, without in any way affecting the definition of evolutionarily stable strategy. To illustrate, in this section we generalize the symmetric version of Crossroads studied in §2.1. We call this new game Crossroads II. Now, in the (symmetric) game of Crossroads, delays due to mutual dithering, mutual impetuosity, and waiting for an opponent to traverse the junction are represented by the parameters , δ, and τ , respectively. Drivers do not distinguish 6 In the game-theoretic literature, the terms payoff matrix and reward matrix tend to be used interchangeably, the former being more prevalent, and we have seen as early as §1.1 that entries in a payoff matrix are typically expected values. In our treatment of population games, however, we will tend to use reward matrix and payoff matrix, respectively, to distinguish between the case in which aij is the reward to an i-strategist from interacting with a population of j-strategists and the case in which aij is the reward or payoff to an i-strategist from a pairwise interaction with a particular j-strategist, albeit one drawn randomly from the population. Thus, in terms of Maynard Smith [188, p. 23], we use payoff and reward matrix to distinguish between pairwise contests and playing the field, respectively. Note, however, that (2.21) defines an ESS in either case.
2.3. Crossroads as a continuous population game
1
67
y
I
II
III
IV
v
0
0
u
1
x
Figure 2.2. The sample space for the joint distribution of latenesses
between these three possible sources of delay in their perceptions of the cost of delay. In the new game of Crossroads II, however, drivers perceive delays due to mutual dithering or impetuosity as inherently more wasteful of time—by virtue of being avoidable—than delays due to opponent traversal (which are inevitable, or there would be no conflict). Specifically, drivers perceive traversal delays as very costly only if they perceive themselves to be very late; and the earlier they perceive themselves, the lower they perceive the costs of such delays. In Crossroads II, lateness is measured by an index between 0 and 1, with 0 corresponding to the lowest possible perception and 1 to the highest. If X denotes the u-strategist’s lateness, then 1−X may be interpreted as his earliness. Similarly, if Y denotes the v-strategist’s lateness, then 1 − Y may be interpreted as his earliness. Both earliness and lateness are always numbers between 0 and 1. We assume, in fact, that X and Y are continuous random variables, independently distributed between 0 and 1 with probability density function g. So the point (X, Y ) is also a continuous random variable, distributed over the unit square in Figure 2.2 with joint probability density g(x)g(y) per unit area. To be quite precise, in Crossroads II, drivers discount traversal delays by a fraction η of their earliness, where 0 ≤ η < 1. Thus a traversal delay of τ is perceived as τ {1 − η(1 − X)} by Ed, the focal and as τ {1 − η(1 − Y )} by player, any other player. The payoff to Ed is now − δ + 12 τ 1 − η(1 − X) in the event of GG, −τ {1 − η(1 − X)} in the event of W G, and so on. But strategies are no longer probabilities of going; rather, they are critical thresholds of lateness above which a player goes, and below which that player waits. In other words, strategy u means go if X > u, but otherwise wait; and strategy v means go if Y > v, but otherwise wait. Thus W G is the event that X ≤ u, Y > v; GG is the event that X > u, Y > v; W W is the event that X ≤ u, Y ≤ v; and GW is the event that X > u, Y ≤ v. These four events correspond to subregions I, II, III, and IV, respectively, of the unit square in the x-y plane; see Figure 2.2. As in §2.1, Ed’s reward f is the expected value of his payoff, which we denote by F . It depends on X and Y , as follows: ⎧ ⎪ ⎪ −τ {1 − η(1 − X)} if (X, Y ) ∈ I, ⎪ ⎨− δ + 1 τ {1 − η(1 − X)} if (X, Y ) ∈ II, 2 F (X, Y ) = 1 ⎪ if (X, Y ) ∈ III, − + ⎪ 2 τ {1 − η(1 − X)} ⎪ ⎩ 0 if (X, Y ) ∈ IV.
68
2. Population Games
11 So Ed’s reward is f (u, v) = E[F ] = 0 0 F (x, y) g(x) g(y) dx dy, where E denotes expected value (not Ed). We will find it a great convenience, however, especially in §2.6 and Chapter 6, to write (2.22)
dA = g(x) g(y) dx dy, 11 so that Ed’s reward becomes f (u, v) = 0 0 F (x, y)dA, which is significantly more compact. Think of (2.22) as merely a notational ruse for avoiding needless clutter: when you see dA on the right-hand side of an integral, instantly replace it by g(x) g(y) dx dy in your mind. Let us now assume, for the sake of simplicity, that lateness is uniformly distributed between 0 and 1, so that g is defined by (2.23)
g(ξ) = 1,
0 ≤ ξ ≤ 1.
Then we can calculate the reward as the sum of four contributions, one for each subregion of the unit square in Figure 2.2. Integration over I, where 0 ≤ xsubregion =u y=1 x ≤ u and v ≤ y ≤ 1, yields the contribution fI (u, v) = x = 0 y = v F (x, y) dA = 1 u −τ 0 1 − η(1 − x) dx v dy, or (2.24) fI (u, v) = −τ u(1 − v) 1 − η 1 − 12 u after simplification. Similarly, integration over subregion II, where u ≤ x ≤ 1 and v ≤ y ≤ 1, yields the contribution (2.25) fII (u, v) = −(1 − u)(1 − v) δ + 12 τ 1 − η2 {1 − u} ; integration over subregion III, where 0 ≤ x ≤ u and 0 ≤ y ≤ v, yields (2.26) fIII (u, v) = −uv + 12 τ 1 − η2 {1 − u} ; and integration over subregion IV yields fIV(u, v) = 0. We obtain f (u, v) = 1 fI (u, 1v) + fII (u, v) + f1III (u, 1v) + f2IV (u, v) = + 2 τ − (δ + )(1 − v) (1 − u) + − 2 τ (1 − v) − − 2 τ − 4 ητ u − 2u − 1 + v or (2.27) f (u, v) = (δ + ) (1 − v − θ)u − 12 λ u2 − 2u − 1 + v − {(1 + θ)δ − (1 − θ)}(1 − v) after simplification, where θ is defined by (2.2) and (2.28)
λ =
ητ . 2( + δ)
The expression for f agrees with (2.1) when η = 0.7 In the case where drivers are either fast or intermediate according to definitions (1.8)–(1.9), i.e., where δ > 12 τ and so λ < θ < 1, it follows from (2.29)
∂f = (δ + ){1 − v − θ − λ(u − 1)} ∂u
7 Note, however, that “agrees with” does not mean “is identical to” (for η = 0), because u in Crossroads has a different meaning from u in Crossroads II. In the first case, u is the probability of G for Ed; in the second case, u is a critical lateness above which Ed goes, so that the probability of G for Ed is 1 − u, by (2.23). Similarly for v.
2.3. Crossroads as a continuous population game
1
69
v
ω1 1-θ
0 0
v*
1
u
Figure 2.3. Typical optimal reaction sets for Crossroads II when drivers are not slow. The quantities ω1 and v ∗ are defined by ω1 = 1 − θ + λ and (2.31). The figure is drawn for δ = 3, = 2, τ = 2, and η = 1 (but would have the same topology for any δ > τ /2).
that f (u, v) is maximized for 0 ∈ [0, 1] ⎧ ⎪ 1 ⎨ 1−θ+λ−v (2.30) B(v) = λ ⎪ ⎩ 0
by u = B(v), where if 0 ≤ v ≤ 1 − θ, if 1 − θ < v < 1 − θ + λ, if 1 − θ + λ ≤ v ≤ 1.
Because (u, v) ∈ R is equivalent to u = B(v), where R is defined by (2.5), the ustrategist’s optimal reaction set consists of three straight line segments in the unit square of the u-v plane, as indicated by the solid curve in Figure 2.3. The dashed curve is the v-strategist’s optimal reaction set, defined by (2.6), and is obtained by reflecting R in the line u = v (which is shown dotted). These sets intersect at (0, 1), at (1, 0) and at (v ∗ , v ∗ ), where (2.31)
v∗ = 1 −
θ . 1+λ
Thus, as in Crossroads, there are three Nash equilibria, only one of which is symmetric. For any positive η, however, B is now a function,8 so that, unlike in Crossroads, every strategy has a unique best reply. In particular, because v ∗ is uniquely the best reply to itself, f (v ∗ , v ∗ ) ≥ f (u, v ∗ ) must hold with f (v ∗ , v ∗ ) > f (u, v ∗ ) for all u = v ∗ , and so v ∗ is a strong ESS, as we can verify by observing that f (v ∗ , v ∗ ) − f (u, v ∗ ) = 14 ητ (u − v ∗ )2 is strictly positive for all u = v ∗ . So we can regard the weak ESS in Crossroads as rather atypical, because it arises only in the limit as η → 0, and it seems unlikely (at least to me) that real-world drivers do not discount traversal delays at all. Of course, in the limit as η → 0, the game is indistinguishable from Crossroads itself. One good reason for drivers to discount traversal delays is that everybody benefits. In a population at the ESS, the reward to each driver is f (v ∗ , v ∗ ), and so the average delay is −f (v ∗ , v ∗ ). It is straightforward to show that this quantity decreases with respect to η (Exercise 2.7). So a population of drivers who discount traversal delays experiences shorter average delays than a population of drivers who do not discount; and the more they discount, the less they wait. 8
As opposed to a multi-valued function or correspondence. See Footnote 11 on p. 19.
70
2. Population Games
Figure 2.4. Typical optimal reaction sets for Crossroads II when (a) δ < (1 − η)τ /2 or λ < θ − 1 (implying θ > 1); (b) (1 − η)τ /2 < δ < ητ /2 − or λ > max{1, θ − 1}, a case that arises only if 1/2 < η < 1, < (2η − 1)τ /2; (c) max{(1 − η)τ /2, ητ /2 − } < δ < τ /2 or θ − 1 < λ < 1; and (d) δ > τ /2 or λ < θ < 1. Here ω1 = 1 − θ + λ, ω2 = ω1 /λ and v ∗ is defined by (2.31); the figure is drawn for = 0, τ = 5, η = 0.75 with (a) δ = 0.5, (b) δ = 1, (c) δ = 2.25, and (d) δ = 3. In the first three cases, drivers are slow.
The above results are readily generalized to allow for slow drivers, or δ < τ /2: in place of (2.31), the ESS becomes
0 if δ ≤ 12 (1 − η)τ, ∗ (2.32) v = θ if δ > 12 (1 − η)τ, 1 − 1+λ as illustrated by Figure 2.4. The game is also readily generalized to allow for an arbitrary nonuniform distribution of lateness, i.e., for any g such that g(ξ) ≥ 0 and 1 g(ξ) dξ = 1 in place of (2.23); for details, see Exercise 2.7. 0
2.4. Hawk-Dove games As noted in §2.1, Maynard Smith introduced his concept of ESS in the context of evolutionary biology. We now discuss a suite of models that he developed to explore the behavior of two animals in conflict over an indivisible territory—or any
2.4. Hawk-Dove games
71
other indivisible resource, but for the sake of definiteness we suppose it to be a territory. In this section, it will be convenient to denote the focal u-strategist by P1 (for Player 1) and the v-strategist by P2 . To begin with, we assume that P1 and P2 have only two pure strategies, which, for the sake of tradition, we label “Hawk” and “Dove” (but the animals belong to the same species). To play Hawk, or H, one must “escalate,” i.e., act fierce; and if that doesn’t scare away the opponent, then one must fight until injury determines a victor. To play Dove, or D, one must first “display,” i.e., merely look fierce, and hope that the opponent is scared away; but if it starts to act fierce, then one must retreat and search for real estate elsewhere. Later, however, we increase the number of possible pure strategies from two to three or four. Let V be the reproductive value of the territory, let Ci be the reproductive cost of being injured in a fight, and let Ce denote any other cost of escalation (which might be primarily energetic); Ce is paid by both animals who fight, whereas Ci is paid only by the loser. By reproductive value we mean the incremental number of offspring—as opposed to the absolute number—that the territory would yield to the animal (or rather, since the increment is a random variable, its expected value). Thus, if an animal who averages five little ones per breeding season from the kind of territory that nobody scraps over can raise this number to eight by acquiring the territory that is in dispute, then V = 8 − 5 = 3. Similarly, by Ci or Ce we mean the amount by which number of expected offspring would be reduced by virtue of injury or escalation, respectively. Thus reproductive cost and value are both in terms of expected future reproductive success, for which fitness is a frequent and often preferred synonym—and is anyhow the term we shall use. Let us analyze this territorial conflict from P1 ’s point of view. When strategy combination DH is selected by the players, P1 plays Dove and P2 plays Hawk. Then P1 retreats as soon as P2 acts fierce, and the payoff to P1 in terms of fitness is zero. When HD is selected, P2 retreats as soon as P1 acts fierce, and the payoff to P1 is V . When DD is selected, both P1 and P1 play Dove. There follows a staring match, during which both animals look fierce but refrain from acting fierce. Let F denote the animal who (given DD) first gets tired of staring and retreats; then F is a random variable whose sample space is {P1 , P2 }. If neither animal is especially nervous, then it seems reasonable to suppose that each is as likely as the other to retreat first, and so the expected value of P1 ’s payoff (given DD) is (2.33)
0 · Prob(F = P1 ) + V · Prob(F = P2 ) = 0 ·
1 2
+V ·
1 2
= 12 V.
In effect, when Doves interact, the resource is randomly allocated. In principle, we could add a cost to a contest between a pair of Doves, as Maynard Smith once considered [186, p. 43], but it complicates the algebra without telling us anything more useful about the logic of animal contests, and he eventually stopped doing it [188]. We assume instead that costs of staring—beyond using energy at basal metabolic rate—are negligible compared to Ce or Ci .9 The larger 9 The costs of using energy at basal metabolic rate are entirely ignorable, on the grounds that they are paid regardless of what the animals are doing, and subtracting the same constant from every payoff has no strategic effect; in particular, it cannot affect conditions for Nash equilibrium or evolutionary stability, since the same number would be subtracted from both sides of each inequality.
72
2. Population Games
Table 2.1. Payoff matrix for Hawk-Dove I
P1
H D
P2 H 1 (V − C) 2 0
D V 1 2V
point here is that judicious approximation is the essence of modelling: effects that are small in a real population are typically absent from a model [211]. We have assumed above that the territory or other resource is indivisible. If instead the resource is divisible, then an alternative interpretation of payoff 12 V to strategy combination DD is that two Doves share the resource equally.10 However, although divisibility makes equal sharing possible, it does not guarantee it—a divisible resource can still be randomly allocated by two Doves, and in §6.5 we discuss the conditions that favor either alternative. But here the resource is indivisible, and so random allocation is the only option. When HH is selected, both animals play Hawk. There follows a major skirmish that one player must eventually win. Let random variable W , again with sample space {P1 , P2 }, denote the winner (given HH); and suppose that neither P1 nor P2 is an especially good fighter. Then it is reasonable to suppose that each is as likely as the other to win, or Prob(W = P1 ) = 12 = Prob(W = P2 ). If P1 wins, then its payoff is V − Ce ; but if P2 wins, then the payoff to P1 is −Ce − Ci . Thus the expected value of P1 ’s payoff (given HH) is (V − Ce ) · Prob(W = P1 ) + (−Ce − Ci ) · Prob(W = P2 ) = 12 (V − C), where (2.34)
C = 2Ce + Ci ,
and it follows that P1 ’s payoff matrix is as shown in Table 2.1. Note that although C is a convenient mathematical notation, as the average cost of a pair of fights it lacks intuitive biological appeal—unless Ce is so small compared to Ci that we can set Ce = 0 and interpret C itself as the cost of losing a fight, which is both how A was initially framed by Maynard Smith [188, p. 12] and how it is usually interpreted. Before proceeding, it is convenient to define a dimensionless ratio of value to cost, (2.35)
θ = V /C.
Although we are assigning double duty to a parameter already in use for Crossroads, its intended meaning will always be obvious from context. Now, from Table 2.1, we can write the payoff matrix as 1 θ − 1 2θ (V − C) V 1 C . = (2.36) A = AI = 2 1 2 0 θ 0 2V The payoff matrix for P2 follows at once as the transpose of AI —but as noted in §2.2, we do not need it, because this is a population game. We call it Hawk-Dove I 10
As assumed in the initial development of the Hawk-Dove game [188, p. 13].
2.4. Hawk-Dove games
73
(hence, the subscript on A). It is often called the “classic” Hawk-Dove game (e.g., by Kokko [160, p. 9]). We define mixed strategies as probabilities of being aggressive. Let P1 select pure strategy H with probability u and, hence, D with probability 1 − u; whereas P2 selects H with probability v and, hence, D with probability 1 − v. Then, by analogy with (1.13), the reward to a u-strategist in a population of v-strategists is (2.37)
f (u, v) = (u, 1 − u)AI (v, 1 − v)T =
1 2 {θ(1
− v) + (θ − v)u}C.
It follows readily that the optimal reaction set is as shown in Figure 2.1, and hence (2.38)
v ∗ = min(1, θ)
is uniquely the ESS (Exercise 2.2); from (2.17) or (2.18), this ESS is strong if θ > 1 but weak if θ ≤ 1.11 Note that θ is small when C is large compared to V ; moreover, the probability of an actual fight (i.e., that both animals are aggressive) is then θ 2 , which is even smaller. So a possible explanation for the rarity in nature of protracted fights is that the cost of injury is much too high. Maynard Smith [188, pp. 17–18] expanded his Hawk-Dove game by adding a third pure strategy, Retaliator, which he denoted by R, but we still have a different use for R, and so we use F B (for fight back) instead. An F B-strategist always begins by displaying, but then escalates and prepares for battle if its opponent escalates. So two F B-strategists behave like two Doves, and a Hawk and an F B-strategist behave like two Hawks; that is, a33 = a22 , and a31 = a13 = a11 . In confrontations between D and F B, however, the F B-strategist will sometimes intuit that its opponent is a really a Dove and exploit it by escalating. For the sake of definiteness, suppose that the F B-strategist recognizes a Dove and escalates with probability λ, where λ > 0 is small; whereas, with probability 1 − λ, the F B-strategist cannot tell that the Dove is not a Retaliator, and hence merely displays. Thus, the probability that the F B-strategist secures the disputed resource is 1 · λ + 12 · (1 − λ) = 12 (1 + λ), implying a32 = 12 (1 + λ)V ; the probability that the Dove secures it is 12 (1 − λ), implying a23 = 12 (1 − λ)V ; and (2.36) expands to become ⎡ ⎤ θ−1 2θ θ−1 θ (1 − λ)θ⎦ (2.39) AII = 12 C ⎣ 0 θ − 1 (1 + λ)θ θ with θ still defined by (2.35). Let us call this game Hawk-Dove II. Inspection of AII reveals two properties. First, H is no longer an ESS for θ ≥ 1. It certainly cannot be a strong ESS, because F B is an alternative best reply to H (a31 = a11 ); but H is also not even a weak ESS, because F B can invade (a13 < a33 ). Second, for any value of θ, F B is a strong ESS of the discrete population game, because a33 > max(a13 , a23 ), and so (2.21b) is satisfied with k = 3 for all j = 3. So, by the remarks at the end of §2.2, Retaliator remains a strong ESS when the game is extended to allow for mixed strategies. That is, F B is also a strong ESS of the population game with two-dimensional strategy set Δ defined by (1.28) and 11 We are inclined to exclude the case θ = 1 on the grounds that it is just too fanciful to suppose that C would exactly equal V (see Footnote 6 on p. 13), but no harm is done by including it.
74
2. Population Games
Table 2.2. Strategy set for Owners and Intruders
Symbol HH B X DD
Name Obligate Hawk Bourgeois anti-Bourgeois Obligate Dove
Definition Always escalate Escalate if owner, do not escalate if intruder Do not escalate if owner, escalate if intruder Never escalate
reward function f (u, v) = f (u1 , u2 , v1 , v2 ) (2.40)
= (u1 , u2 , 1 − u1 − u2 )AII (v1 , v2 , 1 − v1 − v2 )T = 12 θ − v1 − (1 − v1 )(u1 + λθu2 ) + θ(v2 u1 − v1 u2 ) +
θλv2 (1 − u1 ) + v2 u1 + v1 u2 C
by analogy with (1.31). More precisely, v ∗ = (0, 0)—to which F B corresponds in the extended game—is a strong ESS; however, it is not the only ESS for θ < 1, as we will demonstrate in §2.5. Maynard Smith [188,191] also expanded Hawk-Dove I to include an asymmetry of role. Suppose that each of P1 or P2 is always either an owner or an intruder and that these two roles are mutually exclusive and recognized as such. Then we increase the number of pure strategies from two to four by redefining strategies as in Table 2.2. Our notation is chosen to acknowledge a distinction between strategies and tactics. In general, a tactic is an action that a player can take during a game, and a strategy is a rule for deploying tactics. Because there is no asymmetry of role in Hawk-Dove I, there is also is no difference between strategy and tactic; H, for example, is indeed a strategy. Here, by contrast, H is merely a possible tactic, deployed by Bourgeois (strategy B) only as owner and by anti-Bourgeois (strategy X) only as intruder, whereas HH is the strategy of deploying H in either role. Both H and HH are all Hawk, but are not quite the same thing. We assume, at least to begin with, that our animals are equally likely to occupy either of the two roles. So the payoff to B against X is 12 · 12 (V −C)+ 12 · 12 V = 12 V − 14 C because the B-strategist is equally likely to be an escalating Bourgeois owner against an escalating anti-Bourgeois intruder or a displaying Bourgeois intruder against a displaying anti-Bourgeois owner. In effect, this is equally likely to be a Hawk against a Hawk or a Dove against a Dove. Proceeding likewise for the other 15 cases, we obtain Table 2.3 (Exercise 2.8); and with θ = V /C, as in (2.35), the payoff matrix becomes ⎡ ⎤ 2(θ − 1) 3θ − 1 3θ − 1 4θ ⎢ θ−1 2θ 2θ − 1 3θ ⎥ ⎥. (2.41) A = 14 C ⎢ ⎣ θ−1 2θ − 1 2θ 3θ ⎦ 0 θ θ 2θ Let us call this game Owners and Intruders. We assume θ = 1.12
12
See Footnote 6 on p. 13.
2.4. Hawk-Dove games
75
Table 2.3. Payoff matrix for Owners and Intruders
P1
HH B X DD
HH 1 (V − C) 2 1 (V − C) 4 1 (V − C) 4 0
P2 B 3 V − 14 C 4 1 V 2 1 1 V − C 2 4 1 V 4
3 V 4 1 V 2
X − 14 C − 14 C 1 V 2 1 V 4
DD V 3 V 4 3 V 4 1 V 2
By inspection, for θ > 1, HH is a strong ESS of this game; whereas for θ < 1, B and X are both strong ESSs (but HH is not). By the remarks at the end of §2.2, these strong ESSs remain strong ESSs when the game is extended to allow for mixed strategies. So they remain strong ESSs of a population game whose strategy set is the tetrahedron T = {(u1 , u2 , u3 )|u1 , u2 , u3 ≥ 0, u1 + u2 + u3 ≤ 1}. However, this is not the only way to embed the strategies in Table 2.2 in an extended game, and we choose instead to embed HH, B, X, and DD in the strategy set S = {(u1 , u2 )|0 ≤ u1 , u2 ≤ 1} = [0, 1] × [0, 1]
(2.42) by defining (2.43)
u = (u1 , u2 )
to mean escalate or display with probabilities u1 and 1 − u1 , respectively, if in the role of owner, but escalate or display with probabilities u2 and 1 − u2 if in the role of intruder. Now HH corresponds to (1, 1), B to (1, 0), X to (0, 1), and DD to (0, 0). We prefer S to T because it is more plausible biologically and more tractable mathematically. Note that the strategy defined by (2.43) is not a mixture of the pure strategies HH, B, X, and DD, but rather a mixture of the associated tactics of escalating and displaying. Game theorists distinguish mixed strategies—which randomize over pure strategies at the start of a game—from strategies that randomize over tactics during a game by giving the latter a different name. That name used to be behavior strategies [101, 174], but nowadays they are more commonly known as stochastic [250, 254] or randomizing strategies [148]. So technically u is a stochastic strategy, not a mixed one; however, this is not a distinction on which we need to dwell.13 Because our focal individual is equally likely to be an owner against an intruder or an intruder against an owner, it is equally likely to be a u1 -strategist against a v2 strategist or a u2 -strategist against a v1 -strategist in a regular game of Hawk-Dove I. So our new game of Owners and Intruders has reward function (2.44a)
f (u, v) = f (u1 , u2 , v1 , v2 ) =
13
1 2 (u1 , 1
− u1 )AI (v2 , 1 − v2 )T + 12 (u2 , 1 − u2 )AI (v1 , 1 − v1 )T ,
For further discussion, see [306, pp. 44–45].
76
2. Population Games
where AI is the payoff matrix for Hawk-Dove I, implying (2.44b)
1 4 {(θ
f (u, v) =
− v2 )u1 + (θ − v1 )u2 + (2 − v1 − v2 )θ}C
14
by (2.36) above. A question now immediately arises: are any new ESSs induced by extending Owners and Intruders to include stochastic strategies? We defer our answer to the following section (p. 80).
2.5. More on evolutionarily stable strategies We have seen that, no matter how many Nash equilibria a game may have when viewed as a community game, only its symmetric Nash equilibria can yield an ESS when the game is viewed as a population game. In the case of Crossroads and Four Ways, we were easily able to determine the only candidates for ESS because in Chapter 1 we had already determined all of their Nash equilibria, and so it was necessary only to identify the symmetric ones. But what happens if we don’t already know R1 ∩ R2 ? Must we still calculate R, in order to identify candidates for ESS? The answer turns out to be no. The simplest case, exemplified by Hawk-Dove I, is where strategies are scalars belonging to an interval, assumed without loss of generality to be [0, 1]. By (2.5), (v ∗ , v ∗ ) ∈ R requires the reward f (u, v) to satisfy f (v ∗ , v ∗ ) = maxu f (u, v ∗ ) and hence f (v ∗ + h, v ∗ ) ≤ f (v ∗ , v ∗ ) for all v ∗ + h ∈ [0, 1], implying f (v ∗ + h, v ∗ ) − f (v ∗ , v ∗ ) ≤ 0 h
(2.45) if h > 0 but
f (v ∗ + h, v ∗ ) − f (v ∗ , v ∗ ) ≥ 0 h
(2.46)
if h < 0. If v ∗ is an interior ESS, that is, if v ∗ ∈ (0, 1), then v ∗ + h ∈ (0, 1) for all sufficiently small positive |h|, and so (2.45) and (2.46) both hold for all such h. ∂f ≤ 0, whereas the limit of (2.46) The limit of (2.45) as h → 0 now yields ∂u u=v=v ∗ ∂f as h → 0 yields ∂u u=v=v∗ ≥ 0, and these inequalities are both satisfied only if ∂f (2.47) = 0, ∗ ∂u
u=v=v
∗
a necessary condition for v to be an interior ESS—provided, of course, that f is sufficiently differentiable, which we assume.15 This condition instantly identifies the only candidate for interior ESS of Hawk-Dove I, because from (2.37) with θ ∂f = 12 (θ − v ∗ )C. If, on the other hand, v ∗ is defined by (2.35) we obtain ∂u u=v=v ∗ 14
Note that ⎡
f (1, 1, 1, 1) ⎢f (1, 0, 1, 1) A = ⎣ f (0, 1, 1, 1) f (0, 0, 1, 1)
f (1, 1, 1, 0) f (1, 0, 1, 0) f (0, 1, 1, 0) f (0, 0, 1, 0)
f (1, 1, 0, 1) f (1, 0, 0, 1) f (0, 1, 0, 1) f (0, 0, 0, 1)
⎤ f (1, 1, 0, 0) f (1, 0, 0, 0)⎥ f (0, 1, 0, 0)⎦ f (0, 0, 0, 0)
yields payoff matrix (2.41) for the embedded discrete population game. 15 Hence the necessary conditions developed in this section do not invariably apply. For example, (2.47) cannot be used to analyze the cake-cutting game in Exercise 2.22.
2.5. More on evolutionarily stable strategies
77
a boundary ESS (v ∗ = 0 or v ∗ = 1), then (2.47) need not hold. Rather, for v ∗ = 0 to be an ESS, (0, 0) ∈ R requires only that ∂f ≤ 0 (2.48) ∂u
u=v=0
from the limit of (2.45) as h → 0 with v ∗ = 0; and (2.46) has no effect, because v∗ + h ∈ / [0, 1] for any h < 0. Likewise, for v ∗ = 1 to be an ESS, (1, 1) ∈ R requires only that ∂f (2.49) ≥ 0 ∂u
u=v=1
from the limit of (2.46) as h → 0 with v ∗ = 1, and this time (2.45) has no effect, / [0, 1] for any h > 0. For v ∗ = 0 cannot be an ESS of because v ∗ + h ∈ example, ∂f 1 Hawk-Dove I because (2.37) implies ∂u u=v=0 = 2 V , which is positive, violating ∂f = 12 (V − C) = 12 (θ − 1)C, (2.49) implies that (2.48). Likewise, because ∂u u=v=1
v ∗ = 1 can be an ESS of Hawk-Dove I only if θ ≥ 1. Similar considerations apply when strategies are two-dimensional vectors belonging, e.g., to a triangle as in Four Ways (§1.3) or to a square as in Owners and Intruders (§2.4). Let the reward to an individual using strategy u = (u1 , u2 ) in a population of individuals using strategy v = (v1 , v2 ) be denoted by either f (u, v) or f (u1 , u2 , v1 , v2 ), whichever is more convenient for the purpose at hand.16 Then because either component of u can be held constant while only the other component is allowed to vary, (v ∗ , v ∗ ) ∈ R requires not only f (v ∗ , v ∗ ) = max f (u1 , v2∗ , v1∗ , v2∗ ) u1
and hence
∂f = 0 ∂u1 u=v=v∗
(2.50a) but also
f (v ∗ , v ∗ ) = max f (v1∗ , u2 , v1∗ , v2∗ ) u2
and hence
∂f = 0 ∂u2 u=v=v∗
(2.50b)
at any point v ∗ interior to the strategy set. For example, with use of (2.19), (2.50) implies that v ∗ = ζ is the only candidate for interior ESS of Four Ways.17 To address the possibility of a boundary ESS, necessary conditions (2.50) must be appropriately modifed in an ad hoc fashion that depends on the shape of the boundary. A possible shape is that of an isosceles right-angled triangle—as, for example, in Four Ways, where the strategy set is Δ defined by (1.28). Then v ∗ = (0, 0) can be an ESS only if ∂f ∂f (2.51) ≤ 0, ≤ 0 ∗ ∗ ∂u1
16
u=v=v
∂u2
u=v=v
See Footnote 14 on p. 22. The notation introduced on p. 40 can be used to extend (2.50) to m-dimensional strategy sets where m > 2. Because any m − 1 components of u can be held constant while only, say, component i is allowed to vary, (v ∗ , v ∗ ) ∈ R requires f (v ∗ , v ∗ ) = maxui f (u||ui , v ∗ ) and hence ∂f /∂ui u=v∗ = 0, i = 1, . . . , m for an interior ESS. As before, these conditions can be modified in an ad hoc fashion to address the possibility of a boundary ESS. For an example with m = 3, see [231, pp. 274–275]. 17
78
2. Population Games
because (v ∗ , v ∗ ) ∈ R with v ∗ = (0, 0) requires f (v ∗ , v ∗ ) = max f (u1 , 0, 0, 0), u1
f (v ∗ , v ∗ ) = max f (0, u2 , 0, 0). u2
≤ 0, which is clearly violated. For Four Ways, the first of (2.51) reduces to δ + Likewise, for 0 < ω < 1, a boundary ESS of the form v ∗ = (ω, 0) is possible only if ∂f ∂f = 0, ≤ 0, (2.52) ∗ ∗ 1 2τ
∂u1
∂u2
u=v=v
u=v=v
∗
and a boundary ESS of the form v = (0, ω) is possible only if ∂f ∂f ≤ 0, = 0. (2.53) ∗ ∗ ∂u1
∂u2
u=v=v
u=v=v
For Four Ways, the second of (2.52) and the first of (2.53) reduce to (δ− 21 τ )(1−ω) ≤ 0 and (δ + 12 τ )(1 − ω) ≤ 0, respectively, both of which are violated (because δ > 12 τ by assumption); hence neither (2.52) nor (2.53) holds. Whether there exists a corner ESS of the form v ∗ = (1, 0) or v ∗ = (0, 1) is best determined by evaluating f (v ∗ , v ∗ ) − f (u, v ∗ ) directly. For example, v ∗ = (1, 0) cannot be an ESS of Four Ways because in (2.19) it yields f (v ∗ , v ∗ ) − f (u, v ∗ ) = −(1 − u1 )(δ − 12 τ ), which is negative for all u1 = 1, and likewise, v ∗ = (0, 1) cannot be an ESS of Four Ways because it yields f (v ∗ , v ∗ ) − f (u, v ∗ ) = −(1 − u2 )( + 12 τ ), which is negative for all u2 = 1. The case of a boundary ESS on the hypotenuse of Δ, say v ∗ = (ω, 1 − ω) with 0 < ω < 1, is more subtle. It at least requires ∂f ∂f ≥ 0, ≥ 0 (2.54) ∗ ∗ ∂u1
u=v=v
∂u2
u=v=v
because (v ∗ , v ∗ ) ∈ R requires f (v ∗ , v ∗ ) = max f (u1 , 1 − ω, ω, 1 − ω) = max f (ω, u2 , ω, 1 − ω). u1
But it also requires (2.55)
u2
∂f ∂f = ∂u1 u=v=v∗ ∂u2 u=v=v∗
because f (v ∗ , v ∗ ) = maxu f (u, v ∗ ) requires f to achieve a maximum along the ∂ hypotenuse, requiring ∂f (u, v ∗ )/∂s = 0 for u = v ∗ where ∂s = ˆs ·∇ is the derivative √ in the direction of ˆs = {i − j}/ 2. Here i or j is a unit vector in the direction of increasing u1 or u2 , respectively, and ∇ = i ∂/∂u1 + j ∂/∂u2 is the gradient operator. For Four Ways, (2.55) reduces to ω = θ where θ is defined by (2.2). Hence v ∗ = (θ, 1 − θ) is the only candidate for such an ESS on the hypotenuse of Δ. Nevertheless, it is not in fact an ESS because by (2.19) we find that v ∗ = (θ, 1 − θ) implies (2.56)
f (v ∗ , v ∗ ) − f (u, v ∗ ) = −(1 − u1 − u2 )(δ − 12 τ )θ,
which is negative for any u in the interior of Δ, violating (2.17a). Thus Four Ways does not have a boundary ESS (when drivers are not slow): the only ESS is the interior ESS, v ∗ = ζ. This is not new information—we knew it in §2.2. But we have now recovered it from (2.50)–(2.55) without explicit knowledge of the optimal reaction set. The same set of necessary conditions will also identify all ESSs of any other population
2.5. More on evolutionarily stable strategies
79
game whose strategy set is the triangle Δ defined by (1.28). In particular, we can use these conditions to determine whether Hawk-Dove II has any ESSs other than the strong ESS v ∗ = (0, 0) corresponding to F B or Retaliator, which we found on p. 74. Let us first determine whether Hawk-Dove II has an interior ESS. From (2.40) and (2.50), ∂f (2.57a) = 12 {v1∗ − 1 + (1 − λθ + θ)v2∗ }C ∗ ∂u1
and
u=v=v
∂f = ∂u2 u=v=v∗
(2.57b)
1 2 {(1
− θ)v1∗ − λθ(1 − v1∗ )}C
must both equal zero for v ∗ to be an interior ESS. On solving these two equations, we find that the only candidate is ˜ ˜ β), v ∗ = (α,
(2.58) where (2.59)
α ˜ =
λθ , 1 − θ + λθ
β˜ =
1−θ (1 − λθ + θ)(1 − θ + λθ)
(with θ = V /C). For both v1∗ = α ˜ > 0 and v2∗ = β˜ > 0, we require θ < 1. Then ∗ ∗ ˜ 1 − v1 − v2 = (1 − λ)θ β > 0, confirming that v ∗ is an interior point of Δ. It is also readily checked that v ∗ satisfies (2.17a) by virtue of satisfying f (v ∗ , v ∗ ) = f (u, v ∗ ) for all u ∈ Δ. Thus v ∗ is a candidate for weak ESS. Nevertheless, v ∗ is not actually an ESS, even a weak one, because it fails to satisfy (2.17c) for all u = v ∗ . By straightforward algebra, (2.60)
f (v ∗ , u) − f (u, u) = − 12 (v1∗ − u1 )(v1∗ − u1 + 2{v2∗ − u2 })C,
which is easily made negative for u = v ∗ . For example, if u2 = v2∗ but u1 = v1∗ , then u = (u1 , v2∗ ) = v ∗ , yet (2.60) implies f (v ∗ , u) − f (u, u) = − 12 (u1 − v1∗ )2 C < 0. So any ESS must lie on the boundary of Δ. In the particular case of Hawk-Dove II, (2.57) reveals that v ∗ = (0, 0) satisfies (2.51) with strict inequality, which merely confirms what we already know: Retaliator is a strong ESS. We also know that neither v ∗ = (1, 0) nor v ∗ = (0, 1) is an ESS: neither Hawk nor Dove is an ESS, because neither is an ESS of the discrete game with matrix (2.39). Moreover, for 0 < ω < 1, v ∗ = (ω, 0) fails to ∂f |u=v=v∗ = 12 (ω − 1) = 0 by (2.57a), and v ∗ = (0, ω) fails satisfy (2.50a) because ∂u 1 ∂f to satisfy (2.50b) because ∂u |u=v=v∗ = − 12 λθC = 0 by (2.57b). So if there is a 2 boundary ESS besides Retaliator, then it is obliged to lie on the hypotenuse of Δ, that is, to have the form v ∗ = (ω, 1 − ω), play Hawk with probability ω and Dove with probability 1 − ω. By (2.57) with v1∗ = ω, v2∗ = 1 − ω and (2.55), we require 1 1 2 (1 − ω)(1 − λ)θC = 2 {(1 − θ + λθ)ω − λθ}C or (2.61)
ω = θ
with θ < 1, so that (2.62)
v ∗ = (θ, 1 − θ)
80
2. Population Games
is the only candidate for a boundary ESS. From (2.40) we obtain (2.63)
f (v ∗ , v ∗ ) − f (u, v ∗ ) =
1 2 (1
− u1 − u2 )(1 − λ)(1 − θ)ωC,
which is positive—ensuring that (2.17b) holds—for all u ∈ Δ except for u that lie on the hypotenuse where u1 + u2 = 1. Only for these u do we have to check that (2.17c) holds. Again from (2.40), we do indeed find that (2.64) f (v ∗ , u) − f (u, u) = f (θ, 1 − θ, u1 , 1 − u1 ) − f (u1 , 1 − u1 , u1 , 1 − u1 ) =
1 2 (u1
− θ)2 C
is positive for all u1 = θ, confirming that a mixed strategy of playing Hawk with probability θ, Dove with probability 1 − θ, and Retaliator with probability 0 is a weak ESS of Hawk-Dove II. Another possible shape for the boundary of the strategy set is that of a square, as in Owners and Intruders, where the strategy set is the square defined by (2.42) with corners at (0, 0), (1, 0), (0, 1), and (1, 1). Then (2.50)–(2.53) are still the necessary conditions for interior ESSs and boundary ESSs of the form v ∗ = (ω, 0) or v ∗ = (0, ω) when 0 ≤ ω < 1, but they require modification for other types of boundary ESS. In particular, for 0 < ω < 1 the weak inequalities in (2.52) and (2.53) must be reversed for boundary ESSs of the form (ω, 1) and (1, ω), respectively, but the equalities still hold. As before, whether there exists a corner ESS is best determined by evaluating f (v ∗ , v ∗ ) − f (v ∗ , v ∗ ) directly. Because ∂f ∂f ∗ 1 (2.65) = {θ − v }C, = 14 {θ − v1∗ }C 2 4 ∗ ∗ ∂u1
∂u2
u=v=v
u=v=v
by (2.44b), in the particular case of Owners and Intruders the only candidate for interior ESS is (2.66)
v ∗ = (θ, θ)
with θ < 1. Although here f (u, v ∗ ) = f (v ∗ , v ∗ ) for all u ∈ Δ, so that (2.17a) is satisfied, (2.17c) fails to hold because (2.67)
f (v ∗ , u) − f (u, u) =
1 2 (u1
− θ)(u2 − θ)C
is negative for some u = v ∗ , specifically, for 0 ≤ u1 < θ < u2 ≤ 1 or 0 ≤ u2 < θ < u1 ≤ 1. So again there is no interior ESS. Although (2.65) implies that (2.51) cannot be satisfied with v ∗ = (0, 0), we learn nothing new, since we already know that DD is not an ESS from (2.41) in §2.4, where the corner ESSs have already been determined. From (2.65), there cannot be a boundary ESS of the form v ∗ = (ω, 0) or v ∗ = (0, ω) with 0 < ω < 1 because neither (2.52) nor (2.53) then holds. Reversing the weak inequalities in (2.52) and (2.53), we likewise find that there cannot be a boundary ESS of the form v ∗ = (ω, 1) or v ∗ = (1, ω) with 0 < ω < 1 because that would require both θ − 1 = 0 and θ − ω ≥ 0, whereas θ = 1 by assumption.18 We conclude that the 18 See p. 74. Even if we allowed θ = 1, however, there would still be no such ESS. With θ = 1, v ∗ = (ω, 1) implies f (v ∗ , v ∗ ) − f (u, v ∗ ) = 14 (1 − u2 )(1 − ω)C, which is never negative, but fails to be positive for some u = v ∗ , so that v ∗ is not a strong ESS. Specifically, f (v∗ , v ∗ ) − f (u, v ∗ ) fails to be positive for u = (u1 , 1) with u1 = ω. For all such u, we require f (v ∗ , u) > f (u, u) for a weak ESS, but instead find f (v ∗ , u) − f (u, u) = 0. Likewise for v ∗ = (1, ω), in which case, f (v ∗ , v ∗ ) − f (u, v ∗ ) = ∗ 1 4 (1 − u1 )(1 − ω)C for all u and f (v , u) − f (u, u) = 0 for u = (1, u2 ).
2.5. More on evolutionarily stable strategies
81
only ESSs of Owners and Intruders are H for θ > 1 and B or X for θ < 1. In particular, when V < C, B and X are the only ESSs. Bourgeois and anti-Bourgeois are both examples of a so-called conventional strategy, where a convention is defined as a rule based on arbitrary cues that allows quick resolution of potentially protracted disputes. In this case, if the entire population plays B or X, then no one will ever fight. Both strategies prescribe escalation in a favored role and nonescalation in a disfavored role. The cue for escalation is arbitrary in the sense that it does not matter whether ownership or nonownership is favored (although it does matter that the cue is the same for all individuals); either B or X will suffice for keeping the peace (as long as every individual adopts the same conventional strategy). Moreover, in terms of the model, there is no reason why either strategy is more likely to arise than the other. Yet B respects ownership, whereas X does not; and the convention of respecting ownership seems to be extremely common in nature, whereas its antithesis—the convention of ceding ownership to any challenger—seems to be extremely rare [161]. Why? Why can our model not predict that X hardly ever happens? We return to this question in §7.3. The conditions we have developed in this section are all first-order necessary conditions for evolutionary stability. With appropriate modification, they extend in a natural way to games in which strategies are m-dimensional vectors, so that v = (v1 , v2 , . . . , vm ) represents a population strategy, u = (u1 , u2 , . . . , um ) represents ∗ a potential mutant strategy, and v ∗ = (v1∗ , v2∗ , . . . , vm ) represents a candidate for ESS. In particular, (2.50) generalizes to the necessary condition ∂f (2.68) = 0, i = 1, . . . , m, ∗ ∂ui
u=v=v
∗
for any v interior to the strategy set to be a candidate for ESS.19 Here three remarks are in order. First, not all necessary conditions for an ESS are first-order. In particular, because we must exclude the possibilility that (2.68) identifies an interior minimum of f (u, v ∗ ), we must also require the second-order necessary condition ∂2f (2.69) ≤ 0, i = 1, . . . , m, 2 ∗ ∂ui
u=v=v
to be satisfied. Second, a candidate for ESS is not the same as an ESS. Even if m = 1 and the inequality in (2.69) is strengthened from nonpositive to negative, it guarantees only that v ∗ is a (strong) “local ESS”, that is, (2.17) is satisfied for all u = v ∗ in the vicinity of v ∗ , as opposed to all u = v ∗ in the strategy set.20 ∂f An example appears in §6.7 (p. 253) where (2.68) is rewritten as ∂u u=v=v∗ = 0 because a subscripted 1 serves no purpose when m = 1. In this example, in the first instance, there are three candidates for an interior ESS, because three interior 19
This condition is essentially derived in Footnote 17 on p. 77. For m > 1, if the inequality in (2.69) is strengthened from nonpositive to negative, then even a local ESS is no longer guaranteed, because v∗ could be a saddle point. There are two ways to deal with this issue. The first is to show that H, the Hessian of f , is negative-definite at u = v = v∗ , where H is the symmetric matrix with hij = ∂ 2 f /∂ui ∂uj in row i and column j for i, j = 1, . . . , m [175]. For an example with m = 3, see [231, p. 274]. The second approach is to assume that each component of strategy is independent of every other, so that a potential mutant strategy u differs from an ESS v∗ only in one component of the vector; strengthening (2.69) then does guarantee at least a local ESS. For an example, again with m = 3, see [222]. 20
82
2. Population Games
values of v ∗ satisfy (2.68). One of the three corresponds to a minimum of f (u, v ∗ ), and hence fails to satisfy (2.69). The other two satisfy (2.69) with strict inequality, and hence are both local ESSs; however, only one of them is a “global ESS”. That is, only one of them is an ESS because (2.17) implies globality. Third, and in the same regard, once v ∗ has been identified as a candidate for ESS, we must verify (2.17) to confirm that v ∗ is indeed an ESS. Various means can be used. Sometimes (2.17) can be verified directly, i.e., algebraically; examples appear in §2.2 (p. 65), the following section, §6.2, and §6.5. Often the calculus is used. In particular, for an interior ESS candidate, it suffices (but is not necessary) to show that ∂ 2 f /∂ui2 is negative everywhere (as opposed to only at u = v = v ∗ ) for i = 1, . . . , m; sometimes this condition can be verified analytically. Examples include §6.4, where it is clear from (6.47) that it holds with m = 2, and §6.7, where it holds for m = 1 (p. 252). If all else fails, sufficiency can be established by maximizing f (u, v ∗ ) computationally.21 In particular, for m = 1, we would in effect plot f (u, v ∗ ) against u to verify that u = v ∗ yields a global maximum, as illustrated by Figure 6.15 in §6.7.22 The methods of this section enable us to determine ESSs without first knowing the optimal reaction set. When R can readily be calculated, however, it may still yield the most insightful way to show that a population strategy is the best reply to itself; and so we will continue to make use of R, especially in Chapter 6.
2.6. A continuous Hawk-Dove game The Hawk-Dove game we introduced in §2.4 ignores any difference in fighting ability (or strength, for short) between two contestants. It assumes that if P1 and P2 are both aggressive, then they are equally likely to win the ensuing fight, even if one is far stronger than the other. However, an equally plausible alternative assumption, which we adopt in this section, is that the stronger animal is certain to win. But then why would two animals ever fight? An obvious answer—assumed in this section—is that each animal knows only its own strength, not that of the other. Of course, it may be that the stronger animal neither has an equal chance of victory nor is certain to win but that its probability of victory increases with its advantage in strength over the other contestant; however, it suits our purpose better here to construct a simpler model. Our goal is to develop the simplest extension of HawkDove I that allows for continuous variation of strength. We leave determination of R to Exercise 2.9, and instead avail ourselves of the new necessary conditions we developed in §2.5. Accordingly, let P1 and P2 have strengths X and Y , respectively, where P1 knows X but not Y , and P2 knows Y but not X. We assume that X and Y are continuous random variables whose values are both drawn from a given distribution over the interval [0, 1], so that the strengths of the weakest and strongest possible animal are 0 and 1, respectively. Let the distribution have (continuous) probability density function g and cumulative distribution function G. Then Prob(X ≤ x) = G(x) and Prob(Y ≤ y) = G(y), where " s g(ξ) dξ (2.70) G(s) = 0 21 22
For example, by using the Mathematica command NMaximize. For a further illustration of this point, see [222, Appendix B].
2.6. A continuous Hawk-Dove game
83
Table 2.4. Outcomes for the continuous Hawk-Dove game. If the resource is not divisible, then we reinterpret “P1 and P2 share” to mean that the resource is randomly allocated (p. 72), each contestant winning with probability 12 .
X ≤ u, Y > v X ≤ u, Y ≤ v X > u, Y ≤ v X > u, Y > v, X > Y X > u, Y > v, Y > X
I DH II DD III HD IV HH V HH
P2 wins without fight P1 and P2 share P1 wins without fight P1 wins after fight P2 wins after fight
with G(0) = 0 and G(1) = 1; correspondingly, G (s) = g(s), where the prime denotes differentiation.23 Of necessity, g(s) ≥ 0 for all s ∈ [0, 1]. For the sake of simplicity, however, we also assume that g(s) > 0 for all s ∈ (0, 1), so that G is strictly increasing.24 The greater a contestant’s strength, the greater the chance that it is stronger than its opponent, hence the greater its chance of winning a fight if there is one. Correspondingly, strategies are thresholds for aggression, above which an individual behaves like a Hawk, below which it behaves like a Dove. Specifically, P1 ’s strategy is to be aggressive if X > u but nonaggressive if X ≤ u; whereas P2 ’s strategy is to be aggressive if Y > v but nonaggressive if Y ≤ v. Because P1 wins if X > Y but P2 wins if Y > X, there are five possible outcomes, as indicated in Table 2.4 and Figure 2.5. So the payoff to the focal u-strategist P1 is25 ⎧ ⎪ 0 if (X, Y ) ∈ I, ⎪ ⎪ ⎪ 1 ⎪ ⎪ V if (X, Y ) ∈ II, ⎨2 (2.71) F (X, Y ) = V if (X, Y ) ∈ III, ⎪ ⎪ ⎪ if (X, Y ) ∈ IV, V − Ce ⎪ ⎪ ⎪ ⎩−C − C if (X, Y ) ∈ V, i e where the corresponding payoff to nonfocal v-strategist P2 is readily found by symmetry, that is, by interchanging the payoffs for regions I and IV with those for regions III and V, respectively. P1 ’s reward is the expected value of its payoff, namely, "" "" "1 "1 1 F (x, y)dA = 0 dA + f (u, v) = E[F ] = 2 V dA 0
"" (2.72)
0
V dA +
+ " "III
1 2V
= II
I
""
{V − Ce } dA −
"IV " dA + III∪IV
""
"" V dA − IV∪V
II
{Ci + Ce } dA V
""
Ce dA −
Ci dA, V
23 With the exception of §2.10, throughout the book we adopt the convention that a prime denotes differentiation with respect to argument for a univariate function. 24 Later we will specialize to the uniform distribution defined by (2.23), which clearly satisfies these constraints, but for now we keep things more general. 25 Note that our notation for F suppresses its dependence on u and v (through the definitions of regions I–V). Strictly speaking, it would be more correct to use F (X, Y, u, v) in place of F (X, Y ), but it would also be unnecessarily cumbersome, and in mathematics we value elegance and unclutteredness.
84
2. Population Games
Figure 2.5. The sample space of pairs of strengths
where the notational shorthand dA is defined by (2.22). Because g is the derivative of G and G(0) = 0 with G(1) = 1, (2.72) simplifies to "u (2.73a)
f (u, v) =
1 2V
"v g(x) dx
g(y) dy + V
0
− Ce
g(x)
"1
g(y) dy dx 0
"1 g(y) dy dx − Ci
g(x) u
"x
u
0
"1
=
"1
v
"1 g(x)
u
+ Ci ) 1 − G(u)2 + 12 V G(v) + Ci G(u) − Ci
g(y) dy dx x
1 2 (V
− Ce {1 − G(u)}{1 − G(v)} for u > v. But if u < v, then we instead obtain "u (2.73b)
f (u, v) =
1 2V
"v g(x) dx
0
g(y) dy 0
"1 + V
"v g(x)
u
=
1 2V
−
g(y) dy dx +
g(x)
"1
u
v
g(y) dy dx v
"1 g(y) dy dx − Ci
g(x)
2
"x
v
0
"1 − Ce
"1
"y g(y)
v
g(x) dx dy u
1 − G(u)G(v) + G(v)
1 2 Ci {1
− G(v)}{1 − 2G(u) + G(v)} − Ce {1 − G(u)}{1 − G(v)}.
Either (2.73a) or (2.73b) is valid if u = v: the reward function is continuous, because the right-handed limit of the first expression as u → v equals the left-handed limit
2.6. A continuous Hawk-Dove game
85
of the second. We now obtain ⎧ ⎪ ⎨ Ci {1 − G(u)} + Ce {1 − G(v)} ∂f = + 12 V {G(v) − 2G(u)} g(u) ⎪ ∂u ⎩ (Ci + Ce ){1 − G(v)} − 12 V G(v) g(u)
if u > v, if u < v,
because G (u) = g(u). Again, the right-handed limit of the first expression as u → v equals the left-handed limit of the second: the reward function is continuously differentiable, with ∂f (2.74) = Cl {1 − G(v)} − 12 V G(v) g(v), ∂u
u=v
where (2.75)
Cl = Ci + Ce
is the total cost of losing. It will now be convenient to define (2.76)
α =
2Cl 2Cl + V
(so that 0 < α < 1). Because G(0) = 0 and G(1) = 1, (2.74) implies ∂f ∂f = Cl g(0), = − 12 V g(1), (2.77) ∂u
and (2.78)
∂u
u=v=0
u=v=1
∂f = 12 V + Cl {α − G(v ∗ )}g(v ∗ ). ∂u u=v=v∗
Because G is strictly increasing, the equation G(v ∗ ) = α has the unique solution v ∗ = G−1 (α),
(2.79)
where G−1 denotes the inverse of G. Thus, by (2.78) and (2.47), the only candidate for an interior ESS is (2.79). In particular, for a uniform distribution of strength (that is, for g(ξ) = 1 as in (2.23) and hence G(v ∗ ) = v ∗ by (2.70)), the only candidate for an interior ESS is v ∗ = α. Moreover, because g(0) = 1 = g(1) by (2.23), it follows from (2.77) that neither (2.48) nor (2.49) can hold, and so neither v ∗ = 0 nor v ∗ = 1 is a boundary ESS. Thus v ∗ = α is the only candidate for ESS. To see whether it is actually an ESS, we note that ⎧1 2 ⎪ 2 V {1 − u + uv} ⎪ ⎪ ⎨ − 1 (1 − u){C (1 − u) + 2C (1 − v)} if u ≥ v, i e 2 (2.80) f (u, v) = 1 2 ⎪ V {1 − uv + v } ⎪ ⎪2 ⎩ − 12 (1 − v){Ci (1 − 2u + v) + 2Ce (1 − u)} if u ≤ v, by (2.73). Thus, from either of these two expressions, (2.81)
f (α, α) =
1 2V
− 12 (2Ce + Ci )(1 − α)2 =
1 2V
− 12 C(1 − α)2 ,
where C is defined by (2.34). Setting v = α in the first of (2.80), for u > α we have (2.82)
f (α, α) − f (u, α) = =
1 2 (u 1 2 (V
− α){(V + Ci )u − 2Cl + Cα} + Ci )(u − α)2 > 0
86
2. Population Games
α α α
α
0 for all mutant strategies sj if (2.14) holds for all u = v ∗ (and hence in particular for all u = sj = v ∗ , for j = 1, . . . , m). So, for a strong ESS, it makes no difference whether there are finitely many deviant strategies or only one. Nevertheless, it can make a difference if the orthodox strategy v ∗ is a weak ESS. To see why, let us agree to use the simplified notation W (u) for the reward to using strategy u in a population containing mutant strategies s1 , . . . , sm at (small) 27
In terms of §6.2 (p. 221), the animals use self-assessment.
88
2. Population Games
frequencies x1 , . . . , xm and v ∗ at frequency x. Thus W (u) = f (u, ξsT + (1 − x)v ∗ ) or m # xk f (u, sk ) + (1 − x)f (u, v ∗ ) (2.89) W (u) = k=1
by (2.86). Moreover, let us assume that every mutant strategy is an alternative best reply to v ∗ (which does not hold for every weak ESS as we saw in §2.6), so that (2.18b) and (2.18c) are known to be satisfied with u = sj for j = 1, . . . , m. With u = v ∗ and then u = sj , it follows from (2.89) that the difference in fitness between an ESS-strategist and a mutant sj -strategist is (2.90)
W (v ∗ ) − W (sj ) =
m #
xk {f (v ∗ , sk ) − f (sj , sk )}
k=1
for any j = 1, . . . , m, because f (sj , v ∗ ) = f (v ∗ , v ∗ ) by (2.18b). If m = 1, then the above difference in fitness must be positive by (2.18c), because m = 1 implies j = k. If m > 1, however, then the difference may be negative. Suppose, for example, that our game is Crossroads, and that neither driver is slow. Then f (u, v) = (δ + ) {(θ − v)u − θ(1 + v)} + 2v by (2.1a), and we know from §2.1 that v ∗ = θ (< 1) is a weak ESS. Let m = 2, and let the two mutant strategies be s1 = 0 (the pure strategy Wait) and s2 = 1 (the pure strategy Go). Then because f (0, θ) = f (1, θ) = f (θ, θ) = θ{(1 − θ) − (1 + θ)δ}, each mutant strategy is an alternative best reply to θ and (2.91a)
W (θ) − W (0) = θ(δ + ) (θx1 − {1 − θ}x2 ) ,
(2.91b)
W (θ) − W (1) = (1 − θ)(δ + ) (−θx1 + {1 − θ}x2 )
by (2.90), so that (2.92)
W (θ) = (1 − θ)W (0) + θW (1).
Thus W (θ) must lie between W (0) and W (1), and so cannot exceed both of them. Yet (2.18) is clearly satisfied by v ∗ = θ for any 0 ≤ u ≤ 1. In sum, strong ESSs are resistant to multiple deviation (provided of course that the total probability of deviation, namely x, is still small), whereas weak ESSs are not. We have just established that the strategy v ∗ = θ, despite being evolutionarily stable in the sense of Maynard Smith (satisfying the ESS conditions on p. 64) is not uninvadable for fast driving in Crossroads if the deviant strategies s1 = 0 and s2 = 1 occur simultaneously. What happens in these circumstances depends explicitly upon the dynamics of interaction among the population of drivers. Here we present a simple model of those dynamics.28 Let the proportions of the population of drivers playing deviant strategies s1 = 0 and s2 = 1 at time t (≥ 0) be x1 (t) and x2 (t), respectively, where x1 and x2 are differentiable functions; we no longer assume that these proportions are small. Then the proportion of drivers playing strategy θ is 1 − x(t), where x(t) = x1 (t) + x2 (t). So the average reward of the population at time t is (2.93) 28
W = (1 − x)W (θ) + x1 W (0) + x2 W (1),
This requires some familiarity with phase-plane analysis; see Footnote 31 on p. 93.
2.7. Multiple deviation. Population dynamics
89
x2 1 (1-θ, θ)
θ L 0
0
1-θ
1
x1
Figure 2.7. Triangle Δ that contains (x1 (t), x2 (t)) for all t ≥ 0.
where W (0), W (θ), and W (1) are defined by (2.89), and we avoid needless clutter by using notation that suppresses their dependence on t. For all t ≥ 0, the point (x1 (t), x2 (t)) must belong to the triangle (2.94) Δ = (x1 , x2 ) | x1 , x2 ≥ 0, x1 + x2 ≤ 1 ; see Figure 2.7. We assume that x1 (0) and x2 (0) are both positive; otherwise, there is no danger of strategy θ being invaded. It seems reasonable to assume that the fraction of drivers using a deviant strategy would increase if the reward from that strategy were greater than average. Thus dx1 /dt > 0 if W (0) > W , and dx2 /dt > 0 if W (1) > W . Dynamics consistent with these assumptions are defined by the pair of differential equations 1 dx2 1 dx1 = κ{W (0) − W }, = κ{W (1) − W }, (2.95) x1 dt x2 dt where κ (> 0) is a constant of proportionality. From (2.92) and (2.93) we obtain (2.96a)
W (0) − W = {x2 + θ(1 − x)}{W (0) − W (1)}, W (1) − W = {1 − x2 − θ(1 − x)}{W (1) − W (0)},
whereas from (2.91) we obtain (2.96b)
W (0) − W (1) = (δ + ){(1 − θ)x2 − θx1 }.
On substituting (2.96) into (2.95), we obtain a pair of nonlinear ordinary differential equations for the motion of the point (x1 (t), x2 (t)) on the triangle Δ: dx1 = (δ + )κx1 {θ(1 − x1 ) + (1 − θ)x2 }{(1 − θ)x2 − θx1 }, dt (2.97) dx2 = (δ + )κx2 {θx1 + (1 − θ)(1 − x2 )}{θx1 − (1 − θ)x2 }. dt From these equations we can deduce the values of x1 (∞) and x2 (∞). Then the long-term fraction of drivers adopting strategy θ is 1 − x1 (∞) − x2 (∞). We find from inspection of (2.97) that dx1 /dt > 0 and dx2 /dt < 0 above the line segment in Figure 2.7 defined by (2.98)
L = {(x1 , x2 ) ∈ Δ | θx1 − (1 − θ)x2 = 0} ,
whereas below L, dx1 /dt < 0 and dx2 /dt > 0. Therefore, any trajectory—or solution curve—that begins in the interior of Δ must end on L as t → ∞, and every point on L is said to be a metastable equilibrium of the dynamical system described
90
2. Population Games
by (2.97). To check that the trajectory cannot leave Δ across the boundary where x1 + x2 = 1, we need only observe that along that boundary we have x2 {θx1 + (1 − θ)(1 − x2 )} dx2 /dt dx2 = − = −1. = dx1 dx1 /dt x1 {θ(1 − x1 ) + (1 − θ)x2 } Thus trajectories that begin where x1 + x2 = 1 must end precisely at (1 − θ, θ) as t → ∞, and trajectories that begin in the interior must remain in the interior because trajectories cannot cross. The long-term fraction of drivers using strategy θ, namely x(∞), is indeterminate; it depends upon x1 (0) and x2 (0). Nevertheless, drivers not using θ must use strategies 0 and 1 in the ratio x1 (∞)/x2 (∞) = 1−θ θ . Without the ability to recognize individual drivers, the subpopulation in which fraction θ always plays pure strategy G and fraction 1 − θ always plays pure strategy W is indistinguishable from the subpopulation in which every driver plays mixed strategy θ (G with probability θ, W with probability 1 − θ). Thus, although in theory strategy θ is susceptible to simultaneous invasions by deviant strategies 0 and 1, whether in practice θ is invadable is, at the very least, a moot point. Note, however, that this conclusion is strongly dependent upon equations (2.97), which yield no more than a phenomenological description of the dynamics of interaction.
2.8. Discrete population games. Multiple ESSs The population games we analyzed in §§2.1–2.3 have unique evolutionarily stable strategies. But a game may have more than one ESS, as illustrated by Hawk-Dove II (§2.5). Then which ESS will the population adopt? To answer this question, we again require explicit dynamics. Having defined discrete population games towards the end of §2.2 (p. 66), let us now set m = 2 and consider the special case in which payoffs satisfy (2.99)
a22 > a12 > a11 > a21 .
An example of this game appears, e.g., in §5.2.29 Because k = 1 and k = 2 both satisfy (2.21b), both strategies are strong ESSs. Then which will emerge as the winning strategy in a large population of players, some of whom adopt strategy 1, the remainder of whom adopt strategy 2? The answer depends on initial conditions. By analogy with the previous section, let us suppose that the specific growth rate of the fraction of population adopting strategy k is proportional to the difference between the average payoff to strategy k, denoted by Wk , and the average reward to the entire population, denoted by W . Let x1 = x1 (t) and x2 = x2 (t) be the proportions adopting strategies 1 and 2, respectively, at time t. If the integervalued random variable J(t) denotes the strategy adopted by a player’s opponent at time t, and if the population interacts at random, then with negligible error we have (2.100)
Prob(J = j) = xj ,
j = 1, 2,
29 For this particular game, the two pure strategies are the only evolutionarily stable strategies— pure or mixed (Exercise 2.12). Thus, with regard to population dynamics, no generality whatsoever is lost by restricting this game to pure strategies.
2.8. Discrete population games. Multiple ESSs
91
whence the expected value of the payoff to strategy k is (2.101)
Wk =
2 #
akj · Prob(J = j) = ak1 x1 + ak2 x2 ,
j=1
for k = 1, 2. Similarly, the average reward to the population is (2.102)
W = x1 W1 + x2 W2 = a11 x21 + (a12 + a21 )x1 x2 + a22 x22 ,
and so 1 dx1 1 dx2 = κ{W1 − W }, = κ{W2 − W }, x1 dt x2 dt where κ is the constant of proportionality. On using x1 + x2 = 1, either of these equations reduces to (2.103)
(2.104)
dx2 = κx2 (1 − x2 )(W2 − W1 ) dt = κx2 (1 − x2 )(a22 − a12 + a11 − a21 )(x2 − γ),
where (2.105)
γ =
a11 − a21 ; a22 − a12 + a11 − a21
see Exercise 2.13. Thus dx2 /dt is positive or negative according to whether x2 > γ or x2 < γ. If x2 (0) < γ, then (2.104) implies that x2 (t) → 0 as t → ∞, so that strategy 1 wins over the population; whereas, if x2 (0) > γ, then (2.104) implies that x2 (t) → 1 as t → ∞, so that strategy 2 wins instead. More generally, in a game restricted to m pure strategies, let xk = xk (t) be the proportion adopting strategy k at time t, let (2.106)
Wk =
m #
akj xj
j=1
be the reward to strategy k, and let (2.107)
W =
m #
xk W k
k=1
be the average reward to the entire population. Then the long-term dynamics can be described by the differential equations (2.108)
1 dxk = κ{Wk − W }, xk dt
k = 1, . . . , m,
which were introduced by Taylor and Jonker [334] and are now generally known as the replicator equations [145, p. 67]. Note that because (2.109)
x1 (0) + x2 (0) + · · · + xm (0) = 1,
this equation must hold at all later times as well, i.e., (2.110)
x1 (t) + x2 (t) + · · · + xm (t) = 1,
0≤t W (n) or Wk (n) < W (n), in such a way that the proportions playing strategy k at iterations n + 1 and n are in the ratio Wk (n)/W (n). So (2.115a)
xk (n + 1) =
xk (n)Wk (n) , W (n)
1 ≤ k ≤ m,
0 ≤ n < ∞,
or, equivalently (Exercise 2.14), (2.115b)
xk (n + 1) − xk (n) {Wk (n) − Wj (n)}xj (n) . = xk (n) W (n)
We will use (2.115) to describe dynamics in Chapter 5. In the context of discrete population games, we will find it convenient to say that strategy i infiltrates strategy k if xi (0) is a very small positive number but xk (0) is close to 1. Up to m − 1 strategies can infiltrate strategy k, because the sum of m − 1 very small positive numbers is still a very small positive number; we will distinguish two cases by saying that strategy k is subject to pure infiltration by strategy i if strategy i is the only infiltrator (i.e., if xj (0) = 0 for j = k and
2.8. Discrete population games. Multiple ESSs
95
j = i), but that strategy k is subject to mixed infiltration if there is more than one infiltrator. Now, regardless of whether we use (2.108) or (2.115) to describe the subsequent dynamics, the vector x(∞) = (x1 (∞), x2 (∞), . . . , xm (∞)) yields the ultimate composition of the population; in other words, the population evolves to x(∞). If xi (∞) > xi (0), so that xk (∞) < 1, then we shall say that strategy i invades strategy k; if also xi (∞) = 1 (hence xk (∞) = 0), then we shall say that strategy i eliminates strategy k. (If, on the other hand, xk (∞) = 1, then we shall say that strategy k eliminates the infiltrators.) But note that strategy i can invade strategy k without eliminating it. We will find this terminology especially useful in Chapter 5. Now, by analogy with §2.1, if strategy k is an ESS, then it is stable against pure infiltration; i.e., if strategy i is the only infiltrator, then strategy k will always eliminate it (see Exercise 2.16). Furthermore, if strategy k is a strong ESS, then it is stable against mixed infiltration; i.e., strategy k will eliminate every infiltrator (again, see Exercise 2.16). But if strategy k is not a strong ESS, or if strategy k is only a Nash-equilibrium strategy, then—as we shall illustrate in §5.3—the outcome of mixed infiltration will depend on specific details of the dynamics, whether (2.108) or (2.115). An important possibility is that x(∞) = ξ, where at least two components of the vector ξ are nonzero. In that case, if x(0) ≈ ξ implies x(∞) = ξ, then we shall refer to ξ as an evolutionarily stable state, comprising a mixture of pure strategies. If, however, x(0) ≈ ξ implies only that x(∞) is close to ξ, then ξ is merely metastable; and, as §5.3 will illustrate, one or more of the strategies in such a metastable mixture may ultimately be eliminated through persistent infiltration by other strategies. Because evolutionarily stable strategy and evolutionarily stable state are rather cumbersome phrases, there is a natural tendency to abbreviate them, and it is rather unfortunate that they both have the same initials. A frequent workaround is to use “monomorphic ESS” or “monomorphism” for evolutionarily stable strategy, and “polymorphic ESS” or “polymorphism” for evolutionarily stable state; thus ESS has two possible meanings, but the morphic qualifier eliminates any ambiguity. Specifically, in a population at a monomorphic ESS, v ∗ , every individual is playing the same strategy, which may (as with v ∗ = (θ, 1 − θ) in Hawk-Dove II, p. 79) or may not (as with v ∗ = (0, 0) or Retaliator in Hawk-Dove II, p. 73) be mixed; whereas, in a population at a polymorphic ESS, ξ, every individual is playing a pure strategy, but different pure strategies are used, in proportions determined by ξ.32 Whenever we use ESS all by itself, however, we shall always mean evolutionarily stable strategy. Before proceeding to the next section, we digress to note that equations (2.115a) are a special case of first-order, nonlinear recurrence equations of the form (2.116a)
xj (n + 1) = Gj (x1 (n), . . . , xm (n)),
j = 1, . . . , m,
where G1 , . . . , Gm are functions of x1 , . . . , xm ; or, more succinctly, (2.116b)
x(n + 1) = G(x(n)),
32 Note that polymorphism, as we have defined it, is a discrete-game concept, and discrete population games expressly exclude mixed strategies (p. 65). So a mixed-strategy ESS is never a polymorphism; on the contrary, it is always a monomorphism.
96
2. Population Games
where the vector x and the vector-valued function G are defined by (2.117)
x(n) = (x1 (n), . . . , xm (n)),
G = (G1 , . . . , Gm ).
In the particular examples that appear in this book, the dynamics of (2.116) will turn out to be rather simple: as n increases, the vector x(n) will progress towards an equilibrium vector x(∞) = x∗ satisfying x∗ = G(x∗ ). Moreover, for given x(0), it will be easy to generate the sequence x(1), x(2), x(3), . . . by computer, recursively from (2.116). Nevertheless, one should still be aware that the dynamics of equations of type (2.116) are potentially very complicated, and a considerable variety of periodic and even chaotic behavior is possible; see, e.g., Chapter 15 of [142].
2.9. Continuously stable strategies We already know from §2.8 that, whenever a discrete population game has more than one ESS, initial conditions determine which ESS the population adopts. In continuous population games, however, another issue arises. To see why, let us suppose that drivers in Crossroads II (§2.3) are not slow, and that a population of drivers is at the strong ESS defined by (2.31), i.e., the universally adopted critical lateness above which a driver initially goes is (2.118)
v∗ = 1 −
δ + 12 (η − 1)τ θ = 1+λ δ + 12 ητ +
with λ < θ < 1, where δ, (< δ), η (< 1), and τ (< 2δ) denote impetuosity, ditheriness, discount factor, and junction transit time, respectively. Let us further suppose that there is a small but sudden shift in the value of any of these four parameters. For example, perhaps η rises slightly because everyone listens on their car radio to advice from a lifestyle guru. Then the population is no longer at its ESS; instead it remains at v defined by using the old parameter values in (2.118), whereas the ESS v ∗ requires the new values. What happens now? After this small perturbation, will the population of drivers converge on the new ESS? We first obtain a general answer for a two-player continuous population game, and then we apply it to Crossroads II. Because (v ∗ , v ∗ ) ∈ R, f (v ∗ , v ∗ ) = max f (u, v ∗ ) and hence ∂f (2.119) = 0, ∗ ∂u
u=v=v
as long as f is sufficiently differentiable—which we assume, and which f defined by (2.27) clearly is.33 We regard the current population strategy v as a small perturbation to v ∗ , and any fresh mutant u as an even smaller perturbation to v. We can therefore write (2.120)
u = v + h,
v = v ∗ + k,
33 By contrast, in the cake-cutting game (Exercise 2.22), f is not differentiable, and so the concept of continuous stability could not apply. However, it would be irrelevant in any case, because the cakecutting game has no parameters.
2.9. Continuously stable strategies
97
where h and k are both infinitesimally small, but with h much smaller than k. Let us recall Taylor’s theorem, which states that ∂g ∂g (2.121) g(a + ξ1 , b + ξ2 ) = g(a, b) + ξ1 u=a + ξ2 u=a + O(ξ 2 ), ∂u
v=b
∂v
v=b
where ξ = max(|ξ1 |, |ξ2 |) and g is any sufficiently differentiable function of u and v, ∂g implying in particular that ∂u is another function to which Taylor’s theorem can in turn be applied, and where O(χ) is “order notation”—specifically, “big oh”— denoting terms so small that you can divide them by χ and the result still remains ∗ bounded as χ → 0. From (2.121) with g = ∂f ∂u , a = b = v , and ξ1 = ξ2 = k, we obtain ∂f ∂f ∂2f ∂2f ∗ ∗ = + k + k u=v= u=v∗ + O(k2 ) u=v 2 u=v ∗ ∂u
v +k
∂u
v=v ∗
∂u
∂u∂v
v=v ∗
= k
v=v ∗ 2
∂ f ∂2f + + O(k2 ) ∂u2 ∂u∂v u=v=v∗
by (2.119). So, from (2.120) and (2.121) with g = f , a = b = v ∗ + k, ξ1 = h, and ξ2 = 0, we find that playing mutant strategy u = v + h against the current population strategy v = v ∗ + k yields reward f (u, v) = f (v + h, v ∗ + k) = f (v ∗ + k + h, v ∗ + k)
∂f = f (v ∗ + k, v ∗ + k) + h + O(h2 ) ∂u u=v=v∗ +k 2 ∂ f ∂2f + = f (v, v) + hk 2 ∗
(2.122)
∂u
∂u∂v
u=v=v
plus terms of the form hO(k ) + O(h ), which are negligible compared to those above because h is so small compared to k. For mutant strategies to move the population from its current strategy v = v ∗ + k to the ESS v ∗ , those with h > 0 must be favored when k < 0 (to increase a value that is too low) and those with h < 0 must be favored when k > 0 (to decrease a value that is too high); in other words, we require f (u, v) to exceed f (v, v) when hk < 0. So a non-v ∗ population converges to a v ∗ population when 2 ∂ f ∂2f (2.123) + < 0. 2 ∗ 2
2
∂u
∂u∂v
u=v=v
If (2.123) holds, then the ESS v ∗ is said to be either continuously stable [91] or convergence stable [65].34 Thus, in particular, when δ > 12 τ , Crossroads II has a continuously stable strong ESS defined by (2.118) because (2.27) implies that 2 ∂ f ∂2f (2.124) + = −(δ + )(1 + λ) < 0, 2 ∗ ∂u
∂u∂v
u=v=v
and so (2.123) is satisfied. The significance of continuous stability becomes more apparent in terms of the invasion fitness (2.125)
i(u, v) = f (u, v) − f (v, v),
34 Or even m-stable for “mutant-stable” [333]. But even Taylor [333, p. 141] was not happy with this terminology, and anyhow it didn’t catch on.
98
2. Population Games
that is, the relative fitness advantage of a mutant u-strategist in a population of v-strategists. If v is a strong ESS, then the invasion fitness of every mutant is negative by (2.14), so that no mutant can invade (which is why the dashed line in Figure 2.9 must lie in the shaded region). For other v, however, u can invade if, and only if, i(u, v) > 0. Correspondingly, the zero contour of the function i partitions the decision set into four distinct regions with different signs for i(u, v). In the case of Crossroads II, it follows from (2.27) that (2.126)
i(u, v) =
1 2 (δ
+ )(u − v){2(1 − θ + λ) − λu − (2 + λ)v},
and so the zero contour consists of the line u = v from (0, 0) to (1, 1) and the line λu+(2+λ)v = 2(1−θ +λ) from (0, ω2 ) to (1, ω3 ) in Figure 2.9. If v < v ∗ (or k < 0), then i(u, v) > 0 only for u > v (or h > 0), represented by moving from (v, v) along the dotted line in Figure 2.9(b) to (u, v); because the mutant strategy has higher fitness in the unshaded region, its frequency will keep increasing until it reaches fixation, represented by moving from (u, v) to (u, u) = (v + h, v + h) along the dashed line. Because h is so small, however, the population has effectively moved upward along the line u = v from (v, v) to (u, u), as indicated by the arrow: u has become the new v. Any further mutation repeats this process, until eventually (v, v) converges to (v ∗ , v ∗ ), as indicated in Figure 2.9(a). If instead v > v ∗ (or k > 0), then i(u, v) > 0 only for u < v (or h < 0) but the process is the same, except that convergence is downward, as shown in Figures 2.9(c) and 2.9(a). The more fitness changes with u, the more rapidly v approaches the ESS. So it is reasonable to assume that dv dt is proportional to ∂i/∂u|u=v (or ∂f /∂u|u=v , which is the same thing), where t denotes time and the constant of proportionality is set to 1 without loss of generality.35 Thus, after perturbation from the old ESS, v evolves to v ∗ according to dv ∂i (2.127) = = (δ + )(1 + λ)(v ∗ − v), dt
∂u
u=v
dv ∗ ∗ by (2.126) and (2.118), with dv dt > 0 if v < v and dt < 0 if v > v . Note that R lies entirely within the unshaded region of Figure 2.9: v can be invaded by the optimal reply to v unless v is an ESS.
The derivation of (2.123) assumed that v ∗ ∈ (0, 1) or that v ∗ is an interior ESS—which holds for Crossroad II if δ > 12 τ , as we assumed. Indeed drivers need not be quite that fast: v ∗ continues to be an interior ESS as long as δ > 12 (1 − η)τ . But the result fails to hold when δ < 12 (1 − η)τ , because then θ > 1 + λ and ∂f /∂u is invariably negative, by (2.27). So the unique best reply to any v is u = 0, making v ∗ = 0 a strong ESS—but a boundary ESS, as opposed to an interior one. What happens now to the concept of continuous stability? For Crossroads II, the only possible boundary ESS is v ∗ = 0; however, to make our analysis apply more broadly to any population game with strategy set [0, 1], we denote a boundary ESS by b (where b is either 0 or 1). Again we regard the current population strategy v as a small perturbation to the ESS b, and any fresh mutant u as an even smaller perturbation to v. We can therefore write (2.128) 35
See Footnote 30 on p. 93.
u = v + h,
v = b + k,
2.9. Continuously stable strategies
99
Figure 2.9. (a) The zero contour of invasion fitness i (thin solid lines), the optimal reaction set (thick solid) and the line v = v ∗ (dashed) for Crossroads II when drivers are not slow, with v ∗ defined by (2.118); i < 0 in the shaded region, i > 0 in the unshaded region, ω1 = 1 − θ + λ, ω2 = 1 − {2θ − λ}/{2 + λ}, ω3 = 1−2θ/{2+λ}, and ω4 = 11−θ. The figure is drawn for δ = 3, = 2, τ = 2, and η = 1 (but would have the same topology for any δ > τ /2). (b) Convergence to v ∗ from v < v ∗ . (c) Convergence to v ∗ from v > v ∗ .
where h and k are both infinitesimally small, but in such a way that h is much smaller than k. From Taylor’s theorem applied to f , we find that playing mutant strategy u = v + h against the current population strategy v = b + k yields reward (2.129) f (u, v) = f (v + h, b + k) = f (b + k + h, b + k)
= f (b + k, b + k) + h ∂f ∂u u=v=b+k
+ O(h2 ).
We assume that k and h + k are positive if b = 0 but negative if b = 1, to ensure ∂f (as opposed to f ) we obtain that u ∈ [0, 1]. From Taylor’s theorem applied to ∂u 2 ∂f ∂f ∂ f ∂2f (2.130) u=v= = u=b + k 2 u=b + k u=b + O(k2 ). ∂u
b+k
∂u
v=b
∂u
v=b
∂u∂v
v=b
Now, if b = 0 is an ESS because f (u, 0) achieves a maximum at u = 0 with ∂f < 0 or if b = 1 is an ESS because f (u, 1) achieves a maximum at u = 1 ∂u u=v=0 ∂f ∂f with ∂u u=v=1 > 0, then (2.130) reduces to ∂f ∂u u=v=b+k = ∂u u=v=b + O(k). Thus (2.129) reduces to (2.131) f (u, v) = f (b + k, b + k) + h ∂f ∂u u=v=b
2
plus terms of the form hO(k)+O(h ), which are negligible compared to those above because h and k are so small. From (2.128) and (2.131) we see that f (u, v) > f (v, v) for h < 0 if b = 0 but for h > 0 if b = 1. Mutant strategies that move the population from its current strategy v towards the boundary—and hence the ESS—are favored in either case; whereas mutant strategies that would move the population away from the boundary are disfavored, and hence disappear. So a non-b population
100
2. Population Games
always converges to a b population.36 In sum, continuous stability is a nonissue for a boundary ESS but a very desirable property for an interior ESS. And when a continuous population game has more than one interior ESS, continuous stability yields a criterion for distinguishing among them.37
2.10. State-dependent dynamic games We have allowed individuals to condition their behavior on role (e.g., owner or intruder, §2.4) or on state (e.g., lateness, §2.3), but not on time. Here we relax that restriction to touch on a discrete-time approach to dynamic games developed by Houston and McNamara [148], whose notation we largely adopt. This approach in turn is based on state-dependent dynamic games against nature,38 in which a focal individual is the only decision maker, and which we consider first. We start with some general terms and notation. Accordingly, let n-dimensional vector x = (x1 , . . . , xn ) represent an animal’s state at time t; frequently n = 1, in which case, x and x1 coincide. For example, an animal’s state could be its energy reserves, as in the example that follows (p. 101). Correspondingly, let s-dimensional vector u = (u1 (x, t), . . . , us (x, t)) denote a focal individual’s strategy, which specifies the action to be taken when in state x at time t; again, often s = 1, in which case, u and u1 coincide. Thus a strategy specifies a sequence {u(x, 0), u(x, 1), u(x, 2), u(x, 3), . . .} of state-dependent actions for t = 0 onwards, and in practice no difficulty arises from using the same symbol u for both the action u(x, t) that is specified to be taken when in state x at time t and the complete tabulation of such actions, that is, the strategy itself.39 Let V (x, t) denote the fitness—expected future reproductive success—of an animal in state x at time t, predicated on the assumption that the animal behaves optimally at time t and at all later times. Finally, let H(x, t, u) denote the (expected) reward to a u-strategist in state x at time t (not assuming future optimal behavior). There are three ways in which the strategy u can influence this fitness. First, from the action u taken at time t, there may be a direct contribution to fitness through birth of offspring,40 which we denote by B(x, t, u). Second, the action taken can influence the probability of survival to time t + 1, which we denote by S(x, t, u). Third, the action taken can influence the state in which the animal finds itself at time t + 1. We denote this new state—which in general is a random variable—by X , using the prime to denote a later time.41 It now follows that (2.132) H(x, t, u) = B(x, t, u) + S(x, t, u)Eu V (X , t + 1) , where Eu denotes expected value over all possible states resulting from action u being taken at time t, conditional on survival to time t + 1. Let u∗ (x, t) denote the optimal action when in state x at time t, i.e., the action specified for state x ∂f ∂f Assuming that ∂u = 0; if ∂u = 0, then we proceed as for an interior ESS. u=v=b u=v=b For an illustration of this point, see §8.3 (p. 352). 38 See Footnote 9 on p. 5. 39 Put differently, no difficulty arises from using the same symbol for both the image of u : n+1 →
s and the function itself, where denotes the real numbers. 40 More precisely, offspring that are not still dependent on their parents at time t + 1 [148, p. 13]. 41 In §2.10 only; elsewhere in the book, a prime denotes differentiation with respect to argument. 36 37
2.10. State-dependent dynamic games
101
Table 2.5. Possible future states
Definition of state x0 = x − 1 x1 = x x2 = min(x + 1, L) x3 = min(x + 2, L)
Δx 0 1 2 3
Value V (x0 , t + 1) V (x1 , t + 1) V (x2 , t + 1) V (x3 , t + 1)
Probability wi (1 − pi ) w i p i ρ1 w i p i ρ2 w i p i ρ3
and time t by the optimal strategy u∗ . Then it follows from careful scrutiny of our definitions that (2.133)
V (x, t) = H(x, t, u∗ (x, t)) = max H(x, t; u). u
As a practical matter, we seek the optimal strategy over a finite interval [0, T ] by specifying some final reward V (x, T ) = ψ(x) as a terminal boundary condition, and we then work backwards from t = T to all earlier times. An example will clarify how this procedure works. Accordingly, consider an animal foraging for food during the course of a single day [148, p. 28]. This animal forages from t = 0 (dawn) until t = T (dusk), and its reserves are represented by an integer-valued scalar variable, x. If x reaches zero, then the animal dies of starvation; moreover, there is an upper limit L to how much energy it can store. Thus 0 ≤ x ≤ L. The animal cannot forage at night, and its fitness is zero unless it survives the night. So we may use its probability of survival as a proxy for fitness and set B = 0 in (2.132). At times t = 0, 1, . . . , T − 1, the animal must choose whether to forage—until time t + 1—in a poorer food habitat where predation is lower or a richer food habitat where predation is higher. Specifically, u(x, t) = 1 means choosing the poorer habitat in state x at time t, and u(x, t) = 2 means choosing the richer habitat in state x at time t. For i = 1, 2, if the animal chooses u = i, then it is killed by a predator with probability zi and hence survives with probability wi = 1 − zi , in which case, it finds a food item with probability pi . The energetic value of a food item, J, is a discrete random variable with mean 2, taking integer values between 1 and 3 with probability (2.134)
ρj = Prob(J = j) =
1 2
− 14 |2 − j|2
in either habitat. The animal uses one unit of energy per period to run its metabolism. Hence an animal with reserves x at time t that survives until time t + 1 can then be in one of four states, defined in Table 2.5, where Δx denotes 3foraging success for the period; note that the last column sums to wi − wi pi {1 − j=1 ρj } = wi , as opposed to 1, because future success is conditioned upon survival. Because B = 0 and S(x, t, i) = wi , it follows from (2.132) and (2.134) that (2.135) H(x, t, i) = wi (1 − pi )V (x0 , t + 1) + 14 pi {V (x1 , t + 1) + 2V (x2 , t + 1) + V (x3 , t + 1)} and from (2.133) that (2.136)
V (x, t) = max H(x, t, 1), H(x, t, 2) .
x 20 .. . 13 12 11 10 .. . 1 0
t
1. .. 1 1 2 2. .. 2 D
0
1
1 1 2 2
2 D
1
1 1 2 2
2 D
2 D
1 1 2 2
1
2 D
1 1 2 2
1
2 D
1 1 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
1. .. 1 2 2 2. .. 2 D
10
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
Table 2.6. Optimal foraging over last 30 periods
2 D
1 2 2 2
1
1. .. 1 2 2 2. .. 2 D
20
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 2 2 2
1
2 D
1 1 2 2
1
2 D
1 1 1 2
1
102 2. Population Games
2.10. State-dependent dynamic games
103
Table 2.7. Optimal fitness over last 8 periods to 4 significant figures. Column k shows V (x, T − 9 + k), 1 ≤ k ≤ 9. 1 1 1 0.9976 0.9925 0.9784 0.9542 0.9152 0.8617 0.7894 0.6974 0.5898 0.474 0.3595 0.2559 0.1699 0.1045 0.0592 0.03046 0.01315 0
1 1 1 1 0.9951 0.9862 0.9627 0.9257 0.871 0.7941 0.6952 0.5781 0.4534 0.3326 0.2263 0.142 0.08133 0.04211 0.01946 0.007548 0
1 1 1 1 1 0.9903 0.9749 0.9365 0.8823 0.8008 0.692 0.5647 0.4288 0.3011 0.1935 0.1123 0.0584 0.02666 0.01041 0.00335 0
1 1 1 1 1 1 0.9806 0.9547 0.8941 0.8091 0.6902 0.5465 0.3999 0.2642 0.1565 0.08188 0.03658 0.01368 0.004044 0.0007944 0
1 1 1 1 1 1 1 0.9612 0.919 0.8173 0.6869 0.5277 0.3606 0.2213 0.1158 0.05045 0.01799 0.004377 0.0004863 0 0
1 1 1 1 1 1 1 1 0.9224 0.8499 0.6784 0.4969 0.3188 0.164 0.07204 0.02292 0.003275 0 0 0 0
1 1 1 1 1 1 1 1 1 0.8292 0.7101 0.4499 0.2426 0.1103 0.02205 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 0.594 0.4455 0.1485 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Because the animal starves if x = 0, boundary conditions are (2.137)
V (0, t) = 0,
0 ≤ t ≤ T,
and (2.138)
V (x, T ) = ψ(x),
0 ≤ x ≤ L.
We now use (2.135) and (2.136) to determine u∗ (x, T − 1) and V (x, T − 1) from (2.138), which enables us to use (2.135) and (2.136) again to determine u∗ (x, T − 2) and V (x, T − 2) from V (x, T − 1), and so on, for as far back in time as we please. We thus obtain the optimal strategy—and, simultaneously, the optimal fitness— recursively, by a method of backward induction that has come to be known as stochastic dynamic programming. For the sake of definiteness, we follow Houston and McNamara [148, p. 30] in supposing that z1 = 0, implying w1 = 1; z2 = 0.01, implying w2 = 0.99; p1 = 0.5; p2 = 0.6; L = 20; and the animal needs reserves of at least 10 units to survive the night. Then
1 if x ≥ 10, (2.139) ψ(x) = 0 if x < 10, and it follows from (2.135)–(2.138) that u∗ (x, T − 1), u∗ (x, T − 2), . . . , u∗ (x, T − 30) are given by the matrix in Table 2.6, starting with the rightmost column and moving to the left; “D” denotes that our animal is dead if x = 0. Correspondingly, V (x, t) is partially given by the matrix in Table 2.7, specifically, for t = T − 8, . . . , T .42 Table 2.6 reveals that the optimal strategy is of the form
2 if x ≤ θt , ∗ (2.140) u (x, t) = 1 if x > θt , 42 Note that the complete strategy matrix would have one column fewer than the complete fitness matrix, because there is no action to be specified at t = T .
104
2. Population Games
Table 2.8. How the proportions of the population in various states contribute to the proportions in the following period.
Proportion P (x0 , t + 1) P (x1 , t + 1) P (x2 , t + 1) P (x3 , t + 1)
Increase wi∗ (1 − pi∗ )P (x, t) ρ1 wi∗ pi∗ P (x, t) ρ2 wi∗ pi∗ P (x, t) ρ3 wi∗ pi∗ P (x, t)
where θt is the critical threshold that reserves must exceed at time t for u∗ to specify that our animal should forage in the poorer habitat until time t + 1. Indeed, with T = 30, we can represent u∗ (x, t) as the threshold sequence θ = {θ0 , θ1 , . . . , θ29 } defined by θt = 10 if t = 29, θt = 11 if 0 ≤ t ≤ 5 or t = 28, and θt = 12 if 6 ≤ t ≤ 27, which for present purposes is most compactly written simply as θ = {11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, (2.141) 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 11, 10}. That the optimal strategy turns out to be a threshold strategy does not depend on ψ(x), as demonstrated by Exercise 2.19, in which two other forms are considered. Indeed the further back we go in time, the less the final reward matters. In this regard, Houston and McNamara [148, p. 43] distinguish between weak and strong versions of “backward convergence” of the optimal strategy. Weak backward convergence means that for large T − t, the optimal strategy is independent of ψ(x) but still depends on t. Strong backward convergence means that for large T − t, the optimal strategy is independent of ψ(x) and t. Experience has shown that weak or strong backward convergence is usual according to whether or not environmental conditions depend explicitly on time; and in particular, the three final rewards considered here and in Exercise 2.19 all yield the same optimal threshold for T − t ≥ 64 [148, p. 43]. We would now like to generalize the above ideas from games against nature to more fully strategic interaction. It therefore becomes necessary to know how to follow a candidate for evolutionarily stable strategy forward in time to generate resulting expected behavior. To illustrate in the context of our example, we ask: If the proportion of the initial population having reserves x at time t is known to be P (x, t), and if the population adopts the strategy u∗ , what will be the corresponding proportion at later times t+1, t+2, and so on? Let i∗ = u∗ (x, t), and let x0 , . . . , x3 be defined as in Table 2.5. Then, between times t and t + 1, P (xk , t + 1) will change from zero as indicated in Table 2.8, for k = 0, . . . , 3. Correspondingly, proportions zi∗ P (x, t) and P (0, t + 1) will succumb to predation and starvation, respectively. In this way, if P (x, 0) is known, then P (x, t) can be found by recursion for all 0 < t ≤ T .43 Suppose, for example, following Houston and McNamara [148, p. 39] that initially the reserves of every animal equal 8, so that
1 if x = 8, (2.142) P (x, 0) = 0 if x = 8. 43
Although, as we shall see, we need P (x, t) only for 0 < t ≤ T − 1.
2.10. State-dependent dynamic games
105
Table 2.9. Proportions of initial cohort surviving in each state for 0 ≤ t ≤ 6 in the game against nature when the optimal strategy is followed: L = 0 for the bottom row, L = 20 for the top row, and t increases from left to right. 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0.1485 0.297 0.1485 0.396 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.02205 0.08821 0.1323 0.2058 0.2573 0.1176 0.1568 0 0 0 0 0 0
0 0 0 0 0 0 0.002757 0.01861 0.0486 0.09399 0.1539 0.1768 0.1779 0.1659 0.06986 0.0621 0 0 0 0 0
0 0 0 0 0.0003446 0.003016 0.01107 0.02981 0.06615 0.1102 0.139 0.1647 0.1561 0.1251 0.09452 0.03689 0.02459 0 0 0 0
0 0 0.00004307 0.0004631 0.002181 0.007044 0.01861 0.04217 0.07656 0.1152 0.1364 0.1444 0.1396 0.1139 0.07817 0.05021 0.01826 0.009738 0 0 0
5.384 × 10−6 0.00006865 0.0003938 0.001505 0.004592 0.0119 0.02596 0.05082 0.08512 0.1173 0.1295 0.1338 0.1234 0.1029 0.07435 0.04528 0.02549 0.008677 0.003856 0 0
Then with ψ defined by (2.139) and hence u∗ by (2.141), we obtain the results in Table 2.9 for t = 0, . . . , 6. This matrix merely shows proportions of the initial cohort that actually survive. The quantity that will be of strategic interest to us is the proportion of those survivors foraging in habitat i at time t given that the optimal strategy u∗ is followed, namely, $# L # (2.143) qi (t) = P (x, t) P (x, t). x:u∗ (x,t)=i
x=1
∗
For example, because u (x, 2) = 2 for x ≤ 11 and u∗ (x, 2) = 1 for x ≥ 12 by (2.141), it follows from (2.143) that 0.1568 + 0.1176 + 0.2573 + 0.2058 + 0.1323 + 0.08821 q2 (2) = 0.1568 + 0.1176 + 0.2573 + 0.2058 + 0.1323 + 0.08821 + 0.02205 = 0.9775 and, of course, q1 (2) = 1 − 0.9775 = 0.0225. In this and many other examples, expected behavior exhibits forward convergence. Again, Houston and McNamara [148, p. 44] distinguish two versions: weak forward convergence means that expected behavior of animals alive at time t depends on t but not the initial state for large t, whereas strong forward convergence means that expected behavior is also independent of t for large t. They also note that increased stochasticity decreases the computational time for either forward or backward convergence to be achieved, and discuss why models should include adequate stochasticity [148, pp. 35, 44]. Let us now introduce an element of strategic interaction. Perhaps the simplest thing we can do in that regard is to make the probability of predation in the richer habitat decrease with the proportion of animals foraging there. Accordingly, and for the sake of definiteness, let us replace z2 = 0.01 by (2.144a)
z2 = 0.01{0.1}q2 (t)
106
2. Population Games
Table 2.10. Proportions of initial cohort surviving in each state for 0 ≤ t ≤ 6 with u∗ = θ 0 and (2.144a) replacing z = 0.01 in Table 2.9 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1499 2997 1499 3996
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02246 08982 1347 2096 262 1198 1597
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
002807 01907 04991 09647 1581 1817 1828 1705 07178 0638
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0003509 003086 01136 03072 06837 114 1441 1707 1618 1296 09798 03824 02549
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
00004386 0004734 002235 007241 01919 04369 07965 12 1425 151 146 1192 08175 05251 01909 01018
5.482 × 10−6 0 00007014 0 0004032 0 001545 0 004725 0 01228 0 0269 0 05294 0 0891 0 1231 0 1364 0 1411 0 1301 0 1085 0 07843 0 04776 0 02688 0 009152 0 004068 0 0
and hence w2 by (2.144b)
w2 = 1 − 0.01{0.1}q2 (t)
but leave everything else the same. Thus if all animals foraged in the richer habitat, then the concomitant dilution effect would reduce predation to a mere 10% of its value when a predator has only a single target. What will be the ESS? How can we compute it? An ESS is a best reply to itself. So a possible approach is to guess an initial approximation to the ESS, follow expected behavior forward to obtain q2 , compute the optimal response, and then keep iterating until—we hope—our algorithm converges. An obvious candidate for the initial approximation, which we denote by θ 0 , is (2.141). We have already followed expected behavior forward to obtain Table 2.9; but we used z2 = 0.01, or w2 = 0.99. Iterating forward with (2.144) instead yields Table 2.10, from which the first few terms of the sequence {q2 (t)} can be calculated directly (Exercise 2.20). The full sequence is plotted in the first panel of Figure 2.10(a); and the best reply to it—by the method that produced Table 2.6 and (2.141)—is θ 1 = {18, 18, 18, 17, 17, 17, 17, 17, 17, 17, 17, 16, 16, 16, 16, 16, 16, 15, 15, 15, 15, 14, 14, 14, 14, 14, 13, 12, 11, 10}. Iterating forward again with θ 1 in place of θ 0 yields the sequence {q2 (t)} in the second panel of Figure 2.10(a), and the new best reply is θ 2 = θ ∗ , where (2.145)
θ ∗ = {17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 16, 16, 16, 16, 15, 15, 15, 14, 14, 14, 13, 12, 11, 10}
turns out to be the ESS. The strategy sequence has in effect converged, but we do not know it until we have iterated forward one more time to obtain the sequence {q2 (t)} in the third panel of Figure 2.10(a), to which (2.145) is the (unique) best
2.10. State-dependent dynamic games
Figure 2.10. Convergence of q2 (t) as θ converges to the ESS from (a) θ 0 = (2.141) and (b) θ 0 = Θ5 , where Θx is defined by (2.146). Each panel shows the result of iterating forward with the strategy specified in the caption for that panel. For (a), the last two panels are identical; for (b), a further iteration with θ = θ ∗ would reproduce the final plot.
107
108
2. Population Games
reply; {q2 (t)} has now converged as well, as confirmed by iterating one more time to return {q2 (t)} again in the last panel of Figure 2.10(a). We do not have to use (2.141) for θ 0 ; Figure 2.10(b) shows convergence from 0 θ = Θ5 via θ 1 = {19, 10, 10, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 12, 11, 10} and
θ 2 = {18, 18, 17, 18, 17, 18, 17, 17, 17, 17, 17, 17, 17, 16, 16,
16, 16, 16, 15, 15, 15, 15, 14, 14, 14, 14, 13, 12, 11, 10} instead, where we define the constant-threshold strategy Θx by (2.146)
Θxt = x,
0 ≤ t ≤ T − 1,
for 0 ≤ x ≤ 20. Note that convergence requires an extra iteration. So far, so good: the algorithm has converged to an ESS. But we have purposely glossed over a host of largely computational issues that tend to arise with this approach to dynamic games.44 For example, the algorithm often converges, but does not always converge; when it fails to converge, typically it oscillates, so that θ n+2 = θ n for large n [146, p. 66]. Moreover, although the ESS is often unique, it does not have to be, and so it is necessary to check for the possibility of an alternative ESS by testing for convergence from different initial approximations. For detailed discussion of these and other issues concerning this method—including, in particular, how introducing errors in decision making that are less probable when they are costlier can facilitate convergence to an ESS—see Chapter 7 of Houston and McNamara [148].
2.11. Commentary This chapter focused on Maynard Smith’s [188] concept of evolutionarily stable strategy or ESS. After recasting the community game of Crossroads as a population game in §2.1, we formally introduced ESSs in §2.2. In §2.3 we transformed our new version of Crossroads into a continuous population game to demonstrate that the concept of evolutionary stability is by no means restricted to bimatrix games. In §2.4 we introduced three variants of Maynard Smith’s Hawk-Dove game, called Hawk-Dove I, Hawk-Dove II [188, p. 18], and Owners and Intruders [188, p. 96]; 44 A particular issue arising in the simple foraging game we have just considered is whether small differences between different ESSs have biological significance or are merely artefacts of discretization. Let us relabel the strategy defined by (2.145) as θ a . Then we have established that the algorithm converges to θ a for both θ 0 in (2.141) and θ 0 = Θ5 defined by (2.146). More generally, the algorithm converges to θ a from Θx for 0 ≤ x ≤ 5, 11 ≤ x ≤ 13, and 16 ≤ x ≤ 18. For 6 ≤ x ≤ 10 and x = 14 or b x = 15, however, the algorithm instead converges to θ b , where θ19 = 15 but otherwise θtb = θta ; and for c = 17 but otherwise θtc = θta . Furthermore, x = 19 or x = 20, the algorithm converges to θ c , where θ16 use of random sequences of 30 integers between 0 and 20 for θ 0 reveals three further ESSs, namely, θ d , f d e θ e , and θ f , where θ20 = 16 but otherwise θtd = θtb ; θ16 = 17 but otherwise θte = θtb ; and θ16 = 17 f d but otherwise θt = θt . Extensive numerical experimentation has shown that the algorithm converges most often to θta , then θtc , then θtb , and only infrequently converges to θtd , θte or θtf ; in particular, only a single θ 0 leading to θte has been found, only two for θtf , and only three for θtd . All of these ESSs are close to θta , with θtb and θtc differing only for one time, θtd and θte only for two times, and θtf only for three. Collectively, these results strongly suggest that the six ESSs are slightly different discrete manifestations of a unique underlying ESS at which t and x both vary continuously—put differently, that the ESS is in practice unique, despite being nonunique in principle.
2.11. Commentary
109
the discussion of Hawk-Dove I is based on [223, p. 408]. In §2.5 we established necessary and sufficient conditions for ESSs. In §2.6 we applied these conditions, after first extending Hawk-Dove I to allow for continuous variation of strength while assuming that costs are constant; more complex versions of the continuous HawkDove game can either allow for variable costs [232, 233] or not assume that the stronger contestant is guaranteed to win [205], or both [216]. Having adapted our definition of evolutionary stability for games restricted to pure strategies—discrete population games—in §2.2, we discussed population dynamics in terms of strategy composition in §§2.7–2.8. In §2.9 we introduced the concept of a continuously or convergence stable strategy, limiting our coverage to one-dimensional strategy sets, but the idea generalizes in a natural way to higher dimensions [75]; see Exercise 2.23. Finally, in §2.10 we broached the topic of state-dependent dynamic games. How much is lost in a restriction to pure strategies? If one takes the view that mixed strategies are merely proxies for unmodelled aspects of an interaction, then the answer in theory is nothing. An example will help to illustrate. Recall once more the game of Four Ways, and suppose that drivers do not have spinning arrows on their dashboards; rather, they select pure strategy G when they feel agitated, pure strategy W when they feel calm, and otherwise strategy C. Nevertheless, the probabilities of being in these various states of agitation correspond to the probabilities of selecting G, W , or C in the ESS of §2.1. Thus, when behavior is averaged over time, drivers appear to adopt a mixed strategy; yet in fact they adopt a state-dependent pure strategy. Such a pure strategy can also be timedependent; perhaps, for example, drivers select G when they feel agitated between Monday and Thursday, but C when they feel agitated on a Friday. More generally, a player’s state can in principle be defined to incorporate almost any relevant physiological or environmental variable; if decisions can depend on both state and time in a sufficiently general way, then it is arguable that mixed strategies are quite unnecessary. Here two remarks are in order. First, although the decision rule in such games is deterministic—in the sense that the functional relationship between state, time, and action taken is deterministic—the state remains a random variable; thus the “internal” uncertainty that mixed strategies bring to a player’s decisions is replaced by the “external” uncertainty of a player’s state. Second, we must ultimately allow for mistakes by the players. But this uncertainty can also be externalized by considering purely deterministic (but conditional) decision rules of the form, “If you think that the state is that, then do this” (and then computing rewards as expected values over an appropriate probability distribution). All things considered, dynamic games with state-dependent pure strategies may be far more appealing as behavioral models than static games with mixed strategies. The framework developed by Houston and McNamara [148] is remarkably versatile,45 and they and their colleagues have used it to study a variety of animalbehavior issues, including fighting for food [147], the trade-off in birds between foraging to avoid starvation and singing to attract a mate [146], and the ecological factors that would favor sentinel behavior in animals as diverse as scrub jays and 45 As we saw in §2.10, it relies heavily on stochastic dynamic programming, to which Clark and Mangel [66] is a gentler introduction (although with little to say about dynamic games per se).
110
2. Population Games
meerkats, but not in all social mongooses [21]. Analysis of such games is often extremely difficult, however—and so even if, in principle, one regards mixed strategies as merely a stopgap, one is often in practice still grateful to have them. The other dynamics discussed in this chapter are governed by the replicator equations (§2.8). They determine which strategies persist among those initially present but do not allow for mutation to other strategies. In the context of discrete games, mutation is incorporated by generalizing these equations to the replicatormutator equations [251, 258]. In the context of continuous games, mutation is incorporated into dynamics through a body of work that has come to be known as adaptive dynamics, or AD for short [83, 144, 355]. AD assumes any mutant strategy u to be a small, rare deviation from the current population strategy v that, if successful, takes over the population before any further mutation arises, with the direction of evolution determined by the gradient of the invasion fitness: dv ∂i ∂ dt = ∂u u=v , where u and v are in general vectors and ∂u denotes the gradient with respect to u. We have already seen the one-dimensional version of this equation in (2.127). For us, however, in §2.9 and elsewhere, the question of whether a population strategy v ∗ is continuously stable does not even arise unless we already know that v ∗ is an ESS. In AD, by contrast, strategies that are convergence stable46 without being even local ESSs—because (2.69) fails to hold—are of special interest, by virtue of being associated with the splitting of a monomorphic population into two distinct types and hence potentially the onset of speciation [195, p. 415]. For further details, see [43] or Chapter 13 of [49], and references therein. There now exists a large theoretical literature on ESSs and related concepts. More advanced texts citing relevant earlier work include [47, 49, 74, 145, 251, 311, 347] from the perspective of biology and [292, 293, 342, 358] from that of economics. Much of this literature deals with bimatrix games, but the scope of Maynard Smith’s concept is far more general—as we saw in §2.3 and §2.6. Much of the literature assumes an infinite population, but some of it—e.g., [189, 294] and Chapter 7 of [251]—considers a finite one. Infinite- versus finite-population effects will be studied in §§5.4–5.6, in the context of a discrete population game. Further examples of both discrete and continuous population games appear in Chapters 5–8. All are games within species. But the scope of evolutionary game theory extends to games played among species [51, 76], although we do not discuss them in this book.
46 This is the preferred term in AD, where the term “continuously stable strategy” or CSS is used for a local ESS (p. 81) that is convergence stable. Because large deviations are not considered in AD, however, such a strategy need not be a global ESS, i.e, need not satisfy (2.17). See Exercise 6.4 for an example.
Exercises 2
111
Exercises 2 1. Show that u = 1 is an uninvadable strategy of the symmetric version of Crossroads when θ > 1: if both drivers are slow, then both should select G. Would it be better to replace “both drivers are slow” by “the junction is fast”? 2. Verify (2.37), and hence that Figure 2.1 yields the optimal reaction set for Hawk-Dove I in §2.4. Also, verify that the ESS is strong if θ > 1 but weak if θ ≤ 1. How is Hawk-Dove I related to Crossroads mathematically? 3. (a) Show that any uninvadable strategy is a Nash-equilibrium strategy. (b) Show that if v ∗ is a strongly uninvadable strategy, then (v ∗ , v ∗ ) must be a strong Nash equilibrium. (c) If (u∗ , v ∗ ) is a strong Nash equilibrium, are u∗ and v ∗ strongly uninvadable? 4. Verify (2.20). 5. Show that if δ > 12 τ or θ < 1 in Four Ways, then the ESS mixes all three pure strategies with positive probability,47 but that C should not be selected with greater probability than 13 . 6. Find all evolutionarily stable strategies in the prisoner’s dilemma game, defined by Exercise 1.23. 7. (a) Find the optimal reaction set for Crossroads II with slow drivers, thus verifying Figure 2.4 and (2.32). (b) Show that the average delay in Crossroads II decreases with respect to η. (c) Generalize Crossroads II to allow for an arbitrary nonuniform distribution 1 of lateness, i.e., for any g such that g(ξ) ≥ 0 and 0 g(ξ) dξ = 1 in place of (2.23). Assume δ > 12 τ . 8. (a) Verify Table 2.3 for Owners and Intruders. (b) This payoff matrix is for a population that is intrusive, in the sense defined in §7.3: Dove intruders concede if an owner attacks, but otherwise display. By contrast, in an unintrusive population, a Dove intruder concedes as soon as it sees an owner—no attack is necessary. Show that unintrusiveness replaces Table 2.3 by Table 7.6. (c) Modify the analysis of Owners and Intruders in §2.5 (p. 80) to show that X is now the only ESS for θ < 1. 9. Obtain the optimal reaction set R for the continuous Hawk-Dove game of §2.6 in the case where strength is uniformly distributed between 0 and 1, and thus verify Figure 2.6. 10. Verify that the reward to each individual in a population at the ESS of the continuous Hawk-Dove game (§2.6) exceeds the reward at the ESS for the corresponding discrete Hawk-Dove game (Hawk-Dove I, §2.4).
47 In a matrix game, such an ESS is guaranteed to be unique. See Corollary 6.10 of Broom and Rycht´ aˇ r [49, p. 98].
112
2. Population Games
11. In §2.1 we showed that strategy θ in Crossroads could not be invaded by strategy 0 or strategy 1 in isolation. Reconcile this conclusion with dynamical model (2.97) in §2.7. 12. Show that if (2.99) is satisfied, then the only evolutionarily stablestrategies— a12 are pure or mixed—of the symmetric game with payoff matrix A = aa11 21 a22 the two pure strategies. 13. Verify (2.104). 14. (a) Establish (2.110). (b) Show that (2.109) and (2.115a) imply x1 (n) + · · · + xm (n) = 1, 0 ≤ n < ∞. (c) Hence, verify (2.115b). 15. (a) Verify (2.111). (b) Verify that the isoclines in Figure 2.8 cross at (2.112). (c) Let (˜ x1 , x ˜2 ) be an equilibrium of (2.111). Using Footnote 31 on p. 93, obtain the characteristic equation for the Jacobian matrix J(˜ x1 , x ˜2 ), and hence show that the equilibria (0, 1), (0, 0), (θ, 1 − θ), and (α1 , α2 ) are, respectively, an unstable node, two stable nodes, and a saddle point. 16. In the context of discrete population games (§2.8): (a) Show that any ESS is stable against pure infiltration. (b) Show that a strong ESS is stable against mixed infiltration. (c) Show that a dominant strategy is also a strong ESS. 17. For the modified version of Crossroads in Exercise 1.27, it is known that (e1 , e1 ), (e2 , e2 ), and (θ, θ) are all symmetric Nash-equilibrium strategy combinations, where e1 = (1, 0), e2 = (0, 1), and θ = (θ1 , θ2 ) with θ1 , θ2 defined by (1.22). Show that the pure strategies e1 and e2 are both strongly evolutionarily stable, but that the mixed strategy θ is not an ESS.48 18. Is v ∗ = α a continuously stable interior ESS of the continuous Hawk-Dove game in §2.6 for a uniform distribution of strength? 19. In Table 2.6 of §2.10 we found the optimal strategy for a foraging animal in a game against nature with (2.138) as the final reward. Two other forms for ψ(x) were considered by Houston and McNamara [148]: (a) ψ(x) = 1 if x ≥ 1 but ψ(0) = 0; (b) ψ(x) = x, where fitness is proportional to energy reserves at dusk—the constant of proportionality has no effect on optimal strategy and is set to 1 without loss of generality. In both cases, show that the optimal strategy has the form (2.140). 20. (a) Obtain the first seven terms of the sequence {q2 (t)} plotted in the first panel of Figure 2.10(a) directly from Table 2.10. (b) Obtain the analogue of Figure 2.10 with θ 0 = Θ19 in place of (2.141) in the left-hand column and θ 0 = {18, 11, 18, 2, 10, 10, 5, 4, 5, 19, 20, 12, 3, 17, 10, 20, 12, 11, 6, 5, 12, 17, 9, 10, 13, 10, 14, 8, 2, 13} in place of θ 0 = Θ5 in the right-hand one. 48 For a discussion of how either ESS might establish itself (regardless of whether τ1 = τ2 ), see [326]. That only pure strategies may be ESSs is a consequence of a theorem due to Selten [300].
Exercises 2
113
21. (a) Show that v ∗ = 13 , 13 , in which each pure strategy is played with the same probability, is an ESS of the game defined by Exercise 1.29 if λ > 0, but that there is no ESS if λ < 0. (b) In each case, by considering equations analogous to (2.111) and proceeding as in §2.8, describe the evolution of a population consisting of a mixture of all three pure strategies. (c) Using Footnote 31 on p. 93, obtain the characteristic equation for the ˜2 ), and hence show that the equilibrium 13 , 13 is Jacobian matrix J(˜ x1 , x a stable or unstable focus according to whether λ is positive or negative. 22. Revisit the cake-cutting game, which you considered as a community game in Exercise 1.28, and find all evolutionarily stable strategies of the associated population game. 23. For m-dimensional vector strategies let v = (v1 , v2 , . . . , vm ), u = (u1 , u2 , . . . , um ), ∗ and v ∗ = (v1∗ , v2∗ , . . . , vm ) be, respectively, a population strategy, a potential mutant strategy, and an interior ESS. Then the analogue of (2.123) is that a non-v ∗ population converges to a v ∗ population if the matrix C with cij =
∂f ∂f + ∂ui ∂uj ∂ui ∂vj
in row i and column j, 1 ≤ i, j ≤ m, is negative definite at u = v = v ∗ [75]. Use this criterion with m = 2 to verify that ζ in Four Ways (p. 65) is a continuously stable ESS.
Chapter 3
Cooperative Games in Strategic Form
We now return to community games. Consider such a game between two players, and let each choose a single scalar variable—u for the first player, v for the second. Then to every feasible strategy combination (u, v), i.e., to every (u, v) in the decision set D, there corresponds a vector of rewards f = (f1 , f2 ); the equations f1 = f1 (u, v), f2 = f2 (u, v) define a vector-valued function f from the decision set D, which is a subset of the u-v plane, into the f1 -f2 plane. In Crossroads, for example, f1 and f2 are defined by (1.12). The image of D under f , i.e., the subset of the f1 -f2 plane onto which f maps D, is known as the reward set; it contains all reward vectors that are achievable by some combination of strategies in D. A standard notation for this image is f (D), but we will often find it more convenient to denote the reward set by F (that is, F = f (D)). Note that f is not in general invertible: the equations f1 = f1 (u, v), f2 = f2 (u, v) do not in general define a (single-valued) function from F onto D. For example, we saw in Exercise 1.18 that when both drivers in Crossroads are fast, the joint max-min strategy combination (˜ u, v˜) and the Nash-equilibrium strategy combination (θ1 , θ2 ) are mapped to the same point in F by f ; however, (˜ u, v˜) = (θ1 , θ2 ). The concept of a reward set is readily generalized to games among n players, the kth of whom selects an sk -dimensional strategy vector; f is then a vector-valued function from a space of dimension s1 + s2 + · · · + sn into a space of dimension n. But the concept is most useful for n = 2. For n = 3 the reward set is difficult to sketch, and for n > 3 it is difficult even to visualize. For games between specific individuals who can make binding agreements with one another if it benefits them to do so, Nash’s concept of noncooperative equilibrium is superseded by various cooperative solution concepts (the most enduring of which is still due to Nash). In understanding why, we will find that a picture of the reward set is worth several thousand words. Accordingly, in this chapter, an analysis of the reward set for the game of Crossroads will serve as our springboard
115
116
3. Cooperative Games in Strategic Form
to cooperative games, which we study in strategic form here and in nonstrategic form in Chapter 4. For the sake of simplicity, we assume throughout the present chapter that Nan and San are both fast drivers. Therefore (p. 13), (3.1)
2δ > 2 > max(τ1 , τ2 ).
3.1. Unimprovability or group rationality For Crossroads, the reward vector f is defined by (1.12), i.e., (3.2a)
f1 (u, v) = ( + 12 τ2 − {δ + }v)u + ( − 12 τ2 )v − − 12 τ2 ,
(3.2b)
f2 (u, v) = ( + 12 τ1 − {δ + }u)v + ( − 12 τ1 )u − − 12 τ1 .
Suppose, for example, that the game is symmetric with (3.3)
δ = 3,
= 2,
τ1 = 2 = τ2 .
Then (3.4)
f1 = (3 − 5v)u + v − 3,
f2 = u − 3 + (3 − 5u)v.
The reward set F is the image of the decision set D = {(u, v) | 0 ≤ u, v ≤ 1} under the mapping defined by (3.4). To obtain F , imagine that D is covered by an infinity of line segments parallel to the v-axis. On each of these line segments u is constant, but v increases from 0 to 1. Let us first obtain the image under (3.4) of one of these line segments, then allow u to vary between 0 and 1. The images of all the line segments together (with some duplication) will constitute F . Consider, therefore, the line segment (3.5)
L(ξ) = {(ξ, v) | 0 ≤ v ≤ 1}
on which u = ξ is fixed (and 0 ≤ ξ ≤ 1). The image f (L(ξ)) of L(ξ) under the mapping f is the set of all points (f1 , f2 ) such that (3.6)
f1 = 3(ξ − 1) + (1 − 5ξ)v,
f2 = ξ − 3 + (3 − 5ξ)v
for some v ∈ [0, 1]. Eliminating v, we see that f (L(ξ)) is part of the line in the f1 -f2 plane with equation (3.7)
(3 − 5ξ)f1 + (5ξ − 1)f2 + 10ξ 2 − 8ξ + 6 = 0
(Exercise 3.1). But f (L(ξ)) is not the whole of this line, because 0 ≤ v ≤ 1; rather, it is that part which extends in the f1 -f2 plane from the point (3ξ − 3, ξ − 3), corresponding to v = 0, to the point (−2 − 2ξ, −4ξ), corresponding to v = 1. The line segment f (L(ξ)) is sketched in Figure 3.1 as a dotted line for values of ξ at increments of 0.1 between 0 and 1. For example, f (L(0)), which is the image of side u = 0 of the unit square, is that part of the line 3f1 − f2 + 6 = 0 which stretches from (−3, −3) to (−2, 0). It is marked in Figure 3.1 as the upper prong of the smaller V-shape, the lower prong of which is the image of side v = 0 of the unit square and which satisfies f1 − 3f2 − 6 = 0. Again, f (L(1)), which is the image of side u = 1 of the unit square, is that part of the line f1 − 2f2 − 4 = 0 which stretches from (−4, −4) to (0, −2). In Figure 3.1 it is the lower prong of the larger V-shape, the upper prong of which is the image of side v = 1 of the unit square and satisfies 2f1 − f2 + 4 = 0, and so on (Exercise 3.1).
3.1. Unimprovability or group rationality
−4
−3
117
(−τ 2 , 0)
−1
f2 0
f1
ξ = 0.1 ξ = 0.2
−1
ξ = 0.3 ξ = 0.4
(0, −τ 1 )
ξ = 0.5 ξ = 0.6 ξ = 0.7 ξ = 0.8 ξ = 0.9 (−δ−τ 2 /2, −δ−τ 1 /2)
ξ = 1.0
−3
Nash equilibrium rewards uv(1−u)(1−v) =0 −4 locally unimprovable points
Figure 3.1. The reward set for Crossroads, where (0, −τ1 ) and (−τ2 , 0) are globally unimprovable. The dashed line on which these points both lie has equation τ1 f1 + τ2 f2 + τ1 τ2 = 0.
A glance at Figure 3.1 now reveals that the reward set is a curvilinear triangle, which is reminiscent of the open mouth of a fledgling (and the picture is similar for all other values of δ, , τ1 , and τ2 such that (3.1) is satisfied). The straight edges of the triangle have equations 2f1 − f2 + 4 = 0 = f1 − 2f2 − 4. The curved edge of the triangle is the curve to which every line segment in the family (3.8) f (L(ξ)) | 0 ≤ ξ ≤ 45 is a tangent somewhere; we call this curve the envelope of the family. The enve3 , − lope has two straight segments. The first runs between − 11 5 5 and (−2, 0); it corresponds to limiting member L(0) of family (3.8), and hence has equation 2 + 6 = 0. The second straight segment of the envelope runs between 3f13− f11 − 5 , − 5 and (0, −2); it corresponds to limiting member L(4/5) of family (3.8), and so has equation f1 − 3f2 − 6 = 0. 11 3 To equation of the curved part of the envelope between − 5 , − 5 3find 11the and − 5 , − 5 , suppose that the line segment f (L(ξ)) has equation ψ(f1 , f2 , ξ) = 0, so that ψ is defined by the left-hand side of (3.7); and that f (L(ξ)) touches the envelope at the point with coordinates (F1 (ξ), F2 (ξ)), so that the parametric equations of the envelope are f1 = F1 (ξ), f2 = F2 (ξ), 0 ≤ ξ ≤ 45 . Then the vector normal to f (L(ξ)) has direction (∂ψ/∂f1 , ∂ψ/∂f2 ), and the tangent vector to the envelope has direction (F1 (ξ), F2 (ξ)). Because these two vectors are perpendicular at the point (F1 (ξ), F2 (ξ)), we have ∂ψ/∂f1 · F1 (ξ) + ∂ψ/∂f2 · F2 (ξ) = 0. But (F1 (ξ), F2 (ξ)) must lie on f (L(ξ)), i.e., ψ(F1 (ξ), F2 (ξ), ξ) = 0. Differentiating with
118
3. Cooperative Games in Strategic Form
respect to ξ, we obtain ∂ψ/∂f1 · F1 (ξ) + ∂ψ/∂f2 · F2 (ξ) + ∂ψ/∂ξ = 0; therefore, ∂ψ/∂ξ = 0 at (F1 (ξ), F2 (ξ)). So (3.9)
ψ (f1 , f2 , ξ) = 0 =
∂ψ (f1 , f2 , ξ) ∂ξ
must hold at all points (f1 , f2 ) on the curved part of the envelope. By eliminating ξ between these equations, we obtain the equation (3.10)
2
25 (f1 − f2 ) − 40(f1 + f2 ) − 176 = 0
(Exercise 3.2). Thus the curved edge of F has equation (3.11a)
3f1 − f2 + 6 = 0 if −
(3.11b)
25 (f1 − f2 )2 − 40(f1 + f2 ) = 176 if −
(3.11c)
f1 − 3f2 − 6 = 0 if −
3 5 ≤ f2 ≤ 0, 12 5 ≤ f1 , f2 ≤ 3 5 ≤ f1 ≤ 0.
− 35 ,
Note that (3.9) would yield the parametric equations of the envelope of (3.8) even if ψ = 0 were not a straight line. Now, if Nan and San agree to cooperate, then which point of F will be most agreeable to them? Suppose that they somehow pick a tentative point. Then, because Nan wants f1 to be as large as possible and San wants f2 to be as large as possible, points just above or to the right of that point will always be at least as agreeable to them, no matter where in the reward set the tentative point lies. Therefore, Nan and San will revise their tentative point. Proceeding in this manner, the two players will quickly eliminate all points either in the interior of the reward set or on the southern or western edge, because all such points can be improved upon by moving infinitesimally upwards or to the right. Thus, agreeable points must lie on (3.11). But even part of this curved edge can be eliminated. Implicit differentiation of (3.10) yields (3.12)
5(f1 − f2 ) − 4 df2 = , df1 5(f1 − f2 ) + 4
8 12 − 5, − 5 and whence the tangent to the curve (3.11) is parallel to the f1 -axis at 12 and − 35 , − 11 to the f2 -axis at − 5 , − 85 ; see Exercise 3.3. Between − 85 , − 12 5 5 , the slope increases from 0 to 13 ; thereafter is constant. Hence both f1 and 8 it 12 greater at every point between − 5 , − 5 and (0, −2) than they are at f2 8are 12 − 5 , − 5 . All such points, with the exception of (0, −2), are improvable, and so Nan and San them. Likewise, f1 and 12f2 are greater at every would8 eliminate 8 and (−2, 0) than they are at − , − , − point between − 12 5 5 5 5 ; all such points, with the exception of (−2, 0), are improvable and would be eliminated. Thus the only points 12 on8 (3.11) are (−2, 0), (0, −2) and points that lie between 8 remaining and − 5 , − 5 . If one of these points has been reached, then it is − 5 , − 12 5 impossible to move to a neighboring point in the reward set—i.e., a point in the reward set that is arbitrarily close but nevertheless distinct—without making at least one player worse off. Such points are therefore said to be locally unimprovable, or locally Pareto optimal (after the sociologist Pareto) or locally group rational. But among these points, (−2, 0) and (0, −2) possess an even stronger measure of unimprovability: it is impossible to move from either one to any point in the reward
3.1. Unimprovability or group rationality
119
set (regardless of distance) without making at least one player worse off. Such points are said to be globally unimprovable; clearly, global unimprovability implies local unimprovability. Locally unimprovable of F are marked in Figure 3.1 by a 12 8points to − . Note, however, that the endpoints are , − solid curve from − 85 , − 12 5 5 5 excluded. More formally, the strategy combination (u, v) is locally unimprovable if there exists no neighboring point (u, v) in D such that (3.13)
either
f1 (u, v) > f1 (u, v) and
f2 (u, v) ≥ f2 (u, v)
or
f1 (u, v) ≥ f1 (u, v) and
f2 (u, v) > f2 (u, v),
and (u, v) is globally unimprovable if there exists no (u, v) anywhere in D such that (3.13) is satisfied. Note that if (u, v) and (u, v) are neighboring points in D, then (f1 (u, v), f2 (u, v)) and (f1 (u, v), f2 (u, v)) are neighboring points in F , because f1 and f2 are continuous functions. Clearly, provided the players have agreed to cooperate, any strategy combination that is not locally unimprovable would be irrational, because the players could agree to a neighboring combination that would yield no less a reward for all of them and a somewhat greater reward for at least one of them. It is not so clear that a combination would necessarily be irrational if it were not globally unimprovable, however, because the only globally unimprovable points in Figure 3.1 are (0, −2) and (−2, 0). Although these points represent best possible outcomes for Nan or San as individuals, it is difficult to see how both as a group could agree to either. Therefore, we shall proceed on the assumption that the best cooperative strategy combination in a cooperative game must be at least locally unimprovable, but not necessarily globally unimprovable; if only a single combination were locally unimprovable, then we would not hesitate to regard it as the solution of the game. These special circumstances almost never arise, however; rather, many strategy combinations are locally unimprovable. Accordingly, we denote the set of all locally unimprovable strategy combinations by P (for Pareto), and the set of all globally unimprovable strategy combinations by PG . Of course, PG ⊆ P (that is, PG is a subset of P ): a strategy cannot be globally unimprovable unless first of all it is locally unimprovable.1 To obtain P for the game defined by (3.3), we substitute (3.4) into (3.11b); after simplification, we obtain (Exercise 3.3) 2
{10(u + v) − 8}
(3.14)
= 0,
4 5.
implying u + v = But in Figure 3.1 the thick solid curve corresponds to values 1 3 of u between 5and 5 . Thus P consists of the line segment in the u-v plane that joins 15 , 35 to 35 , 15 , together with (0, 1) and (1, 0): (3.15) P = {(0, 1)} ∪ (u, v) | u + v = 45 , 15 < u < 35 ∪ {(1, 0)} . Also, (3.16)
PG = {(0, 1)} ∪ {(1, 0)} = {(0, 1), (1, 0)} .
1 Note that PG ⊆ P allows for the possibility that PG = P , as in Figure 3.3 below. (In general, for sets A and B, A ⊂ B would mean that A is both contained in B and not equal to B, whereas A ⊆ B allows for either A ⊂ B or A = B.)
120
3. Cooperative Games in Strategic Form
1
v
0.5
0 0
0.5
1
u
Figure 3.2. The bargaining set for Crossroads with δ = 3, = 2, and τ1 = 2 = τ2
The set P is sketched in Figure 3.2. Note that P is disconnected—it consists of three separate pieces—and excludes 15 , 35 and 35 , 15 , as indicated by the open circles. Furthermore, although every locally unimprovable reward vector lies on the boundary of F , all but two locally unimprovable strategy combinations lie in the interior of D. If you are familiar with the implicit function theorem, then you could instead have obtained (3.14) from the vanishing of the Jacobian determinant of the rewardfunction mapping defined by (3.4), i.e., from the equation ∂f1 ∂f1 ∂u ∂v (3.17) |J(u, v)| = ∂f2 ∂f2 = 0; ∂u ∂v see Exercise 3.4. On the one hand, the mapping f is locally invertible wherever |J(u, v)| = 0, and, on the other hand, if you think of the unit square in the u-v plane as a sheet of rubber that must be stretched or shrunk by the mapping f until it corresponds to the reward set, then the envelope lies where the sheet gets folded back on itself by f to form two adjacent sheets in F , one above the other. The mapping is not locally invertible here because, for points in F that are arbitrarily close to the envelope, it is impossible to say whether the inverse image in the unit square should correspond to the upper of the two sheets or to the lower one. If P contains more than a single strategy combination, then which element of P is the solution of the game? We will consider this question in §3.3. It is already clear from Figure 3.1 that the question is worth answering, however, because both Nan and San prefer any strategy combination in P to the Nash equilibrium 35 , 35 , which gives each a reward of only − 12 5 . On the other hand, P may be too large a set from which to select a solution, and in this regard it will be convenient to introduce some new terminology. Let f˜1 denote Player 1’s max-min reward, i.e., u), where the minimizing function m1 is defined by (1.98) and u ˜ define f˜1 = m1 (˜ is a max-min strategy for Player 1. If f1 (u, v) ≥ f˜1 , i.e., if (u, v) gives Player 1 at least her max-min reward, then we say that (u, v) is individually rational for her. Likewise, we say that (u, v) is individually rational for Player 2 if f2 (u, v) ≥ f˜2 , v ), m2 is defined by (1.98) and v˜ is a max-min strategy for Player where f˜2 = m2 (˜ 2. Thus (3.18)
D∗ = {(u, v) ∈ D | f1 (u, v) ≥ f˜1 , f2 (u, v) ≥ f˜2 }
3.2. Necessary conditions for unimprovability
121
is the set of all strategy combinations that are individually rational for both players, and (3.19) P ∗ = {(u, v) ∈ P | f1 (u, v) ≥ f˜1 , f2 (u, v) ≥ f˜2 } = P ∩ D∗ is the set of all locally unimprovable strategy combinations that are individually rational for both players. We call P ∗ the bargaining set. Now, it would make no sense for Player 1 to accept the reward f1 (u, v) if f1 (u, v) < f˜1 , because she could obtain at least f˜1 without any cooperation at all; likewise, for Player 2. Accordingly, we seek cooperative solutions that lie not only in P , but also in P ∗ . Note, however, that P ∗ often coincides with P , as illustrated by Figure 3.1, and later by Figure 3.3. The above definitions of unimprovability, individual rationality, and the bargaining set are largely adequate for our purposes, and they would hold even if Player 1 had a vector u of strategies and Player 2 a vector v of strategies (as in §1.4). Nevertheless, we note that the definitions readily generalize to games among an arbitrary number of players, say n, as follows. Let N = {1, 2, . . . , n} denote the set of players, as in (1.67), and for each k ∈ N , let Player k have an sk -dimensional vector of strategies, which we denote by wk ; thus, w1 = u, w2 = v in (3.13). For each k ∈ N , let Player k’s reward be fk = fk (w), where w denotes the strategy combination (w1 , w2 , . . . , wn ), and let D be the the decision set, i.e., the set of all feasible w. Then the strategy combination w is locally unimprovable if there exists no other neighboring point w = (w1 , w2 , . . . , wn ) in D such that (3.20)
and
fk (w) ≥ fk (w)
for all k ∈ N,
fi (w) > fi (w)
for some i ∈ N,
and w is globally unimprovable if there exists no point w anywhere in D such that (3.20) is satisfied. Correspondingly, for k ∈ N , let f˜k denote Player k’s max-min reward; i.e., ˜ k ), where the minimizing function mk is defined by (1.97) and w ˜k define f˜k = mk (w is a max-min strategy for Player k. If fk (w) ≥ f˜k , i.e., if w gives Player k at least her max-min reward, then we say that w is individually rational for player k. Now (3.18) generalizes to the set of strategy combinations that are individually rational for all players as (3.21) D∗ = {w ∈ D | fk (w) ≥ f˜k for all k ∈ N }, and (3.19) generalizes to the set of locally unimprovable strategy combinations that are individually rational for all players as (3.22) P ∗ = {w ∈ P | fk (w) ≥ f˜k for all k ∈ N } = P ∩ D∗ .
3.2. Necessary conditions for unimprovability To determine whether a strategy combination w ∈ D is locally improvable, we compare the rewards at w with those at a neighboring point, say w + λh, where the vector h = (h1 , h2 , . . . , hn ) yields the direction of movement from w, and λ is a small positive number; h must not be the zero vector. If there exists λ > 0, no matter how small, such that fk (w + λh) ≥ fk (w) for all k ∈ N and fi (w + λh) > fi (w) for some i ∈ N , then w is not (either locally or globally) unimprovable, because w + λh is an
122
3. Cooperative Games in Strategic Form
improved strategy combination—provided, of course, that w+λh ∈ D. Accordingly, we define h to be an admissible direction at w if w + λh ∈ D for sufficiently small λ > 0. Any h is admissible for w in the interior of D, whereas only vectors that point into D (or at least not out of D) are admissible for w on the boundary of D. Suppose, for example, that u = w1 , v = w2 and that D is the unit square, {(u, v) | 0 ≤ u, v ≤ 1}. Then h = (h1 , h2 ) must satisfy h1 ≥ 0 to be admissible on the side u = 0; however, h2 is unrestricted on u = 0—except at the points (0,0), where we require h2 ≥ 0, and (0,1), where we require h2 ≤ 0. Similar considerations apply to the other three sides of the square. We can now be more precise and say that if, at w, there exists an admissible direction h and a number λ > 0, no matter how small, such that fk (w + λh) − /P fk (w) ≥ 0 for all k ∈ N and fi (w + λh) − fi (w) > 0 for some i ∈ N , then w ∈ (and hence w ∈ / P ∗ ). In principle, by applying this test to each w ∈ D in turn, we could systematically eliminate all locally improvable strategy combinations. The test is not practicable, however, because D contains infinitely many points. To devise a practicable test for unimprovability, it is necessary to make assumptions about the nature of the functions fk , principally, that fk is (at least once) differentiable for all k ∈ N . From little more than this assumption, it is possible to derive powerful necessary and sufficient conditions for unimprovability; see, e.g., [348]. To follow this approach in its full generality would greatly distract us from our purpose, however, and so we assume instead that each player controls only a single scalar variable. Then, because Player k’s strategy wk is no longer a vector, we prefer to denote it by wk ; the joint strategy combination w becomes an n-dimensional row vector, namely, w = (w1 , w2 , . . . , wn ). Furthermore, we shall restrict our attention to necessary conditions for unimprovability, which, as we shall see, eliminate most—but not all—improvable strategy combinations. Let us now recall that if fk is differentiable with respect to wk for all k ∈ N , then from Taylor’s theorem for functions of several variables we have ∂f
fk (w + λh) − fk (w) = λ k hT + O(λ), ∂w T ∂fk ∂fk ∂fk k denotes the gradient vector , where ∂f ∂w ∂w1 ∂w2 , . . . , ∂wn , h is the transpose of h, and λ > 0; and O(λ) is order notation—specifically, “little oh”—denoting terms so small that you can divide them by λ and the result will still tend to zero as λ → 0. The first term on the right hand side of (3.23) dominates O(λ) for sufficiently small T k λ if ∂f ∂w h = 0. Therefore, if (3.23)
(3.24)
∂fk T h > 0, ∂w
k = 1, 2, . . . , n,
for any admissible h, then w is not (either locally or globally) unimprovable, because then (3.23) and (3.24) imply fk (w + λh) − fk (w) > 0 for all k ∈ N for sufficiently small λ (> 0), so that w + λh is an improved strategy combination. Strategy combinations that do not satisfy (3.24) for any admissible h are candidates for unimprovability; however, we cannot be sure that they are unimprovable, even locally. Accordingly, let us denote by Pnec the set of all w ∈ D that do not satisfy ∗ (3.24) for any admissible h and by Pnec the set of all w ∈ Pnec that are individually ∗ ∗ ; but P = Pnec and P ∗ = Pnec , rational for all players. Then P ⊆ Pnec and P ∗ ⊆ Pnec at least in general.
3.2. Necessary conditions for unimprovability
123
Suppose, for example, that n = 2, and set u = w1 , v = w2 . Then (3.24) requires us to eliminate (u, v) ∈ D if both ∂f1 ∂f2 ∂f2 ∂f1 h1 + h2 > 0 h1 + h2 > 0 and ∂u ∂v ∂u ∂v for any admissible direction h = (h1 , h2 ). To be quite specific, let us consider the version of Crossroads defined by (3.3). Then, from (3.4) and (3.25), (u, v) is improvable if (3.25)
(3.26)
(3 − 5v)h1 + (1 − 5u)h2 > 0 and
(1 − 5v)h1 + (3 − 5u)h2 > 0
for any admissible h = (h1 , h2 ). Because h = (1, 0) is an admissible direction everywhere in the unit square except on the side u = 1, we must exclude points where 3 − 5v > 0 and 1 − 5v > 0, i.e., points where v < 15 . Similarly, because h = (−1, 0) is an admissible direction everywhere in the square except on the side u = 0, we must exclude points where 3 − 5v < 0 and 1 − 5v < 0, i.e., points where v > 35 . Continuing in this manner, we find that choosing h = (0, 1) excludes points where u < 15 , and that choosing h = (0, −1) excludes points where u > 35 . Now the only points remaining are (0, 1), (1, 0) and those which satisfy 15 ≤ u, v ≤ 35 . We already know that (0, 1) and (1, 0) are globally unimprovable, from §3.1. So we concentrate on the region {(u, v) | 15 ≤ u, v ≤ 35 }. Because this square lies totally within the interior of D, any h is admissible. But the more judiciously h is chosen, the more efficiently improvable points can be eliminated. An especially judicious choice is h = ±(1, 1). From (3.26), h = (1, 1) rules out points where 4 − 5(u + v) > 0, and h = (−1, −1) rules out points where 4 − 5(u + v) < 0; therefore, all locally unimprovable points must satisfy u + v = 45 . We conclude that (3.27) Pnec = {(0, 1)} ∪ (u, v) | u + v = 45 , 15 ≤ u ≤ 35 ∪ {(1, 0)} . Note, however, that (3.26) cannot eliminate either 15 , 35 or 35 , 15 as a candidate, even though we already know from §3.1 that neither point is unimprovable. Thus, P ⊆ Pnec but P = Pnec (or P ⊂ Pnec ). How did we know that h = ±(1, 1) would work so well? The guesswork involved in choosing an admissible direction that yields useful information can be eliminated by having recourse to the so-called Theorem of the Alternative. A special case of this theorem, which will suffice for our purposes, is the following: if J is an n × n matrix and K is an m × n matrix, then either there exists a 1 × n (row) vector h such that (3.28)
JhT > 0n
and
KhT ≥ 0m
or there exist a 1 × n vector η with nonnegative components, at least one of which is strictly positive, and a 1 × m vector μ with nonnegative components such that (3.29)
ηJ + μK = 0n
(but never both). Here 0 stands for the n × 1 zero vector, 0n stands for the 1 × n zero vector, and a vector inequality w > 0 (or w ≥ 0) means that every component of the vector w must be positive (or nonnegative); thus, in the statement of the theorem, η ≥ 0n , η = 0n , μ ≥ 0m . For a proof of the theorem, see Mangasarian [176]. n
124
3. Cooperative Games in Strategic Form
Now, under our scalar-strategy assumption, constraints on the admissibility of h at a given w can always be written in the form KhT ≥ 0m for suitable m and K, where K depends on w. Moreover, the n inequalities in (3.24) are equivalent to JhT > 0, where ⎡ ∂f1 ∂f1 ∂f1 ⎤ · · · ∂w ∂w1 ∂w2 n ⎢ ∂f2 ∂f2 ∂f2 ⎥ ⎢ ∂w · · · ∂wn ⎥ ⎢ 1 ∂w2 ⎥ ⎢ ⎥ · · · · · · ⎢ ⎥ (3.30) J(w) = ⎢ ⎥ · ··· · ⎥ ⎢ · ⎢ ⎥ · ··· · ⎦ ⎣ · ∂fn ∂fn ∂fn · · · ∂w ∂w1 ∂w2 n is the Jacobian matrix for the particular w whose unimprovability is being tested. But if w is unimprovable, then (3.24) precludes alternative (3.28); therefore, we must instead have (3.29). The upshot is that finding Pnec means calculating the set of all w for which there exist μ ≥ 0m and nonnegative η = 0n such that (3.29) holds. For w in the interior of D where any h is admissible, we set K equal to the m × n zero matrix (for any value of m), and because η must not be the zero vector, we deduce from (3.29) that ηJ(w) = 0n , η = 0n . Thus it follows immediately from the theory of linear algebra that if w is an interior unimprovable point, then J(w) = 0, (3.31) i.e., the determinant of the Jacobian matrix must vanish at w.2 This result does not imply that every interior point w satisfying (3.31) is a candidate for unimprovability, however, because more than just |J| = 0 is implied by ηJ = 0n , η = 0n . When n = 2, for example, ηJ = 0n becomes ∂f1 ∂f2 ∂f1 ∂f2 + η2 = 0, η1 + η2 = 0, (3.32) η1 ∂u ∂u ∂v ∂v where η = (η1 , η2 ). With (η1 , η2 ) ≥ (0, 0) and (η1 , η2 ) = (0, 0) there are three possibilities for η, namely, that η1 = 0 and η2 > 0, that η1 > 0 and η2 = 0, or that η1 > 0 and η2 > 0. From (3.32), the first case requires ∂f2 /∂u = 0 = ∂f2 /∂v, the second case requires ∂f1 /∂u = 0 = ∂f1 /∂v, and the third requires ∂f1 /∂u·∂f2 /∂u < 0 and ∂f1 /∂v · ∂f2 /∂v < 0. For the version of Crossroads defined by (3.3), you can easily verify that these restrictions on interior unimprovable points correspond to (3.27). We also apply (3.29) to boundary points. Consider, for example, (1, 0). −1The 0 ≤ 0, h ≥ 0, implying m = 2, K = restrictions on h at that point are h 1 2 0 1 , 3 −4 3 −4 η η2 1 −2 + 0)) = 1 −2 . So ηJ + μK = 0n = 02 becomes 1 and J = J((1, −1 0 μ1 μ 2 0 1 = 0 0 or 3η1 + η2 − μ1 = 0, −4η1 − 2η2 + μ2 = 0; and these equalities are easily satisfied with η ≥ 02 , η = 02 , μ ≥ 02 . Therefore (1, 0) is a candidate for unimprovability. On the other hand, for the point (0, 0), where the restrictions on h are h ≥ 02 , so that K is the 2 × 2 identity matrix, a similar analysis yields the equalities 3η1 + η2 + μ1 = 0 and η1 + 3η2 + μ2 = 0. Now μ ≥ 02 implies 3η1 + η2 ≤ 0, η1 + 3η2 ≤ 0, which contradicts η ≥ 02 , η = 02 . Thus (0, 0) 2
Note that (3.17) is a special case of this result.
3.3. The Nash bargaining solution
125
is improvable. Again, at points other than (1, 0) and (1, 1) on the side of D where u = 1, the only restriction on h is h1 ≤ 0, so that m = 1, and K = −1 0 . Now ηJ + μK = 02 yields (3 − 5v)η1 + (1 − 5v)η2 − μ1 = 0, 4η1 + 2η2 = 0, the second of which contradicts η ≥ 02 , η = 02 , regardless of the value of v. Continuing in this manner, the remaining improvable points on the boundary of D are readily eliminated; see Exercise 3.6.
3.3. The Nash bargaining solution Despite the Theorem of the Alternative, eliminating improvable points is rarely a straightforward exercise, even if n = 2. On the other hand, bargaining points—i.e., strategy combinations in the bargaining set—are almost never unique. So, even after the strenuous labor of calculating P ∗ , we are still faced with the problem of deciding which w in P ∗ , say w = w, ˆ should be the solution of the game. If it were somehow possible to determine w ˆ without first calculating P ∗ , then clearly we could save ourselves a great deal of trouble. One such approach to determining w ˆ is provided by Nash’s [242] bargaining solution, which we now describe. Because Player k benefits from cooperation with the other players only if she obtains a reward in excess of the max-min reward that she can guarantee for herself, we will define (3.33)
xk = fk (w) − f˜k
to be Player k’s benefit of cooperation from the joint strategy combination w ∈ D; for w ∈ P ∗ , xk ≥ 0. Because, for each k ∈ N , Player k wants xk to be as large as possible (and certainly nonnegative), an agreeable choice of the vector (3.34)
x = (x1 , x2 , . . . , xn )
should lie as far as possible from the origin (and in the nonnegative orthant) of the n-dimensional space of cooperation benefits. But how should one measure distance d from the origin? Should one use (3.35)
d = x1 + x2 + · · · + xn
or (3.36)
d = x1 · x2 · · · xn ?
Or yet another formula expressing the idea that all players prefer a larger benefit to a smaller one, i.e., ∂d/∂xi > 0 for x > 0n for all i ∈ N ? Clearly, there is no limit to the number of such formulae! Now, each player measures her cooperation benefit according to her own scale of merit, i.e., subjectively. Suppose, however, that there exists some objective scale of merit, against which a supreme arbitrator could assess all players’ subjective valuations, and let one unit of objective merit equal γk units of Player k’s subjective merit, for all k ∈ N . Then the objective distances corresponding to (3.35) and (3.36), which the supreme arbitrator could perhaps supply, are (3.37)
d =
x1 x2 xn + + ··· + γ1 γ2 γn
126
3. Cooperative Games in Strategic Form
and x1 x2 xn x1 · x2 · · · xn · ··· = , γ1 γ2 γn γ1 · γ2 · · · γn respectively. But who is this supreme arbitrator? Who is this person who is capable of assigning values to the numbers γ1 , γ2 , . . . , γn —or, as game theorists prefer to say, of making interpersonal comparisons of utility? Suppose there is no such person. If distance is measured according to (3.35), then this is a most unfortunate circumstance, because we cannot maximize the corresponding objective distance (3.37) until we know the values of γ1 , γ2 , . . . , γn . If distance is measured according to (3.36), however, then it matters not a whit—maximizing (3.36) and maximizing (3.38) are one and the same thing, for any values of γ1 , γ2 , . . . , γn . Thus formula (3.36) has a very desirable property that (3.35) and other formulae do not possess. ˆ maximizes We will say that w ˆ ∈ P ∗ is a Nash bargaining solution if w (3.39) d(w) = {f1 (w) − f˜1 } · {f2 (w) − f˜2 } · · · {fn (w) − f˜n } (3.38)
d =
ˆ is unique—i.e., if no other w ∈ P ∗ satisfies d(w) = d(w)—then ˆ over P ∗ ; and if w we will regard w ˆ as the solution of our cooperative game. In practice, we can often obtain w ˆ by maximizing d(w), not over P ∗ but rather over the whole of D, and then checking that w ˆ is individually rational for all players, ˆ so found must be globally (and hence also locally) i.e., w ˆ ∈ D∗ . It is clear that w unimprovable; for if w ˆ were improvable, then the improved strategy combination would yield a larger value of d in (3.39) by increasing the value of at least one factor without reducing the value of any factor. If, of course, the w that maximizes d(w) over D does not belong to D∗ , then we must maximize d(w) over D∗ instead. Either way, we can find w ˆ that maximizes d(w) over P ∗ without actually calculating P ∗ . Both cases are illustrated by the following two examples. First, let us calculate the Nash bargaining solution for Store Wars II in §1.5. From Exercise 1.19, the max-min rewards for Nan, Van, and San are, respectively, f˜1 = acπ/9, f˜2 = acπ/16, and f˜3 = 25acπ/144. Therefore, from (1.60), we have 1 f1 (u, v, z) − f˜1 = 8acπ u 13 + v − 2u + z − 72 (3.40a) , 1 1 ˜ f2 (u, v, z) − f2 = 8acπ v 4 + u − 2v + z − 128 , (3.40b) f3 (u, v, z) − f˜3 = 8acπ z 5 + u − 2z + v − 25 , (3.40c) 12
1152
and on using (3.39) with w = (u, v, z), we obtain (3.41)
d(u, v, z) =
(acπ)3 ˆ1 (u, v, z)ˆ x2 (u, v, z)ˆ x3 (u, v, z) 20736 x
with x ˆ1 defined by x ˆ1 (u, v, z) = 24u(1+3v −6u+3z)−1, x ˆ2 defined by x ˆ2 (u, v, z) = 32v(1+4u−8v+4z)−1, and x ˆ3 defined by x ˆ3 (u, v, z) = 96z(5+12u−24z+12v)−25. To obtain (ˆ u, vˆ, zˆ), we maximize d over the decision set D defined by (1.61). For the sake of definiteness, let us suppose that the value of α, which determines the maximum price in (1.56), is α = 10. Then D consists of all (u, v, z) such that (3.42)
|u − v| ≤
1 12 ,
|v − z| ≤ 16 ,
0 ≤ u, v, z ≤ 10.
Note that if u, v, and z were not constrained by a price ceiling, i.e., if α → ∞ in (1.56), then d would be unbounded on D (because, e.g., d would increase without bound as t → ∞ along the line defined by u = v = z = t if it weren’t for the
3.3. The Nash bargaining solution
127
constraints u ≤ 10, v ≤ 10, z ≤ 10). Thus collusion among storekeepers is bad for the consumer. It enables one of the players—from Figure 1.12, clearly San—to set her price at the ceiling 8πacα, with the others not far behind; whereas competition in §1.5 kept prices in rein. Maximizing (3.41) subject to (3.42) is a problem in constrained nonlinear programming, a discussion of which would take us far beyond our brief.3 Instead, we simply observe that packages for solving such problems are now widely available even on desktop computers. By using such a package,4 we discover that the maximum of (3.41) on D occurs where u = 9.98, v = 9.95, and z = 10: for all practical purposes, the prices are at their ceiling. The corresponding rewards are f1 = 25.8acπ (> f˜1 ), f2 = 26.3acπ (> f˜2 ), and f3 = 27.7acπ (> f˜3 ), confirming that (9.98, 9.95, 10) ∈ D∗ (which, however, we have not had to calculate). If we compare the Nash bargaining rewards to the Nash equilibrium rewards f1 ≈ 0.44acπ, f2 ≈ 0.36acπ, and f3 ≈ 0.54acπ obtained in §1.6, then we see how much the players benefit by collusion. Note that Van’s reward is greater than Nan’s under cooperation, whereas Nan’s reward is greater than Van’s under competition; in both cases, however, Van’s price is lower than Nan’s. The w ˆ that maximizes d need not be unique, as our second example—Crossroads with fast drivers—will illustrate. From 1.18, when 2 > max(τ Exercise 1 , τ2 ), the u) = − δ + 12 τ2 θ2 and f˜2 = m2 (˜ v ) = − δ + 12 τ1 θ1 , max-min rewards are f˜1 = m1 (˜ where θk = ( + 12 τk )/( + δ) is defined by (1.22) and m1 , m2 are defined by (1.98). Let ξ1 and ξ2 be defined by (δ + )ξ1 = − 12 τ2 ,
(3.43)
(δ + )ξ2 = − 12 τ1 .
Then, from (3.2), we have f1 (u, v) − f˜1 = (δ + )(ξ1 − u)(v − θ2 ), f2 (u, v) − f˜2 = (δ + )(ξ2 − v)(u − θ1 )
(3.44a) (3.44b) and (3.39) yields (3.45)
d(u, v) = (δ + )2 (u − θ1 )(ξ1 − u)(v − θ2 )(ξ2 − v).
In the special case of (3.3), we have (3.46)
d(u, v) =
1 25 (5u
− 3)(1 − 5u)(5v − 3)(1 − 5v).
The maximum of (3.46) over D = {(u, v) | 0 ≤ u, v ≤ 1} occurs at (u, v) = (1, 1), but (1, 1) ∈ / D∗ . Accordingly, we must maximize d over D∗ instead; from Exercise 3.8, we find that the maximum 24 25 occurs at both (1, 0) and (0, 1). It appears that there are two Nash bargaining solutions, each of which is the best possible outcome for one of the players. As we have remarked already in §3.1, it is difficult to see how they could agree to either. Here two remarks are in order. First, the game of Crossroads defined by (3.3) is symmetric (τ1 = τ2 ). There is no basis for distinguishing between the players, and so any cooperative solution should also be symmetric, i.e., satisfy u = v. But P ∗ in Figure 3.2 contains only a single symmetric strategy combination, namely, 3 For a discussion of constrained nonlinear programming see, e.g., [175, Chapters 10, 12, and 13] or [153, Chapters 9 and 10]. 4 For example, the Mathematica command NMaximize readily yields the solution.
128
3. Cooperative Games in Strategic Form
of this cooperative game (u, v) = 25 , 25 . We therefore propose that the solution should be neither (1, 0) nor (0, 1), but rather 25 , 25 , the center of the bargaining set in Figure 3.2.5 Second, the function d defined by (3.46) achieves its maximum twice on D∗ only because the game is symmetric. In general, if Crossroads is asymmetric, i.e., if τ1 = τ2 , then the Nash bargaining solution is again unique. Suppose, for example, that (3.47)
δ = 5,
= 3,
τ1 = 4,
τ2 = 2.
Then (Exercise 3.8) the unique Nash bargaining solution is (ˆ u, vˆ) = (0, 1), i.e., Nan always waits and San always goes. This solution is intuitively attractive because, although both drivers are fast, San is considerably faster, and so it makes good sense for Nan to let her whip across the junction before she dawdles into gear herself. Moreover, even when Crossroads between fast drivers is symmetric with τ1 = τ2 = τ (1, 1) in (3.1)–(3.2), the game has a unique Nash bargaining solution (ˆ u, vˆ) = δ+ 1 at the center of the bargaining set if and 2 τ are sufficiently close together that (3.48) δ/ δ 2 + 2 < 12 τ < , which holds, e.g., if δ = 1.9, = 1.025, and τ = 2 (Exercise 3.8). More fundamentally, there is no unique Nash bargaining solution for the game defined by (3.3) because F is not convex, i.e., it isn’t possible to join any two points in F by a straight line segment that never leaves F (consider, for example, (1, 0) and (0, 1)). If instead F is convex, then Nash’s bargaining solution is always unique; see, for example, Exercise 3.10. More generally, Nash’s bargaining solution is unique whenever a convex subset of F contains its unimprovable boundary, regardless of whether F itself is convex—as illustrated by Figure 3.3, where F is not convex.
3.4. Independent versus correlated strategies The difference between what is rational for an individual and what is rational for a group is exemplified by comparing the noncooperative Nash-equilibrium solution for the game of Crossroads defined by (3.3) with the ad hoc cooperative solution derived at the end of the previous section. Nan and San’s Nash bargaining rewards 12 11 are − 11 5 , whereas their Nash-equilibrium rewards are only − 5 . Each prefers − 5 to − 12 5 , and so it is in their mutual interest to select their cooperative strategies u = 25 = v over their Nash equilibrium strategies u∗ = 35 = v ∗ ; in other words, to be less aggressive and assume right of way less often. If, however, Nan were to select u = 25 without first making her intentions clear to San, then from Figure 1.5 (with θ1 = 35 = 2 θ2 ) the rational thing for San to do would be to select v =2 1 (always go), because 5 , 1 ∈ R2 . It would be irrational for San to select v = 5 , because 2 2 ∈ / R , . Likewise, if Nan knew that San would play v = 25 , then the rational 2 5 5 thing for Nan to do, from a selfish point of view, would be to welch on San and play u = 1, because 1, 25 ∈ R1 . Thus cooperative solutions are rational only if there is a gentlewoman’s agreement among the players—however enforced, whether voluntarily or by compulsion—to abide by their bargaining strategies. Indeed in 5
Which corresponds to a local maximum on D ∗ of d defined by (3.46).
3.4. Independent versus correlated strategies
129
Figure 3.3. The reward set for Store Wars in §1.4 with α = 10 and c = 1. The hexagonal decision set D in Figure 1.8 can be covered by a family of straight lines parallel to its axis of symmetry, u = v; and the images of these straight lines under the mapping f1 = f1 (u, v), f2 = f2 (u, v)—which are also straight lines—trace out the reward set F , as indicated in the diagram. The corner in the interior of the set of unimprovable points corresponds to the Nash bargaining solution (§3.3). Note that the max-min rewards are f˜1 = 32/25 and f˜2 = 12. For further details, see Exercise 3.11.
theory, it is merely the existence or absence of such an agreement that determines whether a game is cooperative or noncooperative. In practice, however, what often determines whether a game is cooperative or noncooperative is whether or not the game is among specific individuals who meet repeatedly in similar circumstances, recognize one another, and have the ability to communicate; for rational beings will avail themselves of any device for maximizing rewards—including, if they have the means, cooperation. Thus, provided the players can trust one another, the potential exists for even greater benefits from cooperation than are possible when strategies are chosen independently. Consider, for example, Crossroads. As in §1.2, we can imagine that each driver has a spinning arrow on her dashboard, which determines whether to go or wait in any particular confrontation. Now, if Nan’s arrow always came to rest
130
3. Cooperative Games in Strategic Form
over the shaded sector of her disk when San’s arrow came to rest over the unshaded part of her disk, and vice versa, then the two players would never waste any time wondering who should go after W W or who should back down after GG, because the only pure strategy combinations ever selected would be W G or GW . But, of course, W G and GW are not the only pure strategy combinations selected, because the players spin their arrows independently, and all angles of rest between 0 and 2π are equally likely. If the players have already agreed to cooperate, however, then they can further reduce delays by agreeing to correlate strategies, as follows. Naturally, there is potential for such collaboration only when Crossroads is played between two specific individuals—a particular Nan and a particular San. Let us imagine that, after agreeing to cooperate, this Nan and San discard their individual spinning arrows and replace them by a large arrow and disk at the junction itself (which can be started by remote control). This disk is divided into four sectors, which subtend angles 2πω11 , 2πω12 , 2πω21 , and 2πω22 , respectively. If the arrow comes to rest in the first sector, then Nan and San will both go (GG); if in the second sector, then Nan will go but San will wait (GW ); if in the third sector, then San will go but Nan will wait (W G); and if in the fourth sector, then Nan and San will both wait (W W ). The device is equivalent to selecting pure strategy combination (i, j) with probability ωij , for i, j = 1, 2. Thus (3.49)
ω11 + ω12 + ω21 + ω22 = 1,
and Nan’s reward f1 , the expected value of her payoff F1 , is (3.50) f1 = − δ + 12 τ2 · Prob(GG) + 0 · Prob(GW ) − τ2 · Prob(W G) − + 12 τ2 · Prob(W W ) = − δ + 12 τ2 ω11 − τ2 ω21 − + 12 τ2 ω22 from Table 1.1. Similarly, San’s reward is (3.51) f2 = − δ + 12 τ1 ω11 − τ1 ω12 − + 12 τ1 ω22 . We will refer to the vector ω = (ω11 , ω12 , ω21 , ω22 ) as a correlated strategy. When Nan and San had two separate arrows and chose independent strategies u and v, we had ω11 = uv, ω12 = u(1 − v), ω21 = (1 − u)v, and ω22 = (1 − u)(1 − v), from §1.2 (p. 15). Thus maximization of f1 and f2 when strategies are selected independently is equivalent to maximization of f1 and f2 under the constraints (3.52)
ω11 + ω12 = u,
ω11 + ω21 = v
and, of course, (3.49); whereas maximization of f1 and f2 when strategies are correlated is subject only to (3.49). Because there are (two) fewer constraints, we can expect maximum rewards to be larger. Now, in designing their disk, Nan and San do not hesitate to set ω11 = 0 = ω22 , which increases both f1 and f2 ; as we have remarked already, it is obvious that nothing can be gained under correlated strategies by selecting GG or W W . So, from (3.49)–(3.51), Nan and San reduce their task to selecting ω12 and ω21 such that ω12 + ω21 = 1, with rewards f1 = −τ2 ω21 and f2 = −τ1 ω12 . Thus, regardless
3.4. Independent versus correlated strategies
131
of which ω12 (and hence ω21 ) the players choose, the corresponding rewards will satisfy (3.53)
τ1 f1 + τ2 f2 + τ1 τ2 = 0,
f1 , f2 ≤ 0.
Their reward pair will therefore lie in the the f1 -f2 plane on the line segment that joins (−τ2 , 0) and (0, −τ1 ); every point on this line segment is achievable by some choice of correlated strategies. Furthermore, from Figure 3.1 (or the equivalent diagram for other values of the parameters δ, , τ1 and τ2 ), this line segment lies on or above the unimprovable boundary of F at every point. In particular, in the case where δ = 3, = 2, and τ1 = 2 = τ2 , it lies above the point that corresponds to the ad hoc cooperative solution we found at the end of §3.3. It thus appears that if you are going to cooperate, then you might as well correlate. There remains, however, the question of which reward pair on this line segment yields the solution of the game. Intuition suggests that we should select the “fair” solution (3.54)
f1 = −
τ1 τ2 = f2 , τ1 + τ2
achieved by the correlated strategy ω for which ω11 = 0 = ω22 and (3.55)
ω12 =
τ2 , τ1 + τ2
ω21 =
τ1 τ1 + τ2
(but see Exercise 3.10). On the one hand, if we accept this solution, then tacitly we have made an interpersonal comparison of utilities: we have assumed that a minute of San’s time is worth a minute of Nan’s, so that if they can save a minute together, then each should reap 30 seconds of the benefits. But who is to say whether Nan’s time and San’s time are equally valuable? What if Nan is a brain surgeon and San a cashier—couldn’t one argue that San’s time is more valuable than Nan’s? Consider that they meet in the morning on their way to work. If San is five minutes late for work, then she may lose her job; whereas if Nan is five minutes late for work—well, do you really expect a brain surgeon to be on time? On the other hand, there are many circumstances in which interpersonal comparisons of utility are quite acceptable—for example, Nan and San may both be brain surgeons or both cashiers. If we agree to correlate strategies, however, and if we agree to make interpersonal comparisons of utility (my time is worth as much as your time, my dollar is just as good as your dollar, etc.), then solving a game loses much of its strategic interest. It becomes instead a matter of seeking a fair distribution among the players of some benefit of cooperation (e.g., time saved, money saved), of which there exists a definite total amount. Or, if you prefer, there’s a finite pie to be distributed among the players, and we need to establish how big a slice is a player’s just desert (dessert?). In any event, by invoking interpersonal comparisons of utility, in effect, we are already talking about characteristic function games, and these are the subject of the following chapter.
132
3. Cooperative Games in Strategic Form
3.5. Commentary In this chapter we introduced the most important concepts of cooperative games in strategic form, namely, unimprovability or Pareto optimality (§§3.1–3.2), Nash bargaining solutions (§3.3), and independent versus correlated strategies (§3.4). In §3.3 we introduced the Nash bargaining solution in the context of independent strategies. Usually, however, the solution is applied in the context of correlated strategies. The reward set is then always convex (so that local unimprovability and global unimprovability are equivalent), and, as remarked at the end of §3.3, a unique Nash bargaining solution is guaranteed. For a proof of this fact and further properties of Nash’s bargaining solution, see, e.g., Owen [257]. Nash’s bargaining solution traditionally lies in the province of economic game theory, broadly conceived to include, for example, the management of fisheries [327] and global biodiversity [105]. Since the turn of the century, however, it has also featured in purely biological applications of game theory, specifically, to strategic interactions in both animals [3,288] and plants [4,121]. Note, finally, that although Nash’s bargaining solution appears to be the most enduring solution concept for cooperative games in strategic form, it is by no means the only one; others are described by, e.g., Shubik [307, pp. 196–200].
Exercises 3 1. (a) Verify (3.7). (b) Show that the image of the line segment {(u, 0)|0 ≤ u ≤ 1} under the mapping f defined by (3.4) is that part of the line f1 − 3f2 − 6 = 0 which extends from (−3, −3) to (0, −2). Note that this image coincident is largely with f (L(4/5)), but f (L(4/5)) does not extend from − 35 , − 11 5 to (0, −2). (c) Show that the image of the line segment {(u, 1)|0 ≤ u ≤ 1} under the mapping f defined by (3.4) is that part of the line 2f1 − f2 + 4 = 0 which extends from (−4, −4) to (−2, 0). 2. (a) Show that (3.10) is the envelope of the family of lines ψ = 0, where ψ is defined by the left-hand side of (3.7). (b) Show this envelope meets (touches) the line (3.11a) at the point 11 that − 5 , − 35 , and crosses the f1-axis where f1 > −2, and that it meets 3 11 the line (3.11c) at the point − 5 , − 5 , and crosses the f2 -axis where f2 > −2. Why are the two small curvilinear triangles enclosed by the envelope, the lines and the axes not part of the reward set? 3. (a) Deduce from (3.12) thatthe tangent to the curved edge of F is parallel to 12 8 , and to the f the f1 -axis at − 85 , − 12 -axis at − 2 5 5 , −5 . (b) Which strategy combinations correspond to these points? (c) Verify (3.14)–(3.16).
Exercises 3
133
4. Verify that (3.17) yields u + v = 45 . 5. Using the method of §3.1, find the bargaining set for Crossroads when δ = 5, = 3, τ1 = 4, and τ2 = 2. Sketch the reward set, marking in particular all locally or globally unimprovable points. 6. Use the Theorem of the Alternative (§3.2) to show that no points other than (1, 0) and (0, 1) on the boundary of the unit square are unimprovable in Crossroads for δ = 3, = 2, and τ1 = 2 = τ2 . ∗ 7. Use the Theorem of the Alternative to find Pnec for Crossroads when δ = 5, = 3, τ1 = 4, and τ2 = 2. Verify that your results agree with those you obtained in Exercise 3.5.
1
v
θ2 α2 ξ2 0 0
ξ1 α1 θ1
1
u
Figure 3.4
8. (a) For the original (asymmetric) game of Crossroads between fast drivers in Chapter 1, show that D∗ is the unshaded region in Figure 3.4, where ξ1 and ξ2 are defined by (3.43), θ1 and θ2 are defined by (1.22), and αi = 12 (θi +ξi ) for i = 1, 2; i.e., the point marked by a dot has coordinates (α1 , α2 ) =
δ+ (1, 1)
+
τ1 −τ2 4{δ+} (1, −1).
(b) Describe the Nash bargaining solution. (c) Show that if the game is symmetric (τ1 = τ2 = τ ) and (3.48) holds, then the unique Nash bargaining solution is the center of the bargaining set, (1, 1). i.e., (ˆ u, vˆ) = δ+ 9. For the game of the prisoner’s dilemma defined by Exercise 1.23: (a) show that PG = P , and find P ∗ . (b) find the Nash bargaining solution. 10. Find the Nash bargaining solution under correlated strategies of the original (asymmetric) game of Crossroads in Chapter 1. Is f1 = f2 at this solution?6 6 From (3.50) and (3.51), the vector f = (f1 , f2 ) from the origin to the point with coordinates (f1 , f2 ) is a convex linear combination of the vectors (−δ−τ2 /2, −δ−τ1 /2), (0, −τ1 ), (−τ2 , 0), and (−− τ2 /2, − − τ1 /2), i.e., a linear combination with nonnegative coefficients that sum to 1, by (3.49). The point (f1 , f2 ) must therefore lie somewhere in the triangle with vertices (−δ − τ2 /2, −δ − τ1 /2), (0, −τ1 ), ˆ corresponds to maximizing the area of a rectangle, one corner of which is and (−τ2 , 0). Thus finding ω constrained to lie on the line with equation (3.53).
134
3. Cooperative Games in Strategic Form
11. Find parametric equations for the boundary of the reward set for Store Wars when α = 10 and c = 1.7 This reward set is sketched in Figure 3.3. 12. Use your results from Exercise 3.11 to find the bargaining set and the Nash bargaining solution for Store Wars when α = 10.
7 Any line segment L(ξ) with parametric equations u = ξ + t, v = t, 0 ≤ t ≤ α − ξ, is parallel to u = v in Figure 1.8. For 1 ≤ ξ ≤ 6 such line segments cover DA , and for 0 ≤ ξ ≤ 1 they cover the part of DB that lies below u = v. The remainder of D is covered by line segments of the form u = t, v = ξ + t, 0 ≤ t ≤ α − ξ; for 1 ≤ ξ ≤ 6 they cover DC , and for 0 ≤ ξ ≤ 1 they cover the rest of DB .
Chapter 4
Cooperative Games in Nonstrategic Form
A cooperative game is in nonstrategic form when the focus is on a benefit distribution or a coalition structure and strategies are merely implicit. We begin with characteristic function games, which focus on a benefit distribution and do not ask which coalitions form, because the answer is assumed. Later, in §4.9, we do ask which coalitions form, but we still do not ask how, because the question cannot be asked without explicit consideration of strategies. We broach the question of how coalitions form in §8.3 instead. A characteristic function game or CFG is a purely cooperative game among n players who seek a fair distribution for a benefit that is freely transferable. It is assumed that all players would like as much as possible of the benefit and that one unit of the benefit is worth the same to all players; thus, in terms of §3.4, CFGs imply interpersonal utility comparisons. Usually, but not necessarily (see, e.g., §4.5), the benefit to be shared is money. In this chapter, we introduce two solution concepts for CFGs, the nucleolus and the Shapley value. The fairness of a distribution is assumed to depend on the bargaining strengths of the various coalitions that could possibly form among some, but not all, of the players. Nevertheless, and at first sight paradoxically, the fundamental assumption of a CFG is that all players are cooperating. A coalition of all n players is known as the grand coalition. So the fundamental assumption of a CFG is that the grand coalition has formed—perhaps voluntarily, but the grand coalition may also be enforced by the action of some external agent or circumstance. Thus coalitions of fewer than n players can use as bargaining leverage the strength they would have had without the others if the others weren’t there, but the grand coalition can never actually dissolve (or the theory dissolves with it). It will be convenient in this chapter to regard each player as a fictitious coalition of one person; however, characteristic function games are interesting only when
135
136
4. Cooperative Games in Nonstrategic Form
there could exist true coalitions of less than all the players. Therefore, we shall assume throughout that n ≥ 3.1 As usual, we introduce solution concepts by means of examples.
4.1. Characteristic functions and reasonable sets Jed, Ned, and Ted are neighbors. Their houses are marked J, N, and T, respectively, in Figure 4.1. They work in the same office at the same times on the same days, and in order to save money they would like to form a car pool. They must first agree on how to share the costs of this cooperative venture, however; or, which is the view we prefer to adopt, on how to share the car pool’s benefits. We explore the matter here. Later, in §4.4, we shall consider adding a fourth neighbor, Zed, to the car pool.
J 1 mile
Z
T
N
F d miles Figure 4.1. Jed, Ned, Ted, and Zed’s stomping ground
Let’s suppose that Jed, Ned, and Ted drive identical cars, and that the cost of driving to work, including depreciation, is k dollars per mile. Because depreciation is included, it doesn’t matter in principle whose car is used (though in practice they might take turns). Let the distance to work from point F in the diagram, where the road through their neighborhood crosses a freeway, be d miles. Then, assuming each player selects the shortest route, Jed lives 4 + d miles from work, whereas Ned and Ted are both 3 + d miles away. Let Jed be Player 1, Ned Player 2, and Ted Player 3; and let c({i}) denote Player i’s cost in dollars of driving to work alone. Then c {1} = (4 + d)k and c {2} = (3 +d)k = c {3} . Round-trip travel costs are just twice these amounts. Let c {1, 2, 3} denote the cost in dollars if all three players drive to work in a single car, assuming that the shortest route is adopted. Then c {1, 2, 3} = (7 + d)k, regardless of whose car is used. If Jed and Ned were to form a car pool without Ted, then the cost would be c {1, 2} = (4 + d)k, again assuming the shortest route—and, of course, that they would use of car Jed’s car. Similarly, the costs pools that excluded Ned or Jed would be c {1, 3} = (6 + d)k = c {2, 3} ; it would not matter whose car Ned and Ted used, although Jed and Ted would have to use Jed’s. We have now calculated the travel costs associated with each of the seven 1 For n = 2, both the nucleolus and the Shapley value give the players equal shares of the benefit to be shared; see Exercises 4.6 and 4.16.
4.1. Characteristic functions and reasonable sets
137
coalitions that three players could form, namely, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}. More generally, among n players, 2n − 1 coalitions are possible. Let ν(S) denote the benefit of cooperation associated with the coalition S. Then # ν(S) = c({i}) − c(S), (4.1) i∈S
that is, ν(S) is the difference between the sum of the costs that the individual members of S would have to bear if they did not cooperate and the cost when they car pool together. For example, the benefit associated with the grand coalition {1, 2, 3} among all three players is saving (4.2)
ν({1, 2, 3}) = c({1}) + c({2}) + c({3}) − c({1, 2, 3}) = (3 + 2d)k
dollars. A car pool without Ted would save Jed and Ned (4.3)
ν({1, 2}) = c({1}) + c({2}) − c({1, 2}) = (3 + d)k
dollars, whereas the benefits of cooperations that excluded Ned or Jed would be (4.4)
ν({1, 3}) = c({1}) + c({3}) − c({1, 3}) = (1 + d)k
and (4.5)
ν({2, 3}) = c({2}) + c({3}) − c({2, 3}) = dk,
respectively. Of course, the benefit of cooperation associated with not cooperating is precisely zero; that is, from (4.1) with S = {i}, (4.6)
ν({i}) = 0
for any value of i. For the sake of definiteness, let us imagine that the car pool {1, 2, 3} will always use Jed’s car, so that it is Jed who actually foots the bills. How much should Ned and Ted pay Jed for each one-way trip? Let us suppose that Jed receives fraction x1 of the grand car pool’s benefit (3 + 2d)k, and that Ned and Ted receive fractions x2 and x3 , respectively, where (4.7)
0 ≤ x1 , x2 , x3 ≤ 1,
x1 + x2 + x3 = 1.
Then Ned or Ted should pay Jed the amount (4.8)
c({i}) − (3 + 2d)kxi
dollars per trip, where i = 2 for Ned and i = 3 for Ted; (4.8) with i = 1 is the part of the bill that Jed must pay himself. Our task is therefore to determine the fractions x1 , x2 , and x3 . Now (4.2)–(4.6) define a function ν from the set of all coalitions among three players into the nonnegative real numbers. We call ν the characteristic function. For set-theoretic purposes, it is convenient to suppose that the set of all coalitions of three players contains, in addition to {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}, the empty coalition ∅, which, because it contains no players, cannot benefit from cooperation. For completeness, therefore, we append (4.9)
ν(∅) = 0.
138
4. Cooperative Games in Nonstrategic Form
Note that (4.9) is not a gratuitous appendage; if in any doubt, compute the righthand side of (4.15) below with T = {i}. The number ν(S), the benefit that the players in S can obtain if they cooperate with each other but not with the players outside S, is a measure of the bargaining strength of the coalition S. It is convenient, however, to express this strength as a fraction of the strength of the grand coalition {1, 2, 3}, i.e., as ν(S) = ν(S)/ν({1, 2, 3}), where S is any coalition. Thus ν({i}) = 0, i = 1, 2, 3, (4.10)
ν({1, 2}) =
3+d 3+2d ,
ν({1, 3}) =
1+d 3+2d ,
ν({2, 3}) =
d 3+2d ,
and ν({1, 2, 3}) = 1. More generally, for a game among n players, we define the normalized characteristic function, ν, by (4.11)
ν(S) =
ν(S) , ν(N )
where N denotes the set of all players or grand coalition, i.e., N = {1, 2, . . . , n}, as in (1.67); and S is any of the 2n coalitions (including ∅) or, which is the same thing, S is any of the 2n subsets of N . Note that (4.11) does not make sense unless ν(N ) > 0, that is, unless the benefits of cooperation are positive, and we shall assume throughout that this condition is satisfied. A CFG such that ν(N ) > 0 is said to be essential, whereas a CFG such that ν(N ) = 0 is inessential. Thus we restrict our attention to the essential variety. Now, with regard to the car pool, our task is to determine a three-dimensional vector x = (x1 , x2 , x3 ) satisfying (4.7) that stipulates how the benefits of cooperation are to be distributed among the grand coalition {1, 2, 3}. We will refer to x as i’s allocation at an imputation, and to the ith component of x, namely xi , as Player x. Furthermore, for each coalition S, we will refer to the sum i∈S xi of allocations to players in S as the coalition’s allocation at x (or as the amount that x allocates to S). More generally, an imputation of a CFG among n players is an n-dimensional vector x = (x1 , x2 , . . . , xn ) such that (4.12a)
xi ≥ 0,
(4.12b)
x1 + x2 + · · · + xn = 1.
i = 1, 2, . . . , n,
The ith component of x, namely xi , is Player i’s allocation at x. We will denote the set of all imputations by X. Thus x ∈ X if and only if (4.12) is satisfied. Obviously, if xi ≥ 0 for all i ∈ N , then (4.12b) implies xi ≤ 1 for all i ∈ N . If x satisfies (4.12b), then x is said to be unimprovable or group rational, because it is impossible to increase one player’s allocation without decreasing that of another. If x satisfies (4.12a), then x is said to be individually rational, because each player is at least as well off in the grand coalition as he would have been all by himself. Thus, imputations are vectors that are both individually and group rational. For n = 3, because x3 ≥ 0 implies 0 ≤ x1 +x2 ≤ 1, X can be represented in two dimensions as a right-angled isosceles triangle. From Figure 4.2, where X is shaded, we see that x1 increases to the east and x2 to the north, whereas x3 increases to the southwest. We should bear in mind, however, that this representation is a
4.1. Characteristic functions and reasonable sets
139
Figure 4.2. A representation of the set of imputations for a three-player CFG
distortion, albeit a convenient one, of the true picture, which is that X is a twodimensional equilateral triangle embedded in three-dimensional space; specifically, X is Δ2 , the standard 2-simplex.2 Which of the infinitely many imputations in X can be regarded as a fair distribution of the benefits of cooperation? We can attempt to answer this question by considering the order in which the grand coalition of all n players might actually form. Suppose, for example, that n ≥ 4 and that Player a is first to join, Player b is second, Player c is third, and Player i is fourth, where a, b, c, and i are any positive integers between 1 and n. Let T = {a, b, c, i}. Then, because the players have joined in the given order, it is reasonable to say that Player a has contributed ν({a}) − ν(∅) = 0 to the grand coalition, that Player b has added ν({a, b}) − ν({a}) = ν({a, b}), that Player c has added ν({a, b, c}) − ν({a, b}), and that Player i has added ν(T ) − ν({a, b, c}); in which case, ν(T ) − ν({a, b, c}) is Player i’s fair allocation. But {a, b, c} = T − {i}, where we define the difference of two sets to consist of all elements in the first but not the second, i.e., (4.13)
A − B = {x | x ∈ A, x ∈ / B} .
Thus, ν(T ) − ν(T − {i}) is Player i’s fair allocation. More generally, if Player i is the jth player to join and the j−1 players who have joined already are T − {i}, then ν(T ) − ν(T − {i}) is a fair allocation for Player i. Unfortunately, we do not know the identity of T : there are 2n−1 coalitions containing i (see Exercise 4.15), and T could be any one of them. What we can be sure about, however, is that if we calculate ν(T ) − ν(T − {i}) for every coalition T containing i, i.e., for every coalition in the set Π i defined by (4.14)
Π i = {S | S ⊆ N and i ∈ S} ,
2 See Footnote 2 on p. 364. More generally, for a CFG among n players, X is Δn−1 , the standard (n − 1)-simplex.
140
4. Cooperative Games in Nonstrategic Form
Figure 4.3. The reasonable set (light and dark shading) and the core (dark shading) for the three-person car pool with d = 9
then a fair allocation for Player i should not exceed the maximum of those 2n−1 numbers. Therefore, the set of imputations such that (4.15)
xi ≤ maxi {ν(T ) − ν(T − {i})} T ∈Π
for all i = 1, 2, . . . , n is called the reasonable set. Returning now to the car pool, let us suppose, for the sake of definiteness, that the office where Jed, Ned, and Ted all work is nine miles from the point marked F in Figure 4.1. Then d = 9 in (4.10). For Jed, the maximum in (4.15) is taken over all coalitions T that contain {1}, i.e., over all T ∈ Π 1 = {1}, {1, 2}, {1, 3}, {1, 2, 3} . Thus if x belongs to the reasonable set, then x1 must not exceed the maximum of ν({1, 2, 3})−ν({2, 3}), ν({1, 2})−ν({2}), 4ν({1, 3})−ν({3}), and ν({1})−ν(∅), that −0, 0−0 = then is, x1 ≤ max 1− 37 , 47 −0, 10 21 4 on taking i = 2and i =103 7 . 3Similarly, 4 3 10 3 in (4.15), we must also have x2 ≤ max 11 = , , and x ≤ max , , 3 21 7 7 7 7 21 7 = 21 (Exercise 4.1). The reasonable set consists of points that satisfy these inequalities and (4.7). Because x3 = 1 − x1 − x2 , the inequality x3 ≤ 10 21 is equivalent to x1 + x2 ≥ 11 , and so the reasonable set is the hexagon represented by the shaded 21 regions (both light and dark) in Figure 4.3. We see that, although (4.15) excludes imputations near the corners of X—it would not be reasonable to give most of the benefits of car pooling to a single player—the concept of a reasonable set provides minimal constraints on what we may consider a fair imputation. Nevertheless, the thinking that led to (4.15), when suitably refined, is capable of yielding an attractive solution concept, and we shall give it our attention in §4.7. Before proceeding, a word about terminology. For most purposes, it is preferable to regard imputations of a three-player game as vectors in three-dimensional space, but X is two-dimensional, and for graphical purposes it is preferable to regard the imputations as points in two dimensions. Thus the corner (1, 0) of the triangle in Figure 4.3 represents the (unreasonable) imputation (1, 0, 0), which would give all the benefits of cooperation to Jed; the corner (0, 1) represents the imputation (0, 1, 0), which would give all the benefits of cooperation to Ned; and the corner
4.2. Core-related concepts
141
(0, 0) represents the imputation (0, 0, 1), which would give all the benefits of cooperation to Ted. More generally, from (4.7), the point (x1 , x2 ) of the triangle in Figure 4.3 represents the imputation (x1 , x2 , 1 − x1 − x2 ). Thus although, in principle, the reasonable set for our three-person car pool is a hexagon in the plane x1 + x2 + x3 = 1 whose projection onto the plane x3 = 0 is shaded in Figure 4.3, to observe this distinction in practice is often merely a nuisance, and so we shall not hesitate to refer to either set as the reasonable set when it is convenient to do so.
4.2. Core-related concepts The concept of a reasonable set has enabled us to exclude the most unreasonable points from the set of imputations for the car pool. But infinitely many imputations still remain. If we are serious about helping Jed, Ned, and Ted reach agreement over their car pool, then we had better find a way to exclude more of X. A concept that is useful in this regard is that of excess. For each imputation x ∈ X in an n-player CFG, the excess of the coalition S at x, denoted by e(S, x), is the difference between the fraction of the benefits of cooperation that S can obtain for itself (even if it does not cooperate with players outside S) and the fraction of the benefits of cooperation that x allocates to S: # xi . (4.16) e(S, x) = ν(S) − i∈S
For example, for any CFG, it follows from (4.6), (4.9), (4.11), and (4.16) that (4.17a) (4.17b)
e({i}, x) = −xi ,
i = 1, 2, . . . , n,
e(∅, x) = 0 = e(N, x),
and for the car pool of §4.1 with d = 9 we have (4.18a)
e({1, 2}, x) =
(4.18b)
e({1, 3}, x) =
(4.18c)
e({2, 3}, x) =
4 7 − x1 − x2 , 10 21 − x1 − x3 3 7 − x2 − x3
= x2 − = x1 −
11 21 , 4 7.
If e(S, x) > 0, then the players in S will regard the imputation x as unfair, because they would receive greater benefits if they did not have to form the grand coalition. It therefore seems sensible, if possible, to exclude imputations for which a coalition exists such that e(S, x) > 0. The imputations that remain, if any, are said to form the core of the game. Here, and in the following section, we will assume that the core exists, and for reasons about to emerge, we will denote it by C + (0). In set-theoretic notation, (4.19)
C + (0) = {x ∈ X | e(S, x) ≤ 0 for all coalitions S} .
For example, from (4.18), the core of the car-pool game with d = 9 is the dark quadrilateral in Figure 4.3.3 If the core exists, then it must be a subset of the reasonable set (Exercise 4.23), possibly the whole of it (Exercise 4.9). But a game may have no core (or, if you prefer, C + (0) = ∅), and we shall consider this possibility 3 The core of a three-player CFG need not be a quadrilateral; it can be a point, a line segment, a triangle, a pentagon (as in Exercise 4.9) or a hexagon.
142
4. Cooperative Games in Nonstrategic Form
x2 4/7 3/7 2/7 1/7 0 0
1/7
2/7
3/7
4/7
x1
Figure 4.4. Rational -core boundaries for = 0 (outer quadrilateral), = −1/21, = −2/21, = −1/7, and = −11/63 (dot) when d = 9 in the car-pool game of §4.1
in §4.4. We note in passing that a sufficient condition4 for the core to exist is that the game be convex, i.e., that (4.20)
ν(S ∪ T ) ≥ ν(S) + ν(T ) − ν(S ∩ T )
for all coalitions S and T . But convexity is by no means a necessary condition, and often fails to hold; for example, the car-pooling game defined by (4.10) is never convex but always has a core (Exercise 4.8). By excluding imputations that lie inside the reasonable set but outside the core (the lighter shaded region in Figure 4.3), we move nearer to a car pool agreement. But infinitely many imputations still remain. Then which of them represents the fairest distribution of the benefits of cooperation? From Jed’s point of view the best points in the dark quadrilateral of Figure 4.3 lie on the boundary where x1 = 47 , from Ned’s point of view they lie on the boundary where x2 = 11 21 , and from Ted’s point of view they lie on the boundary where x1 + x2 = 47 . It thus appears that the fairest compromise would be somehow to locate the “center” of the core. But how do we find an imputation that we can reasonably interpret as the center? One approach would be to move the walls of the boundary inward, all at the same speed, until they coalesce in a point. Suppose, for example, that we move the walls inward at the rate of one unit per second. 1 of a second, we will have shrunk the boundary to the inner of the Then, after 21 2 two quadrilaterals in Figure 4.4. If we continue at the same rate, then after 21 of a second we will have shrunk the boundary to the outer of the two triangles, 11 and after 17 of a second to the inner triangle. Ultimately, after 63 of a second, the 25 22 boundary will collapse to the point 63 , 63 , which is marked dot in Figure 25by22a 16 ∗ 4.4. We might therefore propose that the imputation x = 63 , 63 , 63 is the fair solution of the game, in which case, from (4.8), Jed and Ned should each pay 14 3 k 4 See [302]. A convex CFG is always proper in the sense of §4.6, that is, convexity implies superadditivity; see (4.86).
4.2. Core-related concepts
143
dollars of the cost per trip, whereas Ted should pay 20 3 k dollars. Ted should pay most because Jed must go out of his way to drive him home. With a view to later developments, it will be convenient to denote by Σ 0 the set of all coalitions that are neither the empty coalition nor the grand coalition, Σ 0 = {S | S ⊆ N, S = ∅, S = N } .
(4.21)
We can now generalize the ideas that yielded the above fair solution to an arbitrary n-person CFG by defining the rational -core, denoted by C + (), as the set of all imputations at which no coalition other than ∅ or N has a greater excess than . That is, (4.22) C + () = x ∈ X | e(S, x) ≤ for all S ∈ Σ 0 . Thus the core (if it exists) is the rational 0-core; and for < 0, C + () is the set to which the core has shrunk after one second, when its walls are moved inward at || units per second (again, if it exists). At imputation x, however, e(S, x) ≤ for all S ∈ Σ 0 if and only if the maximum of e(S, x)—taken over all coalitions in Σ 0 —is less than or equal to . Accordingly, and again with a view to later developments, let us define a function φ0 from X to the real numbers by φ0 (x) = max0 e(S, x).
(4.23)
S∈Σ
We can now define the rational -core more succinctly as C + () = {x ∈ X | φ0 (x) ≤ } .
(4.24)
Reducing shrinks C + (), but if is too small, then there are no imputations such that e(S, x) ≤ for all S ∈ Σ 0 . So there is a least value of for which C + () exists. If we denote this value by 1 , then 1 = min φ0 (x)
(4.25)
x∈X
because reducing causes more and more imputations to violate the condition φ0 (x) ≤ , until all that remain finally are the imputations for which φ0 attains its minimum on X. We shall refer to C + (1 ), the set over which φ0 attains its minimum, as the least rational core. With a view to later developments, however, it will be convenient to have an alternative notation for the least rational core, namely X 1 . Thus X 1 = C + (1 ) = {x ∈ X | φ0 (x) = 1 } . Returning to the car pool for illustration, we find that x belongs to C + () if, on using (4.18), (4.26a)
4 7
and, on using (4.17a), (4.26b)
− x1 − x2 ≤ , x2 −
11 21
≤ , x1 −
4 7
≤
5
−x1 ≤ , −x2 ≤ , −x3 ≤ ;
or, which is the same thing, if (4.27) 5
− ≤ x1 ≤ 47 + , − ≤ x2 ≤ 11 21 + , 4 − ≤ x + x ≤ 1 + . 1 2 7
Note that if > 0, then (4.26b) is superseded by condition (4.12a) that allocations are nonnegative.
144
4. Cooperative Games in Nonstrategic Form
The boundary of the region corresponding to (4.27) is sketched in Figure 4.4 for 1 2 (inner quadrilateral), = − 21 (outer triangle), = 0 (outer quadrilateral), = − 21 1 and = − 7 (inner triangle). These triangles, and the open curves consisting of the longest three sides of the quadrilaterals, are φ0 = contours of the function defined by (4.23). The dot corresponds to the least rational core, which is 25 22 16 . (4.28) X 1 = {x∗ } = 63 , 63 , 63 Note that the value 1 = − 11 63 can be found analytically, but we postpone this matter until §4.3. Now e(S, x) is a measure of coalition S’s dissatisfaction with the imputation x. If e(S, x) > 0, then S’s allocation from x would be less than the benefit it could obtain for itself, and the players in S would rather dissolve the grand coalition than accept x (but they cannot dissolve it, because a fundamental assumption of CFG analysis is that the grand coalition has formed). If e(S, x) = 0, then the players in S would be indifferent between maintaining or dissolving the grand coalition. On the other hand, if e(S, x) < 0, then the players in S would prefer to remain in the grand coalition (even if it were possible to dissolve it). Regardless of whether e(S, x) is positive or negative, however, the lower the value of e(S, x), the lower the dissatisfaction of the players in S with imputation x (or, if you prefer, the higher their satisfaction). If x ∈ C + (), then no coalition’s dissatisfaction exceeds (or no coalition’s satisfaction is less than −), and the lower the value of , the lower the value of the maximum dissatisfaction among all coalitions that could possibly form. Thus if X 1 = C + (1 ) contains a single imputation, say x∗ , then x∗ minimizes maximum dissatisfaction; and to the extent that minimizing maximum dissatisfaction is fair—which we assume henceforward until §4.7—x∗ is the fair solution of the game. Although X 1 = C + (1 ) always exists, it may contain infinitely many imputations. Then which should be regarded as the fair solution? We will address this matter in §4.5. Meanwhile, the least rational core will be an adequate solution concept, because in §§4.3–4.4 we consider only games for which X 1 contains but a single imputation, x∗ .
4.3. A four-person car pool The purpose of the present section is twofold: to present an example of a four-player game, and to show how to calculate X 1 . Accordingly, let Jed, Ned, and Ted have a fourth neighbor, Zed, whose house is marked Z in Figure 4.1. He works in Jed, Ned, and Ted’s office at the same times on the same days, owns the same kind of small car, and in order to save money would like to join their car pool. Now, in practice it might happen that the existing car pool would bargain with Zed as a unit, so that the bargaining would reduce to a two-player game; but we wish to consider a four-person game. Therefore, let us assume, at least until the end of the section, that Zed is a good friend of all the others, and that they work out the costs of the four-person car pool from scratch. As stated in the previous section, we also assume that a fair distribution of the benefits of cooperation is one that minimizes maximum dissatisfaction. Thus our task is to calculate the least rational core of a four-player game.
4.3. A four-person car pool
145
With n = 4, and hence N = {1, 2, 3, 4}, there are 15 coalitions excluding ∅. Let us assume that Zed is Player 4, and that Jed, Ned, and Ted are Players 1, 2, and 3, as before. Now, it is clear from §4.1 that the value of k has no effect on the outcome; therefore, we may as well express the costs in units of k dollars. Then from Figure 4.1 we readily find that the costs of the car pools that could possibly form (predicated on travel by the shortest route) are as follows:
(4.29)
c({1}) = 4 + d, c({2}) = 3 + d, c({3}) = 3 + d, c({4}) = 3 + d, c({1, 2}) = 4 + d, c({1, 3}) = 6 + d, c({1, 4}) = 4 + d, c({2, 3}) = 6 + d, c({2, 4}) = 4 + d, c({3, 4}) = 5 + d, c({1, 2, 3}) = 7 + d, c({1, 2, 4}) = 5 + d, c({1, 3, 4}) = 6 + d, c({2, 3, 4}) = 6 + d, c({1, 2, 3, 4}) = 7 + d.
From (4.1) we deduce the associated benefits of cooperation:
(4.30)
ν({i}) = 0, i = 1, 2, 3, 4, ν({1, 2}) = 3 + d, ν({1, 3}) = 1 + d, ν({1, 4}) = 3 + d, ν({2, 3}) = d, ν({2, 4}) = 2 + d, ν({3, 4}) = 1 + d, ν({1, 2, 3}) = 3 + 2d, ν({1, 2, 4}) = 5 + 2d, ν({1, 3, 4}) = 4 + 2d, ν({2, 3, 4}) = 3 + 2d, ν({1, 2, 3, 4}) = 6 + 3d.
For the sake of definiteness, let us suppose that d = 2 (you may consider this value of d to be unrealistic, but higher values are considered in Exercises 4.4 and 4.5). Then, from (4.11) and (4.30), and introducing the shorthand νij··· = ν({i, j, . . .}),
(4.31)
which is especially convenient for n ≥ 4, the normalized characteristic function is defined by (4.32)
νi = 0, i = 1, 2, 3, 4, 5 ν12 = ν13 = ν14 = 12 , ν23 = 16 , ν24 = 13 , ν34 = 14 , 7 7 ν123 = 12 , ν124 = 34 , ν134 = 23 , ν234 = 12 , ν1234 = 1. 5 12 ,
1 4,
Applying the condition that e(S, x) ≤ to each S ∈ Σ 0 in turn, that is, to each coalition in turn among {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, and {2, 3, 4}, we discover from (4.22) that C + () consists of all imputations x = (x1 , x2 , x3 , x4 ) satisfying (4.33a) (4.33b) (4.33c)
−x1 ≤ , −x2 ≤ , −x3 ≤ , −x4 ≤ ; 5 1 5 12 − x1 − x2 ≤ , 4 − x1 − x3 ≤ , 12 − x1 − x4 ≤ 1 1 1 6 − x2 − x3 ≤ , 3 − x2 − x4 ≤ , 4 − x3 − x4 ≤ 7 3 12 − x1 − x2 − x3 ≤ , 4 − x1 − x2 − x4 ≤ , 2 7 3 − x1 − x3 − x4 ≤ , 12 − x2 − x3 − x4 ≤ ;
and (4.34)
x1 + x2 + x3 + x4 = 1.
, ;
146
4. Cooperative Games in Nonstrategic Form
Of course, if > 0, then (4.33a) will be superseded by condition (4.12a) that allocations are nonnegative. On using (4.34) to eliminate x4 from (4.33), we find that C + () corresponds to the three-dimensional set of vectors x ∈ X such that − ≤ x1 ≤
(4.35a)
5 12
(4.35b)
5 12
+ , − ≤ x2 ≤
− ≤ x1 + x2 ≤ 1 6 7 12
(4.35c)
3 4
+ ,
1 4
1 3
+ , − ≤ x3 ≤
− ≤ x1 + x3 ≤
− ≤ x2 + x3 ≤
7 12
2 3
1 4
+ ;
+ ,
+ ;
− ≤ x1 + x2 + x3 ≤ 1 + .
It is readily verified that (4.33) and (4.35) are equivalent; for example, the second of (4.35a) combines the second of (4.33a) with the third of (4.33c), whereas the third of (4.35b) combines the third and fourth of (4.33b). 5 We can now determine 1 . From the first of (4.35a) we have − ≤ 12 + , or 5 ≥ − 24 . Similarly, from the second and third of (4.35a), we have ≥ − 16 and 5 ≥ − 18 ; from (4.35b) we have ≥ − 16 and ≥ − 24 (twice); and from (4.35c) we 5 5 again have ≥ − 24 . We have now obtained three lower bounds on , namely, − 24 , 1 1 1 − 6 , and − 8 , and the greatest of these lower bounds is − 8 . We have therefore established that 1 ≥ − 18 . Can we improve this lower bound? Adding the first two inequalities of (4.35a) yields −2 ≤ x1 + x2 ≤
(4.36)
3 4
+ 2,
which in conjunction with the first of (4.35b) yields −≤
5 12
(4.37) or ≥ − 19 and 1 ≥ − 19 , which
3 4
+ 2,
−2 ≤
3 4
+
− 14 ,
≥ the second of which is superseded by the first. Hence supersedes 1 ≥ − 18 . Can we further improve this lower bound? Adding all three of the inequalities in (4.35a) yields −3 ≤ x1 + x2 + x3 ≤ 1 + 3,
(4.38)
which in conjunction with (4.35c) yields 7 12
(4.39) 5 − 48
− ≤ 1 + 3,
−3 ≤ 1 +
or ≥ and ≥ the second of which is again superseded by the first. 5 Thus 1 ≥ − 48 , an improved lower bound. Continuing in this manner, we obtain 5 four more lower bounds on , but none of them supersedes ≥ − 48 . No further restrictions on are implied by (4.35); can be as small as it pleases, as long as it 5 . We conclude that satisfies ≥ − 48 − 14 ,
5 1 = − 48 .
(4.40)
5 With = − 48 , the first inequality of (4.35c) and the second of (4.38) together 11 imply that 16 ≤ x1 + x2 + x3 ≤ 11 16 and hence
(4.41)
x1 + x2 + x3 =
11 16 .
We can now eliminate x3 from (4.35); after simplification, the least rational core consists of imputations satisfying (4.42)
5 24
≤ x1 ≤
5 1 16 , 8
≤ x2 ≤
11 13 48 , 24
≤ x1 + x2 ≤
7 12 .
4.4. Log hauling: A coreless game
147
5 These inequalities are satisfied only if x1 = 16 and x2 = 11 48 . Hence, from (4.41) and (4.34), the least rational core is 5 5 11 7 5 = . (4.43a) X 1 = C + − 48 16 , 48 , 48 , 16
In other words, the fairest distribution of the benefits of cooperation (according to our agreed criterion) is given by the imputation 5 11 7 5 , 48 , 48 , 16 . (4.43b) x∗ = 16 By analogy with (4.8), it follows from (4.30) that Player i should contribute c({i}) − (6 + 3d)xi = c({i}) − 12xi
(4.44)
to the cost per trip (which, you will recall, is measured in units of k dollars). Thus Jed and Ned should each cover 2.25 units of the total cost of 9 units; whereas Ted should contribute 3.25 units, and Zed only 1.25. Again, Ted is penalized for being the outlier. Now, it could be argued that this “fair” solution is more than fair to the newcomer Zed, and hence less than fair to the other three. If Jed, Ned, and Ted were to bargain as a unit, then we would have, in effect, a two-player CFG with costs c(1) = 9,
(4.45)
c(2) = 5,
c(1 ∪ 2) = 9
and hence benefits (4.46)
ν(1) = 0 = ν(2),
ν(1 ∪ 2) = 5,
where 1 = {1, 2, 3} and 2 = {4}. From Exercise 4.6 the fair solution of this modified CFG is to share the benefits of cooperation equally between 1 and 2. Thus Zed would have to contribute 2.5 units—twice as much as previously—whereas Jed, Ned, and Ted would have reduced their total costs from 7.75 units to 6.5. How would they distribute the extra benefit of 1.25 units among themselves? No two of them could secure part of this benefit by acting without the other; the benefit would accrue only if all three of them agreed to gang up on Zed. Thus Jed, Ned, and Ted would be led to a CFG with normalized characteristic function defined by (4.47)
ν(S) = 0 if S = {1, 2, 3},
ν({1, 2, 3}) = 1.
Again (Exercise 4.6), the fair solution of this game is to share the extra benefit 5 of a equally. Thus Jed, Ned, and Ted would each reduce their cost per trip by 12 11 unit, so that Jed and Ned would cover 6 units of the total cost of 9 units, whereas 5 Ted would contribute 17 6 units (and Zed 2 ). Whether this or the previous solution ultimately emerged would depend on details of how the car pool formed. Note, finally, that the method we used to determine C + (1 ) for a four-player game applies even more readily to a three-player game. It can be used in particular to deduce (4.28) from (4.27); see Exercise 4.2.
4.4. Log hauling: a coreless game Many years ago when they were young and sprightly, Zan and Zed felled enough trees to make 150 8-foot logs, which they neatly stacked in their yard. They had intended to turn the logs into a cabin; but years went by, and they never found the time. Now Zan and Zed are too old to lift the logs, and they no longer need a
148
4. Cooperative Games in Nonstrategic Form
cabin. They have therefore decided to sell the logs for a dollar apiece to the first person or persons who will haul every single one away, and they have advertised the logs accordingly. In response to their advertisement, three people and their pickup trucks arrive simultaneously at Zan and Zed’s at the appointed time. These eager beavers, who would all like to haul as many of the bargain logs as possible, are our old friends Jed (Player 1), Ned (Player 2), and Ted (Player 3). Jed’s truck can haul 45 logs, Ned’s can haul 60, and Ted’s can haul 75. If there were 180 logs, then all three could leave with a full load. Unfortunately, however, there are only 150 logs; and because our friends all arrived simultaneously, no one can claim to have arrived first. Moreover, because the logs are too heavy for one person to lift, and because nobody gets anything unless everything goes, the players cannot resort to a scramble; they must cooperate. Then how do they divvy up the logs? We must solve another CFG. Let ν(S), the benefit of cooperation that accrues to coalition S, be the number of logs that the players in S can haul without the help of players outside S. Then
(4.48)
ν({i}) = 0, i = 1, 2, 3 ν({1, 2}) = 105, ν({1, 3}) = 120, ν({2, 3}) = 135 ν({1, 2, 3}) = 150.
The characteristic function is therefore defined by (4.49)
ν({1, 2}) =
7 10 ,
ν({1, 3}) = 45 , ν({2, 3}) =
9 10
(and, of course, ν({i}) = 0, i = 1, 2, 3, ν(N ) = 1). It follows that the imputation x = (x1 , x2 , x3 ) belongs to C + () if and only if (4.50)
1 + , − ≤ x2 ≤ 15 + , − ≤ x1 ≤ 10 7 10 − ≤ x1 + x2 ≤ 1 +
(Exercise 4.7). The sum of the second and fourth inequalities is consistent with 7 3 2 2 − ≤ 10 + 2, or ≥ 15 . Thus 1 = 15 . We conclude the fifth if and only if 10 immediately that the game has no core. Whenever a game is coreless or, which is the same thing, whenever 1 > 0 (if 1 = 0, then the core contains a single imputation), the maximum dissatisfaction must be positive; no matter which imputation is selected, at least one coalition (and in this particular case, every coalition) will have a lower allocation than if the grand coalition did not form. Thus Jed, Ned, and Ted all wish that one of the other two had not arrived or had arrived late; but they did arrive, and the three of them must reach an agreement. Characteristic function games assume that the grand coalition has formed, but have nothing to say about how it formed. If the game has a core, then it is probable that the grand coalition formed voluntarily; the more players the merrier (up to a point), because everyone benefits from the cooperation. If, on the other hand, the game has no core, then the players were probably coerced into forming the grand coalition; the fewer players the better (down to a point), but no one can be barred from playing. To put it another way, games with cores are about sharing a surplus of benefits, whereas coreless games are about rationing a shortage of benefits; but in either case the existence of the grand coalition is assumed, and
4.4. Log hauling: A coreless game
149
the least rational core, if it contains a single imputation, offers a solution that is fair in the sense of minimizing maximum dissatisfaction. 2 7 , (4.50) can be satisfied only if x1 = 30 and x2 = 13 (Exercise 4.7), With 1 = 15 and so the least rational core is 2 7 1 13 = . (4.51) X 1 = C + 15 30 , 3 , 30 Our fair solution gives 35 logs to Jed, 50 to Ned and 65 to Ted. Hence it distributes the 30-log shortage equally among the players.6 Another way to reach the same conclusion is to say that 15 logs are attributable to Jed regardless of any fairness considerations, because 15 logs would be left over even if Ned and Ted had both filled their trucks. Similarly, 30 logs are attributable to Jed, and 45 logs are attributable to Ted. So only 60 of the 150 logs are not attributable without invoking considerations of fairness. If we divide these 60 logs equally among the players, then we recover the least rational core. The preceding argument readily generalizes. Consider the three-player game defined by (4.52a)
ν({2, 3}) = a,
ν({1, 3}) = b,
ν({1, 2}) = c
with m = max(a, b, c) ≤ 1.
(4.52b)
Because Players 2 and 3 can obtain only a of total savings by themselves, let 1 − a be attributed to Player 1; and likewise, let 1 − b and 1 − c be attributed to Players 2 and 3, respectively. Then nonattributable savings are 1 − (1 − a) − (1 − b) − (1 − c) = a + b + c − 2, which is positive whenever no core exists (Exercise 4.8). If players receive the savings thus attributed to them plus equal shares of nonattributable savings, then the game’s solution is the imputation xT defined by (4.53)
xT =
1 3 (1
− 2a + b + c, 1 + a − 2b + c, 1 + a + b − 2c).
This solution is of considerable historical interest, because it was used by engineers and economists of the Tennessee Valley Authority in the 1930s—long before game theory emerged as a formal field of study—to allocate fairly the joint costs of dam systems among participating users [325]. The solution foreshadowed later developments in game theory, because it agrees with the nucleolus—which we are about to introduce—whenever (4.54)
min(a + b + c, 2a + 2b + 2c − 3m) ≥ 1
(Exercise 4.24). Note that (4.54) is satisfied by (4.49). Whether the above is the conclusion that Jed, Ned, and Ted would actually reach, however, is an open question. For example, they might instead consider it fairer for each to have 50 logs in the first instance, and then to transfer five of Jed’s logs to the other two, because Jed can haul only 45. But how could those five logs then be fairly divided between Ned and Ted—other than by sawing one in half? 6 There remains the problem of who should put the logs on the trucks (assume that they all have family to help at home). Because it takes two to lift a log, and because the player’s allocations are in the ratio 7:10:13, it seems fairest for Jed to lift 70 logs, Ned to lift 100 and Ted 130. For example, Ted could lift all 35 of Jed’s logs, 12 with Jed and 23 with Ned; Ned could lift all 50 of his own logs, 20 with Jed and 30 with Ted; and Ted could lift all 65 of his own logs, 27 with Ned and 38 with Jed.
150
4. Cooperative Games in Nonstrategic Form
Figure 4.5. Reasonable set (hexagon) and rational -cores for = 0 (outer quadrilateral), = −1/20 (inner quadrilateral), and = −1/10 (line segment) for d = 1 in the three-person car pool. The quadrilaterals are φ0 = contours, where φ0 is defined by (4.23); the dot marks the nucleolus; and the difference between the reasonable set and the core is shaded.
4.5. Antique dealing. The nucleolus The boundary of the core does not always collapse to a single point when we move its walls inward at a uniform rate; that is, the least rational core may contain more than one imputation. Then what is the center of the core in such cases? An answer is provided by the nucleolus, which we introduce in this section.7 Let us begin by returning to the three-person car pool of §4.1. If d = 1 then, from (4.10), the characteristic function is defined by (4.55)
ν({1, 2}) =
4 5,
ν({1, 3}) =
2 5,
ν({2, 3}) =
1 5
(and, of course, ν(N ) = 1, ν({i}) = 0, i = 1, 2, 3). It is questionable whether a car pool would really form for such a low value of d, but the value is convenient for our present purpose. From Exercise 4.2, the rational -core is (4.56) C + () = x ∈ X − ≤ x1 ≤ 45 + , − ≤ x2 ≤ 35 + , 4 5 − ≤ x1 + x2 ≤ 1 + , 1 where X is defined by (4.7); for example, C + (0) and C + − 20 are the two quadrilaterals in Figure 4.5.8 By the method of §4.3, and because (4.56) implies 45 − ≤ 1+, 1 it is readily found that 1 = − 10 , and so the least rational core is 1 2 7 1 9 . , 5 ≤ x2 ≤ 12 , x1 + x2 = 10 (4.57) X = x ∈ X | 5 ≤ x1 ≤ 10 7 The concept of nucleolus is due to Schmeidler [297]. Strictly, the concept we introduce is Maschler, Peleg, and Shapley’s lexicographic center—which, however, coincides with the nucleolus. See [183, pp. 331–336].
8 1 onto Strictly speaking, of course, these quadrilaterals are the projections of C + (0) and C + − 20 the x1 -x2 plane. See the remarks at the end of §4.1.
4.5. Antique dealing. The nucleolus
151
It no longer contains a single imputation; rather, it contains all imputations 2 1 cor9 and + x = between (x , x ) = responding to points of the line x 1 2 1 2 10 5, 2 7 1 (x1 , x2 ) = 10 , 5 in Figure 4.5. Which of all these imputations is the fair solution of the game? To answer this question, let us continue to assume that fairness means minimizing maximum dissatisfaction. Now, by construction, for x ∈ X 1 the maximum 1 dissatisfaction of any coalition is 1 = − 10 . That is, every coalition obtains at least 1 more of the benefits of cooperation than it could obtain if the grand coalition 10 had not formed, which we can verify by using (4.16) and (4.55) to compute the excesses at x ∈ X 1 : e({1}, x) = −x1 , (4.58)
e({1, 3}, x) = x2 −
3 5,
e({2}, x) = −x2 , e({2, 3}, x) = x1 − 45 ,
1 = e({3}, x). e({1, 2}, x) = − 10 7 1 Because 25 ≤ x1 ≤ 10 and 15 ≤ x2 ≤ 12 , none of these excesses exceeds − 10 , which is the maximum dissatisfaction. But it is also the minimum possible maximum dissatisfaction. Because e(S, x) is independent of x for the two most dissatisfied coalitions, namely, {1, 2} and {3}, we can vary x within X 1 without affecting their excesses. For the other four coalitions, however, e(S, x) remains a function of x; and so, by varying x, we can seek to minimize their maximum excess, i.e., minimize the maximum of e({1, 3}, x), e({2, 3}, x), e({1}, x), and e({2}, x). In other words, coalitions {1, 2} and {3}, being indifferent among all imputations in X 1 , are already as satisfied as it is possible to make them; so we exclude them from further consideration, and concentrate instead on reducing the dissatisfaction of the remaining coalitions 1 . What this means in practice is that Ted has been allocated a tenth of below − 10 the benefits, and Jed and Ned have together been allocated nine tenths, but it is not yet clear how to divvy it up between them.
With a view to generalization, let Σ 1 denote the set of coalitions whose excess can be reduced below 1 by an imputation in X 1 . That is, define (4.59) Σ 1 = S ∈ Σ 0 | e(S, x) < 1 for some x ∈ X 1 , (so that Σ 1 = {{1}, {2}, {1, 3}, {2, 3}} for our three-person car pool with d = 1). Still with a view to generalization, for x ∈ X 1 let (4.60)
φ1 (x) = max1 e(S, x) S∈Σ
be the maximum excess at imputation x of the remaining coalitions; and let 2 be its minimum, i.e., (4.61)
2 = min1 φ1 (x). x∈X
Furthermore, let X 2 consist of all imputations at which φ1 achieves its minimum, i.e., define (4.62) X 2 = x ∈ X 1 | φ1 (x) = 2 .
152
4. Cooperative Games in Nonstrategic Form
0
ϕ1 0.4
0.5
0.6
0.7
x1
-0.1 -0.2 -0.3 -0.4 -0.5 -0.6 -0.7 Figure 4.6. Graph of φ1 defined by (4.63) 9 Then, in the case of the car pool, because x1 + x2 = 10 for x ∈ X 1 , φ1 can be regarded as function of x1 alone. In fact, from (4.58)–(4.60), we have φ1 (x) = max x2 − 35 , x1 − 45 , −x1 , −x2 3 (4.63) 9 = max 10 − x1 , x1 − 45 , −x1 , x1 − 10 7 ≤ x1 ≤ 10 . The graph of φ1 is the solid vee in Figure 4.6. Note that φ1 assumes its minimum 1 of − 14 where x1 = 11 20 . Thus 2 = − 4 and 11 7 1 (4.64a) X2 = . 20 , 20 , 10
for
2 5
From (4.58), the coalitions for which − 14 is the actual excess at x∗ are {1, 3} and 7 {2, 3}; for {1} and {2}, of course, the excesses at x∗ are − 11 20 and − 20 , respectively. 2 Because X contains a single imputation 7 1 (4.64b) x∗ = 11 20 , 20 , 10 , no further variation of x is possible. We have reduced unfairness as much as possible. Thus x∗ is the solution of the CFG. From (4.8), Jed, Ned, and Ted’s fair (fare?) shares of the cost per trip are 2.25k, 2.25k, and 3.5k, respectively. The ideas we have just developed to reach this solution are readily generalized to games with more than three players, for which the least rational core need be neither a point nor a line segment.9 Consider, for example, the following four-player variation of Exercise 4.9. Jed, Ned, Ted, and Zed are antique dealers, who conduct their businesses in separate but adjoining rooms of a common premises. Their 9 Even if the least rational core is a line segment, X 2 need not be its midpoint. Although the walls of the least rational core are moved inward at the same speed, the components of the velocities of inward movement along the line segment corresponding to the least core need not be equal. For an example, see Exercise 4.11.
4.5. Antique dealing. The nucleolus
153
Figure 4.7. Antique dealers’ business hours
advertised office hours are shown in Figure 4.7; for example, Ted’s advertised hours are from 10:00 a.m. until 4:00 p.m. Because the dealers have other jobs and the store is never so busy that one guy couldn’t take care of everyone’s customers, it is in the dealers’ interests to pool their time in minding the store: there is no need for two people between 10:00 a.m. and noon or between 4:00 p.m. and 5:00 p.m., for three people between noon and 2:00 p.m., or for four people between 2:00 p.m. and 4:00 p.m. Cooperation will enable dealers to leave earlier than their advertised closing hours, or arrive later than their advertised opening hours, or both. But how much earlier or later? What are fair allocations of store-minding duty? Let Jed, Ned, Ted, and Zed be Players 1 to 4, respectively, and let the benefit of cooperation to coalition S be the number of dealer-hours that S saves by pooling time. Then ν(N ) = 13, where N = {1, 2, 3, 4}, because only 8 of the 21 dealer hours currently advertised are actually necessary; ν(i) = 0 for all i ∈ N because no time is saved if no one cooperates; ν({1, 2}) = 4, because either Jed or Ned can mind the other’s business between noon and 4:00; and so on. Thus (4.65)
νi = 0, i = 1, 2, 3, 4, 4 4 3 6 2 2 ν12 = 13 , ν13 = 13 , ν14 = 13 , ν23 = 13 , ν24 = 13 , ν34 = 13 , 10 7 7 8 ν123 = 13 , ν124 = 13 , ν134 = 13 , ν234 = 13 , ν1234 = 1,
by (4.11) and (4.31), and it follows readily that the rational -core contains all vectors x = (x1 , x2 , x3 , x4 ) ∈ X such that − ≤ x1 ≤
(4.66a) (4.66b) (4.66c)
4 13
5 13
+ , − ≤ x2 ≤
− ≤ x1 + x2 ≤ 6 13
11 13
+ ,
4 13
6 13
− ≤ x1 + x3 ≤
− ≤ x2 + x3 ≤ 10 13
+ , − ≤ x3 ≤ 10 13
11 13
6 13
+ ;
+ ,
+ ;
− ≤ x1 + x2 + x3 ≤ 1 +
(Exercise 4.10). By the method of §4.3, we find that (4.67)
3 1 = − 26
(note, in particular, that (4.66c) implies 2 ≥ −3/13), and so it follows that (4.68)
x1 + x2 + x3 =
23 26
154
4. Cooperative Games in Nonstrategic Form
x2 1/4
0 0
x1
1/4
Figure 4.8. Least rational core of the antique-minding game. For greater detail, see Figure 4.9.
for x ∈ X 1 , by (4.66c). Denoting imputations that satisfy (4.68) and hence x4 = ˆ and using (4.68) to eliminate x3 , we find that by X, ˆ | 3 ≤ x1 ≤ 7 , x1 + x2 ≥ 7 , x2 ≤ 9 (4.69) X1 = x ∈ X 13 26 13 26
3 26
after simplification (Exercise 4.10). Thus the least rational core corresponds to the quadrilateral shaded in Figure 4.8. By construction, for x ∈ X 1 the dissatisfaction of any coalition in Σ 0 is at most 3 3 more of the benefits of cooperation − 26 ; that is, every coalition obtains at least 26 than it could obtain if the grand coalition had not formed. Two coalitions, namely, 3 , because e({1, 2, 3}) = 10 {1, 2, 3} and {4}, obtain an excess of precisely − 26 13 − 3 3 x1 − x2 − x3 = − 26 and e({4}) = −x4 = − 26 for all x ∈ X 1 , on using (4.16), (4.65), (4.68) and (4.34). These are the coalitions that cannot be allocated less dissatisfaction (or, if you prefer, greater satisfaction) than they receive from any imputation in the least rational core.10 On the other hand, the excesses in X 1 of {1}, {2}, {3}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 4}, {1, 3, 4}, and 3 at least somewhere. So {2, 3, 4} still vary with x, and are strictly less than − 26 we exclude {1, 2, 3} and {4} from further consideration and focus on eliminating unfairness among the twelve remaining coalitions, which constitute Σ 1 defined by (4.59). With the help of (4.16), (4.65), (4.68), and (4.34), and using eij··· as a convenient and obvious shorthand for e({i, j, . . .}, x), the excesses of those twelve coalitions at x ∈ X 1 are readily found to be (4.70a)
e1 = −x1 , 4 13
− x1 − x2 ,
(4.70b)
e12 =
(4.70c)
e23 = x1 −
(4.70d)
e124 =
11 26
e2 = −x2 , 11 26 ,
− x1 − x2 ,
e13 = x2 − e24 =
1 26
e3 = x1 + x2 − 15 26 ,
− x2 ,
e134 = x2 −
6 13 ,
e14 =
3 26
23 26 ,
− x1 ,
e34 = x1 + x2 − e234 = x1 −
11 13 ,
5 13 .
It is straightforward to show that none of e1 , e2 , e3 , e12 , e13 , e24 , or e34 can ex3 on X 1 , whereas none of the remaining five excesses can be less than ceed − 13 5 − 26 . Moreover, e234 > e23 . Therefore, the maximum in definition (4.60) of φ1 is effectively taken, not over the whole of Σ 1 , but rather over the four coalitions 10 3 What this means in practice, of course, is that Zed has been allocated 26 of the benefits of cooperation; whereas Jed, Ned, and Ted have together been allocated 23 , but it is not yet clear how to 26 divvy it up between them.
4.5. Antique dealing. The nucleolus
155
Figure 4.9. Contour map of the function φ1 defined by (4.71) over the region shaded in Figure 4.8. Subdomains in (4.71b) are distinguished by increasingly darker shading, starting with no shading for the first subdomain and moving anticlockwise.
{1, 4}, {1, 2, 4}, {1, 3, 4}, and {2, 3, 4} with excesses e14 , e124 , e134 , and e234 , respectively. Thus 3 6 5 (4.71a) − x1 , 11 φ1 (x) = max 26 26 − x1 − x2 , x2 − 13 , x1 − 13 ⎧3 4 if x1 ≤ 14 , x2 ≥ 13 , x1 + x2 ≤ 15 ⎪ 26 − x1 26 , ⎪ ⎪ ⎨ 11 − x − x if x ≤ 4 , 2x + x ≤ 21 , 1 2 2 1 2 26 13 26 = (4.71b) 5 1 21 1 ⎪ − if x ≥ , 2x + x ≥ , x2 ≤ x1 + 13 , x 1 1 1 2 ⎪ 13 4 26 ⎪ ⎩ 6 15 1 x2 − 13 if x1 + x2 ≥ 26 , x2 ≥ x1 + 13 for x ∈ X 1 . Because φ1 defined by (4.71a) is now in effect a function of two variables, its graph is three dimensional, and so we can no longer plot it in two dimensions as we did in Figure 4.6. Nevertheless, we can use (4.71b) to sketch its contour map, which is shown in Figure 4.9; see Exercise 4.10. We see that the graph of φ1 is shaped like an asymmetric boat or inverted roof, which slopes down3 (corresponding to minimum ward from the outermost contour at height φ1 = − 26 dissatisfaction for coalitions {1, 2, 3} and {4}) to the innermost contour at height 7 . It follows that φ1 = − 52 (4.72)
7 2 = min1 φ1 (x) = − 52 x∈X
156
4. Cooperative Games in Nonstrategic Form
and (4.73)
X2 = =
x ∈ X 1 | φ1 (x) = 2 x ∈ X | x1 = 14 ,
4 13
≤ x2 ≤
17 52
3 2 with x3 = 23 26 − x1 − x2 and x4 = 26 by (4.68). We can think of X —which, in 1 effect, has been obtained after 1 second by moving the walls of X inward at a rate of one unit every 52 seconds—as the second-order least rational core. By construction, for x ∈ X 2 the dissatisfaction of any coalition in Σ 1 is at most 7 7 , − 52 . Two coalitions, namely, {1, 4} and {2, 3, 4}, obtain an excess of precisely − 52 2 because e14 = 0 = e234 for all x ∈ X from (4.73) and (4.70). These coalitions cannot be allocated less dissatisfaction than they receive from any imputation in X 2 . On the other hand, the excesses in X 2 of {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, 7 {2,4}, {3,4}, {1,2,4}, and {1,3,4} still vary with x, and are strictly less than − 52 at least somewhere. So we exclude {1, 4} and {2, 3, 4} from further consideration and concentrate our attention on eliminating unfairness among the ten remaining coalitions, which constitute (4.74) Σ 2 = S ∈ Σ 1 | e(S, x) < 2 for some x ∈ X 2 .
By now it should be clear how we proceed. By analogy with (4.60)–(4.62), for x ∈ X 2 let (4.75)
φ2 (x) = max2 e(S, x) S∈Σ
be the maximum excess at imputation x of the remaining coalitions, and let (4.76)
3 = min2 φ2 (x) x∈X
be its minimum. Furthermore, let X 3 consist of all imputations at which φ2 achieves its minimum, i.e., define (4.77) X 3 = x ∈ X 2 | φ2 (x) = 3 . We can think of X 3 as the third-order least rational core. Because x1 = 14 for x ∈ X 2 , φ2 can be regarded as function of x2 alone. Moreover, none of 9 on X 2 , and neither of the other e1 , e2 , e3 , e12 , e13 , e23 , e24 , or e34 can exceed − 52 2 excesses can be less than − 13 . So the maximum in definition (4.75) of φ2 is effectively taken, not over the whole of Σ 2 , but only over coalitions {1, 2, 4} and {1, 3, 4}. That is, 9 6 (4.78) φ2 (x) = max 52 − x2 , x2 − 13 4 ≤ x2 ≤ 17 for 13 52 . The graph of φ2 is similar to Figure 4.6, and its minimum occurs 9 6 33 15 where 52 − x2 = x2 − 13 , or x2 = 104 . Thus 3 = − 104 , and 3 1 33 33 3 (4.79) X = 4 , 104 , 104 , 26 33 33 3 contains the single imputation x∗ = 14 , 104 . No further variation of x is , 104 , 26 possible: we have reduced unfairness as much as is possible. Thus x∗ is the solution of the CFG; and Jed, Ned, Ted, and Zed’s fair shares of the time saved are 13 4 hours 3 for Jed, 33 hours each for Ned and Ted, and hours for Zed. Ned must still arrive 8 2 at 9:00, but now leaves at 11:52:30 when Ted arrives; Ted now leaves at 1:45 when Jed arrives; and Jed now leaves at 3:30 when Zed arrives for the last turn of duty.
4.5. Antique dealing. The nucleolus
157
Ned and Ted benefit most because their advertised hours have the greatest overlap with those of the others.11 Our procedure for successively eliminating unfair imputations, until only a single imputation remains, can be generalized to apply to arbitrary n-player characteristic function games. We construct by recursion a nested sequence (4.80)
X = X0 ⊃ X1 ⊃ X2 · · · Xκ = X∗
of sets of imputations, a nested sequence (4.81)
Σ0 ⊃ Σ1 ⊃ Σ2 · · · Σκ
of sets of coalitions,12 a decreasing sequence 1 > 2 · · · κ
(4.82) of real numbers, and a sequence (4.83)
φ0 , φ1 , · · · , φκ−1
of functions such that the domain of φk is X k . The recursion is defined for k ≥ 1 by (4.84a) (4.84b) (4.84c) (4.84d)
φk (x) = max e(S, x), S∈Σ k
k = min φk−1 (x), x∈X k−1 = x ∈ X k−1 | φk−1 (x) = k ,
Xk Σ k = S ∈ Σ k−1 | e(S, x) < k for some x ∈ X k ,
where X 0 = X and Σ 0 consists of all coalitions except ∅ and N . If we perform this recursion for increasing values of the positive integers, then eventually we reach a value of k for which the set X k contains a single imputation; for a proof of this result see, e.g., Wang [354, p. 146]. If we denote the final value of k by κ, then 1 ≤ κ ≤ n − 1 (because the dimension of X k is reduced by at least one at every step of the recursion); and if X κ = {x∗ }, then according to our criterion of fairness, the imputation x∗ is the fair solution of the game. The set X κ = {x∗ } is called the nucleolus. If κ = 1, of course, then the nucleolus coincides with the least rational core, as in (4.28) and (4.51), where n = 3, or in (4.43), where n = 4. More commonly, however, we have κ > 1, as in (4.64), where κ = 2 and n = 3; or in (4.79), where κ = 3 and n = 4. Technically, the task of calculating the nucleolus is an exercise in repeated linear programming,13 which—at least for games of moderate size—can safely be delegated to a computer. For a discussion of this and other matters related to the nucleolus, see, e.g., Owen [257] or Wang [354]. A final remark is in order. Technically, x∗ is not the nucleolus; rather, it is the (only) imputation that the nucleolus contains. To observe the distinction is 11 Whether this is the conclusion that Jed, Ned, Ted, and Zed would actually reach, however, is an open question. They might even consider it fairer to give each dealer 13 4 hours to begin with, so that Zed can just stay at home, and transfer 7 minutes and 30 seconds of the 15 minutes that Zed doesn’t need to each of Ned and Ted. 12 In (4.80) and (4.81) we use the proper-superset symbol. In general, for sets A and B, B ⊃ A means that B both contains A and is not equal to A (and is equivalent to A ⊂ B in Footnote 1 on p. 119), whereas B ⊇ A would allow for either B ⊃ A or B = A (and is equivalent to A ⊆ B). 13 For a modern approach to this topic, see Castillo et al. [60].
158
4. Cooperative Games in Nonstrategic Form
often a nuisance, however, and we shall not hesitate to breach linguistic etiquette by referring to both x∗ and {x∗ } as the nucleolus whenever it is convenient to do so (as, e.g., in §4.7).
4.6. Team long-jumping. An improper game To encourage team spirit among schoolboys who compete as individuals, a high school has instituted a long-jumping competition for teams of two or three. The rules of the competition stipulate that each team can make up to 12 jumps, of which six count officially towards the final outcome—the best three jumps of each individual in a two-man team, or the best two jumps of each individual in a threeman team. Furthermore, a local business has agreed to pay a dollar per foot for each official foot jumped in excess of 15 feet. So unless—which is unlikely— nobody jumps more than 15 feet, winning the competition is the same thing as maximizing dollar uptake. But the existence of prize money brings the added problem of distributing the money fairly among the team. We will use the resultant CFG to clarify the difference between what we have called the rational -core and what is known in the literature as the -core. Let’s suppose that young Jed (Player 1), young Ned (Player 2), and young Ted (Player 3) competed as a team in this year’s competition. They agreed to take four jumps each, and their best three jumps are recorded (in feet) in Table 4.1; obviously, Jed had a bad day. The dollar equivalents of these jumps are recorded in Table 4.2. We see at once that Jed, Ned, and Ted have earned themselves the grand sum of 0 + 0 + 5 + 3 + 4 + 4 = 16 dollars. How should they divvy it up? Let the benefit of cooperation, ν(S), to coalition S be the prize money that the players in S would have earned if they had competed as a team. Then, of course, ν({i}) = 0, i = 1, 2, 3 by (4.6), with one-man teams anyhow not allowed; and, from Table 4.2, we have ν({1, 2}) = 11 = ν({1, 3}), ν({2, 3}) = 22, and ν(N ) = 16. So, from (4.11), the characteristic function is defined by (4.85)
ν({1, 2}) =
11 16
= ν({1, 3}),
ν({2, 3}) =
11 8
(and, of course, ν({i}) = 0, i = 1, 2, 3, ν(N ) = 1). We see at once that there is a coalition of fewer than all the players who could obtain more than the grand coalition. Thus, if it were possible for Ned and Ted to dissolve the grand coalition and become a two-man team, then it would certainly be in their interests to do so. Unfortunately, however, they have already declared themselves a three-man team. The rules of competition enforce the grand coalition. Table 4.2. Monetary values (dollars)
Table 4.1. Team’s best jumps (in feet)
J First jump 15 Second jump 15 Third jump 15
N 20 18 18
T 19 19 18
First jump Second jump Third jump
J N T 0 5 4 0 3 4 0 3 3
4.6. Team long-jumping. An improper game
159
We now digress from our long-jump competition to introduce some more general terminology. If (4.86)
ν(S ∪ T ) ≥ ν(S) + ν(T )
for any two coalitions S, T that are disjoint (S ∩ T = ∅), then the characteristic function ν is said to be superadditive, and the associated game is said to be proper. If, on the other hand, there exist S and T such that (4.86) is violated, then the game is said to be improper. Thus all of the games we studied in §§4.1–4.5 were proper games, whereas the three-player game defined by (4.85) is an improper game. Now, whenever a game is improper, one can make a case for relaxing the condition that allocations should be nonnegative. Could not Ned and Ted demand that Jed reimburse them for the potential earnings that they have lost through joining forces with Jed? Accordingly, let the vector x = (x1 , x2 , . . . , xn ) be called a pre-imputation if it satisfies (4.12b), but not necessarily (4.12a); that is, if (4.87)
x1 + x2 + · · · + xn = 1,
but there may exist some i ∈ N for which xi < 0. Thus pre-imputations are vectors that are group rational but not necessarily individually rational for every player. We will denote the set of all pre-imputations by X, and clearly X ⊃ X.14 With the concept of pre-imputation, we can define the core as (4.88) C(0) = x ∈ X | e(S, x) ≤ 0 for all coalitions S . Because e({i}, x) ≤ 0 implies xi ≥ 0, this definition is equivalent to (4.19), i.e., C + (0) = C(0). We now define the -core by (4.89) C() = x ∈ X | e(S, x) ≤ for all S ∈ Σ 0 . Thus the rational -core is (4.90)
C + () = C() ∩ X,
and by analogy with (4.23) if we first define a function φ0 from X to the real numbers by (4.91)
φ0 (x) = max0 e(S, x), S∈Σ
then we can define the -core more succinctly as (4.92) C() = x ∈ X | φ0 (x) ≤ by analogy with (4.24). Obviously, C + () ⊆ C(). The contours of φ0 coincide with those of φ0 on X; but whereas contours of φ0 that meet the boundary of X must stay on the boundary, contours of φ0 may continue across. For example, in Figure 1 4.4, the contour φ0 = − 21 would be a triangle with a vertex on the line x1 + x2 = 1, and the contour φ0 = 0 would be a triangle with a vertex in the region above the line x1 + x2 = 1 (which is inside X but outside X because x3 < 0). By analogy with results for the rational -core, there is a least value of , namely, (4.93)
1 = min φ0 (x), x∈X
14
More precisely, X is an (n − 1)-dimensional hyperplane, which contains the simplex X.
160
4. Cooperative Games in Nonstrategic Form
for which the -core exists. We refer to C(1 ) as the least core. Because the minimum over a larger set cannot exceed the minimum over a smaller set, a comparison of (4.93) with (4.25) reveals that (4.94)
1 ≤ 1 .
Furthermore, if the game has a core, then 1 = 1 (and C(1 ) = X 1 , i.e., the least core coincides with the least rational core). If the game is both coreless and improper, however, then the possibility15 arises that 1 < 1 . We can illustrate this circumstance by returning now to our longjumping game, which is both improper and coreless, because an improper threeplayer CFG is always coreless (Exercise 4.12). From (4.85) and (4.89), we have 5 + , (4.95) C() = x ∈ X | − ≤ x1 ≤ − 38 + , − ≤ x2 ≤ 16 11 16 − ≤ x1 + x2 ≤ 1 + . 3 By the method of §4.3, and noting in particular that (4.95) implies 11 16 − ≤ − 8 + 5 1 + 16 +,we readily find that the least value of for which C() = ∅ is 1 = 4 , and 9 9 . Thus, if the least core that C 14 contains only the pre-imputation − 18 , 16 , 16 were a fair solution, then not only should Ned and Ted take all the prize money and divide it between themselves equally, but also Jed should pay them a dollar apiece to atone for jumping so badly. On the other hand, the least value of for which C + () = ∅ is 1 = 38 , so that 5 ≤ x2 ≤ 11 (4.96) X 1 = x ∈ X | x1 = 0, 16 16
and the nucleolus is (4.97)
X 2 = {x∗ } =
1 1 0, 2 , 2
(Exercise 4.13). The fair solution is for Ned and Ted to take all the prize money and divide it equally.
4.7. The Shapley value An alternative solution concept for cooperative games emerges when we revert from the core to the reasonable set and try to imagine the order in which the grand coalition might actually form. From §4.1, if Player i is the jth player to join the grand coalition of an n-player characteristic function game, and if T − {i} denotes the j − 1 players who joined prior to i, then ν(T ) − ν(T − {i}) is a fair allocation for i. As we remarked in §4.1, however, we do not know the identity of T : half of all possible coalitions (including ∅) contain i (Exercise 4.15), and T could be any of them. So all we know about T is that T ∈ Π i , where Π i is defined by (4.14). If we are going to reduce the reasonable set to a single fair imputation, then mustn’t we know more about T than just that? It is possible, however, that there is no more about T that can be known, at least in advance—and any solution concept, if it is going to be at all useful, must certainly be known in advance of its application. Suppose, for example, that the n players have decided to meet in the town hall at 8:00 p.m. to bargain over fair 15
But not the inevitability; see Exercise 4.14.
4.7. The Shapley value
161
allocations, and that their order of arrival is regarded as the order in which the grand coalition formed. Then who can say what the order will be in advance? Although all will aim to be there at 8:00, some will be unexpectedly early and others unavoidably late, in ways that cannot be predicted with certainty. In other words, the order of arrival is a random variable. Therefore, if Player i is jth to arrive, then j is a random variable, and the j-person coalition of which he becomes the last member is also a random variable. Let us denote it by Yi . Then, as already agreed, ν(Yi ) − ν(Yi − {i}) is a fair allocation for i; but because Yi is a random variable, which can take many values, so also is ν(Yi ) − ν(Yi − {i}). From all these values, how do we obtain a single number that we can regard as a fair allocation for Player i? Perhaps the best thing to do is to take the expected value, denoted by E. Then, denoting Player i’s fair allocation by xSi , we have (4.98) xSi = E ν(Yi ) − ν(Yi − {i}) # {ν(T ) − ν(T − {i})} · Prob(Yi = T ), = T ∈Π i
where Prob(Yi = T ) is the probability that the first j players to arrive are T , and the summation is over all coalitions containing i. If we assume that the n! possible orders of arrival are all equally likely, then the fair imputation (4.99) xS = xS1 , xS2 , . . . , xSn is known as the Shapley value. Suppose, for example, that n = 4, so that the 24 possible orders of arrival are as shown in Table 4.3, and that i = 1. Then, because Y1 = {1} in the first two columns of the table and all orders of arrival 6 = 14 . Likewise, because Y1 = {1, 2, 3, 4} are equally likely, Prob(Y1 = {1}) = 24 6 in the last two columns of the table, we have Prob(Y1 = {1, 2, 3, 4}) = 24 = 14 2 1 also. The first two entries in column 3 yield Prob(Y1 = {1, 2}) = 24 = 12 , 1 and, similarly, Prob(Y1 = T ) = 12 for all two- and three-player coalitions in Π 1 = {1}, {1, 2}, {1, 3}, {1, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {1, 2, 3, 4} . So, from (4.98) and using ν({i}) = 0 = ν(∅), we have 1 (4.100) xS1 = 14 · 0 + 12 ν({1, 2}) + ν({1, 3}) + ν({1, 4}) 1 ν({1, 2, 3}) − ν({2, 3}) + ν({1, 2, 4}) − ν({2, 4}) + 12 + ν({1, 3, 4}) − ν({3, 4}) + 14 1 − ν({2, 3, 4}) . The other three components of the Shapley value for a four-player CFG now follow easily (Exercise 4.17), and similar expressions for the Shapley allocations in a twoplayer or three-player CFG are also readily found (Exercise 4.16). Note that xS always belongs to the reasonable set because (4.98) satisfies (4.15), by Exercise 4.18. Table 4.3. Possible orders of formation of grand coalition of four players
1234 1243 1324
1342 1423 1432
2134 2143 3124
3142 4123 4132
2314 3214 2413
4213 3412 4312
2341 2431 3241
3421 4231 4321
162
4. Cooperative Games in Nonstrategic Form
Table 4.4. Three-player Shapley values and nucleoli compared to egalitarian imputation
Game Car pool: d = 9 Car pool: d = 1 Log hauling Long-jumping
xS
x∗
xE
1 126 (46, 43, 37) 1 60 (28, 22, 10) 1 60 (17, 20, 23) 1 96 (10, 43, 43)
1 126 (50, 44, 32) 1 60 (33, 21, 6) 1 60 (14, 20, 26) 1 96 (0, 48, 48)
1 126 (42, 42, 42) 1 60 (20, 20, 20) 1 60 (20, 20, 20) 1 96 (32, 32, 32)
Table 4.5. Four-player Shapley values and nucleoli compared to egalitarian imputation
Game Car pool: d = 2 Antique minding
xS
x∗
xE i
1 144 (43, 35, 25, 41) 1 312 (80, 92, 92, 48)
1 144 (45, 33, 21, 45) 1 312 (78, 99, 99, 36)
36 144 78 312
We are now in a position to calculate xS for all games considered so far. Shapley values for three-player games are presented in Table 4.4, and those for four-player games in Table 4.5, the results being taken from Exercises 4.16–4.17. For each game, the Shapley value is compared with both the nucleolus and the egalitarian imputation 1 1 1 , , ,..., (4.101) xE = n n
n
which distributes the benefits of cooperation uniformly among the players. We see at a glance that the Shapley value is, on the whole, a much more egalitarian imputation than the nucleolus, i.e., most allocations are at least as close to n1 under the Shapley value as under the nucleolus. Exceptions are provided by Player 2’s allocation in the three-person car pool with d = 1 and Player 1’s allocation in the antique-minding game, but in each of these cases the Shapley and nucleolus allocations are both close to egalitarian. Note, in particular, that the Shapley value would pay young Jed $1.67 for his short long jumps in §4.6, whereas the nucleolus would not pay him anything. Intuitively, the Shapley value is more egalitarian than the nucleolus because the nucleolus gives priority to the most dissatisfied coalitions, whereas the Shapley value grants all coalitions equal status. The nucleolus derives from core-minded thinking, whereas the Shapley value derives from reasonable-set thinking, and so it is hardly surprising that the two solutions almost never coincide (except, of course, for two-player games—see Exercise 4.16). If the Shapley value belongs to the core (as in the car-pool games), then it does so more by accident than by design.16 That the core need not contain the Shapley value is immediate, because the Shapley value always exists, whereas the core does not. But even a nonempty core may not contain the Shapley value—witness the three-person car pool, where the Shapley value lies outside the core if d < 12 in (4.10); see Exercise 4.16. By contrast, if the core exists, then it always contains the nucleolus, and in any event, the nucleolus always belongs to the least rational core (though not necessarily the least core—witness the long-jumping game). On the other hand, we should not make too much of the 16
Except that the core of a convex game must contain its Shapley value; see [302].
4.8. Simple games. The Shapley–Shubik index
163
difference between core-minded thinking and reasonable-set thinking, because the nucleolus always belongs to the reasonable set—and so does the core, when the core exists (Exercise 4.23). Although the formulae derived in Exercises 4.16–4.17 are adequate for our examples, an explicit expression for the Shapley value of an n-player game is easily obtained. First the number of elements in set S, so that, e.g., we define #(S) to be #(N ) = n, # Π i = 2n−1 , # Σ 0 = 2n − 2, #(X κ ) = 1, #(∅) = 0, etc. In terms of this notation, if Player i is the jth player to join the grand coalition, and if T denotes the first j players to join the grand coalition, then j = #(T ); #(T ) − 1 players have joined the grand coalition before Player i, and n − j = n − #(T ) will join after him. Moreover, a permutation of the order of either the first #(T ) − 1 players or the last n − #(T ) players cannot alter the fact that T is the coalition that Player i completes. There are (#(T ) − 1)! permutations of the first kind, and (n−#(T ))! permutations of the second kind. Accordingly, (#(T )−1)!·1·(n−#(T ))! is the number of orders in which the players can join the grand coalition in such a way that Player i completes coalition T . But there are precisely n! orders of any kind in which the players can join the grand coalition. So, if all orders are equally likely, then (#(T ) − 1)! (n − #(T ))! n! and, from (4.98), Player i’s allocation under the Shapley value is 1 # (#(T ) − 1)! (n − #(T ))! {ν(T ) − ν(T − {i})} (4.103) xSi = n! i Prob(Yi = T ) =
(4.102)
T ∈Π
i
with Π defined by (4.14). Despite the ease with which we obtained this result, however, the Shapley value of an n-player game is not in general an easy vector to calculate, because each component involves a summation over 2n−1 coalitions. Which is the fairer imputation—the nucleolus or the Shapley value? It remains an open question.17
4.8. Simple games. The Shapley–Shubik index If 0 and 1 are the only values assigned by a characteristic function ν—as, for example, in (4.47)—then the CFG defined by ν is said to be simple. In other words, a CFG is simple if either ν(S) = 0 or ν(S) = 1 for every coalition S (not just ∅ and N ). Simple games arise most readily in the context of voting. Suppose, for example, that the mathematics department of a small liberal arts college is recruiting for a new professor. Candidates are interviewed one at a time, beginning with the one whose credentials on paper look most impressive. Immediately after each interview, the existing faculty meet to vote on the candidate, and if the vote is sufficiently favorable, then the candidate is offered a position—and no more candidates are interviewed unless, and until, he turns it down. 17 For an interesting empirical analysis of opinions on these and other solution concepts for threeplayer games in three different nations, see [308]. Also, see the discussion of this question in [182, p. 4].
164
4. Cooperative Games in Nonstrategic Form
Now, in this department there has always been tension between pure and applied mathematicians, with a temptation for one group to exploit the other in the conflict over hiring. To guard against this temptation, it is enshrined in the department’s constitution that a candidate is nominated if, and only if, he secures the vote of at least 50% of both the pure and the applied mathematicians on the existing faculty. Thus every new professor, whether pure or applied, has at least some measure of broad support throughout the existing faculty. In this recruitment process, the purpose of cooperation among the faculty is to ensure that a candidate is hired. Every coalition of faculty is either strong enough to ensure the recruitment of its particular candidate, in which case it reaps the entire benefit of cooperation, or it is too weak to ensure that its candidate is hired, in which case it reaps no benefit at all. We can therefore partition the set of all coalitions into the set of winning coalitions—denoted by W —and the set of losing coalitions, which comprises all others. It is then natural to define the characteristic function ν by
0 if S ∈ / W, (4.104) ν(S) = 1 if S ∈ W. Thus the recruitment process yields a simple game. Indeed the characteristic function of any simple game has the form of (4.104), where W is the set of coalitions to which ν assigns the value 1. To see that the Shapley value of a simple game can be interpreted as an index of power, let us denote by P i the set of coalitions in which Player i’s vote is crucial for victory, i.e., the set of winning coalitions that would become losing if Player i were removed. In symbols: /W . (4.105) P i = S ∈ Π i | S ∈ W, S − {i} ∈ If S ∈ W and S − {i} ∈ W , then ν(S) = 1 = ν(S − {i}); whereas if S ∈ / W and S − {i} ∈ / W , then ν(S) = 0 = ν(S − {i}). Thus ν(S) − ν(S − {i}) = 0 unless S ∈ W and S − {i} ∈ / W , or S ∈ P i ; in which case, ν(S) − ν(S − {i}) = 1 − 0 = 1. It follows from (4.98) that # Prob(Yi = T ), (4.106) xSi = T ∈P i
where Yi is the coalition that Player i was last to join. The right-hand side of (4.106) is the probability that Player i’s vote is crucial (to some coalition), and it is therefore a measure of Player i’s voting power. In particular, if all possible orders of coalition formation are equally likely, then from (4.102) we have 1 # (#(T ) − 1)! (n − #(T ))!. (4.107) xSi = n! i T ∈P
We refer to (4.107) as the Shapley–Shubik index. Suppose, for example, that there are five members of faculty, three of whom are pure mathematicians—say Players 1 to 3—and two of whom are applied (Players 4 and 5). Then at least two pure mathematicians and one applied mathematician must vote for a candidate to secure his nomination. So the set of winning coali-
4.9. Coalition formation: a nonstrategic model
165
tions W contains {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {1, 2, 3, 4}, {1, 2, 3, 5}, {1, 2, 4, 5}, {1, 3, 4, 5}, {2, 3, 4, 5}, and N = {1, 2, 3, 4, 5}. Thus P 1 = {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {1, 2, 4, 5}, {1, 3, 4, 5} and P4 =
{1, 2, 4}, {1, 3, 4}, {2, 3, 4}, {1, 2, 3, 4}
from (4.105), with similar expressions for P 2 , P 3 , and P 5 ; see Exercise 4.20. It 7 7 7 3 3 . A follows from (4.107) that the Shapley–Shubik index is xS = 30 , 30 , 30 , 20 , 20 7 pure mathematician has 30 of the voting power, whereas an applied mathematician 3 has only 20 . Collectively, applied mathematicians are 40% of the faculty but have only 30% of the voting power. Of course, this seems unfair, and so we had better hope—for their sake—that an applied mathematician is hired. Or had we? See Exercise 4.21.
4.9. Coalition formation: a nonstrategic model Despite the centrality of coalitions to characteristic function games, CFGs have nothing to say about the question of which coalitions are expected to form within a group, not only because the answer is assumed to be the grand coalition, but also because CFGs ignore personal attributes—the outcome depends only on other factors, such as where players live (§§4.1–4.3), their truck sizes (§4.4), or their posted business hours (§4.5). It is useful to have a term for whatever individual attribute or combination of attributes most contributes to a favorable outcome in group interaction, and for this purpose we will use resource holding potential [191, p. 159], or RHP, for short.18 Given that individuals vary in RHP, we would expect RHP to influence coalition structure. For example, RHP in §4.6 would correspond to long-jumping ability, and to the extent that the results in Table 4.1 reflect this RHP, we would expect a coalition of Ned and Ted to form—if the grand coalition were not being enforced instead. The question of how RHP within a triad can influence its coalition structure was broached by sociologists more than half a century ago. In this section, we first describe their theory, which is somewhat limited in scope. We then proceed to discuss some of the factors that they did not consider, with a view to revisiting coalition formation in §8.3 for a fuller exploration of the influence of RHP. Accordingly, let X, Y , and Z be the RHPs, in nonincreasing order, of three individuals A, B, and C, respectively, and let S1 > S2 denote that RHP S1 is higher than RHP S2 . Then given that X ≥ Y ≥ Z, four possible orderings within the triad are X = Y = Z, X = Y > Z, X > Y = Z, and X > Y > Z. In the last two cases, X may be either greater than, less than, or equal to Y + Z, so that in all there are 18 For a discussion of RHP in the context of coalition formation, see [220, p. 201]. The term is widely used in the field of behavioral ecology (p. 217), not only in the context of coalition formation but also in that of contest behavior [45, p. 2]. Its appeal is twofold. First, it is generic; it can stand for a range of individual attributes, including initial reserves (§§6.2–6.3), fighting ability or strength (§2.6, §6.6, §6.9, §8.1, and §8.3) and various skills. Second, it is short, e.g., five letters shorter than strength. Despite that, I indulge a personal preference for using strength in place of RHP in Chapters 2, 6, and 8. Among other things, although RHP is five letters shorter than strength, “stronger” or “weaker” are nine or ten characters shorter than “having higher RHP” or “having lower RHP”—and so on balance, it seems to me, with RHP you can more than lose on the roundabouts what you gain on the swings!
166
4. Cooperative Games in Nonstrategic Form
Table 4.6. Coalition structure probability vectors predicted by Caplow’s theory. BC denotes a coalition of the (two weakest) individuals B and C, whose RHPs are Y and Z, respectively, against the (strongest) individual A whose RHP is X, and similarly for AC and AB; I denotes three unallied individuals, i.e., no true coalition; and G denotes the grand coalition of all three individuals. Note that in Cases 5 and 8, A is a dictator (Exercise 4.22). The final column indicates the labelling that was used by Caplow [58, p. 490], which is different from ours.
case 1 2 3 4 5 6 7 8
ordering X=Y =Z X=Y >Z X>Y =Z
X>Y >Z
sign of X−Y −Z
− − − 0 + − 0 +
I 0 0 0 0 1 0 0 1
probabilities (ρ) BC AC AB G 1 1 1 0 3 3 3 1 1 0 0 2 2 1 0 0 0 1 1 0 0 2 2 0 0 0 0 1 1 0 0 2 2 1 1 0 0 2 2 0 0 0 0
Caplow type
1 3 2 8 4 5 7 6
eight types of triads. Moreover, there are five possible coalition structures, namely, B and C versus A, denoted by BC; A and C versus B, denoted by AC; A and B versus C, denoted by AB; three unallied individuals, denoted by I; and a grand coalition of all three, denoted by G. To predict which of these structures emerges in the long run, Caplow [57, 58] assumed that a stronger party (one with higher RHP) can control, or dominate, a weaker party (one with lower RHP) and seeks to do so; that all individuals prefer more control to less, and are indifferent between internal and external control (see below); that RHPs are additive; and that I is the initial configuration. Consider, for example, case 6 of Table 4.6, where X > Y > Z and X < Y + Z. Individuals can be dominated either internally (i.e., within a coalition) or externally (i.e., as the target of a coalition that is stronger than the individual). In case 6 no individual can externally dominate the other two (X > Y + Z, Y > Z + X, and Z > X + Y are all false), and no individual can be internally dominated within I; hence, an alternative to I is preferred by each individual. In G, A would internally dominate both B and C (X > Y , X > Z); however, to prevent external domination by B and C acting jointly (Y + Z > X), A prefers either AB or AC to G, but is otherwise indifferent. Likewise, C is indifferent between AC and BC; either is preferable to G, because externally dominating one individual (B in the case of AC, because X + Z > Y ; A in the case of BC, because Y + Z > X) while being internally controlled by the other is better than being internally controlled by both. But B has a distinct preference: BC gives B both internal control over C (because Y > Z) and external control over A (because Y + Z > X), whereas either AB or G leaves B dominated by A, despite control over C. Thus AB will not emerge, because A’s preference for it is not reciprocated by B; however, C’s preference for AC or BC is reciprocated by, respectively, A or B, so that either AC or BC is to be expected. Caplow effectively assumes that both are equally likely. Hence, if ρ
4.9. Coalition formation: a nonstrategic model
167
denotes a vector of probabilities for coalition structures I, BC, AC, AB, and G, respectively, then Caplow’s predicted outcome for case 6 is ρ = (0, 12 , 12 , 0, 0). The other seven cases are decided similarly. A conclusion that Caplow drew from his theory is that surprisingly often weakness is strength. Vinacke and Arkoff [346] devised a laboratory experiment to test Caplow’s theory. Generally the results supported the theory, except in case 6 discussed 1 59 2 1 , 90 , 9 , 10 , 0) was observed. Gamson [103, p. 379] interpreted above, where ρ = ( 45 this outcome as support for his own theory, which predicts ρ = (0, 1, 0, 0, 0) in case 6, and in every other case agrees with Caplow. Gamson’s theory assumes that participants expect others to demand a payoff share proportional to resources contributed, and that participants maximize payoff by maximizing share, which favors the cheapest winning coalition if the total payoff is held constant. In case Z 6, because X > Y implies both Y + Z < X + Z and Y Z +Z > X+Z , BC both is cheaper than AC and yields a higher payoff to C. Chertkoff [63] argued that a revised version of Caplow’s theory is superior to Gamson’s, and Walker [353] later refined the theory in accordance with Chertkoff’s key observation, essentially just that “it takes two to tango”: formation of a coalition requires reciprocation. For reasons given above, in case 6, if A (being indifferent) offers a coalition to each of B and C with probability 12 and C likewise offers a coalition to each of A and B with probability 12 , while B offers a coalition to A and C with probabilities 0 and 1, respectively, then the probabilities of coalition structures AB, AC, and BC are, respectively, 12 · 0 = 0, 12 · 12 = 14 , and 1 · 12 = 12 . With probability 14 , however, A makes an offer to B, B to C, and C to A, so that no coalition forms. Then each individual “must decide between continuing the present negotiations or switching to the other player” [353, p. 410]; note that for B, AB is better than I, because it prevents external domination by C through AC. Because the situation is symmetrical, Walker assumes that BC, AC, and AB in this cyclic 7 , case are equally likely to form. Thus the total probability of BC is 12 + 13 · 14 = 12 7 1 1 and similarly for the other cases, yielding ρ = (0, 12 , 3 , 12 , 0). This prediction is remarkably close to the frequencies observed in the lab [346]. Yet the theory is limited in scope, in large measure because it is purely ordinal and thus ignores the magnitudes of dominance benefits and RHP differences; for example, only the sign, but not the magnitude, of X − Y − Z in the third column of Table 4.6 is allowed to have an effect. But those magnitudes make a difference. To see what a difference they make, it will be convenient to relabel A, B, and C as Players 1, 2, and 3, respectively, and suppose that X > Y > Z. Then, in the initial configuration I, Player 1 dominates Players 2 and 3 (externally), and Player 3 is dominated by Players 1 and 2 (again externally). Thus the triad forms what is known as a perfectly linear dominance hierarchy, with no individuals having equal or indeterminate rank: for i < j, Player i dominates Player j, with Player i dominating 3 − i others, 1 ≤ i ≤ 3. It is convenient to say that Players 1, 2, and 3 are the alpha, beta, and gamma individuals, respectively. Suppose that the benefits of occupying these roles are 1, b, and 0, respectively, where b < 1. Initially these roles are taken by Players 1, 2, and 3, respectively, but that could change through coalition formation. Assume that there is zero cost to bargaining, and that the cost c of fighting as one of a pair against the remaining
168
4. Cooperative Games in Nonstrategic Form
individual is independent of RHP. Also assume that Player 1 will always accept an offer from Player 2 to exclude Player 3, or—if Player 2 is unwilling to make a pact—from Player 3 to exclude Player 2. This assumption enables us to focus directly on the following question: when is it strategically stable for Players 2 and 3 to unite against Player 1? First suppose that relative ranks within coalitions are fixed, with Player i outranking Player j for all i < j. Let Player 2 offer the coalition {2, 3} to Player 3; let p23 1 denote the probability that this coalition defeats Player 1 in a contest over dominance, and let p123 = 1 − p23 1 denote the complementary probability that the coalition is defeated. It is reasonable to assume that either party’s probability of victory is determined by its RHP, relative to that of the opposing party; that is, the coalition {2, 3} is (Y + Z)/X times as likely to defeat Player 1 as Player 1 is to defeat the coalition, or (4.108a)
= p23 1
Y +Z , X +Y +Z
p123 =
X . X +Y +Z
In terms of §1.6, (4.108a) defines a contest success function, with RHP in place of effort. If Player 3 accepts the offer from Player 2, then Players 1, 2, and 3 become the gamma, alpha, and beta individuals, respectively, with probability p23 1 ; whereas Player 1 remains the alpha, Player 2 the beta, and Player 3 the gamma with probability p123 . Thus Player 3’s reward from coalition {2, 3} is p23 1 (b − c) + b − c. p123 (0 − c) = p23 1 If Player 3 were instead to accept the offer of coalition {1, 3} from Player 1 (whose offer of {1, 2} to Player 2 is implicitly rejected by Player 2’s offer of {2, 3} to Player 3), then Player 1 remains the alpha, Player 3 becomes the beta, and Player 2 the gamma with probability p13 2 ; whereas Player 2 becomes the alpha, Player 1 the beta, and Player 3 the gamma with probability p213 = 1 − p13 2 . Here (4.108b)
p13 = 2
X +Z , X +Y +Z
p213 =
Y X +Y +Z
by analogy with (4.108a). Thus the reward to Player 3 from coalition {1, 3} would 2 13 be p13 2 (b − c) + p13 (0 − c) = p2 b − c. Player 3 should agree to coalition {2, 3} only if the reward to it exceeds the the reward from coalition {1, 3}, that is, if 13 23 13 p23 1 b − c > p2 b − c. But this condition is equivalent to p1 > p2 , and so it never holds because (4.108) reduces it to Y > X. Now suppose, by contrast, that relative ranks within coalitions are not fixed, and that Player 2 offers to share coalitional benefits evenly with Player 3; but continue to assume, for the sake of simplicity, that Player 1 makes no such offer. Then the rewards to Player 3 from coalitions {2, 3} and {1, 3} become 1 1 1 1 = 12 p23 p23 1 1 + 2b − c 2 {1 + b} − c + p23 2 b − c and 2 13 p13 2 (b − c) + p13 (0 − c) = p2 b − c,
respectively, from which Player 3 should form a coalition with Player 2 if 12 p23 1 + 1 Y +Z 13 2 b − c > p2 b − c or b < X−Y +Z , on using (4.108). Similarly, Player 2 should form
4.9. Coalition formation: a nonstrategic model
169
s+1 Figure 4.10. Contour maps of (a) S−1+s and (b) s+1 for S ≥ s ≥ 1 in the Ss s-S plane, where S and s are the highest and second highest RHPs, relative to the √lowest. Difference in height between contours is fixed at 0.1; φ = 1 (1 + 5) ≈ 1.618 in (b); and the unshaded regions indicate where coalition 2 {2, 3} is guaranteed to be strategically stable.
Y +Z a coalition with Player 3 if b < X−Z+Y . The second of these inequalities implies the first (because X > Y > Z). Hence {2, 3} is strategically stable if
(4.109)
b
s > 1, and S is a measure of variance in RHP. Then (4.109) becomes s+1 s+1 . A contour map of S−1+s for S ≥ s ≥ 1 in the s-S plane is shown b < S−1+s s+1 in Figure 4.10(a), from which it is clear that S−1+s > 1 if S < 2. Moreover, if 1 Player 2’s offer to Player 3 is not 2 but instead the proportion Y Z +Z , then (4.109) is replaced by (4.110)
b
0 for S > s > 1.
170
4. Cooperative Games in Nonstrategic Form
4.10. Commentary In this chapter we studied characteristic function games (§4.1) and a variety of CFG solution concepts, namely, imputations and the reasonable set (§4.1); excess, the core, rational -core and least rational core (§4.2); the nucleolus (§4.5); superadditivity, proper games, pre-imputations and the -core (§4.6); and the Shapley value (§4.7). We applied these ideas to a number of bargaining problems (e.g., sharing the costs of a car pool among three (§§4.1–4.2) or four (§4.3) individuals, and sharing duties in an antique dealership (§4.5)), and we showed that a realistic CFG can be either coreless (§4.4) or improper (§4.6). For all of these games we compared the nucleolus and Shapley value with the egalitarian imputation (§4.7). For applications of CFGs to water resources, see [204, pp. 127–130]. For a comparison of the nucleolus and Shapley value in the assessment of airport landing fees, see [257, §XI.4]. For applications of the Shapley value to epidemiology and biodiversity, see [168] and [123], respectively. In §4.8 we briefly considered simple games. We adapted the Shapley value for use as an index of power, and we applied it to power sharing in an academic department. This Shapley–Shubik index is not the only index of power, however, a popular alternative being the Banzhaf–Coleman index; see [257, Chapter X], where both indices are applied to voting in US presidential elections. In §4.9, which is based on [220], we broached the topic of coalition formation, which CFGs do not address. What limits the applicability of cooperative game theory to coalition formation is not so much its concern with a distribution of benefits to individuals—which, as we saw in §4.8, in simple games can be interpreted as an index of power—but rather that this index of power is associated with a particular coalition structure, namely, the grand coalition. If instead an index of power can be associated with each possible coalition structure, then a comparison of these indices can be used to identify transitions between coalition structures. In this regard, Shenoy [303, pp. 182–191] has devised a Caplow power index, which is the vector of proportions of all dominances a player achieves in a given coalition structure, and has used it to recover Caplow’s predictions in Table 4.6. Shenoy’s approach, discussed elsewhere [220, p. 197], is quite general, in that it identifies the undominated coalition structures associated with any power index. Nevertheless, as discussed in §4.9, there are aspects of coalition formation that this theory cannot address. Accordingly, we revisit the topic of coalition formation in §8.3. For more on simple games, see [332]. For more on cooperative games in general, see [27]. For fuzzy cooperative games, see [180].
Exercises 4
171
Exercises 4 1. Show from (4.15) that the reasonable set is the shaded hexagon in Figure 4.3 for the car pool in §4.1 with d = 9, and the partially shaded hexagon in Figure 4.5 for the car pool with d = 1. 2. Find the least rational core for the three-person car pool when (a) d = 9, i.e., deduce (4.28) from (4.27). (b) d = 6. (c) d = 1. 3. Find the least rational core for the four-person car pool when d = 1. 4. Find the least rational core for the four-person car pool when d = 8. 5. Find the least rational core for the four-person car pool when d = 18. 6. (a) For a two-player CFG, verify that the set of imputations X is a onedimensional line segment embedded in two-dimensional space. Sketch a one-dimensional representation of X, and verify that C + (0) = X. What is the characteristic function? What is the least rational core? (b) Find the least rational core for the CFG defined by (4.47). 7. Calculate the rational -core of the log-hauling game and find its least rational core, i.e., verify (4.50)–(4.51). 8. (a) Show that the game defined by (4.52a) is convex only if max(b + c, c + a, a + b) ≤ 1, and hence that the game defined by (4.10) is never convex. (b) Show that the game defined by (4.52a) has a core if and only if a+b+c ≤ 2, and hence that the game defined by (4.10) always has a core, whereas that in §4.4 is coreless. 9. Jed, Ned, and Ted are antique dealers who conduct their businesses in separate but adjoining rooms of a common premises. Jed’s advertised hours are from 12:00 noon until 4:00 p.m., Ned’s hours are from 9:00 a.m. until 3:00 p.m., and Ted’s from 1:00 p.m. until 5:00 p.m. Because the dealers have other jobs and the store is never so busy that one individual could not take care of everyone’s customers, it is in the dealers’ interests to pool their time in minding the store: there is no need for two people between noon and 1:00 p.m. or between 3:00 p.m. and 4:00 p.m, or for three people between 1:00 p.m. and 3:00 p.m. Thus Jed can arrive later than noon or leave earlier than 4:00 p.m., Ted can arrive later than 1:00 p.m., and Ned can leave earlier than 3:00 p.m. But how much later or earlier? What is a fair allocation of store-minding duty for each of the dealers? 10. Verify (4.65)–(4.71), and hence use (4.71b) to verify that Figure 4.9 is the contour map of the function φ1 defined by (4.71a). 11. Consider the four-player CFG whose characteristic function is defined by ν(S) = 1 1 2 if #(S) = 2 or #(S) = 3, except for ν({1, 3}) = 4 and ν({2, 4}) = 0; of course, ν(∅) = 0, ν(N ) = 1 and ν({i}) = 0 for i ∈ N . Show that the least rational core
172
4. Cooperative Games in Nonstrategic Form is the line segment X 1 = x | x = 18(1 + 3t, 3 − 3t, 1 + 3t, 3 − 3t), 0 ≤ t ≤ 1 , 5 3 5 3 is not the nucleolus.20 , 16 , 16 , 16 but that its midpoint 16
12. Show that every improper three-player CFG is coreless. 13. Verify (4.96)–(4.97). 14. How should young Jed, young Ned, and young Ted have split the proceeds from the long-jump competition described in §4.6 if Jed’s best two jumps had been 18 and 16 feet, instead of 15 (but Ned and Ted’s jumps had been the same)? 15. Show that, in any n-player CFG, each player belongs to precisely 2n−1 coalitions. 16. (a) Find the Shapley value of an arbitrary two-player CFG. (b) Find the Shapley value of an arbitrary three-player CFG. (c) Hence verify Table 4.4, and show that the Shapley value of the three-person car pool lies outside the core if d < 12 in (4.10). 17. (a) Verify (4.100), and hence obtain a complete expression for the Shapley value of an arbitrary four-player CFG. (b) Verify Table 4.5. 18. Prove that (4.98) satisfies (4.15). 19. A characteristic function game is said to be symmetric if every coalition’s bargaining strength depends only on the number of players in it, i.e., if there exists a function f from the set of real numbers to itself such that ν(T ) = f (#(T )) for all coalitions T . Prove that if a symmetric CFG has a core, then its core must contain the egalitarian imputation xE defined by (4.101). 20. (a) Find P 1 , P 2 , P 3 , P 4 , and P 5 for the voting game in §4.8. (b) Hence obtain its Shapley–Shubik index. 21. In the eample in §4.8 of a mathematics department, there are currently three pure and two applied mathematicians. Which would increase an applied mathematician’s voting power more (in terms of hiring the seventh member of the faculty)—hiring an applied mathematician or hiring a pure mathematician? Elucidate. 22. In any simple game, Player i is a dummy if P i = ∅ and a dictator if P i = Π i . (a) Can there be more than one dummy? Either prove that this is impossible, or produce an example of a simple game with more than one dummy. (b) Prove that there cannot be more than one dictator. Produce an example of a game with a dictator. 23. Prove that the core of a characteristic function game (if it exists) must be a subset of the reasonable set. 24. Consider the three-player characteristic function game defined by (4.52a)– (4.52b). Let x∗ denote its nucleolus, and let xT and xE be defined, respectively, by (4.53) and (4.101) with n = 3. (a) Show that (4.52b) implies superadditivity. (b) Show that m ≤ 13 implies x∗ = xE . (c) Show that (4.54) implies x∗ = xT . 20
This exercise is from [183, p. 335].
Exercises 4
173
(d) Show that 4 is the least value of d for which x∗ = xT in the three-player car pool defined by (4.10). 25. Verify Table 4.6.
Chapter 5
Cooperation and the Prisoner’s Dilemma
To team, or not to team? That is the question. Chapters 3 and 4 have already shown us that circumstances abound in which players do better by cooperating than by competing. Indeed if ν denotes a normalized characteristic function, then Player i has an incentive to cooperate with coalition S − {i} whenever ν(S) > 0. But an incentive to cooperate does not imply cooperation, for we have also seen in §3.4 that if one of two players is committed to cooperation, then it may be rational for the other to cheat and play noncooperatively. How, in such circumstances, is cooperation achieved? To reduce this question to its barest essentials, we focus on the symmetric, two-player, noncooperative game with two pure strategies C (strategy 1) and D (strategy 2) whose payoff matrix R S (5.1) A = T P satisfies (5.2)
R >
1 2
max(2P, S + T ).
The combined payoffs associated with the four possible strategy combinations CC, CD, DC, and DD are, respectively, 2R, S + T , T + S, and 2P . From (5.2), the best combined payoff 2R can be achieved only if both individuals select C. We therefore say that C is a cooperative strategy and that D (for defect) is a noncooperative strategy. We will refer to this game as the cooperator’s dilemma [217].1 If it is also true that (5.3)
T > R,
P > S,
1 Not every problem of cooperation can be reduced to its barest essentials in this way. CD and DC will yield the best combined payoff if cooperation requires players to take complementary actions [78], but the cooperator’s dilemma has broad applicability.
175
176
5. Cooperation and the Prisoner’s Dilemma
Table 5.1. Payoff matrix for the classic prisoner’s dilemma
remain silent implicate
remain silent implicate Short Long Very short Medium
then the cooperator’s dilemma reduces to the prisoner’s dilemma, with which we are familiar from the exercises.2 We discovered in Exercise 1.23 that DD is the only Nash-equilibrium strategy combination (pure or mixed), in Exercise 2.6 that D is a strong ESS, and in Exercise 3.9 that CC is the Nash bargaining solution. We can consolidate these findings by saying that, because T > R and P > S, D is the unique best reply to both C and D; in other words, D is a (strongly) dominant strategy. Thus, by symmetry, it is rational for each player to select strategy D, whence each obtains payoff P . If each were to cooperate by selecting C, however, then each would obtain a higher payoff, namely, R. We have discovered the paradox of the prisoner’s dilemma: although mutual cooperation would yield a higher reward, mutual defection is rational (but only because there exists no mechanism for enforcing cooperation). To see why this game is called the prisoner’s dilemma, imagine that two prisoners in solitary confinement are suspected of some heinous crime—for which, however, there is no hard evidence. A confession is therefore needed, and the police attempt to persuade each prisoner to implicate the other. Each prisoner is told that her sentence will depend on whether she remains silent or implicates the other prisoner, according to the payoff matrix in Table 5.1. Let us suppose that long, medium, short, and very short sentences are, respectively, d years, c years, b years, and a years, so that d > c > b > a, and that a long sentence is at least twice as long as a short sentence, or d > 2b. Then because the prisoners would like to spend as little time in jail as possible, we have R = −b, S = −d, T = −a, and P = −c, so that (5.3) is satisfied, and we also have b < 12 (a + d), so that (5.2) is also satisfied. Although the prisoners would both prefer a short sentence, they will settle for a medium sentence, because neither can be sure that the other will cooperate: if one player were to remain silent, in the hope that the other would also remain silent, then the second player could cheat—by implicating the first player—and thus obtain the shortest sentence of all. Therefore, unable to guarantee the other’s silence, each player will implicate the other. In this chapter, the prisoner’s dilemma and related games will enable us to investigate rationales for cooperation. For the sake of simplicity, we will consider only discrete population games—games restricted to pure strategies. We begin with a concrete example of the prisoner’s dilemma. We then study ways of escape from its paradox.
2 The game descends from Flood, Dresher, and Tucker [324]. In defining the prisoner’s dilemma, some authors—e.g., Boyd and Richerson [40]—require in addition to (5.2) that 2P < S + T . This additional requirement is satisfied, e.g., in Exercise 1.23 and §5.1, but not necessarily in §5.7.
5.1. A game of foraging among oviposition sites
177
5.1. A game of foraging among oviposition sites To exemplify the prisoner’s dilemma, let the players be a pair of female insects foraging over a patch of N oviposition sites, i.e., sites at which to lay eggs. They forage randomly, and their searches are independent, so that all sites are equally likely to be visited next. Each site has the potential to support one or two eggs, and each insect begins to forage with a plentiful supply (N eggs or more). If one egg is laid at a site, then the probability that it will survive to maturity is r1 . If two eggs are laid, then the corresponding survival probability for each is r2 (and if three or more were laid, then the survival probability for each would be zero). Thus the expected number of offspring from a site is r1 or 2r2 , according to whether one or two eggs are laid, and we assume that r1 > 2r2 ; that is, an offspring that monopolizes its site is more than twice as likely to survive as either of two offspring that share a site and thus compete for the same resource.3 We assume that the insects arrive at the patch simultaneously, and we measure time discretely from the moment of their arrival. Let the duration of the game be n units of (discrete) time, or periods. In each period, let λ be the probability that an insect survives and remains on the patch, hence 1 − λ the probability that it leaves the patch or dies; and if an insect has survived on the patch, let be the probability that it finds a site, and 1 − the probability that it finds no site (hence zero the probability that it finds more than one site). We will assume that an insect oviposits only during periods it survives on the patch, and that it never oviposits more than once per visit to a site. Henceforward, we will use “survival” to mean surviving and remaining on the patch. In each period, if an insect finds a site where fewer than two eggs have been laid, then it can behave either cooperatively or noncooperatively. A cooperative insect will oviposit only if a site is empty, but a noncooperative insect will lay a second egg at sites where an egg has been laid by the other insect; we assume that insects recognize their eggs. The rationality behind such noncooperation is provided by the inequality r1 > 2r2 : a second egg at the other insect’s site yields a payoff of r2 , which is positive, whereas a second egg at the insect’s own site yields a payoff of 2r2 −r1 , which is negative. If an insect behaves cooperatively during every period, then we shall say that it plays strategy C; whereas, if the insect behaves noncooperatively during every period, then we shall say that it plays strategy D. No other strategies for oviposition will be considered. (In particular, as stated at the outset, we do not consider mixed strategies.) For the sake of simplicity, we now further assume that n = 1, so that the length of the game is a single period. The corresponding event tree4 is shown in 3 Our many assumptions in this section deliberately simplify what would happen in nature, as any model must; see the remarks in the preface (p. xi) and the discussion at the beginning of Chapter 9. 4 Despite resembling Figure 1.17 in some respects, this event tree is not the extensive form of the game: it has no decision points for either player, and contains no payoffs to Player 2. To convert this tree into an extensive form, it would be necessary to regard Nature as a third player with random moves at the internal vertices K1 , . . . , K4 and to expand what would then be a subgame at L5 into an explicit subtree with its root at L5 . This would then become a decision point for Player 1, with edges for C and D each leading to a corresponding decision point for Player 2, each with two edges leading to leaves labelled by the entries of the payoff matrix (5.1) and its transpose. At the other leaves it would be necessary to add payoffs to Player 2. Moreover, the unshaded subtree with root K2 would have to be appended at both L1 and L2 to include payoffs to Player 2 (which are not all zero, unlike those of Player 1).
178
5. Cooperation and the Prisoner’s Dilemma
O λ 1 −λ
L1 0
K1 ε 1 −ε
L2 0
L3 r1
Player 1
K2 1 −λ
L4 r1
λ
Player 2 K3
1 −ε
L5 a
ε K4 1/N
1 − 1/N L6 r1
Figure 5.1. The event tree for the single-period foraging game, showing payoffs to Player 1. Strictly, this tree is incomplete, since L1 and L2 are internal vertices, not leaves. To complete the tree, we would have to append the unshaded subtree with root K2 at both L1 and L2 . Because the payoff to Player 1 would still be zero, however, there is nothing to be gained from doing so.
Figure 5.1 in the standard way, upside down with the root on top. Each branch of this tree corresponds to a conditional event, i.e., an event that can happen only if the preceding (conditional) event has happened; and the number on the branch is the corresponding conditional probability, i.e., the probability that the event happens, given that the event corresponding to the previous branch has already happened. Each path through the tree from the root to a leaf represents a sequence of conditional events; and the number in a rectangle at the end of this path is the corresponding payoff to Player 1. To obtain the probability of this payoff, we simply multiply together all conditional probabilities along the path. Note that branches in the shaded region represent events that can happen to Player 1, whereas the remaining branches represent events in the life of Player 2. Let us now begin at the root. The first right-hand branch, from O to the vertex at K1 , corresponds to the event that Player 1 survives the period, which happens with probability λ. The next right-hand branch, from K1 to the vertex at K2 , corresponds to the event that Player 1 locates a site—conditional, of course, upon survival. Conditional upon surviving and locating a site, Player 1 is assured of a nonzero payoff; however, the size of that payoff depends on Player 2. The left-hand branch from K2 to the leaf at L3 corresponds to the event that Player 2 does not survive the period; the payoff to Player 1 is then r1 . The right-hand branch from K2 to the vertex at K3 plus the left-hand branch from K3 to the leaf at L4 represent the event that Player 2 survives the period but fails to find a site (conditional, of course, upon Player 1 surviving and locating a site), in which case the payoff to Player 1 is still r1 . Continuing in this manner, the right-hand path from K3 through K4 to the leaf at L6 corresponds to the event that Player 2 locates
5.1. A game of foraging among oviposition sites
179
one of the N − 1 sites that Player 1 did not locate—with (conditional) probability (N − 1)/N , because all sites are equally likely to be found. Again, the payoff to Player 1 is r1 . If both players locate the same site, however—which corresponds to the path from O to the leaf at L5 , then the payoff to Player 1 depends upon the players’ strategies. Let us denote it by a(u1 , v1 ), where u1 is Player 1’s strategy, and v1 is Player 2’s. Finally, the left-hand branches from O and K1 represent the events that Player 1 either does not survive or survives without finding a site; in either case, the payoff is zero. (Note that, strictly, the tree in Figure 5.1 is incomplete: L1 and L2 are internal vertices, not leaves, and to complete the tree we would have to append at each vertex the subtree with root K2 . Because the payoff to Player 1 would still be zero, however, there is nothing to be gained from doing so.) We can now compute the expected value of Player 1’s payoff in terms of a. All we need do is to multiply each payoff by the product of all conditional probabilities on branches leading to that payoff, then add. Thus Player 1’s reward is (5.4) 0 · (1 − λ) + 0 · λ · (1 − ) + r1 · λ(1 − λ) + r1 · λ2 (1 − ) 2 2 + a · λN + r1 · λ2 2 1 − N1 = λ 1 −
λ N
r1 +
λa N
.
If our insects locate the same site, and if both are C-strategists—that is, if u1 = C = v1 —then the first to arrive will oviposit, whereas the second will not; in other words, the first to arrive will obtain r1 , whereas the second to arrive will obtain zero. Let us assume that, if they do locate the same site, then they are equally likely to locate it first. Then Player 1’s (conditional) expected payoff is (5.5a)
a(C, C) =
1 2
· r1 +
1 2
·0 =
1 2 r1 .
If both are D-strategists—that is, if u1 = D = v1 —then both will oviposit, regardless of who is first to arrive, and each will obtain payoff r2 . Thus (5.5b)
a(D, D) = r2 .
If, on the other hand, Player 1 is a C-strategist but Player 2 a D-strategist, then Player 1 will oviposit only if it is the first to arrive, in which case it will obtain only r2 , because Player 2 will oviposit behind it. Thus (5.5c)
a(C, D) =
1 2
· r2 +
1 2
·0 =
1 2 r2 .
Finally, if Player 1 is a D-strategist but Player 2 is a C-strategist, then Player 1 will oviposit even if Player 2 arrives first, in which case Player 1 will obtain r2 ; whereas if Player 1 arrives first, then Player 2 will fail to oviposit, and so Player 1’s payoff will be r1 . Accordingly, (5.5d)
a(D, C) =
1 2
· r1 +
1 2
· r2 =
1 2 (r1
+ r2 ).
Substitution from (5.5) into (5.4) yields the payoff matrix % & % & λ r1 λ λ r2 r1 · 2 + 1 − λ R S 1 N · 2 + 1 − N r N N (5.6) = λ λ r1 +r2 . λ λ λ + 1 − N r1 N · r2 + 1 − N r1 T P N · 2 Because r1 > 2r2 , we have T > R > P > S and 2R > S + T , so that (5.2) and (5.3) are both satisfied. In other words, the game is a prisoner’s dilemma.
180
5. Cooperation and the Prisoner’s Dilemma
5.2. Tit for tat: champion reciprocal strategy We now proceed to investigate possible escapes from the paradox of the prisoner’s dilemma. An important idea in this regard is that of reciprocity: one good turn deserves another—and one bad turn deserves another, too. To be more precise, reciprocity in the sense of reciprocal altruism [11, 337, 338] means that one good turn now deserves another later, and similarly for bad turns. Thus, reciprocity is an inherently dynamic concept: it is impossible to reciprocate if the game is played only once, and so we shall assume that it is played repeatedly. Indeed it is convenient to define a brand new game, of which a single play consists of all the plays of the prisoner’s dilemma that an individual makes within some specified interval of time. We call this brand new game the iterated prisoner’s dilemma (IPD), and whenever a prisoner’s dilemma is embedded in an IPD, we shall refer to each play of the prisoner’s dilemma as a move of the IPD. An individual who cooperates on the first move of an IPD might well cooperate at all subsequent times. But whereas her strategy in the prisoner’s dilemma would be C, as defined above, her strategy in the IPD would be (5.7)
ALLC = (C, C, C, C, . . . , C),
for “at all times Cooperate”. More generally, in the IPD, pure strategy u—and we consider only pure strategies—consists of a sequence u1 , u2 , . . . , un of prisoner’s dilemma strategies, where prisoner’s dilemma strategy uk is used on move k of the IPD and n is the number of moves, or, regarding the sequence as a vector, (5.8)
u = (u1 , u2 , . . . , un ).
Although C and D are the only values that uk can take, there would be 2n values for u to take even if strategies had to be unconditional; (5.7) is one such value, and (5.9)
ALLD = (D, D, D, D, . . . , D),
for “at all times Defect”, is another. But IPD strategies need not be unconditional: they can be contingent upon opponents’ strategies—a prerequisite for reciprocity. The paragon of such a strategy is “tit for tat”, or T F T , which cooperates on the first move and subsequently plays whatever prisoner’s dilemma strategy an opponent used on the previous move. Thus, if the prisoner’s dilemma strategies adopted by a player’s opponent are denoted by v1 , v2 , . . . , then (5.10)
T F T = (C, v1 , v2 , . . . , vn−1 )
when n is finite. We shall now suppose, however, that n may be infinite (though with vanishingly small probability). Let the number of moves in an IPD be denoted by the (integer-valued) random variable M , so that Prob(M ≥ 1) = 1. Conditional upon there being a kth move in the IPD, let φ(uk , vk ) be that move’s payoff to prisoner’s dilemma strategy uk against prisoner’s dilemma strategy vk ; thus, from (5.1), φ(C, C) = R, φ(C, D) = S, φ(D, C) = T and φ(D, D) = P . Then the actual payoff to prisoner’s dilemma strategy uk against prisoner’s dilemma strategy vk from move k of the IPD is the random variable
φ(uk , vk ) if M ≥ k, (5.11) Fk (uk , vk ) = 0 if M < k,
5.2. Tit for tat: champion reciprocal strategy
181
and the reward from move k of the game is (5.12)
E [Fk (uk , vk )] = φ(uk , vk ) · Prob(M ≥ k) + 0 · Prob(M < k) = φ(uk , vk )Prob(M ≥ k),
where E denotes expected value. Thus, if f (u, v) is the reward to strategy u against strategy v from (all moves of) the IPD, then (5.13)
f (u, v) =
∞ #
E [Fk (uk , vk )] =
k=1
∞ #
φ(uk , vk ) Prob(M ≥ k).
k=1
Let us now suppose that there are only three (pure) strategies, namely, ALLC, ALLD, and T F T , where ALLC is strategy 1, ALLD is strategy 2, and T F T is strategy 3. Then a 3 × 3 payoff matrix A for the IPD can be computed directly from (5.13), with a12 = f (ALLC, ALLD), a23 = f (ALLD, T F T ), and so on. For example, (5.14)
a23 = φ(D, C) Prob(M ≥ 1) +
∞ #
φ(D, D) Prob(M ≥ k)
k=2
= T +P
∞ #
Prob(M ≥ k).
k=2
But (Exercise 5.1) (5.15)
E[M ] =
∞ #
k · Prob(M = k) =
k=1
∞ #
Prob(M ≥ k),
k=1
assuming, of course, that both series converge. Hence, if we define (5.16)
μ = E[M ],
so that μ ≥ 1, then (5.17)
∞ #
Prob(M ≥ k) = μ − 1.
k=2
Thus, from (5.14), a23 = T + P (μ − 1). Similarly, a32 = S + P (μ − 1). Furthermore, it is clear that the expected payoff to ALLC against ALLD is just φ(C, D) times the expected number of moves, or a12 = Sμ, and it is also clear that a11 = a13 = a31 = a33 = Rμ, because T F T always cooperates with ALLC. So the payoff matrix is ⎡ ⎤ Rμ Sμ Rμ Pμ T + P (μ − 1)⎦ . (5.18) A = ⎣T μ Rμ S + P (μ − 1) Rμ We see on inspection that each entry of the third row is at least as great as the corresponding entry of the first row, and in one case it is strictly greater, provided only that μ > 1; in other words, strategy 1 is (weakly) dominated by strategy 3. Let us therefore remove strategy 1 from the game (although see Exercise 5.24), and consider instead the IPD with only two (pure) strategies, ALLD and T F T , with
182
5. Cooperation and the Prisoner’s Dilemma
ALLD redefined as strategy 1 and T F T as strategy 2. The payoff matrix for this reduced IPD is Pμ T + P (μ − 1) (5.19) A = . S + P (μ − 1) Rμ Let us define μc =
(5.20)
T −P . R−P
If μ < μc , then each entry of the first row in (5.19) exceeds the corresponding entry of the second row, so that ALLD is a dominant strategy, and hence also a strong ESS (Exercise 2.16). If μ > μc , however, then both a11 > a21 and a22 > a12 ; whence, from (2.21), both ALLD and T F T are evolutionarily stable strategies. Thus, initial conditions will determine which strategy emerges as the winning strategy in a large population, some of whom adopt ALLD, the remainder of whom adopt T F T . According to the dynamics of §2.8, for example, T F T will emerge as victorious if its initial frequency exceeds the critical value (5.21)
γ =
P −S P − S + (R − P )(μ − μc )
(on substituting from (5.19) into (2.105)). The greater the amount by which μ exceeds μc , the easier it is to satisfy (5.21). Thus, maintenance of cooperation via reciprocity—specifically, via TFT—requires two things: first, that the average number of interactions be sufficiently high, or equivalently that the probability of further interaction be sufficiently high; and second, that the initial proportion of TFT strategists be sufficiently large. Until now, we have allowed M to be any (integer-valued) random variable with μ > 1. Further progress is difficult, however, unless we specify the distribution of M . Accordingly, we assume henceforth that there is constant probability w of further interaction, which implies that M has a geometric distribution defined by (5.22)
Prob(M ≥ k) = wk−1 , k ≥ 1
(Exercise 5.2). Then (5.23)
μ=
1 , 1−w
and the condition for T F T to be evolutionarily stable, μ > μc , becomes (5.24)
w>
T −R T −P
by (5.20). Axelrod [10, pp. 40–54] used the payoff matrix (5.1) with (5.25)
P = 1,
R = 3,
S = 0,
T = 5
for a computer tournament. The game that was played in the second round of this tournament, which had 63 contestants, was the IPD. The strategies were T F T , RN DM (defined in Exercise 5.10), and 61 other strategies submitted by contestants in six countries and from a variety of academic disciplines. The result? T F T was easily the most successful strategy—a champion reciprocal strategy. Axelrod chose w so that the median number of moves in a game would be 200, i.e., so that Prob(M ≤ 200) = 12 , or Prob(M ≥ 201) = 12 . So, from (5.22),
5.3. Other reciprocal strategies
183
Axelrod’s value of w is given by w200 = 12 . We will refer to the IPD defined by (5.25) and w = (0.5)0.005 ≈ 0.99654 as Axelrod’s prototype, and from time to time we will use his payoffs for illustration. Note, however, that they are purely arbitrary.
5.3. Other reciprocal strategies Tit for tat is a nice, forgiving, and provocable strategy based on reciprocity. It is nice because it always begins by cooperating. It is forgiving because if an opponent— after numerous defections—suddenly begins to cooperate, then T F T will cooperate on the following move. And T F T is provocable because it always responds to a defection with a defection. By contrast, ALLD is a nasty strategy, that is, a strategy that always defects on the first move. For that matter, because ALLD always defects, it is the meanest strategy imaginable. T F T is not the only example of a nice, forgiving, and provocable strategy based on reciprocity. A more forgiving nice strategy is T F 2T , or tit for two tats, which always cooperates on the first two moves but then plays T F T , that is, (5.26)
T F 2T = (C, C, v2 , v3 , . . .),
where v denotes the other player’s strategy. In a sense, T F 2T is one degree more forgiving than T F T . A homogeneous T F T population is indistinguishable from a homogeneous T F 2T population—or any mixture of the two strategies—because both populations always cooperate. But the strategies are distinguishable in the presence of any nasty strategies because T F 2T will forgive an initial defection, whereas T F T will punish it on the second move. In particular, if T F T were to play against ST CO = (D, C, C, . . .), for “slow to cooperate”, then mutual cooperation would be established only at the third move; whereas if T F 2T were to play against ST CO, then mutual cooperation would be established at the second move. The question therefore arises, is T F T too mean? See Exercise 5.3. In a sense, ST CO is the least exploitative of all the nasty strategies, and ALLD is the most exploitative. In between lies ST F T , for “suspicious tit for tat”, which always defects on the first move, but thereafter plays T F T ; i.e., (5.27)
ST F T = (D, v1 , v2 , . . .).
A homogeneous ALLD population is indistinguishable from a homogeneous ST F T population (or any mixture of the two strategies) because both populations always defect; but the strategies are distinguishable in the presence of any nice strategies, because ST F T will reciprocate an initial cooperation, whereas ALLD will exploit it. To see how T F T fares against these nasty strategies, let us compute the payoff matrix for the IPD with ALLD (strategy 1), T F T (strategy 2), and ST F T (strategy 3). Of course, a11 , a12 , a21 , and a22 are still defined by (5.19); moreover, because (5.9) and (5.27) imply that ST F T and ALLD always suffer mutual defection, we have a13 = a31 = a33 = P μ. If, on the other hand, u = T F T and v = ST F T , then from (5.10) and (5.27) we have φ(u1 , v1 ) = φ(C, D) = S, φ(u2 , v2 ) = φ(D, C) = T , φ(u3 , v3 ) = φ(C, D) = S, φ(u4 , v4 ) = φ(D, C) = T , and so on; that is, for all j ≥ 1, (5.28)
φ(u2j−1 , v2j−1 ) = S,
φ(u2j , v2j ) = T.
184
5. Cooperation and the Prisoner’s Dilemma
T F T and ST F T are caught in an endless war of reprisal. From (5.13), (5.22), and (5.28), we obtain ∞ ∞ ∞ # # # a23 = f (T F T, ST F T ) = φ(uk , vk )wk−1 = Sw2j−2 + T w2j−1 j=1
k=1
= (S + wT )
∞ #
(w2 )j−1 =
j=1
j=1
S + wT = λ0 μ(S + wT ), 1 − w2
where (5.29)
λ0 =
1 , 1+w
μ is defined by (5.23), and we have set β = w2 in the standard formula ∞ # 1 (5.30) β j−1 = , 0 ≤ β < 1, j=1
1−β
for the sum of a geometric series. Similarly (Exercise 5.4), a32 = λ0 μ(T + wS). Hence the payoff matrix is ⎡ ⎤ Pμ T + P wμ Pμ Rμ λ0 (S + wT )μ⎦ . (5.31) A = ⎣S + P wμ Pμ Pμ λ0 (T + wS)μ Applying (2.21), we discover that neither ALLD nor ST F T is an ESS. ALLD is a Nash-equilibrium strategy, because a11 ≥ aj1 for j = 2, 3; but it is not an ESS, because neither a11 > a31 nor a13 > a33 . ST F T is a Nash-equilibrium strategy if −S a33 ≥ a23 or w ≤ TP −P , but even then it is not an ESS, because neither a33 > a13 nor a31 > a11 . On the other hand, (5.24) implies a22 > a12 ; and if also a22 > a32 , −R , then T F T is evolutionarily stable. In other words, T F T is the that is, if w > TR−S sole ESS of this IPD if T −R T −R , (5.32) w > max . T −P R−S
When (5.32) is satisfied, a T F T population can be invaded by neither a small army of ST F T -strategists nor a small army of ALLD-strategists—nor any combination of the two, because T F T is a strong ESS of the game defined by (5.31).5 Of course, (5.32) does not imply that T F T can invade an ALLD population, because ALLD is a Nash-equilibrium strategy (and an ESS when ST F T is absent). −S . In particular, if But ST F T has no such resistance to T F T if w > TP −P S + T ≥ P + R,
(5.33) T −R R−S
−S then w > implies (5.32), which in turn implies w > TP −P ; thus, not only is a T F T population immune to invasion by ST F T (in the IPD defined by (5.31)), but also a single T F T -strategist is enough to conquer an entire ST F T population. Note that (5.33) is satisfied with strict inequality by Axelrod’s payoffs (5.25) and with equality by the foraging game of §5.1. Nevertheless, not every prisoner’s dilemma satisfies (5.33); for an exception, see §5.7. 5 Axelrod [10] has shown that when w is greater than or equal to the right-hand side of (5.32), T F T is a Nash-equlibrium strategy (but not an ESS) against any deviant strategy (not just ALLD or ST F T ).
5.3. Other reciprocal strategies
185
We have seen that if the probability of further interaction is high enough to satisfy (5.32), then T F T is uninvadable by the pair of exploitative strategies, ST F T and ALLD. But could a more forgiving strategy do just as well? To answer this question, we compute the payoff matrix for the IPD with ALLD (strategy 1), ST F T (strategy 2), and T F 2T (strategy 3). Because ST F T and ALLD always suffer mutual defection, we now have a11 = a12 = a21 = a22 = P μ, and because T F 2T always cooperates with itself, we have a33 = Rμ. From (5.9) and (5.26), if u = ALLD and v = T F 2T , then φ(u1 , v1 ) = φ(D, C) = T, φ(u2 , v2 ) = φ(D, C) = for k ≥ 3. Thus a13 = f (ALLD, ) = T , and φ(uk , vk ) = φ(D, D) = P ∞T F 2T ∞ ∞ k−1 k−1 2 j−1 φ(u , v )w = T + wT + P w = T + wT + w P w = k k k=1 k=3 j=1 (1 + w)T + w2 P μ, on using (5.13), a change of summation index (j = k − 2) and (5.30) with β = w, μ being defined by (5.23). Continuing in this manner, we find (Exercise 5.4) that the payoff matrix is ⎤ ⎡ Pμ Pμ (1 + w)T + w2 P μ ⎦. Pμ Pμ T + Rwμ (5.34) A = ⎣ 2 Rμ (1 + w)S + w P μ S + Rwμ We see on inspection that a33 − a23 = R − T < 0, so that T F 2T is invadable by ST F T ; whereas T F T is not invadable. Even if ST F T were absent, we see from (5.34) that T F 2T can withstand ALLD only if Rμ > (1 + w)T + w2 P μ or (5.35)
w >
T −R ; T −P
−R whereas (5.24) implies that T F T can withstand ALLD if only w > TT −P , which is more readily satisfied. With Axelrod’s payoffs (5.25), for example, this means that T F T , but not T F 2T , can withstand ALLD whenever 0.5 < w < 0.7071. The conclusion is clear: T F 2T is too forgiving to persist as an orthodox strategy if infiltrated by ST F T , and it is less resistant to invasion (in the sense of requiring higher w) than T F T when infiltrated by ALLD. Although ST F T can invade T F 2T , it does not necessarily eliminate it. To see why, let us use (2.115) to model the long-term dynamics of a population’s strategy mix, so that if xk (n) is the proportion adopting strategy k in generation n, then
(5.36)
x1 (n) + x2 (n) + x3 (n) = 1,
and proportions evolve according to (5.37a)
W (n){x1 (n + 1) − x1 (n)} = x1 (n)x2 (n){W1 (n) − W2 (n)} + x1 (n)x3 (n){W1 (n) − W3 (n)},
(5.37b)
W (n){x2 (n + 1) − x2 (n)} = x2 (n)x1 (n){W2 (n) − W1 (n)} + x2 (n)x3 (n){W2 (n) − W3 (n)},
(5.37c)
W (n){x3 (n + 1) − x3 (n)} = x3 (n)x1 (n){W3 (n) − W1 (n)} + x3 (n)x2 (n){W3 (n) − W2 (n)},
where (5.38)
Wk (n) = ak1 x1 (n) + ak2 x2 (n) + ak3 x3 (n)
186
5. Cooperation and the Prisoner’s Dilemma
is the reward to strategy k in generation n and (5.39)
W (n) = x1 (n)W1 (n) + x2 (n)W2 (n) + x3 (n)W3 (n)
is the average reward to the entire population (and is clearly positive). In view of (5.36), we can determine how the population evolves by following the point with coordinates (x1 (n), x2 (n)) in the triangle defined by the inequalities (5.40)
0 ≤ x1 ≤ 1,
0 ≤ x2 ≤ 1,
0 ≤ x1 + x2 ≤ 1.
As in Figure 4.2, where (x1 , x2 , x3 ) denotes an imputation, the upper boundary x1 + x2 = 1 of triangle (5.40) corresponds to x3 = 0 and x3 increases towards the southwest, with x3 = 1 at the point (0, 0). From (5.34) and (5.38), we obtain W1 −W2 = μwx3 {T −R−w(T −P )}, W1 −W3 = μx1 (1 − w2 )(P − S)+μx2 {P −S −w(R−S)}+μx3 {T −R−w2 (T −P )}, and W2 −W3 = μx1 (1−w2 )(P −S)+μx2 {P −S −w(R−S)}+μx3 (T −R)(1−w). For the sake of simplicity, let us now choose Axelrod’s payoffs (5.25). Then, on substituting into (5.37) and using (5.36), we have (5.41)
W (n){xk (n + 1) − xk (n)} = Φk (x1 (n), x2 (n))
for k = 1, 2, 3, where we define (5.42a)
Φ1 (x1 , x2 ) = μx1 (1 − x1 − x2 )(a + cx1 − dx2 ),
(5.42b)
Φ2 (x1 , x2 ) = μx2 (1 − x1 − x2 )(b + cx1 − dx2 ),
(5.42c)
Φ3 (x1 , x2 ) = μ(1 − x1 − x2 )(x1 + x2 )(dx2 − cx1 ) − μ(1 − x1 − x2 )(ax1 + bx2 ),
and (5.43)
a = 2(1 − 2w2 ),
b = 2(1 − w),
c = 3w2 − 1,
d = 1 + w.
Let us √at least assume that T F 2T can withstand ALLD. Then, from (5.35), w > 1/ 2 and a < 0 < c, d > b > 0. It is convenient here to define parallel lines L1 , L2 by (5.44)
L1 : cx1 − dx2 + a = 0,
L2 : cx1 − dx2 + b = 0.
These lines are sketched in Figure 5.2(a), together with both branches of the hyperbola H defined by (5.45)
H: (cx1 − dx2 )(x1 + x2 ) + ax1 + bx2 = 0.
This curve partitions triangle (5.40) into the shaded region of Figure 5.2(a), where Φ3 < 0, and the unshaded region where Φ3 > 0 (Exercise 5.5). Because Φ1 is negative to the left of L1 and positive to the right of L1 , from (5.41) and (5.42) we have x1 (n + 1) < x1 (n) if the point (x1 (n), x2 (n)) lies to the left of L1 but x1 (n+1) > x1 (n) if (x1 (n), x2 (n)) lies to the right of L1 . Similarly, x2 (n+1) < x2 (n) when (x1 (n), x2 (n)) lies above L2 , whereas x2 (n + 1) > x2 (n) when (x1 (n), x2 (n)) lies below L2 ; and x3 (n + 1) < x3 (n) or x3 (n + 1) > x3 (n) according to whether
5.3. Other reciprocal strategies
187
(x1 (n), x2 (n)) lies in the shaded or unshaded part of the triangle (5.40). Therefore, if (x1 (0), x2 (0)) lies either to the left of H or between its two branches, then (x1 (n), x2 (n)) converges to (0, b/d) as n → ∞; whereas, if (x1 (0), x2 (0)) lies sufficiently far to the right of H, then (x1 (n), x2 (n)) converges to the line x1 + x2 = 1 as n → ∞. See Figure 5.2(b) for an illustration. We say that (ξ1 , ξ2 ) is an equilibrium point if (x1 (n0 ), x2 (n0 )) = (ξ1 , ξ2 ) implies (x1 (n), x2 (n)) = (ξ1 , ξ2 ) for all n ≥ n0 . Hence, from (5.41), (ξ1 , ξ2 ) is an equilibrium point if, and only if, Φ1 (ξ1 , ξ2 ) = 0 = Φ2 (ξ1 , ξ2 ). Thus (0, b/d), where H meets L2 on the x2 -axis, is an equilibrium point (marked by a dot in Figure 5.2). There is also an equilibrium point, namely, (−a/c, 0) on the x1 -axis. All other equilibrium points that joins (0, 1) lie on the line x1 + x2 = 1. Indeed every point on the line segment d−b a+c , , a−b+c+d to (1, 0) is an equilibrium point. But those between (0, 1) and a−b+c+d where H intersects x1 + x2 = 1, are unstable because the slightest increase in x3 from 0 will displace (x1 (n), x2 (n)) into the unshaded region, where it begins a relentless march towards (0, b/d). Similarly, (−a/c, 0) is unstable because the slightest leftward displacement of (x1 (0), x2 (0)) will again send (x1 (n), x2 (n)) to (b/d, 0), and the slightest rightward displacement will send it towards the line x1 + x2 = 1. But (b/d, 0) is a locally stable equilibrium point, because it attracts (x1 (n), x2 (n)) from any point (x1 (0), x2 (0)) in its vicinity—indeed, from any point to the left of H between its two branches. By contrast, equilibria on x1 + x2 = 1 between or d−b a+c , a−b+c+d a−b+c+d and (1, 0) are merely metastable, because a slight displacement of (x1 (0), x2 (0)) from (ξ1 , ξ2 ) will neither return (x1 (n), x2 (n)) to (ξ1 , ξ2 ), nor send
1
x2
1
x2
L2 H
b/d
L1
b/d
H 0 0
−a/c a
1
x1
0 0
1 b
Figure 5.2. Convergence to equilibrium for the IPD with strategies ALLD, ST F T , and T F 2T in proportions x1 , x2 , and 1 − x1 − x2 , respectively. The figure is drawn for w = 4/5. (a) The domain of attraction of the equilibrium at (0, b/d) contains all of the unshaded region and the shaded region near the origin. (b) A sample of trajectories, with open circles for initial points. For example, (x1 (n), x2 (n)) converges to the locally stable equilibrium point (0, 2/9) from (x1 (0), x2 (0)) = (0.8, 0.07) but to the metastable equilibrium point (0.828, 0.172) from (x1 (0), x2 (0)) = (0.8, 0.05); both initial points lie well within the shaded region to the right of (a). Increased line thickness distinguishes the metastable equilibria on x1 + x2 = 1 from the unstable ones.
x1
188
5. Cooperation and the Prisoner’s Dilemma
it far away; rather, (x1 (n), x2 (n)) will shift to a neighboring equilibrium on the line x1 + x2 = 1. All of these statements are readily verified by considering the signs of Φ1 , Φ2 , and Φ3 in the various regions of the triangle (5.40). We can now confirm that, even if ST F T invades T F 2T , it does not necessarily eliminate it. In Figure 5.2—where (0, 0), (1, 0), and (0, 1) correspond to homogeneous populations of T F 2T , ALLD, and ST F T , respectively—infiltration of T F 2T by ST F T and ALLD corresponds to displacing (x1 (0), x2 (0)) slightly from the origin, which will send (x1 (n), x2 (n)) to (0, b/d) as n → ∞. Thus T F 2T is not eliminated. Rather, ALLD is eliminated, the proportion of ST F T increases to b/d, and the proportion of T F 2T decreases from 1 to 1 − b/d = (3w − 1)/(1 + w). It can be shown more generally (i.e., when payoffs other than Axelrod’s are used) that, provided w > (P − S)/(R − S), the final composition of the population will be a mixture of ST F T and T F 2T in which the proportion of T F 2T is (5.46)
x3 (∞) = 1 − x2 (∞) =
w(R − S) − (P − S) ; (1 − w)(T − R) + w(R − S) − (P − S)
see Exercise 5.6. Let us now restore T F T . In (5.31) it was strategy 2, whereas ALLD was strategy 1, and ST F T was strategy 3. In (5.34), where T F T was absent, we promoted ST F T to strategy 2 and introduced T F 2T as strategy 3. With all four strategies together, it is convenient to demote ALLD from 1 to 4 and relabel as follows: (5.47)
TFT: T F 2T :
strategy 1, strategy 2,
ST F T : strategy 3, ALLD: strategy 4.
The advantage is that nice and nasty strategies are now adjacent and, from Exercise 5.7, the payoff matrix is ⎤ ⎡ Rμ
Rμ ⎢ (5.48) A = ⎣ λ0 μ(T + wS) T + P μw
Rμ Rμ T + Rwμ T (1 + w) + w2 P μ
λ0 μ(S + wT ) S + Rwμ Pμ Pμ
S + P μw S(1 + w) + w2 P μ⎥ ⎦ Pμ Pμ
with μ defined by (5.23) and λ0 by (5.29). On applying (2.21), we see that ALLD is a Nash-equilibrium strategy for any w, and that T F T is a Nash-equilibrium strategy if (5.32) is satisfied. But neither strategy is evolutionarily stable: ALLD is incapable of eliminating the other nasty strategy, ST F T , and T F T is incapable of eliminating the other nice strategy, T F 2T . Indeed no strategy—that is, no pure strategy, because we have expressly forbidden mixed strategies—can resist invasion by an arbitrary mixture of infiltrators, even if w is sufficiently large.6 Broadly speaking, the reason for this vulnerability is that with any Nash-equilibrium strategy we can associate a second strategy that, although in principle quite distinct, is distinguishable in practice only in the presence of a third strategy; and if the second strategy does better than the first against the third, then the frequency of the second strategy can increase. To clarify 6 A formal proof is given by Boyd and Lorberbaum [39], who show that no (pure) strategy in the T −R P −S
1 IPD can resist invasion if w > min T −P , R−S , hence if w > 3 with Axelrod’s payoffs. But the result may have little practical significance; see §§5.8–5.9.
5.3. Other reciprocal strategies
189
this point, let us first consider the IPD with strategies T F T , T F 2T , and ST F T , whose payoff matrix is ⎤ ⎡ Rμ Rμ λ0 μ(S + wT ) ⎦, Rμ Rμ S + Rwμ (5.49) A=⎣ Pμ λ0 μ(T + wS) T + Rwμ obtained from (5.48) by deleting the final row and column, and let us assume −R that (5.32) is satisfied. Then T F T is a Nash-equilibrium strategy, and w > TR−S implies a23 > a13 , so that T F 2T is a second strategy, distinct from T F T , that does better against a third strategy, namely, ST F T (without which T F 2T would be indistinguishable from T F T ). The long-term dynamics can again be described by (5.41), but because the strategies are different now, we have (5.50a)
Φ1 (x1 , x2 ) = μx1 (1 − x1 − x2 )(a − cx1 − dx2 ),
(5.50b)
Φ2 (x1 , x2 ) = μx2 (1 − x1 − x2 )(b − cx1 − dx2 ),
(5.50c)
Φ3 (x1 , x2 ) = μ(1 − x1 − x2 )(x1 + x2 )(cx1 + dx2 ) − μ(1 − x1 − x2 )(ax1 + bx2 ),
in place of (5.42) and (5.43), where this time (5.51)
a = λ0 {(T − P )w − (P − S)}, b = w(R − S) − P + S, c = S + T − P − R, d = c + (2R − S − T )w,
and (5.32) implies d > b > a > c (Exercise 5.8). We can again deduce the long-term dynamics by following the point with coordinates (x1 , x2 ) in the triangle (5.40). Let us first suppose that (5.33) is satisfied, i.e., S + T ≥ P + R, so that d > b > a > c ≥ 0. If we redefine the parallel lines L1 and L2 by (5.52)
L1 : cx1 + dx2 = a,
L2 : cx1 + dx2 = b
and the hyperbola H by (5.53)
H: (x1 + x2 )(cx1 + dx2 ) = ax1 + bx2 ,
then in Figure 5.3(a) we have Φ1 > 0 below L1 , Φ2 > 0 below L2 , and Φ3 > 0 above the upper branch of H, which intersects L2 at the point 0, db but lies entirely above L1 . (The lower branch of H passes through the origin, but otherwise lies outside the triangle.) The point 0, db , where x3 = d−b d > 0, is again the only locally stable equilibrium. On x1 + x2 = 1 there exists, however, a whole line segment of metastable equilibria—and (x1 , x2 ) may be attracted to one of these, rather than to 0, db . To see why, let us first suppose that (x1 (0), x2 (0)) lies in the shaded triangle with a vertex at (1, 0) in Figure 5.3(a). Then because Φ1 > 0, Φ2 > 0, and Φ3 < 0, (5.41) implies that the point (x1 (n), x2 (n)) will always move rightwards, upwards, and towards the line x1 + x2 = 1. Thus x3 (∞) = 0, and x1 (∞) + x2 (∞) = 1; furthermore, x2 (∞) > x2 (0), so that T F 2T invades. On the other hand, ST F T is eliminated, and x1 (∞) > x1 (0), so that the frequency of T F T is higher than initially. Indeed because L1 intersects the line x1 + x2 = 1 where x1 = d−a d−c , we see
190
5. Cooperation and the Prisoner’s Dilemma
1
x2
1
x2
b/d L2 H
a/d
L1
0 0
θ1 a
θ2
1
x1
0 0
1 b
Figure 5.3. Convergence to equilibrium for the IPD with strategies T F T , T F 2T , and ST F T in proportions x1 , x2 , and 1 − x1 − x2 , respectively. The figure is drawn for T = 5, R = 3.5, P = 1, S = 0, and w = 0.725, so that (5.54) reduces to x1 (0) > 0.585. (a) The point (x1 , x2 ) is attracted to the line x1 + x2 = 1 in the lighter shaded region, but to the equilibrium at (0, b/d) in the darker shaded region. Here θ1 = (d − b)/(a − b − c + d) and θ2 = (d−a)/(d−c). (b) A sample of trajectories, with open circles for initial points. For example, (x1 (n), x2 (n)) converges to the locally stable equilibrium point (0, 0.788) from (x1 (0), x2 (0)) = (0.3, 0.3) but to the metastable equilibrium point (0.475, 0.525) from (x1 (0), x2 (0)) = (0.45, 0.3).
Figure 5.4. Domains of attraction for the IPD with strategies T F T , T F 2T , and ST F T in proportions x1 , x2 , and 1 − x1 − x2 when P + R > S + T . In the lighter shaded region, (x1 , x2 ) is attracted to the line x1 + x2 = 1. In the darker shaded region, (x1 , x2 ) is attracted to the equilibrium point marked by a dot. Both figures are drawn for T = 5, R = 3.5, P = 2, and S = 0 with θ1 = (d − b)/(a − b − c + d) and θ2 = (d − a)/(d − c) as in Figure 5.3. (a) Here w = 0.7, implying w > (P − S)/(T − P ) and d > b > a > 0 > c. (b) Here w = 0.525, implying (P − S)/(R − S) > w > (P + R − S − T )/(2R − S − T ) and d > 0 > b > a > c.
x1
5.3. Other reciprocal strategies
191
that T F T is bound to increase in frequency whenever7 (5.54)
x1 (0) >
(T − R)(1 − w) + (2R − S − T )w2 . (2R − S − T )w(1 + w)
If (x1 (0), x2 (0)) lies in the unshaded region of Figure 5.3(a), (x1 (n), x2 (n)) then may converge either to the line x1 + x2 = 1 or to the point 0, db , as illustrated by Figure 5.3(b). If, however, (x x2 (0)) lies in the darker shaded region of Figure 1 (0), 5.3(a), then convergence to 0, db and elimination of T F T are assured. −S > A similar analysis applies to the case where P + R > S + T . Then TP −P P −S T −R T −R T −R R−S > T −P > R−S (Exercise 5.9) so that (5.32) implies w > T −P . Nevertheless, −S P −S +R−S−T w can be either larger or smaller than each of TP −P , R−S , and P2R−S−T , and the P −S third of these expressions, though smaller than R−S , can be larger or smaller than T −R cases are possible, and Figure 5.4 shows two of them. In Figure T −P . So various b −S ; whereas in 5.4(a), where 0, d is the only stable equilibrium, we have w > TP −P P −S +R−S−T Figure 5.4(b), where (0, 0) is the only stable equilibrium, R−S > w > P2R−S−T . The smaller the value of w, the larger the darker shaded region in which (x1 , x2 ) is guaranteed to be attracted to x1 = 0. Nevertheless, the lighter shaded region, where T F T is bound to increase in frequency, is still finite; and if the whole population were using T F T just before infiltration by ST F T and T F 2T at time n = 0, then (x1 (0), x2 (0)) would have to lie close to (1, 0) and hence within the lighter shaded region. Of course, a fresh infiltration of ST F T would displace (x1 , x2 ) from the line x1 + x2 = 1, slightly towards the southwest. The population would subsequently evolve to a new metastable equilibrium, still on the line x1 + x2 = 1 but nearer to
Table 5.2. Numerical solution of (2.115) for A defined by (5.25) and (5.48) with T F T , T F 2T , ST F T , and ALLD as strategies 1, 2, 3, and 4, respectively, in initial proportions x1 (0) = 0.7 and x2 (0) = x3 (0) = x4 (0) = 0.1
n 0 1 2 3 4 5
x1 (n) 0.700 0.760 0.792 0.809 0.820 0.827
x2 (n) 0.100 0.111 0.117 0.121 0.124 0.126
x3 (n) 0.100 0.089 0.076 0.065 0.054 0.046
x4 (n) 0.100 0.040 0.015 0.006 0.002 0.001
n 10 15 20 25 30 40
x1 (n) 0.847 0.855 0.859 0.860 0.861 0.862
x2 (n) 0.133 0.136 0.137 0.138 0.138 0.138
x3 (n) 0.020 0.009 0.004 0.002 0.001 0.000
x4 (n) 0.000 0.000 0.000 0.000 0.000 0.000
7 Note that (5.54) approaches x1 (0) > 12 as w approaches 1. Bendor and Swistak [22, p. 3600] have shown that the minimal stabilizing frequency above which T F T can resist all possible invasions must exceed 12 , and it approaches 12 as w approaches 1. Although it is possible for the right-hand side of (5.54) to be less than 12 , [22] is not contradicted because, e.g., ALLD is not a potential mutant in our model.
192
5. Cooperation and the Prisoner’s Dilemma
(0, 1). In the course of time, repeated infiltrations by ST F T could nudge (x1 , x2 ) all the way along the line x1 + x2 = 1, slowly decreasing the fraction of T F T , so that (x1 , x2 ) would eventually enter the darker shaded region, to be attracted by x1 = 0. At that stage, T F T would have been eliminated (and also T F 2T in Figure 5.4(b)). But whether this would actually happen is far beyond the scope of our dynamic model, namely, (5.41). The dynamics are more difficult, both to analyze and to visualize, when ALLD is also present. But it is easy to solve (2.115) with A defined by (5.48) numerically, and Table 5.2 shows a sample calculation for Axelrod’s prototype when x1 (0) = 0.7 and the other three strategies are equally represented initially. Notice that the two nasty strategies are summarily dispatched; and because the two nice strategies are incapable of eliminating one another, the final composition of the population is 86% T F T and 14% T F 2T . Two remarks are now in order. First, as we have said before, insofar as no strategy can resist invasion by every conceivable combination of deviant strategies, the IPD has no ESS. But this does not mean that T F T cannot persist as an orthodox IPD strategy, for the simple reason that strategies capable of displacing it may be either absent or too rare. Whether a strategy is evolutionarily stable will always depend upon the scope of the strategy set. By suitably enlarging it, most of the strategies that have ever been claimed as ESSs for one game or another could almost certainly be destabilized. For example, if in Crossroads we allowed a third strategy, Z, which stands for “instantly convert your car into a motor cycle, so that you can proceed at once regardless of what the other motorist does, and zoom”, then clearly Z would be a dominant strategy, and it would replace the ESS in §2.1 as the solution of the game. But not every motorist drives Jane Bond’s car.8 Likewise, ST F T might not emerge as a deviant strategy sufficiently often to sustain the process for eliminating T F T that we described in the last paragraph but one. Second, Axelrod [10, pp. 48–52] has marshalled impressive empirical support for the persistence of T F T as an orthodox strategy by conducting an ecological simulation of the IPD in a population of computer programs submitted to his tournament; see §5.2. At first, all programs were represented; subsequently, he allowed the composition of the population to evolve, essentially according to the dynamics of §2.8, and found that the frequency of T F T was always higher than that of any other strategy and increased steadily.9 Axelrod’s results have since been rationalized by Bendor and Swistak [22, 23], who show that there is a sense in which T F T and other nice, retaliatory strategies are maximally robust. In the light of these remarks, we should interpret (5.32) as a necessary condition for T F T to be uninvadable. If this condition is satisfied, then T F T may persist as an orthodox strategy, simply because a mixture of deviant strategies that could invade it may not be represented. If (5.32) is not satisfied, however, then there is little hope that T F T will persist, because deviant strategies that can eliminate it are so simple as virtually to be guaranteed to arise. 8
Jane Bond is female Agent 007. T F T was also the champion of the computer tournament itself, but this was a round-robin tournament in which every strategy played every other strategy once. 9
5.4. Dynamic versus static interaction
193
Nevertheless, there are circumstances in which (5.32) is inappropriate even as a necessary condition, because—as we shall demonstrate in §5.5—it is based on two tacit assumptions; namely, that the population is very large, and that opponents, though randomly selected at the beginning of the game, are retained for its duration. But even if the players are all drawn at random from a large population, the subpopulation with which they interact may not be large, and opponents may be drawn at random from this subpopulation throughout the game. Let us therefore relax both assumptions.
5.4. Dynamic versus static interaction Given T F T ’s success in Axelrod’s computer tournament and the remarks at the end of the previous section, we would like to obtain some theoretical insights into the resilience of this strategy in a finite population. In this section, we shall consider two different modes of interaction: first, a mode in which players select their opponents at random but retain them for the duration of the game, and second, a mode in which opponents are drawn randomly throughout the game. Consider, therefore, a population of N +1 individuals, of whom Nk play strategy k in the IPD. Thus, with m strategies, we have N1 + N2 + · · · + Nm = N + 1.
(5.55)
Because opponents are not necessarily fixed, it no longer makes sense to construct a payoff matrix for the entire game; instead, we construct a payoff matrix for each move and use it to derive expressions for the expected payoffs to the various strategies against all possible opponents. Let φ(k) denote this m × m matrix, i.e., for 1 ≤ j, l ≤ m, let φjl (k) be a j-strategist’s payoff from interaction k if the interaction is with an l-strategist, and, for j = 1, 2, . . . , m, let Wj denote a jstrategist’s expected payoff from the game. Then, in place of (5.12), the reward from move k of the game—conditional upon encountering an l-strategist—is φjl (k) Prob(M ≥ k),
(5.56)
where M as usual is the number of interactions. In the case where opponents are drawn at random throughout, let πjl denote the probability—assumed the same on every move—that a j-strategist’s next interaction is with an l-strategist. Then the (unconditional) reward from move k of the game is m #
(5.57)
πjl φjl (k) Prob(M ≥ k),
l=1
because the opponent uses strategy l at move k with probability πjl , and the reward from the entire game is '
m ∞ # # (5.58) Wj = πjl φjl (k) Prob(M ≥ k), 1 ≤ j ≤ m. k=1
l=1
On the other hand, in the case where players interact with fixed opponents, for some value of l a j-strategist’s opponent is known to be an l-strategist from the
194
5. Cooperation and the Prisoner’s Dilemma
first move onwards; whence (5.56) implies that the reward from the entire game, assessed at the first move, is ∞ #
(5.59)
φjl (k) Prob(M ≥ k),
1 ≤ j, l ≤ m.
k=1
Before the first move the identity of l is unknown, however, and so to obtain the reward, we must average this expression over the distribution of possible opponents, which yields '
∞ m # # (5.60) Wj = πjl φjl (k) Prob(M ≥ k) , 1 ≤ j ≤ m. l=1
k=1
Clearly, the same expression for Wj results in either case. Let us now define xj = Nj /N, j = 1, . . . , m, (5.61) α = 1/N, so that (5.55) implies x1 + x2 + · · · + xm = 1 + α.
(5.62)
Then, because the probability that a j-strategist interacts with an l-strategist is Nl /N if l = j but (Nj − 1)/N if l = j, the probabilities of interaction are πjl = xl − αδjl ,
(5.63)
1 ≤ j, l ≤ m,
where we define δjl = 0 if j = l but δjj = 1; and, from (5.22), (5.64)
Wj =
∞ # k=1
wk−1
m #
(xl − αδjl )φjl (k),
1 ≤ j, l ≤ m.
l=1
Further progress requires an explicit expression for the m × m matrix φ(k). Let us therefore choose m = 4 in (5.64) and define the strategies as in (5.47). Then, because T F T or T F 2T always cooperate with one another, the payoff to T F T or T F 2T against T F T or T F 2T at interaction k is independent of k: (5.65a)
φ11 (k) = φ12 (k) = φ21 (k) = φ22 (k) = R,
1 ≤ k < ∞.
Similarly, because ST F T or ALLD always defect against one another, (5.65b)
φ33 (k) = φ34 (k) = φ43 (k) = φ33 (k) = P,
1 ≤ k < ∞.
For all other values of j and l, however, φjl (k) depends on k. Let σk denote the probability that an individual’s opponent at move k has been encountered before, let γk denote the probability that the opponent has previously been encountered more than once, and let k denote the probability that the opponent has previously been encountered an even number of times (hence 1 − k the probability that the opponent has been encountered an odd number of times). Then, because T F T cooperates with ST F T on odd-numbered encounters (when the opponent defects) but defects against ST F T on even-numbered ones (when the opponent cooperates), we have (5.66a)
φ13 (k) = k S + (1 − k )T,
1 ≤ k < ∞.
5.4. Dynamic versus static interaction
195
Similarly, φ31 (k) = k T + (1 − k )S,
(5.66b)
1 ≤ k < ∞.
Because T F T cooperates with ALLD on a first encounter but thereafter defects, we have φ14 (k) = σk P + (1 − σk )S,
1 ≤ k < ∞.
(5.66d)
φ23 (k) = σk R + (1 − σk )S,
1 ≤ k < ∞,
(5.66e)
φ32 (k) = σk R + (1 − σk )T,
1 ≤ k < ∞,
(5.66f)
φ41 (k) = σk P + (1 − σk )T,
1 ≤ k < ∞.
(5.66c) Similarly,
Again, because T F 2T cooperates against ALLD on a first or second encounter but thereafter defects, we have (5.66g)
φ24 (k) = γk P + (1 − γk )S,
1 ≤ k < ∞,
(5.66h)
φ42 (k) = γk P + (1 − γk )T,
1 ≤ k < ∞.
The matrix φ(k) is now defined, and from (5.62)–(5.66) we obtain (5.67a)
W1 =
∞ #
wk−1 {(x1 − α + x2 )R + x3 (k S + {1 − k }T )
k=1
+ x4 (σk P + {1 − σk }S)} , (5.67b)
W2 =
∞ #
wk−1 {(x1 + x2 − α)R + x3 (σk R + {1 − σk }S)
k=1
+ x4 (γk P + {1 − γk }S)} , (5.67c)
W3 =
∞ #
wk−1 {x1 (k T + {1 − k }S)
k=1
+ x2 (σk R + {1 − σk }T ) + (x3 − α + x4 )P } , (5.67d)
W4 =
∞ #
wk−1 {x1 (σk P + {1 − σk }T )
k=1
+ x2 (γk P + {1 − γk }T ) + (x3 + x4 − α)P } . If players keep the same opponent throughout the game, then (5.68a) (5.68b) (5.68c)
σ1 = 0,
σk = 1 if k ≥ 2,
γ1 = 0 = γ2 , γk = 1 if k ≥ 3, 1 k = 2 1 − (−1)k , k ≥ 1.
We shall say in this case that the interaction is static. Explicit expressions for W1 , . . . , W4 follow from (5.66)–(5.68). On using (5.30) and the related formula (5.69)
∞ # j=1
jβ j−1 =
1 , (1 − β)2
0 ≤ β < 1,
196
5. Cooperation and the Prisoner’s Dilemma
for the derivative of a geometric series, we obtain (see Exercise 5.11) (5.70a)
W1 = μ(x1 + x2 − α)R + λ0 μx3 (S + wT ) + x4 (S + μwP ),
(5.70b)
W2 = μ(x1 + x2 − α)R + x3 (S + μwR) + x4 {S(1 + w) + μw2 P },
(5.70c)
W3 = λ0 μx1 (T + wS) + x2 (T + μwR) + μ(x3 + x4 − α)P,
(5.70d)
W4 = x1 (T + μwP ) + x2 {T (1 + w) + μw2 P } + μ(x3 + x4 − α)P,
where μ is defined by (5.23) and λ0 by (5.29). Note that the difference in payoffs between the two nice strategies T F T and T F 2T is (5.71)
W1 − W2 = λ0 μx3 w{T − R − w(R − S)} + x4 w(P − S),
which vanishes if x3 = 0 = x4 . Because the coefficient of x4 is positive when x4 = 0, T F T always does better than T F 2T against ALLD. Similarly, T F 2T does better −R , because the coefficient of x3 is then than T F T against ST F T when w > TR−S negative. If, on the other hand, opponents are drawn at random throughout the game, then we shall call the interaction dynamic. Because the probability per move of meeting any specific opponent is α = N1 , the probability that the opponent at move k has not been encountered during the previous k − 1 moves is (1 − α)k−1 , whence σk = 1 − (1 − α)k−1 , k ≥ 1.
(5.72a)
Similarly, the probability of precisely one encounter with the opponent during the previous k − 1 moves is (k − 1)α(1 − α)k−2 , whence (5.72b)
γk = 1 − (1 − α)k−1 − (k − 1)α(1 − α)k−2 , k ≥ 1.
Because zero is an even number, 1 = 1. Moreover, the number of previous encounters with an opponent is even at move k if either it was even at move k − 1 and the opponent was then different, or it was odd at move k − 1 and the opponent was then the same. Thus k = (1 − α)k−1 + α(1 − k−1 ) for k ≥ 2. The solution of this recurrence equation subject to 1 = 1 (see Exercise 5.11) is (5.72c) k = 12 1 + (1 − 2α)k−1 , k ≥ 1. Explicit expressions for W1 , . . . , W4 now follow from (5.66), (5.67), and (5.72). On using (5.30) and (5.69), we obtain (5.73a)
W1 = μ(x1 + x2 − α)R + 12 x3 {μ(S + T ) + λ2 (S − T )} + x4 {μP + λ1 (S − P )}, λ2 (S−P ) , = μ(1 − x4 )R + λ1 x3 (S − R) + x4 μP + 1 λ2
(5.73b)
W2
(5.73c)
W3 =
1 2 x1 {μ(S
+ T ) + λ2 (T − S)}
+ x2 {μR + λ1 (T − R)} + μ(x3 + x4 − α)P, (5.73d)
W4 = x1 {μP + λ1 (T − P )} λ2 (T −P ) + μ(x3 + x4 − α)P, + x2 μP + 1 λ2
where we define (5.74)
λ1 =
1 , 1 − w + αw
λ2 =
1 1 − w + 2αw
5.5. Stability of a nice population: static case
197
so that μ > λ1 > λ2 > λ0 . Note that if either (2.108) or (2.115) with m = 4 is used for long-term dynamics, then either (2.106) or (2.113) must be replaced by (5.73) under dynamic interaction and by (5.70) under static interaction. Note also that (5.73) reduces to (5.70) when N = 1: we cannot distinguish between static and dynamic interaction when there are only two individuals (Exercise 5.12). We apply this model in the following two sections.
5.5. Stability of a nice population: static case In this section, we consider the stability of a nice population under static interaction. Given our remarks at the end of §5.3, our goal will be to seek necessary conditions for population stability—and if a strategy is to be stable against all possible deviation, then it must at least be stable against pure infiltration by a single nasty player. We begin by considering the stability of T F T against ALLD. Thus N2 = N3 = 0 (hence x2 = 0 = x3 ), and (5.70) implies W1 = μ(x1 − α)R + x4 (S + μwP ),
(5.75)
W4 = x1 (T + μwP ) + μ(x4 − α)P.
For a T F T population to be stable against pure infiltration by ALLD, it must at least be true that W1 exceeds W4 when N1 = N and N4 = 1. Accordingly, we set x1 = 1 and x4 = α in (5.75) to obtain (5.76)
W1 − W4 = μ(R − P ) − (T − P ) − α{μ(R − P ) + P − S}.
Straightforward algebraic manipulation now shows that this expression is positive when (5.77)
R − P + (1 − w)(P − S) + N {T − R − w(T − P )} < 0,
which requires both (5.78)
w >
T −R T −P
for a negative coefficient of N in (5.77) and (5.79)
N >
R − P + (1 − w)(P − S) . w(T − P ) − (T − R)
Next we consider the stability of T F T against ST F T . Now N2 = 0 = N4 (hence x2 = 0 = x4 ), and (5.70) implies (5.80)
W1 = μ(x1 − α)R + λ0 μx3 (S + wT ), W3 = λ0 μx1 (T + wS) + μ(x3 − α)P,
where λ0 is defined by (5.29). It follows from (5.80) that (5.81)
W1 − W3 = μ{R + αP − λ0 (1 + α)(T + wS)} + μ(S + T − P − R)x3 .
For a T F T population to be stable against pure infiltration by ST F T , it must at least be true that W1 exceeds W3 when N1 = N and N3 = 1. Accordingly, we set x1 = 1 and x3 = α in (5.81). Then (5.82) W1 − W3 = λ0 μ {(2R − S − T )(1 − α) − (1 − w)(R − S + α{T − R})}
198
5. Cooperation and the Prisoner’s Dilemma
is positive for (5.83)
w >
T −R , R−S
N >
(1 − w)(T − R) + 2R − S − T w(R − S) − (T − R)
by (5.61). For sufficiently large N , (5.78), (5.79), and (5.83) reduce to (5.32). Thus §5.2 corresponds to static interaction in a large population. A similar analysis can be applied to the other nice strategy, T F 2T . For T F 2T to be stable against pure infiltration by ALLD, we require (see Exercise 5.15) (5.84)
w >
T −R , T −P
N >
R − P + (1 − w2 )(P − S) , w2 (T − P ) − (T − R)
which agrees with (5.35) when N is large enough. But no value of w is large enough to make T F 2T stable against ST F T . To see why, we set x1 = 0 = x4 in (5.70) to obtain W2 = μ(x2 − α)R + x3 (S + μwR) and W3 = x2 (T + μwR) + μ(x3 − α)P . For x2 = 1 and x3 = α, W2 − W3 = R − T − α(R − S) is always negative. Thus the frequency of ST F T will always increase; T F 2T is too nice a strategy to resist invasion. On the other hand, T F 2T need not be eliminated: for x2 = α and −S x3 = 1, W2 − W3 = μ(R − P ) − R + S − α{T − R + μ(R − P )} > 0 if w > P R−S and N > {(1 − w)(T − R) + R − P }/{w(R − S) − (P − S)}. Assuming both conditions hold, let us define xes by N {(1 − w)(T − R) + w(R − S) − P + S}xes = {w(R − S) − P + S}N − (1 − w)(R − S). Then, because x2 + x3 = 1 + α, W2 − W3 is negative if x2 > xes but positive if x2 < xes , so that the population will reach equilibrium at a mixture of T F 2T and ST F T in which the proportion of T F 2T is xes /(1 + α). In the limit as N → ∞, of course, xes approaches (5.46). If R − S > T − P , then there is a range of values of w, specifically, (5.85)
T −R < w < T −P
T −R , T −P
for which T F T is stable against pure infiltration by ST F T or ALLD, whereas T F 2T is not. Of course, T −P may exceed R−S, but if also (T −R)(T +R−S−P ) < 2(R − S)2 , then there is still a range, namely, (5.86)
T −R < w < R−S
2(T − R) , T +R−S−P
for which T F T is stable against pure infiltration, whereas T F 2T is not; (5.86) is satisfied, for example, by Axelrod’s prototype. We more or less knew all this in §5.3, and it could be argued that our finiteN analysis has done little more than recover the results of that section. Things are very different, however, when opponents are drawn at random throughout the game—or, as we have chosen to say, when the interaction is dynamic.
5.6. Stability of a nice population: dynamic case We now consider the stability of a nice population under dynamic interaction. Some mathematical preliminaries will facilitate our analysis. Accordingly, let us define
5.6. Stability of a nice population: dynamic case
199
quadratic polynomials ζ1 , ζ2 , ζ3 and Δ1 , Δ2 , Δ3 by (5.87)
ζj (N ) = (1 − w)(T − R)N 2 + (R − S − waj )N + w(aj − R + S),
(5.88)
Δj (w) = (R − S − waj )2 − 4w(1 − w)(T − R)(aj − R + S)
for j = 1, 2, 3, and (5.89) a1 = 2R − P − S,
a2 = 3R − 2S − T,
a3 =
1 2 {5R
− 3S − T − P },
so that (5.2) and (5.3) imply aj > R − S,
(5.90)
j = 1, 2, 3.
Then because Δj (0) > 0, Δj ((R − S)/aj ) < 0, and Δj (1) > 0, the smaller and larger roots of Δj (w) = 0 satisfy 0 < w < (R − S)/aj and (R − S)/aj < w < 1, respectively. Let ξj denote the larger root, (5.91)
ξj =
2(aj − R + S)
(T − R)(T − S) + aj (R − S) + 2(T − R)(aj − R + S) a2j + 4(T − R)(aj − R + S)
for j = 1, 2, 3. Whenever ξj < w < 1, Δj (w) is positive; and so the equation ζj (N ) = 0 has two real roots, both of which are positive, by (5.90). Let the greater of these roots be denoted by Uj (w). Then, for j = 1, 2, 3, we have established that ζj (N ) < 0 if ξj < w < 1
(5.92a) and (5.92b)
w(aj − R + S) < N < Uj (w), (1 − w)(T − R)Uj (w)
where (5.93)
Uj (w) =
waj − R + S + Δj (w) . 2(1 − w)(T − R)
The lower bound in (5.92b) is the smaller root of ζj (N ) = 0. After these preliminaries, we can now proceed. First we consider the stability of T F T against pure infiltration by ALLD: N2 = 0 = N3 , so that (5.73) implies W1 = μ(x1 − α)R + x4 {μP + λ1 (S − P )} and W4 = x1 {μP + λ1 (T − P )} + μ(x4 − α)P . Proceeding as in §5.5, we require for stability that W1 > W4 when N1 = N and N4 = 1. On setting x1 = 1 and x4 = α in our expressions for W1 and W4 , the above condition reduces (Exercise 5.17) to (5.94)
ζ1 (N ) < 0.
Hence necessary conditions for the stability of T F T are (5.95a)
w > ξ1
and (5.95b)
w(R − P ) < N < U1 (w), (1 − w)(T − R)U1 (w)
where ξ1 is defined by (5.91) and U1 by (5.93). For example, in Axelrod’s prototype we have ξ1 = 0.869 and U1 (w) = 285.5; the lower bound on N in (5.95b) is then N > 1.009 and is satisfied if only N ≥ 2. Thus we require 3 ≤ N + 1 ≤ 286.
200
5. Cooperation and the Prisoner’s Dilemma
Generally, the lower bound in (5.92b) is rather trivially satisfied by N ≥ 2 unless w is very close to ξj , in which case, the range of values admitted by (5.92b) is too small to be of interest. For example, with Axelrod’s payoffs P = 1, R = 3, S = 0, and T = 5 but w = 0.87, (5.95b) requires 2.38 < N < 2.81, which no integer value of N satisfies. Thus our interest lies principally in the upper bound, Uj . Next we consider the stability of T F T against pure infiltration by ST F T . With x2 = 0 = x4 , (5.73) yields W1 = μ(x1 − α)R + 12 x3 {μ(T + S) − λ2 (T − S)} and W3 = 12 x1 {μ(T + S) + λ2 (T − S)} + μ(x3 − α)P . We require for stability that W1 > W3 when N1 = N and N3 = 1, in other words (see Exercise 5.17) that (5.96)
ζ2 (N ) < 0.
Thus T F T is stable against pure infiltration by ST F T if (5.97a)
w > ξ2
and (5.97b)
w(2R − S − T ) < N < U2 (w), (1 − w)(T − R)U2 (w)
where ξ2 is defined by (5.91) and U2 by (5.93). For example, in Axelrod’s prototype we have ξ2 = 0.930 and U2 (w) = 141.5. The lower bound on N in (5.97b) is then N > 1.018, and so resistance to ST F T requires 3 ≤ N + 1 ≤ 142. A similar analysis applies to the other nice strategy, T F 2T . In a population of T F 2T and ALLD, N1 = 0 = N3 , W2 = μ(x2 − α)R + x4 {μP − λ21 (P − S)/λ2 }, and W4 = x2 {μP + λ21 (T − P )/λ2 } + μ(x4 − α)P . For T F 2T to be stable, we require W2 > W4 for x2 = 1, x4 = α, or (1 − w)2 (T − R)N 3 + (1 − w){2(T − R)w + (1 − w)(R − S)}N 2 + w{2(1 − w)(R − S) − w(R − P )}N + w2 (R − P ) < 0, 2(R−S) . As we would expect, the condition which requires in particular that w > 3R−2S−P is more difficult to satisfy than (5.94)—yielding, for example, 3 ≤ N + 1 ≤ 16 in Axelrod’s prototype. If T F 2T is still the orthodox strategy but ST F T replaces ALLD as the deviant strategy, so that N2 = 0 = N4 , then (5.73) implies W2 = μR − λ1 x3 (R − S), W3 = x2 {μR + λ1 (T − R)} + μ(x3 − α)P ; and x2 = 1, x3 = α yields W2 − W3 = −λ1 {T −R+α(R−S)} < 0. Thus ST F T increases in number; regardless of whether interaction is static or dynamic, T F 2T is too nice a strategy to resist invasion by ST F T . On the other hand, T F 2T need not be eliminated: for x2 = α and x3 = 1, W2 − W3 = λ0 (R − P ) − λ1 (R − S) − α{λ0 (R − P ) + λ1 (T − R)} is positive if
(5.98) ψ(N ) = (1 − w)(P − S)N 2 + {T − P − w(T + R − 2P )}N + w(R − P ) T −P . Routine manipulations is negative, requiring in particular that w > T +R−2P (Exercise 5.18) now establish that ψ(N ) < 0 whenever
(5.99a)
w > ξ4
and (5.99b)
w(R − P ) < N < U4 (w), (1 − w)(P − S)U4 (w)
5.7. Mutualism: common ends or enemies
201
where (5.100) ξ4
(T − P )(T + R − 2P ) + 2(R − P )(P − S) + 2(R − P ) (P − S)(T − S) = (T + R − 2P )2 + 4(P − S)(R − P )
T −P satisfies T +R−2P < ξ4 < 1 and U4 (w) is the larger root of ψ(N ) = 0 (whose existence is guaranteed by w > ξ4 ). Assuming that (5.99) is satisfied—which, for example, requires 3 ≤ N + 1 ≤ 571 in Axelrod’s prototype—let us define −S)−R+S−N (1−w)(P −S) xed = w(2R−P w(R−P )+N (1−w)(S+T −P −R) . Then, because x2 + x3 = 1 + α, W2 − W3 < 0 if x2 > xed but W2 − W3 > 0 if x2 < xed , so that the population will reach equilibrium at a mix of T F 2T and ST F T with proportion xed /(1 + α) of T F 2T . Three conclusions can be drawn from this analysis. First, we have shown that T F T cannot be stable under dynamic interaction unless
(5.101a)
w > max(ξ1 , ξ2 )
and (5.101b)
N < min U1 (w), U2 (w) ,
where ξ1 , ξ2 , U1 , and U2 are defined by (5.91) and (5.93). Routine algebra (Exercise −R −R , ξ2 > TR−S . So, comparing (5.101) 5.19) shows that (5.2) and (5.3) imply ξ1 > TT −P with (5.32), we see that the probability of further encounters must be higher under dynamic interaction than under static interaction for T F T to be a stable orthodox strategy. Second, we see that (5.94), (5.96), and the corresponding inequalities for T 2F T are all false as N → ∞. So, when interaction is dynamic, reciprocity cannot maintain cooperation if the population is too large, essentially because nasty strategies can then profit from too many first encounters and too few second or higher encounters—in other words, too few opportunities for punishment. And third, T F 2T is again too forgiving to be a stable orthodox strategy, but it can coexist with ST F T if the population is small enough. Subsequently, it will be convenient to refer to animals or other organisms—e.g., bacteria [11, p. 1392]—as highly mobile if their pattern of interaction for an IPD is dynamic and as sessile if instead their interaction pattern is static. These labels need not apply to the organism’s entire life history. For example, impala live in very loose social groups and on the whole are better described as highly mobile; but pairs are sessile during an IPD consisting of repeated bouts of allogrooming (or social grooming) to remove parasites from areas that partners cannot reach [84, pp. 91–94].
5.7. Mutualism: common ends or enemies We have seen that it is possible to sustain cooperation via reciprocity in the iterated prisoner’s dilemma—provided, of course, that a player’s probability of further interaction is high enough. But how is cooperation initiated? Recall that in the simplest IPD we considered, namely, the game in which ALLD and T F T are the only strategies, both strategies are evolutionarily stable if μ > μc , where μc is defined by (5.20), and that T F T will become the orthodox strategy only if x2 (0) (the initial fraction of T F T -strategists in the game with matrix (5.19)), exceeds
202
5. Cooperation and the Prisoner’s Dilemma
the value γ defined by (5.21). But what if x2 (0) < γ? How is cooperation initiated then? One possibility is that sometimes the game with payoff matrix (5.1) is a prisoner’s dilemma and sometimes C is a dominant strategy, according to whether the environment in which the game is played is lenient or harsh. If so, then cooperation could be initiated by unconditional self-interest during a harsh phase and sustained throughout a lenient phase by reciprocity. Then, as it were, the environment would play the role of a common enemy—the ultimate enforcer of cooperation. How many times have you heard it said that the only way to accomplish wholesale cooperation among humans would be to invite aliens from outer space to invade our planet? To illustrate the effect of a common enemy, we now turn to another kind of game—wild animals—for a further example of the prisoner’s dilemma. Suppose that an army of rangers in jeeps is employed to protect an endangered species. These rangers patrol at random and are advised to confront poachers—who are quite ruthless—only in pairs, the accepted custom being that any poacher is intercepted by the nearest two rangers. Let us suppose that a poacher has been spotted, and that the nearest two rangers have been identified. If one of these rangers confronts the poacher, then we shall say that she selects strategy C, or cooperates; whereas if she desists from confrontation, then we shall say that she selects strategy D, or defects. The poacher, however, will not be granted the status of a player; rather, she is the rangers’ common enemy. This does not mean that poachers have no strategic possibilities, but because our interest is the behavior of the rangers, we disregard the poachers’ strategies (and instead incorporate their behavior into the parameters of our model, namely, , q, and Q defined below). There are two kinds of encounter between ranger and poacher—deliberate or accidental. The first arises when the ranger elects to confront the poacher, in which case an encounter between ranger and poacher is certain. The second arises when the ranger desists from confrontation, but is nevertheless unlucky enough to find herself on the poacher’s path, in which case we suppose that an encounter between ranger and poacher occurs with probability (< 1), regardless of whether the other ranger cooperates or defects (think of a defector as stationary, and a cooperator or poacher as always moving). When a poacher encounters a ranger, let the probability that the poacher inflicts injury on the ranger be Q or q, according to whether the ranger is isolated or reinforced by her colleague; and assume that if a poacher encounters a defecting ranger accidentally, then the encounter will be isolated only if the other ranger is also defecting (because if she is cooperating, then her jeep will be right on the poacher’s heels when the poacher meets the other ranger). Then, if the payoff to a ranger is the probability of avoiding injury, the interaction between rangers has payoff matrix (5.102)
A =
R T
S P
=
1−q 1 − q
1−Q . 1 − Q
It is straightforward to show that (5.2) and (5.3) are both satisfied—that is, the game with this matrix is a prisoner’s dilemma—whenever (5.103)
q < Q,
5.7. Mutualism: common ends or enemies
203
implying in particular that Q > q. Note that if there are N + 1 rangers, then the iterated game is an example of the IPD analyzed in §5.6, and that (5.102) and (5.103) imply P + R > S + T , so that (5.33) is violated (Exercise 5.23). Now, in constructing our payoff matrix, we have assumed that the only difference between cooperating and defecting lies in the probability of accidental discovery by a poacher—the rows of (5.102) would be identical if were equal to 1. We are therefore assuming that, conditional upon an encounter, a lone ranger who attacks a poacher has no advantage over a lone ranger who is surprised by a poacher on the run. If there is any truth at all to the old adage that attack is the best form of defense, however, then this assumption is false; rather, if the probability of injury while confronting alone is Q, then the probability of injury while defecting alone should be, not Q, but Z, where Z > Q. Similarly, if the probability of injury while confronting in pairs is q, then the probability of injury to a defector whose opponent is cooperating should be, not q, but z, where z > q—partly because the defector does not have the advantage of attack, but also because the cooperator may not be so hot on the heels of the poacher as to guarantee effective reinforcement (especially if she is mad at her colleague for defecting). We also expect Z > z. To allow (5.102) as a special case of the analysis that follows, however, we will weaken the inequalities Z > Q, z > q to Z ≥ Q, z ≥ q. Furthermore, we will assume that poachers are becoming more and more ruthless as time goes by in inflicting injuries on rangers they surprise, so that, if δ ≥ 0 is a measure of ruthlessness, then z and Z are increasing functions of δ. Let δ = 0 correspond to (5.102). Then the payoff matrix for the game at ruthlessness δ becomes R S 1−q 1−Q , (5.104) A(δ) = = 1 − z(δ) 1 − Z(δ) T (δ) P (δ) where (5.105)
z(0) = q,
Z(0) = Q
and (5.106)
z(δ) < Z(δ),
z (δ) > 0,
Z (δ) > 0,
0 < δ < ∞.
Confrontation is clearly the only rational strategy if poachers are infinitely ruthless against defectors. Therefore, C must be dominant as δ → ∞ or (5.107)
z(∞) > q,
Z(∞) > Q.
Together, (5.105)–(5.107) imply that unique δ1 , δ2 exist such that (5.108)
z(δ1 ) = q,
Z(δ2 ) = Q.
It need not be true that (5.109)
δ1 < δ2 .
Nevertheless, we shall assume so; see Figure 5.5(a). From inspection of (5.104), the rangers’ game at ruthlessness δ is a prisoner’s dilemma only if 0 ≤ δ ≤ δ1 . For δ1 < δ < δ2 , we have R > T (δ) and S < P (δ), so that both C and D are evolutionarily stable, and the final composition of the ranger population is determined by the initial fraction of cooperators; if most of the rangers are cooperating initially, then all will be cooperating eventually, and vice versa. For δ2 < δ < ∞, however, we have R > T (δ) and S > P (δ), so that C is a
204
5. Cooperation and the Prisoner’s Dilemma
T
εZ(δ)
Q
R P
εz(δ)
εQ q
t(δ)
p(δ)
S
εq 0
δ1
δ
δ2
δ0
δ1
(a)
δ
δ2 (b)
Figure 5.5. Variation with δ of defector’s payoffs in games against a common enemy
dominant strategy: what is best for the group is then also best for the individual. In short, if the poachers are sufficiently ruthless (but not infinitely ruthless), then cooperation is the only rational strategy; whereas if the poachers are not ruthless enough and if the initial proportion of cooperators is too small, then the rangers remain locked in a prisoner’s dilemma. If we think of the poachers as a harsh environment for the rangers, then what we have shown is that cooperation can emerge under adverse conditions purely as a response to changes in environmental parameters and without any need for reciprocity. More generally, we can conceive of cooperator’s dilemmas with payoff matrix R S (5.110a) A(δ) = t(δ) p(δ) satisfying (5.110b) (5.110c)
t(δ0 ) = T > R >
1 2 (S
+ T ),
t(∞) < R,
p(δ0 ) = P > S,
R > S
p(∞) < S,
and (5.110d)
R > p(δ),
t(δ) > p(δ),
p (δ) < 0,
t (δ) < 0
for 0 ≤ δ0 < δ < ∞, so that there exist unique δ1 and δ2 satisfying (5.110e)
t(δ1 ) = R,
p(δ2 ) = S;
see Figure 5.5(b). Here δ is a parameter that measures adversity, and δ0 is a base value, at which adversity is so slight that the players are still locked in a prisoner’s dilemma. Obviously, the ranger game is a special case of (5.110), in which R = 1−q, S = 1 − Q, t(δ) = 1 − z(δ), p(δ) = 1 − Z(δ), and δ0 = 0, but other such games can be constructed.10 Indeed it can be shown that if the oviposition game in §5.1 is extended to n periods, if the insects are unable to recognize their eggs, and if half the number of sites exceeds the probability per period of finding a site times 10
See, e.g., [59, 245], especially in the context of [217, p. 274].
5.8. Much ado about scorekeeping
205
the average number of periods an insect survives on the patch (a perfectly natural assumption if we wish sites to be abundant when the insects start to forage), then the game is an example of (5.110) in the limit as n → ∞, with δ equal to the ratio of survival probabilities, r1 /r2 , and 52 = δ0 < δ1 < δ2 < 11 4 [201]. What this means is that a very modest increase in environmental adversity (here, a decrease in the ratio of the survival probability of a paired egg to that of a solitary egg from 2 4 5 = 0.4 to 11 ≈ 0.36) can be all it takes to convert a game in which only defection pays (δ < δ1 ) to one in which only cooperation pays (δ > δ2 ). Note, finally, that there is no difference in principle between cooperation against a common enemy and cooperation towards a common end (for much the same reason that minimizing a function is the same as maximizing its negative). In either case, cooperation is an incidental consequence of ordinary selfish behavior, and so this category of cooperation is known to biologists as byproduct mutualism [50, 217], or simply mutualism [67, 68]. What distinguishes reciprocity from mutualism is the presence or absence of scorekeeping. For reciprocators, benefits are conferred or costs extracted by specific individuals, and it is necessary to keep tabs on their past behavior. For mutualists, by contrast, benefits are conferred or costs extracted by the common environment—both players and nonplayers—with which all interact; even if scorekeeping is possible it is unnecessary, because the risk is too high that anyone who tries to exploit others for short-term gain will only penalize herself. In other words, although there is no direct feedback between individuals—only indirect feedback from the environment—benefits exceed costs over the time scale on which rewards are measured, and so there is no incentive to cheat. For further examples of cooperation via mutualism, see §7.1 and §7.2.
5.8. Much ado about scorekeeping Many instances of cooperation among animals have been interpreted as either mutualism or reciprocity. Examples of the first kind include cooperative hunting in African wild dogs [68] and sentinel duty in meerkats [21, 69] and other species [42, p. 577]. Examples of the second kind include predator inspection by guppies or sticklebacks [84, pp. 63–70], social grooming by impala [84, pp. 92–94], and blood sharing by bats [84, pp. 113–114]. But the evidence is inconclusive, and so debate has been spirited. A key issue is whether putative reciprocators are sessile (have fixed partners) for the purposes of an IPD, because even if (5.101) holds, cooperation among highly mobile organisms is arguably more likely to be mutualism than reciprocity. To see why and also to demonstrate that mutualistic ALLC and reciprocal T F T may be distinguishable even in the absence of noncooperative ALLD, we embroil those three strategies in an iterated cooperator’s dilemma or ICD under dynamic interaction in this section and under static interaction in §5.9. As in §5.2, ALLC, ALLD, and T F T are strategies 1, 2, and 3, respectively. Now, the entire analysis of §5.6 was predicated on the ability of animals to distinguish one another: without it, reciprocity cannot be stable under dynamic interaction (Exercise 5.22). But there may be costs associated with recognizing previous partners and remembering whether they cooperated or defected; in other words, with keeping score. We include these costs here by supposing that T F T strategists store a memory of each partner at cost c0 , and that, at move k, they
206
5. Cooperation and the Prisoner’s Dilemma
compare the current partner with each of the k − 1 previous partners, at a cost of c 1 per comparison. Thus, on using (5.23) and (5.69), the total scorekeeping cost is ∞ k−1 = μc, where k=1 {c0 + (k − 1)c1 }w (5.111)
c = c0 + μc1 w
is the average cost of keeping score per move. Next we calculate a reward matrix A for the entire ICD, in which aij is the reward to an individual playing strategy i against N individuals playing strategy j. For example, a T F T -strategist among N ALLD-strategists obtains P −c at move k if it has previously met its partner, i.e., with probability σk , where σk is defined by new, (5.72a); whereas the T F T -strategist obtains S − c at move k if its partner isk−1 {(P − c)σ + (S − c)(1 − σ )}w = i.e., with probability 1 − σk . Thus a32 = ∞ k k k=1 1 and λ1 is defined by (5.74). We find in this μ(P − c) + λ1 (S − P ), where μ = 1−w way that ⎡ ⎤ μR μS μR μP μP + λ1 (T − P )⎦ , (5.112) A = ⎣ μT μ(R − c) μ(R − c) μ(P − c) + λ1 (S − P ) agreeing with (5.18) when c = 0 and N = 1 (hence also α = 1 and λ1 = 1, by (5.61) and (5.74)). Now, if c were zero, then ALLC would be (weakly) dominated by T F T (because μ > λ1 ), and we could remove ALLC from the game as in §5.2; moreover, T F T T −R would be an ESS for w > T −R+α(R−P ) , agreeing with (5.24) for N = 1. Under dynamic interaction, however, there must surely be significant costs associated with keeping score, and if c is positive, no matter how small, then T F T is no longer even a Nash-equilibrium strategy (because a13 > a33 ). Rather, ALLC is an ESS if R > T , and otherwise only ALLD is an ESS. Thus our model predicts that only mutualism can sustain cooperation among highly mobile organisms, although reciprocity could work for sessile organisms if the cost of scorekeeping were zero. But mustn’t there always be some cost associated with scorekeeping, because of the necessity to recall opponents’ previous moves? Can T F T never resist ALLC? We will return to this point in §5.9. Meanwhile, we conclude this short section by noting that, for discrete games in a finite population, there is a subtle difference between the standard conditions for strategy j to be an ESS—namely, ajj > aij for all i = j, where aij is the reward to an i-strategist among N other individuals, all of whom are j-strategists—and the stability conditions applied in §5.6. By comparing ajj to aij , as here, we imagine that a potential mutant who is currently a j-strategist compares its current reward to the one it would instead obtain if it switched to strategy i (and actually switches if that reward is greater). By contrast, in §5.6 we imagined that an individual who is already an i-strategist compares its reward to the one it would obtain if instead it were one of the other N individuals (and switches back if that reward is greater). In other words, in applying the standard conditions, we ask whether a mutant can enter a finite population; whereas in §5.6 we asked whether an infiltrator can be expelled. An answer to the second question yields more stringent conditions for cooperation than an answer to the first; but in practice the difference is often neg-
5.9. The comedy of errors
207
ligible, the standard conditions are usually easier to apply, and we will use them exclusively in Chapter 7.
5.9. The comedy of errors Everything so far assumes that players do not make errors, for example, in executing their strategies. But if they did, what would be their effect? There is no simple answer: errors can destabilize T F T , but they can also favor reciprocity by making an ESS out of a different reciprocal strategy that is otherwise only a Nash-equilibrium strategy. To broach the issues involved, we revisit the game of the previous section, but under static interaction, so that scorekeeping costs are small. For the sake of simplicity, we assume initially that they are precisely zero. Then, in the absence of errors, the payoff matrix A would reduce to (5.18). But we assume instead that each player makes a mistake—that is, defects when it intended to cooperate or vice versa—with the same small probability on every move. Thus, at their initial encounter, two T F T -strategists achieve their intended outcome CC of mutual cooperation with probability (1−)2 . With probability 2(1−), one player makes a mistake, so that the actual outcome is CD or DC, and with probability 2 they both make an error, so that the outcome is DD. If is sufficiently small, however, then we can safely neglect terms that are second-order in , which we denote collectively by O(2 ), where O is “big oh” (defined on p. 97). So, with negligible error, the probabilities of outcomes CC, CD, DC, and DD for the initial encounter between two T F T -strategists are, respectively, 1 − 2, , , and 0. If we label CC, CD, DC, and DD as outcomes 1, 2, 3, and 4, respectively, then their probabilities at subsequent encounters can be found with the help of an update matrix U , in which uij is defined to be the probability of outcome j at move k + 1, conditional upon outcome i at move k. For T F T versus T F T , it is readily shown that
(5.113)
⎡ ⎤ (1 − )2 (1 − ) (1 − ) 2 ⎢(1 − ) 2 (1 − )2 (1 − )⎥ ⎥ U = ⎢ ⎣(1 − ) (1 − )2 2 (1 − )⎦ 2 2 (1 − ) (1 − ) (1 − ) ⎡ ⎤ 1 − 2 0 ⎢ 0 1 − 2 ⎥ ⎥ + O(2 ). = ⎢ ⎣ 1 − 2 0 ⎦ 0 1 − 2
For example, u32 = (1−)2 = 1−2+O(2 ) because DC is followed by the intended outcome CD only if neither player errs; if Player 1 or Player 2, respectively, errs, then the outcome instead is DD or CC. On defining Δ± (k) =
(5.114) we obtain (5.115)
⎡
Uk
1 − 2k ⎢ k = ⎢ ⎣ k 0
1 2 (1
− 2k){1 ± (−1)k },
k Δ+ (k) Δ− (k) k
k Δ− (k) Δ+ (k) k
⎤ 0 k ⎥ ⎥ + O(2 ) k ⎦ 1 − 2k
208
5. Cooperation and the Prisoner’s Dilemma
(Exercise 5.25). But if xi (k) denotes the probability of outcome i at move k for i = 1, . . . , 4, then x(k) = x(0) U k . So, given that x(0) = (1, 0, 0, 0), the probabilities of CC, CD or DC, and DD at move k are x1 (k) = 1 − 2k, x2 (k) = k = x3 (k), and x4 (k) = 0, respectively, on neglecting terms of order 2 ; and so ∞ # a33 = (5.116) {Rx1 (k) + Sx2 (k) + T x3 (k) + P x4 (k)}wk−1 =
k=1 ∞ #
{R − (2R − S − T )k}wk−1
k=1
= Rμ − (2R − S − T )μ2 to first order in , on using (5.30) and (5.69). A similar calculation for ALLC versus T F T (Exercise 5.26) shows that, again to first order in , x(1) = (1 − 2, , , 0) and x(k) = (1 − 3, 2, , 0) for k ≥ 2. So ∞ # a13 = (1 − 2)R + S + T + {(1 − 3)R + 2S + T }wk−1 k=2
or (5.117)
a13 = Rμ − {2R − S − T + w(R − S)}μ
after simplification. The corresponding calculation for ALLD versus T F T (Exercise 5.26) yields, to the same order in , (5.118)
a23 = T + μwP − {2T − R − P + (3P − 2T − S)wμ}.
From (5.116)–(5.118), although a33 = a13 in the absence of errors, for > 0 (no −R . The terms of order in (5.118) matter how small) we have a33 > a13 if w < TR−S do not affect the lower bound on w, which is still (5.24). So T F T is an ESS if T −R T −R 11 This condition is always satisfied for some range of values of T −P < w < R−S . w when S + T > P + R, e.g., for 12 < w < 23 with Axelrod’s payoffs (5.25). The upper bound on w arises because the probability x1 (k) of mutual cooperation at move k continually decreases with k for T F T against itself but remains constant for ALLC against T F T if k ≥ 2, so that an ALLC mutant has an advantage over T F T if w is sufficiently large. Returning now to the question of whether T F T can ever resist ALLC, note that a small scorekeeping cost c does not affect the evolutionary stability of T F T as long as c < {T − R − w(R − S)}μ2 w. It is therefore possible for errors to offset the negative effect of the cost of scorekeeping on the stability of T F T . The important point, however, is not that T F T may be stable after all, but rather the more general point that one small effect can easily be offset by another. Unfortunately for T F T , the condition that makes it immune to ALLC requires −R −R , which, being incompatible with w > TR−S , renders it vulnerable to w < TR−S ST F T (p. 184). But T F T is merely one of many reciprocal strategies that are 11 Of course, if ALLC and ALLD correspond to absolutely fixed behavior, then it is unrealistic to suppose that either would ever behave like the other, even by mistake. So suppose instead that only T F T makes errors.Then the only possible states between ALLC and T F T are CC and CD, and ∞ (5.117) becomes a13 = k=1 {(1 − )R + S} = Rμ − (R − S)μ. The condition for a33 to exceed this quantity turns out to be exactly the same as before, i.e., w < (T − R)/(R − S). Similarly, (5.118) becomes a23 = P μ + (T − P )μ = P μ + O() as before. So our result is unaffected.
5.10. Commentary
209
capable of sustaining cooperation. Another such strategy, namely, contrite tit for tat or CT F T , which cooperates unconditionally after defecting by mistake, turns out with errors to be uniquely the best12 reply to itself—and therefore a strong −S T −R , Extrapolating from T F T to CT F T , a ESS—if w > max P R−S R−S [38, 326]. small scorekeeping cost need not prevent reciprocity from sustaining cooperation among sessile organisms, because its effect could be offset by, e.g., that of errors in executing strategies.
5.10. Commentary In this chapter we explored conditions for the evolution of cooperation in terms of the cooperator’s dilemma. We began with an illustration of the prisoner’s dilemma, the foraging game in §5.1. Then, in §5.2 and §5.3, we analyzed the iterated prisoner’s dilemma or IPD as a discrete population game, showing how the reciprocal strategy tit for tat or T F T might sustain cooperation. In §5.4, we constructed a model of the IPD in a finite population, and in the following two sections we used this model to develop necessary conditions for T F T to be a stable population strategy under both static (§5.5) and dynamic (§5.6) interaction patterns.13 In §5.7, we showed how cooperation could emerge without reciprocity, as mutualism. Finally, in §5.8 and §5.9, we explored implications of scorekeeping costs and execution errors. The IPD has generated a substantial theoretical literature on reciprocity; see Sigmund [310, 312] and references therein. Earlier work (e.g., [96], in which different players have different probabilities of further interaction) typically considered only pure strategies. Later work (e.g., [250]) allowed for stochastic strategies, which we have not discussed, but our neglect of them in §§5.4–5.6 is somewhat bolstered by Vickery’s contention that only pure strategies can be ESSs in a finite population [344, 345]. In §5.6, dynamic interaction implies pairwise encounters, but it is possible to assume instead that each individual interacts with every other individual in the population. The rationality of cooperation has been studied in this context by Schelling [295] and by Boyd and Richerson [40], whose main conclusion—that conditions for the evolution of reciprocity become extremely restrictive as group size increases—aligns with our conclusions in §5.6 (p. 201). The idea that reciprocity may be found only in sessile organisms (§5.8, §5.9) implicitly recognizes that a population’s spatial structure is important: sessile individuals interact only with their neighbors. Some studies that incorporate this spatial effect explicitly exclude reciprocation; for example, Nowak et al. [253] consider binary choices between cooperation and defection, whereas Killingback et al. [159] consider unilateral investments in cooperation (with zero investment equivalent to defection). In the first case, the game is a spatial version of the ordinary prisoner’s dilemma; in the second case, spatial extension is added to a continuous 12 Indeed when Wu and Axelrod [365] repeated Axelrod’s ecological simulation (p. 192) with errors and a handful of additional strategies, including CT F T , the honor of most successful strategy clearly devolved from T F T to CT F T . 13 Sections 5.4–5.6 are based on [203] and are further developed in [213], where it is argued that the conditions for stable reciprocity may be hard to satisfy. The later paper considers a possible continuum of mobility between sessile and highly mobile; there is no spatial structure, and conditions become steadily less favorable to reciprocity as mobility increases; whereas Ferriere and Michod [97] find that spatial structure can make conditions most favorable to reciprocity at intermediate mobility.
210
5. Cooperation and the Prisoner’s Dilemma
prisoner’s dilemma or CPD. Either way, any cooperation is due to benefits of clustering, not reciprocity. In other studies, however, the relevant game is a spatial IPD in either one [97] or two [44, 120] dimensions, or it is an iterated CPD without spatial structure, which Roberts and Sherratt [285] and Wahl and Nowak [351] use to predict that successful strategies should gradually increase investment in the welfare of others, raising the stakes against strategies that are initially reluctant to cooperate. But further discussion of these or related studies would distract us too far from our goals, not least because spatial games are apt to rely so heavily on stochastic numerical simulations or cellular automata [363], whereas we have chosen to emphasize analytical models. Note, however, that analytical progress in spatial games is also possible, to varying degrees [97, 150, 275]. Whither T F T ? Many studies of the IPD, e.g., [173, 255, 367], focus on greater strategic sophistication. Much of this literature implicitly assumes that reciprocal strategies—though not T F T —solve the problem of cooperation in nature. But Crowley and Sargent [80] identify situations where T F T might prevail over more sophisticated cooperative strategies. Furthermore, many studies do not account for scorekeeping costs—or temporal discounting, i.e., the tendency to value instant gratification more highly than delayed gratification. Yet Crowley et al. [79, p. 61] find that even low memory costs can greatly disfavor reciprocity; and Stephens et al. [317, 319, 320] find that temporal discounting is significant and greatly disfavors reciprocity, because its effect is equivalent to reducing the value of w in an IPD.14 So one shouldn’t forget that the implicit assumption may be false: reciprocity need not always, or even often, solve the problem of cooperation—and when it does, the strategies may not need to be sophisticated.15 We saw in §5.9 that execution errors can rob T F T of its robustness in the IPD, whereas contrition can restore it. This, however, is by no means the end of the story because, e.g., CT F T is still susceptible to errors in perception of an opponent’s move [33]. More generally, execution errors and scorekeeping costs are merely two among a panoply of possible further effects that can significantly modify the outcome of the game. For example, another potential restorative is generosity, i.e., a tendency to forgive others their defections. But the same two authors who showed how successful generous strategies can be in prevailing over noncooperative extortion strategies [322] soon afterwards were demonstrating the vulnerability of such generosity to even comparatively minor environmental changes [323], in ways foreshadowed by [169]. This result in itself is not so surprising—we have already seen (in §5.7 (p. 205)) that a decrease of less than 10% in a key environmental parameter can be all it takes to convert a game in which only defection pays to one in which only cooperation pays. But it is also just the tip of an iceberg; for example, we have not even touched on how interindividual variation within a population—in fighting ability [232] or any other factor [17, 198]—can influence the propensity to cooperate. Indeed so many factors can play a role and so many assumptions differ between models that no clear picture of their overall effect has yet emerged, and further discussion would take us much too far afield for an introductory text. But not every kind of discounting is bad for cooperation; see §2.3 (p. 69). Press and Dyson [273] identify other circumstances where simpler strategies in the IPD surpass more sophisticated ones; see [321, p. 10134]. 14 15
5.10. Commentary
211
On the empirical side, the prisoner’s dilemma and IPD have spurred experiments with both humans and other animals. Those with humans date from the 1950s. Some underscore the difficulties of achieving mutual cooperation in an IPD; for a review, see Chapter 7 of [70]. Others (critiqued by Romp [286, pp. 229–240]) indicate that humans are more cooperative in an ordinary prisoner’s dilemma than game theory predicts. Why? We offer an answer later (p. 211). Experiments with other animals did not begin until the 1980s. They include experiments with rats [99], starlings [278], and blue jays [67]. On the whole, these experiments reaffirm the difficulties of sustaining cooperation through reciprocity; in particular, Clements and Stephens [67] found that cooperation neither developed nor persisted in an IPD, but mutualism persisted in the corresponding repeated game with the signs of T − R and P − S in (5.1) reversed. These results remind us that not all cooperation is reciprocity, although mutualism (§5.7) is not the only alternative—merely the most parsimonious [67]. Other categories of cooperation include group-selected behavior [274, 362], in which an animal’s success is determined (for better or worse) by that of its group; and kinselected behavior [124, 361], whose surface we will scratch in §6.3. Categories are not mutually exclusive; e.g., a group may consist of kin (as will be implicit in §6.3). But there are at least some cases of cooperation where kin- and group-selected behavior can be ruled out a priori [218, p. 556]. And so, for the sake of simplicity, we have chosen to focus in this chapter on mutualism versus reciprocity among unrelated individuals. For a broader overview of the literature on cooperation, see [84, 252, 331, 360] and references therein. Opinions vary on the extent to which mutualism is worthy of our attention. For Rand and Nowak [276], it is not even interesting enough to count as a category of cooperation. For West et al. [360], however, that “mutually beneficial cooperation is less interesting” than other types of cooperation is one of 16 common misconceptions about the evolution of cooperation, and the mechanisms that sustain mutualism “can often be much more complicated, from both a theoretical and empirical perspective” [360, p. 242] than those underpinning other forms of cooperation. Put differently, even if cooperative behavior is mutualistic, it can still be puzzling, and a game-theoretic model can help to reveal the hidden feedbacks that sustain it.16 In any event, firm evidence of reciprocity in nonhuman animal societies is rare, and many examples of cooperation between nonkin probably represent cases of mutualism [68, p. 51]. In sum, despite the plenitude of categories into which cooperation has been subdivided by various authors, this chapter demonstrates that there is really only one escape from the prisoner’s dilemma, and that is to discover that the game being played—or, in experiments, the game that subjects perceive themselves as playing—is not really the prisoner’s dilemma after all. In other words, if A is the payoff matrix, strategy 1 is cooperative, strategy 2 is noncooperative, and there 16 For example, Australian fiddler crabs have been observed helping neighbors defend their territories against invaders [13]. Helping can be costly. To join the fight, helpers must leave their burrows, thus risking usurpation in their absence, and fighting drains energy which may result in claw loss. So why do helpers help? Reciprocity can be excluded, because helpers are always larger than the neighbors they help. But a detailed model [231] shows that the costs of renegotiating boundaries with a new and potentially stronger neighbor can outweigh those of helping and thus sustain this initially puzzling behavior mutualistically.
212
5. Cooperation and the Prisoner’s Dilemma
appears to exist a time scale over which a21 > a11 , then, when all relevant factors are accounted for, either that appearance must turn out to have been an illusion or else there must exist a longer time scale over which a11 > a21 , and which is also the relevant time scale for tallying benefits and costs [218]. Nevertheless, the details may differ significantly in different cases, and game-theoretic models help us to unravel them.
Exercises 5 1. Establish (5.15). 2. (a) Show that if there is constant probability w of further interaction in the IPD, then the number of moves has distribution (5.22). (b) Verify (5.23). 3. Under the conditions of §5.2: (a) Show that T F 2T can be invaded by ST CO, whereas T F T cannot be −R . invaded by ST CO if w > TR−S (b) Determine the final composition of the population when ST CO invades T F 2T . 4. Verify (5.31) and (5.34). 5. Verify that H defined by (5.45) is a hyperbola, and that Φ3 is negative in the shaded region of Figure 5.3. 6. Verify (5.46). 7. Verify (5.48). 8. Verify (5.50) and (5.51), and verify that (5.32) implies d > b > a > c. 9. Show that P + R > S + T in a prisoner’s dilemma implies −S −S T −R T −R >P (a) TP −P R−S > T −P > R−S , P −S P +R−S−T (b) R−S > 2R−S−T . 10. Strategy RN DM means cooperating or defecting with probability 12 . Extend the IPD of §5.3 to five strategies, with RN DM as strategy 5; in other words, define the fifth row and column of the 5 × 5 move-k payoff matrix φ(k). 11. (a) Obtain (5.70). (b) Obtain (5.72c). 12. Verify that (5.73) reduces to (5.70) when N = 1. 13. With regard to the IPD, we showed in §5.3 that T F 2T does better than T F T −R , assuming of course that no other strategies against ST F T whenever w > TR−S are present; see the remarks after (5.49). But T F 2T ’s first two payoffs against ST F T are S and R, whereas T F T ’s first two payoffs are S and T . Thus, because S + wT > S + wR, T F 2T ’s advantage over T F T (against ST F T ) does not emerge until the third encounter. Under dynamic interaction, however, we would expect third encounters with the same individual to be rather infrequent. Does this mean that T F 2T loses this advantage over T F T when opponents are drawn at random throughout the game? Why or why not? Use (5.73).
Exercises 5
213
14. Show that T F T is stable against RN DM (defined in Exercise 5.10) for sufficiently large N if (5.32) is satisfied. 15. Obtain (5.84). 16. Find necessary conditions for T F 2T to be stable against infiltration by RN DM (defined in Exercise 5.10). 17. Verify (5.94) and (5.96). 18. Verify that (5.99) implies ψ(N ) < 0, where ψ is defined by (5.98). 19. Show that (5.2)–(5.3) imply ξ1 > defined by (5.91).
T −R T −P
and ξ2 >
T −R R−S ,
where ξ1 and ξ2 are
20. Let T F T be the orthodox strategy and RN DM (defined in Exercise 5.10) the only deviant strategy in an iterated prisoner’s dilemma. With RN DM as strategy 5 (and N2 = N3 = N4 = 0), find expressions for the expected payoffs W1 and W5 under dynamic interaction. Hence, find conditions for T F T to be stable against RN DM when opponents are drawn at random. 21. Let T F 2T (strategy 2) be the orthodox strategy, and let RN DM (defined in Exercise 5.10) be the only deviant strategy in an iterated prisoner’s dilemma. With RN DM as strategy 5 (and N1 = N3 = N4 = 0), find expressions for the expected payoffs W2 and W5 under dynamic interaction. Hence, find conditions for T F 2T to be stable against RN DM when opponents are drawn at random. 22. The entire analysis of §5.6 is predicated on perfect recognition and recall; that is, we assume that players always recognize opponents they have met before, and always remember whether they cooperated or defected. Show that if recognition were absent, then dynamic interaction would make T F T unstable at all values of w and N , because ALLD would always invade. 23. Verify that, subject to (5.103), (5.102) satisfies both (5.2) and (5.3), but (5.33) does not hold. 24. Consider the IPD in which T F T is subject to mixed infiltration by ALLC and ALLD. Show that T F T —though not an ESS of the game with matrix (5.18)— will nevertheless increase in frequency under static interaction if both (5.78) and (5.79) are satisfied, provided only that its initial frequency be sufficiently large. Repeat your analysis for dynamic interaction, and obtain the corresponding result. 25. (a) Verify (5.113). (b) Show that x(k) = x(0) U k , where U is any update matrix and xi (k) is the probability of outcome i at move k. (c) Verify (5.115). (d) What are the update matrices for ALLC and ALLD against T F T in §5.9? Verify that U k+1 = U k for k ≥ 2. 26. Verify (5.116)–(5.118).
Chapter 6
Continuous Population Games
Here and in the following two chapters we study further examples of both continuous and discrete population games. This chapter focuses on continuous games. Our goal is twofold: to demonstrate that games are valuable tools in the study of animal (including human) behavior and to capture the breadth and diversity of possible models. All but one of the continuous games are separable, i.e., if their strategies are m-dimensional vectors, then their reward functions can be written as (6.1)
f (u, v) =
m #
fi (ui , v),
i=1
where v = (v1 , v2 , . . . , vm ) is the strategy adopted by the population and u = (u1 , u2 , . . . , um ) is a potential mutant strategy. Separability reduces the problem of maximizing f (u, v) for given v to m one-dimensional problems, thus greatly simplifying calculation of the optimal reaction set (6.2) R = (u, v) | f (u, v) = max f (u, v) , u
although the complexity of the analysis still usually increases with m; for example, in §6.9, where m = 4, the analysis is appreciably more complicated than in §6.4, where m = 2. Note, however, that separability is by no means a prerequisite for analysis (as illustrated later in §8.1 for m = 3). A tractable analytical model requires a judicious choice of strategy set. In this regard, vectors of five types of variables have proven especially practicable as strategies in continuous games. The first type of variable is a continuous approximation to an essentially discrete variable, such as price in Store Wars (§1.4), the second type is an amount or intensity, such as effort in the sunken-treasure contest (§1.6), the third type is a threshold, such as the aggression threshold in the continuous Hawk-Dove game (§2.6), the fourth type is a proportion, and the last type is a probability, such as the probability of choosing G in Crossroads (§1.1). All five types of variable are used in this chapter, specifically, the first in §6.7, the second in §6.4, the third in §6.6 and §6.9, the fourth in §6.1–§6.3, and the fifth in §6.5 and 215
216
6. Continuous Population Games
§6.8. It is convenient to adopt notation that distinguishes this last type from the other four. Accordingly, in §6.5 and §6.8 we use q for an orthodox probability and p for a deviant one—think of q for “quiescent” and p for “perturbed” replacing the usual v for “vogue” and u for “unfashionable” (§2.2, p. 63). We now proceed with our examples of continuous games.
6.1. Sex allocation. What is the evolutionarily stable sex ratio? We begin with a simple model of sex allocation in animals.1 Suppose that an animal can determine the sex of its offspring. Then what proportion of its offspring should be male and what proportion female? In answering this question, we will make matters simple by supposing that the animal is the female of its species, that there is a fixed amount of resources to invest in its brood, and that sons and daughters are equally costly to raise, so that after mating she produces the same number of children, say C, regardless of how many are sons or daughters. According to Darwin, our animal will behave so as to transmit as many of its genes as possible to posterity. We can capture this idea most simply by supposing that our animal will behave so as to maximize the expected number of genes it transmits to the second generation, i.e., to its grandchildren. Sex ratio clearly cannot affect the size of the first generation, for if our animal’s objective were simply to maximize number of children, then what would it matter if they were male or female? Let our animal’s strategy be the proportion of its offspring that is male, which we denote by u. Then it always has uC sons and (1 − u)C daughters after mating. In principle, uC and (1 − u)C should both be integers; in practice, they may not be integers. But that will scarcely matter if we first assume that our animal is the kind that lays thousands of eggs—a fish, perhaps. We will imagine that it plays a game against every other female in its population, and that all such females—say N in all—adopt strategy v; that is, they have vC sons and (1 − v)C daughters after mating. Let σm denote the proportion of sons who survive to maturity and σf the proportion of daughters. Then the number of daughters in the next generation is d = σf (1 − u)C + N σf (1 − v)C and the number of sons is s = σm u C + N σm v C. If the population is so large that our animal’s choice of strategy will have negligible effect on proportions, then an excellent approximation is σf (1 − v) d = . s σm v Formally, of course, (6.3) obtains in the limit as N → ∞. Let us now assume that females always find a mate, because some males will mate more than once if d > s; and that males are equally likely to find a mate if d < s, equally likely to find a second mate if s < d < 2s, etc. Thus, if M is the number of times that a son mates, then M is a random variable with expected value E[M ] = d/s.2 (6.3)
1
It derives from Fisher [98], but here is adapted from Charnov [61] and Maynard Smith [188]. There are two ways to derive this result. The first is simply to observe that it is obvious. The second is to suppose that (k − 1)s ≤ d < ks, where k ≥ 1 is an integer, and to observe that M = k − 1 if a son fails to mate a kth time but M = k if the son does mate a kth time. Then, because the probability of a kth mating is, by assumption, (d − (k − 1)s)/s, and the probability of no kth mating is therefore (ks − d)/s, we have E[M ] = (k − 1) · (ks − d)/s + k · (d − (k − 1)s)/s = d/s. 2
6.2. Damselfly duels: a war of attrition
217
Now, the number of genes that our animal transmits to the second generation is proportional to its number of surviving grandchildren, and hence to (6.4)
F = σf (1 − u)C · C + σm uC · CM,
because σf (1 − u)C of its daughters and σm uC of its sons survive to maturity, and because daughters all produce C offspring, whereas sons produce C offspring per mating.3 The expected value of this payoff is E[F ] = σf (1 − u)C 2 + σm uC 2 E[M ]. Thus, on setting E[M ] = d/s and using (6.3), our animal’s reward is (6.5)
f (u, v) = C 2 σf {v + (1 − 2v)u}/v,
provided of course that v = 0. We have glossed over some details in our eagerness to obtain the above reward. We have implicitly assumed that the population outbreeds—no individual mates with a brother or sister—and that all the males suffice to mate with all the females. Both assumptions are of negligible consequence if v = 0 and N → ∞. If v = 0, however, then there are no males in the population to mate with our animal’s daughters, and so it can have grandchildren only through its σm Cu surviving sons when they mate with the N σf C surviving females in the population, so that d/s = N σf ÷ σm u in place of (6.3), F = σm uC · CM in place of (6.4), and thus f (u, 0) = N C 2 σf when u > 0. But f (0, 0) = 0: when v = 0, no son means no grandchildren! So the best response to v = 0 is any positive u—in theory. In practice, however, it is unlikely that one or two sons could mate with the entire female population, so the best response to v = 0 is surely u = 1. Either way, it is now easy to calculate the optimal reaction set R and show that v = 12 is a continuously stable, weak ESS (Exercise 6.1). Thus, in the highly idealized circumstances described, the population should evolve to produce equal numbers of sons and daughters—irrespective of the proportions of sons and daughters that survive to maturity.
6.2. Damselfly duels: a war of attrition The study of the fitness consequences of behavior has come to be known as behavioral ecology.4 Research in this field poses the basic question, What does an animal gain, in fitness terms, by doing this rather than that? [28, p. 9] So behavioral ecology thrives on paradoxes—baffling inconsistencies between intuition and evidence that engage our attention and stimulate further investigation. A paradox arises because evidence fails to support an intuition, which (assuming the evidence to be sound) can happen only if the intuition relies on a false assumption about behavior, albeit an implicit one. So the way to resolve the paradox is to spot the false assumption. In other words, if a paradox of animal behavior exists, then we have wrongly guessed which game best models how a real population interacts, and to resolve this paradox we must guess again—if necessary, repeatedly—until eventually we guess correctly. Assuming the validity of our solution concept for 3 If the sons and daughters mated, then the daughter’s genes would be counted by the first term in (6.4) and the son’s genes by the second; see Exercise 6.28. Here, however, we simply assume that the population outbreeds. 4 For an introduction to the subject at large, see [42, 81].
218
6. Continuous Population Games
population games, i.e., assuming that observed behavior corresponds to some ESS, our task is to construct a game whose ESS corresponds to the observed behavior. Then the resolution of the paradox lies in the difference between the assumptions of this new model and the assumptions we had previously been making about the observed behavior (whether we realized it or not). Of course, a model population is only a caricature of a real population. But a paradox is only a caricature of real ignorance. So, in terms of realism, a game and a paradox are a perfect match. In this regard, our next example will illustrate how a population game can help to resolve a paradox. The game is a simple model of nonaggressive contest behavior, in which animals vie for an indivisible resource by displaying only—unlike in the Hawk-Dove game of §2.4, where the animals sometimes fight. The cost of displaying increases with time, and the winner is the individual whose opponent stops displaying first. So the game is a war of attrition. A common expectation for such contests, confirmed by experimental studies on a variety of animals, is that each animal compares its own strength to that of its opponent and withdraws when it judges itself to be the probable loser.5 We call this expectation the mutual-assessment hypothesis. The duration of such contests is greatest when opponents are of nearly equal fighting ability, so that it is more difficult to judge who is stronger and therefore the likely winner. But a series of contests over mating territories between male damselflies, staged by Marden and Waage [179], failed to follow this logic. Although the weaker animal ultimately conceded to its opponent in more than 90% of encounters, there was no significant negative correlation between contest durations and differences in strength. Why? We explore this question in terms of game theory. A possible answer is that the damselflies were not assessing one another’s strength. Animals who contest indivisible resources will vary in reserves of energy and other factors, but an animal’s state need not be observable to its opponent, and so we will assume that an animal has information only about its own condition. Let us also assume that variation in reserves is continuous, and that an animal’s state is represented by the maximum time it could possibly display before it would have to cede the resource for want of energy. Let this time be denoted by Tmax , and let T be the time for which the animal has already displayed. Then the larger the value of Tmax − T , the more likely it is that continuing to display will gain the animal the resource. Thus the animal should persist if its perception of Tmax − T is sufficiently large. We will refer to Tmax − T as the animal’s current reserves and to Tmax as its initial reserves.6 Our war-of-attrition model requires a result from psychophysics. Let π (> 0) denote the intensity of the stimulus of some physical magnitude, e.g., size or time, and let ρ(π) be an animal’s subjective perception of π. Then, in general, ρ (π) > 0 and ρ (π) < 0; that is, perception increases with intensity of stimulus, but the greater that intensity, the greater the increment necessary for perception. Now, for all sensory modalities in humans, the ratio between a stimulus and the increment required to make a difference just noticeable is approximately constant over the usable (middle) range of intensities. Therefore, to the extent that human 5 6
See, e.g., [225, p. 66]. An implicit assumption here is that energy is proportional to time [225, p. 72].
6.2. Damselfly duels: a war of attrition
219
psychophysics also applies to other animals,7 it is reasonable to assume that if a is the least observable intensity and π is increased steadily beyond a, then a first difference will be noticed when π = a(1 + b), a second when π = a(1 + b)2 , a third when π = a(1 + b)3 , and so on, where b is the relevant constant. If it is further assumed that these just noticeable differences are all equal—to c, say—and if zero on the subjective scale corresponds to a on the objective scale, then an animal’s subjective perception of the stimulus π = a(1 + b)k is ρ = kc. Thus π = a(1 + b)ρ/c or (6.6)
ρ(π) = γ ln (π/a) ,
where γ = c/ ln(1 + b) is a constant. In psychophysics, (6.6) is usually known as Fechner’s law (but occasionally as the relativity principle), and it clearly satisfies ρ (π) > 0, ρ (π) < 0. We will assume throughout that it provides an adequate model of the relationship between sensory and physical magnitudes. Now, let H and L be physical magnitudes whose difference determines the probability that a favorable outcome will be achieved if a certain action is taken. Then an animal should take the action if it perceives H to be sufficiently large compared to L. If it perceives each magnitude separately,8 then it should take the action if ρ(H)−ρ(L) is sufficiently large. But (6.6) implies that ρ(H)−ρ(L) = γ ln(H/L). So the action should be taken if H/L is sufficiently large. Thus, on setting H = Tmax and L = T above, an animal should persist if Tmax /T is sufficiently large—bigger than, say, 1/w—but otherwise give up. Accordingly, we define strategy w to mean (6.7)
stay
if
T < wTmax ,
go
if
T ≥ wTmax .
We interpret w as the proportion of an animal’s initial reserves that it is prepared to expend on a contest. Thus, if the initial reserves of a u-strategist and a v-strategist are denoted by X and Y , respectively, where X and Y are both drawn randomly from the distribution of Tmax , then the u-strategist will depart after time uX, and the v-strategist will depart after time vY . We assume that the value of the contested resource is an increasing function of current reserves, a reasonable assumption when the resource is a mating territory (although the assumption would clearly be violated—effectively reversed—in a contest for food). For simplicity, we assume that the increase is linear, i.e., the value of the resource is α(Tmax − T ) where α > 0. The parameter α has the dimensions of fitness (p. 71) per unit of time, and so we can think of α as the rate at which the victor is able to translate its remaining reserves into future offspring. Let β (> 0) denote the cost per unit time of persisting, in the same units. Then, because X and Y are (independent) random variables, the payoff to a u-strategist against a
7 See, for example, [296, pp. 15–16]. Although humans are still the subject of most empirical work, human psychophysics does seem to apply to many other animal taxa—including invertebrates [5]. 8 As argued by [225, p. 75]. If the animal actually perceives the difference, of course, then it should instead take the action if ρ(H − L) is sufficiently large, and hence if H − L is sufficiently large, but results are qualitatively the same [225].
220
6. Continuous Population Games
v-strategist is also a random variable, say F defined ⎧ ⎪ ⎨α(X − vY ) − βvY (6.8) F (X, Y ) = −βuX ⎪ ⎩ 0
by if uX > vY, if uX < vY, if u = 0 = v,
because the u-strategist wins if uX > vY but loses if uX < vY , and the cost of display is determined by the loser’s persistence time for both contestants. We ignore the possibility that uX = vY (other than where u = 0 = v) because reserves are continuously distributed, and so it occurs with probability zero. We also assume the resource is sufficiently valuable that α > β or θ < 1, where (6.9)
θ = β/α.
Let us now assume that initial reserves Tmax are distributed over (0, ∞) with probability density function g. Then the reward to a u-strategist is f (u, v) = E[F ] = ∞∞ F (x, y)dA, where we have used (2.22) and E denotes expected value. On 0 0 using (6.8), we obtain
" ' " ∞
(6.10) f (u, v) =
ux/v
{αx − (α + β)vy}g(y) dy
g(x) 0
0
" − βu
"
∞
dx '
vy/u
g(y)
xg(x) dx
0
dy
0
if (u, v) = (0, 0) but f (0, 0) = 0. For example, if Tmax is uniformly distributed with mean μ according to
1 if 0 < t < 2μ, 2μ (6.11) g(t) = 0 if 2μ < t < ∞, then (Exercise 6.2) (6.12)
f (u, v) =
⎧ ⎪ ⎨
αμu 3v {2 − 3θv − (1 − θ)u} αμ{3(1−{1+θ}v)u2 +({2+θ}u−1)v 2 } 3u2 ⎪
⎩
and R is defined by (6.13)
u = B(v) =
0
⎧ ⎪ ⎨
if u ≤ v, v = 0, if u > v, if u = 0 = v,
if u ≤ v, v = 0, if u > v, ⎪ ⎩ any u ∈ (0, 1] if v = 0. 2−3θv 2−2θ 2 u = 2+θ
Note that (0, 0) ∈ / R. Thus the unique ESS is v = v ∗ where (6.14)
v∗ =
2 , 2+θ
as illustrated by Figure 6.1. At this ESS, each animal is prepared to expend at least two thirds of its initial reserves, although only the loser actually does so. Moreover, because each animal is prepared to persist for time v ∗ Tmax at the ESS, the victor is always the animal with the higher value of Tmax —even though an opponent’s reserves are assumed to be unobservable. Thus victory by the stronger animal need not imply that an opponent’s reserves are being assessed. But animals still
6.2. Damselfly duels: a war of attrition
1
v
221
1
0
v∗ 1
0
u
v
0
v∗
0
a
1
u
b
Figure 6.1. Optimal reaction set for the war of attrition with uniform distribution of initial reserves when (a) 0 < θ < 23 and (b) 23 < θ < 1, where θ is defined by (6.9) and v ∗ by (6.14). R is drawn for (a) θ = 1/2 and (b) θ = 3/4.
assess their own reserves (and respond to the distribution of reserves among the population). So we refer to the expectation that an opponent’s reserves are not being assessed as the self-assessment hypothesis. As noted on p. 72, an effect that is small in a real population is typically absent from a model. So we would expect a 90% win rate for stronger males in the real world to translate into a 100% win rate for stronger males in a model world. And this is precisely what we have predicted: because both contestants are prepared to deplete their initial reserves by the same proportion, the weaker one invariably gives up first. But it is also what the mutual-assessment hypothesis predicts, at least in the absence of assessment errors, because the weaker animal withdraws as soon as it has judged itself to be the probable loser. So are the two competing hypotheses indistinguishable? We return to this question below (p. 223). Although the uniform distribution is not the only one for which an ESS always exists (Exercise 6.3), for many distributions an ESS exists only if θ is sufficiently small. For example, suppose that Tmax is distributed parabolically with mean μ according to
3t(2μ−t) if 0 < t < 2μ, 4μ3 (6.15) g(t) = 0 if 2μ < t < ∞. Then, on interchanging the order of integration in the second term of (6.10) and substituting from (6.15), we have
" ' " 2μ
f (u, v) =
ux/v
0
0
" (6.16a)
{αx − (α + β)vy}g(y) dy
g(x) − βu
= αμu
2μ
"
'
2μ
xg(x) 0
dx
g(y) dy
dx
ux/v
u(5u{(3 − θ)u − 4} + 14v{3 − (2 − θ)u}) −θ 35v 3
222
6. Continuous Population Games
for u ≤ v, v = 0; whereas, on interchanging the order of integration in the first term of (6.10) before substituting from (6.15), we have
" ' " 2μ
2μ
0
vy/u
" (6.16b)
{αx − (α + β)vy}g(x) dx
g(y)
f (u, v) =
"
2μ
− βu
'
vy/u
g(y) 0
dy
xg(x) dx
dy
0
(14u{(3 + θ)uv − 2v} + 5v 2 {3 − (4 + θ)u})v 2 = αμ 1 − (1 + θ)v + 4 35u
for u ≥ v. Differentiating, we obtain (24 + 13θ)(v ∗ − v) ∂f (6.17) = αμ, ∗ ∂u
35v
u=v=v
where (6.18)
v∗ =
24 24 + 13θ
(Exercise 6.4), and it follows from (2.47)–(2.49) that v ∗ is the only candidate for ESS. To show that v ∗ is indeed an ESS, we must verify (2.17). For u > v ∗ , it follows from (6.16b) that 2 u − v∗ 1 (6.19) f (v ∗ , v ∗ ) − f (u, v ∗ ) = 35 αμ Q+ (u) , 2 (24 + 13θ)u
where Q+ (u) = (216 + 47θ)u{48 + (24 + 13θ)u} − 8640 is strictly increasing with respect to u, and so, for u > v ∗ , Q+ (u) > Q+ (v ∗ ) = 10368(16 − 3θ)/(24 + 13θ) > 0. Hence f (v ∗ , v ∗ ) > f (u, v ∗ ) for all u > v ∗ . For u < v ∗ , it follows from (6.16a) that (6.20)
f (v ∗ , v ∗ ) − f (u, v ∗ ) =
1 483840
αμ (24 + 13θ)Q− (u)(v ∗ − u)2 ,
where Q− (u) = 48(108 − 169θ) + 4(108 + 41θ)(24 + 13θ)u − 5(3 − θ)(24 + 13θ)2 u2 . Because Q− (u) < 0 and Q− (v ∗ ) − Q− (0) = 96(18 + 71θ) > 0, the minimum of Q− on [0, v ∗ ] occurs at u = 0. Hence f (v ∗ , v ∗ ) − f (u, v ∗ ) is guaranteed to be positive ∗ ∗ ∗ for all u < v ∗ only if Q− (0) > 0, or θ < 108 169 . Then f (v , v ) > f (u, v ) for all 108 ∗ ∗ u = v , implying that v is a strong ESS. If θ = 169 , then it can be shown directly that v ∗ is a weak ESS (Exercise 6.4). Thus v ∗ is an ESS if (and only if) (6.21)
θ≤
108 169 .
∗ When θ > 108 169 , however, a population of v -strategists can be invaded by mutants who play u = 0, i.e., give up immediately. These results are illustrated by Figure 6.2, where R is sketched. Although giving up immediately can invade strategy v ∗ if θ > 108 169 , it is advantageous only when rare, and so it does not eliminate v ∗ . Suppose that giving up immediately (strategy 1) and expending proportion v ∗ (strategy 2) are found at frequencies x1 and x2 , respectively. Then the reward to strategy 1, which entails neither benefits nor costs, is W1 = 0, and for W2 (the reward to strategy 2) is f (v ∗ , 0) = αμ times the probability of meeting strategy 1 plus f (v ∗ , v ∗ ) times the
6.2. Damselfly duels: a war of attrition
223
Figure 6.2. Optimal reaction set (solid curve) for the war of attrition with parabolic distribution of initial reserves when (a) θ = 14 , (b) θ = 12 , (c) , and (d) θ = 78 , where θ is defined by (6.9) and v ∗ by (6.18). θ = 108 169
probability of meeting strategy 2. So (6.22)
W1 − W2 = −αμx1 − f (v ∗ , v ∗ )x2 = −f (v ∗ , v ∗ ) − {αμ − f (v ∗ , v ∗ )}x1 ,
where f (v ∗ , v ∗ ) = −2(169θ − 108)αμ/(455θ + 840) < 0. For small x1 , W1 − W2 is positive, so that strategy 1 increases in frequency; for large x1 , W1 −W2 is negative, so that x1 decreases. The population stabilizes where W1 = W2 or (6.23)
x1 =
2(169θ−108) 13(48+61θ)
108 169 , there still exists a polymorphic ESS at which the proportion of those who give up immediately is invariably less than 9%. The polymorphism persists because negative payoffs to strategy 2 on meeting itself are balanced by large positive payoffs on rarer occasions when it meets strategy 1. Thus the alternative strategies do equally well on average, and there is no incentive to switch from one to the other. Neither a uniform nor a parabolic distribution of initial reserves is especially realistic. But the results of this section generalize to other distributions (Exercise 6.5), enabling one to find an acceptable fit to the data on damselfly energy reserves collected by Marden, Waage and later Rollins [179], [178]. Now, we discovered earlier that, with respect to victory by the stronger contestant, these data are consistent with both the mutual-assessment and the self-assessment hypotheses. But there is also a difference. In our model of pure self-assessment, which has
224
6. Continuous Population Games
acquired the acronym WOA-WA,9 a contest ends when the loser gives up after using a fixed proportion of its reserves, and so we predict a positive correlation between final loser reserves and contest duration; whereas, as discussed earlier, the mutual-assessment hypothesis predicts a negative correlation between strength difference and contest duration. This difference10 demonstrates that game-theoretic models are capable of yielding testable predictions. Although the WOA-WA was originally developed to address a damselfly paradox, the model applies to other species. Empirical support for self-assessment has since emerged from staged contests in dragon lizards [196] and, to various degrees, in wasps [339], pigs [55], sheet-web spiders [352], and cave-dwelling wetas [95].
6.3. Games among kin versus games between kin In §6.2 we found that the war of attrition need have no monomorphic ESS if the cost of display is sufficiently high.11 We assumed, however, that contestants are unrelated. Here we study how nonzero relatedness modifies the conditions for a strategy to be an ESS with particular reference to the war of attrition. For simplicity, we assume that the strategy set is [0, 1], as in §6.1 and §6.2. According to Darwin, animals will behave so as to transmit as many as possible of their genes to posterity. By descent, any two blood relations share a nonnegligible proportion of genes for which there is variation in the population at large, and hence have a tendency to exhibit the same behavior—or to have the same strategy. But animals may also behave identically because of cultural association. Accordingly, let r be the probability that a strategy encounters itself by virtue of kinship, where kinship can be interpreted to mean either blood-relationship or similarity of character (as in common parlance); then 1 − r is the probability that the strategy encounters an opponent at random (still possibly itself). We call r the relatedness, and assume that r < 1. Nothing in our analysis will depend on whether animals tend to behave identically by virtue of shared descent or shared culture—except, as remarked in §2.1 (p. 62), the time scale of the dynamic by which an ESS can be reached. Let the population contain proportion 1 − of an orthodox strategy v and proportion of a mutant strategy u, and let f (u, v) denote, as usual, the reward to a u-strategist against a v-strategist. Then, because u and v are encountered with probabilities and 1 − , respectively, the reward to strategy s against a random opponent is (6.24)
9
w(s) = f (s, u) + (1 − )f (s, v),
For war of attrition without assessment, i.e., without mutual assessment; see, e.g., [9, 335]. It has been explored elsewhere [211, 225], although with inconclusive results. A difficulty is that in the absence of mutual assessment, an apparent negative relationship between strength difference and contest duration could arise as an artefact of a stronger positive relationship between loser reserves—or, more generally, loser RHP (p. 165)—and contest duration [335]. The matter is discussed at length by Briffa et al. [46, pp. 63–65]. 11 And variation in reserves is sufficiently low; see [211, 225]. 10
6.3. Games among kin versus games between kin
225
where s may be either u or v. Thus the payoff to s, allowing for s to encounter either itself with probability r or a random opponent with probability 1 − r, is (6.25)
W (s) = rf (s, s) + (1 − r)w(s) = rf (s, s) + (1 − r)f (s, v) + (1 − r){f (s, u) − f (s, v)}.
Again, s may be either u or v. Now, strategy v is an ESS among kin if W (v) > W (u) for all u = v when is sufficiently small. From (6.25), however, W (v) − W (u) has the form A + B with (6.26a)
A = f (v, v) − f (u, v) + r{f (u, v) − f (u, u)},
(6.26b)
B = (1 − r){f (v, u) − f (v, v) − f (u, u) + f (u, v)}.
So v is an ESS among kin if, for all u = v, either A > 0 or A = 0 and B > 0. That is, v is an ESS among kin if for all u = v, either (6.27a)
f (v, v) > (1 − r)f (u, v) + rf (u, u)
or (6.27b)
f (v, v) = (1 − r)f (u, v) + rf (u, u), f (v, u) + rf (u, v) > (1 + r)f (u, u)
(Exercise 6.8). If (6.27a) is satisfied for all u = v, then v is a strong ESS among kin, and otherwise v is a weak ESS among kin. These conditions are simpler to deal with if first we define φ by (6.28)
φ(u, v) = (1 − r)f (u, v) + rf (u, u).
Then it follows from (6.27) that v is an ESS among kin when, for all u = v, either (6.29a)
φ(v, v) > φ(u, v)
or (6.29b)
φ(v, v) = φ(u, v), φ(v, u) > φ(u, u)
(Exercise 6.8). These conditions are merely (2.18) with φ in place of f . So v is an ESS among kin of the game with reward function f when v is an ordinary ESS of the game with reward function φ, which we therefore interpret as the kin-modified reward of the original game. The optimal reaction set—obtained by setting f = φ in (6.2)—is found in the usual way, and if R intersects the line u = v uniquely (i.e., if φ has a unique maximum at u = v), then v is a strong ESS. Suppose, for example, that the war of attrition is played among kin and that initial reserves are uniformly distributed with mean μ so that, for u ≤ v = 0,12 (6.12) and (6.28) imply 1 1 (1 − r)(1 − θ)u2 − u− , (6.30) φ(u, v) = 13 αμ 2r + 2(1 − r) v
where θ is the cost/value ratio in (6.9) and λ = (6.31)
v∗ =
λ
2(1−r) r+(3−r)θ .
v
Let us define
2(1 − r) . 2 − r + (1 + r)θ
12 If v = 0, then φ has no maximum; from (6.32), it approaches its least upper bound αμ(1 − r/3) as u → 0, but φ(0, 0) = 0.
226
6. Continuous Population Games
1
v
1
0
v∗
0
1
u
v
0
v∗
0
a
1
u
b
Figure 6.3. Optimal reaction set for the war of attrition among kin with uniform distribution of initial reserves when (a) 3θ + (3 − θ)r < 2 and (b) 3θ + (3 − θ)r > 2, where θ is defined by (6.9) and v ∗ by (6.31). R is drawn for r = 0.1 and (a) θ = 1/2, (b) θ = 3/4. See Exercise 6.9.
Then φ has a unique maximum along the line (1 − θ)λu + v = λ if v ∗ ≤ v ≤ λ but at u = 0 if λ ≤ v ≤ 1 (Exercise 6.9). For u ≥ v = 0, on the other hand, we have αμ(1 − r)v 2 3u
(6.32) φ(u, v) =
1 2+θ− u
+
1 3 αμ{3
− r − (1 + 2θ)ru − 3(1 − r)(1 + θ)v},
which has a unique maximum where u is the only positive root of the cubic equation (1 + 2θ)ru3 + (2 + θ)(1 − r)v 2 u = 2(1 − r)v 2
(6.33)
(Exercise 6.9). In either case, the maximum occurs where u = v if, and only if, v = v ∗ (Figure 6.3). So v ∗ is the unique ESS among kin. Note that v ∗ decreases with r: the higher the relatedness, the lower the proportion of its initial reserves that an animal should be prepared to expend on contesting a resource. The upshot is that the average population reward at the ESS, namely, (6.34)
φ(v ∗ , v ∗ ) = f (v ∗ , v ∗ ) =
2(1 − θ + 3θr) αμ 2 − r + (1 + r)θ
(Exercise 6.9), is an increasing function of r. But the higher the average reward to the population, the more cooperative the outcome. Thus kinship induces cooperation, as in the prisoner’s dilemma among kin; see Exercise 6.13. For a parabolic distribution of initial reserves, the effect of kinship is even more striking because it can ensure that there is always a monomorphic ESS, even when θ is very close to 1 (although the required degree of relatedness, almost 18 , is perhaps rather high). From (6.15) and (6.28), we deduce after simplification that ∂φ αμ (6.35a) = {24(1 − r) − {24 − 11r + 13(1 + r)θ}v} , ∂u u=v 35v ∂2φ 12αμ (6.35b) = − (1 − r)(3 − {1 + 2θ}v) < 0 2 2 ∂u
u=v
35v
6.3. Games among kin versus games between kin
227
(Exercise 6.10). So φ invariably has a local maximum on u = v where v = v ∗ , defined by (6.36)
v∗ =
24(1 − r) . 24 − 11r + 13(1 + r)θ
This local maximum is also a global one if φ(v ∗ , v ∗ ) = f (v ∗ , v ∗ ) ≥ 0. But (6.15) and (6.36) imply that (6.37)
f (v ∗ , v ∗ ) =
2αμ{108 − 169θ + 35(1 + 13θ)r} , 35{24 − 11r + 13(1 + r)θ}
which is invariably positive if (6.38)
r≥
61 490
because θ < 1 (Exercise 6.10). If r < ESS if (6.39)
θ≤
≈ 0.1245
61 490 ,
on the other hand, then v ∗ is still an
108+35r 13(13−35r) .
61 The critical value of θ increases with r from 108 169 at r = 0 towards 1 as r → 490 , in ∗ ∗ agreement with (6.21) and (6.38). Note that φ(v , v ) again increases with r. Games among kin, in which a mutant strategy has a greater—by r—than infinitesimal probability of interacting with itself, must be carefully distinguished from games between kin, which are games between specific individuals whose proportion of (variable) genes shared by descent is precisely r. For example, parents and their offspring, or two sisters, share 12 , an aunt and her niece share 14 , and first cousins share 18 .13 Although it is traditional in evolutionary game theory to use the same symbol r for both relatedness between kin and relatedness among kin, possible implications of the different interpretations should always be kept in mind. In particular, because a game between kin is not a population game, the concept of ESS is—strictly speaking—inappropriate, but we can still apply the concept of Nash equilibrium.14 For k = 1, 2, let fk (u, v) denote Player k’s reward in a game between nonrelatives, and let φk (u, v) denote Player k’s reward in the corresponding game between individuals whose relatedness is r. Then, because fraction r of either’s reproductive success counts as reproductive success for the other, we have
(6.40)
φ1 (u, v) = f1 (u, v) + rf2 (u, v), φ2 (u, v) = f2 (u, v) + rf1 (u, v).
These rewards are usually known as the players’ inclusive fitnesses. Although in general we must allow for asymmetry (Exercise 6.14), it is sometimes reasonable to assume that the reward structure is symmetric, i.e., f2 (u, v) = f1 (v, u). Then the game can be analyzed in terms of a single reward or inclusivefitness function φ defined by (6.41) 13
φ(u, v) = f (u, v) + rf (v, u).
See, e.g., [52, Chapter 9]. This is not to say that a Nash equilibrium between kin could not correspond to an ESS of a larger population game in which Player 1 and Player 2 in the game between kin would correspond to roles that any individual could take in the population game. This point is illustrated by Exercises 1.27 and 2.17 (with regard to ordinary ESSs). 14
228
6. Continuous Population Games
Comparing (6.41) with (6.28), we see that a symmetric game between kin differs from the corresponding population game among kin not only in terms of interpretation, but also in terms of mathematical structure. Nevertheless, their properties often overlap. For example, if f is bilinear, then v ∗ is an ESS among kin only if (v ∗ , v ∗ ) is a Nash equilibrium of the corresponding symmetric game between kin (although the first may be strong but the second weak, as illustrated by Exercise 6.13); and sometimes an ESS v ∗ among kin corresponds to a symmetric Nash equilibrium between kin, even if f is not bilinear (as illustrated by Exercise 6.11). In general, however (and as Exercise 6.12 illustrates), there is no such correspondence—which is unsurprising, because the games are different!
6.4. Information and strategy: a mating game The games we have studied so far in this chapter are separable in a degenerate sense because p = 1 in (6.1). Moreover, although we have already seen a population game that is separable with p = 2—namely, Four Ways (§1.3)—its reward function is bilinear, and so its ESS is weak. Here is a game with p = 2 whose ESS is strong. Sperm competition is competition between the ejaculates of different males for fertilization of a given set of eggs [262]. For example, male 13-lined ground squirrels routinely queue for mating with oestrous females, typically in pairs, with the first male having an advantage [298]. If ejaculates are costly, in the sense that increased expenditure of reproductive effort on a given ejaculate reduces the number of matings that can be achieved, should a male expend more on a mating in the disfavored role to compensate for its disadvantage—or should it expend more on a mating in the favored role to capitalize on its advantage? We explore this question in terms of game theory. Let α denote the fairness of the competition between males, i.e., the effectiveness ratio of a unit of disfavored sperm compared to a unit of favored sperm: as α increases from 0 to 1, the unfairness of the favored male’s advantage in the “raffle” for fertilization decreases, until the raffle becomes fair when α = 1. Consider a game among a population of males who mate randomly either with unmated or with once mated females from a set of females who mate at most twice. The proportion of unmated females may be any positive number less than 1. Thus each male may occupy either of two roles, and a strategy consists of an amount of sperm to ejaculate in each. Unless the raffle is fair, one of these roles is favored; let us denote it by A, and the disfavored role by B. Let the amounts ejaculated in roles A and B be v1 and v2 , respectively, for the population strategy but u1 and u2 for a potential mutant. That is, a mutant’s strategy is a two-dimensional vector u = (u1 , u2 ) in which u1 is the amount of sperm ejaculated when that male is in role A, and u2 is the amount of sperm ejaculated when it is in role B. Similarly, the population strategy is v = (v1 , v2 ), where v1 and v2 are the amounts of sperm ejaculated in roles A and B, respectively. For any given female, let K denote its maximum potential future reproductive success from a given set of eggs, let X denote the favored male’s sperm expenditure, let Y denote the disfavored male’s expenditure, and let T denote the female’s effective total number of sperm. Then the proportion of its reproductive success that
6.4. Information and strategy: a mating game
229
accrues to the favored male is X/T , and the proportion that accrues to the disfavored male is 1 − X/T . To obtain T , we devalue the disfavored male’s expenditure by the fairness α. Thus (6.42)
T = X + αY.
Let Kg(T ) denote the female’s fitness as a function of effective total sperm number. Then it is reasonable to assume that the proportion g is a concave increasing function of T , with g(0) = 0 and g(∞) = 1.15 For the sake of simplicity, we satisfy these conditions by assuming (6.43)
g(T ) =
T +T
throughout, where —the number of sperm that would fertilize half of a female’s eggs—is relatively small, in a sense to be made precise. Let WA (X, Y ) denote the favored male’s expected reproductive gain from the female, and let WB (X, Y ) denote that of the disfavored male. Then, from (6.42)– (6.43), (6.44a) (6.44b)
X KX · Kg(T ) = , T + X + αY X KαY Kg(T ) = . WB (X, Y ) = 1 − T + X + αY
WA (X, Y ) =
Note that WB (X, X)/WA (X, X) = α: whenever sperm expenditures are equal, the disfavored male’s gain is lower than that of the favored male by factor α, the fairness. Note also that WA (X, Y ) + WB (X, Y ) = Kg(T ): because a female mates at most twice, the sum of reproductive gains for two mates must equal its fitness. The cost of a mating in terms of fitness increases with sperm expenditure, ultimately at a prohibitive rate. That is, denoting sperm expenditure by s (which may be either X or Y ) and mating cost by Kc(s), so that c(s) is a dimensionless quantity, we have c(0) = 0, c (s) > 0, and c (s) ≥ 0. For simplicity’s sake, we satisfy these conditions with (6.45)
c(s) = γs,
√ where γ has dimensions sperm , so that δ = γ is a dimensionless measure of uncertainty of fertilization in the absence of competition: the lower the value of δ, the lower the amount of sperm with which an uncontested male can expect to fertilize a given egg set. We now quantify our assumption that the egg set can be fertilized by a relatively small amount of sperm by requiring δ < 1. Let pA denote the probability that a mutant focal male is in role A, allocating u1 against a male who allocates v2 in role B, so that the focal male’s expected reproductive gain from the female is WA (u1 , v2 ). The cost of this mating is Kc(u1 ), by assumption, so that the mutant’s net increase of fitness is WA (u1 , v2 ) − Kc(u1 ). Similarly, let pB = 1 − pA denote the probability that the mutant is in role B, allocating u2 against a male who allocates v1 in role A, so that its net reward is WB (v1 , u2 ) − Kc(u2 ). Note that the proportion of unmated females is pA when mating first is favored but pB when mating second is favored (because the favored role is always role A). On multiplying the reward from each role by the probability −1
15
For a discussion of this point see [207], on which §6.4 is based.
230
6. Continuous Population Games
of occupying that role and adding, the reward to a u-strategist in a population of v-strategists (the expected payoff from the next female, who is analogous to the next customer in Store Wars (§1.4)) becomes (6.46)
f (u, v) = pA {WA (u1 , v2 ) − Kc(u1 )} + pB {WB (v1 , u2 ) − Kc(u2 )} 1 α − γ + KpB u2 −γ . = KpA u1 + u1 + αv2
+ v1 + αu2
Note that this expression is a special case of (6.1) with p = 2. Straightforward partial differentiation reveals that (6.47)
∂f + αv2 = − γ, ∂u1 ( + u1 + αv2 )2
α( + v1 ) ∂f = − γ. ∂u2 ( + αu2 + v1 )2
Hence the unique best reply to v = (v1 , v2 ) is u = (ˆ u1 , u ˆ2 ), where ⎧ ⎨ + αv2 − − αv if αγv < 1 − δ 2 , 2 2 γ (6.48a) u ˆ1 = ⎩ 0 if αγv2 ≥ 1 − δ 2 and (6.48b)
u ˆ2
⎧ ⎨ + v1 − + v1 αγ α = ⎩ 0
if γv1 < α − δ 2 , if γv1 ≥ α − δ 2
(Exercise 6.15). If v is the best reply to itself, i.e., if u ˆ1 = v1 and u ˆ2 = v2 , then v ,u ˆ2 = v2 with v1 = 0, because is the unique strong ESS. We cannot satisfy u ˆ1 = v1 ≥ 1 − δ 2 with v2 = /αγ − /α and α > δ 2 ; but (6.48) would then require αγv2 √ √ that would imply δ α ≥ 1 and α > δ, the first of which contradicts δ < 1 even if the second is satisfied (because 0 < α ≤ 1). So we must have v1 > 0. Nevertheless, it is possible to have either v2 = 0 or v2 > 0 (Exercise 6.15). The upshot is that the unique ESS for > 0 is v = (v1∗ , v2∗ ), where
δ(1 − δ) if 0 ≤ α ≤ δ, ∗ √ 2 2 γv1 = (6.49a) α +4δ α(1+α)−2δ 2 α(1+α)+α if δ < α ≤ 1, 2(1+α)2 ⎧ ⎨ 0 if 0 ≤ α ≤ δ, γv2∗ = 2δ 2 (α2 −δ 2 ) (6.49b)
√ 2 2 if δ < α ≤ 1 ⎩ 2 2 α 2δ (1+α)−α +α
α +4δ α(1+α)
(and both are dimensionless, because γ has dimensions sperm−1 ). For = 0, however, we obtain the somewhat surprising result that (6.49c)
v1∗ = v2∗ =
α γ(1 + α)2
(Exercise 6.15). Because (6.49) implies α(v1∗ − v2∗ ) = (1 − α) whenever v1∗ and v2∗ are both positive, we also have v1∗ = v2∗ when the raffle is fair, i.e., when α = 1; but otherwise v1∗ > v2∗ .16 Sperm expenditures at the ESS are plotted against α in Figure 6.4 in units of γ −1 , i.e., the right-hand sides of (6.49) are plotted. In each diagram, the solid 16 Note that this result requires the assumption that costs are independent of role: if γ is replaced by γA and γB in roles A and B, respectively, then α(γA v1∗ − γB v2∗ ) = (γB − αγA ) when v1∗ > 0, v2∗ > 0 at the ESS, implying v1∗ < v2∗ if γB < αγA .
6.4. Information and strategy: a mating game
ω
231
ω
0 0
δ
1
a 0 < δ < 0.5
α
0
δ
0
1
α
b 0.5 < δ < 1
Figure 6.4. Sperm expenditure versus fairness α at the ESS when roles are √ certain for fixed δ = γ, where is the number of sperm that would fertilize half of the egg set, γ is the marginal-cost parameter, and ω = δ(1−δ). The solid curve shows expenditure in favored role A, the dashed curve shows expenditure in disfavored role B, and the dotted curve shows expenditure in either role when δ = 0. Sperm expenditure is measured in units of γ −1 along the vertical axis. (Units of would preclude a comparison with the result for = 0.)
0.21
0
0
0.3
1
α
Figure 6.5. Sperm expenditure versus fairness at the ESS when roles are uncertain (solid curves) for p → 0 (lowest), p = 0.25, p = 0.5, p = 0.75, p → 1 (highest), and δ = 0.3. The separate expenditures when roles are certain are also shown for comparison (dotted curves). As in Figure 6.4, sperm expenditure is measured in units of γ −1 along the vertical axis.
curve corresponds to favored role A and the dashed curve to disfavored role B. The dotted curve corresponds to δ = 0, when sperm expenditure is the same in either role, by (6.49c). The form of the ESS depends on whether δ exceeds 12 . If not, i.e., 1 if 0 < δ ≤ 12 , then the maximum possible expenditure at the ESS is always 4γ , precisely the same value as when fertilization is assured (δ = 0); but the maximum occurs where α = 1/(1 + 4δ 2 ) instead of at α = 1 (as when δ = 0). If 12 < δ < 1, on the other hand, then expenditure is always lower than when fertilization is assured, the maximum occurring where α ≤ δ, i.e., in the absence of competition. In both cases, v1 − v2 is a strictly decreasing function of α when δ < α ≤ 1. That is, the divergence between expenditures in roles A and B increases with unfairness in the de facto competitive domain. All of the above assumes, however, that animals know which role they are in. A different result emerges if they know the proportion of unmated females but do not know (with certainty) whether they are first or second to mate. Then sperm expenditure is of necessity role-independent, say v for the population, u for
232
6. Continuous Population Games
a mutant. The reward to a u-strategist in a population of v-strategists becomes (6.50)
f (u, v) = pA {WA (u, v) − Kc(u)} + pB {WB (v, u) − Kc(u)} ( ) p (1 − p)α = Ku + −γ + u + αv
+ v + αu
on using p in place of pA to indicate that roles are now uncertain (thus, for consistency, 0 < p < 1), and the unique ESS v ∗ is given by ⎧ 0 if α ≤ αc , ⎨ (6.51) γv ∗ = 2δ 2 {p + α(1 − p) − δ 2 } if α > αc , ⎩ 2 2 2 2 2δ (1 + α) − α +
α + 4δ (1 + α){p + α (1 − p)}
−p where αc = δ1−p (Exercise 6.16). This ESS is plotted in Figure 6.5 for δ = 0.3, i.e., for the same value of δ as in the left-hand panel of Figure 6.4. Note that the ESS for uncertain roles agrees with the ESS for certain roles if the raffle is fair (α = 1): if expenditures are equal anyway, then it makes no difference whether animals are aware of their roles. For the most part, however, the ESSs differ. Typically, expenditure when the role is uncertain is higher than when the mating is known to be disfavored but lower than when it is known to be favored; in particular, the threshold αc , below which there is zero expenditure, is lower than the corresponding threshold δ for a mating that is known to to be disfavored. Increasing p increases the sperm expenditure. Note, however, that v1∗ is not the limit of v ∗ as p → 1, and v2∗ is not the limit of v ∗ as p → 0—unsurprisingly, because each allocation is a response to itself when roles are uncertain but to a separate allocation when roles are known. The result that v ∗ = 0 if α ≤ αc is at first paradoxical, because no eggs are fertilized if v ∗ = 0. But we have to remember that our game is a game among males, and that γK is the (marginal) opportunity cost of a unit of sperm expenditure, that is, the cost of the unit in terms of future mating opportunities that the male thereby forgoes. These forgone opportunities need not exist among the present set of females: because the males are out for themselves, they should have no qualms about jeopardizing the reproductive success of the present females if their own mating prospects elsewhere are brighter thereby. Moreover, if the current set of females were the only set available, then the cost of forgoing future matings would have to be close to zero, and α ≤ αc cannot be satisified if δ ≈ 0. All the same, it is perhaps unsatisfactory to resolve a paradox in terms of that which is not explicitly modelled. Sperm competition is a complex affair, requiring far more sophisticated game-theoretic models than we have developed here. For pointers to the literature, see §6.10. 2
6.5. Competition over divisible resources: a Hawk-Dove game We have already noted in §2.4 (p. 72) that when Doves interact, a divisible resource can either be equally shared or randomly allocated. But which? To elucidate the conditions that favor either option over the other, in this section we consider a population of individuals in pairwise competition over resources of varying sizes.
Value of resource of given size
6.5. Competition over divisible resources: a Hawk-Dove game
V 1
233
V 1
V(Z)
V(0.5Z) V(Z)
0.5V(Z)
0
0
0.5Z Z Size of resource
a Diminishing returns
1
z
0.5V(Z) V(0.5Z) 0
0
0.5Z Z Size of resource
1
z
b Increasing returns
Figure 6.6. Diminishing versus increasing returns to size in the value of a resource when the costs of partitioning and random allocation are both negligible. The arrows indicate that the difference between the value V 12 Z of 1 partitioning and the expected value 2 V (Z) of random allocation is positive or negative according to whether returns are (a) diminishing or (b) increasing. The quantity δ = E V 12 Z − 12 V (Z) is positive for diminishing returns and negative for increasing returns. The right-hand curve is based on the assumption that this distribution has negligible mass for sizes so large that returns become diminishing (as they ultimately must) and the curve begins to bow down. If there is significant probability mass in this region, then δ will be a sum of positive and negative contributions, and so |δ| could be very small; in which case, the ESS would insteaad be determined primarily by a balance between the costs of partitioning and random allocation because then B = δ − CP + CR ≈ CR − CP by (6.54).
Although increasing the size of a resource in general increases its value, the increase of value need not be proportional to the increase in size or—in terms of economics—returns to size need not be constant. The increase of value can instead be either less than proportional (diminishing returns to size) or greater than proportional (increasing returns to size). For example, twice as much food may be more than an animal can eat or store, and hence less than twice as valuable; conversely, twice as much territory may more than double an animal’s ability to attract mates, and hence be more than twice as valuable. So we must distinguish between the size of a resource, denoted by Z, and the value V (Z) associated with a resource of that size. As illustrated by Figure 6.6, diminishing returns to size yield V ( 12 Z) > 12 V (Z), and increasing returns to size yield V ( 21 Z) < 12 V (Z), with V ( 12 Z) = 12 V (Z) for constant returns to size. We begin by dealing with the subgame that arises when both individuals choose to be nonaggressive. Two such animals can settle a contest over a resource of size Z either by partitioning it, so that each obtains 12 Z and hence value V ( 12 Z), or by allocating it at random, so that each is equally likely to obtain all or none of it, accruing value V (Z) or 0 with equal probability. An animal’s strategy in this subgame is its probability of choosing to partition the resource. We assume that
234
6. Continuous Population Games
the resource is shared if either contestant opts to partition it, and that a sole winner is determined at random if neither opts to partition it; however, we would obtain exactly the same results if instead we assumed that the resource is shared only if both parties agree to partition it (Exercise 6.17). Let p2 and q2 be the respective probabilities that one of two Doves seeks to partition the resource. No partition occurs with probability (1−p2 )(1−q2 ), yielding a mean payoff to each contestant of 12 V (Z) − CR , where CR denotes the cost to each party of random allocation (which we ignored on p. 71, and may still be negligible). A partition occurs with probability 1 − (1 − p2 )(1 − q2 ), yielding a mean payoff to each contestant of V ( 12 Z) − CP , where CP is the (mutual) cost of partitioning. So, if E denotes expected value over the distribution from which Z is drawn, reward to a p2 -strategist then the against a q2-strategist is φ(p2 , q2 ) = E V 12 Z − CP {1 − (1 − p2 )(1 − q2 )} + 12 V (Z) − CR (1 − p2 )(1 − q2 ) or (6.52)
φ(p2 , q2 ) = A − B(1 − p2 )(1 − q2 ),
where (6.53)
A = E V 12 Z − CP
is the net benefit from a partition between Doves, and (6.54) B = E V 12 Z − 12 V (Z) − CP + CR = A − 12 E V (Z) + CR is the bias in favor of partitioning versus randomly allocating the resource; that is, B is the difference between the net benefit A from partition and the net benefit 1 E V (Z) − CR from random allocation. When the 2 R are both costs CP and C negligible, the bias reduces to the quantity δ = E V 12 Z − 12 V (Z) , which the arrows in Figure 6.6 reveal to be positive for diminishing returns to size but negative for increasing returns to size (and zero for constant returns to size). More generally, the bias B in favor of partitioning is positive for diminishing or constant returns (δ ≥ 0) if the cost of sharing through random allocation exceeds the cost of partition (CR > CP ), or in every case if the cost CR of random allocation is prohibitively high; and the bias is negative for increasing or constant returns (δ ≤ 0) if the cost of partition exceeds the cost of random allocation (CP > CR ), or in every case if the cost CP of partition is prohibitively high. The Dove-Dove subgame is embedded in a larger game among individuals who first choose whether or not to be aggressive. In this larger Hawk-Dove game, let the focal individual in a population of q-strategists have strategy p = (p1 , p2 ), where q = (q1 , q2 ). Here p1 and q1 are the two opponents’ probabilities of being aggressive, while p2 and q2 are their probabilities of seeking to partition the resource in the event that both are not aggressive (as above). Let CF be a cost that both animals pay if they fight, in which case, the payoff to the focal p-strategist is −CF or V (Z) − CF with equal probability, i.e., 12 E[V (Z)] − CF . This outcome arises with probability p1 q1 . Payoffs of E[V (Z)] or 0 to the focal individual arise with probabilities p1 (1 − q1 ) or (1 − p1 )q1 , respectively. Finally, with probability (1 − p1 )(1 − q1 ), the animals enter the subgame, whose payoff φ(p2 , q2 ) to the focal p-strategist is defined by (6.52). Thus, for the full Hawk-Dove game with strategy set S = {(p1 , p2 )|0 ≤ p1 , p2 ≤ 1} = [0, 1] × [0, 1], the reward to a p-strategist in a
6.5. Competition over divisible resources: a Hawk-Dove game
235
population of q-strategists is (6.55) f (p, q) = ( 21 E[V (Z)] − CF )p1 q1 + p1 (1 − q1 )E[V (Z)] + (1 − p1 )(1 − q1 )φ(p2 , q2 ), implying (6.56)
∂f = ( 12 E[V (Z)] − CF )q1 + (1 − q1 ){E[V (Z)] − φ(p2 , q2 )} ∂p1
and (6.57)
∂f = B(1 − p1 )(1 − q1 )(1 − q2 ). ∂p2
If 12 E[V (Z)] > CF , then pure Hawk is evolutionarily stable, because f (q, q) − f (p, q) = ( 21 E[V (Z)] − CF )(1 − p1 ) is positive for p = q when q = (1, q2 ) unless p1 = 1, and so a nonzero probability of Dove cannot enter the population. However, such an outcome is irrelevant to our present purpose, because the Dove-Dove subgame is never reached. Accordingly, we now assume that 1 2 E[V
(6.58)
(Z)] < CF
to focus on the case where q1 < 1 at the ESS. There cannot be such an ESS with 0 < q2 < 1, for that would require ∂f /∂p2 p=q = 0, which is impossible by (6.57). Furthermore, q = (0, 0) can be invaded by p = (1, 0), because then f (q, q) − f (p, q) = A − B − E[V (Z)] = − 12 E[V (Z)] − CR < 0, and so q is not an ESS; likewise, q = (0, 1) invaded by p = (1, 1), because then f (q, q) − f (p, q) = can be A − E[V (Z)] = E V 12 Z − E[V (Z)] − CP < 0 (since V is an increasing function), and so it is also not an ESS. Thus a strategy in S is a candidate for ESS only if it has the form q = (q1 , 0) or q = (q1 , 1) with 0 < q1 < 1; the first case corresponds to random allocation in the Dove-Dove subgame, the second to partitioning. In either case, for q to be a best reply to any p, we require ∂f /∂p1 p=q = 0. Let q1 = q1R solve this equation for q2 = 0; that is, on using (6.56), define (6.59)
q1R =
CR + 12 E[V (Z)] . CR + CF
Here q1R denotes the ESS frequency of Hawks for a mixed population in which competing Doves randomly allocate the resource; in view of (6.58), q1R decreases with fighting cost CF (as it would in the classic Hawk-Dove game) but increases with the cost CR of random allocation. Then with q = (q1R , 0), we obtain (6.60)
f (q, q) − f (p, q) =
Bp2 (p1 − 1)(CF − 12 E[V (Z)]) , CF + CR
which is negative unless p1 = 1 or p2 = 0 for B > 0, implying that q is not an ESS. When B < 0, however, f (q, q) ≥ f (p, q) for all p, and f (q, q) > f (p, q) fails to hold only when p1 = 1 or p2 = 0. So q = (q1R , 0), although not a strong ESS, is nevertheless a weak ESS, because for p = (1, p2 ), we obtain (CR + CF ){f (q, p) − f (p, p)} = (CF − 12 E[V (Z)])2 > 0, and for p = (p1 , 0), we obtain {f (q, p) − f (p, p)} = (CR + CF )(q1R − p1 )2 > 0 if p1 = q1R .
236
6. Continuous Population Games Correspondingly, let q1 = q1P solve ∂f /∂p1 p=q = 0 for q2 = 1, that is, define
(6.61)
q1P =
E[V (Z)] − E V 12 Z + CP
. 1 E[V (Z)] − E V 12 Z + CP + CF 2
Here q1P denotes the ESS frequency of Hawks for a mixed population in which competing Doves partition the resource; it decreases with fighting cost CF but increases with the cost CP of partition. Because f (p, q) = f (q, q) for any p when q = (q1P , 1), this strategy is not uniquely the best reply to itself and hence not a strong ESS; however, f (q, p) − f (p, p) = B(1 − p1 )2 (1 − p2 )2 1 + { 12 E[V (Z)] − E V Z + CP + CF }(p1 − q1P )2 2 is positive for all p = (q1P , 1) when B > 0 but can be made negative for p1 = q1P , p2 = 1 if B < 0, so that q = (q1P , 1) is a weak ESS if, and only if, B is positive. So for all practical purposes, the game has a unique ESS.17 If the bias in favor of partition is positive (B > 0), then the unique ESS is q = (q1P , 1), which corresponds to being aggressive with probability q1P , being nonaggressive with probability 1 − q1P , and, in the event that both contestants are nonaggressive, partitioning the resource. Note that the probabilities of being aggressive and of being nonaggressive are both positive by (6.58) and (6.61). If the bias in favor of partition is negative (B < 0), however, then the unique ESS is q = (q1R , 0), which corresponds to being aggressive with probability q1R , being nonaggressive with probability 1 − q1R , and, in the event that both contestants are nonaggressive, randomly allocating the resource. Note that the probabilities of being aggressive and of being nonaggressive are again both positive by (6.58) and (6.59). With regard to sharing, the key result is that when neither opponent is aggressive at the ESS, the resource is partitioned when the bias is positive and randomly allocated when it is negative. In particular, if CP and CR are both negligible, then there is always partitioning between Doves at the ESS for diminishing returns to size and always random allocation between Doves for increasing returns to size. Furthermore, if the resource is shared only when both animals opt to partition it and otherwise is randomly allocated, then the unique ESS is still q = (q1P , 1) for B > 0 and q = (q1R , 0) for B < 0 (Exercise 6.17). Thus our key result is robust: it is not sensitive to whether two Doves partition the resource if either party opts to partition it or only if both parties opt to partition it.18 17 The only exception to the existence of an ESS is that if B = 0 (which arises if V (Z) = γZ for some γ (constant returns to size) and CP = CR ) then technically the game has no ESS; rather, it has what is known as an evolutionarily stable set [336] containing all equivalent strategies of the form (q1∗ , q2 ) for all 0 ≤ q2 ≤ 1, where q1∗ = q1R = q1P from either (6.59) or (6.61). In this case, however, the Dove-Dove subgame is degenerate and does not distinguish between partitioning and random allocation, yielding the same reward to either as to every intermediate combination; in other words, the subgame serves no purpose and is best removed from the game. Under these conditions the unique ESS for zero bias is to be aggressive with positive probability q1∗ and seek to resolve the conflict nonaggressively with positive probability 1 − q1∗ by an indeterminate mechanism. Note, however, that zero bias is a very unlikely scenario: even with constant returns to size (δ = 0 in B = δ − CP + CR ), any slight discrepancy between CP and CR will suffice for B = 0. 18 This result can be regarded as a robust theorem in the sense described on p. 361.
6.6. Partitioning territory: a landmark game
237
1
2
Player 1's nest site L
Player 2's nest site 1 −L Landmark
Figure 6.7. The contested line segment
6.6. Partitioning territory: a landmark game There is considerable evidence that either physical objects or conspicuous features of the landscape—landmarks, for short—are used by animals as designators of territorial boundaries [137]. For example, territorial wasps have been shown to establish borders rapidly at horizontal wooden dowels placed by experimenters [86]. The resulting territory may well be smaller than could have been obtained through fighting. Then why are landmarks so readily adopted? A possible answer is that, although each of two territorial neighbors would like a larger territory than a landmark between them yields and might obtain one through fighting, worse outcomes are also possible. Through fighting, an animal might instead obtain a smaller territory, or no eventual boundary settlement might be reached, and regardless, fighting over a boundary can be protracted, and therefore costly (in terms of fitness). So neighbors may be favored to adopt a landmark as a conventional solution19 to the problem of territorial partitioning. Here we devise a game to explore this possibility. Let each of two residents vie for control of a line segment, whose length is defined (without loss of generality) to be the unit of distance. The residents have nests at opposite ends of this line segment, as shown in Figure 6.7. Their struggle ends when a boundary is established that divides the line segment into two smaller segments that represent the territories of the residents. We assume that these territories always fill the expanse between the nests. The marginal value of territory expansion is assumed to be the same constant b for all residents, and so the value of the territory thus obtained is simply b times the length of the relevant segment. Residents vary in fighting ability—or strength, for short—and the boundary between two residents is established where their effective fighting abilities are equal. Without loss of generality, we assume that strength, denoted by S, is a random variable taking values between 0 and 1; thus, 0 denotes the weakest animal that is strong enough to hold a territory, and 1 denotes the strongest. For all residents, the effective fighting ability that can be brought to bear at a particular location increases with S but declines with distance from the nest, denoted by d.20 For 19
See p. 81. Effective fighting ability may decline with distance from the nest either because an individual fights more effectively in more familiar surroundings or because it is more motivated to fight, the closer the approach of another animal to its nest (or both). In the latter regard, for example, Zucker [372, p. 234] notes that any displaying male Uca terpsichores fiddler crab ”defends the area around his burrow more vigorously as an intruder is presented closer to it.” 20
238
6. Continuous Population Games
the sake of simplicity, we assume that both relationships are linear; specifically, the territorial pressure p that an animal exerts increases at rate r1 with respect to strength and decreases at rate r2 with respect to distance, according to (6.62)
p = p0 + r1 S − r2 d.
We assume that p0 ≥ r2 , so that even the weakest owner exerts a positive pressure at its neighbor’s nest. To simplify our analysis, we also assume that r1 < r2 , so that even the weakest owner exerts a greater pressure at its own nest than even the strongest neighbor (because p0 > p0 + r1 − r2 ). Then the natural boundary, where neighbors exert equal territorial pressure, is guaranteed to fall between the nests: no resident is so much stronger than another that it obtains the entire contested area. Consider first where the territory boundary will be established in the absence of a landmark. Let X and Y be the strengths at the nest site (hereafter, simply strengths) of Player 1 and Player 2, respectively, and let the natural boundary be at distance ξ(X, Y ) from Player 1’s nest. Then, from (6.62), we have p0 + r1 X − r2 ξ = p0 + r1 Y − r2 (1 − ξ) or (6.63)
ξ(X, Y ) =
1 2
{1 + θ(X − Y )} ,
where (6.64)
θ = r1 /r2
satisfies θ < 1. Thus θ is the length of the interval centered around the midpoint of the contested line segment in which the natural boundary can lie, or, equivalently, the natural boundary falls at least 12 (1 − θ) from an animal’s nest. Note that 1 − ξ(X, Y ) = ξ(Y, X). We assume that Player 1 knows X but not Y , whereas Player 2 knows Y but not X: although each animal knows its own strength, neither discovers ξ without provoking a fight. We assume that X and Y are drawn from the same distribution on [0, 1] with probability density function g and cumulative distribution function s G (so that G(s) = 0 g(ξ) dξ, as in §2.6). Now assume that a landmark visible to both players is located at distance L (< 1) from Player 1’s nest, as indicated by the striped flag in Figure 6.7.21 We seek the conditions under which one or both players will accept the landmark as the territory boundary. In deciding whether to follow the landmark convention, each animal assesses whether the advantages due to quickly settling the dispute outweigh any losses due to a lower expected territory size. Since stronger animals can expect to obtain larger territories as a result of fighting, strong animals may be favored to reject a landmark that a weak animal would accept; so thresholds for aggression are suitable strategies. Accordingly, we define strategy s to mean (6.65)
accept landmark if
S ≤ s,
reject landmark if
S > s.
21 We assume the landmark to be unique, so we dodge the question of how animals would coordinate on a single landmark if there were more than one. An analogous question in the economics literature has been explored by Young [368], whose analysis at least very loosely suggests that the animals should tend to coordinate on the most centrally located landmark.
6.6. Partitioning territory: a landmark game
239
Figure 6.8. Sample space for the joint distribution of X and Y
Since willingness to accept a landmark will depend on the resulting territory size, the optimum threshold for aggression will depend on the distance to the landmark from the resident’s nest site, which is L for Player 1 and 1 − L for Player 2. Hence the optimum threshold will depend on L. Let Player 1 have strategy u, and let Player 2 have strategy v. If both animals accept the landmark (i.e., if X ≤ u and Y ≤ v or the point (X, Y ) belongs to region I in Figure 6.8), then Player 1’s benefit is bL and Player 2’s is b(1 − L), because b is the marginal benefit per unit distance of territory between nest and landmark. Otherwise, at least one animal fails to observe the landmark, which provokes a fight, whose outcome is invariably that territories merge at the natural boundary. We assume that such fights begin with an initial phase during which the animals attempt to defend overlapping territories and end with an escalated phase during which costs are higher for weaker animals. We also assume that if either animal accepts the landmark so that only one animal encroaches beyond it, then the overlap between initially defended regions is not as great as when neither animal accepts the landmark and, hence, both encroach beyond it; thus fighting costs are higher for each animal when neither accepts the landmark. For simplicity, in satisfying these assumptions, we model fighting costs for an individual with strength S by cL (S) if one animal accepts the landmark but by cH (S) if both reject it, where (6.66)
cL (S) = c0 + c1 (1 − S),
cH (S) = + cL (S)
and > 0. Thus c0 is a fixed cost of fighting, avoided only when both animals accept the landmark, is the additional fighting cost that each incurs when neither accepts the landmark, and c1 , in effect, is the marginal cost of a unit of weakness. We assume that the minimum cost of fighting cannot exceed the maximum value of the territory, or c0 < b. Thus constraints on the model’s parameters are (6.67)
0 < c0 /b < 1,
0 < θ < 1,
0 < L < 1,
c1 > 0,
> 0.
Let Fk (X, Y ) denote the payoff to Player k (conveniently abusing notation, to the extent that Fk also depends on u, v, and L), and let fk (u, v, L) = E[Fk (X, Y )] be the associated reward, where this time we opt for notation that emphasizes the
240
6. Continuous Population Games
dependence on L. Then
(6.68)
⎧ ⎪ if (X, Y ) ∈ I, ⎨bL F1 (X, Y ) = bξ(X, Y ) − cL (X) if (X, Y ) ∈ II, ⎪ ⎩ bξ(X, Y ) − cH (X) if (X, Y ) ∈ III,
(6.69)
⎧ ⎪ ⎨b(1 − L) F2 (X, Y ) = bξ(Y, X) − cL (Y ) ⎪ ⎩ bξ(Y, X) − cH (Y )
if (X, Y ) ∈ I, if (X, Y ) ∈ II, if (X, Y ) ∈ III,
where Regions I, II, and III are defined by Figure 6.8. Integrating over the sample space, we obtain " 1" 1 (6.70) fk (u, v, L) = Fk (x, y) g(x) g(y) dx dy 0
0
for k = 1, 2. Note that because a v-strategist at distance 1 − L from the landmark faces a u-strategist whenever a u-strategist at distance L from the landmark faces a v-strategist, we have f2 (u, v, L) = f1 (v, u, 1 − L). v Using ξ(x, y) + ξ(y, x) = 1 and G(v) = 0 g(s) ds in (6.68)–(6.70), we obtain " 1 " u f1 (u, v, L) = − cL (x)g(x) dx + G(v) cL (x)g(x) dx 0 0 " v " u (6.72) + b 12 − g(y) ξ(x, y)g(x) dx dy + LG(u)G(v)
(6.71)
0
0
− {1 − G(v)}{1 − G(u)} after simplification. Thus " v ∂f1 (6.73) = {cL (u) + bL}G(v) − b ξ(u, y)g(y) dy + {1 − G(v)} g(u). ∂u 0 For example, if strength is uniformly distributed, i.e., if g(s) = 1 for all 0 ≤ s ≤ 1, then (6.72) and (6.73) become 1 2 (b
(6.74) f1 (u, v, L) =
and (6.75)
− c1 ) − c0 − (1 − u − v) + 14 θbuv(v − u) − 12 c1 vu2 − 12 − L b + − c0 − c1 uv
∂f1 = v {bL + c0 + c1 (1 − u)} − 14 b{2 + 2uθ − θv} + (1 − v) ∂u
by (6.66). Corresponding expressions for f2 (u, v, L) and ∂f2 /∂v follow directly from (6.71), on interchanging u with v and L with 1 − L. A question of immediate interest can now be answered. When will both animals invariably honor the landmark, regardless of how strong they are? In other words, when is (1, 1) a Nash equilibrium of the community game defined by f1 and f2 ?
6.6. Partitioning territory: a landmark game
241
Figure 6.9. How the optimum aggression threshold varies with the distance of the landmark from the nest for a symmetric distribution (μ = 12 ) when θ < 4 min{c0 /b, 12 − (c0 + c1 )/b}. Player 1’s threshold—above which it rejects the landmark, below which it accepts—is shown solid and is plotted from left to right, i.e., with respect to increasing L. Player 2’s threshold is shown dashed and is plotted from right to left, i.e., with respect to increasing 1 − L. The height of the solid curve at a given distance from Player 1’s nest is determined by (6.81) and always equals the height of the dashed curve at the same distance from Player 2’s nest, by (6.80). Shading indicates where both players accept landmarks unconditionally. The figure is drawn for θ = 0.1, c0 /b = 0.125, and c1 /b = 0.3.
The condition is that 1 is a best response to 1 for both players or ∂f1 /∂u|u=1=v ≥ 0, ∂f2 /∂v|u=1=v ≥ 0. This condition reduces to " 1 " 1 c0 c0 ≤ L ≤ 1− (6.76a) ξ(1, y)g(y) dy − ξ(1, y)g(y) dy + b b 0 0 on using (6.66) in (6.73) and noting that CL (1) = c0 and G(1) = 1, and it further reduces to c0 c0 1 1 (6.76b) 2 {1 + θ(1 − μ)} − b ≤ L ≤ 2 {1 − θ(1 − μ)} + b 1 for an arbitrary distribution of strength with mean μ = 0 yg(y) dy, on using (6.63).22 In other words, both animals will invariably honor the landmark when c0 ≥ 12 θ{1 − μ}b and the landmark is no further than cb0 − 12 θ(1 − μ) from the midpoint of the line segment between nests. The greater the value of c0 , which represents the fixed cost of fighting if the landmark is not accepted, the more easily this condition is satisfied. In particular, for a symmetric distribution (μ = 12 ), both animals invariably honor the landmark when 14 (2 + θ) − c0 /b ≤ L ≤ 14 (2 − θ) + c0 /b, e.g., for 0.4 ≤ L ≤ 0.6 in Figure 6.9 where θ = 0.1 and c0 /b = 0.125. Note that (6.76a) is not in general independent of the variance σ 2 = 01 (y − μ)2 g(y) dy of the distribution; e.g., if we replaced (6.63) by ξ(X, Y ) = 12 1 + θ(X − Y )2 , then (6.76a) would become 2 2 2 2 1 1 2 {1 + θ((1 − μ) + σ )} − c0 /b ≤ L ≤ 2 {1 − θ((1 − μ) + σ )} + c0 /b. 22
242
6. Continuous Population Games
We can similarly show that (0, 0) is never a Nash equilibrium (Exercise 6.18) but that (0, 1) is a Nash equilibrium when c0 + c1 (6.77) L ≤ 12 {1 − θμ} − b 1 (which is possible only if 2 {1 − θμ}b exceeds c0 + c1 ) and, hence, by symmetry, that (1, 0) is the Nash equilibrium when—replacing L by 1 − L in (6.77)— 1 L ≥ 12 {1 + θμ} + c0 +c b . For example, in Figure 6.9, (0, 1) is the Nash equilibrium for L ≤ 0.05 and (1, 0) is the Nash equilibrium for L ≥ 0.95. More generally, we can compute the Nash equilibrium as the unique intersection of the player’s optimal reaction sets, which we write as R1 (L) and R2 (L) to emphasize their dependence on L.23 With g(u) = 0,24 the sign of ∂f1 /∂u in (6.73) is determined by the expression in large round parentheses. Solving ∂f1 /∂u = 0 for u reveals that it is convenient to define a function φ with two arguments by 1 2{1 − G(s)} (6.78) φ(s, L) = 2(c0 + c1 ) + 2c1 + θb G(s) ( ) " s θ + 2L − 1 + yg(y) dy b ; G(s) 0 for example, if strength is uniformly distributed, we obtain 1 2(1 − s) 1 (6.79) φ(s, L) = 2(c0 + c1 ) + 2L − 1 + 2 θs b + . 2c1 + θb
s
Then, on using (6.63) and (6.66) along the curve u = φ(v, L) we have in (6.73), ∂f1 /∂u = 0 and ∂ 2 f1 /∂u2 = − c1 + 21 θb G(v)g(φ(v, L)), which is negative for v > 0, and ∂f1 /∂u = g(u) > 0 if v = 0 (Exercise 6.18). Thus the unique maximum of f1 occurs where u = φ(v, L) if 0 < φ(v, L) < 1, and otherwise on the right- or lefthand boundary of the unit square according to whether φ(v, L) ≥ 1 or φ(v, L) ≤ 0; equivalently, R1 (L) is the curve with equation u = max {min (φ(v, L), 1) , 0}. By symmetry, it follows immediately that the unique maximum of f2 occurs where v = φ(u, 1 − L) if 0 < φ(u, 1 − L) < 1, and otherwise on the upper or lower boundary of the unit square according to whether φ(u, 1−L) ≥ 1 or φ(u, 1−L) ≤ 0; equivalently, R2 (L) is the curve with equation v = max {min (φ(u, 1 − L), 1) , 0}. Note in particular that φ(1, L) = {2(c0 + c1 ) + (2L − 1 + θμ)b}/{2c1 + θb} is Player 1’s best reply to unconditional acceptance by Player 2 when φ(1, L) < 1.25 What we have shown is that the community game defined by f1 and f2 has a unique strong Nash equilibrium for all 0 < L < 1. We denote this unique pair of mutual best replies by (u∗ (L), v ∗ (L)) where, in view of (6.71), (6.80) 23
u∗ (L) = v ∗ (1 − L).
That is, R1 (L) = {(u, v) ∈ D | f1 (u, v, L) = max f1 (u, v, L)}, 0≤u≤1
R2 (L) = {(u, v) ∈ D | f2 (u, v, L) = max f2 (u, v, L)}, 0≤v≤1
where D denotes the unit square. 24 Except possibly at u = 0 or u = 1. See Footnote 51 on p. 278. 25 As illustrated by Figure 6.10, where μ = 12 , for (a) 0 ≤ L ≤ 0.394, (b) 0 ≤ L ≤ 0.375, (c) 0.025 ≤ L ≤ 0.475, and (d) 0 ≤ L ≤ 0.275.
6.6. Partitioning territory: a landmark game
243
Figure 6.10. How the evolutionarily stable threshold α varies with the landmark position L for different values of θ (the relative influence of strength and distance from the nest on effective fighting ability) and c0 (the fixed costs of fighting). In all cases, strength is uniformly distributed and the marginal value of territory expansion, the marginal cost of a unit of weakness, and the additional fighting cost incurred when both animals reject the landmark are b = 1, c1 = 0.4, and = 0.01, respectively.
What most interests us, however, is the population game associated with this community game. Accordingly, let α(L) denote the aggression threshold of a focal individual in a large population whose nests are at distance L from a landmark, so that strategy α is defined by (6.65) with s = α(L). Then strategy α is a best reply to itself and, hence, is strong evolutionarily stable strategy if α(L) = u∗ (L) or, which is exactly the same thing, if α(L) = v ∗ (1 − L). We illustrate in Figure 6.10 by showing how the ESS threshold varies with L for b = 1, c1 = 0.4, = 0.01, and two different values of c0 and θ; u∗ (L) is shown as a solid curve, and v ∗ (L) as a dashed curve.
244
6. Continuous Population Games
For c0 ≥ 12 θ{1−μ}b, it follows from above that the ESS has an explicit analytical form, namely, ( ) 2(c0 + c1 ) + (2L − 1 + θμ)b (6.81) α(L) = max min ,1 ,0 , 2c1 + θb
as illustrated by Figures 6.9 and 6.10(b)–(d) for a symmetric distribution. Although the condition for landmarks near the center of the line segment to be unconditionally accepted by both animals is independent of c1 by (6.76b), (6.81) shows that the threshold increases with c1 for landmarks nearer to a nest (Exercise 6.19): the greater the cost of being weak, the stronger an animal must be to reject the landmark. Moreover, although the ESS threshold is independent of for c0 ≥ 12 θ{1−μ}b, if c0 < 12 θ{1 − μ}b, then (6.81) no longer applies, e.g., at intermediate values of L in Figure 6.10(a). On the one hand, in this particular case, the ESS can still in principle be found analytically, because simultaneous solution of u = φ(v, L) and v = φ(u, 1 − L) yields u = φ(φ(u, 1 − L), L), which (6.79) reveals to be quartic in u for a uniform distribution. On the other hand, the resulting expression is too cumbersome to be useful, except perhaps for L = 12 , where it yields 2 (6.82) α 12 = c0 + c1 − + (c0 + c1 − )2 + (4c1 + θb) , 4c1 + θb
which increases with (Exercise 6.19). For all other values of L, numerical solution of u = φ(v, L), v = φ(u, 1 − L) is preferable. Figure 6.10 illustrates that an animal should tend to be more aggressive (i.e., pick a lower threshold for rejecting the landmark) than its neighbor when the landmark is on its side of the midpoint, and it should not necessarily be unconditionally aggressive when its neighbor is accepting a landmark that is very close to its nest. The latter result is perhaps surprising. But we have to remember that even though Player 1 knows its own strength, say s, it does not know the strength of the other player, which is merely a random number Y between 0 and 1. Suppose that such a Player 1 contemplates being aggressive against a Player 2 who is honoring a landmark at L = 0 (i.e., right at Player 1’s nest). The payoff to being aggressive will be bξ(s, Y ). The cost of being aggressive will be cL (s). So the expected net gain from aggression is " 1 (6.83) bξ(s, y) dy − cL (s) = 14 (2 − θ + 2θs)b − c0 − c1 (1 − s), 0
which increases with s but is often negative when s is small. For example, if we set b = 1, c0 = 0.05, and c1 = 0.4 as in Figures 6.10(a) and (c), then (6.83) is positive 3 when θ = 0.5, although it is always positive when θ = 0.1. only for s ≥ 26 Despite condition (6.76b), whether both animals honor the landmark in general depends on their strengths. In this regard, we define two probabilities, p and q. Let p(L) denote the probability that both animals honor a landmark at distance L from Player 1’s nest: (6.84)
p(L) = Prob(X < u∗ (L), Y < v ∗ (L)) " u∗ (L) " v∗ (L) g(x) g(y) dx dy = G(u∗ (L))G(v ∗ (L)). = 0
0
6.6. Partitioning territory: a landmark game
245
Figure 6.11. The probability p(L) that a landmark will be mutually agreeable (dashed curves) and the probability q(L) that an animal will accept a smaller territory than it would have obtained through fighting (solid curves) for b = 1, c1 = 0.4, = 0.01, θ = 0.5, and two values of c0 when strength has a symmetric Beta distribution with variance σ 2 = 1/12 (lowermost curves), σ 2 = 0.05, σ 2 = 1/44, or σ 2 = 0.01 (uppermost curves). The corresponding density functions are sketched in Figure 6.12(b), with Figures 6.12(e) and 6.12(h) matching the uppermost curves in (a) and (b), respectively. Note that q(0.5) = 0.5p(0.5) by (6.87), and hence that q(0.5) = 0.5 for c0 = 0.25: in view of (6.76b), the interval over which a landmark is unconditionally accepted by both animals does not depend on the variance in this case.
Let q(L) denote the probability that an individual whose nest is at distance L from the landmark accepts a smaller territory at the ESS than it would have won if the landmark had not been honored by both players: q(L) = Prob(X < u∗ (L), Y < v ∗ (L), ξ(X, Y ) > L).
(6.85)
For L ≤ 12 , if v ∗ (L) < (1−2L)/θ, then the whole of Region I in Figure 6.8 lies below the line y = x + (1 − 2L)/θ on which ξ(x, y) = L, and so q(L) = p(L); whereas if v ∗ (L) > (1 − 2L)/θ, then the line divides Region I into a triangle and a pentagon with ξ(x, y) < L inside the triangle,26 and so " v∗ (L)−(1−2L)/θ " v∗ (L) (6.86) q(L) = p(L) − g(x) g(y) dy dx. 0
For L >
1 2
x+(1−2L)/θ
we calculate q from
(6.87)
q(L) = p(1 − L) − q(1 − L)
(Exercise 6.20). 26 Because v ∗ (L) − u∗ (L) is a nonincreasing function of L and (1 − 2L)/θ is a strictly decreasing one, they can assume the same value at most once, and it occurs at L = 12 . Because the second function exceeds the first as L → 0, it follows that (1 − 2L)/θ ≥ v∗ (L) − u∗ (L) for all L ≤ 12 . So the line y = x + (1 − 2L)/θ cannot divide Region I into two quadrilaterals—that would require v ∗ (L) > u∗ (L) + (1 − 2L)/θ).
246
6. Continuous Population Games
To illustrate, we assume that strength has a Beta distribution with density
(6.88)
g(s) =
Γ(m1 + m2 ) m1 −1 s (1 − s)m2 −1 , Γ(m1 )Γ(m2 )
m1 , m2 > 0 and Γ denotes the Euler Gamma function (defined by Γ(ρ) = where ∞ (−ξ) ρ−1 e ξ dξ). If B denotes the incomplete Beta function (i.e., B(s, m1 , m2 ) = 0s m −1 1 ξ (1 − ξ)m2 −1 dξ), then the mean μ, variance σ 2 and cumulative distribu0 tion function G are given by μ = m1 /(m1 + m2 ), σ 2 = μ2 (1 − μ)/(μ + m1 ), and Γ(m1 )Γ(m2 )G(s) = Γ(m1 + m2 )B(s, m1 , m2 ). We assume that m1 , m2 ≥ 1, so that the distribution is unimodal. If m1 = m2 = m, say, then m becomes an inverse measure of variability and the distribution is symmetric, as in Figure 6.11, where the four different variances σ 2 = 1 1 1 1 2 2 2 12 , σ = 20 , σ = 44 , and σ = 100 correspond to m = 1 (uniform distribution), m = 2, m = 5, and m = 12, respectively. For these four distributions, Figure 6.11 shows p(L) as a dashed curve, which is symmetric about L = 12 , and q(L) is shown as a solid curve. Figure 6.11 illustrates that the range of unconditionally accepted landmarks is greater when the fixed cost c0 is higher. Note that q(L) ≤ p(L) with q(L) = p(L) for L ≤ 12 (1 − θ) and q(L) = 0 for L ≥ 12 (1 + θ), because (6.63) implies 1 1 2 (1 − θ) ≤ ξ ≤ 2 (1 + θ). If m2 > m1 > 1, then the distribution is unimodal with positive skew, as in Figure 6.12(a) with m1 = 3 and m2 = 12; whereas if m1 > m2 > 1, then the distribution is unimodal with negative skew, as in Figure 6.12(c) with m1 = 12 and 1 m2 = 3. Both of these distributions have the same variance, σ 2 = 100 , as the most peaked of the four symmetric distributions plotted in Figure 6.12(b). Together, Figures 6.11 and 6.12 show that the probability of an animal accepting a smaller territory than it would have obtained through fighting can be remarkably high, being greatest when the landmark is somewhat displaced from the midpoint of the line segment towards the animal’s nest. For a symmetric distribution, the probability is higher when the variance is lower; and for a given variance, the probability is even higher when the distribution is positively skewed, as illustrated by Figure 6.12. The low variances under which landmarks are especially likely to be used are consistent with data on animal populations. Distributions of fighting ability are typically unimodal and fairly symmetric with coefficients of variation27 between 5 and 20% [212, p. 691]. Thus 0.0025μ2 ≤ σ 2 ≤ 0.04μ2 , implying that the variance will lie between about 0.625 × 10−3 and 10−2 for a distribution that is almost symmetric (μ ≈ 0.5), and in any event, will not exceed 0.04. Thus, of all the curves in Figure 6.11, the uppermost ones appear most relevant, and the distributions in Figure 6.12 are by no means unrealistic.
27 The coefficient of variation is defined as 100σ/μ%, where σ is the standard deviation (square root of the variance) of the distribution and μ is the mean.
6.6. Partitioning territory: a landmark game
Figure 6.12. The probabilities p(L) that a landmark will be mutually agreeable (dashed curves) and q(L) that an animal will accept a smaller territory than it would have obtained through fighting (solid curves) for b = 1, c1 = 0.4, = 0.01, θ = 0.5, and two values of c0 when strength has a Beta distribution with variance σ 2 = 0.01 and mean μ = 0.2 (left-hand column), μ = 0.5 (center column), or μ = 0.8 (right-hand column). The density function for each column is the solid curve in the top panel. The dashed curves in (b) correspond to the three distributions in Figure 6.11 whose variance exceeds 0.01.
247
248
6. Continuous Population Games
6.7. The influence of contests on clutch size Some animals reproduce only once in their lifetime and are, therefore, said to be semelparous. Pacific salmon are surely the best known example, but many insects also fall into this category. One such insect is a tiny wasp called Goniozus nephantidis, or G. nephantidis for short, which paralyses the larvae of another insect and lays its eggs on these hapless hosts, eventually killing them—for which reason, it is called a parasitoid. But how many eggs should G. nephantidis lay? We address this question here. Given that G. nephantidis is semelparous, one would expect it to behave so as to maximize fitness from its only clutch. Among biologists, the clutch size that maximizes fitness per clutch is known as the Lack clutch size [136, p. 272], and its calculation has traditionally been viewed as an exercise in nonstrategic optimization (corresponding to the upper left-hand quadrant of Figure 0.2). Fitness per clutch equals clutch size times fitness per egg. Let U denote clutch size, and let S(U ) denote fitness per egg, which we assume to decrease with U , because larger clutches mean more larvae competing for the same essential food resource— the host; and a less well-nourished larva will have lower fitness, by virtue of lower survival probability and smaller size on emergence. Then fitness per clutch equals U S(U ), which we assume to be a unimodal function with its maximum at U = L. We will refer to L as the traditional Lack clutch size. Hardy et al. [131, p. 128] calculated traditional Lack clutch size in G. nephantidis for hosts of a given size, and found it to be approximately 18. But the observed clutch size for hosts of the same size was only about 8–10. Why? Why does G. nephantidis lay only about half as many eggs as one would expect? As in §6.2, we are presented with a paradox. After paralysing a host, a female G. nephantidis waits 1–3 days before laying its eggs on it. During this interval between paralysis and parasitism of the host, its owner may be challenged to a contest by a conspecific intruder, a paralysed host being of equal value to both owner and intruder [270, p. 1364]. All other things being equal, larger adults are more likely to win contests than smaller adults, and larger adults are more likely to emerge from smaller clutches. We thus have a possible resolution of our paradox, namely, that traditional Lack clutch size in G. nephantidis, by virtue of being a nonstrategic optimum, fails to account for the influence on clutch size of such owner-intruder contests. To explore this idea, we develop a game. For the sake of simplicity, we consider a population of females that produce only females, which is a close approximation to the highly female-biased sex ratios observed in G. nephantidis [131]. Let the number of eggs in a clutch laid by an individual in the general population be V , and let U now represent the clutch size laid by a potential mutant; daughters are assumed to inherit their mothers’ strategies. Although U and V are in practice integers, we assume for simplicity that they vary continuously over positive real numbers. Likewise for simplicity, we assume that there are only two adult body sizes, small and large. We further assume that all hosts are of equal size, each surviving egg becomes a large adult or a small adult independently of any other egg and
6.7. The influence of contests on clutch size
249
the probability of becoming large decreases with clutch size. For an egg laid by a U -strategist, let Φ(U ) be its probability of becoming large. Then we assume that Φ (U ) < 0 for U > 0, and we continue to assume that S (U ) < 0, where S(U ) denotes fitness per egg in the absence of contest behavior. Although contests tend to be won by larger females, prior ownership also confers an advantage [270], which our model must allow for. We therefore assume that, in a contest between two small individuals, or between two large individuals, or between a large owner and a small intruder, the owner always wins; however, in a contest between a small owner and a large intruder, winning depends upon the advantage of ownership versus the advantage of size. Accordingly, we introduce a measure of the first advantage relative to the second by defining ρ to be the probability that a small owner defeats a large intruder. We refer to ρ as owner’s relative advantage.28 We continue to assume that an animal’s reward is the fitness of its clutch, the expected number of surviving offspring from a suitable host—one that is unparasitized, though possibly already paralysed. Because contest behavior makes this reward depend on the behavior of other animals, however, the clutch size that we expect to see is no longer the traditional Lack clutch size, but rather the ESS of a population game that describes how the wasps interact. Since the reward is still fitness per clutch, we will refer to this ESS as the true Lack clutch size. Our analysis is simplified if true Lack clutch size is scaled with respect to traditional Lack clutch size. We therefore define u = U/L, v = V /L, and we rewrite the fitness per egg and probability of becoming large as s(u) = S(Lu) and φ(u) = Φ(Lu), respectively. Thus our assumptions become s (u) < 0, φ (u) < 0 for u > 0, and the traditional fitness of a clutch whose size is proportion u of the traditional Lack clutch size becomes (6.89)
F (u) = Lus(u).
By assumption, F is a unimodal function whose maximum occurs where u = 1. For the sake of simplicity, we now further assume that F (u) < 0 for all u ≤ 1. Let Z(T ) be the probability that a suitable host located at time T is being guarded (i.e., it already has an owner), let Y (T ) be the probability that an owner will subsequently be intruded upon by another insect, and let k be the probability that a host is never found (during the entire vulnerable period of its development). Then (6.90)
(1 − Z(T ))(1 − Y (T )) = k
because, from our protagonist’s point of view, k is simply the probability 1 − Z(T ) that a host has not already been found times the probability 1 − Y (T ) that it is not subsequently discovered: the bigger the value of Z(T ), the smaller the value of Y (T ). In particular, if the vulnerable period of development has length p and time to next arrival follows an exponential distribution with parameter a,29 then Z(T ) = 1 − e−aT , Y (T ) = 1 − e−a(p−T ) , and (6.91) 28
k = e−aT e−a(p−T ) = e−ap .
In terms of Footnote 18 on p. 165, ρ is a measure of relative RHP. −aξ Which, in terms of (2.70), has on (0, ∞), hence distribution density function g(ξ) = ae function G(s) = 1 − e−as with mean 0∞ ξg(ξ) dξ = 1/a. 29
250
6. Continuous Population Games
We assume for simplicity that a, and hence also k, is independent of the population clutch size, v.30 We assume that there is a very narrow time window in which an insect can actually acquire a host, and thus that each host is the subject of at most one contest. To be able to reproduce, females must either find an unguarded host, and defend it against at most one intruder, or take over a guarded host in a contest. To compute the reward to a u-strategist in a population of v-strategists, let f (u, v) denote this reward, i.e., the expected number of offspring from a host that has just been discovered. First we compute the payoff from a host located at time T , conditional on being large. With probability 1 − Z(T ) the host is unguarded; in which case, the owner is guaranteed to keep the host, because even if it is subsequently intruded upon, the owner will win the contest by virtue of being large. On the other hand, with probability Z(T ), the host is guarded; in which case, the intruding u-strategist wins (and hence keeps) the host only if its owner is small, and then only with probability 1 − ρ. The owner is a v-strategist, and therefore small with probability 1−φ(v). So the probability that a large u-strategist retains a host is 1 − Z(T ) + Z(T ){1 − φ(v)}(1 − ρ); and so its payoff, conditional on being large, is {1 − Z(T ) + Z(T ){1 − φ(v)}(1 − ρ)}F (u). See the left-hand branch of the event tree in Figure 6.13. Next we compute the payoff from a host located at time T , conditional on being small. Then, with probability 1 − Z(T ) the host is unguarded, in which case the owner will keep the host unless subsequently disturbed by a large intruder who wins, which happens with probability 1 − ρ. The intruder is a v-strategist and, therefore, large with probability φ(v). The small owner will thus keep the host with probability 1 − Y (T )φ(v)(1 − ρ). With probability Z(T ), on the other hand, the host is guarded, in which case, the intruding u-strategist has no chance of winning because it is small. So, the probability that a small u-strategist retains a host it discovers is (1 − Z(T )){1 − Y (T )φ(v)(1 − ρ)}, and its payoff, conditional on being small, is (1 − Z(T )){1 − Y (T )φ(v)(1 − ρ)}F (u). However, the probabilities of being small and large (conditional upon survival) are φ(u) and 1 − φ(u), respectively. Thus the payoff from a host located at time T is {1 − Z(T ) + Z(T ){1 − φ(v)}(1 − ρ)}F (u)φ(u) + (1 − Z(T )){1 − Y (T )φ(v)(1 − ρ)}F (u){1 − φ(u)}. But T is a random variable, and so we compute the expected value of this payoff over the distribution of T to obtain the animal’s reward. Thus, if E denotes expected value, on using (6.90) we obtain f (u, v) = E [{(1 − ρ)(Z(T )(1 − φ(v)) + Y (T )(1 − Z(T ))φ(v))φ(u) + (1 − Z(T ))(1 − Y (T )(1 − ρ)φ(v))} F (u)] = E [1 − Z(T ) + (1 − ρ){Z(T )(1 − φ(v))φ(u) − (1 − k − Z(T ))(1 − φ(u))φ(v)}] F (u) 30 In nature, however, the mean time to next arrival at a site, i.e., 1/a, will be lower at higher population density, which increases with the fitness per clutch in the population, and hence with F (v). The effect of a increasing with F (v) is readily incorporated [221, p. 975], and is similar to that of reducing k in Figure 6.14: as intuition would suggest, clutch size optima decrease with strength of density dependence.
6.7. The influence of contests on clutch size
251
Figure 6.13. The event tree for the clutch-size game, showing payoffs to the protagonist (u-strategist). The left-hand branches are conditioned on being large, the right-hand ones on being small. The (conditional) probability associated with each branch is written across it. Internal vertices are represented by dots, and leaves by squares; payoffs are in rectangular boxes. To obtain the reward, multiply each payoff by the product of the probabilities on the branches leading to the associated leaf, then sum. A free (unguarded) host is assumed to be accepted.
or (6.92) f (u, v) =
1 − z + (1 − ρ){z(1 − φ(v))φ(u)
− (1 − k − z)(1 − φ(u))φ(v)} F (u)
where (6.93)
z = E [Z(T )] .
In particular, if time to next arrival has an exponential distribution with parameter a and if T is uniformly distributed between 0 and p (the length of the vulnerable period of development), then (6.91) implies " p 1 1−k (6.94) z = Z(t) dt = 1 + 0
p
ln(k)
(Exercise 6.21). Note that f = (1 − z)F when ρ = 1. Thus—because z is independent of u regardless of whether it depends on v, so that the multiplicative factor 1 − z can have no effect on the maximization of (1 − z)F —if the advantage of ownership is sufficiently strong that even small owners cannot lose, then the ESS invariably agrees with the traditional Lack clutch size (as also when contests cannot occur, because f = F when z = 0 and k = 1).
252
6. Continuous Population Games
For ρ < 1, however, it follows from our assumptions that f is decreasing for u ≥ 1,31 so that f has its maximum where u < 1 for any given v; and if this maximum occurs where u = v, so that v is a best reply to itself, then v is an ESS. We will denote this ESS by v ∗ . Thus v ∗ < 1: the true Lack clutch size is smaller than the traditional Lack clutch size, and because v ∗ increases with ρ and k (as Figure 6.14 will shortly demonstrate), the true Lack clutch size is smallest in the limit as ρ → 0 and k → 0. A contest is then inevitable, and f (u, v) approaches (1−φ(v))φ(u)F (u), so that the minimum optimal clutch size is the u that maximizes large-offspring fitness φ(u)F (u). We denote this minimum optimal clutch size by . In order to calculate the true Lack clutch size for ρ < 1, we must first choose explicit forms for both s and φ. Accordingly, we now set (6.95)
s(u) = e−u .
Because s(0) cannot affect where f is maximized, no generality is lost by assuming s(0) = 1, and it follows from (6.89) that all of our assumptions about F are satisfied. Specifically, F is unimodal with its maximum at u = 1, and F (u) < 0 for all u ≤ 1. We note in passing that other such forms of s yield qualitatively similar results. We now turn to φ. On the one hand, intuition strongly suggests that increasing clutch size will cause a smaller reduction in the probability of being large when clutch size is small than when clutch size is somewhat larger, i.e., ∂ 2 φ/∂u2 < 0 when u is small. On the other hand, the probability of being large must eventually approach zero as clutch size increases, and so ∂ 2 φ/∂u2 must eventually become positive. Let c denote the critical clutch size at which the graph of φ thus has an inflection point. Then we need a form satisfying ∂ 2 φ/∂u2 < 0 for u < c but ∂ 2 φ/∂u2 > 0 for u > c. We choose to satisfy these conditions by setting (6.96)
φ(u) = φ0 e− 2 (u/c) 1
2
with φ0 ≤ 1. It now follows from the end of the previous section that true Lack clutch size satisfies < v ∗ < 1, where (6.97) = 12 c c2 + 4 − c (Exercise 6.22). In the case where c ≥ 1, (6.96) implies φ (u) < 0 for all u < 1, so it is easy to demonstrate from (6.92) and our prior assumptions that ∂ 2 f /∂u2 < 0 for all u < 1 (Exercise 6.23), and it turns out that ∂ 2 f /∂u2 < 0 continues to hold for c < 1 as well, as long as c is not too small. Then, for any given v, f has a unique = 0, and this local maximum is also the local maximum at u = u ˜ defined by ∂f ∂u u=˜ u ˜=v global maximum of f . Thus the ESS v ∗ is the unique value of v for which u and 0 < v < 1. That is, v = v ∗ is the unique solution of the equation (6.98) c2 (1 − v) 1 − z + (1 − ρ)(2z − 1 + k){1 − φ(v)}φ(v) = v 2 (1 − ρ){z + (1 − k − 2z)φ(v)}φ(v) 31 The expression in (6.92) is a product of two positive factors: The first decreases with respect to u because φ (u) < 0. The second decreases for u > 1 because F is unimodal, with maximum at u = 1. So, by the product rule, for all u ≥ 1, ∂f /∂u is the sum of a negative and a nonpositive (since F (1) = 0) term, and hence is negative.
6.7. The influence of contests on clutch size
253
Figure 6.14. The ESS as a function of ρ for φ0 = 1 in (6.96), and various values of the critical clutch size for various values of the probability that a host is never found, namely, k = 0.5 (uppermost curve), k = 0.25, k = 0.1, k = 0.01, k = 10−6 , and k → 0 (lowermost curve, corresponding to (6.97))
(Exercise 6.23), where z is defined in terms of k by (6.93) and φ is defined by (6.96). This is a transcendental equation, which cannot be solved analytically, but is readily solved by numerical means. It is also readily verified, again numerically, that v ∗ is continuously stable. The resultant ESS is plotted in Figure 6.14, confirming that v ∗ increases with ρ and k. For sufficiently small values of c, there arises the possibility that ∂ 2 f /∂u2 changes sign twice, from negative to positive and back, for 0 < u < 1, so that f has two local maxima.32 Nevertheless, it is straightforward to determine which of these local maxima yields the global maximum, and where this clutch size yields a best reply to itself, there is again a unique ESS. −4 Suppose, for example, that φ0 = 1 and c = 0.2 in (6.96) with ρ = 0.5, k = 10 , ∂f and hence z ≈ 0.8914. Then the equation ∂u u=v = 0 is satisfied by three different values of v, namely, v = v1∗ ≈ 0.2249, v = v2∗ ≈ 0.5973, and v = v3∗ ≈ 0.9996 (Exercise 6.23). In the first instance, therefore, there are three candidates for an interior ESS. Because ∂ 2 f /∂u2 |u=v=v2∗ ≈ 0.2111L > 0 fails to satisfy (2.69), v2∗ can be excluded; it corresponds to a local minimum of f (u, v2∗ ), as opposed to a local maximum. Because ∂ 2 f /∂u2 |u=v=v1∗ ≈ −1.021L and ∂ 2 f /∂u2 |u=v=v3∗ ≈ −0.0396L, both satisfy (2.69) with strict inequality, v1∗ and v3∗ are local ESSs; moreover, because ∂ 2 f /∂u∂v|u=v=v1∗ ≈ −0.1426L and ∂ 2 f /∂u∂v|u=v=v3∗ ≈ −5.492 × 10−10 L, 32 To indicate why: When c is small, φ falls rapidly to zero as u increases, so that f becomes approximately F (u) times a positive function of v and hence has approximately the shape of F (u) with a maximum near 1—except when u is small. In that region, f is the product of increasing F (u) and a function that decreases rapidly because φ does so, thus generating a second local maximum nearer 0.
254
6. Continuous Population Games
f/L
f/L 0.03716
0.06 ρ = 0.5 0.037155 ρ = 0.5
v∗2
0.59
0.03
0.603
u
ρ = 0.5 f /L
ρ = 0.9
0.039938547 ρ = 0.5
0.039938546 0.039938545
0.9994 v∗3 0.9999
0 0
v∗1
0.4
v∗2
0.8
u v∗3
u
Figure 6.15. The relationship between fitness and interior ESS candidates when φ0 = 1, c = 0.2 and k = 10−4 (hence z ≈ 0.8914) for two values of ρ. A large dot denotes the unique (global) ESS, a small dot denotes a local ESS. The small dot representing v3∗ = 0.9996 for ρ = 0.5 is hidden by the large dot representing the ESS v ∗ = 0.9999 for ρ = 0.9; however, the lower inset confirms that v3∗ = 0.9996 is a local ESS. The solid curves show f (u, v1∗ ) for ρ = 0.5 and f (u, v ∗ ) for ρ = 0.9. On all dashed curves, ρ = 0.5; the main panel and the lower inset show f (u, v3∗ ), whereas the upper inset shows f (u, v2∗ ), confirming that v2∗ is a local minimum of f (u, v2∗ ) and is therefore not even a local ESS. Fitness is scaled with respect to traditional Lack clutch size L.
v1∗ and v3∗ are both continuously stable by (2.123). But v3∗ is a best reply to itself only if u is restricted to exceed about 0.4745 (and there is no reason why it should be), whereas v1∗ is the best reply to itself for all u ∈ [0, 1]. So v1∗ is the unique ESS values (and v3∗ is only a local ESS). If, on the other hand, ρ = 0.9 with the same ∂f ∗ = 0 has only one solution, v = v ≈ 0.9999, for the other parameters, then ∂u u=v which again is the unique ESS—an interior ESS, albeit very close to 1. These points are illustrated by Figure 6.15. Extensive numerical experimentation along these lines reveals the broader picture, which is that clutch size at the ESS is small or large according to whether ρ is small or large. However, there is an intermediate range of values of ρ for which no monomorphic ESS exists because a small clutch size and a large clutch size both
6.7. The influence of contests on clutch size
255
f /L 0.04 0.03 0.02 0.01 0
0
v*1
0.5
v*3
u
Figure 6.16. The relationship between fitness and interior ESS candidates when φ0 = 1, c = 0.2, k = 10−4 (hence z ≈ 0.8914) for ρ = 0.7. A dot denotes a local ESS. The lower curve shows f (u, v1∗ ); the upper curve, f (u, v3∗ ). As in Figure 6.15, fitness is scaled with respect to traditional Lack clutch size L.
have the property of being stable to small perturbations in clutch size, although a population at either clutch size is readily invaded by the other. This point is illusin Figure trated by Figure 6.16 for ρ = 0.7, with the same values of φ0 , c, and k as ∂f ∗ = 0 has three interior solutions, namely, v = v ≈ 0.2624, 6.15. Here again ∂u u=v 1 ∗ ∗ 2 2 v = v2 ≈ 0.5311, and v = v3 ≈ 0.9998 with ∂ f /∂u |u=v=v1∗ ≈ −0.4922L < 0, ∂ 2 f /∂u2 |u=v=v2∗ ≈ 0.2065L > 0, and ∂ 2 f /∂u2 |u=v=v3∗ ≈ −0.03974L < 0. So again (2.69) excludes v2∗ , and v1∗ , v3∗ are both local ESSs. This time, however, neither is an ESS, because Figure 6.16 shows that f (v1∗ , v3∗ ) > f (v3∗ , v3∗ ) implying that v1∗ can invade v3∗ , and that f (v3∗ , v1∗ ) > f (v1∗ , v1∗ ) implying that v3∗ can invade v1∗ . For such intermediate values of ρ, the evolutionarily stable state is a polymorphic mixture of the two clutch sizes, whose stable proportions may be found, as in §6.2. Let the clutch sizes v1∗ and v3∗ be found at frequencies x1 and x2 , respectively, where x1 + x2 = 1. Then the reward to the smaller clutch size against the population is f (v1∗ , v1∗ )x1 + f (v1∗ , v3∗ )x2 , the reward to the larger clutch size against the population is f (v3∗ , v1∗ )x1 + f (v3∗ , v3∗ )x2 , and the first exceeds the second by f (v1∗ , v3∗ ) − f (v3∗ , v3∗ ) − {f (v1∗ , v3∗ ) − f (v3∗ , v3∗ ) + f (v3∗ , v1∗ ) − f (v1∗ , v1∗ )}x1 , which is positive when x1 is small but negative when it is large (near 1). So for small x1 , v1∗ will increase in frequency, for large x1 , v1∗ will decrease in frequency, and the population will stabilize where the difference is zero, or (6.99)
x1 =
f (v1∗ , v3∗ ) − f (v3∗ , v3∗ ) . f (v1∗ , v3∗ ) − f (v3∗ , v3∗ ) + f (v3∗ , v1∗ ) − f (v1∗ , v1∗ )}x1
The overall dependence on ρ of the ESS—regardless of whether it is monomorphic or polymorphic—is illustrated by Figure 6.17. In terms of the paradox we set out to resolve, however, what is most important is to have demonstrated the existence of a parameter regime in which the true Lack clutch size is predicted to be around half of the traditional one. Specifically, from Figure 6.14, v ∗ ≈ 12 when the value of k is low and that of c is also low, but not so low as to shift the ESS into the polymorphic regime of Figure 6.17. So contest behavior remains a good candidate explanation for the observed clutch-size discrepancy in empirical work on G. nephantidis [131].
256
6. Continuous Population Games
Figure 6.17. The ESS as a function of ρ for φ0 = 1, c = 0.2, and k = 10−4 . In the unshaded region on the left, the ESS is monomorphic with small clutch size; in the unshaded region on the right, the ESS is monomorphic with large clutch size. In between, shading indicates the intermediate range of values of ρ where the evolutionarily stable state is a polymorphic mixture of the (local ESS) clutch sizes above and below the shading. The inset shows how the proportion of small clutch size v1∗ falls from 1 to 0 as ρ increases from the value at which f (v3∗ , v1∗ ) = f (v1∗ , v1∗ ) to the value at which f (v1∗ , v3∗ ) = f (v3∗ , v3∗ ), and the proportion of large clutch size v3∗ correspondingly rises. Note that the upper curve, which appears to be horizontal, in fact rises very slowly with ρ.
The more general prediction, that anticipation of future competition for hosts reduces clutch size, is consistent with results of later experiments on G. nephantidis by Goubault et al. [114]. But the model, although developed to answer a question about this species, also applies to other species, and support for its predictions emerges from experiments by Creighton on a burying beetle, Nicrophorus orbicollis [73, p. 1035].
6.8. Models of vaccination behavior The use of evolutionary game theory in epidemiology began with two papers by Bauch et al. [19, 20]. Here we study simplified versions of their models, which we recast to conform to our notation. The first model envisions a sudden outbreak of smallpox as the result of bioterrorism. We consider a population of humans who are susceptible to smallpox. With regard to getting vaccinated, each individual is designated as either Type I or Type II. Type I, or vaccinator , means getting vaccinated pre-emptively. Type II, or delayer , means seeking vaccination in the event of a smallpox attack. In this population, a strategy is a probability of choosing to be Type I. Assume that the vaccine, if taken, is 100% efficacious. Let dv be the probability of dying from the vaccine if you are Type I. You want this to be as low as possible. Since minimizing a quantity is the same as maximizing its negative, we define (6.100)
EI = −dv
6.8. Models of vaccination behavior
257
to be the payoff for being Type I. Let q be the population strategy, which we assume to equal the proportion of pre-emptively vaccinated individuals in the population.33 Let φs (q) be a delayer’s probability of becoming infected after a smallpox attack; assume that all delayers are duly vaccinated—late—after such an attack (if fortunate enough not to catch smallpox first), and let ds be the probability of dying from smallpox if infected by it. Then a delayer’s probability of dying in the event of an attack is ds times φs (q) plus dv times the probability of the complementary event, or ds φs (q) + dv {1 − φs (q)}. To obtain an individual’s probability of dying from being a delayer, we must multiply the probability of dying in the event of an attack by the probability of an attack, which we denote by r (for risk ); we assume that r < 1. Thus, by analogy with (6.100), we obtain the payoff for being Type II as EII (q) = −r(ds φs (q) + dv {1 − φs (q)})
(6.101)
= −r({ds − dv }φs (q) + dv ).
We also assume that (6.102)
dv < ds
and that (6.103)
φs (q) < 0,
φs (1) = 0,
so that EII (q) increases with q. Bauch and Earn [20] determined their φs from an epidemiological model with considerable empirical input.34 For the sake of simplicity, however, we largely assume that (6.104)
φs (q) = λ(1 − q)κ
with λ = φs (0) ≤ 1 < κ, so that (6.103) clearly holds. Then (6.105)
EII (q) = −r(λ{ds − dv }(1 − q)κ + dv )
with EII (q) > 0, by (6.101). The reward to a p-strategist in a population of q-strategists is now seen to be
(6.106)
f (p, q) = pEI + (1 − p)EII (q) = EII (q) + {EI − EII (q)}p,
and so we are ready to calculate the optimal reaction set (6.107) R = (p, q) ∈ D | f (p, q) = max f (p, q) , p
33 On the grounds that dv is very small compared to 1. After pre-emptive vaccination, the proportions of survivors and nonsurvivors among the vaccinated subpopulation are q · (1 − dv ) and qdv , respectively. All others (proportion 1 − q of the whole population) are unvaccinated and, therefore, also survive. Thus, removing the nonsurviving pre-emptively vaccinated from both populations, the effective proportion of pre-emptively vaccinated individuals among the population of survivors is
q(1 − dv ) q − qdv # surviving despite vaccination = = # surviving either way q(1 − dv ) + 1 − q 1 − qdv
=
q(1 − dv ) ≈ q 1 − qdv
when dv 1. 34 In much the same way that φ in (6.131) below is determined by (6.123). However, they used a model with five compartments and hence five ordinary differential equations instead of three. They used the extra compartments to distinguish infectious individuals from those who are infected but not yet infectious, and to distinguish those removed from the susceptible subpopulation through vaccination from those removed through contracting the disease—in either case, whether by dying or surviving and thus becoming immune.
258
6. Continuous Population Games
Figure 6.18. The optimal reaction set and ESS
where D = [0, 1] × [0, 1]. Let us first suppose that EI − EII (0) ≤ 0 or (6.108)
rλds ≤ {1 − (1 − λ)r}dv .
Then because EII is an increasing function, the coefficient of p in (6.106) is always either negative or zero. In the first case, for given q, f (p, q) is maximized by p = 0, and so R = {(u, v)|p = 0, 0 ≤ q ≤ 1}, shown as a thick solid line in Figure 6.18(a). In the second case, f (p, q) is maximized by p = 0 for q > 0 and by any p for q = 0 (when f (p, q) equals the constant −dv ). Thus R = {(u, v)|p = 0, 0 ≤ q ≤ 1} ∪ {(u, v)|0 ≤ p ≤ 1, q = 0}, again shown as a thick line in Figure 6.18(a), with the extra segment dashed. Either way, the unique intersection of R with p = q is (0, 0), and so q ∗ = 0 is the unique ESS.35 It is easiest to see why this result makes sense in the case where λ = 1 and (6.108) reduces to rds ≤ dv : if the probability of dying from an outbreak does not exceed the probability of dying from the vaccine, then it makes no sense to get vaccinated. Let us now suppose instead that EI − EII (0) > 0 or (6.109)
rλds > {1 − (1 − λ)r}dv .
Because (6.103) implies EI − EII (1) = (r − 1)dv < 0 and EI − EII (q) is decreasing, there must exist q ∗ ∈ (0, 1) such that (6.110)
EI − EII (q ∗ ) = 0,
regardless of whether (6.104) is assumed, and in that special case (1 − r)dv 1/κ (6.111) q∗ = 1 − . λr(ds − dv )
Because EII is an increasing function, the coefficient of p in (6.106) is positive for q ∈ [0, q ∗ ), zero for q = q ∗ , and negative for q ∈ (q ∗ , 1]. So f (p, q) is maximized for 35 If EI − EII (0) < 0, then the ESS is strong, because 0 is uniquely the best reply to itself: f (0, 0) − f (p, 0) = (EII (0) − EI )p > 0 for all p = 0, because EII is strictly increasing. If EI − EII (0) = 0, then the ESS is weak, with every p = 0 being an alternative best reply to 0 because f (p, 0) = f (0, 0) = EI for all such p. However, f (0, p) − f (p, p) = (EII (p) − EI )p > 0 for all p = 0 to satisfy (2.17c), again because EII is strictly increasing.
6.8. Models of vaccination behavior
given q by (6.112)
259
⎧ ⎪ 1 if 0 ≤ v < q ∗ , ⎨ p = B(q) = any p ∈ [0, 1] if q = q ∗ , ⎪ ⎩ 0 if q ∗ < q ≤ 1.
The set of all (p, q) that satisfy (6.112) is R: it consists of three straight-line segments joining (0, 1) to (0, q ∗ ) to (1, q ∗ ) to (1, 0), as shown in Figure 6.18(b). Because (q ∗ , q ∗ ) is the only point in which R intersects the line p = q, (q ∗ , q ∗ ) is the only symmetric Nash equilibrium, and so q ∗ is the only candidate for ESS. It cannot be a strong ESS, because there are infinitely many alternative best replies: f (p, q ∗ ) = f (q ∗ , q ∗ ) = EII (q ∗ ) = EI for all p by (6.106) and (6.110), so that (2.17b) never holds. Nevertheless, q ∗ is a weak ESS. From (6.106) we obtain f (q ∗ , p) − f (p, p) = EII (p) + {EI − EII (p)}q ∗ − {EII (p) + {EI − EII (p)}p} or (6.113)
f (q ∗ , p) − f (p, p) = {EI − EII (p)}(q ∗ − p).
But EI −EII (p) and q ∗ −p are both strictly decreasing and change sign from positive to negative at p = q ∗ . Hence the product in (6.113) is invariably positive for p = q ∗ , establishing that q ∗ is a weak ESS.36 In a population with frequency q of pre-emptive vaccination, the probability of dying from vaccination or from smallpox is q times the probability of dying for Type I plus 1 − q times the probability of dying for Type II. From society’s vantage point, this probability should be as low as possible. We can regard it as the cost of vaccination level q. Denoting this cost by C(q), we infer from the negatives of (6.100) and (6.101) that (6.114)
C(q) = qdv + (1 − q)r({ds − dv }φs (q) + dv ),
implying (6.115)
C (q) = (1 − r)dv − rλ(κ + 1)(ds − dv )(1 − q)κ
with (6.116)
C (q) = rκλ(κ + 1)(ds − dv )(1 − q)κ−1 > 0,
so that C(q) has a minimum at q = qm , where (1 − r)dv (6.117) qm = 1 −
rλ(κ + 1)(ds − dv )
1/κ .
36 In this regard our terminology differs from that of Bauch and Earn [20], who refer to q ∗ (which they denote by pind ) as a “strict Nash equilibrium” in the online appendix to their paper. In our view, it is neither strict (in this context, merely a synonym for strong) nor a Nash equilibrium. Bauch and Earn obtain their result by applying (2.12) directly, that is, by showing that f (q ∗ , q) > f (p, q) for p = q ∗ , where q = p + (1 − )q ∗ . By (6.106), of course, f (q ∗ , q) > f (p, q) is equivalent to {EI − EII (q)}(q ∗ − p) > 0, which holds when p = q ∗ for the same reason that (6.113) is then positive: because EII is a strictly increasing function, and EI − EII (q) is strictly decreasing and changes sign from positive to negative at p = q ∗ , just like both EI − EII (p) and q ∗ − p. By this token, however, every ESS would be a strong ESS because, as noted in §2.2 (p. 65), f (q ∗ , q) > f (p, q), p = q ∗ , and (2.17) are equivalent (with p and q replacing u and v). But (2.17) allows us to differentiate between the two subcategories of weak and strong, whereas (2.12) does not. Bauch and Earn appear to have no use for this distinction. The reason that q ∗ is not a Nash equilibrium is far simpler: a Nash equilibrium is a strategy combination, whereas an ESS is a strategy. Again as noted already, however, the ESS q ∗ does indeed correspond to the symmetric Nash equilibrium (q ∗ , q ∗ ).
260
6. Continuous Population Games
This pre-emptive vaccination frequency is the optimal coverage level from the perspective of group interest. It follows from (6.111) that 1/κ qm − q ∗ 1 = 1 − (6.118) ∗ 1−q
κ+1
is always positive. Thus the optimal coverage level for society always exceeds the level achieved under a voluntary vaccination program.37 Bauch and Earn [19] built their second game-theoretic model to study the effect of parents’ vaccination behavior on the prevalence of childhood diseases. Consider a population of parents whose children are susceptible to an infectious disease and must decide whether or not to have their children vaccinated. With regard to this decision, each individual is again either Type I or Type II; Type I is again vaccinator , and a strategy is again a probability of choosing to be Type I. This time, however, Type II means simply nonvaccinator (as opposed to delayer ). Each parent’s goal is to minimize the probability of adverse consequences—or morbidity risk, for short—from either the vaccine or the disease itself. Because parents act on their perceptions of risk (regardless of how accurate they are), let rv and ri denote the perceived morbidity risks from the vaccine and from the infectious disease, respectively, and let φ(q) denote the probability that an unvaccinated child will become infected when proportion q of all children are vaccinated; it is assumed that q is always equal to the population strategy. The payoff to being Type I or Type II in a population of q-strategists is −rv or −ri φ(q), respectively. Hence the reward to a p-strategist in a population of q-strategists is the expected value f (p, q) = p · {−rv } + (1 − p) · {−ri φ(q)} = − φ(q) + (r − φ(q))p ri , where (6.119)
r = rv /ri
is the perceived relative risk (of vaccination versus infection). Because the constant ri has no effect on maximization, we can scale it out and rewrite the reward as (6.120) f (p, q) = − φ(q) + (r − φ(q))p . The optimal reaction set is still given by (6.107), and we assume as before that φ is a decreasing function.38 So if r ≥ φ(0), then the ESS is q ∗ = 0 (Figure 6.19(a)): if vaccinating is sufficiently risky (relative to not vaccinating), then nobody vaccinates. If, on the other hand, r < φ(0), then the ESS is q ∗ > 0, where q ∗ is uniquely defined by (6.121)
r = φ(q ∗ )
(Figure 6.19(b)). Calculations to confirm that q ∗ = 0 is a strong ESS for r > φ(0) and q ∗ defined by (6.121) is a weak ESS for r ≤ φ(0) are virtually the same as before, and so we do not repeat them. 37 Bauch, Galvani, and Earn [20, pp. 10565–10566] use parameter values r = 10−2 , dv = 10−6 , ds = 0.3, and they obtain q ∗ ≈ 0.19, qm ≈ 0.47. We would obtain such values from (6.104) with κ ≈ 3.6 and λ ≈ 7 × 10−4 . 38 Although not a strictly decreasing function this time. As we shall see, there exists a critical value qc such that φ(q) = 0 for q ≥ qc . But φ indeed is strictly decreasing for q ∈ (0, qc ).
6.8. Models of vaccination behavior
261
Figure 6.19. The optimal reaction set and ESS
Further analysis now depends on φ, which we determine from the steady state solution of a standard three-compartment dynamic epidemiological model, the socalled SIR model [85]. Let S(t), I(t), and R(t) be the respective proportions of susceptible, infected, and recovered—and hence immune—individuals at time t in a population of fixed size, so that (6.122)
S(t) + I(t) + R(t) = 1.
Then—using S, I, and R in place of S(t), I(t), and R(t) to reduce clutter—the intercompartmental dynamics are described by the following three ordinary differential equations, (6.123a) (6.123b) (6.123c)
dS = μ(1 − q) − βSI − μS, dt dI = βSI − γI − μI, dt dR = μq + γI − μR. dt
Here μ is both the mean birth rate and the mean death rate, whose equality keeps the population size fixed; β is the mean transmission rate between infected and susceptible individuals; and γ −1 is the mean infectious period, so that infected individuals cease being infective and thus become recovered at rate γ. It is assumed for simplicity that Type I parents get their children vaccinated so soon after birth that there is no time for them to become infected. Thus, in effect, with Type I and Type II parents in proportions q and 1 − q, respectively, children are born into the recovered state at rate qμ and into the susceptible state at rate μ(1 − q). A new dimensionless parameter (6.124)
R0 =
β γ+μ
262
6. Continuous Population Games
Figure 6.20. The S-I phase plane
enables us to rewrite the first two equations in (6.123) as dS = μ(1 − q) − {(γ + μ)R0 I + μ}S, dt dI = (γ + μ)(R0 S − 1)I, (6.125b) dt and the third equation is unneeded, in view of (6.122). Because S ≤ 1, if R0 ≤ 1, then I can never increase. Hence an epidemic is possible only if R0 > 1. We refer to R0 as the basic reproductive number. Even when R0 > 1, however, no epidemic will occur if R(0) = q is sufficiently large or, equivalently, if S(0) = 1 − q is sufficiently small.39 In Figure 6.20 the isoclines, i.e., the line S = R−1 0 and the curve (6.125a)
(6.126)
S =
μ(1 − q) (γ + μ)R0 I + μ
in the S-I phase plane, are both shown dashed. From (6.125), dS/dt is positive to the left of the curve but negative to the right of it, as indicated by the arrows, whereas dI/dt is positive to the right of the line but negative to the left of it. So if the level of vaccination—that is, the proportion of Type I—exceeds the critical value 1 (6.127) qc = 1 − R0 so that 1 − q < R−1 0 , then it follows from Figure 6.20(a) that (6.128)
lim (S(t), I(t)) = (1 − q, 0),
t→∞
39 Here S(0) + R(0) = 1 is not exact but is an excellent approximation, in view of(6.122), because the proportion I(0) of infected at the start of an epidemic is always extremely small.
6.8. Models of vaccination behavior
263
and hence there is no epidemic. If, on the other hand, q < qc or 1 − q > R−1 0 , then it follows from Figure 6.20(b) that (6.129)
lim (S(t), I(t)) = (R−1 0 , I∞ ),
t→∞
where we define (6.130)
I∞ =
μ(qc − q) . γ+μ
The point (R−1 0 , I∞ ), where the curve and line intersect in Figure 6.20(b), is a stable node of the dynamical system defined by (6.125) because the eigenvalues of the Jacobian matrix are both negative there.40 At this long-run steady state, a child can leave the susceptible compartment only by becoming infected or dying. and I = I∞ in the subtracted term of (6.125a), we see that Setting S = R−1 0 individuals become infected at rate {(γ + μ)R0 I∞ }R−1 0 = μ(qc − q) and die at rate −1 −1 μR−1 0 with a total departure rate of {(γ + μ)R0 I∞ + μ}R0 = μ{qc − q + R0 }. So, when q < qc , the probability that an unvaccinated child will become infected when proportion q of all children are vaccinated is μ(qc − q) 1 = 1− . R (1 − q) μ{qc − q + R−1 } 0 0
When q > qc , of course, the probability is zero. Hence 1 ,0 . (6.131) φ(q) = max 1 − R0 (1 − q)
From our earlier analysis, we know that the unique ESS is q ∗ = 0 when r ≥ φ(0) and the solution of (6.121) when r < φ(0). Thus, in sum, there are three possibilities. The first is that R0 ≤ 1. Then φ(0) = 0, and so r ≥ φ(0) is trivially satisfied: nobody vaccinates and nobody gets infected. The second possibility is that R0 > 1, implying φ(0) = 1 − R−1 0 > 0, but that r ≥ rc , where it is convenient to define a new parameter (6.132)
rc = 1 −
1 R0
(which just happens to equal both qc and φ(0)). Then again nobody vaccinates (because r ≥ φ(0)), but some children get infected. Let us denote the proportion of such children by ci . Then from (6.127) and (6.130) with q = 0, we find ci = α, where μ 1 1− . (6.133) α = γ+μ
R0
The final possibility, of course, is that R0 > 1 with r < φ(0) = rc . Then some parents vaccinate, and some children get sick. The ESS proportion of vaccinators is the solution (6.134)
q∗ = 1 −
1 R0 (1 − r)
of equation (6.121). Thus, regardless of whether the relative risk r is perceived as large or small, as long as it is perceived as positive, the voluntary level of 40
See Footnote 31 on p. 93. For G1 , G2 defined by (6.125) with x1 = S and x2 = I, we find that the eigenvalues r1 , r2 of J(R−1 0 , I∞ ) satisfy r1 + r2 = −μ(1 − q)R0 < 0, r1 r2 = μ(γ + μ)(qc − q)R0 > 0 and are therefore both negative.
264
6. Continuous Population Games
q , ci 1 qc
ESS
INFECTION RATE
α 0 0
r1
r2
rc
1
r
Figure 6.21. The ESS proportions of Type I parents and infected children for R0 > 1
vaccination—whether (6.134) or zero—will always be lower than the critical level ∗ qc = 1−R−1 0 needed to eradicate the disease. From (6.127) and (6.130) with q = q , the corresponding proportion of sick children is (6.135)
ci =
μ rα ∗ (1 − R−1 0 − q ) = (1 − r)(R − 1) , γ+μ 0
which is lower than α (defined by (6.133)) because r < rc . For R0 > 1, Figure 6.21 illustrates the variation with respect to r of both the ESS q ∗ and the infection rate ci . Because (6.120), (6.131), and (6.134) yield 2 ∂ f ∂2f (6.136) + = 0 + φ (q ∗ ) = −R0 (1 − r)2 < 0, 2 ∗ ∂p
∂p∂q
p=q=q
the ESS is continuously stable by (2.123). Thus if a vaccine scare should increase perceived relative risk, say from r = r1 to r = r2 > r1 , then the population can be expected to shift accordingly to the new ESS, as indicated in Figure 6.21: the level of pre-emptive vaccination will fall, and the infection rate will correspondingly rise.
6.9. Stomatopod strife: a threat game Several decades ago, Adams and Caldwell [1] observed a series of contests between crustaceans called stomatopods, or mantis shrimps, who occupy cavities in coral rubble. If one stomatopod intrudes upon another, then the resident often defends its cavity by threatening with a pair of claw-like appendages. These threat displays often cause intruders to flee, so that contests are settled without any physical contact. In this regard, a surprising observation is that when stomatopods are weakened by molting, and are thus completely unable to fight, they threaten more frequently than stronger animals who are between molts. Moreover, threats by weaklings often deter much stronger intruders, who would easily win a fight if there were one; that is, weaklings often bluff. But if the very weakest residents in the population can threaten and get away with it, then why don’t all residents threaten? And if the threat display can be given by animals who cannot back it up, then why do their opponents respect it?
6.9. Stomatopod strife: a threat game
265
To explore this paradox, we develop a game in which a resident possessing a resource of value V can either threaten or not threaten in defense of it, and an intruder responds by either attacking or fleeing. We assume that, if there’s a fight, then the stronger animal wins. Furthermore, both contestants pay a combat cost, which is higher for the weaker animal; specifically, an animal of strength s pays C(s), where C (s) < 0. Threats increase the vulnerability of a resident to injury inflicted by an intruder. Thus a threat is a display of bravado that bears no cost if the resident is not attacked, but which carries a cost T in addition to the combat cost if the resident is attacked by a stronger opponent.41 Because the molt condition of stomatopods is not externally visible, we assume (as in the warof-attrition game) that each contestant is unaware of its opponent’s strength. So its own strength must determine its behavior. In terms of §6.2 (p. 221), the animal uses self-assessment. Let fighting strengths be continuously distributed between 0 and 1 with probability density function g, and consider a focal individual or protagonist with fighting strength X, called Player 1 for convenience. In the role of resident, Player 1 threatens if either X < u1 or X > u2 , but does not threaten if u1 < X < u2 .42 In the role of intruder, on the other hand, Player 1 attacks when X > u4 if its opponent threatens but when X > u3 if its opponent does not threaten; correspondingly, Player 1 flees when X < u4 or when X < u3 , according to whether its opponent threatens or not. Thus Player 1’s strategy is a four-dimensional vector u = (u1 , u2 , u3 , u4 ), whose first two components govern its behavior as resident, while its last two govern its behavior as intruder. The corresponding fighting strength and strategy of Player 1’s opponent, called Player 2, are denoted by Y and v = (v1 , v2 , v3 , v4 ), respectively; e.g., Player 2 threatens as resident if either Y < v1 or Y > v2 . Thus threats occur only when a resident’s strength is either above or below a certain threshold. Nevertheless, potential ESSs include ones such that only the strongest residents threaten (v2 v1 = 0), such that the weakest residents also threaten (v2 v1 > 0) or such that residents always threaten (v1 = v2 ). In the first case threats would carry reliable information, in the second case threats could be either honest or deceitful, and in the third case threats would carry no information at all. Using notation that temporarily suppresses dependence on u and v, let F (X, Y ) denote the payoff to a u-strategist (Player 1) against a v-strategist (Player 2), let Fk (X, Y ) denote the payoff to u against v in role k, and let pk be the probability of occupying role k. Then, if r stands for resident and i for intruder, we have pr + pi = 1 and (6.137)
F (X, Y ) = pr Fr (X, Y ) + pi Fi (X, Y ).
Note that F , Fr , and Fi are random variables, because X and Y are random variables, and it follows from (6.137) that the reward to a u-strategist in a population of v-strategists is (6.138) 41
f (u, v) = E[F (X, Y )] = pr fr (u, v) + pi fi (u, v),
T is for threat cost, but it is also called a vulnerability cost [2], [42, p. 408]. Because X is continuously distributed, the event that X = u1 or X = u2 occurs with zero probability, and so we ignore it. Similarly for X = u3 or X = u4 . 42
266
6. Continuous Population Games
Table 6.1. Payoff to a protagonist of strength X with strategy u = (u1 , u2 , u3 , u4 ) against an opponent of strength Y with strategy v = (v1 , v2 , v3 , v4 ). Whether its role is resident or intruder is denoted by r or i; ρ and σ are defined by (6.140).
k r r r r i i i i
Relative magnitudes of X and Y X < u1 or X > u2 and Y > v4 X < u1 or X > u2 and Y < v4 u1 < X < u2 and Y > v3 u1 < X < u2 and Y < v3 Y < v1 or Y > v2 and X > u4 Y < v1 or Y > v2 and X < u4 v1 < Y < v2 and X > u3 v1 < Y < v2 and X < u3
Fk (X, Y ) ρ(X, Y ) V σ(X, Y ) V σ(X, Y ) 0 σ(X, Y ) 0
where E denotes expected value and, for k = r or k = i, " 1" 1 Fk (x, y) dA, (6.139) fk (u, v) = 0
0
with dA denoting g(x) g(y) dx dy as in (2.22). It is convenient at this juncture to define ρ, σ, and τ as
σ(X, Y ) =
(6.140a)
V − C(X) −C(X) 0 −T
if X > Y, if X < Y,
if X > Y, if X < Y,
(6.140b)
τ (X, Y ) =
(6.140c)
ρ(X, Y ) = σ(X, Y ) + τ (X, Y ).
Thus ρ or σ, respectively, is the payoff to a threatening or nonthreating resident protagonist of strength X against an attacking opponent of strength Y . The protagonist’s payoff from any contest is now defined by Table 6.1, in which the first, second, fifth, and sixth rows correspond to threatening behavior by the resident, and the remaining four rows correspond to nonthreatening behavior. For u1 ≤ u2 , substitution from (6.140) into (6.139) now yields "u1 "1 (6.141) fr (u, v) =
"1 "1 ρ(x, y) dA +
0 v4
"u1 "v4 ρ(x, y) dA +
u2 v4
"1 "v4 u2 0
and
"
1
"
(6.142) fi (u, v) =
1
"
0
σ(x, y) dA +
1
"
v2
σ(x, y) dA + u4
v2
V dA u1 0
"
1
σ(x, y) dA + u4
"u2 "v3
u1 v3
"
v1
0
"u2 "1 V dA +
+
V dA 0
σ(x, y) dA. u3
v1
In each case, the first integral sign corresponds to integration variable x and the second to variable y. If u1 > u2 , however, then the u-strategist always threatens,
6.9. Stomatopod strife: a threat game
and in place of (6.141) we have " 1" (6.143) fr (u, v) = 0
267
"
1
1
"
v4
ρ(x, y) dA +
v4
V dA = Q, 0
0
say, where Q is independent of u. In other words, any strategy satisfying u1 > u2 is equivalent mathematically to any strategy satisfying u1 = u2 . Therefore, from now on we constrain u to satisfy (6.144)
0 ≤ u1 ≤ u2 ≤ 1,
0 ≤ u3 ≤ 1,
0 ≤ u4 ≤ 1.
It is a moot point whether the strategies thus excluded are all equivalent biologically: an animal who always threatens because its “high” threshold (for reliable communication) is normal but its “low” threshold (for deceitful communication) is abnormally high may be said to behave very differently from an animal who always threatens because its low threshold is normal but its high threshold is abnormally low, whereas our game does not distinguish between them. Nevertheless, the question becomes irrelevant, because u1 < u2 at the only ESS. To calculate the optimal reaction set R, we must maximize f defined by (6.138) with respect to u. We first observe that the game is separable, because (6.138) may be written as (6.1) with p = 4 and " u1 " v4 " u1 " 1 f1 (u1 , v) = pr (6.145a) {V − σ(x, y)} dA + pr τ (x, y) dA, 0
(6.145b) (6.145c)
u3 1
" (6.145d)
v3
0
v4
f2 (u2 , v) = pr Q − f1 (u2 , v), " 1 " v2 σ(x, y) dA, f3 (u3 , v) = pi v1 v1
"
f4 (u4 , v) = pi
"
1
"
1
σ(x, y) dA + pi u4
0
σ(x, y) dA, u4
v2
where Q is defined by (6.143). Thus maximization with respect to u3 may be performed separately from that with respect to u4 , and both independently of that with respect to u1 or u2 . This separability of the reward function makes the game analytically tractable. Some general features of R require only that V , C, T , and g are all positive, which we assume. From (6.140a) and (6.145c), " v2 " v2 ∂f3 = −pi σ(u3 , y)g(u3 )g(y) dy = pi C(u3 )g(u3 ) g(y) dy ∂u3 v1 v1 is positive for u3 < v1 < v2 , so the maximum of f3 for 0 ≤ u3 ≤ 1 must occur where v1 ≤ u3 ≤ 1. Again, (6.140a)–(6.140b) and (6.145a) imply that if v4 ≤ v3 , then ∂f1 ∂u1 < 0 for all 0 < u1 < 1 unless v4 = v3 = 1. Thus if v4 ≤ v3 , then the maximum of f1 must occur at u1 = 0, unless v4 = v3 = 1, in which case, f1 is independent of ∂f2 > 0 for all 0 < u2 < 1 unless v4 = v3 = 1, u1 . Correspondingly, (6.145b) yields ∂u 2 and so the maximum of f2 must occur at u2 = 1, unless v4 = v3 = 1, in which case, f2 is independent of u2 . Nevertheless, we cannot fully calculate R until we specify both C and g in (6.140)–(6.143). In this regard, we make two assumptions. First, combat cost
268
6. Continuous Population Games
Table 6.2. Quantities that appear in Tables 6.3–6.4. Note that all but the last two depend on v = (v1 , v2 , v3 , v4 ).
δ= ω4 = θ1 = θ3 = Δ=
(a+b)(v1 −v2 +1)−v1 v1 −v2 +1 (a+b)(v1 −v2 +1) 1+b(v1 −v2 +1) (1+a+b)(v4 −v3 )−t(1−v4 ) b(v4 −v3 ) v1 +(a+b)(v2 −v1 ) 1+b(v2 −v1 ) t(1−v4 ) ω3 v4 −v3
ω1 = γ4 = θ2 = θ4 = =1−
(1+a+b)v4 −(a+b)v3 −t(1−v4 ) 1+b(v4 −v3 ) (a+b)(v1 −v2 +1)−v1 +v2 1+b(v1 −v2 +1) a(v4 −v3 ) 1 − t−b(v 4 −v3 ) (a+b)(v1 −v2 +1)−v1 = δb b(v1 −v2 +1)
1−a b
γ3 =
a+b 1+b
decreases linearly with fighting strength according to (6.146)
C(s) = V {a + b(1 − s)},
with 0 < a < 1 and b > 0. Thus V > C for the strongest animal and V > C for every animal in the limit as b → 0, but in general there may be (weaker) animals for which C > V . Second, fighting strength is uniformly distributed between 0 and 1, i.e., g(x) = g(y) = 1 or dA = dx dy in (6.145). Furthermore, it is convenient to introduce a dimensionless threat-cost parameter (6.147)
t = T /V.
We can now proceed to calculate R.43 We obtain f3 (u3 , v) = 12 V (v2 −v1 )(1−u3 ){2(1−a)−b(1−u3 )}− 12 V (v2 −u3 )2 if v1 ≤ u3 ≤ v2 , whereas the last (squared) term must be omitted to obtain the correct expression for f3 if v2 ≤ u3 ≤ 1. It follows from Exercise 6.24 that the maximum of f3 for v1 ≤ u3 ≤ 1 (and hence also for 0 ≤ u3 ≤ 1) occurs at u3 = θ3 if b(1−v2 ) ≤ 1− a but at u3 = ω3 if b(1−v2 ) > 1−a, where θ3 and ω3 are defined in Table 6.2. Also, f4 (u4 , v) = 12 V (1−u4 ){2v1 −(2a+b{1−u4 })(v1 −v2 +1)}+ 12 V (1−v2 )2 − 12 V (v1 −u4 )2 when 0 ≤ u4 ≤ v1 , but the last (negative squared) term must be omitted to obtain the correct expression for f4 when v1 ≤ u4 ≤ v2 , and for v2 ≤ u4 ≤ 1 we obtain f4 (u4 , v) = 12 V (1 − u4 ){2v1 − 2v2 + u4 + 1 − (2a + b{1 − u4 })(v1 − v2 + 1)}. Provided v1 − v2 + 1 = 0, the maximum of f4 for 0 ≤ u4 ≤ 1 can now be shown to occur at u4 = ω4 if δ < bv1 , at u4 = θ4 if bv1 ≤ δ ≤ bv2 and at u4 = γ4 if δ > bv2 , where δ, γ4 , θ4 , and ω4 are defined in Table 6.2. If v1 − v2 + 1 = 0, which can happen only if v1 = 0 and v2 = 1, then f3 is maximized at u3 = γ3 (defined in Table 6.2) and any u4 maximizes f4 . These results imply that the maximum of fi —defined by (6.142)—is given by Table 6.3. We have already seen that when v3 ≥ v4 , fr is maximized for 0 ≤ u1 ≤ u2 ≤ 1 where u1 = 0, u2 = 1 (unless v3 = v4 = 1, in which case both u1 and u2 are arbitrary). Moreover, it is clear from (6.140a) and (6.145a)–(6.145b) that when v3 < v4 = 1, fr is maximized where u1 = u2 . Let us therefore assume that v3 < v4 < 1, and hence that (6.148)
b(1 − v4 ) < a + b(1 − v4 ) < a + b(1 − v3 ).
43 This calculation is rather complicated; readers who would prefer to take its outcome on trust are advised to skip ahead to p. 272.
6.9. Stomatopod strife: a threat game
Table 6.3.
269
Maximizers u3 and u4 of fi for 0 ≤ u3 , u4 ≤ 1
Constraints on v1 , v2 (≥ v1 ) δ < bv1 , b(1 − v2 ) ≤ 1 − a bv1 ≤ δ ≤ bv2 , b(1 − v2 ) ≤ 1 − a δ > bv2 , b(1 − v2 ) ≤ 1 − a δ < bv1 , b(1 − v2 ) > 1 − a bv1 ≤ δ ≤ bv2 , b(1 − v2 ) > 1 − a δ > bv2 , b(1 − v2 ) > 1 − a v1 = 0, v2 = 1
u3 θ3 θ3 θ3 ω3 ω3 ω3 γ3
u4 ω4 θ4 γ4 ω4 θ4 γ4 u4
Constraints on u3 , u4 v1 ≤ u3 ≤ v2 , u4 < v1 v1 ≤ u3 , u4 ≤ v2 v1 ≤ u3 ≤ v2 , u4 > v2 u3 > v2 , u4 < v1 u3 > v2 , v1 ≤ u4 ≤ v2 u3 > v2 , u4 > v2 u4 arbitrary
Then f1 (u1 , v) = 12 V u1 {2(1 + a + b)(v4 − v3 ) − 2t(1 − v4 ) − b(v4 − v3 )u1 } for 0 ≤ u1 ≤ v3 , 12 V (u1 − v3 )2 must be subtracted to obtain the correct expression for v3 ≤ u1 ≤ v4 , and, for v4 ≤ u1 ≤ 1, f1 (u1 , v) =
1 V {(v4 − v3 )(v4 + v3 + u1 {2(a + b) − bu1 }) 2 − 2tu1 (1 − v4 ) + t(u1 − v4 )2 }.
From Exercise 6.24, f1 varies between u1 = 0 and u1 = 1 as follows. If Δ ≥ 1+a+b, then f1 decreases between u1 = 0 and u1 = θ2 (> v4 ) but increases again between u1 = θ2 and u1 = 1. If 1 + a + b > Δ ≥ 1 + a + b(1 − v3 ), then f1 increases between u1 = 0 and u1 = θ1 (≤ v3 ), decreases between u1 = θ1 and u1 = θ2 , and increases again between u1 = θ2 and u1 = 1. If 1 + a + b(1 − v3 ) > Δ > a + b(1 − v4 ), then f1 increases between u1 = 0 and u1 = ω1 (which satisfies v3 < ω1 < v4 ), decreases between u1 = ω1 and u1 = θ2 , and increases again between u1 = θ2 and u1 = 1.44 Finally, if Δ ≤ a + b(1 − v4 ), then f1 increases monotonically between u1 = 0 and u1 = 1; its concavity is always downward for 0 ≤ u1 ≤ v4 , but it is upward or downward for v4 ≤ u1 ≤ 1 according to whether Δ > b(1 − v4 ) or Δ < b(1 − v4 ). Correspondingly, from (6.145b), f2 varies between u2 = 0 and u2 = 1 as follows. If Δ ≥ 1 + a + b, then f2 increases between u2 = 0 and u2 = θ2 and decreases again between u2 = θ2 and u2 = 1. If 1 + a + b > Δ ≥ 1 + a + b(1 − v3 ), then f2 decreases between u2 = 0 and u2 = θ1 , increases between u2 = θ1 and u2 = θ2 , and decreases again between u2 = θ2 and u2 = 1. If 1 + a + b(1 − v3 ) > Δ > a + b(1 − v4 ), then f2 decreases between u2 = 0 and u2 = ω1 , increases between u2 = ω1 and u2 = θ2 , and decreases again between u2 = θ2 and u2 = 1. Finally, if Δ ≤ a + b(1 − v4 ), then f2 decreases monotonically between u2 = 0 and u2 = 1. Thus the maximum of fr —defined by (6.141)—is given by Table 6.4. Note that the maximum corresponds to unconditional threatening if Δ ≤ a + b(1 − v4 ). Now, if v is an ESS, then the maximum in Table 6.3 must occur where u3 = v3 and u4 = v4 , the maximum in Table 6.4 must occur where u1 = v1 and u2 = v2 , and all conditions on u must be satisfied. Let us first of all look for a strong ESS. Then v must be the only best reply to itself. This immediately rules out the fourth and sixth rows of Table 6.4, where u1 and u2 do not yield a unique best reply to v3 and v4 , and although the fifth row of Table 6.4 does yield a unique best reply, it corresponds to the bottom row of Table 6.3, where u4 is not unique. 44 Note that Δ > a + b(1 − v4 ) and (6.148) imply t > b(v4 − v3 ) in θ2 (and hence, eventually, that η (s) > 0 for L < s < 1 in (6.152)).
270
6. Continuous Population Games
Table 6.4.
Maximizers u1 and u2 of fr for 0 ≤ u1 ≤ u2 ≤ 1
Constraints on v3 , v4 (> v3 ) u1 Δ≥ 1+a+b 0 1 + a + b > Δ ≥ 1 + a + b(1 − v3 ) θ1 1 + a + b(1 − v3 ) > Δ > a + b(1 − v4 ) ω1 Δ ≤ a + b(1 − v4 ), v4 > v3 u1 0 v3 ≥ v4 , v4 = 1 u1 v3 = v4 = 1
u2 θ2 θ2 θ2 u2 0 u2
Constraints on u1 , u2 u1 ≤ v3 , u2 > v4 u1 ≤ v3 , u2 > v4 v3 < u1 ≤ v4 , u2 > v4 u1 = u2 , u2 arbitrary u1 , u2 both arbitrary
Accordingly, we restrict our attention to the first three rows of Table 6.4. Then, for the maximum to occur at u2 = v2 , each possibility requires v2 > v4 . Thus the maximum at u4 = v4 in Table 6.3 must satisfy v4 < v2 , excluding the third and sixth row of that table. Again, the relative magnitudes of v3 and v4 in the first three rows of Table 6.4 all imply v3 < v4 < 1, so that the maximum at u3 = v3 in Table 6.3 cannot satisfy v3 ≥ v4 , and hence (because v4 < v2 ) cannot satisfy v3 ≥ v2 ; thus the fourth and fifth rows of Table 6.3 are excluded. The first row of the table is likewise excluded, because the maximum where u3 = v3 and u4 = v4 would have to satisfy v4 < v1 ≤ v3 , which is impossible because v4 > v3 . Only the second row of Table 6.3 now remains. Because the maximum at u3 = v3 must therefore satisfy v1 ≤ v3 , we have to exclude the third row of Table 6.4. But the maximum where u1 = v1 and u2 = v2 in Table 6.4 cannot now occur where u1 = 0 and u2 = θ2 because the second row of Table 6.3 would then imply 0 ≤ a + b ≤ bθ2 , which is impossible for a > 0. We have thus excluded the top row of Table 6.4, and only the second remains. We conclude that a strong ESS must correspond to the second row in each table. Let us now set v = (I, J, K, L) in Table 6.2, so that θ3 , θ4 , δ, ω4 , and γ4 depend on I and J, whereas θ1 , θ2 , Δ, and ω1 depend on K and L. Then what we have shown is that (I, J, K, L) is a strong ESS if it satisfies the equations I = θ1 , J = θ2 , K = θ3 , and L = θ4 . The last two equations yield (6.149)
K =
I+(a+b)(J−I) 1+b(J−I) ,
L =
(a+b)(I−J+1)−I . b(I−J+1)
Substituting into I = θ1 and J = θ2 , we obtain a pair of equations for I and J. The first has the form tab(1 − J)2 + d1 (1 − J) + d0 = 0,
(6.150) where
d0 = (1 − a){(1 + t)(1 − bI + b) + a}I is quadratic in I and d1 = −(a + b + at)(1 − bI + b) − bt(1 − a)I − a(a + b) is linear in I. The second equation is cubic in J and can be used in conjunction with the first to express J as a quotient of cubic and quadratic polynomials in I. Substitution back into the first equation yields a sextic equation for I, of which three solutions—namely, I = 0, I = 1 + a/b, and I = 1 + (1 + a)/b—can be found
6.9. Stomatopod strife: a threat game
271
Table 6.5. Coefficients for cubic equation (6.151)
c0 c1 c2 c3
= −a{(a + b)(1 + 2a + b) + at(1 + a + b)} = (1 + a + b){(1 + t){a + (1 + b)(1 + t) + b2 } + 2ab} + a(1 + t){1 + b + b(3a + 2b)} + a2 = −b{(1 + t){(2 + b)t + 2b2 + (3b + 2)(1 + a)} + a(1 + b)} = b2 (1 + b)(1 + t)
by inspection. None of these solutions satisfies 0 < I < 1. Thus, removing the appropriate linear factors, we find that I must satisfy the cubic equation (6.151)
c3 I 3 + c2 I 2 + c1 I + c0 = 0,
whose coefficients are defined by Table 6.5. Because c0 < 0 and c0 +c1 +c2 +c3 > 0, it is clear at once that there is always a real solution satisfying 0 < I < 1. It is not difficult (but a bit tedious) to show that this solution is the only solution satisfying 0 < I < 1; the other two solutions are either complex conjugates or, if they are real, satisfy I > 1. Moreover, only one solution of quadratic equation (6.150) for J satisfies J > I. Thus the strategy v = (I, J, K, L) defined by (6.149)–(6.151) is the only strong ESS. Nevertheless, there are several candidates for a weak ESS. First, from the last row of Table 6.3 and the fifth row of Table 6.4, we find that v ∗ = (0, 1, γ3 , λ) satisfies (2.17a) for any λ ≤ γ3 (where γ3 is defined in Table 6.2); however, v ∗ fails to satisfy (2.17b), because f (u, v ∗ ) = f (v ∗ , v ∗ ) for u = (0, 1, γ3 , u4 ), 0 ≤ u4 ≤ 1. From (6.141)–(6.142), we then find that f (v ∗ , u) = f (u, u) = 0, so that (2.17c) fails to hold. Thus v ∗ is not a weak ESS. Intuitively, never threatening cannot be an evolutionarily stable behavior because in equilibrium the threshold λ is irrelevant; even if the population strategy v satisfies v4 ≤ γ3 to begin with, there is nothing to prevent v4 from drifting to v4 > λ, in which case, never threatening is no longer a best reply. (In particular, there is nothing to prevent v4 from drifting to 1, and never threatening cannot be a best reply to an opponent who never attacks when threatened.) Second, from the last row of Table 6.4, we must investigate the possibility that there is a weak ESS of the form v = (v1 , v2 , 1, 1). Because a < 1, however, we see from Table 6.3 that this possibility requires θ3 = 1 or (1 − a)v1 + av2 = 1, which implies v1 = v2 = 1. But then, from the first three rows of Table 6.3, either v4 = γ3 or v4 = ω3 , contradicting v4 = 1. Hence there is no such ESS. The remaining possibility for a weak ESS is an always-threatening equilibrium with v1 = v2 = λ, say, which corresponds to the fourth row of Table 6.4, and therefore satisfies v4 > v3 . This equilibrium cannot correspond to the first row of Table 6.3, because v1 ≤ v3 ≤ v2 and v4 < v1 then imply v4 < λ ≤ v3 , contradicting v4 > v3 . For similar reasons, the equilibrium cannot correspond to either the second row of Table 6.3 (which would require v4 = λ = v3 ) or the fourth or fifth row (each of which would require v3 > v4 ). Thus the equilibrium must correspond to either the third or sixth row of Table 6.3, and hence have the form v ∗ = (λ, λ, ζ, γ3 ), where ζ = max(λ, ω3 ) satisfies ζ < γ3 . Then f (u, v ∗ ) = f (v ∗ , v ∗ ) and f (v ∗ , u) = f (u, u) for any u such that u1 = u2 and u4 = γ3 : although v ∗ satisfies (2.17a), it fails to
272
6. Continuous Population Games
Figure 6.22. The effect of varying (a) the fixed cost of fighting or (b) the threat cost on the ESS thresholds. In these examples b = 0.4 and (a) t = 0.4 or (b) a = 0.4. Light shading corresponds to the proportion of residents who threaten honestly, dark shading corresponds to the proportion of residents who threaten deceptively, and intermediate shading corresponds to the proportion of intruders deterred by threats.
satisfy (2.17b)–(2.17c), and so is not a weak ESS. Intuitively, always threatening cannot be an evolutionarily stable behavior because in equilibrium the threshold ζ is irrelevant; even if the population strategy v satisfies v3 < γ3 to begin with, there is nothing to prevent v3 from drifting to v3 ≥ γ3 , in which case, always threatening is no longer a best reply. (In particular, there is nothing to prevent v3 from drifting to 1, and always threatening cannot be a best reply to an opponent who never attacks when not threatened.) The upshot is that the sole ESS is the strong ESS, for which J > L > K > I (Figure 6.22). At this ESS, the weakest and strongest animals both threaten when resident, whereas those of intermediate strength do not. The proportion of intruders deterred by threats is L − K (because an intruder attacks if its strength exceeds K when not threatened, but only if its strength exceeds L when threatened). All such intruders would lose against a resident whose strength exceeds J (because J > L); the threats of the strongest residents are therefore honest. On the other hand, all deterred intruders would win against a resident whose strength does not exceed I (because I < K); the threats of the weakest residents are therefore deceptive—but they cannot be distinguished from honest threats without escalation. The proportions of residents who threaten deceptively, who do not threaten, and who threaten honestly are I, J − I, and 1 − J, respectively, and it is readily shown that J − I > 12 , so that fewer than half of the residents threaten (Exercise 6.25). Nevertheless, the proportion of threats that are deceptive can be considerable (Exercise 6.25). In other words, not only is bluffing a part of the ESS, but also it can persist at high frequency. To see why the weakest and strongest animals both threaten when resident while those of intermediate strength do not, it is instructive to compute the expected
6.9. Stomatopod strife: a threat game
273
difference in net gain between threatening and not threatening to a resident of known strength s against an intruder whose unknown strength S is drawn at random from the uniform distribution (so that Prob(S ≤ z) = z). We compute this quantity by subtracting the expected difference in total cost (i.e., combat cost plus threat cost) from the expected difference in benefit. On the one hand, the resident’s combat cost—if paid (i.e., if S > L or if S > K, according to whether the resident threatens or not)—is C(s). Thus the expected difference in combat cost between threatening and not threatening is C(s){Prob(S > L) − Prob(S > K)} = {K − L}C(s). Also, the resident’s threat cost is T if S > max(s, L) but zero otherwise, with expected value T Prob(S > max(s, L)) = T {1 − max(s, L)}—which is also the expected difference in threat cost, because the cost is avoided by not threatening. Adding, we find that the expected difference in total cost is {K − L}C(s) + T {1 − max(s, L)}. On the other hand, the benefit to a threatening resident, who wins if the intruder is weaker or does not attack, is V if S < max(s, L) but 0 if S > max(s, L), with expected value V Prob(S < max(s, L)). For a nonthreatening resident, the corresponding expected value is V Prob(S < max(s, K)), and their difference is the expected difference in benefit. So the expected difference in net gain is V {Prob(S < max(s, L))−Prob(S < max(s, K))}−{K −L}C(s)−T {1−max(s, L)}. Let us denote this quantity by V η(s), so that η is dimensionless. Then, from (6.146)–(6.147) and Exercise 6.26, we obtain ⎧ ⎪ if 0 ≤ s ≤ K, ⎨{1 + a + b(1 − s)}(L − K) − t(1 − L) (6.152) η(s) = if K ≤ s ≤ L, L − s + {a + b(1 − s)}(L − K) − t(1 − L) ⎪ ⎩ {a + b(1 − s)}(L − K) − t(1 − s) if L ≤ s ≤ 1. Furthermore, η(I) = 0 = η(J), η(s) < 0 if I < s < J, and η(s) > 0 if s < I or s > J. So the strongest and the weakest residents both threaten because the expected net gain from doing so exceeds that from not threatening; however, their threats are profitable for different reasons. The strongest residents threaten because, although their expected benefit (of avoiding combat costs) is low, their expected cost is even lower—they are very unlikely to meet an opponent strong enough to inflict the threat cost. At the other extreme, the weakest residents threaten because, although their expected cost is high—an intruder who attacks invariably inflicts the threat cost—their expected benefit from threatening is even higher. They are able thereby to deter some considerably stronger intruders (who would win a fight if there were one), and to do so without the cost of combat (which is highest for the weakest animals). Note, finally, that our partial-bluffing ESS arises only in special circumstances. The ESS does not persist in the limit as b → 0, or if we change the reward structure so that a threatening resident pays the threat cost either regardless of whether it is attacked, or only if it is attacked, but regardless of whether it wins or loses.45 From (6.146), we have V b = −C (s) where s is strength and C is cost of combat. Thus our model predicts a partial-bluffing ESS only if the combat cost is higher for weaker animals and a threatening resident pays an additional cost only when it is attacked and loses. And this is a strength of the model: it helps to identify the 45
For details, see Exercise 6.27.
274
6. Continuous Population Games
particular circumstances in which we might expect to observe a high frequency of bluffing in nature.
6.10. Commentary In this chapter we used continuous population games to study a variety of topics in behavioral ecology and epidemiology, namely, sex allocation (§6.1), kinship (§6.3), sperm competition (§6.4), resource partitioning (§6.5, §6.6), landmarks as conventions (§6.6), vaccination behavior (§6.8), and aspects of animal contest behavior (§6.2, §§6.5–6.7, §6.9). Within this, §6.2 is based on [225], §6.5 on [233], §6.6 on [212], §6.7 on [221], and §6.9 on [2]. This list of topics is by no means exhaustive, but it exemplifies the scope and variety of applications. Other topics for continuous population games include anisogamy (i.e., sexual reproduction by fusion of unlike gametes [53]), parent-offspring conflict [122], timing of arrival to breeding grounds in migratory birds [154], and seed dispersal [100] or other aspects of plant ecology [151, 199], although in this book we consider only games among animals. Evolutionary game theory did not fully emerge as a field of study in its own right until 1982 when Maynard Smith’s definitive monograph [188] consolidated the advances that he and others had made during the 1970s. But the application of game-theoretic reasoning to the study of sex ratios (§6.1) is significantly older, and it can be traced through Hamilton [125] all the way back to Fisher [98] in 1930. The study of sex allocation has since advanced considerably; for a masterly synthesis, see West [359]. The application of game theory to animal contest behavior (§6.2, §§6.5–6.7, §6.9) began with two basic models introduced by Maynard Smith [185, 192], the Hawk-Dove game (§2.4), and a war-of-attrition model. Both models have since been developed in various ways by numerous authors; see [77, 126, 128, 160, 166] and references therein. Other models of contest behavior include Enquist and Leimar’s sequential assessment model of mutual assessment or SAM [90] and Payne’s cumulative assessment model or CAM [266]. In Payne’s model, an opponent’s strength— despite not being directly assessed—affects the decision to withdraw because it determines the opponent’s ability to inflict costs (e.g., injuries) on its rival. These and other models are discussed in the context of current research by Kokko [160], as well as in most later chapters of Hardy and Briffa [130], which comprehensively covers both theoretical and empirical work on animal contest behavior. In particular, the CAM, SAM, and WOA-WA (§6.2), together with a more recent model [222] allowing for both self- and mutual assessment within the same game, have been central to an ongoing debate over the question of whether animals’ decisions to withdraw from contests rely on self- or mutual assessment. Even if assessment of opponents—for which evidence is sparse [88]—is cognitively possible, it need not be cost-effective [222]. Studies have shown that pure self-assessment (as in the WOA-WA) is far more common than once thought; yet different species conform to different models, and some conform to no existing model [95, p. 86]. Indeed assessment mechanisms can vary not only across species but even within contests [9]. So the question has no simple answer, and existing models seem unlikely to settle the debate; rather, newer models are needed. Here lies a golden opportunity for game theorists.
6.10. Commentary
275
In matters of assessment, analysis of contest behavior dovetails with that of communication or signalling [42], where game theory has been central in recent years; see [89, 149, 190, 299] and references therein. An early prediction was that honest threat displays are unlikely to be evolutionarily stable signals, because— unless it is physically impossible to be dishonest—they could be infiltrated by bluffs until it would no longer pay receivers (e.g., the intruders in §6.9) to respect them [187]. A later view was that threat displays must be honest to be stable, with signal costs enforcing reliability [118, 369]; similar logic applies to other signals, e.g., offspring’s calls for food from their parents [249]. As we saw in §6.9, however, stable communication can involve both reliable and deceptive signals, and the frequency of bluffs may be high [2].46 Thus, in a sense, game theory has come full circle over this issue. Sperm competition games began with Parker [261] in 1990. He and others have since developed a suite of models to deal with various strategic aspects of sperm expenditure, e.g., risk assessment [15, 16]. On the whole, this theory has developed in isolation from sex allocation theory (§6.1), although occasionally the two overlap; see, e.g., [267]. For a review of earlier work, see [262, 357]; for more recent advances, see [56, 264, 265, 283, 314] and references therein. Games between and among kin derive, respectively, from Hamilton [124] in 1964 and Grafen [116] in 1979;47 §6.3 is based on [206], which in turn is based on [140]. For a different perspective on games among relatives, see [7]. For an application to sperm competition, see [263]. By contrast, the use of evolutionary games48 in epidemiology did not begin until 2003, as noted in §6.8, but their literature has since grown apace. For these more recent developments, see [14, 26, 102, 238, 272, 282] and references therein. The use of landmarks as territorial boundaries has been comprehensively reviewed by Heap et al. [137]. In some cases (e.g., [214]) any individuals adopting a landmarked boundary will reduce their costs, but in §6.6 both residents must adopt the same landmark for either to benefit from reduced costs—observing the landmark is a convention.49 Unlike the game-theoretic literature on animal contests, epidemiology, or sperm competition, which is growing rapidly, the game-theoretic literature on nonhuman conventions is said to be “virtually nonexistent” [318, p. 607], consisting of §6.6 and little else. Yet games are natural tools for studying conventions, and Stephens and Heinen [318] have argued persuasively that conventions can arise in nearly every aspect of animal social interactions and that exploring them could greatly enrich the study of animal behavior. Here lies another golden opportunity for game theorists.
46 See also [12, 330], but beware of the statements that [2] involves “the assumption that cheating must be rare in an evolutionarily stable signalling system” [330, p. 225] and that a bluffing frequency as high as 44% is “unanticipated by current signalling theory” [12, p. 719]. Neither is correct. See Exercise 6.25. 47 But Grafen’s ESS conditions were discovered independently by Fagen [93]. 48 Although the use of game theory at large began much earlier; see [14, 168, 282]. 49 See p. 81.
276
6. Continuous Population Games
Finally, as stated at the beginning of §6.2, we tacitly assume that behavior observed in a real population can be adequately approximated by the ESS of a gametheoretic model (at least for the purpose of resolving a paradox). Although this assumption is at least highly plausible—if the rate of deviation or mutation is significantly lower than the rate of selection, and regardless of whether the underlying dynamic is cultural or genetic (§2.1)—it has generated considerable skepticism,50 especially among biologists who worry about its consistency with details of genetical inheritance in sexually reproducing populations. Game theorists have addressed this issue by showing that, under reasonable assumptions about the mapping between genotype and behavioral phenotype (= strategy)—which rarely is actually known [119, pp. 5–8]—ESSs correspond to stable, long-run equilibria of explicit genetic models; see [92, 104, 127, 141, 181] and references therein. For further discussion see [280], which systematically addresses the criticisms and argues powerfully in favor of evolutionary game theory.
Exercises 6 1. Sketch the optimal reaction set R of the game whose reward is defined by (6.5), and verify that v = 12 is its only ESS. Also, verify that the ESS is continuously stable. Is (6.5) valid if v = 1? 2. (a) Verify (6.12)–(6.14). (b) Verify that (6.14) is a continuously stable ESS. 3. (a) Obtain the reward function for the WOA-WA (§6.2) in which Tmax is exponentially distributed with mean μ according to g(t) =
1 −t/μ , μe
0 < t < ∞.
(b) Calculate R, the optimal reaction set. (c) Find the ESS, and verify that it is continuously stable. 4. (a) (b) (c) (d)
Verify (6.17). Show that (6.18) is a weak ESS if (6.21) holds with equality. Verify that (6.18) is a continuously stable ESS. Verify (6.22) and (6.23).
5. Show that if ω defined by " ∞ " ω {xg(x)}2 dx = 0
g(y)
0
satisfies μ ≥ μc , where μc =
"
∞
ω(2 + {ω + 2}θ) 1 + ωθ
y
xg(x) dx dy 0
"
∞
{xg(x)}2 dx,
0
then the unique ESS of the war of attrition with reward (6.10) is v∗ =
1 . 1 + ωθ
6. (a) Verify Exercise 6.5 for g defined by (6.11). 50
See, e.g., [237, p. 70].
Exercises 6
277
(b) Verify Exercise 6.5 for g defined by Exercise 6.3. (c) Verify Exercise 6.5 for g defined by (6.15). 7. Let initial reserves in the WOA-WA (§6.2) have a Weibull distribution with scale parameter s (> 0) and shape parameter c (≥ 1) defined by g(t) =
ctc−1 −(t/s)c . sc e
What is the largest value of the cost/value ratio θ in (6.9) for which an ESS exists when (a) c = 2 and (b) c = 3? In each case, find the ESS (assuming that θ does not exceed its critical value). 8. (a) Verify (6.27). (b) Verify that (6.29) is merely (2.18) with φ in place of f . 9. (a) Verify (6.30) and (6.32), and hence that (6.31) is the unique ESS among kin of the WOA-WA (§6.2) defined by (6.11). (b) Verify Figure 6.3. How does R change shape with r? (c) Verify (6.34). 10. (a) Verify (6.35). (b) Verify (6.37), and hence that (6.36) is the unique ESS among kin for the WOA-WA (§6.2) with parabolically distributed initial reserves if either 61 and (6.39) holds. (6.38) holds or r < 490 11. (a) The WOA-WA with exponentially distributed initial reserves (Exercise 6.3) has a unique ESS among kin, v ∗ . What is it? (b) Show that (v ∗ , v ∗ ) is the unique symmetric Nash equilibrium of the corresponding game between kin. 12. Show that an ESS among kin of the war of attrition with uniformly distributed initial reserves need not correspond to a Nash equilibrium of the associated symmetric game between kin, in the sense described at the end of §6.3. 13. Assuming, for simplicity’s sake, that (5.33) is satisfied with strict inequality, analyze the prisoner’s dilemma as (a) a (continuous) game among kin, and (b) a symmetric game between kin. 14. Suppose that two related females share a nest for breeding purposes. One of these animals dominates the other in the sense that it always enjoys a greater share of the pair’s total reproductive success, even though the subordinate animal can increase its relative share by allocating more of the pair’s resources to fighting over it (instead of to actually raising their young). Let the amounts of effort that the dominant and subordinate expend on this “tug-of-war” [281] have the effect of allocating fractions u and v, respectively, of the pair’s resources to contesting relative shares, leaving fraction 1 − u − v for reproduction per se. Then it is reasonable to assume that total reproductive success is K(1 − u − v) and that the relative shares are u/(u + bv) for the dominant and bv/(u + bv) for the subordinate, where b < 1; see [281], where it is assumed without loss of generality that K = 1. The interaction between these two animals can be analyzed as an asymmetric game between kin. (a) Obtain expressions for φ1 and φ2 in (6.40).
278
6. Continuous Population Games
(b) Sketch the optimal reaction sets, and show that there is a unique strong Nash equilibrium (u∗ , v ∗ ). (c) Find (u∗ , v ∗ ), and discuss its dependence on b and r. 15. Verify (6.47) and (6.48), and hence that (6.49) yields the unique ESS for the sperm competition game. Does (6.49c) make sense? 16. Show that (6.51) yields the unique ESS of the game whose reward is defined by (6.50). 17. Modify the analysis of competition over divisible resources (§6.5) to allow for the possibility that the resource is shared only if both parties agree to partition it, and confirm that the results are unaffected. 18. (a) Show that ∂f1 /∂u|u=0=v = ∂f2 /∂v|u=0=v = g(0) > 0 for the landmark game in §6.6, and hence that (0, 0) can never be a Nash equilibrium.51 (b) Verify (6.76). (c) Verify that, along the curveu = φ(v, L) defined by (6.78), ∂f1 /∂u = 0 and ∂ 2 f1 /∂u2 = − c1 + 12 θb G(v)g(φ(v, L)), with ∂f1 /∂u = g(u) > 0 if v = 0. 19. (a) Verify that α(L) in (6.81) increases with c1 (for values of L where it is not constant). (b) Verify that α( 12 ) in (6.82) increases with . 20. Establish (6.87). 21. Establish (6.94). 22. Verify that large-offspring fitness φ(u)F (u), defined by (6.89), (6.95), and (6.96), has a maximum at defined by (6.97) in the clutch-size game of §6.7. 23. (a) Show that ∂ 2 f /∂u2 < 0 for 0 < u < 1 with f defined by (6.92) and (6.96) when c ≥ 1 in the clutch-size game of §6.7. (b) Verify (6.98). (c) Use a software package to solve (6.98) numerically in the case where φ0 = 1 and c = 0.2 in (6.96) with ρ = 0.5, k = 10−4 , and hence z ≈ 0.8914, thus verifying that there are three solutions on the interval (0, 1). 24. Verify the calculations leading to Tables 6.3–6.4. 25. (a) Show that fewer than half of all residents threaten at the ESS of the game described in §6.9. (b) What is the proportion of threats that are deceptive? How large can it be? 26. (a) Verify (6.152). (b) Describe the graph of η, verifying that η(I) = 0 = η(J). 27. (a) Show that strategy (I, J, K, L) defined by (6.149)–(6.151) does not remain evolutionarily stable in the limit as b → 0. (b) Show that the ESS defined by (6.149)–(6.151) does not persist if a threat cost T is paid not only by animals that threaten and lose, but also by animals that threaten and win after an attack by the intruder. 51 For a uniform distribution, it is clear that g(0) = 1 > 0. For a Beta distribution, on the other hand, we have g(0) = 0 by (6.88). Nevertheless, because g(0) > 0 for all u = 0, a more refined argument can be employed to show that ∂f1 /∂u|u=v and ∂f2 /∂v|u=v 0 stay positive in the limit as u → 0 and v → 0; and a similar argument applies to the other end of the interval.
Exercises 6
279
(c) Show that the ESS also does not persist if a threatening resident invariably pays a threat cost T , regardless of whether it is attacked. 28. In §6.1 we assumed that individuals mate at random across the entire population. Suppose instead that (small) proportions α of males and β of females mate within their brood. (a) Find the new evolutionarily stable sex ratio by suitably modifying §6.1. (b) Find the new ESS by some other method. (c) How does the sex ratio vary with α? With β? Interpret.
Chapter 7
Discrete Population Games
We now turn our attention to three examples of games restricted to m pure strategies, with m = 2 in §7.1, m = 6 in §7.2, and m = 4 in §7.3. From §2.2, strategy k is a strong ESS of this discrete population game if and only if akk > ajk for all j = k, where aij is the payoff to an i-strategist in a population of j-strategists, that is, if and only if the largest element in column k of the m × m reward matrix A is the diagonal element akk . So the ESS analysis is straightforward, and our primary focus is on formulating each model and calculating its reward matrix, which involves several steps in all three cases, but especially the first and last.
7.1. Roving ravens: a recruitment game It is not unusual for an animal who finds a major new source of food to recruit others to the site. But it is comparatively rare for an animal to delay recruitment until, e.g., the following day.1 The best known example concerns male juvenile ravens. These animals will often return to their communal roost after spotting a large carcass and wait until the following morning before leading others to it. Why? Heinrich [138], who has conducted long-term studies of ravens in the forests of western Maine, suggests two possible answers. The first, the status-enhancement hypothesis, is that delayed recruitment is favored because a recruiter’s social status (and hence attractiveness to females) increases with the number of followers it leads to a food source. The second possible answer, the posse hypothesis, is that aggregation is favored because larger groups are more likely to usurp a defended carcass. Here we use a game-theoretic model to explore the logic of delayed recruitment in the light of these hypotheses. Consider a population of unrelated overwintering juveniles who forage independently by day, but roost together at night in groups of size N + 1. They search during a period of L daylight hours, from early morning until dusk, for a bonanza— i.e., an opened carcass. Bonanzas are rare and ephemeral, but bountiful. For the 1
Especially if the animal is a communally roosting bird or bat [42, p. 594].
281
282
7. Discrete Population Games
sake of simplicity, we assume that a bonanza lasts only a day (before, e.g., it is irretrievably buried by a heavy snowfall), but it is ample enough to satiate any juvenile who exploits it. Moreover, we assume that bonanzas reappear at a rate of precisely one per day. Thus the object of the search is to find the lone bonanza, wherever in a huge area it may have randomly appeared. We scale fitness so that feeding at a bonanza increases it by 1 unit. Also, we assume that juveniles feed solely at bonanzas, and we ignore the effect of predation. Thus any additional fitness increment is due to a rise in social status. We assume that the fitness of a discoverer increases by α units for every animal it recruits, and we will refer to this assumption as the status-enhancement effect. Each day, an isolated individual locates the bonanza with probability b, and hence with probability 1 − b finds no bonanza. We assume that 0 < b < 1. The time until a focal individual locates the bonanza by itself is a random variable Z, continuously distributed between 0 and ∞ with probability density function g and cumulative distribution function G defined by (2.70), so that G(0) = 0, G(L) = b, and g(z) = G (z). Because Z has a continuous distribution, two individuals cannot discover the bonanza simultaneously (although any number of individuals may discover it at some time during the day). Individuals cease to search either at the end of the day or when they become privy to the bonanza. But the focal individual need not discover the bonanza for itself. Let W denote the time until one of the N nonfocal individuals locates the bonanza. Then W is also continuously distributed between 0 and ∞, but with different distribution and density functions, say H and h, respectively. Because 1 − Prob(W ≤ w) equals the probability that none of N independent foragers locates the bonanza by time w, which is (1 − G(w))N , we have (7.1a) (7.1b)
H(w) = 1 − Prob(W ≤ w) = 1 − (1 − G(w))N , h(w) = H (w) = N (1 − G(w))N −1 G (w).
If Z < W < L, then the focal individual ceases to search at time Z. If Z > W , however, then its stopping time depends on the strategy of the other individuals. For the sake of tractability, we assume each individual in the population to be either an immediate recruiter (strategy 2, or IR) or a delayed recruiter (strategy 1, or DR). Both types respond to recruitment by others: this aspect of behavior is assumed to be fixed. If an IR-strategist discovers the bonanza, then it immediately recruits all other individuals within range. It then remains at the carcass with its gang of recruits until all return to the roost at dusk, i.e., at time L. If another individual subsequently discovers the carcass itself, independently of recruitment, then it neither rises in status nor joins the gang, but it is able to satiate itself if the gang acquires access to the food. (If necessary, we can think of such a late discoverer as lurking in the vicinity of the carcass until the gang has left and gorging itself rapidly before likewise returning to the roost.) Each juvenile will have access to the food if it is not defended or if it is defended but the gang contains enough recruits to repulse the resident adults. Let δ denote the probability that the bonanza is defended by a resident pair, and let ρ(I) denote the probability that a gang of I juveniles (including its leader) will repulse the residents. For the sake of simplicity, we assume that a lone animal has no chance
7.1. Roving ravens: a recruitment game
283
of repulsing them, but that the probability of a gang’s success thereafter increases in direct proportion to the number of recruits, with maximum probability σ (when the entire roost is at the bonanza): ρ(I) = (I − 1)σ/N.
(7.2)
With this assumption, the probability of access for a gang of size I is a(I) = (1 − δ) · 1 + δρ(I), or a(I) = 1 − δ{1 − (I − 1)σ/N }.
(7.3)
We will refer to the assumption embodied in (7.2) as the posse effect; it holds whenever bonanzas are defended with positive probability but is absent when bonanzas are undefended. In other words, there is a posse effect if and only if δ > 0. We are now ready to calculate the reward matrix, which we write as R S (7.4) A = T P for consistency with (5.1). Here aij denotes the increase in fitness to an individual playing strategy i against N individuals playing strategy j; for example, T is the reward to an IR-strategist among N DR-strategists. The calculation requires some combinatorial results associated with the binomial distribution, namely, (7.5a)
N # N k N −k = 1, k p (1 − p)
(7.5b)
N # k N −k k N = N p, k p (1 − p)
k=0
k=0
(7.5c)
N #
k2
N k N −k = N p + N (N − 1)p2 , k p (1 − p)
k=0
(7.5d)
N # k=0
k 1 − (1 − p)N +1 p (1 − p)N −k =
N 1 k+1 k
N
(N + 1)p
for 0 < p < 1, where k denotes the number of ways of choosing k objects from N (Exercise 7.1). First we calculate R. Although, in an IR population, at most one animal per day—the first discoverer—can rise in status, in a DR population several animals may rise in status by locating the carcass independently, in which case the corresponding increase of fitness is shared equally among them. A DR-strategist who discovers the bonanza delays recruitment to the following day at dawn. If it is the sole discoverer, then the other N juveniles follow it at dawn from the roost to the site of the carcass, but if several animals find the bonanza, then each is equally likely to lead. A juvenile who fails to discover the bonanza will fail to rise in status, but its fitness will still increase by 1 unit if some DR-strategist found the bonanza and the flock gains access to the food. So, conditional upon access, the increase of fitness to a DR-strategist in a DR population is * + −D) + (1 − b) Prob(D ≥ 1), (7.6) b E 1 + α(N 1+D
284
7. Discrete Population Games
where E denotes expected value and D is the number of other discoverers (so that N − D is the number of recruits). Thus, on using (7.5) and multiplying by the probability of access, we find that the increase of fitness to a DR-strategist in a population of DR-strategists is (7.7) R = (1 − δ{1 − σ}) (1 + α){1 − (1 − b)N +1 } − αb (Exercise 7.2). Next we calculate T . Suppose that the population contains a mutant IRstrategist (in addition to N DR-strategists). If this focal individual finds a bonanza at time Z, then it will immediately recruit the other individuals—all of them DRstrategists—within its range of attraction. Let F (Z) denote the number of other individuals still foraging at time Z, and let r denote the recruitment probability for each. In ravens, immediate recruitment appears to be predominantly vocal, although visual cues may play a lesser role. Thus, because individuals are assumed to search at random over a large area, we can interpret r as the ratio of the area of the animal’s call range to the total search area for the roost. Then the expected number of recruits is rF (Z). The focal individual’s fitness will increase by 1 unit for access to the food plus αrF (Z) units for the status of a discoverer, and the size of the gang will be rF (Z) + 1, so that the probability of access will be a(rF (Z) + 1), where a is defined by (7.3). Even if the IR-strategist fails to discover the bonanza, however, i.e., if Z > L, its fitness will increase by 1 if one of the DR-strategists discovers it (i.e., if W < L) and the group as a whole gains access the following dawn. But a payoff of 1 with conditional probability 1 − δ(1 − σ) is equivalent to a payoff of 1 − δ(1 − σ) with conditional probability 1. So the IR-strategist’s payoff against the N DR-strategists is ⎧ σrF (Z) ⎪ ⎪ 1 − δ 1 − (1 + αrF (Z)) if Z < L, ⎨ N (7.8) ψT (Z, W ) = 1 − δ(1 − σ) if Z > L, W < L, ⎪ ⎪ ⎩ 0 if min(Z, W ) > L, and the increase of fitness to an IR-strategist in a population of DR-strategists is its expected value, (7.9)
T = E [ψT (Z, W )] .
To calculate this expression, we must average over the distributions of both F (Z) and Z. Because F (Z) = k when N − k others have ceased to search, and because—in a population of DR-strategists—an animal at has ceasedk to search N −k time Z with probability G(Z), Prob(F (Z) = k) = N {1 − G(Z)} {G(Z)} . k Thus, if we temporarily fix the value of Z and denote the average of (7.8) over the distribution of F (Z) by φT (Z), for Z < L we obtain (7.10) φT (Z) = 1 − δ + r{N α(1 − δ) + (1 + αr)δσ}{1 − G(Z)} + (N − 1)αδσr 2 {1 − G(Z)}2 (Exercise 7.2). Now, on using the result that " L " 1 1 − (1 − b)i+1 (7.11) {1 − G(z)}i g(z) dz = xi dx = , 0
1−b
(i + 1)
7.1. Roving ravens: a recruitment game
285
Figure 7.1. Joint sample space of Z and W
with i = 0, 1, 2 and noting that Z and W are independent random variables and hence are distributed over the infinite quadrant in Figure 7.1 with joint probability density g(z)h(w) = G (z)H (w) per unit area, we deduce from (7.8) and (7.9) that " ∞ " L T = (7.12) h(w) φT (z)g(z) dz dw 0
0
"
+ (1 − δ(1 − σ))
"
L
∞
h(w)
g(z) dz dw = b(1 − δ) + r {N α(1 − δ) + (1 + αr)δσ} b 1 − 12 b + (N − 1)αδσr 2 b 1 − b + 13 b2 0
L
+ (1 − b){1 − (1 − b)N } (1 − δ(1 − σ)) . Next we calculate P . As remarked above, if the focal IR-strategist is not a mutant but belongs instead to a population of IR-strategists, then we must allow for the possibility that it ceases to search before the end of the day, not because it has discovered the bonanza itself, but because another IR-strategist has discovered the bonanza and the focal individual is near enough to be recruited. In that case, the focal individual’s status will not rise, but its fitness will still increase by 1 unit if (immediate) recruitment yields a large enough gang for access to the carcass. If the focal individual is first to discover the carcass itself, then its fitness increases by 1 plus α times the expected number of recruits, or 1 + αrN . If it is not the first discoverer of the carcass (implying Z > W ), then its fitness will increase by 1 if either it is recruited, or subsequently it discovers the carcass for itself. In the triangle shaded in Figure 7.1, the focal individual is recruited with conditional probability r; with conditional probability 1−r it is not recruited, but subsequently finds the bonanza itself. In either case, its reward is 1. In the open rectangle to the right of this triangle, however, the focal individual enjoys a reward of 1 only if it is recruited. But a payoff of 1 with conditional probability r is equivalent to an expected payoff of r with conditional probability 1. So the focal individual’s payoff,
286
7. Discrete Population Games
conditional upon access to the carcass, is ⎧ 1 + αrN ⎪ ⎪ ⎪ ⎨ 1 ψP (Z, W ) = ⎪ r ⎪ ⎪ ⎩ 0
if if if if
Z < min(L, W ), W < Z < L, W < L < Z, min(Z, W ) > L,
and its increase in fitness, conditional on access to the carcass, is E [ψP (Z, W )] = (1 + αrN )Prob(Z < min(L, W )) + 1 · Prob(W < Z < L) + r · Prob(W < L < Z). Whenever the reward is positive—regardless of whether the focal individual is the discoverer, a recruit, or neither—the expected size of the gang of juveniles is rN +1. So the probability of access is a(rN + 1) = 1 − δ(1 − σr), and the (unconditional) increase of fitness to an IR-strategist in an R population is P = {1 − δ(1 − σr)} E [ψP (Z, W )] , which readily reduces to (7.13)
P = (1 − δ + δσr)
1+
αN r{1 − (1 − b)N +1 } + (1 − r)b N +1
(Exercise 7.2). Finally, to complete the reward matrix, we calculate S. Let the population consist of N IR-strategists and a mutant DR-strategist. This focal individual obtains an immediate reward only if it responds to an IR-strategist’s call; if it finds the bonanza itself, then its reward is delayed to the following dawn. Thus we must allow for the fact that if the focal DR-strategist is the first discoverer, then its expected number of recruits is no longer rN , but rather the number of IR-strategists that were not made privy to the bonanza on the previous day (all others being satiated, by assumption). An IR-strategist is not made privy to the bonanza if it fails to find it, and it is not the case that both another IR-strategist finds it and the first animal is recruited. The probability of this event is q = (1 − G(L)) 1 − r{1 − (1 − G(L))N −1 } = (1 − b)(1 − r) + r(1 − b)N . The corresponding expected number of recruits is qN . Thus although the probability of access remains 1 − δ(1 − σr) when Z > W , for Z < W it changes to a (N q + 1) = 1 − δ(1 − σq). So the increase of fitness to a DR-strategist in an IR population is S = {1 − δ (1 − σq)} (1 + αqN )Prob(Z < min(L, W )) + {1 − δ(1 − σr)} Prob(W < Z < L) + r {1 − δ(1 − σr)} Prob(W < L < Z), and this expression reduces to (7.14)
S = P+
1 − {1 − b}N +1 (q − r) {δσ + αN (1 − δ + δσ{q + r})} N +1
(Exercise 7.2). Depending on the signs of R − T and S − P in the reward matrix (7.4), either IR or DR is potentially an ESS. If R − T and S − P are both positive, then DR
7.1. Roving ravens: a recruitment game
287
is the only ESS; if both are negative, then IR is the only ESS; if R − T is positive but S − P is negative, then both strategies are evolutionarily stable; and if R − T is negative but S − P is positive, then neither strategy is a monomorphic ESS but each can infiltrate the other, so that a mixture of both will yield a polymorphic ESS. As in Chapter 5, we will refer to the strategy that yields the higher reward to the population as the cooperative strategy. Thus IR is the cooperative strategy if P > R, and DR is the cooperative strategy if R > P . In practice, however, it seems extremely unlikely that b and r are large enough to make P > R for ravens; moreover, mutualistic food sharing via immediate recruitment has been adequately explained elsewhere [245], and is not the issue we set out to address. We therefore assume that R > P . So our game is an (N + 1)-player analogue of the cooperator’s dilemma, the games being equivalent for N = 1. But because we are concerned with larger group sizes, 2R > S + T in (5.2) no longer applies. First we ask whether a posse effect alone suffices for the evolution of delayed recruitment.2 Setting α = 0 in (7.12) and (7.14) yields R − T = bδσ 1 − r 1 − 12 b , (7.15) δσ(1 − b){1 − (1 − b)N +1 } r (7.16) 1− , S−P = N +1
r1
where (7.17)
r1 =
(1 − b) . 2 − b − (1 − b)N
By inspection, R−T is invariably positive, but the sign of S−P depends on r. Thus, in the absence of a status-enhancement effect, DR is always an evolutionarily stable strategy. Moreover, if r < r1 , then DR is the only evolutionarily stable strategy. If, on the other hand, r > r1 , then both IR and DR are evolutionarily stable; but if the population consists of immediate recruiters initially, then the noncooperative strategy will prevail. We can now answer the question of interest. If the immediate recruitment probability r exceeds r1 , then the posse effect alone does not suffice for the evolution of cooperation. If, however, r lies below this critical value, then the cooperative strategy DR is a dominant strategy and will invade an IR population. In sum, the posse effect alone suffices for the evolution of delayed from immediate recruitment if r is sufficiently small. In fact, such data as are available do tentatively suggest that r < r1 [219]. Next we ask whether the status-enhancement effect alone suffices for the evolution of delayed recruitment. Setting δ = 0 in (7.7) and (7.12)–(7.14), we find that R − T = N bα 1 − 12 b (r2 − r), (7.18) αN (1 − b){1 − (1 − b)N +1 } r (7.19) 1− , S−P = N +1
r1
2 Note that in this case R > P holds regardless: α = 0 in (7.7) and (7.13) yields R − P = (1 − r){1 − δ(1 − σr)}(1 − b){1 − {1 − b}N } + δσ(1 − r){1 − (1 − b)N +1 }, which is invariably positive because b, δ, r, and σ are all probabilities between 0 and 1.
288
7. Discrete Population Games
Table 7.1. Evolutionarily stable strategies in the absence of a posse effect. The critical values of r, namely, r1 and r2 are defined by (7.17) and (7.21), respectively, and invariably satisfy max(r1 , r2 ) < 1 − b < r0 . The critical value of b, namely, bc , is the value of b for which r1 = r2 .
0 < r < min(r1 , r2 ) 0 < b < bc , r1 < r < r2 bc < b < 1, r2 < r < r1 max(r1 , r2 ) < r < r0
DR unique ESS Two ESSs, DR and IR Polymorphic ESS IR unique ESS
and R − P = (1 + α)N br2 (1 − b/2)(1 − r/r0 ), where αN (7.20) r0 = (1 + α) 1 + 1+ N +1
(7.21)
r2 =
b (1 − b)(1 − {1 − b}N )
−1 ,
2(1 − b){1 − (1 − b)N } , N b(2 − b)
and r1 is defined by (7.17). Thus for N ≥ 3, which always holds in practice [219, p. 382], whether IR or DR is an ESS depends on r according to Table 7.1, and as illustrated by Figure 7.2 for α = 0.5 and various values of N . Note that R > P excludes the shaded region, whose curvilinear boundary is r = r0 . Note also that two ESSs seem quite unlikely to arise in practice. We can now answer the question of whether a status-enhancement effect by itself suffices for the evolution of delayed recruitment. The answer is yes, and for at least moderately large roosts the conditions for DR to infiltrate an IR population are essentially the same; but the conditions for DR to invade and become a monomorphic ESS are potentially much more stringent. Nevertheless, they must always be satisfied for sufficiently small b and r. Observations of juvenile ravens in a forest in North Wales support the results of this section. For details, see Wright et al. [364].
7.2. Cooperative wildlife management What can an African government do to deter illegal hunting of wildlife and, hence, promote its conservation? In strictly protected areas, such as large national parks, it can employ professional scouts to discourage and intercept poachers. But much of Africa’s most valuable wildlife is found elsewhere, in communities where people are living and hunting, such as Zambia’s Luangwa Valley, and there aren’t nearly enough professional scouts to go round. So, as a practical matter, governments must find ways to create an incentive for communities to police themselves. Several such schemes, through which governments promise financial rewards to communities for refraining from illegal hunting and to individuals for reporting poachers, have been implemented over the years, yet they have met with only limited success. For example, improved law enforcement in Zambia curbed the decimation of large mammals that was rampant in the 1970s and 1980s, but mainly because poachers switched to targeting smaller animals to reduce the risk of being detected [106]. Could these schemes have been better designed? Here we attempt to answer this
7.2. Cooperative wildlife management
289
Figure 7.2. ESS regimes for δ = 0 and α = 0.5. The shaded region has boundary r = r0 ; the other solid curve is r = r2 ; the dashed curve is r = r1 ; and P denotes a polymorphism. In general, r1 and r2 are independent of α but decrease with respect to N and b, whereas r0 decreases with respect to all three parameters (and approaches 1 − b in the limit as α → ∞, N → ∞); however, the dependence of r0 on N is weak.
question by constructing a discrete population game in which each player chooses whether or not to poach and whether or not to monitor the resource. Let a community consist of n + 1 decision-making families or individuals, to whom a government is offering incentives to conserve a wildlife resource.3 For the sake of simplicity, we assume that incentives are offered in cash (and thus avoid the issue of how noncash benefits could be equitably distributed among deserving individuals). The maximum potential community benefit per period for conserving the resource is B. It is paid in full if no individual is hunting illegally, but reduced proportionately to zero as the number of poachers rises from zero to n + 1;4 in any event, it is distributed equally among all individuals. Thus the community wage per period to each individual for conserving wildlife when m individuals are poaching is B m 1− . (7.22) Wm = n+1
n+1
We use n in place of N to be consistent with [226], on which §7.2 is based. We implicitly assume that the government has a means of estimating the total number of poachers, even when infractions are not reported by the community. 3 4
290
7. Discrete Population Games
A choice of hunting technologies is available to residents. For example, in the Luangwa Valley of Zambia, they may either snare surreptitiously or hunt with guns. Before increases in law enforcement levels, hunting with guns was preferred. But snaring subsequently became more prevalent, due to its lower probability of detection. These two options exemplify the tradeoffs a poacher may face. We use L to denote hunting with guns, a long-range technology, and S to denote hunting with snares, a short-range technology. Let VZ be the expected value of returns per period from using type-Z technology, let DiZ denote the associated probability of being detected if i individuals are monitoring, and let CZ denote the corresponding expected cost per period of a conviction for poaching. Then the expected benefit per period to each individual who poaches with type-Z technology, if i individuals are monitoring, is (7.23)
Zi = VZ · (1 − DiZ ) + (VZ − CZ ) · DiZ = VZ − CZ DiZ ,
where Z is either L or S.5 For the sake of simplicity, we assume that each monitor has a constant probability qZ per period of observing while type-Z technology is used. So the probability that some monitor is observing—i.e., that not all monitors are not observing—is (7.24)
DiZ = 1 − (1 − qZ )i ,
where Z again is either L or S. Long-range technologies typically have a higher probability of detection, and so we assume that qL > qS . The existence of a community wage ensures that the net value of poaching with type-Z technology is less than VZ . When m individuals are poaching, an B additional poacher reduces the community wage by Wm − Wm+1 = (n+1) 2 . Thus the net expected value per period of poaching with type-Z technology is (7.25)
vZ = VZ −
B . (n + 1)2
If vL and vS were both negative, then there would exist no incentive to poach. The persistence of poaching suggests, however, that one of them is positive, and so we assume that (7.26)
max (vL , vS ) > 0.
Expected returns VZ depend not only on the value of captured animals, but also on the numbers taken per period with type-Z technology. Accordingly, it will be convenient to denote the higher-value technology by H, regardless of whether H = L or H = S, and the other technology by O (so that H = L implies O = S, and vice versa). Thus (7.26) is equivalent to vH > 0. We assume that each individual who monitors is paid a scouting fee σ by the government. We envisage that this payment would be remuneration for a variety of monitoring duties (e.g., carrying out a wildlife census), of which game law enforcement is only one; thus monitors would have useful work to do, even if no one were breaking the law. We also assume that monitors would always report observed violations to the government, for which there would be an additional reward RZ to the first informant of a type-Z infraction. Monitoring is not without its costs, 5 Note that (7.23) does not imply that convicted poachers keep their ill-gotten gains, because CZ may greatly exceed VZ .
7.2. Cooperative wildlife management
291
Table 7.2. The strategy set
Strategy 1 LM 2 LX 3 SM 4 SX 5 NM 6 NX
Definition Poach with type-L technology, monitor and report (other) lawbreakers Poach with type-L technology, do not monitor or report lawbreakers Poach with type-S technology, monitor and report Poach with type-S technology, do not monitor or report Do not poach, monitor and report Do not poach, do not monitor or report
however—even if no one is poaching, there is an opportunity cost c0 associated with the remaining duties. Furthermore, reporting poachers to the government incurs opprobrium, whose intensity increases with the number of poachers. For the sake of simplicity, we assume that the total expected cost of monitoring is c0 +cL j +cS k per period, where j and k are the numbers of type-L and type-S poachers, respectively. Given that hunters gain social status from providing their lineage dependents with meat and goods exchanged for meat [106, p. 943], cL and cS are probably much larger than c0 , which is probably very small. Accordingly, we assume that cL c0 ,
(7.27)
cS c0 ,
c0 ≈ 0.
Observing a violation is not the same thing as being rewarded for it, however, because there may be more than one observer, and only the first to report an offence is assumed to be remunerated. Thus, for a monitoring individual, the probability of obtaining a reward from a specific violation is the probability that the violation is detected times the conditional probability that the monitor is first to report it. For the sake of simplicity, we assume that all monitors are equally likely to obtain a reward from each violation. Then the reward probability for type-Z technology is DiZ /i, where i is the number of monitors; note that this quantity is approximately equal to qZ when qZ is small (that is, when there is effectively no competition in the race to report). Combining our results, we find that the expected benefit per period to each monitoring individual, when j other individuals are engaged in typeL poaching, k other individuals are engaged in type-S poaching, and i individuals are monitoring, is (7.28)
Mijk =
jRL DiL + kRS DiS + σ − c0 − cL j − cS k. i
We assume that poachers may use either type-L or type-S technology, but not both. Thus each individual has six possible strategies, defined in Table 7.2. Decimation of large mammals without law enforcement broadly corresponds to a population at LX, reduced exploitation, in which improved law enforcement induces residents to take small mammals only, broadly corresponds to a population at SM , and conservation with law enforcement broadly corresponds to a population at N M . In essence, therefore, the wildlife management scheme in Zambia’s Luangwa Valley failed to achieve a desideratum of N M , but instead evolved from LX in the 1970s and 1980s to SM in the 1990s. The goal of our analysis is to indicate how the desired outcome could yet be attained. Note that there is no direct feedback between individuals: any cooperation is mutualistic (§5.7).
292
7. Discrete Population Games
Table 7.3. The reward matrix
LM LX SM SX NM NX
LM Wn+1 + Ln +Mnn0 Wn+1 + Ln Wn+1 + Sn +Mnn0 Wn+1 + Sn Wn + Mnn0 Wn
LX Wn+1 + L0 +M1n0 Wn+1 + L0 Wn+1 + S0 +M1n0 Wn+1 + S0 Wn + M1n0 Wn
SM Wn+1 + Ln +Mn0n Wn+1 + Ln Wn+1 + Sn +Mn0n Wn+1 + Sn Wn + Mn0n Wn
SX Wn+1 + L0 +M10n Wn+1 + L0 Wn+1 + S0 +M10n Wn+1 + S0 Wn + M10n Wn
NM W1 + L n 00 +M··· W1 + L n W1 + S n 00 +M··· W1 + S n 00 W0 + M··· W0
NX W1 + L 0 00 +M··· W1 + L 0 W1 + S 0 00 +M··· W1 + S 0 00 W0 + M··· W0
The matrix of rewards per period to a focal individual, using a given row strategy against a population using a given column strategy, can now be determined from (7.22)–(7.25) and (7.28); see Table 7.3. We denote this matrix by A, so that aIJ is the reward to an individual using strategy I against n individuals using strategy J. For example, the reward to strategy N X against population strategy LM is a61 = Wn because strategy N X neither poaches nor monitors—the only benefit is the community wage, which is Wn , because there are n + 0 = n typeL poachers and no type-S poachers among the n + 1 individuals in the entire population. Similarly, the reward to strategy SM against population strategy SX is a34 = Wn+1 + S0 + M10n , which we obtain as follows. First, because there are n + 1 type-S poachers when strategy SM confronts n individuals using strategy SX, the community wage is Wn+1 (= 0, by (7.22) with m = n + 1). Second, because SX does not monitor and SM does not monitor itself, no one monitors the SM -strategist, so that the expected benefit of poaching is S0 (= VS , by (7.23) with i = 0 and Z = S). Finally, the benefit from monitoring is M10n because the focal individual is the only monitor, and although all n + 1 individuals are poaching, the focal individual does not monitor itself. Note that the benefits of monitoring to the focal individual are independent of the number of monitors if no one else is poaching, as indicated by the subscripted dots in columns 5 and 6 of Table 7.3. From (2.21), population strategy J is (strongly) stable if aJJ exceeds aIJ for all I = J or, equivalently, if pJJ = 0 is the only nonpositive term in column J of the stability matrix P defined by (7.29)
pIJ = aJJ − aIJ ,
1 ≤ I, J ≤ 6.
The first two columns of P are shown in Table 7.4(a), the next two columns in Table 7.4(b), and the last two columns in Table 7.4(c). Some general results emerge at once. First, because p21 + p12 = RL 1 − (1 − qL )n − nqL ≤ 0 for n ≥ 1, if p12 is positive, then p21 is negative, and vice versa. Thus if either LM or LX is a stable strategy, then the other is not: if conditions do not favor a switch to monitoring in a no-monitoring population, then they also must favor a switch to no monitoring in a monitoring population, and vice versa. Similarly, SM and SX or N M and N X or LX and SX cannot both be stable strategies because p43 + p34 ≤ 0 or p65 + p56 = 0 or p42 + p24 = 0, respectively; and only one of LM , SM , and N M can be stable because p51 + p15 = 0 while both p31 + p13 = 0 and
7.2. Cooperative wildlife management
293
Table 7.4. The population stability matrix
I 1 2 3 4 5 6
(a) Type-L technology columns pI1 pI2 0 −σ + c0 + n (cL − qL RL ) 0 σ − c0 − ncL + DnL RL VL − VS − σ + c0 + n (cL − qL RL ) VL − DnL CL − VS + DnS CS VL − DnL (CL − RL ) − VS VL − VS +DnS CS + σ − c0 − ncL n vL − (1 − (1 − qL ) ) CL vL − σ + c0 + n (cL − qL RL ) vL vL − DnL (CL − RL ) + σ − c0 − ncL (b) Type-S technology columns
I 1 2 3 4 5 6
pI3 VS − DnS CS − VL + DnL CL σ − c0 − ncL + DnS (RS − CS ) +VS − VL + DnL CL 0 σ − c0 − ncS + DnS RS vS − (1 − (1 − qS )n ) CS σ − c0 − ncS + DnS (RS − CS ) + vS
I 1 2 3 4 5 6
pI4 VS − VL − σ + c0 + n (cS − qS RS ) VS − VL −σ + c0 + n (cS − qS RS ) 0 vS − σ + c0 + n (cS − qS RS ) vS
(c) No-poaching columns pI5 pI6 (1 − (1 − qL )n ) CL − vL −σ + c0 − vL σ − c0 + (1 − (1 − qL )n ) CL − vL −vL (1 − (1 − qS )n ) CS − vS −σ + c0 − vS σ − c0 + (1 − (1 − qS )n ) CS − vS −vS 0 −σ + c0 σ − c0 0
p53 + p35 = 0. Furthermore, and unsurprisingly, N X cannot be a stable strategy, because (7.26) implies that either p26 or p46 is negative: lack of vigilance favors poaching. The upshot is that at most two strategies can be stable strategies, and if there are two stable strategies, then one must be a monitoring strategy (LM , SM , or N M ) while the other must be a no-monitoring strategy (LX or SX). Indeed if one of these two no-monitoring strategies is stable, then it must be to use the higher-value technology, i.e., strategy HX, for if everyone else were using the other technology (strategy OX), then it would pay to switch. Nevertheless, our principal purpose is to find conditions for N M to be stable, which can happen only if p65 > 0 or (7.30)
σ > c0 .
In particular, σ must be positive: each individual must be paid to monitor even if no one poaches, or else the agreement is unsustainable. On the other hand, (7.27) implies that σ need not be large. Let us assume that (7.30) holds. Then p45 > p35 and p25 > p15 . So N M is a stable strategy if, and only if, p15 and p35 are both positive or, on using (7.25) in
294
7. Discrete Population Games
Table 7.5. Some critical group sizes when qO < VO /CO = 0.5
qO 0.005 0.01 0.015 0.02
ncrit + 1 140 70 47 36
qO 0.025 0.03 0.05 0.1
ncrit + 1 29 24 15 8
Table 7.4(c), (7.31)
(1 − (1 − qH )n ) CH + B/(n + 1)2 > VH
and (7.32)
(1 − (1 − qO )n ) CO + B/(n + 1)2 > VO
both hold (where H denotes the higher-value technology and O the other type); (7.32) is automatically satisfied if vO < 0, by (7.25). As remarked earlier, the only other potentially stable strategy is HX. From Table 7.4(a), if H = L, i.e., VL > VS , then to exclude the possibility that LX is stable, we require p12 < 0 or n (cL − qL RL ) < σ − c0 ; whereas, if H = S, then to exclude the possibility that SX is stable, we require p34 < 0 or n (cS − qS RS ) < σ − c0 . So N M is the only stable strategy if, in addition to (7.30) and (7.31), n (cH − qH RH ) < σ − c0 .
(7.33)
Ignoring the fanciful circumstance that qH RH almost exactly balances cH , (7.33) cannot be satisfied for positive cH − qH RH unless either σ − c0 is infeasibly large or n is unreasonably small. In practice, therefore, N M is the only stable strategy if (7.30), (7.31), and qH > cH /RH all hold. Because 1 − (1 − qH )n cannot be less than qH for n ≥ 1, (7.31) must hold if qH CH /VH > 1. Similarly, (7.32) must hold if qO CO /VO > 1. But it also holds if vO < 0. Thus sufficient conditions for N M to be the only stable strategy are σ > c0 , (7.34)
qH > max (VH /CH , cH /RH )
(which requires VH < CH and cH < RH ) and (7.35)
either
qO > VO /CO
or
vO < 0.
Otherwise, the stability of N M depends on group size: for N M to be stable, there is a critical value, say ncrit + 1, that n + 1 must exceed. Suppose, e.g., that (7.34) holds and VO /CO < 1, but (7.35) is false. Then the effect of B/VO on (7.32) is negligible, except to the extent of precluding (7.35), and so critical group size is well approximated by ensuring that (1 − (1 − qO )ncrit ) CO /VO exceeds 1: * + O /CO ) (7.36) ncrit + 1 ≈ ln(1−V + 2, ln(1−qO ) where [z] denotes the integer part of z. The effect of group size is illustrated by Table 7.5.6 6 And is potentially significant; e.g., Child [64] has argued that, for a community-based scheme to work, the community should be small enough to meet under a tree, which he interprets as having no more than 200 households.
7.2. Cooperative wildlife management
295
In general, if a strategy is the only stable one, then we can expect it to emerge as the community norm; whereas if a second strategy is also stable, then we can expect the first to emerge only if it yields a higher community reward. Thus, for a conservation agreement to hold (when vH > 0, as we have assumed), either the government must ensure that N M is the only stable strategy, or if HX is also stable, then the government must ensure both that N M is stable and that it yields a higher community reward than HX. Now, in terms of the reward matrix A defined by Table 7.3, strategy J yields a higher community reward when it yields a higher value of aJJ (i.e., when it yields a higher reward to each individual if the whole population adopts it). So, because N M is the fifth strategy in Table 7.2 whereas HX is either the second or fourth, if HX is stable, then the government must ensure that a55 > max (a22 , a44 ) or, on using Table 7.4, (7.37)
σ − c0 +
B > VH . n+1
The higher the value of cH or the lower the value of RH , the greater the significance of the above inequality. If the conservation strategy N M is the only stable one, then the relatively high value of RH /cH that guarantees (7.33) will coerce the community into conserving the resource, and (7.37) need not hold. If N M is stable but RH /cH is too small to destabilize the anticonservation strategy HX, however, then the emergence of N M will require the community’s voluntary cooperation, which can be induced by the government only if it pays a community benefit high enough to make N M yield a higher reward than HX or, on rearranging (7.37), if B > (n+1){VH −σ +c0 }. Then (n+1) (VH − σ + c0 ) is the minimum cost to the government of ensuring that N M has the higher reward, but it also costs the government σ per individual, or (n+1)σ in all, to render N M stable. Including this cost makes (n + 1) (VH + c0 ) the total minimum price tag for inducing conservation through community self-monitoring. Suppose, however, that the government neglects to pay the community for monitoring per se. In other words, suppose that σ = 0. Then although, by (7.37), N M still yields a higher reward than HX, it is no longer a stable strategy, because p65 is negative: if no one is poaching and no one is being paid to monitor, then it pays to switch to not monitoring, thus avoiding the opportunity cost c0 . But although N X would conserve the resource, it is not a stable strategy. From Table 7.4(c), to render it stable requires vH < 0, or B > (n + 1)2 VH : when σ = 0, the minimum price tag for inducing conservation is raised to (n + 1)2 VH . In other words, the minimum cost to the government of an effective community agreement to conserve the resource when σ = 0 is greater than when σ > c0 by a factor (n + 1)VH / (VH + c0 ). If c0 VH , as c0 ≈ 0 would imply, then the relevant factor is approximately n+1, which could never be less than 10 and is typically far greater [226]. So an agreement among residents to conserve a wildlife resource through community self-monitoring may be cheaper by at least an order of magnitude for a government to sustain if its community incentive structure separates benefits for not poaching and bonuses for making arrests from payments for monitoring per se. Our analysis clarifies the important distinction between the value of an agreement and its strategic stability; community benefits may strongly influence the former, yet scarcely influence the latter. For example, if B/(n + 1) is very much
296
7. Discrete Population Games
greater than σ − c0 , then the (individual) value of an agreement to adhere to N M , a55 = σ − c0 + B/(n + 1), is dominated by the magnitude of B; the influence of σ − c0 is negligible. But the magnitude of B need have no effect on the stability of the agreement; whereas the effect of σ −c0 on the agreement’s stability can never be ignored because no self-monitoring agreement is sustainable without a payment to each individual that exceeds the opportunity cost of monitoring—even if no one is poaching. Otherwise, it pays to switch to neither poaching nor monitoring, which is not a stable strategy. Thus, to answer the question that motivated our analysis: it appears that community-based wildlife management schemes could, in fact, be better designed—residents could always be paid to monitor, even if no one is breaking the law.
7.3. An iterated Hawk-Dove game In Chapter 2 we found that either of two conventional strategies, Bourgeois (B) and anti-Bourgeois (X), could enable a population to settle disputes over a territory or other resource of value V in the absence of any fighting, thus avoiding a cost of C to any loser, and that in terms of evolutionary stability, there was nothing to distinguish B from X, at least in the game of Owners and Intruders (§2.4). So shouldn’t B and X arise about equally often? Yet in nature behavior resembling X is at best extremely rare, at least when the contested resource has lasting value, and for good reason—after all, if someone knocked on your door tonight and demanded your property, would you turn it over and run off to attack a neighbor? The “only case known to” Maynard Smith [188, p. 96] concerns a social spider, Oecobius civitas, found on the undersides of rocks in Mexico. Burgess [54, p. 105] provides the following description: If a spider is disturbed and driven out of its retreat, it darts across the rock and, in the absence of a vacant crevice to hide in, may seek refuge in the hiding place of another spider of the same species. If the other spider is in residence when the intruder enters, it does not attack but darts out and seeks a new refuge of its own. Thus once the first spider is disturbed the process of sequential displacement from web to web may continue for several seconds, often causing a majority of the spiders in the aggregation to shift from their home refuge to an alien one.
Despite this apparent exception, it is far more usual for territorial conflicts to be settled in favor of owners [188, p. 97]. So why does Owners and Intruders predict that either both B and X are ESSs (for V < C) or neither is an ESS (for V > C)? What crucial effects does the model exclude? Can we develop the model to instead allow for B to be very common but X to be very rare? In particular, can we explain how populations consisting of individuals who invariably fight over territories may evolve to become nonfighting populations that almost invariably observe the ownership-respecting convention sustained by Bourgeois, and only rarely—if ever—observe the ownership-disrespecting convention sustained by anti-Bourgeois? We explore these issues here. First we note that the evolutionary stability of X becomes even more puzzling when we consider that there are two ways to obtain the Dove-Dove payoff for contesting an indivisible resource of value V . The method used to obtain both
7.3. An iterated Hawk-Dove game
297
Table 7.6. Payoff matrix for Owners and Intruders in an unintrusive population
P1
HH B X DD
HH 1 (V − C) 2 1 (V − C) 4 1 (V − C) 4 0
P2 B 3 V − 14 C 4 1 V 2 3 1 V − C 4 4 1 V 2
X 3 V − 14 C 4 1 (V − C) 4 1 V 2
0
DD V 1 V 2 V 1 V 2
a22 = 12 V in Table 2.1 and a44 = 12 V in Table 2.3 was to suppose that a nonowner displays at an owner’s site, and the animals are then equally likely to blink first; we will refer to such Dove-like behavior as intrusive. We would have obtained the same payoff, however, if we had instead assumed that a nonowner who sees an owner at its site does not display but continues searching, with two Doves equally likely to be an owner or a nonowner when this interaction occurs. We will distinguish such Dove-like behavior as unintrusive. Unintrusiveness has no effect on Table 2.1, but it replaces Table 2.3 by Table 7.6, so that X is the only ESS for V < C (Exercise 2.8).7 Thus unintrusiveness seems to make our model even less consistent with nature! Nevertheless, we allow for both types of Dove in the following analysis.8 Until now, we have assumed that the roles of owner and intruder are mutually exclusive and recognized as such (p. 74). Yet mistakes over ownership of a resource may be relatively common in nature, for a variety of reasons [223, pp. 411–412]. We incorporate their effect by supposing that intruders believe themselves to be owners with probability β, and hence perceive their role correctly with probability 1 − β, whereas owners always perceive themselves as owners. Incorporation of this effect requires us to modify the definition of strategy set that we gave in Table 2.2, and at the same time allows us to simplify our notation by using H in place of HH and D in place of DD. Our modified definitions of strategy appear in Table 7.7. 7 By deleting the last row and column from Table 7.6, we see that B would also be evolutionarily stable if D were absent from the population; thus, as noted before in §5.3 (p. 192), whether a strategy is an ESS is always a function of the other strategies present (more fundamentally, a model’s predictions are always a function of the assumptions on which it is based). But B is merely a (symmetric) Nashequilibrium strategy in the presence of D. Thus D cannot be prevented from drifting into a B population and changing its composition to a mixture of B and D. If this mixture is now infiltrated by H or X, however, then the proportion of B will decrease and the proportion of D will increase, because D does better than B against either H or X; in terms of §2.8 the mixture is metastable. If either H or X were to infiltrate repeatedly, then the proportion of D could become quite large (with a corresponding reduction in the proportion of B). But D is not even a Nash-equilibrium strategy and is readily invaded by either H or X. Thus the population would eventually evolve from the metastable mixture of B and D to the sole ESS, namely, X. If the frequency of a strategy in a metastable mixture decreases over time without interference by strategies other than those already represented in the mixture, then the reduction is said to be caused by random drift; whereas, if the frequency of a metastable strategy decreases because it does less well against an infiltrator than other strategies in the mixture, then the reduction is said to be caused by selection pressure. 8 We regard an animal’s degree of intrusiveness as a characteristic of its entire population. A possible cause of such variation in intrusiveness would be risk of predation, which can vary enormously between two populations of the same species, e.g., the funnel-web spider Agelenopsis aperta. In a desert grassland population of this species, the risk of predation by birds or spider wasps was virtually nonexistent; whereas in a desert riparian population, the risk to exposed spiders was extremely high [129, pp. 120–121]. Wars of attrition would make the second population an easy target for predators, and could be strongly selected against. So the first population may be better modelled as an intrusive population, and the second as an unintrusive one.
298
7. Discrete Population Games
Table 7.7. The strategy set
Symbol H B X D
Name Hawk Bourgeois
Definition Always escalate Escalate if role perceived as owner, do not escalate if role perceived as intruder anti-Bourgeois Do not escalate if role perceived as owner, escalate if role perceived as intruder Dove Never escalate
Table 7.8. The states that an animal can occupy. With regard to state 4, the key point is that an animal in this state takes no further part in the competition—regardless of whether it is dead or alive.
i 1 2 3 4
State Uninjured owner Injured owner Uninjured nonowner Dead, or injured nonowner in an intrusive population
Until now, we have also assumed that an animal’s role is independent of its strategy, and that its role is randomly assigned and cannot change. Yet these are rarely valid assumptions for animals involved in a series of contests with different opponents [160, p. 24]. If an intruder wins a contest—which will depend upon its strategy—then it will be an owner, not an intruder, for the next. To incorporate this effect, we need a multiperiod model. We therefore consider a sequence of HawkDove contests that together constitute an iterated Hawk-Dove game, or IH-D for short. The duration of this IH-D for each individual is K units of time, where K is a discrete random variable with geometric distribution, defined by (5.22) with K in place of M : (7.38)
Prob(K ≥ k) = wk−1 ,
k ≥ 1.
So in every period there is constant probability w that the game will continue for at least a further period. To allow for changes of role, in Table 7.8 we define four states that a player can occupy; we will elaborate on these definitions (especially the last) below. We assume that all animals are initially uninjured nonowners, that is, in state 3. Thus we think of the population as colonizing fresh habitat, and we think of K as the time that remains before an animal must acquire a site for reproduction. The higher the risk of predation while it seeks a refuge, the smaller the value of w. The existence of two roles—owner and intruder—implies asymmetry among players at any particular encounter. But the game consists of all encounters, and a strategy is inherited before the start of the game. Thus if all animals initially face the same probabilities of being owner or intruder at any particular encounter, for any given strategy combination, then the game itself is still symmetric, and can still be described by a single 4 × 4 payoff matrix A, which we calculate below; see (7.70).
7.3. An iterated Hawk-Dove game
299
Let the population consist of N animals, let there be M suitable sites—of equal quality—for a territory, and define (7.39)
σ = N/M.
Then the larger the value of σ, the scarcer the supply of sites, with more sites than animals if σ < 1 and more animals than sites if σ > 1. For simplicity’s sake, we assume that σ < 1; and we will later assume that N → ∞ and M → ∞, but in such a way that σ is finite. We assume that sites are randomly distributed across the habitat, and that in each unit of time there is probability that a searching animal will find a site. A site so found may be either empty or occupied by another animal; if the latter, then we shall refer to the occupier as the owner, and to the finder as the intruder. The intruder can either attack the owner, in the hope of obtaining a site without further search, or surrender the site to its owner and look for another one. Likewise, if the intruder attacks, then the owner can either also attack or else surrender the site to the intruder—who then becomes the owner—and search for another. To analyze this iterated game we make the following assumptions. First, at most one animal intrudes upon a site per unit of time. The lower the value of , the better this assumption, but it is anyhow forced upon us by a further assumption we make below; namely, that an animal can change state only at t = k, where t denotes time and k is an integer. If we allowed multiple intrusions during the same unit of time, then we would also have to allow multiple changes of state. Second, all sites are equally likely to be located by a searching animal; the smaller the habitat, the better this assumption. Third, if two animals actually engage in a fight, then the site owner wins with probability μ (hence the intruder with probability 1 − μ), and the loser always dies;9 however, there is probability λ that the victor is injured. We assume throughout that μ ≥ 12 . Thus we allow an owner to have a greater probability of winning a fight, purely by virtue of being an owner [260, p. 227]; for example, owners were observed to be about twice as likely as intruders to win escalated contests between desert spiders of equal size observed at average sites, suggesting μ ≈ 23 [129, p. 124]. It will be convenient to set (7.40)
μ =
1 2 (1
+ ρ),
where 0 ≤ ρ < 1, and to refer to ρ as owner advantage; it measures the extent beyond equiprobability to which the owner is favored to win in the event of a fight.10 Fourth, injured animals are unable to fight or search for a site. Fifth, sites are solely for reproduction and yield lower fitness to an injured animal, because an injured animal is less capable of attracting a mate or otherwise investing in reproduction. Specifically, an animal’s objective is to own a site when t = K; and if it is an uninjured owner (state 1), then—without loss of generality—we define its fitness to be 1, whereas if it is an injured owner (state 2), then we define its fitness to be α, where α < 1 (although the value of α is irrelevant if λ = 0, because then 9 This assumption may strike you as extreme, but it is by no means the only deliberate simplification that we have made in this book. In nature, not every loser would fight to the death; however, to incorporate a survival probability would merely further complicate the analysis without significantly or usefully affecting our principal conclusions. See the discussion at the beginning of Chapter 9. 10 Thus, in terms of Footnote 18 on p. 165, owner advantage is a measure of absolute RHP, but it is convenient to use the same notation and terminology as in §6.7, where it measures relative RHP instead.
300
7. Discrete Population Games
Table 7.9. The dimensionless parameters
β w σ ρ λ α
Probability of intruder error in perceiving role (0 ≤ β < 1) Probability of further competition (0 < w < 1) N/M , number of animals ÷ number of sites (0 < σ < ∞) Probability per period of locating a site (0 < < 1) Owner advantage in the event of a fight (0 ≤ ρ < 1) Probability of injury in winning a fight (0 ≤ λ < 1) Fitness gain to an injured animal from holding a site, where 1 is the gain to an uninjured animal (0 ≤ α < 1). The value of α is irrelevant if λ = 0.
the second state is never entered). We have now defined the seven dimensionless parameters recorded in Table 7.9. Both σ and measure site scarcity; the higher the value of σ, or the lower the value of , the more difficult it is to find a vacant site. Both λ and α measure (expected) fighting costs; the higher the value of λ, or the lower the value of α, the greater the risk attached to fighting. In other words, the true value V of the current site (i.e., its expected value assessed in the light of future alternatives) increases with σ and decreases with , whereas the true cost C of fighting over the current site increases with λ and decreases with α. An increase in true value and a decrease in true cost have the same effect on fitness. So the higher the value of σ or α, or the lower the value of or λ, the less likely it is that Bourgeois will be an ESS [117]. An uninjured animal is restricted to the strategy set {H, B, X, D} at each encounter; and because we have assumed that injured animals are unable to search or fight, the question of strategy for an injured animal does not arise. In an unintrusive population, the injured animal remains on its site until t = K or until it is attacked by an (uninjured) intruder, in which case, we assume that it loses the fight and dies. In an intrusive population, the same applies to H or X intruders, and in the event that B or D intrudes, we simply suppose that the injured animal is reduced to a Dove and, hence, is equally likely to keep its site or surrender it. In the case of surrender, it would become an injured nonowner—but not a dead one, which accounts for the qualification in the definition of state 4. The important point is that an animal in state 4 takes no further part in the competition—regardless of whether it is dead or alive. Although the IH-D is a discrete population game, as in §2.4 it will facilitate analysis to first define a set of probabilistic strategies of which {H, B, X, D} forms a subset. Accordingly, we reinterpret u = (u1 , u2 )
(7.41)
in (2.43) to mean escalate or display with probabilities u1 and 1 − u1 , respectively, if its role is perceived as owner, but it will escalate or display with probabilities u2 and 1 − u2 if its role is perceived as intruder, so that the strategies in Table 7.7 are embedded as (7.42)
H = (1, 1),
B = (1, 0),
X = (0, 1),
D = (0, 0).
7.3. An iterated Hawk-Dove game
301
As usual, to determine whether u is an ESS, we must obtain an expression for the reward f (u, v) = f (u1 , u2 , v1 , v2 ) to a representative u-strategist in a population of v-strategists. We will assume in the first instance that this population is unintrusive but later modify our results to allow for intrusiveness. Let the random variable Xu (t) denote a u-strategist’s state at time t, so that Xu has sample space {1, 2, 3, 4}. We will suppose that animals can change their state only at discrete instants of time, t = k, where k is a positive integer. Let xj (k) be the probability that a u-strategist is in state j at time k (i.e., immediately after time k), and let yj (k) be the probability that a v-strategist is in state j at time k; that is, define xj (k) = Prob(Xu (k) = j),
(7.43)
yj (k) = Prob(Xv (k) = j)
for j = 1, . . . , 4.11 Then, because any animal, whether u-strategist or v-strategist, must be in some state at time k, we have x1 (k) + x2 (k) + x3 (k) + x4 (k) = 1,
(7.44)
y1 (k) + y2 (k) + y3 (k) + y4 (k) = 1
for k = 0, . . . , K. For 1 ≤ i, j ≤ 4, let φij (k, u, v) be the conditional probability that a u-strategist in a population of v-strategists is in state j at time k + 1, given that it is in state i at time k; that is, define (7.45a)
φij (k, u, v) = Prob(Xu (k + 1) = j|Xu (k) = i),
where Prob (Y | Z) is standard notation for the conditional probability of Y , given Z. Note that three arguments and two subscripts are necessary to distinguish between, on the one hand, the probability that a u-strategist goes from state i to state j, which is φij (k, u, v); and, on the other hand, the probability that a v-strategist goes from state i to state j, which is (7.45b)
φij (k, v, v) = Prob(Xv (k + 1) = j|Xv (k) = i).
We assume that animals and sites are both so numerous that N → ∞ and M → ∞, but in such a way that the ratio σ defined by (7.39) is finite. Then every v-strategist is effectively playing only against v-strategists: the probability that a v-strategist will meet a u-strategist is negligible. We now obtain xj (k + 1) = yj (k + 1) =
4 # i=1 4 #
Prob (Xu (k + 1) = j | Xu (k) = i) · Prob (Xu (k) = i) , Prob (Xv (k + 1) = j | Xv (k) = i) · Prob (Xv (k) = i)
i=1
11 Strictly, because xj depends on u, we should use xj (k, u) and xj (k, v), respectively, in place of xj (k) and yj (k), but that would be unnecessarily cumbersome.
302
7. Discrete Population Games
for j = 1, . . . , 4 and 0 ≤ k ≤ K − 1, or, on using (7.45), xj (k + 1) =
(7.46a)
yj (k + 1) =
(7.46b)
4 # i=1 4 #
φij (k, u, v)xi (k), φij (k, v, v)yi (k).
i=1
Note that 4
(7.47)
φij (k, u, v) = 1
j=1
for all 1 ≤ i, j ≤ 4 and 0 ≤ u, v ≤ 1, because if a u-strategist is in state i at time k, then at time k + 1 it must either remain in state i or enter one of the other states. We assume that, during the interval 0 ≤ t < K, effects on fitness other than survival are negligible.12 So, from above (p. 299), if the contest lasts for K periods, then the payoff to a u-strategist against v-strategists is ⎧ ⎪ ⎨1 if Xu (K) = 1, FK (u, v) = α if Xu (K) = 2, ⎪ ⎩ 0 if Xu (K) ≥ 3. FK , so defined, is a random variable, whose expected value E [FK (u, v)] = 1 · Prob(Xu (K) = 1) + α · Prob(Xu (K) = 2) + 0 · Prob(Xu (K) ≥ 3) = x1 (K) + αx2 (K) is the reward to strategy u against strategy v if the contest lasts for K periods. Thus, because the length of the contest is random, the reward to strategy u against strategy v is the expected value of x1 (K)+αx2 (K), calculated over the distribution of K, that is, ∞ # f (u, v) = {x1 (k) + αx2 (k)}Prob(K = k) (7.48)
k=1
= (1 − w)
∞ #
wk−1 {x1 (k) + αx2 (k)},
k=1
because Prob(K = k) = Prob(K ≥ k) − Prob(K ≥ k + 1) = wk−1 − wk by (7.38). We can deduce the vectors x(k) and y(k) from (7.46) for any value of k if we first obtain an explicit expression for the 4 × 4 matrix Φ(k, u, v) defined by (7.45a). All possible transitions from states 1 or 2 are shown in Figure 7.3. We will first calculate the transition probabilities for the unintrusive case and subsequently indicate which terms require modification for the intrusive case. Accordingly, let us first suppose that the u-strategist is in state 1 (uninjured owner) at time k. Then it can move into state 4 (injured nonowner) only if it is intruded upon, is attacked, attacks back, and then loses; whereas it can move into 12 In spiders, for example, we ignore the cost in terms of fitness of abandoning a web, and simply assume that new construction costs are negligible compared to the necessity of eventually having a permanent site.
7.3. An iterated Hawk-Dove game
303
1 q
2 1− q
IO pv
NI
u1
1− u1
NI –21
u1
E
A –12
1− u1
DNE 1− μ
W 1− λ
1− pv pv
NA
AB μ
IO
1− pv
A
1− q
q
DNE
NA
–12
L
3
I
4
–12
λ
U
Figure 7.3. Flow chart of all possible transitions from owner states 1 (uninjured) or 2 (injured), where q = q(y3 (k)) and pv are defined by (7.49) and (7.50). All arc probabilities are conditional (on reaching the node from which the arc emanates) probabilities, and are equal to 1 for unlabelled arcs. The diagram is drawn for the intrusive case, using thick solid lines and larger arrowheads to indicate arcs that must be deleted for the unintrusive case (with the unused weight of 12 on a deleted arc transferred to the other arc emanating from the same parent node, to increase its weight from 12 to 1), and using dashed lines to surround nodes that are irrelevant to the unintrusive case. Numbers in shaded boxes refer to states; labels in unshaded boxes are defined as follows: A: AB: DNE: E:
ATTACKED ATTACK BACK DO NOT ESCALATE ESCALATE
I: IO: L: NA:
INJURED INTRUDED ON LOSER NOT ATTACKED
NI: U: W:
NO INTRUSION UNINJURED WINNER
304
7. Discrete Population Games
state 3 (uninjured nonowner) only if it is intruded upon, is attacked, and promptly surrenders; and it can move into state 2 (injured owner) only if it is intruded upon, is attacked, attacks back, and then wins, but sustains an injury. At time k, the expected number of uninjured nonowners is N y3 (k), because all of them are vstrategists, and each finds a site during the interval k < t < k + 1 with probability . Because they are all equally likely to find any site, the probability that one of them finds the particular site now occupied by the u-strategist is /M . The probability that one of them does not find the u-strategist is therefore 1 − /M , the probability that none of them finds the u-strategist is that number raised to the power of N y3 (k), and so the probability that at least one of them intrudes is 1 − (1 − /M )N y3 (k) = 1 − (1 − /M )M σy3 (k) by (7.39). So, in the limit as M → ∞, the probability that at least one uninjured nonowner locates the u-strategist is q(y3 (k)), where (7.49)
q(y3 ) = 1 − e−σy3 .
Having assumed that at most one animal intrudes upon a site per unit of time (p. 299), we interpret q(y3 (k)) as the probability that an uninjured nonowner intrudes during the interval k < t < k + 1.13 Then, because another animal attacks as intruder with probability (7.50)
pv = βv1 + (1 − β)v2 ,
and because the u-strategist attacks as owner with probability u1 and loses with probability 1 − μ (conditional upon attacking), the probability that a nonowner intrudes and attacks, and that the u-strategist attacks and loses, is q(y3 (k)) times pv times u1 times 1 − μ. This is all conditional, of course, on the u-strategist being in state 1, and assuming that the intruder’s attack probability is independent of the u-strategist’s surrender probability. We have thus established that (7.51)
φ14 (k, u, v) = (1 − μ)u1 pv q(y3 (k)).
Similarly, because the u-strategist surrenders as owner with probability 1 − u1 , (7.52)
φ13 (k, u, v) = (1 − u1 )pv q(y3 (k))
and, because the u-strategist wins with probability μ but (conditional upon winning) sustains an injury with probability λ, (7.53)
φ12 (k, u, v) = μλu1 pv q(y3 (k)).
The probability that the u-strategist remains in state 1 is now readily deduced from (7.47) with i = 1: (7.54)
φ11 (k, u, v) = 1 − {1 − μ(1 − λ)u1 }pv q(y3 (k)).
Alternatively, (7.54) can be obtained by summing products of conditional probabilities around all three circuits from state 1 back to itself in Figure 7.3—one through no intrusion (NI), one through uninjured (U), and one through not attacked (NA, with the escalate (E) and do not escalate (DNE) nodes deleted for the time being, because for now we deal only with the unintrusive case). 13 In the event that more than one animal located the site, we could suppose that the u-strategist interacted with the first, and that later arrivals ignored them both. As noted on p. 299, the smaller the value of , the more reasonable the interpretation (and for sufficiently small , we can even approximate (7.49) by q(y3 ) = σy3 ).
7.3. An iterated Hawk-Dove game
305
Second, let us suppose that the u-strategist is in state 2 (injured owner). Then the u-strategist is injured and does not recover, whence (7.55)
φ21 (k, u, v) = 0 = φ23 (k, u, v).
The u-strategist either remains an injured owner or becomes an injured nonowner. Given that it is already injured, it surrenders its site if it is intruded upon, which happens with probability q(y3 (k)), and if it is attacked, which happens with probability βv1 + (1 − β)v2 = pv . Thus, on using (7.47) with i = 2, we have (7.56)
φ22 (k, u, v) = 1 − pv q(y3 (k)),
φ24 (k, u, v) = pv q(y3 (k)).
All possible transitions from states 3 or 4 are shown in Figure 7.4. Let us first suppose that the u-strategist is in state 3 (uninjured nonowner). Then it will descend into state 4 (injured nonowner) only if it intrudes upon an uninjured owner, attacks, is attacked back, and then loses. At time k, the expected number of uninjured owners is N y1 (k), because every one of them is a v-strategist; whence, if the u-strategist finds a site, then the probability that it is occupied by an uninjured owner is N y1 (k)/M = σy1 (k)—if, of course, this number does not exceed 1. Here we digress to remark that the expected number of owners should not exceed the number of sites available; i.e., N (y1 + y2 ) ≤ M or σ(y1 + y2 ) ≤ 1. Because our model underestimates the true probability of intrusion, and hence overestimates the probability of ownership it is possible for N (y1 + y2 ) calculated from (7.46b) to exceed M , but only if σ is significantly greater than 1. Although we do not consider such cases,14 for completeness we correct for the anomaly by simply assuming that if (7.46b) predicts N (y1 + y2 ) > M , then all M sites are occupied with M y1 uninjured owners and M y2 injured owners, so that the proportions are consistent with the probabilities. Thus the probability that a site is occupied at time k by an uninjured owner is σy1 (k) if σy1 (k) + σy2 (k) ≤ 1, but y1 (k)/{y1 (k) + y2 (k)} if σy1 (k) + σy2 (k) > 1; that is, the probability is s1 (y(k)), where we define ym (7.57) sm (y) = min σym , y1 + y2
for m = 1, 2. Similarly, the probability that a site is occupied at time k by an injured owner is s2 (y(k)). Let us now return to calculating the probability that a u-strategist in state 3 will descend into state 4 during the interval k < t < k + 1. The u-strategist finds a site with probability , and it is occupied by an uninjured owner with probability s1 (y(k)), in which case, the u-strategist attacks as intruder with probability (7.58)
pu = βu1 + (1 − β)u2 ,
is attacked back by the owner with probability v1 , and subsequently loses with probability μ. Thus, multiplying all the conditional probabilities together, we have (7.59)
φ34 (k, u, v) = μpu v1 s1 (y(k)).
Similarly, because (conditional upon an intrusion and engagement) the u-strategist wins and is injured with probability (1 − μ)λ, we have (7.60) 14
φ32 (k, u, v) = (1 − μ)λpu v1 s1 (y(k)).
For a discussion of the case where σ > 1, see [223, pp. 421–422].
306
7. Discrete Population Games
3
4 1− ε
ε
SF
NSF
1−s1 −s2
SE
s2
A
s1
–12
IO
E
DNE
1−v1
1− pu
NA
UO
1− pu
pu
1
v1
pu
–12
–12
DNE
–12
E 1−v1 v1
U
2
AB
I
1−λ
1−μ 1−
λ
W
NA μ
L
Figure 7.4. Flow chart of all possible transitions from nonowner states 3 (uninjured) or 4 (injured), where s1 = s1 (y(k)), s2 = s2 (y(k)) and pu are defined by (7.57) and (7.58). All other details are the same as in Figure 7.3, with labels in unshaded boxes defined as follows: A: AB: DNE: E: I:
ATTACKED ATTACKED BACK DO NOT ESCALATE ESCALATE INJURED
IO: L: NA: NSF: SE:
INJURED OWNER LOSE NOT ATTACKED NO SITE FOUND SITE EMPTY
SF: U: UO: W:
SITE FOUND UNINJURED UNINJURED OWNER WINNER
The probability that the u-strategist finds a site whose owner is either injured or uninjured and does not attack is times s1 (y(k)) + s2 (y(k)) times 1 − pu . The probability that the u-strategist does not find a site is 1 − . In either case, it remains in state 3. Thus (7.61)
φ33 (k, u, v) = 1 − + {1 − pu }{s1 (y(k)) + s2 (y(k))}
7.3. An iterated Hawk-Dove game
307
and (7.47) with i = 3 implies (7.62) φ31 (k, u, v) = 1 − (1 − pu ){s1 (y(k)) + s2 (y(k))}
− (μ + λ(1 − μ))pu v1 s1 (y(k)) .
Finally, because an injured animal is unable to search, even for unoccupied sites, we have (7.63)
φ4j (k, u, v) = 0 for 1 ≤ j ≤ 3,
φ44 (k, u, v) = 1.
The matrix φ has now been defined for an unintrusive population. Six elements of this matrix, namely, φ11 , φ13 , φ22 , φ24 , φ31 , and φ33 , require modification if the population is intrusive. In (7.52), conditional on being intruded upon and not fighting, a u-strategist’s probability of descending from state 1 to state 3 in an unintrusive population is simply pv , the probability that the intruder attacks. In an intrusive population, however, if the intruder does not attack, then owner and intruder are equally likely to obtain the site. Thus, conditional on being intruded upon and not fighting, a u-strategist’s probability of descending from state 1 to state 3 is, not pv , but pv + 12 (1 − pv ) = 12 (1 + pv ). Thus (7.52) is replaced by (7.64)
φ13 (k, u, v) =
1 2 (1
− u1 ){1 + pv } q(y3 (k))
and, from (7.47) with i = 1, (7.54) is replaced by (7.65) φ11 (k, u, v) = 1 − 12 1 − u1 + {1 + (1 − 2μ(1 − λ))u1 }pv q(y3 (k)). Similarly, because we assume that an injured owner in an intrusive population is equally likely to lose or keep its site when B or D intrudes, (7.56) is replaced by (7.66)
φ22 (k, u, v) = 1 − 12 {1 + pv }q(y3 (k)), φ24 (k, u, v) =
1 2 {1
+ pv }q(y3 (k)).
In state 3, the focal u-strategist finds a site and then does not fight for it with probability ·(1−pu ). Conditional thereupon, it remains in state 3 with probability 1 1 2 if the site is occupied by an injured owner or with probability v1 · 1 + (1 − v1 ) · 2 = 1 2 (1 + v1 ) if it is occupied by an uninjured owner, that is, with net conditional probability 12 · s2 (y(k)) + 12 (1 + v1 ) · s1 (y(k)) = 12 {(1 + v1 )s1 (y(k)) + s2 (y(k))}. Thus (7.67)
φ33 (k, u, v) = 1 − + 12 {1 − pu }{(1 + v1 )s1 (y(k)) + s2 (y(k))}
replaces (7.61), while (7.62) becomes (7.68) φ31 (k, u, v) = 1 − 12 {1 − pu }{(1 + v1 )s1 (y(k)) + s2 (y(k))} − {μ + λ(1 − μ)}pu v1 s1 (y(k))
by (7.47). Substitution of φ into (7.46) leads to a set of eight first-order, nonlinear recurrence equations. Because all animals are initially uninjured nonowners, that is, in state 3 (p. 298), we have (7.69)
x(0) = (0, 0, 1, 0) = y(0).
Given (7.69), the vectors x(k) and y(k) are readily calculated from k successive recursions of (7.46) for any strategy combination (u, v), and for any value of k, and so f (u, v) = f (u1 , u2 , v1 , v2 ) in (7.48) can be approximated to an arbitrarily high
308
7. Discrete Population Games
degree of accuracy by truncating the infinite series after a sufficiently large number of terms. A series of such calculations yields the reward matrix ⎡ ⎤ f (1, 1, 1, 1) f (1, 1, 1, 0) f (1, 1, 0, 1) f (1, 1, 0, 0) ⎢f (1, 0, 1, 1) f (1, 0, 1, 0) f (1, 0, 0, 1) f (1, 0, 0, 0)⎥ ⎥ (7.70) A = ⎢ ⎣f (0, 1, 1, 1) f (0, 1, 1, 0) f (0, 1, 0, 1) f (0, 1, 0, 0)⎦ , f (0, 0, 1, 1) f (0, 0, 1, 0) f (0, 0, 0, 1) f (0, 0, 0, 0) which is a function of the seven parameters β, w, σ, , ρ, λ, and α listed in Table 7.9. So the ESS is a function of the same seven parameters. We consider separately the effects of owner advantage and ownership uncertainty. To consider the first effect, we set β = 0 with ρ ≥ 0, so that μ ≥ 12 in (7.40). General relationships now exist between the reward matrices for unintrusive and intrusive populations. If we use A to distinguish the reward matrix for an intrusive population from that for an unintrusive one, then it is always true that (7.71)
a11 = a11 , a13 = a13 , a14 = a14 , a22 = a22 , a31 = a31 , a33 = a33 , a23 > a23 , a24 > a24 , a32 < a32 , a34 < a34 , a42 < a42 , a43 > a43 .
These twelve between-population relationships are identical to those that exist between the corresponding entries of Tables 2.3 and 7.6; however, whereas the remaining elements in Tables 2.3 and 7.6 are related by equality, now we obtain (7.72)
a12 < a12 , a21 > a21 , a41 > a41 , a44 > a44
instead. The first three of these inequalities arise because an injured Hawk is guaranteed to keep its site from a peaceable intruder only in an unintrusive population. Thus H does better against an unintrusive B population than against an intrusive one; and B and D do better against an intrusive population whose ESS is H than it does against an unintrusive one. The fourth inequality is more subtle: Doves can lose their territories in an intrusive D population, and so the population spends more time searching. Hence—because all sites are equally likely to be located in any given period—more vacant sites are found by the population and the population reward to D is higher. Some intrapopulation relationships hold regardless of whether the population is intrusive or unintrusive (and so overbars are unnecessary in this paragraph). First, because territories can change hands so easily in an X population, the population spends more time searching than it would if its strategy were B, and hence more vacant sites are found by the population; thus the population reward is always higher in an X population than in a B population, or a33 > a22 . Second, the population rewards to B, X, and D, namely, a22 , a33 , and a44 , are independent of ρ, because owner advantage makes a difference only if fighting occurs; whereas the population reward to H, namely, a11 , is a decreasing function of ρ. All of the above relationships are exemplified by Tables 7.10 and 7.11, which show results of reward-matrix calculations for both an unintrusive and an intrusive population.15 15 The upper panels of these two tables also illustrate some relationships among rewards for an unintrusive population, which are parallel to relationships among rewards in Table 7.6. Specifically, because B and D are indistinguishable in an unintrusive population without H or X, we have a22 = a24 = a42 = a44 , and because an X mutant is the same as an H mutant in an unintrusive B or D population, we have a12 = a32 and a14 = a34 .
7.3. An iterated Hawk-Dove game
309
Table 7.10. Reward matrix in the IH-D for four values of w when σ = 0.8, = 0.1, λ = 0.6, α = 0.5, β = 0, and ρ = 0. H, B, X, and D are strategies 1, 2, 3, and 4, respectively; boldface indicates an ESS.
unintrusive population w = 0.8 w = 0.85 0.2932 0.2844 0.2882 0.2785
0.3229 0.3147 0.3229 0.3147
0.3185 0.2801 0.3158 0.2732
0.3571 0.3147 0.3571 0.3147
0.3362 0.3274 0.3330 0.3226
0.3771 0.3692 0.3771 0.3692
0.5263 0.4509 0.5263 0.4509
0.4848 0.5032 0.5082 0.5287
0.5744 0.5940 0.5744 0.5940
w = 0.9 0.3957 0.3911 0.3986 0.3919
0.4545 0.4509 0.4545 0.4509
0.3684 0.3186 0.3705 0.3121
0.4255 0.3692 0.4255 0.3692
w = 0.95
0.4346 0.3710 0.4525 0.3688
0.5186 0.4467 0.5960 0.4693
0.6897 0.5940 0.6897 0.5940
intrusive population w = 0.8 0.2932 0.2860 0.2882 0.2803
0.3223 0.3147 0.3007 0.2927
0.3185 0.3005 0.3158 0.2957
w = 0.85 0.3571 0.3372 0.3353 0.3152
0.3362 0.3299 0.3330 0.3255
0.3761 0.3692 0.3464 0.3386
0.5263 0.4935 0.4868 0.4517
0.4848 0.5095 0.5082 0.5377
0.5700 0.5940 0.5016 0.5250
w = 0.9 0.3957 0.3950 0.3986 0.3969
0.4525 0.4509 0.4095 0.4063
0.4346 0.4073 0.4525 0.4159
0.3684 0.3457 0.3705 0.3437
0.4255 0.3998 0.3963 0.3698
w = 0.95 0.5186 0.4918 0.5960 0.5466
0.6897 0.6527 0.6392 0.5950
Specifically, Tables 7.10 and 7.11 show reward matrices for two values of ρ and four values of w, when σ = 0.8, = 0.1, λ = 0.6, α = 0.5, and β = 0. The corresponding ESSs are identified by using boldface for any diagonal element that is the largest in its column. For ρ = 0, or equivalently μ = 12 by (7.40), as w increases the ESS changes from H to H or X and then to X, with a further change to B or X for an intrusive population (Table 7.10). The values chosen for σ, , λ, and α are purely arbitrary, but they are easily changed, and numerical experiment shows that there are always three critical values, namely, w1 , at which the ESS changes from H to H or X, w2 (> w1 ) at which the ESS changes from H or X to X, and w3 (> w2 ) at which the ESS changes from X to B or X for an intrusive population (only). For example, Table 7.10 corresponds to w1 ≈ 0.83, w1 ≈ 0.88, and w1 ≈ 0.91.16 These critical values increase with σ and α but decrease with λ and . Note that if X is an ESS for μ > 12 (as in Table 7.11 for w = 0.9 and w = 0.95), then the ESS is paradoxical in the sense that sites are acquired conventionally by the contestant whose probability of winning a fight—if there were one—would be lower. 16 Note that, for i = 1, 2, wi for an intrusive population is identical to wi for an unintrusive population for the following reason. As w increases, the ESS changes from H to H or X and then to X because a33 − a13 increases with w while a11 − a31 decreases. But a11 , a13 , a31 , and a33 for an intrusive population are the same as for an unintrusive one, by (7.71). So the value of w for which a33 = a13 , which is w1 , and the value of w for which a11 = a31 , which is w2 , are the same for both populations.
310
7. Discrete Population Games
Table 7.11. Reward matrix in the IH-D for four values of w when σ = 0.8, = 0.1, λ = 0.6, α = 0.5, β = 0, and ρ = 0.4. H, B, X, and D are strategies 1, 2, 3, and 4, respectively; boldface indicates an ESS.
unintrusive population w = 0.8 w = 0.85 0.2930 0.2905 0.2816 0.2785
0.3155 0.3147 0.3155 0.3147
0.3258 0.2866 0.3158 0.2732
0.3571 0.3147 0.3571 0.3147
0.3360 0.3357 0.3236 0.3227
0.5263 0.4509 0.5263 0.4509
0.4845 0.5215 0.4848 0.5288
0.3667 0.3692 0.3667 0.3692
w = 0.9 0.3955 0.4029 0.3845 0.3920
0.4390 0.4509 0.4390 0.4509
0.3788 0.3278 0.3705 0.3121
0.4255 0.3692 0.4255 0.3692
w = 0.95
0.4506 0.3849 0.4525 0.3688
0.5496 0.5940 0.5496 0.5940
0.5462 0.4704 0.5960 0.4693
0.6897 0.5940 0.6897 0.5940
intrusive population w = 0.8 0.2930 0.2920 0.2816 0.2803
0.3151 0.3147 0.2934 0.2927
0.3258 0.3074 0.3158 0.2957
w = 0.85 0.3571 0.3372 0.3353 0.3152
0.3360 0.3382 0.3236 0.3255
0.3661 0.3692 0.3360 0.3386
0.5263 0.4935 0.4868 0.4517
0.4845 0.5279 0.4848 0.5378
0.5469 0.5940 0.4748 0.5250
w = 0.9 0.3955 0.4069 0.3845 0.3970
1
0.4378 0.4509 0.3937 0.4063
0.4506 0.4223 0.4525 0.4159
0.3788 0.3556 0.3705 0.3437
0.4255 0.3998 0.3963 0.3698
w = 0.95 0.5462 0.5178 0.5960 0.5466
0.6897 0.6527 0.6392 0.5950
ρ
Bourgeois (B) or anti-Bourgeois (X)
Bourgeois (B) H or B Hawk (H) H
B
H
X H or X
0 0 High predation
w1
X w2 w3 1w Low predation
Figure 7.5. Illustration of ESS as a function of survival probability w and owner advantage ρ for an intrusive population. The probability that an owner wins is μ = 12 (1 + ρ), by (7.40).
7.3. An iterated Hawk-Dove game
311
Table 7.12. IH-D reward matrix for an intrusive population for two pairs of values of w, μ when σ = 0.8, = 0.1, λ = 0.6, α = 0.5, and β = 0; boldface indicates an ESS.
w = 0.775, ρ = 0.5 0.2756 0.2753 0.2635 0.2630
0.2933 0.2936 0.2745 0.2747
0.3060 0.2895 0.2945 0.2770
w = 0.87, ρ = 0.2 0.3306 0.3130 0.3116 0.2940
0.35743 0.35723 0.35044 0.34969
0.39746 0.39750 0.36298 0.36221
0.39869 0.37372 0.39895 0.36864
0.46083 0.43233 0.42784 0.39823
Further numerical experiment reveals that the critical values move closer together as ρ increases beyond 0, until eventually they coalesce, allowing B to replace X as the sole ESS in an intrusive population at higher values of ρ for a range of intermediate values of w, e.g., w = 0.85 when ρ = 0.4 (Table 7.11). The resultant dependence of the ESS on w and ρ is illustrated in Figure 7.5, where a point with coordinates (w, ρ) represents the environment that a population faces. Although this diagram is drawn only for an intrusive population and several details of the true picture (e.g., the size of the H-or-X region) have been exaggerated or suppressed for the sake of clarity,17 the overall topology is correct, and what it shows is the following. If initially high predation is associated with low owner advantage so that ρ is not much bigger than 0, and subsequently if predation decreases and if the population tracks the ESS, then it will evolve from H to X. Moving to the right below the dashed line in Figure 7.5, when the population moves into the H-or-X region it will stay at H, because it began at H, but it will switch to X in the X-only region, and it will stay at X in the B-or-X region, because it began at X. Likewise, above the dashed line, the population will shift from H to B under decreasing predation, staying at H through the narrow H-or-B region because it began at H, but then switching to B in the B-only region. Thus B can arise under decreasing predation only if ρ exceeds a critical value that is strictly greater than 0; if ρ = 0, then X will arise instead.18 Having shown that Bourgeois cannot arise from a nonfighting population when there is neither owner advantage nor ownership uncertainty, we now separately consider the effect of the latter by setting ρ = 0 with β > 0, so that μ = 12 in (7.40). Table 7.13 shows the corresponding reward matrices for σ = 0.8, = 0.1, λ = 0.6, α = 0.5 (all as in Table 7.10), and β = 0.4 for six values of w differing by increments of 0.02, with ESSs again identified by using boldface. As w increases, the ESS shifts from H to B, then to B or X, and finally to X, after passing through a range of values—between approximately 0.8904 and 0.9026—where no monomorphic ESS exists. Extensive numerical experiment using other parameter 17 For an unintrusive population, the principal difference is that the Bourgeois region of Figure 7.5 is replaced by a region in which no member of the strategy set {H, B, X, D} is an ESS; such a region also arises for an intrusive population in place of the narrow H-or-B transitional region when λ is sufficiently low. Among details suppressed in Figure 7.5 is that it is possible for H, B, and X all to be ESSs in a very small transitional region where the various regions depicted in the diagram all come together; an example is given by the second reward matrix in Table 7.12. All of these details are discussed at length elsewhere [203, pp. 208–213]. 18 Because the critical value of ρ is found to be an increasing function of σ, the more abundant the sites, i.e., the lower the value of σ, the lower is the owner advantage necessary to ensure that the population evolves to B; however, it is always positive.
312
7. Discrete Population Games
Table 7.13. IH-D reward matrix for an intrusive population for six values of w when σ = 0.8, = 0.1, λ = 0.6, α = 0.5, β = 0.4, and ρ = 0; boldface indicates an ESS.
w = 0.8725 0.3604 0.3579 0.3571 0.3530
0.3850 0.3823 0.3642 0.3600
0.4144 0.4162 0.4214 0.4224
0.4459 0.4474 0.4260 0.4264
0.4905 0.5052 0.5269 0.5487
0.5306 0.5453 0.5227 0.5437
w = 0.8925
0.4215 0.4063 0.4026 0.3836
0.4657 0.4490 0.4208 0.4021
0.3854 0.3844 0.3859 0.3835
0.4131 0.4120 0.3921 0.3895
0.5594 0.5399 0.5031 0.4794
0.4489 0.4553 0.4667 0.4744
0.4845 0.4907 0.4681 0.4751
0.7005 0.6802 0.6340 0.6053
0.5419 0.5722 0.6120 0.6648
0.5864 0.6168 0.5979 0.6521
w = 0.9125
0.5083 0.4901 0.4579 0.4368
w = 0.9325
0.4921 0.4749 0.4799 0.4554
w = 0.9525
0.5355 0.5180 0.5339 0.5061
0.6221 0.6016 0.5598 0.5334
w = 0.9725
0.5845 0.5681 0.6059 0.5750
0.6353 0.6232 0.7097 0.6779
0.8016 0.7843 0.7376 0.7092
β
No monomorphic ESS
1
0.4543 0.4380 0.4373 0.4157
H
H
B
H
X
X H or X
0
High predation
w1
w2
X
B or X B or X
X
0
B
w3
Low predation
1w
Figure 7.6. Illustration of intrusive-population ESS as a function of survival probability w and ownership uncertainty β
values shows that this pattern is representative of what happens more generally, and the resultant dependence of the ESS on w and β is illustrated in Figure 7.6. As in Figure 7.5 several details of the true picture have been exaggerated or suppressed for the sake of clarity,19 but the overall topology is correct. If initially high predation is associated with low ownership uncertainty so that β is not much 19
Details are discussed at greater length elsewhere [223, pp. 418–423].
7.3. An iterated Hawk-Dove game
313
bigger than 0, and subsequently if predation decreases and the population tracks the ESS, then it will evolve from H to X. If (w, β) in Figure 7.6 lies well below the dashed line but above the β = 0 axis as it moves to the right, then it will stay at H after moving into the H-or-X region, because it was at H beforehand; it will shift to X in the X-only region; and it will stay at X in the B-or-X region, because it was at X beforehand. At larger β but still below the dashed line, the population will move through a region where no monomorphic ESS exists after shifting from X, but it will still eventually shift to X and stay there.20 Likewise, above the dashed line in Figure 7.6, the population will shift from H to B under decreasing predation, after first moving through a region where no monomorphism exists, and it will stay at B, because it was at B beforehand—but only as long as (w, β) remains to the left of the thick solid curve. For sufficiently high w, the ESS will again shift to X. Two broad conclusions emerge from our analysis of the IH-D. First, ownership uncertainty by itself can facilitate the emergence of the Bourgeois convention from a population of fighters under decreasing predation in the absence of owner advantage (Figure 7.6). For B to become the sole ESS, the value of β must be high enough for (w, β) to clear the cusp of the X-only region, yet the value of w must not so high as to shift (w, β) across the thick solid curve. The critical value that β must exceed can be shown to increase with σ or α and decrease with λ or (so that it decreases with the cost of fighting), but it is always positive. If mistakes over ownership are very rare, then a nonfighting population will still emerge under decreasing predation, but it will be X, not B. Second, owner advantage by itself can facilitate the emergence of Bourgeois from a population of fighters, again under decreasing predation but in the absence of ownership uncertainty (Figure 7.5). Regardless of whether a population is intrusive or unintrusive, however, if owner advantage is sufficiently low, then the final ESS of a population that evolves from H under decreasing predation will always be X, not B; only in an intrusive population, and only if owner advantage is sufficiently high, does B emerge instead. Thus our analysis suggests that if strategy X is rare in nature whenever a contested resource is of value only if it can be held for a long time [188, p. 96], then X may be rare because owner advantage is almost always high enough to shift an ESS of H to an ESS of B under conditions of decreasing predation, and that if the behavior observed by Burgess [54] in Oecobius civitas 20 Within the region where no monomorphism exists, the ESS is a polymorphism of H and X below the dashed line, with a lower proportion of X at lower values of w and a higher proportion of X at higher values. For example, β = 0.2 lies below the dashed line when σ = 0.8, = 0.1, λ = 0.6, and α = 0.5 (with ρ = 0). For this value of β, the polymorphic segment extends from w ≈ 0.8873 to w ≈ 0.8954, and the ESS determined by the replicator equations (2.108) with A in (2.106) is, e.g., about 17% X with 83% H at w = 0.8893, 38% X with 62% H at w = 0.8914 and 65% X with 35% H at w = 0.8934. Above the dashed line in the region where no monomorphism exists, the ESS is a polymorphism of H and X at lower values of w, but shifts to a polymorphism of H and B at higher values of w, with the proportion of B increasing as w nears the B-only region. Suppose, for example, that w = 0.8925, as in the upper right-hand corner of Table 7.13. We see from the reward matrix that H will then be infiltrated by X, not B, because a31 > a11 , whereas a21 < a11 ; the resultant mixture determined by (2.108) is approximately 97% H and 3% X. For this value of β, namely, 0.4, the unshaded region stretches from w ≈ 0.890384 to w ≈ 0.902640. As w increases within it, the proportion of X at the ESS is never large (no more than about 16%), and at w ≈ 0.901236 the ESS shifts from a polymorphism of H and X to a polymorphism of H and B, in which the proportion of B increases with w. For example, it is about 16% B with 84% H at w = 0.9013, 41% B with 59% H at w = 0.9017, 66% B with 34% H at w = 0.9021 and 97% B with 3% H at w = 0.9026.
314
7. Discrete Population Games
is indeed strategy X, then it may have resulted under the same conditions from a rare instance of owner advantage being sufficiently low to shift an ESS of H to an ESS of X instead.
7.4. Commentary In this chapter we used three discrete population games to study topics in behavioral ecology and resource management, namely, mutualism in social foraging (§7.1), mutualism in wildlife conservation (§7.2), and competition for reproductive sites (§7.3); §7.1 is based on [219], §7.2 on [226] and §7.3 on [202, 223]. An aspect of dominance (§4.9) is implicit in the recruitment game of §7.1, because a bonanza’s discoverer may rise in status. This model focuses on immediate versus delayed recruitment, but social foraging games more often focus on producing (i.e., finding food) versus scrounging or kleptoparasitism (i.e., stealing food from others), in which case, a different aspect of dominance to consider is that individuals of high rank may coerce those of low rank into producing [18]. The definitive synthesis of social foraging theory is [108], with more recent contributions reviewed in [109]. Overlooked by both, however, is an alternative game-theoretic approach to kleptoparasitism based on using compartment models (like the one in §6.8) that begins with [48] and leads to [316]; see §17.5 of [49]. In terms of §5.7, the recruitment game showed that delayed recruitment to food bonanzas among juvenile ravens can arise mutualistically either as cooperation against the common enemy of adult ravens who may defend such bonanzas or as cooperation towards the common end of rising in social status (which is a property of the individual, albeit a relative one). Either way, cooperation is an incidental consequence of each individual’s selfish goal. In the first case, that goal is access to the bonanza. In the second case, that goal is acquiring prestige [369, pp. 125–150]. In neither case is there any direct feedback between specific individuals. In both cases, therefore, cooperation is mutualism [217, p. 270]. Despite that, cooperation via the status-enhancement effect in ravens is reminiscent of what has been called indirect reciprocity [6]. Here two remarks are in order. First, if one regards mutualism and reciprocity as the poles of a grand continuum, then the degree of feedback between cooperators should steadily increase between the first pole and the second; any behavior described as a form of reciprocity should be close to the second pole. Second, there is no central authority to act as supreme arbiter of names for subcategories. As a result, indirect reciprocity means different things to different people: either that a well-defined network of specific donors and recipients contains more than two individuals [41] or that individuals can enhance their status through generosity and be more generous to partners of higher status [6, 256]. The first interpretation is far closer to reciprocity, especially for a short network; and the second interpretation—which “does not require the same two individuals ever to meet again” [256, p. 573]—is so close to mutualism that, in one opinion, it “stretches the definition of reciprocity to the point of meaninglessness” [369, p. 149].
7.4. Commentary
315
Although game theory has long been applied to fisheries management [327], the enormous potential for game-theoretic modelling in other areas of wildlife conservation remains largely undeveloped. In this regard, models can be used for either of two purposes. The first is to predict the behavior of humans who exploit a wildlife resource, as in §7.2; for more recent work in this area, see [71, 156, 170, 279] and references therein. The second purpose is to predict the behavior of animals that are themselves the resource to be conserved [328, 329]; recent examples include [316, 366]. The iterated Hawk-Dove game of §7.3 provides a rationale for the evolution of a nonfighting population from a population of fighters through the adoption of a conventional strategy. More specifically, it provides a rationale for the adopted convention to be almost invariably Bourgeois, which respects ownership, and, correspondingly, for the opposite convention of anti-Bourgeois to almost never arise. This rationale (namely, that when predation is high, owner advantage will rarely be low—perhaps only in circumstances where sites are sheltered from predators and animals must venture farther afield to be preyed upon) is not, however, the basis of Maynard Smith’s view that X should be very rare. Rather, for him “the difficulty with strategy X is that it leads to a kind of infinite regress, because as soon as an owner loses a contest it becomes an intruder, and hence able to win its next contest” [188, p. 96]. He was therefore arguing, at least implicitly, that repeated costs of entry to and departure from a site would nullify its potential long-term value. Somewhat surprisingly, more than 30 years elapsed before the validity of his verbal reasoning was formally tested in [235]. The strategy set for this model is still {H, B, X, D} (effectively defined by Table 2.2, since ownership uncertainty is set to zero), but the parameter w in §7.3 becomes a fixed probability that the same two individuals meet again. The analysis reveals that, when w exceeds thresholds determined by the costs of taking and relinquishing ownership, B can become the only evolutionarily stable convention. Thus, on the one hand, the costs of infinite regress can facilitate the evolution of Bourgeois, on the other hand, there are always regions of parameter space where X is the only (monomorphic) ESS. So we cannot exclude the possibility that strategy X is at work in nature, albeit only rarely. As noted on p. 296, Maynard Smith certainly seemed to think that the behavior of Oecobius civitas is an example of X, although the evidence remains anecdotal [305, p. 1188]. The current consensus among empiricists is that although respect for ownership is widespread in nature, it is not as absolute as Bourgeois would imply; rather, it corresponds to a population strategy resembling a mix of B and H, whose proportion of B, denoted by χB , represents the degree of respect for ownership [305]. In [224] the analysis of [235] was extended to allow for polymorphic mixtures, and it identified a pathway through which any χB between 0 and 1 can evolve from a population of Hawks under increasing fighting costs and increasing w, provided only that the conveyance costs of entry √ to or departure from a site (scaled with 2 (7 − 4 2) ≈ 0.158. Collectively, these two models respect to site value) exceed 17 [224, 235] and the IH-D of §7.3 go a long way towards establishing a rationale for the pattern of respect for ownership that we observe in nature. Yet challenges remain [305, p. 1198], and not all aspects of territorial contests can be modelled with a discrete strategy set. For example, each of a pair of animals may discover a
316
7. Discrete Population Games
site and develop an attachment to it for a significant period of time without being aware of the other’s presence; and the question of how their residency duration affects their rival claims to ownership requires a continuous game. Complementary approaches to addressing the question are developed in [236]. Discrete population games may be n-player games in the sense that individuals interact within groups or populations of finite size n,21 on which the ESS depends; however, they are not n-player games in the sense of Chapter 1, where each of n individuals in a community game is allowed a different strategy. On the contrary, there is always a sense in which population games are effectively two-player games, because they are always interactions between a population strategy and a potential mutant strategy. Nevertheless, the scope of discrete population games is extensive, and it remains relatively unexplored. For a more advanced treatment of effects of population size, see, e.g., [251] or Chapter 12 of [49].
Exercises 7 1. Verify (7.5). 2. (a) Verify (7.7). (b) Verify (7.10) and (7.11). (c) Verify (7.13) and (7.14). 3. How must the analysis of §7.1 be modified if there is both a status-enhancement and a posse effect? 4. (a) How must Tables 2.3 and 7.6 for Owners and Intruders be modified to allow for owner advantage ρ = 2μ − 1 (> 0)? (b) How must the ESS analysis be modified? 5. (a) How must the reward matrix in Table 2.3 be modified to instead allow for ownership uncertainty β? (b) How must the ESS analysis be modified? for A defined by (7.70). 6. (a) Show that a14 = 1−w(1−) (b) Verify that this result agrees with Tables 7.10–7.12. 7. Why does a41 , the IH-D fitness of a Dove in a Hawk population, increase (slightly) with ρ between Table 7.10 and Table 7.11?
21
With n replaced by N + 1 in §§5.4–5.6 or §7.1 and n + 1 in §7.2.
Chapter 8
Triadic Population Games
Many strategic interactions take place within social networks, such as when male fiddler crabs or male fireflies advertise themselves to females [42, p. 9]. Even for contests between two individuals, a network of potential observers introduces new strategic elements. In particular, neighbors can eavesdrop on contests between other individuals to obtain information that is useful for later contests, animals can advertise their fighting prowess to discourage challengers, and either of two contestants can enlist the help of a third individual in fighting over a resource. Because triads are both the simplest groups in which such network effects can be studied and the groups beyond dyads in which analysis of population games is most likely to be tractable, especially when allowing for intrinsic variation among individuals, strategic models of triadic interaction—triadic games—have an important role to play in the study of animal behavior. Here we demonstrate how such models can illuminate a variety of behavioral phenomena within animal networks. All of our models are continuous population games as defined in Chapter 2. Two kinds of continuous triadic game have proven especially amenable to analysis; in both cases, strength is assumed to be continuously distributed on [0, 1] with probability density function g. In the first kind of game, which we call Type I, strategies are intensities—the payoff to an individual depends on its own strength and strategy, but not those of the other triad members—and the set of all possible outcomes from the triadic interaction has a discrete probability distribution for every conceivable strategy combination (u, v). Let there be K such outcomes in all, let wi (u, v) be the probability associated with outcome i, let Pi (u, X) be the corresponding payoff to the focal individual, where X ∈ [0, 1] is its strength, and let f (u, v) be its reward. Then
"1 ' K K # # wi (u, v) Pi (u, x) g(x) dx , wi (u, v) = 1. (8.1) f (u, v) = i=1
0
i=1
317
318
8. Triadic Population Games
In the second kind of game, which we call Type II, strategies are behavioral thresholds (e.g., for aggression) and the sample space [0, 1] × [0, 1] × [0, 1] of the triad’s three strengths—assumed to be independent—can be decomposed into a finite number K of mutually exclusive events, for all (u, v). Let Ωi (u, v) be the ith such event, and let Pi (X, Y, Z) be the corresponding payoff to the focal individual, where Y and Z are the other two strengths in the triad. Then """ K # Pi (x, y, z)g(x)g(y)g(z) dx dy dz. (8.2) f (u, v) = i=1
(x,y,z) ∈ Ωi (u,v)
The principal virtue of both Type I and Type II games is that they make it possible to construct purely analytical models of triadic interactions. Nevertheless, for tractability, judicious use of further simplifying assumptions may still be necessary. In particular, in a Type I model, zero variance may be assumed (as illustrated by §8.2), then both Pi (u, X) and the integral in curly brackets in (8.1) reduce to Pi (u). We must have variance in a Type II model, because otherwise behavioral thresholds are inoperable; however, we may instead make other simplifying assumptions, e.g., that strength difference is a perfectly reliable predictor of fight outcome.1 It could be argued that zero variance and infinite reliability are unrealistic; but results that rely on these assumptions should still yield an excellent approximation for populations in which variance is low or reliability is high, respectively. Furthermore, such additional assumptions are by no means always necessary. For example, a Type I model with positive variance is discussed in the following section, and a Type II model with finite reliability is discussed in §8.3. We now proceed with our examples of triadic games.
8.1. Winner and loser effects We often speak of someone having a winning streak or being “on a roll” after a string of consecutive successes. It seems that if you win today, then you have a higher probability of winning tomorrow (though you could still lose), and similarly for a losing streak where if you lose today, then you have a higher probability of losing tomorrow (though you could still win). We will refer to these effects as winner or loser effects, or collectively as prior-experience effects. Winner and loser effects have been observed in laboratory experiments with, e.g., beetles, crayfish, crickets, lizards, rats, snakes, spiders, some birds, and various fish [215, p. 41], as well as in humans [259]. Some of these experiments have produced only a loser effect, and some have produced both a loser and a winner effect, but only one experiment has ever produced a winner effect without a loser effect [112]. Why? Here we attempt to make sense of these observations with a Type I triadic game, having K = 22 in (8.1). Consider a game between animals who interact in triads chosen randomly from a large population. An interaction consists of three pairwise contests. The outcome of each contest is determined by the difference between the contestants’ perceptions of their strengths, which are known only to themselves, and may be revised in light 1
As illustrated by [229], as well as by §6.9, albeit in the context of dyadic games.
8.1. Winner and loser effects
319
of experience. The cost of a contest is determined by the extent to which an animal overestimates its strength, that is, by the difference between its actual strength— which we assume it does not know—and its perception of that strength. Both perceived and actual strengths are assumed to be numbers between 0 (weakest) and 1 (strongest); actual strengths are drawn at random from a continuous distribution on [0, 1] with probability density function g and cumulative distribution s function G (so that G(s) = 0 g(ξ) dξ, as in §2.6). Although there is no direct assessment of strength—i.e., in terms of §6.2 there is neither self-assessment nor mutual assessment—animals can still respond to the distribution of strength among the population at large. In general, the greater an animal perceives its strength, the harder it fights, and hence the greater its probability of winning; however, as remarked above, it is costly for an animal to overestimate his strength, and the cost increases with the magnitude of the overestimate. Thus the fundamental tradeoff to be captured by our model is between the benefits to an animal of raising its strength perception above that of its opponent and the costs of overestimating its actual strength. These costs are assumed to arise principally from excessive depletion of energy reserves when strength is overestimated. They could also, in principle, arise partially from increased risk of injury with long-term consequences; however, we assume that an animal’s actual strength does not change between contests, so that the short-term consequences of any such injury would have to be negligible. Specifically, let an animal whose strength perception is S1 defeat an opponent whose strength perception is S2 with probability (8.3)
W (S1 , S2 ) =
1 2
+ 12 (S1 − S2 ).
Note that W increases with S1 − S2 , W (1, 0) = 1, W (S, S) = 12 , and W (0, 1) = 0: animals with equal strength perceptions are equally likely to win a fight, and a perception of maximum strength is guaranteed to defeat one of minimum strength. Moreover, if L(S1 , S2 ) denotes the probability that an animal with strength perception S1 loses against an opponent with strength perception S2 , then (8.4)
L(S1 , S2 ) = W (S2 , S1 ) =
1 2
+ 12 (S2 − S1 ).
For an individual whose strength is X but whose perception of it is S, let k0 denote the fixed cost of fighting per contest, and let r be the maximum variable cost (which would be paid by the worst possible fighter if it perceived itself as best). Then the animal’s contest cost, denoted by c(S, X), is k0 when S ≤ X but increases with respect to S − X when S > X; moreover, c(S, S) = k0 and c(1, 0) = k0 + r. For the sake of simplicity, we satisfy these conditions by taking c(S, X) = k0 + r(S − X) if S > X, so that r becomes the marginal cost of overestimating strength perception (with respect to the magnitude of that overestimate). Every contestant pays the fixed cost k0 : this parameter reduces each animal’s excess over that of the gamma individual by the same amount 2k0 , and hence has no strategic effect. So, without loss of generality, we set k0 = 0:
0 if S ≤ X, (8.5) c(S, X) = r(S − X) if S > X.
320
8. Triadic Population Games
Table 8.1. Possible outcomes for a focal individual: F; O1 and O2 are its first and second opponents, respectively, and parentheses indicate a contest in which it is not involved.
case 1st k 1 F F 2 F O1 3 F 4 O1 5 6 O1 O1 F 7 F 8 9 F O1 10
winners 2nd 3rd F (O1) F (O2) O2 (O2) F (O1) O2 (O1) F (O2) O2 (O1) O2 (O2) (O1) F (O2) F (O2) O2 (O1) F
case k 11 12 13 14 15 16 17 18 19 20 21 22
winners 1st 2nd F (O1) O1 (O2) O1 (O1) O1 (O2) (O1) F (O2) F (O1) O1 (O2) F (O1) F (O2) O1 (O1) O1 (O2) O1
3rd O2 F O2 O2 F F F O2 O2 F O2 O2
A strategy in this game consists of a triple of numbers, an initial strength perception and a pair of revised perceptions, one to adopt in the event of a win and another to adopt in the event of a loss. All are numbers between 0 and 1. For the population, let v0 be initial strength perception, let v1 be the level to which strength perception rises after a win, and let v2 be the level to which strength perception falls after a loss. Then the population’s strategy is a three-dimensional vector v = (v0 , v1 , v2 ) satisfying (8.6)
0 ≤ v2 ≤ v0 ≤ v1 ≤ 1.
Correspondingly, let u = (u0 , u1 , u2 ) denote the strategy of a focal individual who is a potential mutant and changes its initial strength perception of u0 to u1 after a win but to u2 after a loss, where (8.7)
0 ≤ u2 ≤ u0 ≤ u1 ≤ 1.
Both of the focal individual’s opponents play the population strategy v. Thus a triad consists of a focal u-strategist and a pair of v-strategists. All orders of interaction within a triad are assumed to be equally likely, and no animal observes the outcome of the contest between the other two individuals. After a win, a v-strategist’s probability of a win against an opponent with the same initial perception increases from W (v0 , v0 ) = 12 to W (v1 , v0 ) = 12 + 12 (v1 − v0 ), i.e., by 12 (v1 − v0 ). When v1 > v0 , we will say that there is a winner effect of magnitude v1 − v0 (omitting the factor 12 for the sake of simplicity). Correspondingly, when v0 > v2 there is a loser effect of magnitude v0 − v2 , because the v-strategist’s probability of a loss against an opponent with the same initial perception increases by L(v2 , v0 ) − L(v0 , v0 ) = 12 (v0 − v2 ) by (8.4). Because there are eight possible ways in which any set of three pairwise contests can be won or lost and three possible orders of interaction between a u-strategist and two v-strategists, there are 24 possible outcomes in all; see Table 8.1 where four of
8.1. Winner and loser effects
321
the outcomes have been combined in pairs to yield a modified total of 22. An animal is called naive if it has yet to engage in a contest, and otherwise it is experienced. If the focal individual faces two naive individuals, then its reward is denoted by fnn (u, v); if it faces a naive individual followed by an experienced individual, then its reward is fne (u, v); and if it faces two experienced individuals, then its reward is fee (u, v). These three possibilities are equally likely, by assumption. So (8.8)
f (u, v) =
1 3
fnn (u, v) +
1 3
fne (u, v) +
1 3
fee (u, v)
is the unconditional reward to a u-strategist in a population of v-strategists. There are two possible outcomes for the triad overall as a result of this interaction. The first is that one animal wins twice, implying that another loses twice and the third both wins and loses. In this case, there is a perfectly linear dominance hierarchy (p. 167) with the double winner or alpha male on top, the double loser or gamma male at the base, and the remaining contestant or beta male at an intermediate rank. The second possibility is a circular arrangement, which arises if each animal both wins and loses. In the first case, we assume that the alpha male’s reproductive benefits from its position exceed those of the gamma male by 1 unit, and that the beta male’s benefits exceed those of the gamma male by b units, where 0 ≤ b < 1. In the second case, we assume that the total excess of 1 + b is divided equally among the triad. Thus b is an inverse measure of reproductive inequity. The middle rank of a linear hierarchy yields a higher benefit than an equal share from a circular triad if 12 < b < 1 but a lower benefit if 0 ≤ b < 12 ; and if b = 12 , then additional benefits are directly proportional to number of contests won. We now discuss each interaction order in turn. Let φnn (u, v, X) denote the payoff to a u-strategist with strength X when both opponent v-strategists are naive, so that each has strength perception v0 . The u-strategist’s strength perception for its opening contest is u0 , and so it incurs a fighting cost c(u0 , X). It wins with probability W (u0 , v0 ), and it loses with probability L(u0 , v0 ) = W (v0 , u0 ). For the subsequent contest, the u-strategist’s perception will increase from u0 to u1 following a win, but decrease from u0 to u2 following a loss. After a win, it wins again with probability W (u1 , v0 ), but it loses with probability L(u1 , v0 ); in either case, the cost is c(u1 , X). After a loss, it loses again with probability L(u2 , v0 ), but it wins with probability W (u2 , v0 ); in either case, the cost is c(u2 , X). The u-strategist becomes the alpha individual, with payoff 1 − c(u0 , X) − c(u1 , X) if it wins both contests, i.e., with probability W (u0 , v0 )W (u1 , v0 ); see Table 8.2 (Case 1). Similarly, the u-strategist becomes the gamma individual, with payoff 0−c(u0 , X)− c(u2 , X), if it loses both contests, i.e., with probability L(u0 , v0 )L(u2 , v0 ); again see Table 8.2 (Case 6). In either of these cases (i.e., if the focal individual either wins or loses twice), the outcome of the third contest does not affect its payoff. If the focal individual is both a winner and a loser, however, then the remaining contest determines whether it occupies the middle rank in a linear hierarchy or belongs to an egalitarian triad. Each of these outcomes can arise in two ways. The u-strategist will occupy the middle rank of a linear hierarchy if it wins before it loses and its second opponent prevails over its first (Table 8.2, Case 2), or if it loses before it wins and its first opponent prevails over its second (Case 3). The benefit is b in either case but the costs differ, as shown in Table 8.2. Similarly, the u-strategist will belong to
322
8. Triadic Population Games
Table 8.2. Payoff to focal individual, conditional on Cases 1–6: Two naive opponents Cases 7–14: A naive and an experienced opponent Cases 15–22: Two experienced opponents
case k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
probability ωk (u, v) W (u0 , v0 ) W (u1 , v0 ) W (u0 , v0 ) L(u1 , v0 ) W (v1 , v2 ) L(u0 , v0 ) W (u2 , v0 ) W (v1 , v2 ) W (u0 , v0 ) L(u1 , v0 ) W (v2 , v1 ) L(u0 , v0 ) W (u2 , v0 ) W (v2 , v1 ) L(u0 , v0 ) L(u2 , v0 ) W (u0 , v0 ) W (v2 , v0 ) W (u1 , v2 ) W (u0 , v0 ) W (v0 , v2 ) W (u1 , v1 ) W (u0 , v0 ) W (v0 , v2 ) L(u1 , v1 ) L(u0 , v0 ) W (v1 , v0 ) W (u2 , v2 ) W (u0 , v0 ) W (v2 , v0 ) L(u1 , v2 ) L(u0 , v0 ) W (v0 , v1 ) W (u2 , v1 ) L(u0 , v0 ) W (v1 , v0 ) L(u2 , v2 ) L(u0 , v0 ) W (v0 , v1 ) L(u2 , v1 ) W (v0 , v0 ) W (u0 , v1 ) W (u1 , v2 ) W (v0 , v0 ) W (u0 , v2 ) W (u1 , v1 ) W (v0 , v0 ) L(u0 , v1 ) W (u2 , v2 ) W (v0 , v0 ) W (u0 , v2 ) L(u1 , v1 ) W (v0 , v0 ) W (u0 , v1 ) L(u1 , v2 ) W (v0 , v0 ) L(u0 , v2 ) W (u2 , v1 ) W (v0 , v0 ) L(u0 , v1 ) L(u2 , v2 ) W (v0 , v0 ) L(u0 , v2 ) L(u2 , v1 )
payoff Pk (u, X) 1 − c(u0 , X) − c(u1 , X) b − c(u0 , X) − c(u1 , X) b − c(u0 , X) − c(u2 , X) 1 (1 + b) − c(u0 , X) − c(u1 , X) 3 1 (1 + b) − c(u0 , X) − c(u2 , X) 3 −c(u0 , X) − c(u2 , X) 1 − c(u0 , X) − c(u1 , X) 1 − c(u0 , X) − c(u1 , X) b − c(u0 , X) − c(u1 , X) b − c(u0 , X) − c(u2 , X) 1 (1 + b) − c(u0 , X) − c(u1 , X) 3 1 3 (1 + b) − c(u0 , X) − c(u2 , X) −c(u0 , X) − c(u2 , X) −c(u0 , X) − c(u2 , X) 1 − c(u0 , X) − c(u1 , X) 1 − c(u0 , X) − c(u1 , X) b − c(u0 , X) − c(u2 , X) b − c(u0 , X) − c(u1 , X) 1 (1 + b) − c(u0 , X) − c(u1 , X) 3 1 (1 + b) − c(u0 , X) − c(u2 , X) 3 −c(u0 , X) − c(u2 , X) −c(u0 , X) − c(u2 , X)
an egalitarian triad it wins before it loses and its first opponent prevails over its second (Case 4), or if it loses before it wins and its second opponent prevails over its first (Case 5). The benefit is now 13 (1 + b) in either case, and the costs again differ, as shown in Table 8.2. Let ωk (u, v) be the probability of Event k in Table 8.2, and let Pk (u, X) be the associated payoff to the focal individual. Then (8.9)
φnn (u, v, X) =
6 #
ωk (u, v) Pk (u, X),
k=1
and integration over the sample space of X yields " 1 (8.10) fnn (u, v) = E [φnn (u, v, X)] = φnn (u, v, x)g(x) dx 0
in (8.8), where E denotes expected value. A similar analysis yields the payoff φne (u, v, X) or φee (u, v, X) to a u-strategist against a naive opponent and an experienced one, or against two experienced opponents. The relevant expressions are obtained from Table 8.2 and analogues of (8.9)
8.1. Winner and loser effects
323
Table 8.3. Reward function coefficients
72J0 (v) = (1 − 2b) 2{2v1 − 2v2 + 3}v02 + 7v1 v2 − (2 − b)(1 − v0 )v22 + {24 + (1 + v0 )v12 }(1 + b) − 12{v0 + (1 − b)v1 + bv2 } − 3{v1 + v2 + (v1 − v2 + v1 v2 + 2)v0 } 72J1 (v) = 3(1 − 2v0 ) − (1 − 2b)v0 (v0 + v1 − 2v2 + 3) − (v1 − v2 )(1 + b) + 6(1 − b)(2 − v2 ) 72J2 (v) = (1 − 2b)v0 (v0 − 2v1 + v2 + 3) + (v2 − v1 )(2 − b) + 6b(2v0 + v1 + 2) + 3 72K0 (v) = 18 − (1 + b)v12 + 3v1 v2 − (2 − b)v22 − (1 − 2b)(2v0 {2v1 − 2v2 + 3} + 3{v1 + v2 }) 72K1 (v) = (1 − 2b)(6 + v0 + v1 − 2v2 ) + 9 72K2 (v) = (1 − 2b)(6 − v0 + 2v1 − v2 ) − 9 in which k is summed between 7 and 14 for a naive and an experienced opponent and between 15 and 22 for two experienced opponents. Then fne (u, v) and fee (u, v) are calculated by analogy with (8.10), and, on using W (v0 , v0 ) = 12 , it follows from (8.8) that the reward f (u, v) to a u-strategist against v-strategists is # # f (u, v) = 13 ωk (u, v) + 13 b ωk (u, v) k∈N1
k∈N2
+ 19 (1 + b)
# k∈N3
(8.11) −
1 6
"
ωk (u, v) −
1
c(u0 , x) g(x) dx 0
" "
−
1 6
1
{4W (u0 , v0 ) + W (u0 , v1 ) + W (u0 , v2 )}
c(u1 , x) g(x) dx 0
1
{4L(u0 , v0 ) + L(u0 , v1 ) + L(u0 , v2 )}
c(u2 , x) g(x) dx, 0
where the index sets N1 , N2 , and N3 are defined by N1 = {1, 7, 8, 15, 16}, N2 = {2, 3, 9, 10, 17, 18},
(8.12)
N3 = {4, 5, 11, 12, 19, 20}. It now follows from (8.5) that f (u, v) = J0 (v) + J1 (v)u1 + J2 (v)u2 (8.13)
+ {K0 (v) + K1 (v)u1 + K2 (v)u2 } u0 − rH(u0 ) − −
1 12 r {6 1 12 r {6
+ 6u0 − 4v0 − v1 − v2 } H(u1 ) − 6u0 + 4v0 + v1 + v2 } H(u2 )
with J0 , J1 , J2 , K0 , K1 , and K2 defined by Table 8.3 and H by " s " s (8.14) H(s) = (s − ξ) g(ξ) dξ = G(ξ) dξ. 0
0
The gradient of f is also readily calculated, and its components are recorded in Table ∂f (v0 , v1 , v2 ) 8.4. For u = v the expressions reduce to those in Table 8.5, where ∂u i ∂f denotes ∂ui u=v for i = 0, 1, 2.
324
8. Triadic Population Games
Table 8.4. The gradient of f ∂f ∂u0 ∂f ∂u1 ∂f ∂u2
= K0 (v) + K1 (v)u1 + K2 (v)u2 − rG(u0 ) − 12 r {H(u1 ) − H(u2 )} = J1 (v) + K1 (v)u0 −
1 12 r {6
+ 6u0 − 4v0 − v1 − v2 } G(u1 )
= J2 (v) + K2 (v)u0 −
1 12 r {6
− 6u0 + 4v0 + v1 + v2 } G(u2 )
Table 8.5.
The gradient of f , evaluated for u = v
{6(1 + v1 ) − (1 − 2b)(v1 − v2 + 2)v0 + (v1 − v2 )v2 } 1 − 24 {2(1 + b) + b(v1 − v2 )}(v1 + v2 ) − 12 r{H(v1 ) − H(v2 ) + 2G(v0 } ∂f 1 ∂u1 (v0 , v1 , v2 ) = 72 {6(1 − b)(v0 − v2 + 2) − (1 + b)(v1 − v2 ) + 3} 1 − 12 r(6 − v1 − v2 + 2v0 )G(v1 ) ∂f 1 ∂u2 (v0 , v1 , v2 ) = 72 {6b(v1 − v0 + 2) − (2 − b)(v1 − v2 ) + 3} 1 − 12 r(6 + v1 + v2 − 2v0 )G(v2 ) ∂f ∂u0 (v0 , v1 , v2 )
=
1 24
All of these results hold for an arbitrary distribution of actual strength, but such generality allows only limited further progress. Accordingly, we assume henceforward that strength has a uniform distribution—i.e., g(ξ) = 1, as in (2.23)—and hence that (8.15)
G(s) = s,
H(s) =
1 2 2s ,
0 ≤ s ≤ 1.
More realistic, nonuniform distributions are discussed in [209]. To any population strategy v, the best reply—u∗ , say—is u that maximizes f (u, v) subject to (8.7). This best reply is unique, because Table 8.4 implies ∂ 2 f /∂u20 = −rg(u0 ), ∂ 2 f /∂u21 = −r(6+6u0 −4v0 −v1 −v2 )g(u1 )/12, and ∂ 2 f ∂u22 = −r(6 − 6u0 + 4v0 + v1 + v2 )g(u2 )/12, all of which are negative in the interior of the constraint set. If u∗ = v, then v is a (strong) ESS. Note, however, that the uniqueness of u∗ does not by itself imply a unique ESS (Exercise 8.3), although for a uniform distribution there is indeed a unique ESS. We now proceed to discover its dependence on b and r. The ESS is most readily found by applying the Kuhn–Tucker conditions of constrained optimization theory; see, e.g., [175, Chapter 10]. Adapted to the purpose at hand, these conditions state that if u∗ maximizes f subject to constraints of the form hi (u) ≥ 0, i = 0, . . . , 3, then there must exist nonnegative Lagrange multipliers λi , i = 0, . . . , 3, such that the Lagrangian L = f + 3i=0 λi hi (u) satisfies ∇L = (0, 0, 0) and λi hi (u∗ ) = 0, i = 0, . . . , 3, with the gradient evaluated at u = u∗ . If these conditions are satisfied with hi (u∗ ) = 0, then we will say that Constraint i is active. Using (8.7) to write h0 (u) = 1 − u1 , h1 (u) = u1 − u0 , h2 (u) = u0 − u2 , and h3 (u) = u2 , it follows at once that u∗ maximizes f subject to 0 ≤ u2 ≤ u0 ≤ u1 ≤ 1
8.1. Winner and loser effects
325
if nonnegative λ0 , λ1 , λ2 , and λ3 exist such that ∂f (8.16a) − λ1 + λ2 = 0, ∂u0 u=u∗ ∂f (8.16b) − λ0 + λ1 = 0, ∂u1 u=u∗ ∂f (8.16c) − λ2 + λ3 = 0 ∗ ∂u2
u=u
and λ0 (1 − u∗1 ) = 0
(8.17a)
λ1 (u∗1 λ2 (u∗0
(8.17b) (8.17c) (8.17d)
− u∗0 ) − u∗2 ) λ3 u∗2
(Constraint 0),
= 0
(Constraint 1),
= 0
(Constraint 2),
= 0.
To determine the ESS, we must solve these equations for u∗ = v.2 We first establish that there is no ESS with v2 = 0. For suppose that v2 = 0. 1 Then λ2 − λ3 = 72 {3 − 2v1 + (12 − 6v0 + 7v1 )b} is positive from (8.16c) and Table 8.5; hence λ2 is positive, because λ3 ≥ 0. So Constraint 2 can be satisfied with u∗ = v only if also v0 = 0, in which case it follows from Table 8.5 and 1 {2(2v1 + 3) − (2 + v1 )bv1 } − 14 rv12 and λ0 − λ1 = (8.16a)–(8.16b) that λ1 − λ2 = 24 1 1 72 {12(1 − b) − (1 + b)v1 + 3} − 12 rv1 (6 − v1 ), the second of which implies λ0 > 0 and contradicts (8.17a) if v1 = 0; hence v1 > 0, requiring λ1 = 0, by (8.17b). Now, from λ2 > 0, λ1 = 0, and λ0 ≥ 0, we require 2(2v1 + 3) − (2 + v1 )bv1 < 6rv12 and 12(1 − b) − (1 + b)v1 + 3 ≥ 6rv1 (6 − v1 ), implying 3(1 + b)v12 > bv13 + 3v1 + 36, which is clearly false. So we must have v2 > 0 at the ESS, i.e., the last constraint is never active. It follows from (8.17d) that λ3 = 0, and hence from (8.16) that ∂f (v0 , v1 , v2 ) + ∂u0 ∂f = (v0 , v1 , v2 ) + ∂u0 ∂f = (v0 , v1 , v2 ) ∂u2
(8.18a)
λ0 =
(8.18b)
λ1
(8.18c)
λ2
∂f ∂f (v0 , v1 , v2 ) + (v0 , v1 , v2 ), ∂u1 ∂u2 ∂f (v0 , v1 , v2 ), ∂u2
for any ESS v. Next we establish that there is no ESS of the form v = (v0 , v1 , v0 ) with v1 > v0 . For suppose that such an ESS exists. Then (8.17b) implies λ1 = 0, so that (8.15), (8.18), and Table 8.5 imply (8.19a) {10 − 6r(3v1 + 4v0 ) + b}(v1 − v0 ) + 3(7 − 36rv0 + 4b) = 3b(v1 − v0 )2 , (8.19b)
λ0 =
(8.19c)
λ2 =
1 24 (5 1 24 (1
− 12rv1 − 4b) + − 12rv0 + 4b) +
1 72 (6rv1 1 72 (7b −
− 1 − b)(v1 − v0 ), 6rv0 − 2)(v1 − v0 ).
If v1 < 1, then (8.17a) implies that λ0 = 0. Thus v0 is determined by (8.19a) if v1 = 1, whereas v0 and v1 are jointly determined by (8.19a) and (8.19b) if v1 < 1. 2 The following calculations are rather complicated; readers who would prefer to take the outcome on trust are advised to skip ahead to p. 330.
326
8. Triadic Population Games
1 Figure 8.1. Variation of ζ = min(λ0 , λ2 ) with r and b for 36 (4b + 7) ≤ r ≤ 1 (10b + 31) when v = (v , 1, v ). Contours are drawn (from right to left) 0 0 18 for ζ = −0.87, . . . , −0.07, . . . , −0.01, with the difference in heights between contours decreasing from 0.1 to 0.02 at the ninth. The dot represents the global maximum at (r, b) = 14 , 12 , where ζ = 0.
First suppose that v1 = 1. Then, from Exercise 8.2, (8.19a) has a solution satisfying 0 < v0 < 1 only if 1 36 {4b
(8.20)
+ 7} < r
v1 = v0 = v2 1 = v1 = v0 > v2 1 = v1 > v0 > v2 1 > v1 > v0 > v2
loser effect? winner effect? No No No No Yes No Yes Yes Yes Yes
0 < v1 − v0 < 1 if 36r + 13b < 14. Whenever 0 < v0 < v1 < 1 is satisfied, however, the corresponding value of λ2 is readily obtained from (8.19c), and so the dependence of v0 , v1 , and λ2 on b and r can be determined. It is found that λ2 ≥ 0, v1 > v0 is never satisfied: as illustrated by Figure 8.2, λ2 is always negative if b < 12 , and v1 − v0 is always negative if b > 12 . It follows that no ESS of the form v = (v0 , v1 , v0 ) with v1 > v0 can exist. A similar analysis demonstrates that an ESS of the form v = (v0 , v0 , v2 ) with v0 > v2 is possible only if v0 = 1. For if v0 < 1, then neither Constraint 0 nor Constraint 2 is active, requiring λ0 = 0 = λ2 . These equations determine v0 and ∂f (v0 , v0 , v2 ) is determined by setting v1 = v0 v2 , the corresponding value of λ1 = ∂u 0 in Table 8.5, and it is found that λ1 < 0 if b < 12 and v0 − v2 < 0 if b > 12 whenever v2 < v0 < 1. So λ1 ≥ 0, v2 < v0 < 1 can never be satisfied. The upshot is that the only possible ESSs are of the types defined in Table 8.6. We discuss each type in turn. At a Type I ESS there is neither a winner nor a loser effect, i.e., v = (v0 , v0 , v0 ), and so Constraints 1 and 2 are both active, with λ1 , λ2 ≥ 0. From Table 8.5 with v1 = v0 = v2 , (8.18) reduces to (8.24)
(λ0 , λ1 , λ2 ) =
1 − 4rv0 7 − 36rv0 + 4b 1 − 12rv0 + 4b , , . 2 24 24
328
8. Triadic Population Games
Figure 8.3. Type of ESS as a function of r and b. See Table 8.6 for definitions of types.
At an ESS of Type Ia, where v0 = 1, Constraint 0 is also active, with λ0 ≥ 0. Thus, from (8.24) with v0 = 1, r≤
(8.25)
1 12
min(1 + 4b, 3).
At an ESS of Type Ib, v0 < 1, so that (8.17a) implies λ0 = 0. Then it follows from (8.24) that r ≥ 14 , b ≥ 12 , and (8.26)
v0 =
1 . 4r
The Type I region of the r-b plane is shaded in Figure 8.3. At a Type II ESS, a loser effect exists without a winner effect; Constraints 0 and 1 are active, but (8.17c) is satisfied with λ2 = 0, so that (8.15), (8.18), and Table 8.5 with v1 = 1 = v0 imply (8.27a)
6rv2 (5 + v2 ) = (2 − b)v2 + 13b + 1,
(8.27b)
λ0 =
41 − 10b − 5(1 + b)v2 − 3(1 − b)v22 − 72
λ1 =
1 24 {7
1 12 (7
− v2 )r − 14 (5 − v22 )r,
and (8.27c)
− v22 + (3 − v2 )(1 − v2 )b} − 14 (5 − v22 )r.
From Exercise 8.2, (8.27a) has a solution satisfying 0 < v2 < 1 only if 12r > 4b + 1, and the only such solution is 1 (8.28) v2 = (30r + b − 2)2 + 24(1 + 13b)r − 30r − b + 2 . 12r
− 3r Because (8.27b) implies that λ0 < 41−10b 72 2 for 0 < v2 < 1, (8.28) can satisfy 41−10b 1 λ0 ≥ 0 only if r < 108 . Thus, if b ≥ 2 , then the only possibility for λ0 ≥ 0 is16at a point in the triangle in the r-b plane with vertices at 14 , 12 , 13 , 12 , and 29 92 , 23 . But substitution of (8.28) into (8.27b)–(8.27c) yields expressions for λ0 and λ1 in terms of b and r alone, from which it is readily established that λ0 < 0 throughout the triangle (Exercise 8.2). It follows that a Type II ESS exists only if b < 12 . Although the equations of the curves on which λ0 = 0 and λ1 = 0—which intersect at 14 , 12 —can in principle be found analytically, the resultant expressions are too unwieldy to be of practical use. Instead, therefore, we proceed numerically. We find that λ0 > λ1 for b < 12 , and so the right-hand boundary of the region in which
8.1. Winner and loser effects
329
(1, 1, v2 ) is an ESS for v2 < 1 is determined by λ1 = 0 (Exercise 8.2). In Figure 8.3 it is the solid curve joining 14 , 12 to (r2 , 0), where r2 ≈ 0.233 is the larger positive root of the quartic equation 43200r 4 − 23040r 3 + 4332r 2 − 344r + 9 = 0. To the right of the Type II region there exists both a winner and a loser effect, i.e., v1 > v0 > v2 , so that (8.17b) and (8.17c) imply λ1 = 0 = λ2 . The ESS is of Type IIIa if v1 = 1. Then the equations 6r{4v0 − v22 + 1} + {2(1 + b) + b(1 − v2 )}(1 + v2 )
(8.29)
+ (1 − 2b)(3 − v2 )v0 = (1 − v2 )v2 + 12 and 6r(7 − 2v0 + v2 )v2 + (2 − b)(1 − v2 ) = 6b(3 − v0 ) + 3
(8.30)
jointly determine v0 and v2 , and (v0 , 1, v2 ) is an ESS as long as (8.31) λ0 =
1 12 {(v0
− v2 + 2)(1 − b) − (2v0 − v2 + 5)r} −
1 72 (1
− v2 )(1 + b) +
1 24
is nonnegative. As in the case of a Type II ESS, for any b there is an upper bound on r, which is found by solving (8.29) and (8.30) simultaneously with λ0 = 0 for v0 , v2 , and r. The resultant boundary is sketched in Figure 8.3.
Figure 8.4. Evolutionarily stable strategy v = (v0 , v1 , v2 ) as a function of b for (a) r = 0.2, (b) r = 0.4, (c) r = 0.6 and (d) as a function of r for b = 0.05; v0 is shown dashed, v1 and v2 (where different from v0 ) are shown solid. A loser effect (L) of magnitude v0 − v2 is indicated by darker shading, a winner effect (W) of magnitude v1 − v0 by lighter shading.
330
8. Triadic Population Games
Finally, to the right of this boundary, there is always an ESS of Type IIIb with 1 > v1 > v0 > v2 > 0, so that (8.17a) and (8.17c) imply λi = 0 for all i. That is, v0 , v1 , and v2 are jointly determined by (8.32)
6r{v12 − v22 + 4v0 } + {2(1 + b) + b(v1 − v2 )}(v1 + v2 ) = 6(1 + v1 ) − (1 − 2b)(v1 − v2 + 2)v0 + (v1 − v2 )v2 ,
(8.33)
6r(6 − v1 − v2 + 2v0 )v1 + (1 + b)(v1 − v2 ) = 6(1 − b)(v0 − v2 + 2) + 3,
and (8.34)
6r(6 − 2v0 + v1 + v2 )v2 + (2 − b)(v1 − v2 ) = 6b(v1 − v0 + 2) + 3.
These equations agree with (8.26) along the common boundary b = 12 between regions Ib and IIIb. We conclude that there is a unique ESS, as determined by Table 8.6 and Figure 8.3. The magnitude of any winner or loser effect decreases with b and is greatest at intermediate values of r, as illustrated by Figure 8.4. The uniqueness of the ESS turns out to be a special property of the uniform distribution; for nonuniform distributions, there may be small transition regions where different types of ESS overlap (Exercise 8.3). Nevertheless, the main prediction embodied in Table 8.6 and Figure 8.3, that a winner effect cannot exist without a loser effect, has been shown to hold for a much more general class of distributions, of which the uniform distribution is merely one special case [209]. Now two decades old, this prediction remains “well-supported by the empirical literature” [42, p. 423], even though, as remarked at the beginning of the section, one experiment has failed to corroborate it. Intuitively, a winner effect cannot exist without a loser effect because if costs are too low to support a loser effect, then they are also low enough to support such a high initial strength perception that there is no advantage to raising it after a win. Then what accounts for the sole exception to the rule? We return to this question in §8.4.
8.2. Victory displays A victory display—defined as a display performed by the winner of a contest, but not by the loser—is a familiar sight after sporting events among humans. But victory displays are not uniquely human; they have also been observed in other species, for example, black-capped chickadees [172], field crickets [24], little blue penguins [240], and mangrove crabs [62]. What are they for? A possible answer is that either they are an attempt to advertise victory to other members of a social group that did not pay sufficient attention to a contest or that they are an attempt to decrease the probability that the loser of a contest will initiate a new contest with the same individual [36]. In short, they are either for advertising or for browbeating. But which? We address this question with a pair of models, called Model A (for advertising) and Model B (for browbeating). Both models are Type I games, with K = 36 for Model A and K = 10 for Model B in (8.1), and in both cases, strategies are intensities with which an animal displays after winning. That
8.2. Victory displays
331
is, strategy s means that an s-strategist displays with intensity s after victory. We begin with Model A. As in any triadic game, we consider a population that is subdivided into triads, the smallest groups in which it is possible to compare the two rationales for victory displays. For simplicity, and because we want to study victory displays in their purest form, in isolation from other effects, we assume no variation in strength. It is therefore consistent to assume that a fight—if there is one—lasts for a fixed amount of time, with each contestant paying the same fixed cost. By contrast, post-contest costs differ between winner and loser, because a winner incurs the cost of a victory display, whereas a loser does not. There is also a basic benefit that accrues to each individual for belonging to the group, regardless of its status within it, but because this term is the same for every outcome, it has no strategic effect, and so we can proceed as though it were zero.4 Within any animal group, benefits beyond the basic level are determined by relative status. If there were always a linear dominance hierarchy, then rank would be a sufficient measure of status. But we already know that a linear dominance hierarchy need not form, even if every contest produces a winner and a loser (as assumed in §8.1); moreover, inconclusive contests are quite common [152]. So rank will not suffice as a measure of status. Consider, for example, a triad of animals A, B, C in which A defeats and dominates B and B defeats and dominates C; however, A does not dominate C, and C does not dominate A, because their contest was inconclusive. A has the highest status within this group and C has the least, just as if there were a linear dominance hierarchy with A on top and C on the bottom; but whereas the status of B may be about the same as if such a hierarchy existed, that of A is lower and that of C is higher. In essence, therefore, status is determined by a combination of dominance and what we shall refer to as “nonsubordination.” A contest outcome is one of dominance if one individual subordinates to the other, whereas the second does not; and a contest outcome is one of nonsubordination if either each defers (concedes dominance) to the other (as is possible only in Model A) or else neither defers to the other (as is possible only in Model B). We must capture the idea that dominance contributes more to fitness than nonsubordination, which in turn contributes more than being dominated. So for each individual, let fitness increase (beyond the basic level) by α for every individual it dominates and by b α for every individual it fails to dominate that also fails to dominate it. The cumulative benefit to an individual is then α times the associated number of dominances, #D, plus bα times the associated number of nonsubordinations, #NS. For example, in the triad described in the preceding paragraph, the benefits accruing to A, B, and C would be (1 + b)α, α, and bα, respectively, whereas in a linear hierarchy they would be 2α, α, and 0. Thus b, a dimensionless parameter satisfying 0 ≤ b < 1, is an inverse measure of the reproductive advantage of dominance, which is greatest when b = 0 and least when b = 1; in effect, the advantage of (complete) dominance within the triad is 1 − b, and so we will refer to parameter b as the dominance disadvantage. It will be convenient to scale costs with respect to α. Accordingly, let c0 α be the cost of a contest, and let c(s)α be 4
See Footnote 9 on p. 71.
332
8. Triadic Population Games
the cost of using strategy s, that is, of displaying with intensity s after winning. Then the associated (relative) payoff is (8.35)
α · #D + b α · #NS − c0 α · #C − c(s)α · #W,
where #C is the number of contests and #W is the number of wins. We assume that all possible orders of interaction are equally likely. Defining p(s) be the probability that a victory display of intensity s—that is, use of strategy s—is observed by the noncontestant, we also assume that (8.36)
c(0) = 0
with (8.37a)
c (s) > 0,
p (s) > 0
c (s) ≥ 0,
p (s) < 0.
and (8.37b)
In essence, doubling the intensity of display would at least double its cost, but less than double its effectiveness.5 Conversely, with p (s) > 0, we require p (s) < 0 to ensure that p(s) does not exceed 1. Both of the focal individual’s opponents play the population strategy v. Thus, as in §8.1, a triad consists of a focal u-strategist and a pair of v-strategists. We assume that observing the winner of a contest means indirectly observing the loser. Let the probability that an (indirectly) observed loser subsequently loses against the observer be 1+l 2 , where 0 ≤ l ≤ 1: there is a potential loser effect of sorts, but only if the loser has been observed by its next opponent. Our rationale for allowing l > 0 is that an individual has a higher estimate of its own chance of victory when taking on an individual that it has observed to lose, which remains consistent with our assumption of equal fighting strengths. Let λ denote the probability that an animal defers to an opponent when it has observed that the opponent won against its previous opponent, where 0 ≤ λ ≤ 1. Then the probability that an observed winner subsequently wins against the observer is λ · 1 + (1 − λ) · 12 = 1+λ 2 : there is a winner effect of sorts, but only if the winner has been observed by its next opponent. For greater generality we allow the probability of deference to an observed winner to vary with an individual’s own prior experience by using subscripts 0, 1, 2 to distinguish an untested individual, a prior loser, and a prior winner. That is, we denote the probability that an untested individual, a prior loser, or a prior winner defers to an observed winner by λ0 , λ1 , or λ2 , respectively, and we assume that (8.38)
0 < λ2 ≤ λ0 ≤ λ1 < 1.
We also allow for a prior loser to defer to an observed loser with probability λ3 . However, we assume that neither an untested animal nor a prior winner ever defers to an observed loser. We now consider each order of interaction in turn. Let us first consider the payoff to a u-strategist when it participates in the first two of the triad’s three 5 There is a biological rationale for c (s) > 0: in keeping with many physiological processes involving reagents in short supply, we assume that small increases in the magnitude of low strength signals are metabolically cheap, while further increases in the magnitude of high strength signals are expensive.
8.2. Victory displays
333
Table 8.7. Model A payoff to a focal individual F whose first and second opponents are O1 and O2, respectively, conditional on participation in the first two of the three contests (rows 1–5) or on participation in the first and last of the three contests (rows 6–19). Parentheses indicate a contest in which the focal individual is not involved. A bold letter indicates that the individual’s opponent deferred after observing it win an earlier contest. Note that O1 and O2 do not label specific individuals: O1 is whichever individual happens to be the focal individual’s first opponent for a given order of interaction, and the other individual is O2. Case k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1st F F F O1 O1 F F F F F F F O1 F O1 O1 O1 O1 O1
Winners 2nd 3rd F F O2 F O2 (O1) F (O1) F (O2) F (O2) F/O2 (O2) F (O2) O2 (O2) O2 (O1) F (O1) O2 (O2) F (O2) O2 (O2) O2 (O1) O2 (O1) F/O2
Probability ωk (u, v) 1 λ p(u) 2 0 1 {1 − λ0 p(u)} 4 1 {1 − λ0 p(u)} 4 1 {1 − l p(v)} 4 1 {1 + l p(v)} 4 1 {1 − l p(u)}λ1 p(u) 4 1 {1 − l p(u)}{1 − λ1 p(u)}{1 + l p(v)} 8 1 {1 + l p(u)}{1 − λ2 p(v)}λ2 p(u) 4 1 2 {1 + l p(u)}λ 2 p(v) p(u) 4 1 {1 + l p(u)}{1 − λ2 p(u)}{1 − λ2 p(v)} 8 1 {1 + l p(u)}{1 − λ2 p(u)}λ2 p(v) 4 1 {1 + l p(u)}{1 − λ2 p(u)}{1 − λ2 p(v)} 8 1 {1 + λ0 p(v)}{1 − λ32 } 8 1 {1 − l p(u)}{1 − λ1 p(u)}{1 − l p(v)} 8 1 {1 − λ p(v)}{1 − λ1 p(v)}{1 − l p(v)} 0 8 1 {1 − λ p(v)}λ1 p(v) 0 4 1 {1 − λ p(v)}{1 − λ1 p(v)}{1 + l p(v)} 0 8 1 {1 + λ p(v)}{1 − λ32 } 0 8 1 {1 + λ0 p(v)}λ32 4
Payoff Pk (u) {2 − 2c(u) − c0 }α {2 − 2c(u) − 2c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α −2c0 α {2 − 2c(u) − c0 }α {2 − 2c(u) − 2c0 }α {2 − 2c(u) − c0 }α {1 + b − c(u) − c0 }α {2 − 2c(u) − 2c0 }α {1 − c(u) − c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α −c0 α −2c0 α −2c0 α {b − c0 }α
dyadic interactions. The focal individual wins its first contest with probability 12 and is observed to win by its second opponent with probability p(u). If the focal individual wins and has been observed, with probability λ0 the opponent defers, and with probability 1 − λ0 the two individuals engage in a fight, which they are equally likely to win. Hence the u-strategist’s probability of winning twice through a single fight is 12 · p(u) · λ0 = 12 λ0 p(u); see Table 8.7, Case 1. Its probability of winning twice through a pair of fights, allowing for the possibility that its first win may not be observed, is 12 · p(u) · (1 − λ0 ) · 12 + 12 · (1 − p(u)) · 12 = 14 {1 − λ0 p(u)}; see Table 8.7, Case 2. In either case, the focal individual becomes the alpha individual at the top of a linear dominance hierarchy with benefit 2α, but its costs differ, because in the first case it avoids the fixed cost that would have been associated with the second fight. Similarly, the probability that the focal individual comes to rest at the bottom of a linear hierarchy is the probability that it loses twice in succession. The probability that it loses the first time is simply 12 . It is now a prior loser; and because its first opponent was a v-strategist, it has been indirectly observed to lose by its second opponent through the first opponent’s victory display with probability p(v), in which case, its probability of losing again is 1+l 2 . So the
334
8. Triadic Population Games
1 1 1 probability of losing twice is 12 · p(v) · 1+l 2 + 2 · (1 − p(v)) · 2 = 4 {1 + l p(v)}; see Table 8.7, Case 5. Proceeding likewise for the other two outcomes, we find that the reward to a u-strategist in a population of v-strategists, conditional upon participation in the first two contests, is
(8.39a)
f12 (u, v) = =
5 #
ωk (u, v)Pk (u)
k=1 1 4 {1
− c(u)}{λ0 p(u) − lp(v) + 4} − 12 {4 − λ0 p(u)}c0 α,
where ωk (u, v) is the probability of the event defined by Case k in Table 8.7 and Pk (u) is the corresponding payoff to the focal individual. We handle the other two possible orders of interaction similarly. Suppose, for example, that the focal individual F is involved in the first and the third of the three contests, that it wins the first, and that its (future) second opponent O2 wins the second contest, as in Cases 8–12 of Table 8.7. This event is equivalent to a double loss by F’s first opponent O1, and its probability is therefore 14 {1 + l p(u)}, the only difference from Case 5 above being that the winner of the first contest is a u-strategist instead of a v-strategist. The third contest is now between two prior winners. F has observed O2’s victory with probability p(v), O2 has observed F’s victory with probability p(u), and conditional thereupon, each defers to the other with probability λ2 . Thus the probability of nonsubordination in the third contest is λ22 p(u)p(v); see Table 8.7, Case 9. The focal individual escalates if it is not the case that it has seen O2 win and has deferred, i.e., with probability 1 − λ2 p(v); similarly, O2 escalates with probability 1 − λ2 p(u). So an actual fight occurs with probability {1 − λ2 p(u)}{1 − λ2 p(v)} and is won with equal probability by F (Table 8.7, Case 10) or O2 (Table 8.7, Case 12). The remaining possibility for the third contest is that the outcome is dominance: either F escalates with probability 1 − λ2 p(v) and O2 defers with probability λ2 p(u) (Table 8.7, Case 8) or else O2 escalates with probability 1 − λ2 p(u) and F defers with probability λ2 p(v) (Table 8.7, Case 11). Proceeding likewise for the other cases, the reward f13 (u, v) to a u-strategist in a population of v-strategists, conditional upon participation in the first and the last of the three contests, and the reward f23 (u, v), conditional upon participation in the last two of the three contests, are given by (8.39b)
f13 (u, v) =
19 #
ωk (u, v)Pk (u)
k=6
and (8.39c)
f23 (u, v) =
36 #
ωk (u, v)Pk (u),
k=20
respectively, where the terms of the first summation are defined by Table 8.7, and those of the second by Table 8.8. Now, since all possible orders of interaction are equally likely, the (unconditional) reward to a u-strategist in a population of v-strategists is (8.40)
f (u, v) =
1 3
f12 (u, v) +
1 3
f13 (u, v) +
1 3
f23 (u, v).
8.2. Victory displays
335
Table 8.8. Model A payoff to a focal individual F whose first and second opponents are O1 and O2, respectively, conditional on participation in the last two of the three contests. Parentheses indicate a contest in which the focal individual is not involved. A bold letter indicates that the individual’s opponent deferred after observing it win an earlier contest. Case k 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
1st (O1) (O1) (O2) (O2) (O2) (O1) (O1) (O2) (O2) (O1) (O2) (O1) (O1) (O2) (O2) (O1) (O1)
Winners 2nd 3rd F F F F F F F F/O2 F F O1 F O1 F F O2 F O2 F O2 O1 F O1 O2 O1 O2 O1 O2 O1 O2 O1 F/O2 O1 F/O2
Probability ωk (u, v) 1 {1 − λ0 p(v)}λ1 p(u) 4 1 {1 − λ0 p(v)}{1 − λ1 p(u)}{1 + l p(v)} 8 1 {1 + l p(v)}{1 − λ2 p(v)}λ2 p(u) 4 1 {1 + l p(v)}λ22 p(v) p(u) 4 1 {1 + l p(v)}{1 − λ2 p(u)}{1 − λ2 p(v)} 8 1 λ p(v){1 − λ32 } 0 4 1 {1 − λ0 p(v)}{1 − λ32 } 8 1 {1 + l p(v)}{1 − λ2 p(u)}λ2 p(v) 4 1 {1 + l p(v)}{1 − λ2 p(u)}{1 − λ2 p(v)} 8 1 {1 − λ0 p(v)}{1 − λ1 p(u)}{1 − l p(v)} 8 1 {1 − l p(v)}2 {1 − λ1 p(v)} 8 1 λ p(v){1 − λ32 } 4 0 1 {1 − λ0 p(v)}{1 − λ32 } 8 1 {1 − l p(v)}λ1 p(v) 4 1 {1 − l p(v)}{1 − λ1 p(v)}{1 + l p(v)} 8 1 λ p(v)λ32 2 0 1 {1 − λ0 p(v)}λ32 4
Payoff Pk (u) {2 − 2c(u) − c0 }α {2 − 2c(u) − 2c0 }α {2 − 2c(u) − c0 }α {1 + b − c(u) − c0 }α {2 − 2c(u) − 2c0 }α {1 − c(u) − c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α {1 − c(u) − 2c0 }α −c0 α −2c0 α −c0 α −2c0 α bα {b − c0 }α
Substitution from Table 8.7 into (8.39c), from Table 8.8 into (8.39c), and then from (8.39) into (8.40) yields a very cumbersome expression [227, p. 599] whose presentation would serve no purpose here. Nevertheless, because of (8.37) and 2 (8.38), and because b < 1, f defined by (8.40) satisfies ∂∂uf2 < 0 for all u ≥ 0, implying that zero is uniquely the best reply to itself for ∂f ∂u u=0 ≤ 0 and that v = v ∗ is uniquely the best reply to itself for ∂f (8.41) > 0, ∂u
u=0
where v = v ∗ > 0 is the only root of the equation ∂f (8.42) = 0. ∂u
u=v
In other words, v ∗ or 0 is the unique strong ESS, according to whether or not (8.41) holds. Thus (8.41) is the condition for a victory display at the ESS, and (8.42) determines the corresponding intensity. For the sake of simplicity, we now concentrate on the case in which (8.38) is satisfied with (8.43)
λ0 = λ1 = λ2 = λ > 0,
λ3 = 0,
so that the probability of deferring to a prior winner is independent of the prior experience of the observer, and prior losers do not defer to observed losers.6 In this 6
The effects of departures from (8.43) are considered in Exercise 8.5.
336
8. Triadic Population Games
case, (8.40) reduces to
f (u, v) = (1 − c(u)) 1 + − + +
1 24 {p(u)
− p(v)}(2λ(3 − lp(v)) − (l2 + λ2 )p(v))
2 2 1 24 λp(v){1 − c(u)} 2p(u){λ − l p(u)} + l(l + λ){p(u) + 1 2 12 λ p(u)p(v)(2 + l{p(u) + p(v)})(b − c0 ) 1 12 λ{6p(u) + (2l − λ)p(u)p(v) + (6 − λp(v))p(v)}c0 − 2c0
p(v)2 }
times α. For the sake of definiteness, we satisfy (8.36) and (8.37) with (8.44)
c(s) = γθs,
p(s) = + (1 − ) 1 − e−θs ,
(8.45)
where θ > 0 and 0 < < 1; here θ has the dimensions of intensity−1 , so that γ (> 0) is a dimensionless measure of the marginal cost of displaying (i.e., the cost per unit of increase of intensity) and is the baseline probability that a victor will be observed (by the noncontestant) in the absence of a display. Now (8.41) becomes (8.46)
γ < 1−
2λ{3−l{1−l}+(2{3+l}−3λ{1+l})c0 }−l2 +(2b{2+3l}−3−2l)λ2 . 2(12−λ2 2 {1+l })
This inequality cannot be satisfied in the limit as → 1 (where display is unnecessary), and is most readily satisfied in the limit as → 0, but only if the cost of displaying is not too excessive, specifically, if (8.47) γ < 14 + 12 c0 λ, which we assume. There is then always a critical value of the baseline probability, , above which winners do not display at the ESS, below which they display with intensity v ∗ , where v = v ∗ is the only root of (8.48) {1 − p(v)} 2lλ(l + (3{b − c0 } − 1)λ)p(v)2 + 6λ(1 + 2c0 ) − {l(l + 2λ) − 4lλc0 + λ2 (3 − 4b + 6c0 )}p(v) = 2γ 12 − {1 + lp(v)}λ2 p(v)2 + c(v){1 − p(v)} 6λ − {(l + λ)2 + 2λ2 }p(v) + 2lλ(l − λ)p(v)2 , by (8.42). Note that (8.46) cannot be satisfied in the limit as λ → 0, but that its right-hand side increases with λ; thus conditions are more favorable to a nonzero ESS, the greater the probability of deference to an observed winner. Equation (8.48), being cubic in e−θv , is best solved numerically. The resultant ESS is plotted as a dashed curve in Figures 8.5(a) and 8.5(b) for c0 = 0.1, l = 0.5, γ = 0.05, and various values of b and λ (such that (8.47) holds). We call these victory displays obligate, because a victor invariably displays. At least among animals with sufficient cognitive ability, however, victory displays may be facultative—and in a triadic interaction, there is no need to advertise after a second victory or after a first victory in the final contest, since there is no other individual that can be influenced by the display. Our model is readily adapted to deal with this possibility, which has no effect on ωk (u, v) but removes a display cost c(u)α from Pk (u) in 16 of the 36 cases in Tables 8.7 and 8.8, namely, Cases 1, 2, 4, 6–8, 10, 13, 15, 20–22, 24–26, and 30. The analysis proceeds as before, with only relatively minor modifications. The only effect on (8.46) is to reduce the denominator on the right-hand side from
8.2. Victory displays
337
Figure 8.5. Typical Model A (advertising) ESS. Intensity (scaled with respect to the parameter θ1 to make it dimensionless) is plotted as a function of the baseline probability of observing victors, , for both obligate display (dashed curves) and facultative display (solid) for two values of the probability of deference and three values of the dominance disadvantage, namely, b = 0 (lowermost curves), b = 0.5, and b = 0.9 (uppermost). Values of other parameters (all dimensionless) are c0 = 0.1 for the fixed cost of a contest, l = 0.5 for the loser effect (i.e., the probability that an observed loser again loses is 0.75), and γ = 0.05 for the marginal cost of displaying.
2 2 2 1 12 −λ {1 + l } to 2 (6 + {l − λ}); thus, in place of (8.47), we obtain γ < 2 + c0 λ, which (8.47) implies, and the left-hand side of (8.48) is unaltered, while the right-hand side simplifies to 2γ{6+(l−λ)p(v)}. The resultant ESS is exemplified by the solid curves in Figures 8.5(a) and 8.5(b). As we would expect, reducing the costs increases the intensity of display. Now we turn to Model B, whose essence is to observe a distinction between losing and subordination. According to the browbeating rationale, a victory display is an attempt to decrease the probability that the loser of a contest will initiate a new contest with the same individual. As long as there is a chance that the loser will challenge the winner to another fight in the future, the winner has won a battle for dominance, but not the war. If, on the other hand, the victory ensures that the loser will never challenge, then victory is tantamount to dominance. Thus an equivalent statement of the browbeating rationale is that a victory display is an attempt to ensure that victory equals dominance. Some things are the same as for Model A. Each contest lasts for a fixed amount of time, with each contestant paying the same fixed cost, strategy s means that an s-strategist displays with intensity s, and c(s) is the cost of a victory display with intensity s. But now let q(s) be the probability that a display of intensity s by a winner elicits submission on the part of the loser: thus, an s-strategist wins with probability 12 but wins and dominates only with a smaller probability, 1 2 q(s). Because (we assume) nothing is signalled or observed outside each dyad, the order of interaction is irrelevant: because both of the focal individual’s opponents play the population strategy v, a triad consists of a u-strategist and a pair of vstrategists, with two contests between a u-strategist and a v-strategist and one
338
8. Triadic Population Games
between a pair of v-strategists. For the first type of contest, the probability that the u-strategist comes to dominate the v-strategist is 12 q(u), the probability that the u-strategist comes to be dominated by the v-strategist is 12 q(v), the probability that a winning u-strategist fails to browbeat a losing v-strategist into submission is 12 {1 − q(u)}, and the probability that a winning v-strategist fails to browbeat a losing u-strategist into submission is 12 {1 − q(v)}. For a contest between two vstrategists, the corresponding probabilities have u replaced by v. We assume that (8.36) and (8.37) continue to hold, but with with q in place of p; that is, (8.49a)
c(0) = 0,
c (s) > 0,
c (s) ≥ 0
and q (s) > 0,
(8.49b)
q (s) < 0.
The various possible configurations are readily distinguished as indicated in Table 8.9. Suppose, for example, that a focal u-strategist wins twice but fails to dominate either opponent. Then, from (8.35), its payoff is α · 0 + bα · 2 − c0 α · 2 − c(u)α · 2 = {2b − 2c(u) − 2c0 }α. 2 The associated probability is 12 {1 − q(u)} ; see Table 8.9, Line 3. Or suppose that the focal u-strategist wins only once but succeeds in dominating that opponent, whereas it avoids being dominated by the other opponent. Then, from (8.35), its payoff is α · 1 + bα · 1 − c0 α · 2 − c(s)α · 1 = {1 + b − c(u) − 2c0 }α. It pays the cost of display only once, because it wins only once. There are two ways in which this payoff can arise. If the u-strategist dominates its first opponent and is undominated by its second, then the associated probability is 12 q(u) · 12 {1 − q(v)}; if the u-strategist dominates its second opponent and is undominated by its first, then the same associated probability arises as 12 {1 − q(v)} · 12 q(u). Adding, the probability associated with payoff {1 + b − c(u) − 2c0 }α is 12 q(u){1 − q(v)}; see Table 8.9, Line 4. The other eight cases are similar. The upshot is that the reward to a u-strategist in a population of v-strategists is
(8.50)
f (u, v) =
10 #
ρk (u, v)Qk (u)
k=1
= {q(u) + b{2 − q(u) − q(v)} − c(u) − 2c0 }α. Now the ESS is (8.51a)
v =
0 v∗
if if
c (0) ≥ (1 − b)q (0), c (0) < (1 − b)q (0),
where v ∗ is uniquely defined by (8.51b)
(1 − b)q (v ∗ ) = c (v ∗ )
(Exercise 8.6). For example, with costs the same as for Model A and (8.52) q(s) = δ + (1 − δ) 1 − e−θs ,
8.2. Victory displays
339
Table 8.9. Model B payoff to a focal u-strategist from all possible configurations. Here #W is the number of wins by the focal u-strategist, #D is the number of animals dominated by the u-strategist, and #NS is the number of animals to which the u-strategist is not subordinate.
Case k 1 2 3 4 5 6 7 8 9 10
Configuration #W #D #NS 2 2 0 2 1 1 2 0 2 1 1 1 1 1 0 1 0 2 1 0 1 0 0 2 0 0 1 0 0 0
Probability ρk (u, v) 1 q(u)2 4 1 q(u){1 − q(u)} 2 1 {1 − q(u)}2 4 1 q(u){1 − q(v)} 2 1 q(u)q(v) 2 1 {1 − q(u)}{1 − q(v)} 2 1 {1 − q(u)}q(v) 2 1 {1 − q(v)}2 4 1 {1 − q(v)}q(v) 2 1 q(v)2 4
Payoff Qk (u) 2{1 − c(u) − c0 }α {1 + b − 2c(u) − 2c0 }α 2{b − c(u) − c0 }α {1 + b − c(u) − 2c0 }α {1 − c(u) − 2c0 }α {2b − c(u) − 2c0 }α {b − c(u) − 2c0 }α 2{b − c0 }α {b − 2c0 }α −2c0 α
Figure 8.6. Typical Model B (browbeating) ESS. Intensity is plotted against baseline probability of submission, δ, for γ = 0.05 and various values of dominance disadvantage b.
. There is now a critical value of δ, the baseline probawe have v ∗ = 1θ ln (1−b)(1−δ) γ bility of submission (i.e., the probability that a victor elicits permanent submission from a loser in the absence of a display) above which winners do not display. On rearranging (8.51a), we find that the ESS is given by
γ , ln (1−b)(1−δ) if δ < 1 − 1−b γ (8.53) θv = γ 0 if δ ≥ 1 − 1−b . It is plotted in Figure 8.6 for γ = 0.05 (as in Figure 8.5) and various values of b. Note that p and q are equivalent mathematical functions: the only difference between p(s) defined by (8.45) and q(s) defined by (8.52) is that in the first case we denote the baseline probability of achieving the desired effect by p(0) = , whereas in the second case we denote the same probability by q(0) = δ. We use
340
8. Triadic Population Games
Figure 8.7. Comparison of advertising and browbeating ESSs. Intensity is plotted as a function of dominance disadvantage b. Values of the other parameters (all dimensionless) are c0 = 0.1 for the fixed cost of a contest, l = 0.5 for the loser effect (i.e., the probability that an observed loser again loses is 0.75), γ = 0.05 for the marginal cost of displaying, λ = 0.9 for the probability of deference, and = 0.1 = δ for the baseline probability of the desired effect (attention to remote victor in Model A, submission to current opponent in Model B) in the absence of a display. The advertising ESS is shown dashed for obligate signallers and dotted for facultative signallers.
different symbols for baseline probability because the desired effects (attention to remote victors and submission to current opponents, respectively) are different. Nevertheless, in either case, victory displays are evolutionarily stable if the desired effect has a significantly lower probability of arising in the absence of a display. Despite that, there is an important difference between predictions from Models A and B, suggesting at least a partial answer to the question of what victory displays are for. In the case of advertising, intensity of display at the ESS increases with respect to dominance disadvantage b; in the case of browbeating, intensity of display at the ESS decreases with respect to b. This result is already implicit in Figures 8.5 and 8.6, but in Figure 8.7 we have made it explicit by juxtaposing the two ESSs. The diagram shows that, all other things being equal, the intensity of advertising displays will be highest in the limit as b → 1, when the rewards to dominating an opponent are least; there is then little difference between dominating an opponent and not subordinating, and there is likely to be a largely equal distribution of reproductive benefits—or, as behavioral ecologists like to say, there is likely to be low reproductive skew. By contrast, the intensity of browbeating displays will be highest in the limit as b → 0, when the rewards to dominating an opponent are greatest; then reproductive skew is likely to be high, that is, reproductive benefits are likely to be distributed very unequally. Reproductive skew is likely to be low in socially monogamous species, in which a male and a female develop a strong pair bond and share parental care.7 Correspondingly, reproductive skew is likely to be high in polygynous species, in which males mate with multiple females, perhaps because resources that females need are spatially clumped and a male can monopolize mating opportunities by defending a 7 Typically while also demonstrating a penchant for frequent dalliances with “extra-pair” members of the opposite sex; see, e.g., [158].
8.3. Coalition formation: a strategic model
341
territory where the resources abound, a practice known as resource defense polygyny [111, p. 351]. In this regard, it is notable that three of the best candidates for advertising displays—tropical boubous [115], song sparrows [36, p. 123], and little blue penguins [240]—are socially monogamous species, while crickets and wetas, whose post-victory stridulation seems likely to represent a browbeating display [36, p. 122], [239, p. 113], frequently exhibit resource defense polygyny—as, e.g., in the Wellington tree weta [157].
8.3. Coalition formation: a strategic model In nature a coalition forms when at least two individuals join forces against a common target. Often the target belongs to the same social group, as we shall assume in this section; thus coalition formation is cooperation both with and against conspecifics. Coalitions have been observed in numerous mammalian taxa including primates [268], cetaceans [177], and social carnivores [315], typically for access to females among males but also for access to food among females. When, and why, do coalitions form? What determines the coalition structure within a group? Despite the centrality of coalitions to characteristic function games, Chapter 4 has little to say about this matter, because the existence of the grand coalition is assumed in §4.1–§4.8, and strategy is only implicit in §4.9. So we explore these questions here instead, using a Type II triadic model with K = 8. We are especially interested in identifying the circumstances in which a true coalition of less than all the players— in our case, a coalition of two versus one—is most likely to arise. A triadic model is appropriate because, although coalitions in nature vary in size, two against one is the coalition structure most commonly observed [29, 315]. We assume that each member of a triad knows its own strength but not that of either partner; in terms of §6.2 (p. 221), the animals use self-assessment. All three strengths are drawn from the same distribution with probability density function g on [0, 1]. As noted on p. 246, distributions of fighting ability are typically fairly symmetric in nature, so an appropriate choice of distribution for theoretical purposes is one that is perfectly symmetric on [0,1] with mean 12 . We choose (8.54)
g(ξ) =
Γ(2a) a−1 ξ (1 − ξ)a−1 , {Γ(a)}2
i.e., the symmetric version of the Beta distribution defined on p. 246. For a = 1 this distribution is uniform; for a > 1 it is unimodal, and its variance decreases with a 1 according to σ 2 = 14 (1 + 2a)−1 . Throughout, we assume that a ≥ 1, or σ 2 ≤ 12 . Stronger animals tend to escalate when involved in a fight, and weaker animals tend not to escalate. We assume that if an animal considers itself too weak to secure a resource alone, then it attempts to form a pact with each of its partners for mutual sharing of benefits and, if necessary, defense. Let Λ denote total group fitness (beyond the basic level accruing equally to all individuals), and let it cost θΛ (where 0 ≤ θ < 1) to attempt to make a pact. The attempt may not be successful, in which case, a pact seeker will refuse to fight alone; however, if all agree to a pact, then there are also no fights. We first consider how rewards are split between contestants and then consider the likelihood of winning fights and the cost of fighting. If there are three distinct
342
8. Triadic Population Games
ranks after fighting for a resource, then the alpha individual gets αΛ, where α > 12 , the beta individual gets (1 − α)Λ, and the gamma individual gets zero, i.e., nothing beyond the basic level of fitness. If, on the other hand, there is a three-way pact, then each gets 13 Λ. This is also the benefit that each receives if the animals fight one another and end up winning and losing a fight apiece, although then they will have had to pay the cost of fighting. We have so far dealt with two of four possible outcomes in terms of number of pact seekers, namely, three pact seekers, which we will find convenient to describe as universal peace, and zero pact seekers, which we will find convenient to describe as universal war. But there remains the possibility of two against one. If a coalition of two defeats the third individual, then each member of the coalition takes 12 Λ while the third individual takes zero (beyond the basic level). If, on the other hand, the individual defeats the coalition, then it gets αΛ while each member of the coalition takes 12 (1 − α)Λ. We will deal with the final possibility—a single pact seeker—later. We allow for the existence of a synergistic—or an antergistic—effect, so that the effective strength of a coalition of two whose individual strengths are S1 and S2 is not simply S1 + S2 but rather q{S1 + S2 }, where q need not equal 1; rather, q > 1 for synergy, and q < 1 for antergy. We will refer to the multiplier q as the synergicity.8 Let p(Δs) denote the probability of winning for a coalition (or individual) whose combined effective strength exceeds that of its opponent by Δs, so that (8.55)
p(Δs) + p(−Δs) = 1
for all Δs, which implies in particular that p(0) = 12 ; also, p(Δs) > 12 for Δs > 0 and p(Δs) < 12 for Δs < 0. We set p = 0 for Δs < −2 and p = 1 for Δs > 2 (both possible only if q > 1), and for Δs ∈ [−2, 2] we choose Γ(2r) 1 1 (8.56) p(Δs) = B 2 + 4 Δs, r, r , 2 Γ(r) where B is the incomplete Beta function, defined on p. 246.9 The graph of p is plotted in Figure 8.8(a) for four different values of r, which is a measure of the reliability of strength difference as a predictor of fight outcome. It will be convenient to scale fighting costs with respect to Λ. Accordingly, let c(Δs)Λ be the cost of a fight between coalitions whose effective strengths differ by Δs; c must be an even function (i.e., a function of |Δs|), and we assume that c(Δs) = 0 for |Δs| ≥ 2. For |Δs| ≤ 2, we choose k (8.57) c(Δs) = c0 1 − 14 |Δs|2 , which is graphed in Figure 8.8(b), for four different values of k, a measure of the sensitivity of cost with respect to strength difference, in the sense that a small difference in strength implies a large cost reduction when k is very high but virtually no cost reduction when k is very low. We assume that fighting costs are equally borne by all members of a coalition. So a lone individual bears the whole cost of fighting, whereas a pair of allies splits the cost equally. 8 A number of de facto synonyms may be used to describe combined effects [32, Table 1, p. 541], e.g., superadditive in place of synergistic for q > 1 and antagonistic in place of antergistic for q < 1 (with additive used for q = 1). My choice of terminology merely reflects a personal preference. 9 For a justification of this choice of contest success function, see [216, p. 2435].
8.3. Coalition formation: a strategic model
343
Figure 8.8. (a) The contest success function p defined by (8.56) for r = 0.1 (thin solid curve), r = 1 (dotted), r = 10 (dashed), and r = 100 (thick solid). (b) The cost function c defined by (8.57) for four different values of k.
Few data are available to suggest the approximate sizes in natural systems of parameters such as q, r, and k. Nevertheless, there are reasons to suspect that q must lie between 12 and 2 and in practice lies much closer to 1, that r at least exceeds 10 and much higher values are far from uncommon, and that k at least exceeds 25 but does not greatly exceed 50 in most cases [228, p. 284]. These estimates are reflected in the values we choose for illustration later. Let u be the coalition threshold for the focal individual or potential mutant, whom we label F: if F’s strength fails to exceed the value u, then it attempts to make a pact with each of the other two individuals in its triad, whom we label A and B. Let v be the corresponding threshold for A and B, who represent the population. Let X be the strength of the u-strategist F, and let Y and Z be the respective strengths of A and B, the two v-strategists. We can now decompose the sample space of strength combinations into eight mutually exclusive and exhaustive events as indicated in Table 8.10. In Case 1, all three individuals have below-threshold strengths. So each, including the focal individual F, pays θΛ to obtain a third of Λ. In Case 2, the strength Y of the v-strategist A is above its threshold, but the strength Z of the v-strategist B is below its threshold; the strength X of F is also below its threshold, and so F and B make a pact to fight A when the need arises. The total cost of the fight is c(q{X + Z} − Y ) to A and half that amount to each of F and B. The probability that the coalition {F, B} wins is p(q{X + Z} − Y ); the probability that A wins is p(Y − q{X + Z}). So the payoff to F is p(q{X + Z} − Y ) · { 12 − θ − 12 c(q{X + Z} − Y )}Λ + p(Y − q{X + Z}) · { 12 (1 − α) − θ − 12 c(q{X + Z} − Y )}Λ, which simplifies to (8.58) P2 (X, Y, Z) =
1 2 {αp(q{X
+ Z} − Y ) + 1 − α − 2θ − c(q{X + Z} − Y )}Λ
on using (8.55) with Δs = q{X + Z} − Y . Case 3 now follows by symmetry.
344
8. Triadic Population Games
Table 8.10. Payoff to a focal individual F of strength X with partners A, B of respective strengths Y , Z. The number of pact seekers is three in Case 1, two in Cases 2–4, one in Cases 5–7, and zero in Case 8.
Case Coalition i structure 1 {F, A, B} 2 {F, B}, {A} 3 {F, A}, {B} 4 {F}, {A, B} 5 {F}, {A}, {B} 6 {F}, {A}, {B}
X X X X X X
Event Ωi (u, v) < u, Y < v, Z < u, Y > v, Z < u, Y < v, Z > u, Y < v, Z < u, Y > v, Z > u, Y > v, Z
v
8
{F}, {A}, {B} X > u, Y > v, Z > v
Payoff Pi (X, Y, Z) { 13 − θ}Λ Equation (8.58) P2 (X, Z, Y ) Equation (8.59) −θΛ {(2α − 1)p(X − Y ) +1 − α − c(X − Y )}Λ {(2α − 1)p(X − Z) +1 − α − c(X − Z)}Λ Equation (8.60)
In Case 4, F takes on A and B by itself, and so it pays the full cost of fighting (but avoids the cost of pact making). If it wins it becomes the alpha individual, whereas if it loses it becomes the gamma individual. So its payoff is p(X − q{Y + Z}) · {α − c(X − q{Y + Z})}Λ + p(q{Y + Z} − X) · {0 − c(q{Y + Z} − X)}Λ, which simplifies to (8.59)
P4 (X, Y, Z) = {αp(X − q{Y + Z}) − c(X − q{Y + Z})}Λ.
Pact seekers do not fight alone. In Case 5, F accepts that it is the gamma individual and has tried to make a pact for nothing, while A and B contest dominance. In Case 6, B accepts that it is the gamma individual while F contests dominance with A, each paying the full cost of the fight. So the payoff to F is p(X − Y ) · {α − c(X − Y )}Λ + p(Y − X) · {1 − α − c(Y − X)}Λ, which simplifies to {(2α − 1)p(X − Y ) + 1 − α − c(X − Y )}Λ. Case 7 now follows by symmetry. In Case 8, F is involved in a pair of fights, one with each of A and B, whose order does not matter. Conditional on strengths having been drawn from the distribution, the probability that F wins both fights to become the alpha individual is π2 = p(X − Y )p(X − Z). The probability that F loses both fights to become the gamma individual is π0 = p(Y − X)p(Z − X). In either case, the outcome of the third contest is irrelevant to the focal individual’s payoffs. However, if F wins precisely one contest, then there are two possible outcomes for the group as a whole. The first is that either A or B wins twice to become the alpha individual (while the other becomes the gamma individual). The probability of this outcome is π11 = p(X − Y )p(Z − X)p(Z − Y ) + p(X − Z)p(Y − X)p(Y − Z). The second outcome is that A and B both also win one fight. The probability of this outcome is π12 = p(X − Y )p(Z − X)p(Y − Z) + p(X − Z)p(Y − X)p(Z − Y ). Hence the benefit to the focal individual is π2 · αΛ + π11 · (1 − α)Λ + π12 · 13 Λ + π0 · 0 and the
8.3. Coalition formation: a strategic model
345
cost is {c(X − Y ) + c(X − Z)}Λ, implying P8 (X, Y, Z) = {ζ(X, Y, Z) − c(X − Y ) − c(X − Z)}Λ,
(8.60a)
where (8.60b) ζ(X, Y, Z) = αp(X − Y )p(X − Z) + (1 − α) × {p(X − Y )p(Z − X)p(Z − Y ) + p(X − Z)p(Y − X)p(Y − Z)} + 13 {p(X − Y )p(Z − X)p(Y − Z) + p(X − Z)p(Y − X)p(Z − Y )}. The reward to a u-strategist in a population of v-strategists now follows from (8.2) with K = 8, that is, """ 8 # (8.61) f (u, v) = Pi (x, y, z)g(x)g(y)g(z) dx dy dz, i=1
(x,y,z) ∈ Ωi (u,v)
where g is the probability density function of the Beta distribution from which the strengths of the triad are drawn, i.e., (8.54). Let G denote the distribution function, defined by (2.70). Then, on noting from Table 8.10 that the first and fifth integrals of (8.61) are separable and that each of the others contains a separable integral, and combining both the second integral with the third and the sixth with the seventh and simplifying, we obtain f (u, v) = { 13 G(v)2 − θ}ΛG(u) + (1 − α)Λ{2 − G(u)}{1 − G(v)}G(v) " 1 " v " u g(x) g(y) {α p(q{x + z} − y) − c(q{x + z} − y)}g(z) dz dy dx +Λ 0
v
0
u
0
0
" v " v " 1 + Λ g(x) g(y) {α p(x − q{y + z}) − c(x − q{y + z})}g(z) dz dy dx " 1 " 1 " v + 2Λ g(x) g(y) {(2α − 1)p(x − y) − c(x − y)}g(z) dz dy dx u
v
0
" 1 " 1 " 1 + Λ g(x) g(y) {ζ(x, y, z) − c(x − y) − c(x − z)}g(z) dz dy, dx u
v
v
so that partial differentiation with respect to u implies (8.62) ∂f = Λg(u) 13 G(v)2 − θ − (1 − α){1 − G(v)}G(v) ∂u " 1 "v + g(y) {α p(q{u + z} − y) − c(q{u + z} − y)}g(z) dz dy v
"
0
"
v
−
v
{α p(u − q{y + z}) − c(u − q{y + z})}g(z) dz dy
g(y) 0
0
"
"
1
− 2
v
{(2α − 1)p(u − y) − c(u − y)}g(z) dz dy
g(y) v 1
" −
"
{ζ(u, y, z) − c(u − y) − c(u − z)}g(z) dz dy
g(y) v
0 1
v
346
8. Triadic Population Games
after some simplification (Exercise 8.7). For p given by (8.56), c given by (8.57), and g given by (8.54), we are now equipped to calculate the evolutionarily stable strategy or strategies as a function of the seven parameters, namely, c0 (maximum fighting cost), q (synergicity), θ (pact cost), α (proportion of additional group fitness to a dominant), r (reliability of strength difference as predictor of fight outcome), k (sensitivity of cost to strength difference), and σ 2 (variance in fighting strength). There are three possibilities. The first is that v = 0 is an ESS: unilateral aggression without coalitions always pays. The second possibility is that v = 1 is an ESS: it always pays to form a pact, which is therefore tripartite. The third possibility is that v is an interior ESS, i.e., 0 < v < 1: it pays to seek a coalition if insufficiently strong. It is only in this third case that a true coalition can arise. From §2.5, 0 is at least a local ESS if (8.62) is negative for v = 0 in the vicinity of u = 0, and it can be verified computationally that this ESS is also a global ESS; as indicated in §2.5 (p. 82), we plot the graph of f (u, 0) on [0, 1] to confirm that f (u, 0) < f (0, 0) for all u < 0 ≤ 1. Because (8.54) implies that usually10 g(0) = 0 even though g(u) > 0 for all u ∈ (0, 1), the sign of (8.62) for v = 0 in the vicinity of u = 0 is determined by the sign of the limit as u → 0 and v → 0 of the large term in braces, which reduces to "1 "1 (8.63) − θ − g(y) {ζ(0, y, z) − c(0 − y) − c(0 − z)}g(z) dz dy 0
0
because G(0) = 0. On using (8.55), (8.60b), and G(1) = 1, and exploiting symmetry, we find that (8.63) is negative and hence 0 is an ESS for θ > θ2 , where we define the critical value (8.64) " 1 " 1 2 c(−y)g(y) dy − α p(−y)g(y) dy θ2 (σ 2 , α, r, k, c0 ) = 2 0
" −
2 3
0
1
0
"
1
p(−y)p(z) {2 − 3α}p(z − y) + 1 g(y)g(z) dy dz,
0
which increases (making unilateral aggression less likely) with α, r, and c0 , is independent of q, and decreases with k.11 The dependence of θ2 on σ 2 is slightly more complicated. Whenever k, c0 , and α are all sufficiently large, θ2 increases with σ 2 as illustrated by the dashed curves in Figure 8.9. Here, the higher the variance, the more costly pact making must be for universal war to be an ESS: if pact making were cheaper at higher variance, then it would pay at least the weakest animals to seek a pact. Decreasing either c0 or α not only lowers the dashed curves in Figure 8.9 but flattens them out, so that eventually (i.e., as c0 → 0 and α → 12 ) θ2 either increases only very weakly with σ 2 or even slightly decreases; however, θ2 itself is then either only marginally positive or even negative. Thus the more important point is that it is possible for unconditional aggression to be an ESS even if pact making is costless (θ = 0), as is illustrated by the lower panels of Figure 8.9. 10
The exception occurs for a uniform distribution, i.e., when a = 1. To the extent that if θ > 0, then unconditional aggression must be an ESS in the limit of infinite sensitivity because the first term of (8.64) approaches zero as k → ∞ while the remaining terms in this expression are negative. 11
8.3. Coalition formation: a strategic model
347
Figure 8.9. Critical pact cost θ2 , defined by (8.64), above which θ must lie for universal war to be an ESS (dashed curves), and critical pact cost θ1 , defined by (8.66), below which θ must lie for universal peace to be an ESS (solid curves) as a function of variance for c0 = 1 = q. Note that q = 1 makes θ1 independent of r (Exercise 8.7).
Correspondingly, 1 is at least a local ESS if (8.62) is positive for v = 1 in the vicinity of u = 1, and it can again be verified computationally that this ESS is also a global ESS: this time, we plot the graph of f (u, 1) to confirm that f (u, 1) < f (1, 1) for all u ≤ 0 < 1. Because usually g(1) = 0 even though g(u) > 0 for all u ∈ (0, 1), the sign of (8.62) for v = 1 in the vicinity of u = 1 is determined by the sign of the limit as u → 1 and v → 1 of the large term in braces, which reduces to " (8.65)
1 3
−θ−
"
1
0
1
{α p(1 − q{y + z}) − c(1 − q{y + z})}g(z) dz dy
g(y) 0
348
8. Triadic Population Games
because G(1) = 1. This term is positive for θ < θ1 , where we define a second critical value, " 1 " 1 θ1 (σ 2 , α, r, k, c0 , q) = 13 + g(y) c(1 − q{y + z})g(z) dz dy 0 0 (8.66) " 1 " 1 − α
p(1 − q{y + z})g(z) dz dy,
g(y) 0
0
which decreases (making a tripartite coalition less likely) with σ 2 , k, and α but increases with c0 .12 Without a synergistic effect, θ1 is independent of reliability; however, when q = 1, θ1 increases with r for q > 1 but decreases with r for q < 1, as illustrated by Figure 8.10. That θ1 decreases with α while θ2 increases with α means that increasing α makes the conditions for universal war and universal peace both harder to satisfy. One can readily understand why increasing α makes universal peace harder: the strongest animals have more to fight for. It is less obvious why increasing α makes universal war harder. But the absence of universal war simply means that the weakest animals aren’t aggressive, and the most the weakest animals can reasonably hope to achieve through aggression is to become the beta individual, which is worth less when α is greater. To have θ above the dashed curve in Figure 8.9 would mean unconditional war without coalition building, to have θ below the solid curve would mean universal peace. Thus Figure 8.9 indicates that all-out war and all-out peace are both possible at low variance for a wide range of values of α, r, and θ (at least for sufficiently large k), but that only one is possible at high variance, except in a narrow range of values of θ for sufficiently low r and α. The effect of synergy on the critical pact cost for an ESS of 1 is illustrated by Figure 8.10. We can identify three regimes: synergy (q > 1), absence of synergistic effects (q = 1), and antergy (q < 1). With synergy, the higher the reliability of contest outcome, the greater the likelihood of universal peace; equivalently, the greater the extent to which a two-versus-one contest would be a lottery (low reliability), the greater the incentive for the strongest individual to fight the remaining pair. With antergy, the lower the reliability, the greater the likelihood of universal peace; equivalently, the higher the reliability, the greater the incentive for the strongest individual to exploit the reduction of handicap against a pair that antergy then affords. In the absence of any synergistic effect, reliability has no impact on the likelihood of universal peace: as q approaches 1, whether from above or below, the curves for different values of r in Figure 8.10 all approach the dotted curve, preserving their order in the process (i.e., the highest value of r always corresponds to the uppermost curve for q > 1 but the lowermost curve for q < 1). For v ∗ ∈ (0, 1) to be at least a local ESS, we require ∂f ∂2f = 0, < 0 (8.67) 2 ∗ ∗ ∂u
u=v=v
∂u
u=v=v
12 Note that in the limit of zero variance, (8.64) and (8.66) reduce to θ2 (0, α, r, k, c0 ) = 2c 12 −
− 12 2(2 − 3α)p 12 + 3α and θ1 (0, α, r, k, c0 , q) = 13 + c(1 − q) − αp(1 − q), respectively. These two expressions determine all intercepts in Figures 8.9 and 8.10. (The coefficient of α in
vertical-axis
the first expression is p − 12 {2p 12 − 1}, which must be positive, although it is small when r is large.)
1 3p
8.3. Coalition formation: a strategic model
349
Figure 8.10. Critical pact cost θ1 , defined by (8.66), below which θ must lie for universal peace to be an ESS, as a function of variance when k = 25, c0 = α = 1, and q = 1 (so that θ1 is independent of r, dotted curve), and for various other values of q and r (as indicated).
from §2.5. On differentiating (8.62) with respect to u and then setting u = v = v ∗ , and with the help of (8.60b) and the definition of G, we find after simplification that
∂f = Λg(v ∗ )ψ(v ∗ ) ∂u u=v=v∗
(8.68a) and
∂2f = Λg(v ∗ )χ(v ∗ ), ∂u2 u=v=v∗
(8.68b) where (8.69) ψ(v) =
− θ − (1 − α){1 − G(v)}G(v) " v g(y) {α p(q{v + z} − y) − c(q{v + z} − y)}g(z) dz dy + v 0 " v " v − g(y) {α p(v − q{y + z}) − c(v − q{y + z})}g(z) dz dy 2 1 3 G(v) " 1
0
0
"
"
1
− 2(2α − 1)G(v)
g(y)p(v − y) dy + 2 v
− α
"
1
g(y)p(v − y) dy "
2 3 (2
− 3α)
"
1
2
"
"1 −
g(y)p(v − y) dy
2 3 v
1
g(z)p(z − v) dz v
1
p(v − y)p(z − v)p(z − y)g(z) dz dy
g(y) v
g(y)c(v − y) dy v
v
−
1
v
350
8. Triadic Population Games
and (8.70a) "
"1
1
g(y)p(v − y) dy
χ(v) = − 2α v
"
v
"
1
g(y)p (v − y) dy
v
g(y) {α p (q{v + z} − y) − c (q{v + z} − y)}g(z) dz dy +q " 0v " vv g(y) {α p (v − q{y + z}) − c (v − q{y + z})}g(z) dz dy − 0
0
"
1
− 2(2α − 1)G(v)
g(y)p (v − y) dy + 2
v
"1 −
"
1
g(y)c (v − y) dy
v
"1 {φ(v, y, z)ω(y, z) + φ(v, z, y)ω(z, y)}g(z) dz dy
g(y) v
v
is defined with the help of (8.70b) (8.70c)
φ(v, y, z) = p (v − y)p(z − v) − p(v − y)p (z − v), ω(y, z) =
1 3 p(y
− z) + (1 − α)p(z − y)
(Exercise 8.7). Here, as in §2.6, a prime denotes differentiation with respect to argument; specifically, from (8.56)–(8.57), p (Δs) =
(8.71) and
Γ(2r) Γ(r)2
41−2r {(2 + Δs)(2 − Δs)}r−1
k−1 . c (Δs) = − 12 kc0 Δs 1 − 14 |Δs|2
(8.72)
Because g(v ∗ ) = 0, it follows at once from (8.67) that ψ(v ∗ ) = 0 and that χ(v ∗ ) must be negative for v ∗ to be an interior ESS. Note that θ > θ2 is equivalent to ψ(0) < 0 and that θ < θ1 is equivalent to ψ(1) > 0. Moreover, on differentiating (8.62) with respect to v, then setting u = v and combining with (8.68b), we find after simplification that 2 ∂ f ∂2f + = Λg(v ∗ )ρ(v ∗ ), (8.73) 2 ∗ ∂u
∂u∂v
u=v=v
where (8.74)
" 1 g(y){α p(2qv − y) − c(2qv − y)} dy ρ(v) = α − 1 + 83 − 3α G(v) + v " v " v − α p {1 − q}v − qy g(y) dy + 3 c {1 − q}v − qy g(y) dy 0
" + 2(2α − 1) p(0)G(v) − " − 2c(0) +
0 1
p(v − y)g(y) dy
v 1
{ζ(v, v, y) + ζ(v, y, v)}g(y) dy + χ(v) v
8.3. Coalition formation: a strategic model
351
1 Figure 8.11. The ESS v = v ∗ (r) for k = 50, θ = 0, σ 2 = 12 , q = 1, α = 1 ∗ with (a) c0 = 0.1, r0 ≈ 12.4, m1 ≈ 24.1, m2 ≈ 24.9, v (∞) ≈ 0.9 (dotted line) and (b) c0 = 0.3, m2 = r0 ≈ 2.6, m1 ≈ 2, v ∗ (∞) ≈ 0.97.
with ζ defined by (8.60b) and χ by (8.70). Because g(v ∗ ) = 0, it follows from (2.123) and (8.73) that v ∗ is continuously stable when ρ(v ∗ ) < 0. From above, if θ1 < θ < θ2 , so that neither 0 nor 1 is an ESS, then ψ(0) > 0 > ψ(1), implying that the equation (8.75)
ψ(v) = 0 ∗
must have at least one solution v on (0, 1); however, there may be more than one such value of v ∗ , as illustrated by Figures 8.11(a) and 8.12(a). For arbitrary values of the seven parameters c0 , q, θ, α, r, k, and σ 2 all such solutions are readily found by numerical methods.13 If χ(v ∗ ) < 0, then v ∗ is at least a local ESS, and we can verify that v ∗ is also a global ESS, i.e., f (u, v ∗ ) < f (v ∗ , v ∗ ) for all u ∈ [0, 1] such that u = v ∗ , by plotting f (u, v ∗ ) against u (as in Exercise 8.9). We now describe how reliability of contest outcome affects the ESS, which we denote by v ∗ (r) when the other six parameters are held constant. The results of a typical such calculation are illustrated by Figure 8.11, which shows v = v ∗ (r) for 1 (uniform distribution), q = 1, α = 1, and two different k = 50, θ = 0, σ 2 = 12 values of c0 . In general, there are two possible pairs of critical values for r. The first consists of the highest value at which 0 is an ESS, denoted by r0 , and the lowest value at which 1 is an ESS, denoted by r1 . The second pair of values, denoted by m1 and m2 , yields the lower and upper limits of an interval in which the ESS is not unique: for m1 < r < m2 , v = v ∗ (r) is either a doubly or a triply valued function. These points are illustrated by Figure 8.11(a), where r0 ≈ 12.4, m1 ≈ 24.1, m2 ≈ 24.9, and r1 does not exist: although v ∗ (r) continues to increase at higher values of r than are shown in the diagram, it never reaches the value 1, and instead approaches the asymptote v ∗ (∞) ≈ 0.9. For this relatively low maximum cost (here c0 = 0.1), v ∗ (r) is zero for r < r0 , is unique and increasing for r0 < r < m1 , is triply 13
For example, by using the Mathematica command FindRoot in conjunction with NIntegrate.
352
8. Triadic Population Games
1 Figure 8.12. The ESS v = v ∗ (r) for k = 50, θ = 0, σ 2 = 12 , q = 1.2, α = 1 with (a) c0 = 0.1, r0 ≈ 12.4, m1 ≈ 7.7, m2 ≈ 18.4, r1 ≈ 25.5 and (b) c0 = 0.3, m2 = r0 ≈ 2.6, m1 ≈ 0.8, r1 ≈ 1.76. An interior ESS is continuously stable where the curve is solid.
valued for m1 < r < m2 , and is again unique and increasing for r > m2 . At higher maximum cost (here c0 = 0.3), however, the picture changes as illustrated in Figure 8.11(b), where m1 ≈ 2 and r0 = m2 ≈ 2.6. There is still a narrow range of values of r for which ψ(v) = 0 has three solutions v; however, only the largest corresponds to an interior ESS because χ(v) > 0 for the other two. Thus v ∗ (r) = 0 for r < m1 ; there are two ESS thresholds, a high one and zero, for m1 < r < m2 ; and v ∗ (r) is unique for r > m2 , and increases towards the asymptote v ∗ (∞) ≈ 0.97.14 At even higher maximum cost, e.g., if c0 were replaced by 0.5 in Figure 8.11, there are no interior ESSs, but v = 1 is an ESS (because ψ(1) > 0, or equivalently θ < θ1 ). We have just seen that, in the absence of a synergistic effect (q = 1) and for sufficiently low maximum fighting cost, the ESS threshold for forming coalitions is low at low reliability of contest outcome but increases quite rapidly across a relatively narrow transition region to become high at high reliability. Even a modest synergistic effect appreciably widens this transition region, as illustrated by Figure 8.12(a). It is now important to note that when m1 < r < m2 , the highest and lowest ESS thresholds are both always found to be continuously stable, whereas the intermediate threshold is not; for example, when r = 16 in Figure 8.12(a), the three ESSs are v1 ≈ 0.076 with χ(v1 ) ≈ −0.039 and ρ(v1 ) ≈ −0.11; v2 ≈ 0.3 with χ(v2 ) ≈ −0.14 and ρ(v2 ) ≈ 0.11; and v3 ≈ 0.96 with χ(v3 ) ≈ −0.63 and ρ(v3 ) ≈ −0.19. From §2.9, the population will track the ESS as long as it is continuously stable. So what we expect to happen in a population where r increases sufficiently slowly from low to high values is that the ESS will follow the curve in Figure 8.12(a) until it starts to bend back, then jump to the upper branch as indicated by the solid arrow. Conversely in a population where r decreases sufficiently slowly from high to low values, the population will jump from the upper branch to the lower one as indicated by the dotted arrow. 14 The upper branch of the curve in Figure 8.11(b) could be continued leftward as a local ESS (which can be invaded by zero), before even that local ESS disappears at r ≈ 1.5; however, only global ESSs are shown in our diagrams.
8.3. Coalition formation: a strategic model
353
In a population at the ESS, there are four possible outcomes for every triad that draws three strengths from the distribution. The probability of universal war is p0 (v ∗ ) = Prob(X > v ∗ , Y > v ∗ , Z > v ∗ ) = {1 − G(v ∗ )}3 , where v ∗ denotes the ESS; if v ∗ = 0 then p0 = 1. The probability of universal peace is (8.76)
p3 (v ∗ ) = Prob(X < v ∗ , Y < v ∗ , Z < v ∗ ) = G(v ∗ )3
with p3 = 1 if v ∗ = 1. Similarly, the probability of a lone pact seeker ceding to a two-way struggle for dominance between the other two individuals is the probability that two of the strengths X, Y , Z lie above the threshold v ∗ while the other one lies below it, or p1 (v ∗ ) = 3G(v ∗ ){1 − G(v ∗ )}2 . Finally, and of special interest, the probability of a coalition of two individuals against the third is the probability that two of the strengths X, Y , Z lie below the threshold v ∗ while the other one lies above it, or (8.77)
p2 (v ∗ ) = 3G(v ∗ )2 {1 − G(v ∗ )}.
Because (8.77) is 0 if either v ∗ = 0 or v ∗ = 1, the probability p2 of a true coalition of two individuals against the third is nonzero only for an interior ESS. To illustrate, suppose that in Figure 8.12(b) the population tracks the ESS as r increases from a very low to a very high value, jumping from the lower curve to the upper curve as r passes through m2 (= r0 in this case). Then p2 is always 0, because v ∗ = 0 jumps from 0 to 1 as r crosses 2.6. By contrast, in Figure 8.11(a) or in Figure 8.12(a), the probability of a true coalition can be substantial. Suppose, for example, that the population tracks the ESS in Figure 8.11(a) as r increases from a very low to a very high value, jumping from the lower to the upper branch of the curve as r passes through m2 ≈ 24.9. Then p2 varies as shown by the solid curve in Figure 8.13(a). If q is increased by 20% but all other parameters retain their values, however, then p2 varies as shown in Figure 8.13(b) instead. Thus the probability of a true coalition is much lower with synergy than without it: synergy encourages a three-way pact, as indicated by Figure 8.13 (for reasons to be discussed below, on p. 355). To describe how variance σ 2 affects the probability p2 of a true coalition, we first describe how it affects the ESS, which we denote henceforth by v ∗ (σ 2 ). The results of a typical calculation are illustrated by Figure 8.14(a), which shows v = v ∗ (σ 2 ) for k = 50, θ = 0, r = 30, q = 1, α = 1, and c0 = 0.1. For very low variance the ESS is multivalued, for example, when σ 2 = 0.002, the three ESSs are given by v ∗ ≈ 0.357, v ∗ ≈ 0.454, and v ∗ ≈ 0.95, all of which are continuously stable. Nevertheless, it is likely that the multivalued ESS region is of greater mathematical than biological interest, for two reasons. The first is that the ESS is multivalued in Figure 8.14(a) only if σ 2 is less than about 0.00425 or if the coefficient of variation in strength is less than about 13%, which would be unusually low. The second reason is that, because the ESS is continuously stable, if ever the variance were high enough to make the ESS unique, then the population would track the upper curve—even if the variance subsequently became extremely low. Accordingly, in Figure 8.14(b) we plot p2 and p3 only for the upper curve of Figure 8.14(a). We already know from Figure 8.14 that p2 increases with variance, and from Figure 8.13 that it remains nonnegligible in the limit as r → ∞. Therefore, to explore how synergicity q affects the ESS, for the remainder of this section we
354
8. Triadic Population Games
Figure 8.13. The probability of a true coalition (p2 , solid curve) or universal 1 , α = 1, c0 = 0.1, and peace (p3 , dashed curve) for k = 50, θ = 0, σ 2 = 12 two values of q. (a) The population tracks the ESS in Figure 8.11(a) from left to right; p2 → p2 (v ∗ (∞)) ≈ 0.244 and p3 → p3 (v ∗ (∞)) ≈ 0.728 as r → ∞ (dotted lines). (b) The population tracks the ESS in Figure 8.12(a) from left to right. In both cases, vertical segments correspond to where the ESS jumps from the lower to the upper branch at r = m2 .
Figure 8.14. (a) The ESS v = v ∗ (σ 2 ) for k = 50, θ = 0, r = 30, q = 1, α = 1, and c0 = 0.1, which is multivalued when the variance is very low. (b) The probability of a true coalition (p2 , solid curve) or universal peace (p3 , dashed curve) when the population tracks the high ESS in the diagram on the left.
assume that both reliability and variance are maximal, and in this limit we denote the ESS by v ∗ (q). The results of a typical calculation appear in Figure 8.15(a), which shows v = v ∗ (q) for θ = 0, α = 1, c0 = 0.1, and two different values of k. The ESS is unique and continuously stable. For the maximum-variance (uniform) distribution, we infer from (8.54) with a = 1 and from (8.77) that G(v ∗ ) = v ∗ and p2 = 3v ∗ 2 {1 − v ∗ },
8.4. Commentary
355
Figure 8.15. (a) The ESS v = v ∗ (q) and (b) the probability p2 = p2 (v ∗ (q)) of a true coalition for θ = 0, α = 1, c0 = 0.1 and two values of k in the limit 1 ). The dotted lines are of maximum reliability and variance (r → ∞, σ 2 → 12 explained in the text.
which is maximal when v ∗ = 23 . Thus we can infer the value of q that yields the maximum probability p2 of a true coalition as easily from Figure 8.15(a)—as indicated by the dotted lines—as from Figure 8.15(b), where p2 itself is plotted against q. What we find is that the relevant synergicity is always less than 1 and increases slightly with k. Thus antergy is conducive to a true coalition, although not directly; rather, antergy is conducive to the strongest individual going it alone, as opposed to joining a tripartite coalition. This point is perhaps most readily appreciated by supposing that q is first high, so that p2 = 0 and p3 = 1 (as on the right of Figure 8.15), and all three animals are in a pact. It makes no sense for the strongest animal to withdraw from this pact, because high synergicity amplifies the combined strengths of the two weaker animals too much. If q is steadily reduced, however, then the degree of amplification will correspondingly diminish, until eventually it becomes in the strongest animal’s interest to withdraw from the pact and take on the other two. In sum, a clear prediction is that coalitions of two versus one are most likely to arise when strength has high reliability as a predictor of contest outcome (large r), there is a degree of antergy in combining strengths (q < 1) and their variance is high (large σ 2 ). These conditions do seem to mirror those in which coalitions are found in primate societies, e.g., in male savanna baboons [247, 248], where there is intriguing evidence that “at least one of the partners did not pull his weight” [248, p. 212]; in other words, that q < 1.15
8.4. Commentary The models we studied in this chapter exemplify how continuous triadic population games can be used to clarify central aspects of behavior within a network. We 15 There is also evidence that q < 1 for coalitions among fiddler crabs that arise when an animal intervenes to help a neighbor win a territorial contest [231, p. 272], although in this instance there is an asymmetry of role, which is absent from §8.3.
356
8. Triadic Population Games
focused on winner and loser effects (§8.1, which is based on [209]), victory displays (§8.2, which is based on [227]), and coalition formation (§8.3, which is based on [228]). But triadic games can also be used to clarify other aspects of network behavior, such as social eavesdropping [229] and intervention in neighborhood disputes [231]. Indeed because triadic games are a relatively recent development, they have received only limited attention to date and their potential remains enormous [230, 304]. Winner and loser effects (§8.1) may help to explain the persistence of linear dominance hierarchies, in which no individuals are of equal or indeterminate rank. Such hierarchies can form in social groups through a series of pairwise contests in which animals are able to assess one another’s strength prior to any escalation, with the winner of each contest dominating the loser. Because relative strength is never a perfectly reliable predictor of the outcome of a contest, however, some games are won by the weaker contestant; thus, for the hierarchy to persist, stronger animals must sometimes consider themselves subordinate to weaker ones, instead of attempting to reverse the asymmetry at a subsequent encounter. Why? Perhaps because there exist winner and loser effects: one effect of victory by a weaker animal may be to raise its perception of its strength and/or to lower that of its opponent, so that after they fight the opponent perceives its relative strength to lie below an evolutionarily stable aggression threshold, whereas previously it lay above, provoking a fight [215]. It is shown in [165] that winner and loser effects can enable a dominance hierarchy to form quite rapidly, with mutual aggression only in early contests. The main prediction of §8.1 is that a winner effect cannot exist without a loser effect. This prediction has been upheld by numerous experiments over several decades [215, p. 40] and is consistent with two more recent models [94, 341] that make different assumptions.16 Yet a recent study of contests in the parasitoid wasp Eupelmus vuilleti has presented contrary evidence of a winner effect without a loser effect [112]. Here four remarks are in order. First, we should not expect a prediction from any model to be upheld if its assumptions are violated. In §8.1 we assumed that winning is equally valuable to both contestants, whereas an asymmetry in the value of winning appears to account for the winner effect observed without a loser effect in E. vuilleti [112]. Second, even though changes in self-perception of strength cannot explain this result, they may still account for winner or loser effects observed in other systems; evidence abounds that winner and loser effects may arise through different mechanisms, even within a single species [215, p. 40]. Third, the winner effect observed without a loser effect in E. vuilleti could not be replicated in a later experiment [113], indicating, together with other observations [215, pp. 41–42], that winner and loser effects are highly context-dependent. Lastly, this context-dependence yields ample scope for newer models [215], [305, p. 1196]. In §8.2, to focus on factors favoring the evolution of victory displays, we ignored the question of why such displays should be respected, either by observers in Model A or by losers in Model B. As discussed at length in [227, pp. 603–604], a likely answer is that respecting displays avoids the costs of potentially protracted 16
It can therefore be regarded as a robust theorem in the sense described in Chapter 9 (p. 361).
Exercises 8
357
disputes, and in that sense is analogous to respecting landmarks in §6.6. Our assumption that a loser does not display is relaxed in [234], which extends Model B to incorporate the effect of contest length as well as the loser’s strategic response. The new model predicts that if the reproductive advantage of dominance over an opponent is sufficiently high—that is, if b is sufficiently low—then, in a population adopting the ESS, neither winners nor losers signal in contests that are sufficiently short; and only winners signal in longer contests, but with an intensity that increases with contest length. These predictions are consistent with the outcomes of laboratory studies in both crickets [24] and mangrove crabs [62]. Finally, the topic of coalition formation (§8.3), especially among nonhuman animals, has attracted increasing attention over the last decade from both theoreticians [220] and empiricists [30] alike. Together they have identified a range of empirical patterns requiring theoretical explanations [30, p. 40]. The newer models needed to address these issues represent yet another golden opportunity for game theorists—at what is currently one of game theory’s most promising frontiers.
Exercises 8 1. A large population of local manufacturers of home-brewed beer is interacting in triads because each local market is large enough only for three firms to compete. Each firm advertises its product on local television once a day, either in the morning or at night. If more than one firm in a local market (= triad) advertises at the same time, then none of the firms in that triad experiences a profit increase; whereas if one firm advertises when the others do not, then its profit increases by $α per day if it advertises in the morning or by $να per day if it advertises at night. All firms use a mixed strategy: advertising in the morning with probability p (and at night with probability 1 − p) is defined to be strategy p. Obtain an expression for the reward to a mutant p-strategist in a population of q-strategists, and show that q ∗ = 1+1√ν is a continuously stable ESS.17 2. (a) Verify that (8.19a) has a solution satisfying 0 ≤ v0 ≤ 1 for v1 = 1 only if (8.20) is satisfied, and that (8.21) is the only such solution. (b) Verify that (8.27a) has a solution satisfying 0 ≤ v2 ≤ 1 for v0 = v1 = 1 only if 12r > 4b + 1, and that (8.28) is the only such solution. throughout (c) Verify that λ0 defined by (8.27b) and (8.28) is negative the 16 , triangle in the r-b plane with vertices at 14 , 12 , 13 , 12 , and 29 92 23 . (d) Verify Figure 8.3. 3. In §8.1 we assumed that (actual) fighting strength has a uniform distribution. Suppose instead that strength is distributed between 0 and 1 with probability density function g defined by g(ξ) = 16224936{ξ(1 − ξ)}11 . (a) What equation must v2 satisfy for (1, 1, v2 ) to be an ESS (with a loser but 1 199 no winner effect) if b = 20 and r = 2000 ? (b) Does this equation have a unique solution? 17
This exercise was suggested by [107, p. 120].
358
8. Triadic Population Games
4. Verify Tables 8.7 and 8.8. 5. This question refers to Model A in §8.2. (a) Increasing the value of any probability of deference to a prior winner, regardless of whether it is λ0 (for an untested individual), λ1 (for a prior loser), or λ2 (for a prior winner), will always have the same effect at the ESS. What is it? (b) What is the effect of increasing λ3 from 0, that is, allowing prior losers to defer to one another? (c) What is the effect of increasing the loser effect l? 6. Verify that (8.51) yields the unique ESS for Model B in §8.2. 7. (a) Verify the calculations that lead to (8.62). (b) Use symmetry to deduce that the last term of (8.66) becomes − 12 α when q = 1. Thus, in the absence of a synergistic effect, θ1 in Figure 8.9 is independent of the reliability. (The last term of (8.66) also becomes − 12 α in the limit as r → 0, even for q = 1.) (c) Verify (8.68)–(8.74). 8. In §8.3, some expressions simplify considerably in the limit of maximum relia1 . bility and variance, i.e., as r → ∞ and σ 2→ 12 1 1 (a) Show that (8.64) becomes θ2 = 2c0 B 4 , 2 , 1 + k in this limit, where B is the incomplete Beta function (defined on p. 246). (b) Show that (8.66) reduces to " 1" 1 θ1 = 13 + c(1 − q{y + z})} dz dy − αA1 (q) 0 0 1 1 − 2 (2 − 1/q)2 if 12 ≤ q ≤ 1 that 12 ≤ q ≤ 2 (see p. 343).
with A1 (q) = It is assumed (c) Show that (8.69) reduces to ψ(v) =
but A1 (q) = "
− θ − (1 − α)v(1 − v) + αA2 (q, v) + 2
1 2 3v
"
1
"
"
v
v
"
c(q{v + z} − y) dz dy +
− v
0
1 2q 2
if 1 < q ≤ 2.
1
c(v − y) dy v
v
c(v − q{y + z}) dz dy, 0 2
0 2
where A2 (q, v) = 12 (q − 1)(4 + 2/q − 1/q )v − vA3 (qv) if 12 ≤ q ≤ 1 but A2 (q, v) = v − (1 + 12 q −2 )v 2 − vA4 (qv) if 1 < q ≤ 2; A3 (ξ) = 12 (1 − 2ξ)2 /ξ if 12 ≤ ξ ≤ 1 but A3 (ξ) = 0 if 0 ≤ ξ ≤ 12 ; and A4 (ξ) = 1 − 32 ξ if 0 ≤ ξ ≤ 12 but A4 (ξ) = 12 (1 − ξ)2 /ξ if 12 ≤ ξ ≤ 1 with A4 (ξ) = 0 if 1 < ξ ≤ 2. These expressions are used for the asymptotes (dotted lines) in Figure 8.11 and the effects of synergicity in Figure 8.15. 9. Figure 8.11 illustrates that (8.75) may have three solutions, which for r = 16 (and the given values of the other parameters) are v1 ≈ 0.076, v2 ≈ 0.3, and v3 ≈ 0.96 (p. 352). By plotting f (u, vi ) on a suitable interval for i = 1, . . . , 3, verify that v1 , v2 , and v3 are all (global) ESSs.
Chapter 9
Appraisal
A well-designed model is a deliberate simplification of reality. This guiding principle has been variously restated by Maynard Smith [184, p. 21] as “all good models in science leave out a lot. A model which included everything would be too complicated to analyze”, and by Box and Draper [37, p. 414] as “all models are wrong, but some are useful”. Games are useful simplifications of reality when they help to answer questions about behavior, which often appear in the form of a paradox. As noted in §6.2 (p. 218), the method by which we resolve that paradox assumes that the observed behavior corresponds to some ESS. But humans and other animals have interacted strategically for many thousands of years. Over time they have evolved behavior to deal with such interactions; and so it is reasonable to suppose that, by a process of trial and error, the behavior they exhibit in familiar situations—a key restriction, because regularities are what science deals with1 —is already the behavior that optimizes their rewards. Then we can resolve the paradox by constructing a game whose ESS corresponds to the observed behavior. For example, we saw in §7.3 that the spider Oecobius civitas can behave in a most peculiar way. A disturbed animal may enter the lair of another spider; the owner, far from shooing the intruder out, will scurry off to bump another spider, that one in turn will bump yet another spider, and so on, often until most of the spiders in a colony have been displaced from their lairs. This behavior is paradoxical. Yet O. civitas has frequented rocks for countless generations and must surely be familiar with getting disturbed, in which case, we would expect the spider’s strategy to be evolutionarily stable. But what is the game for which its strategy is an ESS? Concerning this question, in §7.3 we built a model of ownerintruder interactions, we postulated a reward and a strategy set, and we found a small region in parameter space with a unique ESS that resembles the spider’s observed behavior. Thus the model suggests a reason for the spider’s odd behavior, and the size of the region suggests a reason for its rarity.
1
See, e.g., Rubinstein [290, p. 921].
359
360
9. Appraisal
Regardless of whether an observed behavior is paradoxical, games can be useful in identifying conditions that favor the behavior in question. Our threat game (§6.9), whose partial-bluffing ESS arises only for a particular cost structure (p. 274), identifies conditions favoring a high frequency of bluffing. Our landmark game (§6.6) identifies conditions that favor an animal accepting a smaller territory than it would have obtained through fighting (p. 246), the triadic game in §8.3 identifies conditions that favor coalitions of two versus one (p. 355), the games in §8.2 distinguish conditions favoring advertising displays from those that favor browbeating displays (p. 340), and so on. We have shown repeatedly that a well-designed population game tends to have a unique ESS, which varies across parameter space.2 Different parameter values can represent either different species or different ecotypes of the same species, where an ecotype is defined as a population adapted to a specific ecological environment. Thus ESS variation across parameter space can be interpreted as variation of ecotype across environments, or ecotypic variation [129]. ESS variation over parameter space can also be interpreted as variation within a population, if we assume that ESS variation over many theoretical populations, each characterized by a different parameter value, will reasonably approximate variation of behavior with respect to that parameter within a single real population. For example, with regard to §8.3, different triadic conflicts over resources within a population may have different synergicities. This point is discussed at greater length elsewhere [234, p. 33]. An ESS may not be directly observable; for example, how do we know whether aggression by an owner in the IH-D (§7.3) is by a Hawk or by a Bourgeois? Yet inability to observe an ESS does not prevent games from being useful, because we can instead observe outcomes associated with the ESS, such as the frequency of true coalitions (Figure 8.15(b)) in §8.3 or the frequency of landmarks being honored (Figures 6.11 and 6.12) in §6.6. Games are not even prevented from being useful by imprecise knowledge of relevant behavioral data, but it is at least desirable and often essential that the model should be robust, i.e., insensitive to small changes in parameter values. For example, the partial-bluffing ESS of §6.9 holds for arbitrary values of the threat-cost or combat-cost parameters; it can be shown that §8.1’s prediction of no winner effect without a loser effect is robust to changes in the distribution of fighting strength or in the cost function or contest success function [209, pp. 1181–1184]; in order to conclude that community-based wildlife management schemes could be better designed in §7.2, we required only that duties associated with monitoring have a low opportunity cost; the delayed-recruitment ESS of §7.1 holds as long as the search area is sufficiently large and each individual has a sufficiently low probability of locating the bonanza, and so on. Levins [171, p. 422] has characterized the models that ecologists use as compromises between generality, realism, and precision, reflecting compromises between the simultaneously unattainable goals of, respectively, understanding, predicting, and modifying nature. From this perspective, our models are primarily general models. We hope that their assumptions are not unrealistic, but we acknowledge that some parameters (e.g., the marginal-cost parameter in §6.4’s mating game) 2 In doing so, we have also incidentally shown that the potential difficulties of nonexistence or nonuniqueness associated with an ESS or Nash equilibrium are more of a concern in theory than they are in practice, and so we have not dwelt on them.
9. Appraisal
361
may be difficult or even impossible to measure, and so we cannot expect precise quantitative predictions. But their predictions can still be tested. For example, a test of the self-assessment ESS in §6.2 is whether loser reserves correlate positively with contest duration; a test of the partial-bluffing ESS in §6.9 is whether animals who threaten before losing pay higher costs than those who lose without threatening; and §8.1’s prediction of no winner effect without a loser effect has frequently been tested, and corroborated, as discussed in §8.4. Yet even where empirical tests are inconclusive, games are useful simply because they allow us to explore the logic of a verbal argument rigorously—they act as proof-of-concept tests [301]. They often demonstrate what is difficult to intuit; for example, that victory by stronger animals need not imply that strength is being assessed (§6.2), that bluffing can persist at high frequency (§6.9), or that a degree of antergy favors coalitions of two against one (§8.3). What does the future hold for games? Perhaps some day there will be a general theory that fully and explicitly describes all aspects of strategic interaction. But animal behavior is extremely complicated, and much valuable progress towards understanding it can still be made, especially in the short term, not by seeking a general theory, but rather by building a greater variety of explicit models of specific conflicts in which behavior has been observed—in other words, by continuing the approach we have adopted in this book. Levins [171, p. 423] has expressed a rationale for it as follows: . . . we attempt to treat the same problem with several alternative models each with different simplifications but with a common biological assumption. Then, if these models, despite their different assumptions, lead to similar results, we have what we can call a robust theorem which is relatively free of the details of the model. Hence our truth is the intersection of independent lies.
Or as Maynard Smith and Szathm´ary have said, “complex systems can best be understood by making simple models” [193]. In sum: The vast potential of games remains untapped in many areas. There is ample scope for newer models, especially in the field of behavioral ecology. I have already mentioned issues that strike me as especially worthy of attention,3 and others have offered their own perspectives on where the future lies; see, e.g., McNamara [197], Riechert [284], and Rusch and Gavrilets [291]. As an active field of research, games remain as promising today as they were when I wrote the first two editions of this book—and once again, I hope that it will help to attract you towards them.
3 Issues that strike me as especially worthy of attention include self- versus mutual assessment (p. 274), nonhuman conventions (p. 275), wildlife conservation (p. 315), winner and loser effects (p. 356), coalition formation (p. 357), and triadic games in general (p. 356).
Appendix A
Bimatrix Games
In this appendix we elaborate further on elements of the bimatrix game introduced at the end of §1.1. Let Player 1 have m1 pure strategies with payoff matrix A, and let Player 2 have m2 pure strategies with payoff matrix B; A and B are both m1 × m2 , as on p. 12. Let pi be the probability that Player 1 adopts pure strategy i, and let all such probabilities form the m1 -dimensional row vector p = (p1 , . . . , pm1 ). Likewise, let qj be the probability that Player 2 uses pure strategy j, let all such probabilities form the m2 -dimensional row vector q = (q1 , . . . , qm2 ), and let a superscripted T denote a transpose. Because Player 1’s payoff from pure strategy i when Player 2 adopts pure strategy j is aij , and because Player 2 adopts pure strategy m2j with probability aij qj = (Aq T )i , qj , the expected payoff to Player 1 from pure strategy i is j=1 the ith component of the m1 -dimensional column vector Aq T . Player 1 obtains this reward with probability pi , and so her unconditional reward is m1 m1 m1 # m2 # # # (A.1a) (Aq T )i pi = pi (Aq T )i = pAq T = pi aij qj . i=1
i=1
i=1 j=1
Correspondingly, because Player 2’s payoff from pure strategy j when Player 1 adopts pure strategy i is bij , and because Player 1 adopts pure strategy m1 i with , the expected payoff to Player 2 from pure strategy j is probability p i i=1 bij pi = m1 p b = (pB) , the jth component of the m -dimensional row vector pB. j 2 i=1 i ij Player 2 obtains this reward with probability qj and so has unconditional reward (A.1b)
m2 # j=1
(pB)j qj = pBq T =
m1 # m2 #
pi bij qj .
i=1 j=1
Now, in general a mixed strategy is a vector of probabilities for choosing pure strategies. Because the component probabilities of both p and q must sum to 1, however, the number of decision variables for each player when using a mixed strategy is one fewer than her number of pure strategies. Accordingly, we set s1 = m1 −1, s2 = m2 −1, and we define Player 1’s strategy to be the s1 -dimensional vector u = (u1 , u2 , . . . , us1 ) containing her probabilities of selecting one of her first 363
364
A. Bimatrix Games
s1 pure strategies; that is, we set pi = ui for 1 ≤ i ≤ si . We regard these s1 probabilities as decision variables. But we do not view pm1 as a decision variable, because the probability that Player 1 selects pure strategy m1 is already determined by u. Correspondingly, we define Player 2’s strategy to be the s2 -dimensional vector v = (v1 , v2 , . . . , vs2 ) containing her probabilities of selecting one of her first s2 pure strategies; that is, we set qj = vj for 1 ≤ j ≤ s2 , which also determines qm2 . The players’ joint strategy combination is now the (s1 + s2 )-dimensional row vector (u, v) = (u1 , u2 , . . . , us1 , v1 , v2 , . . . , vs2 ), and setting p = (u, pm1 ) and q = (u, qm2 ) in (A.1), the players’ rewards from strategy combination (u, v) become1 (A.2) with pm1 (A.3a)
f1 (u, v) = (u, pm1 )A(v, qm2 )T , f2 (u, v) = (u, pm1 )B(v, qm2 )T s1 s2 = 1 − i=1 ui and qm2 = 1 − j=1 vj . So the strategy sets are s1 ui ≤ 1} Δ1 = {u ∈ s1 |ui ≥ 0 for i = 1, . . . , s1 and i=1
for Player 1 and (A.3b)
Δ2 = {v ∈ s2 |vi ≥ 0 for i = 1, . . . , s2 and
s2 i=1
vi ≤ 1}
for Player 2, where is the real numbers and is n-dimensional Euclidean space, and the decision set is D = Δ1 × Δ2 . Often s1 = s2 = s, say, in which case we write Δ1 = Δ2 as Δ. For example, we have s = 1 (and Δ = [0, 1]) for Crossroads in §1.1, s = 2 for Four Ways in §1.3, and s = 5 for Chump in Exercise 1.30. When a bimatrix game is extended in this way to allow for mixed strategies, the new game is often called the mixed extension of the original game.2 Because mixed strategies are so often allowed, however, we prefer to call the new game simply a bimatrix game, and to distinguish the original game—in which only finitely many pure strategies are allowed—by calling it a discrete bimatrix game (as, e.g., on p. 18). n
1 Strictly speaking, it abuses notation to equate (u, v) with (u1 , u2 , . . . , us1 , v1 , v2 , . . . , vs2 ), p with (u, pm1 ), and q with (u, qm2 ) as it fails to distinguish a two-dimensional row vector whose components are an (n − k)-dimensional and a k-dimensional row vector from an n-dimensional row vector. But it is a great convenience that rarely leads to any practical difficulties. See Footnote 14 on p. 22. 2 See, e.g., [110, p. 29]. We have formulated the mixed extension by regarding mixtures of m1 and m2 pure strategies for Players 1 and 2, respectively, as s1 - and s2 -dimensional vectors (where s1 = m1 − 1 and s2 = m2 − 1). An alternative is to regard these mixtures as m1 - and m2 -dimensional vectors, effectively using p and q, respectively, as strategy vectors in place of u and v. Suppose, for example, that m1 = m2 = 3, as in §1.3. Then we could regard mixtures of, say, Player 1’s three pure strategies as row vectors of the form (p1 , p2 , p3 ), where pi ≥ 0 for 1 ≤ i ≤ 3 and p1 + p2 + p3 = 1. The strategy set is then the so-called standard 2-simplex, namely, the set Δ2 = {(p1 , p2 , p3 ) ∈ 3 |pi ≥ 0, 1 ≤ i ≤ 3, 3i=1 pi = 1}, a triangle in three-dimensional Euclidean space 3 with corners at (1, 0, 0), (0, 1, 0), and (0, 0, 1). The strategy set Δ1 = Δ defined by (A.3) or (1.28) is just the projection of Δ2 onto the triangle with corners at (1, 0, 0), (0, 1, 0), and (0, 0, 0) in the plane p3 = 0 (although our notation suppresses the third coordinate, which is always zero), and the “height” p3 = 1 − p1 − p2 maintains a one-to-one correspondence between points (p1 , p2 ) in Δ and points (p1 , p2 , p3 ) in Δ2 . Either way, the strategy set is two-dimensional. More generally, with the standard n-simplex defined by n n+1 |xi ≥ 0, 1 ≤ i ≤ n + 1, n+1 Δ = {x = (x1 , . . . , xn , xn+1 ) ∈
i=1 xi = 1},
one can formulate the mixed extension of the game with matrices A and B by expressing the rewards either as (A.1) with p ∈ Δm1 , q ∈ Δm2 , and D = Δs1 × Δs2 or as (A.2) with u ∈ Δ1 , v ∈ Δ2 , and D = Δ1 × Δ2 , where Δ1 defined by (A.3a) is the projection of Δs1 onto the hyperplane xm1 = 0; likewise for Δ2 . Again, either way, Player 1’s strategy set is s1 -dimensional, and Player 2’s strategy set is s2 -dimensional. For the most part, and especially when attempting a complete analysis (finding all Nash equilibria), it is preferable to work with vectors of lower dimension—that is, to work with sk -dimensional vectors in an sk -dimensional set rather than (sk + 1)-dimensional vectors in an sk -dimensional set. Sometimes, however, it is preferable to work with vectors in the higher-dimensional space, especially when deriving theorems (as illustrated by the result established at the end of §2.2).
Appendix B
Answers or Hints for Selected Exercises
Chapter 1 6. Let f1 = Ju1 + Ku2 + L. Then region C in Figure 1.6 (which is drawn for γ > σ) corresponds to J < 0, K < 0; region B corresponds to J < 0, K > 0 to the right of the line between (α, β) and (σ/ω, 0), to J = 0, K > 0 on the line, and to K > J > 0 to the left of it; and region A corresponds to J > K > 0 below the line between (α, β) and (0, 1 − θ), to J > 0, K = 0 on the line, and to J > 0, K < 0 above it. Similarly, the seven rows of Table 1.6 correspond, respectively, to J > 0, K ≤ 0 or J > K > 0; J ≤ 0, K > 0 or K > J > 0; J < 0, K < 0; J = 0, K < 0; J < 0, K = 0; J = K > 0; and J = K = 0. 8. To obtain, e.g., (1.49c), note from (1.47) that if (u, v) ∈ DC , then c 25 {(u
2
∂f1 ∂u
=
− v + 6)(v − 3u − 6) + 70}, = − 6) − 3u}; whence f1 has a maximum for max(0, v − 6) ≤ u ≤ v − 1 at u = 13 (2v + (v − 6)2 + 210) − 4 provided this number—which clearly exceeds max(0, v − 6)—is less than or equal to v − 1, or v ≥ 11 2 . Otherwise, the maximum is at u = v − 1. ∂ f1 ∂u2
2c 25 {2(v
9. To obtain, e.g., (1.51c), note that because (u, v) ∈ DC implies ∂f2 2 2 1 ∂v = 75 c (3v − 2u − 12) − (u + 6) + 90 ≥ 0
√ if (u + √ 6)2 ≤ 90, the maximum of f2 on DC is at v = u + 6 if u ≤ 3( 10 − 2). If u > 3( 10 − 2), however, then for u + 1 ≤ v ≤ u + 6, f2C has a local minimum at v = vup (u), where vup (u) = 13 (2u + (u + 6)2 − 90) + 4, where and a local maximum at v = vdown (u), vdown (u) = 13 (2u − (u + 6)2 − 90) + 4, provided that vdown (u) ≥ u + 1 or u ≤ 92 . But this local maximum is not the 365
366
B. Answers or Hints for Selected Exercises
√ maximum for 3( 10 − 2) ≤ u ≤ 92 , because f2C (u, u + 6) − f2C (u, vdown (u)) ≥ 0 √ when 3( 10−2) ≤ u ≤ 4 and f2C (u, 10)−f2C (u, vdown (u)) ≥ 0 when 4 ≤ u ≤ 92 . For 92 ≤ u ≤ 9, the maximum (for u + 1 ≤ v ≤ 10) must be at v = u + 1, because f2C (u, u + 1) − f2C (u, 10) = 15 c(9 − 2u)(u − 9) ≥ 0. √ 11. As we move to the left on the interval 3( 10−2) ≤ u ≤ 92 , the points (u, vup (u)) to Exercise 1.9 move closer together, until they and (u, vdown (u)) √ in the solution √ coalesce at (3{ defined by ⎧ 10 − 2}, 2 10). So R2 in Figure 1.11(b) is √ ⎪ u+6 if 0 ≤ u ≤ 2( 10 − 3), ⎪ √ √ ⎪ √ ⎪ ⎪ if 2( 10 − 3) ≤ u ≤ 3( 10 − 2), 2 10 ⎨ √ 2 v = u − 13 (u + 6)2 − 90 + 4 if 3( 10 − 2) ≤ u ≤ 92 , 3 √ ⎪ ⎪ 1 9 ⎪ u + 13 if ≤ u ≤ 4 10 − 13 , ⎪ 2 √ 4 2 2 ⎪ √ √ ⎩ 13 2 10
if
4 10 −
2
≤ u ≤ 2 10.
18. (a) When 2δ < min(τ1 , τ2 ), we have (˜ u, v˜) = (1, 1) = (u∗ , v ∗ ). 1 − 12 τ2 , − 12 τ1 , which lies u, v˜) = δ+ (b) When 2 > max(τ1 , τ2 ), we have (˜ neither in R1 nor in R2 . (c) u, v˜) = − δ + 12 τ2 θ2 and f2 (˜ u, v˜) = For1 2 > max(τ1 , τ2 ), we obtain f1 (˜ − δ + 2 τ1 θ1 , where θ1 and θ2 are defined by (1.22). Compare with Table 1.3. 19. Because v, z ≥ 0, v + z is minimized with respect to v, z by v = 0 = z. Thus m1 (u) = 8acπu(1/3 − 2u), and similarly for m2 , m3 . So the unique max-min 1 1 5 , v˜ = 16 , and z˜ = 48 . strategies for Players 1, 2, and 3 are, respectively, u ˜ = 12 21. For τ > 2δ > 2 or slow drivers, u ˜ = v˜ = (1, 0). For 2δ > τ > 2 or intermediate drivers, u ˜ = v˜ = (0, 1). For 2 > τ or fast drivers, u ˜ = v˜ = 1 2 1+2γ+σ 2 ({γ − σ}{1 − σ}, {1 + σ} ). 22. Here u ˜ = 2, v˜ = α. Note that (˜ u, v˜) ∈ / D if α > 8. 23. (a) The sole Nash equilibrium is (0, 0). (b) R = 12 b, S = 0, T = 34 b, P = 14 b. 27. (b) There are nine Nash equilibria in all. Four are pure strategy combinations, namely, (1, 1, 0, 0), (1, 0, 1, 0), (0, 0, 1, 1), (0, 1, 0, 1), and five are mixed strategy combinations, namely, (1, θ2 , θ1 , 0), (θ1 , 0, 1, θ2 ), (0, θ2 , θ1 , 1), (θ1 , 1, 0, θ2 ), and (θ1 , θ2 , θ1 , θ2 ). 28. (c) R1 ∩ R2 = {(u, v) ∈ [0, 1] × [0, 1]|u + v = 1} ∪ {(1, 1)}; (θ, 1 − θ) is a strong Nash equilibrium for any θ ∈ (0, 1), and the other three Nash equilibria are weak. 29. There is only one Nash equilibrium, at which each pure strategy is played with probability 13 . 30. (a) The payoff matrices ⎡are A = and B = −A.
0 ⎢ 00 ⎣ −1 −1 0
0 −1 0 2 0 −1
1 0 −2 0 −2 0
−2 0 0 0 3 −2
0 2 −3 −3 0 0
⎤
0 0 1⎥ 0⎦ 0 0
31. (b) The unique Nash equilibrium is (θ 2 , θ) where θ =
b b+2 .
Chapter 3
367
Chapter 2 3. (c) No—consider strategy combinations (1, 0) and (0, 1) in §2.1’s Crossroads. But if u∗ = v ∗ , i.e., if (v ∗ , v ∗ ) is a symmetric strong Nash equilibrium, then v ∗ must be a strong ESS. 5. See [200, pp. 13–14]. 7. (c) We obtain ∂f = 12 τ η(1 − u) + δ − 12 τ − (δ + )G(v) g(u) ∂u = (δ + ){λ(1 − u) + 1 − θ − G(v)}g(u), v where G(v) = 0 g(ξ) dξ, in place of (2.29). Because g(u) = 0, ∂f ∂u is negative, zero, or positive according to whether u is greater than, equal to, or less than {1 − θ + λ − G(v)}/λ. So R is defined by (2.30) with G(v) in place of v on the right-hand side of the equality sign. 11. Trajectories that begin on the boundary of Δ where x1 x2 = 0 must remain on it and converge to (0, 0) as t → ∞. xk (t); then y(0) = 0, by (2.109). By (2.108), 14. (a) Define y(t) = 1 − m k=1 m m m dy dxk = − = −κ k=1 dt k=1 xk Wk + κW k=1 xk = −κW y. The solution dt subject to y(0) = 0 is y(t) = 0, or (2.110). 18. From (2.80) we obtain holds. 22. v ∗ =
∂2f ∂u2
+
∂2f ∂u∂v
= −Ci − Ce − 12 V < 0, and so (2.123)
1 2
is the sole (strong) ESS. + * −2δ −δ− 12 τ has negative eigenvalues. The result is a 23. From (2.19), C = −δ+ 1 τ −δ− 2 special case of a far more general one, namely, that any interior ESS of a matrix game is continuously stable [82, p. 610].
Chapter 3 2. (a) From ∂f /∂ξ = −5f1 +5f2 +20ξ −8 = 0, ξ = 14 (f1 −f2 )+ 25 . Now substitute into (3.7) and simplify. (b) The small curvilinear triangles would correspond to v > 1 or v < 0 under the mapping f defined by (3.4). 5. For all δ, , τ1 , and τ2 satisfying (3.1), the shape of the reward set is similar to that of Figure 3.1 (except that f1 = f2 is not an axis of symmetry when τ1 = τ2 ). In fact, (3.11) generalizes to θ1 (f1 + τ2 ) − ξ1 f2 = 0 if ( − δ)θ1 ≤ f2 ≤ 0, {f1 −f2 − 12 (τ1 −τ2 )}2 +2 ν 2 ν{τ1 (f1 +)+τ2 (f2 +)+τ1 τ2 }
= 1 if
f1 ≤ ( − δ)θ2 , f2 ≤ ( − δ)θ1 ,
ξ2 f1 − θ2 (f2 + τ1 ) = 0 if ( − δ)θ2 ≤ f1 ≤ 0, where ν = (τ1 + τ2 )/(δ + ) and ξ1 , ξ2 are defined by (3.43). So, when δ = 5, = 3, τ1 = 4, and τ2 = 2, unimprovable points in F are (−2, 0), (0, −4) and 2 the interior of the arc of the parabola 16 (f1 − f2 − 1) − 48f1 − 24f2 − 231 = 0 15 35 that joins (−3, − 13 4 ) to (− 8 , − 8 ).
368
B. Answers or Hints for Selected Exercises
9. (b) The Nash bargaining solution is at (1, 1) in D, corresponding to (R, R) in F . By the symmetry of F about f1 = f2 , it suffices to show that d = (f1 − P )(f2 − P ) takes its maximum on the line segment from (R, R) to (T, S) at (R, R), which is readily established by using the equation of the line segment to write d as a function of either f1 or f2 alone.
Chapter 4 8 16 2 . 2. (b) 1 = − 45 and C + (1 ) = 19 45 , 45 , 9 1 1 2 1 1 1 + 3. X = C − 9 = 3 , 9 , 9 , 3 . 11 11 29 5 11 4. X 1 = C + − 120 = 40 , 120 , 24 , 40 . 7 21 59 11 21 1 + 5. X = C − 80 = 80 , 240 , 48 , 80 . 6. (b) X 1 = 13 , 13 , 13 . 9. Jed should arrive at 1:20 p.m. and leave at 2:40 p.m., Ned should leave at 1:20 p.m. and Ted should arrive at 2:40 p.m. 14. $2 for Jed, $9 apiece for Ned and Ted. 16. (a) The Shapley value (and the nucleolus) for a two-player game coincides with (4.101). (b) The Shapley value for a three-player CFG is the imputation whose transpose is ν({1,2}) ν({1,3}) ν({2,3}) *1+ S T = 16 ν({2,3}) + 16 ν({2,1}) + 13 1 − 13 ν({3,1}) . x ν({3,1})
ν({3,2})
1
ν({1,2})
(c) Hence, e.g., the Shapley value for the three-person car pool is xS =
1 (10 + 4d, 7 + 4d, 1 + 4d). 6(3 + 2d)
By the method of §4.2, the core is the quadrilateral in which 1 3+d 2+d 3+d 3+2d ≤ x1 ≤ 3+2d , 0 ≤ x2 ≤ 3+2d , and 3+2d ≤ x1 + x2 ≤ 1. 3+d 1 S + S S / C (0) if x1 + x2 < 3+2d or d < 2 . Hence x ∈ 17. (a) 12xS2 = ν({2, 3}) + ν({2, 4}) + ν({2, 1}) + ν({2, 3, 4}) − ν({3, 4}) + ν({2, 3, 1}) − ν({3, 1}) + ν({2, 4, 1}) − ν({4, 1}) + 3{1 − ν({3, 4, 1})}, 12xS3
= ν({3, 4}) + ν({3, 1}) + ν({3, 2}) + ν({3, 4, 1}) − ν({4, 1}) + ν({3, 4, 2}) − ν({4, 2}) + ν({3, 1, 2}) − ν({1, 2}) + 3{1 − ν({4, 1, 2})},
12xS4
= ν({4, 1}) + ν({4, 2}) + ν({4, 3}) + ν({4, 1, 2}) − ν({1, 2}) + ν({4, 1, 3}) − ν({1, 3}) + ν({4, 2, 3}) − ν({2, 3}) + 3{1 − ν({1, 2, 3})}
by cyclic permutation of 1, 2, 3, and 4 in (4.100) where, obviously, {2, 3, 1} is the same as {1, 2, 3}, etc. 21. Hiring a pure mathematician. 22. (a) Yes.
Chapter 5
369
23. Suppose that x ∈ C + (0) but x ∈ / R, where R denotes the reasonable set. Then, from (4.15), there exists i ∈ N such that xi > ν(T ) − ν(T − {i}) for all T to which Player i belongs. In particular, xi > ν(N )−ν(N −{i}) = 1−ν(N −{i}). Also e(N − {i}, x) ≤ 0, from (4.19). But (4.12b) and (4.16) together imply that e(N − {i}, x) = ν(N − {i}, x) − (1 − xi ), which cannot be positive. So xi ≤ 1 − ν(N − {i}), a contradiction.
Chapter 5 n 1. Define S(n) Prob(k ≤ M ≤ n). Then, for all n ≥ 2, n= k=1 n S(n) = k=1 j=k Prob(M = j) n−1 n = k=1 j=k Prob(M = j) + Prob(M = n) n−1 n−1 = k=1 j=k Prob(M = j) + Prob(M = n) + Prob(M = n) n−1 n−1 = k=1 j=k Prob(M = j) + (n − 1)Prob(M = n) + Prob(M = n) = S(n − 1) + n · Prob(M = n). So N #
n · Prob(M = n) = S(1) +
n=1
N #
n · Prob(M = n)
n=2
= S(1) +
N #
{S(n) − S(n − 1)} = S(N ).
n=2
Now let N → ∞. 9. (a) Use (P − S)(T − P ) − (R − S)(T − R) = (R − P )(P + R − S − T ). 11. (b) Noting that the recurrence equation has an equilibrium (independent of k) solution k = 1/2 that fails to satisfy 1 = 1, define the perturbation ηk = k − 1/2 and obtain for it both a recurrence equation and a starting condition. Solve for ηk , and hence obtain k = ηk + 1/2. 13. Not necessarily, since the frequency of third encounters depends on N . 23. Because < 1, it follows immediately from (5.102) that T > R and P > S, and (5.103) implies R > P . So (5.3) holds. Also, P +R−S −T = (1−)(Q−q) > 0, implying P + R > S + T . But R > P . Hence 2R > S + T , implying (5.2). 25. (b) See, e.g., [210, p. 196]. (c) Use mathematical induction. (d) The first two rows of U for ALLC versus T F T are identical to the first row of U for T F T versus itself, and the last two rows to the third row. Every row of U k is (1 − 3, 2, , 0) + O(2 ) for k ≥ 2. Hence, to first order in , x(1) = (1 − 2, , , 0) and x(k) = (1 − 3, 2, , 0) for k ≥ 2. Similarly, the first two rows of U for ALLD versus T F T are identical to the second row of U for T F T versus itself, and the last two rows to the fourth row. Every row of U k is (0, , 2, 1−3)+O(2 ) for k ≥ 2. Hence, to first order in , x(1) = (, 0, 1−2, ) and x(k) = (0, , 2, 1 − 3) for k ≥ 2.
370
B. Answers or Hints for Selected Exercises
Chapter 6 2. (b)
∂2f ∂u2
+
∂2f ∂u∂v
u=v=v ∗
= − 16 (2 + θ)2 αμ, implying (2.123).
3. (a) See equation (22) of [206, p. 260]. 2 , 1 and 2+θ ,0 . (b) R is the straight line segment between 2−θ 2+θ 2 ∂2f 1 , ∂∂uf2 + ∂u∂v = − 14 (1 + θ)2 αμ < 0. (c) v ∗ = 1+θ u=v=v ∗ αμ 4. (b) f (26/35, u) − f (u, u) = 207924080 (114244 + 63700u − 35525u2 )(26 − 35u)2 26 > 0 for all u = 35 such that 0 ≤ u ≤ 1. 2 2 (24+13θ)2 (c) Here ∂ f2 + ∂ f αμ implies (2.123). With regard ∗ = − ∂u
∂u∂v
u=v=v
840
∗ to Footnote 46 on p. 110, (2.123) is satisfied even if θ > 108 169 ; moreover, v is ∂2f 1 then still a local ESS, because ∂u2 u=v=v∗ = − 560 (16 − 3θ)(24 + 13θ)αμ < 0. So in terms of AD, it is a CSS but not an ESS—that is, not a global ESS.
5. See [206, pp. 256–257]. 6. (a) From (6.11) we obtain ω = 12 , agreeing with (6.14), and invariably μc = 4+5θ 3(2+θ) μ ≤ μ (because θ < 1). 2+3θ μ ≤ μ. (b) Here we obtain ω = 1 and invariably μc = 4(1+θ) 13 (c) From (6.15) we obtain ω = 24 , agreeing with (6.18), and μc = 13(48+61θ) 840+455θ μ ≤ μ if and only if (6.21) holds. 7. (a) v ∗ = (b) v ∗ =
3 3+2θ for all θ < 1. 2 6 2+θ for θ ≤ 5−24/3
− 2 ≈ 0.4192. See [225, pp. 68–69].
9. (b) Increasing r reduces the curvature below the line u = v. 1−r 11. (a) v ∗ = 1+(1+r)θ . (b) See [206, p. 261].
12. See [206, pp. 263–264]. 13. From Exercise 1.23, f (u, v) = −{(P − S)(1 − v) + (T − R)v}u + T v + P (1 − v), where u is the probability of cooperation. −S −R , r2 = TR−S (a) Here ∂ 2 φ/∂u2 = −2r(S + T − P − R) < 0. With r1 = TP −P
2 )(r−r1 ) the unique strong ESS among kin is v ∗ = 0 if 0 < r ≤ r1 , v ∗ = (1+r (r2 −r1 )(1+r) if ∗ r1 < r < r2 and v = 1 if r2 ≤ r < 1. (b) The unique Nash equilibrium is (v ∗ , v ∗ ); it is strong for r ≤ r1 or r ≥ r2 but weak for r1 < r < r2 . In either case, the probability of cooperation increases with r for r1 < r < r2 ; there is no cooperation for r ≤ r1 , and full cooperation for r ≥ r2 .
14. (a) φ1 (u, v) = u+rbv u+bv (1 − u − v)K, φ2 (u, v) = (c) See [281, p. 269].
ru+bv u+bv (1
− u − v)K.
15. Note that the Hessian (p. 81) is (diagonal and) negative-definite everywhere. Equation (6.49c) is counterintuitive [261, p. 125], but see [208, pp. 104–106].
Chapter 8
371
17. A partition will now occur with probability p2 q2 , instead of 1 − (1 − p2 )(1 − q2 ); hence in place of (6.52) we obtain φ(p2 , q2 ) = E V 12 Z − CP p2 q2 + 12 V (Z) − CR (1 − p2 q2 ) = E V 12 Z − CP p2 q2 + 12 E V (Z) − CR (1 − p2 q2 ) = 12 E V (Z) − CR + (E V 12 Z − 12 V (Z) −CP + CR p2 q2 = A − B + Bp2 q2 , as opposed to A − B(1 − p2 )(1 − q2 ), with A and B defined by (6.53)–(6.54). But the rest of the analysis is virtually the same as before. 25. (a) J − I decreases with a but increases with b and t. It is least as a → 1, b → 0 and t → 0; then J − I → 12 , by (A.42) of [2]. I (b) I−J+1 , which approaches 1 in the limit as a → 1. 26. (b) η(s) decreases in a piecewise-linear fashion between s = 0 and s = L, then increases linearly; η(I) = 0 and η(J) = 0 are merely rearrangements of I = θ1 and J = θ2 , respectively.
Chapter 7 2. See [219]. 3. See [219, pp. 383-389]. 4. (a) See Tables 1 and 2 of [223, p. 409]. (b) See [223, pp. 409–410] and Figure 2 of [305, p. 1193]; both papers use γ = C/V in place of θ = V /C and θ in place of β. 5. (a) Set μ = 12 in Table 5 of [223, p. 415]. (b) See [223, pp. 413–414] and Figure 1 of [305, p. 1190].
Chapter 8 1. Here f (p, q) = α{K(q)p + νq 2 } where K(q) = (1 − q)2 − νq 2 implies K(q ∗ ) = 0, hence f (p, q ∗ ) = f (q ∗ , q ∗ ); and √ √ f (q ∗ , p) − f (p, p) = αK(p){q ∗ − p} = α(1 + ν)(1 − p + νp)(q ∗ − p)2 > 0 for p = q ∗ . So q ∗ is a weak ESS by (2.18). The ESS is continuously stable 2 ∂2f = −2α{1 − q + νq} < 0. because ∂∂pf2 + ∂p∂q 3. (b) No, it has two solutions, where v2 ≈ 0.6036 and where v2 ≈ 0.9867. Both are ESSs; see [209, p. 1166]. 5. (a) It will increase the intensity of signalling at the ESS. For illustration, see Figure 2 of [227, p. 602]. (b) It increases the intensity of display at the ESS for obligate signallers but has no effect on the ESS for facultative ones. (c) It lowers the intensity of signalling at the ESS.
Bibliography
[1] E. S. Adams and R. L. Caldwell, Deceptive communication in asymmetric fights of the stomatopod crustacean Gonodactylus bredini, Animal Behaviour 39 (1990), 706–716. [2] E. S. Adams and M. Mesterton-Gibbons, The cost of threat displays and the stability of deceptive communication, Journal of Theoretical Biology 175 (1995), 405–421. [3] E. Ak¸ cay and J. Roughgarden, Extra-pair parentage: a new theory based on transactions in a cooperative game, Evolutionary Ecology Research 9 (2007), 1223–1243. [4] E. Ak¸ cay and J. Roughgarden, Negotiation of mutualism: rhizobia and legumes, Proceedings of the Royal Society of London B 274 (2007), 25–32. [5] K. L. Akre and S. Johnsen, Psychophysics and the evolution of behavior, Trends in Ecology and Evolution 29 (2014), no. 5, 291–300. [6] R. D. Alexander, The Biology of Moral Systems, Aldine De Gruyter, New York, 1987. [7] B. Allen and M. A. Nowak, Games among relatives revisited, J. Theoret. Biol. 378 (2015), 103– 116, DOI 10.1016/j.jtbi.2015.04.031. MR3350852 [8] C. H. Anderton and J. R. Carter, Principles of Conflict Economics, Cambridge University Press, Cambridge, UK, 2009. [9] G. Arnott and R. W. Elwood, Assessment of fighting ability in animal contests, Animal Behaviour 77 (2009), no. 5, 991–1004. [10] R. Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984. [11] R. Axelrod and W. D. Hamilton, The evolution of cooperation, Science 211 (1981), no. 4489, 1390–1396, DOI 10.1126/science.7466396. MR686747 [12] P. R. Y. Backwell, J. H. Christy, S. R. Telford, M. D. Jennions, and N. I. Passmore, Dishonest signalling in a fiddler crab, Proceedings of the Royal Society of London B 267 (2000), 719–724. [13] P. R. Y. Backwell and M. D. Jennions, Coalition among male fiddler crabs, Nature 430 (2004), 417–417. [14] F. Bai, Uniqueness of Nash equilibrium in vaccination games, J. Biol. Dyn. 10 (2016), no. 1, 395–415, DOI 10.1080/17513758.2016.1213319. MR3546150 [15] M. A. Ball and G. A. Parker, Sperm competition games: sperm selection by females, J. Theoret. Biol. 224 (2003), no. 1, 27–42, DOI 10.1016/S0022-5193(03)00118-8. MR2069247 [16] M. A. Ball and G. A. Parker, Sperm competition games: the risk model can generate higher sperm allocation to virgin females, Journal of Evolutionary Biology 20 (2007), 767–779. [17] Z. Barta, Individual variation behind the evolution of cooperation, Philosophical Transactions of the Royal Society of London B 371 (2016), no. 1687, 20150087. [18] Z. Barta and L.-A. Giraldeau, The effect of dominance hierarchy on the use of alternative foraging tactics: a phenotype-limited producing-scrounging game, Behavioral Ecology and Sociobiology 42 (1998), 217–223. [19] C. T. Bauch and D. J. D. Earn, Vaccination and the theory of games, Proc. Natl. Acad. Sci. USA 101 (2004), no. 36, 13391–13394, DOI 10.1073/pnas.0403823101. MR2098103
373
374
Bibliography
[20] C. T. Bauch, A. P. Galvani, and D. J. D. Earn, Group interest versus self-interest in smallpox vaccination policy, Proc. Natl. Acad. Sci. USA 100 (2003), no. 18, 10564–10567, DOI 10.1073/pnas.1731324100. MR1998144 [21] P.A. Bednekoff, Mutualism among safe, selfish sentinels: a dynamic model, American Naturalist 150 (1997), 373–392. [22] J. Bendor and P. Swistak, Types of evolutionary stability and the problem of cooperation, Proceedings of the National Academy of Sciences USA 92 (1995), 3596–3600. [23] J. Bendor and P. Swistak, The evolutionary stability of cooperation, American Political Science Review 91 (1997), 290–307. [24] S. M. Bertram, V. L. M. Rook, and L. P. Fitzsimmons, Strutting their stuff: victory displays in the spring field cricket, Gryllus veletis, Behaviour 147 (2010), 1249–1266. [25] C. Bevi´ a and L. C. Corch´ on, Peace agreements without commitment, Games Econom. Behav. 68 (2010), no. 2, 469–487, DOI 10.1016/j.geb.2009.09.012. MR2655215 [26] S. Bhattacharyya, C. T. Bauch, and R. Breban, Role of word-of-mouth for programs of voluntary vaccination: a game-theoretic approach, Math. Biosci. 269 (2015), 130–134, DOI 10.1016/j.mbs.2015.08.023. MR3407552 [27] J. M. Bilbao, Cooperative games on combinatorial structures, Theory and Decision Library. Series C: Game Theory, Mathematical Programming and Operations Research, vol. 26, Kluwer Academic Publishers, Boston, MA, 2000. MR1887284 [28] T. R. Birkhead and P. Monaghan, Ingenious ideas: The history of behavioral ecology, Evolutionary Behavioral Ecology (D. F. Westneat and C. W. Fox, eds.), Oxford University Press, Oxford, 2010, pp. 3–15. [29] A. Bissonnette, H. de Vries, and C. P. van Schaik, Coalitions in barbary macaques, Macaca sylvanus: strength, success and rules of thumb, Animal Behaviour 78 (2009), 329–335. [30] A. Bissonnette, S. Perry, L. Barrett, J. C. Mitani, M. Flinn, S. Gavrilets, and F. B. M. de Waal, Coalitions in theory and reality: a review of pertinent variables and processes, Behaviour 152 (2015), 1–56. [31] E. N. Bodine, S. Lenhart, and L. J. Gross, Mathematics for the life sciences, Princeton University Press, Princeton, New Jersey, 2014. [32] W. Boedeker and T. Backhaus, The scientific assessment of combined effects of risk factors: different approaches in experimental biosciences and epidemiology, European Journal of Epidemiology 25 (2010), 539–546. [33] M. C. Boerlijst, M. A. Nowak, and K. Sigmund, The logic of contrition, Journal of Theoretical Biology 185 (1997), 281–293. [34] E. von B¨ ohm-Bawerk, Capital and Interest, Volume II, Libertarian Press, South Holland, Illinois, New York, 1959, translation of 1889 French original. [35] E. Borel, On games that involve chance and the skill of the players, Econometrica 21 (1953), 101–115, DOI 10.2307/1906947. MR0052752 [36] J. L. Bower, The occurrence and function of victory displays within communication networks, Animal Communication Networks (P.K. McGregor, ed.), Cambridge University Press, Cambridge, 2005, pp. 114–126. [37] G. E. P. Box and N. R. Draper, Response surfaces, mixtures, and ridge analyses, 2nd ed., Wiley Series in Probability and Statistics, Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, 2007. MR2293880 [38] R. Boyd, Mistakes allow evolutionary stability in the repeated prisoner’s dilemma game, J. Theoret. Biol. 136 (1989), no. 1, 47–56, DOI 10.1016/S0022-5193(89)80188-2. MR976117 [39] R. Boyd and J. P. Lorberbaum, No pure strategy is evolutionarily stable in the repeated prisoner’s dilemma game, Nature 327 (1987), 58–59. [40] R. Boyd and P. J. Richerson, The evolution of reciprocity in sizable groups, J. Theoret. Biol. 132 (1988), no. 3, 337–356, DOI 10.1016/S0022-5193(88)80219-4. MR946593 [41] R. Boyd and P. J. Richerson, The evolution of indirect reciprocity, Social Networks 11 (1989), no. 3, 213–236, DOI 10.1016/0378-8733(89)90003-8. MR1022848 [42] J. W. Bradbury and S. L. Vehrencamp, Principles of Animal Communication, 2nd ed., Sinauer, Sunderland, Massachusetts, 2011. ˚. Br¨ [43] A annstr¨ om, J. Johansson, and N. von Festenberg, The hitchhiker’s guide to adaptive dynamics, Games 4 (2013), no. 3, 304–328, DOI 10.3390/g4030304. MR3117853 [44] K. Brauchli, T. Killingback, and M. Doebeli, Evolution of cooperation in spatially structured populations, Journal of Theoretical Biology 200 (1999), 405–417. [45] M. Briffa and I. C. W. Hardy, Introduction to animal contests, Animal Contests (I. C. W. Hardy and M. Briffa, eds.), Cambridge University Press, Cambridge, 2013, pp. 1–4.
Bibliography
375
[46] M. Briffa, I. C. W. Hardy, M. P. Gammell, D. J. Jennings, D. D. Clarke, and M. Goubault, Analysis of animal contest data, Animal Contests (I. C. W. Hardy and M. Briffa, eds.), Cambridge University Press, Cambridge, 2013, pp. 47–85. [47] M. Broom and V. Kˇ rivan, Biology and evolutionary games, Handbook of Dynamic Game Theory (T. Ba¸ser and G. Zaccour, eds.), Springer, Cham, Switzerland, 2018, pp. 1039–1077. [48] M. Broom and G. D. Ruxton, Evolutionarily stable stealing: game theory applied to kleptoparasitism, Behavioral Ecology 9 (1998), 397–403. [49] M. Broom and J. Rycht´ aˇ r, Game-theoretical models in biology, Chapman & Hall/CRC Mathematical and Computational Biology Series, CRC Press, Boca Raton, FL, 2013. MR3052136 [50] J. L. Brown, Helping and Communal Breeding in Birds, Princeton University Press, Princeton, New Jersey, 1987. [51] J. S. Brown, Why Darwin would have loved evolutionary game theory, Proceedings of the Royal Society of London B 283 (2016), no. 1750, 20160847. [52] M. Bulmer, Theoretical Evolutionary Ecology, Sinauer, Sunderland, Massachusetts, 1994. [53] M. G. Bulmer and G. A. Parker, The evolution of anisogamy: a game-theoretic approach, Proceedings of the Royal Society of London B 269 (2002), 2381–2388. [54] J. W. Burgess, Social spiders, Scientific American 234 (1976), no. 3, 100–106. [55] I. Camerlink, S. P. Turner, M. Farish, and G. Arnott, The influence of experience on contest assessment strategies, Scientific Reports 7 (2017), 14492. [56] E. Cameron, T. Day, and L. Rowe, Sperm competition and the evolution of ejaculate composition, American Naturalist 169 (2007), no. 6, E158–E172. [57] T. Caplow, A theory of coalitions in the triad, American Sociological Review 21 (1956), no. 4, 489–493. [58] T. Caplow, Further development of a theory of coalitions in the triad, American Journal of Sociology 64 (1959), no. 4, 488–493. [59] T. Caraco and J. L. Brown, A game between communal breeders: when is food-sharing stable?, J. Theoret. Biol. 118 (1986), no. 4, 379–393, DOI 10.1016/S0022-5193(86)80160-6. MR829879 [60] E. Castillo, A. Cobo, F. Jubete, and R. E. Pruneda, Orthogonal sets and polar methods in linear algebra, Pure and Applied Mathematics (New York), John Wiley & Sons, Inc., New York, 1999. Applications to matrix calculations, systems of equations, inequalities, and linear programming; A Wiley-Interscience Publication. MR1673208 [61] E. L. Charnov, The Theory of Sex Allocation, Princeton University Press, Princeton, New Jersey, 1982. [62] P. Z. Chen, R. L. Carrasco, and P. K. L. Ng, Mangrove crab uses victory display to ”browbeat” losers from reinitiating a new fight, Ethology 123 (2017), 981–988. [63] J. M. Chertkoff, A revision of Caplow’s coalition theory, Journal of Experimental Social Psychology 3 (1967), 172–177. [64] B. Child, The practice and principles of community-based wildlife management in Zimbabwe: the CAMPFIRE programme, Biodiversity and Conservation 5 (1996), 369–398. [65] F. B. Christiansen, On conditions for evolutionary stability for a continuously varying character, American Naturalist 138 (1991), 37–50. [66] C. W. Clark and M. Mangel, Dynamic State Variable Models in Ecology, Oxford University Press, New York, 2000. [67] K. C. Clements and D. W. Stephens, Testing models of non-kin cooperation: mutualism and the prisoner’s dilemma, Animal Behaviour 50 (1995), 527–549. [68] T. H. Clutton-Brock, Cooperation between non-kin in animal societies, Nature 462 (2009), 51–57. [69] T. H. Clutton-Brock, M. J. O’Riain, P. N. M. Brotherton, D. Gaynor, R. Kansky, and M. Manser, Selfish sentinels in cooperative mammals, Science 284 (1999), 1640–1644. [70] A. M. Colman, Game theory and experimental games, International Series in Experimental Social Psychology, vol. 4, Pergamon Press, Oxford-Elmsford, N.Y., 1982. The study of strategic interaction. MR720187 [71] M. Colyvan, J. Justus, and H. M. Regan, The conservation game, Biological Conservation 144 (2011), 1246–1253. [72] A. A. Cournot, Researches into the Mathematical Principles of the Theory of Wealth, Macmillan, New York, 1897, translation of 1838 French original. [73] J. C. Creighton, Population density, body size, and phenotypic plasticity of brood size in a burying beetle, Behavioral Ecology 16 (2005), 1031–1036. [74] R. Cressman, Evolutionary dynamics and extensive form games, MIT Press Series on Economic Learning and Social Evolution, vol. 5, MIT Press, Cambridge, MA, 2003. MR2004666
376
Bibliography
[75] R. Cressman, J. Garay, A. Scarelli, and Z. Varga, Continuously stable strategies, neighborhood superiority and two-player games with continuous strategy space, International Journal of Game Theory 38 (2009), 221–247. [76] R. Cressman, A. Halloway, G. G. McNickle, J. Apaloo, J. S. Brown, and T. L. Vincent, Unlimited niche packing in a Lotka-Volterra competition game, Theoretical Population Biology 116 (2017), 1–17. [77] P. H. Crowley, Hawks, doves, and mixed-symmetry games, Journal of Theoretical Biology 204 (2000), 543–563. [78] P. H. Crowley, T. Cottrell, T. Garcia, M. Hatch, R. C. Sargent, B. J. Stokes, and J. M. White, Solving the complementarity dilemma: evolving strategies for simultaneous hermaphroditism, Journal of Theoretical Biology 195 (1998), 13–26. [79] P. H. Crowley, L. Provencher, S. Sloane, L. A. Dugatkin, B. Spohn, L. Rogers, and M. Alfieri, Evolving cooperation: the role of individual recognition, Biosystems 37 (1996), 49–66. [80] P. H. Crowley and R. C. Sargent, Whence tit-for-tat?, Evolutionary Ecology 10 (1996), 499–516. [81] N. B. Davies, J. R. Krebs, and S. A. West, An Introduction to Behavioural Ecology, 4th ed., Wiley-Blackwell, Oxford, 2012. [82] T. Day and P. D. Taylor, Evolutionary dynamics and stability in discrete and continuous games, Evolutionary Ecology Research 5 (2003), 605–613. [83] F. Dercole and S. Rinaldi, Analysis of evolutionary processes, Princeton Series in Theoretical and Computational Biology, Princeton University Press, Princeton, NJ, 2008. The adaptive dynamics approach and its applications. MR2382402 [84] L. A. Dugatkin, Cooperation Among Animals: An Evolutionary Perspective, Oxford University Press, New York, 1997. [85] D. J. D. Earn, A light introduction to modelling recurrent epidemics, Mathematical epidemiology, Lecture Notes in Math., vol. 1945, Springer, Berlin, 2008, pp. 3–17, DOI 10.1007/978-3-540-789116 1. MR2428371 [86] P. K. Eason, G. A. Cobbs, and K. G. Trinca, The use of landmarks to define territorial boundaries, Animal Behaviour 58 (1999), 85–91. [87] F. Y. Edgeworth, Mathematical Psychics, Kegan Paul, London, 1881. [88] R. W. Elwood and G. Arnott, Understanding how animals fight with Lloyd Morgan’s canon, Animal Behaviour 84 (2012), 1095–1102. [89] M. Enquist, P. L. Hurd, and S. Ghirlanda, Signaling, Evolutionary Behavioral Ecology (D. F. Westneat and C. W. Fox, eds.), Oxford University Press, Oxford, 2010, pp. 266–284. [90] M. Enquist and O. Leimar, Evolution of fighting behaviour: the effect of variation in resource value, J. Theoret. Biol. 127 (1987), no. 2, 187–205, DOI 10.1016/S0022-5193(87)80130-3. MR900328 [91] I. Eshel, Evolutionary and continuous stability, J. Theoret. Biol. 103 (1983), no. 1, 99–111, DOI 10.1016/0022-5193(83)90201-1. MR714279 [92] I. Eshel, M. W. Feldman, and A. Bergman, Long-term evolution, short-term evolution, and population-genetic theory, Journal of Theoretical Biology 191 (1998), 391–396. [93] R. M. Fagen, When doves conspire: evolution of nondamaging fighting tactics in a nonrandomencounter animal conflict model, Amer. Natur. 115 (1980), no. 6, 858–869, DOI 10.1086/283604. MR596843 [94] T. W. Fawcett and R. A. Johnstone, Learning your own strength: winner and loser effects should change with age and experience, Proceedings of the Royal Society of London B 277 (2010), 1427– 1434. [95] M. Fea and G. Holwell, Combat in a cave-dwelling weta (Orthoptera: Rhaphidophoridae) with exaggerated weaponry, Animal Behaviour 138 (2018), 85–92. [96] M. W. Feldman and E. A. C. Thomas, Behavior-dependent contexts for repeated plays of the prisoner’s dilemma. II. Dynamical aspects of the evolution of cooperation, J. Theoret. Biol. 128 (1987), no. 3, 297–315, DOI 10.1016/S0022-5193(87)80073-5. MR912010 [97] R. Ferriere and R. E. Michod, The evolution of cooperation in spatially heterogeneous populations, American Naturalist 147 (1996), 692–717. [98] R. A. Fisher, The genetical theory of natural selection, A complete variorum edition, Oxford University Press, Oxford, 1999. Revised reprint of the 1930 original; Edited, with a foreword and notes, by J. H. Bennett. MR1785121 [99] M. Flood, K. Lendenmann, and A. Rapoport, 2 × 2 games played by rats: different delays of reinforcement as payoffs, Behavioral Science 28 (1983), 65–78. [100] L. Fromhage and H. Kokko, Spatial seed and pollen games: dispersal, sex allocation, and the evolution of dioecy, Journal of Evolutionary Biology 23 (2010), 1947–1956. [101] D. Fudenberg and J. Tirole, Game theory, MIT Press, Cambridge, MA, 1991. MR1124618
Bibliography
377
[102] S. Funk, M. Salath´ e, and V. A. A. Jansen, Modelling the influence of human behaviour on the spread of infectious diseases: a review, Journal of the Royal Society Interface 7 (2010), 1247–1256. [103] W. A. Gamson, A theory of coalitions in the triad, American Sociological Review 21 (1961), no. 4, 489–493. [104] J. Garay and Z. Varga, When will a sexual population evolve to an ESS?, Proceedings of the Royal Society of London B 265 (1998), 1007–1010. [105] R. Gatti, T. Goeschl, B. Groom, and T. Swanson, The biodiversity bargaining problem, Environmental and Resource Economics 48 (2011), 609–628. [106] C. C. Gibson and S. A. Marks, Transforming rural hunters into conservationists: an assessment of community-based wildlife management programs in Africa, World Development 23 (1995), 941–957. [107] H. Gintis, Game theory evolving: a problem-centered introduction to modeling strategic interaction, 2nd ed., Princeton University Press, Princeton, NJ, 2009. MR2502619 [108] L.-A. Giraldeau and T. Caraco, Social Foraging Theory, Princeton University Press, Princeton, New Jersey, 2000. [109] L.-A. Giraldeau and F. Dubois, Social foraging and the study of exploitative behavior, Advances in the Study of Behavior 38 (2008), 59–104. [110] J. Gonz´ alez-D´ıaz, I. Garc´ıa-Jurado, and M. G. Fiestras-Janeiro, An introductory course on mathematical game theory, Graduate Studies in Mathematics, vol. 115, American Mathematical Society, Providence, RI; Real Sociedad Matem´ atica Espa˜ nola, Madrid, 2010. MR2654274 [111] J. Goodenough, B. McGuire, and E. Jakob, Perspectives on animal behavior, John Wiley, Hoboken, New Jersey, 2010. [112] M. Goubault and M. Decuigni` ere, Previous experience and contest outcome: Winner effects persist in absence of evident loser effects in a parasitoid wasp, American Naturalist 180 (2012), 364–371. [113] M. Goubault, M. Exbrayat, and R. L. Earley, Fixed or flexible? winner/loser effects vary with habitat quality in a parasitoid wasp, Unpublished manuscript, 2018. [114] M. Goubault, A. F. S. Mack, and I. C. W. Hardy, Encountering competitors reduces clutch size and increases offspring size in a parasitoid with female-female fighting, Proceedings of the Royal Society of London B 274 (2007), 2571–2577. [115] T.U. Grafe and J. H. Bitz, An acoustic postconflict display in the duetting tropical boubou (Laniarius aethiopicus): a signal of victory?, BMC Ecology 4 (2004), 1. [116] A. Grafen, The hawk-dove game played between relatives, Animal Behaviour 27 (1979), 905–907. [117] A. Grafen, The logic of divisively asymmetric contests: respect for ownership and the desperado effect, Animal Behaviour 35 (1987), 462–467. [118] A. Grafen, Biological signals as handicaps, J. Theoret. Biol. 144 (1990), no. 4, 517–546, DOI 10.1016/S0022-5193(05)80088-8. MR1059592 [119] A. Grafen, Modelling in behavioural ecology, Behavioural Ecology: An Evolutionary Approach (J. R. Krebs and N. B. Davies, eds.), Blackwell Science, Oxford, 3rd ed., 1991, pp. 5–31. [120] P. Grim, Spatialization and greater generosity in the stochastic prisoner’s dilemma, Biosystems 37 (1996), 3–17. [121] E. Grman, T. M. P. Robinson, and C. A. Klausmeier, Ecological specialization and trade affect the outcome of negotiations in mutualism, American Naturalist 179 (2012), no. 5, 567–581. [122] U. Grodzinski and R. A. Johnstone, Parents and offspring in an evolutionary game: the effect of supply on demand when costs of care vary, Proceedings of the Royal Society of London B 279 (2012), 109–115. [123] C.-J. Haake, A. Kashiwada, and F. E. Su, The Shapley value of phylogenetic trees, J. Math. Biol. 56 (2008), no. 4, 479–497, DOI 10.1007/s00285-007-0126-2. MR2367029 [124] W. D. Hamilton, The genetical theory of social behaviour, I & II, Journal of Theoretical Biology 7 (1964), 1–52. [125] W. D. Hamilton, Extraordinary sex ratios, Science 156 (1967), 477–488. [126] P. Hammerstein, The role of asymmetries in animal contests, Animal Behaviour 29 (1981), 193– 205. [127] P. Hammerstein, Darwinian adaptation, population genetics and the streetcar theory of evolution, Journal of Mathematical Biology 34 (1996), 511–532. [128] P. Hammerstein and G. A. Parker, The asymmetric war of attrition, J. Theoret. Biol. 96 (1982), no. 4, 647–682, DOI 10.1016/0022-5193(82)90235-1. MR676768 [129] P. Hammerstein and S. E. Riechert, Payoffs and strategies in territorial contests: ESS analyses of two ecotypes of the spider, Agelenopsis aperta, Evolutionary Ecology 2 (1988), 115–138.
378
Bibliography
[130] I. C. W. Hardy and M. Briffa (eds.), Animal Contests, Cambridge University Press, Cambridge, 2013. [131] I. C. W. Hardy, N. T. Griffiths, and H. C. J. Godfray, Clutch size in a parasitoid wasp: a manipulation experiment, Journal of Animal Ecology 61 (1992), 121–129. [132] J. C. Harsanyi, Games with incomplete information played by “Bayesian” players. I. The basic model, Management Sci. 14 (1967), 159–182, DOI 10.1287/mnsc.14.3.159. MR0246649 [133] J. C. Harsanyi, Games with incomplete information played by “Bayesian” players. II. Bayesian equilibrium points, Management Sci. 14 (1968), 320–334, DOI 10.1287/mnsc.14.5.320. MR0246650 [134] J. C. Harsanyi, Games with incomplete information played by “Bayesian” players. III. The basic probability distribution of the game, Management Sci. 14 (1968), 486–502, DOI 10.1287/mnsc.14.7.486. MR0246651 [135] J. C. Harsanyi and R. Selten, A general theory of equilibrium selection in games, MIT Press, Cambridge, MA, 1988. With a foreword by Robert Aumann. MR956053 [136] M. P. Hassell and H. C. J. Godfray, The population biology of insect parasitoids, Natural Enemies: The Population Biology of Predators, Parasites and Diseases (M. J. Crawley, ed.), Blackwell Scientific Publications, Oxford, 1992, pp. 265–292. [137] S. M. Heap, P. Byrne, and D. Stuart-Fox, The adoption of landmarks for territorial boundaries, Animal Behaviour 83 (2012), 871–878. [138] B. Heinrich, Ravens in Winter, Simon and Schuster, New York, 1989. [139] V. F. Hendricks and P. G. Hansen (eds.), Game theory: 5 questions, Automatic Press/VIP, Copenhagen, Denmark, 2007. [140] W. G. S. Hines and J. Maynard Smith, Games between relatives, J. Theoret. Biol. 79 (1979), no. 1, 19–30, DOI 10.1016/0022-5193(79)90254-6. MR540949 [141] W. G. S. Hines and M. Turelli, Multilocus evolutionarily stable strategy effects: additive effects, Journal of Theoretical Biology 187 (1997), 379–388. [142] M. W. Hirsch, S. Smale, and R. L. Devaney, Differential equations, dynamical systems, and an introduction to chaos, 3rd ed., Elsevier/Academic Press, Amsterdam, 2013. MR3293130 [143] J. Hirshleifer, The paradox of power, Economics and Politics 3 (1991), 177–200. [144] J. Hofbauer and K. Sigmund, Adaptive dynamics and evolutionary stability, Appl. Math. Lett. 3 (1990), no. 4, 75–79, DOI 10.1016/0893-9659(90)90051-C. MR1080408 [145] J. Hofbauer and K. Sigmund, Evolutionary Games and Population Dynamics, Cambridge University Press, Cambridge, UK, 1998. [146] A. I. Houston and J. M. McNamara, Singing to attract a mate: a stochastic dynamic game, J. Theoret. Biol. 129 (1987), no. 1, 57–68, DOI 10.1016/S0022-5193(87)80203-5. MR918858 [147] A. I. Houston and J. M. McNamara, Fighting for food: a dynamic version of the Hawk-Dove game, Evolutionary Ecology 2 (1988), 51–64. [148] A. I. Houston and J. M. McNamara, Models of Adaptive Behaviour, Cambridge University Press, Cambridge, 1999. [149] P. L. Hurd and M. Enquist, A strategic taxonomy of biological communication, Animal Behaviour 70 (2005), no. 5, 1155–1170. [150] V. C. L. Hutson and G. T. Vickers, The spatial struggle of tit-for-tat and defect, Philosophical Transactions of the Royal Society of London B 348 (1995), 393–404. [151] Y. Iwasa, D. Cohen, and J. A. Le´ on, Tree height and crown shape, as results of competitive games, J. Theoret. Biol. 112 (1985), no. 2, 279–297, DOI 10.1016/S0022-5193(85)80288-5. MR781895 [152] D. J. Jennings, M. P. Gammell, C. M. Carlin, and T. J. Hayden, Win, lose or draw: a comparison of fight structure based on fight conclusion in the fallow deer, Behaviour 142 (2005), 423–439. [153] M. W. Jeter, Mathematical programming, Monographs and Textbooks in Pure and Applied Mathematics, vol. 102, Marcel Dekker, Inc., New York, 1986. An introduction to optimization. MR841969 [154] J. Johansson and N. Jonz´ en, Effects of territory competition and climate change on timing of arrival to breeding grounds: A game-theory approach, American Naturalist 179 (2012), no. 4, 463–474. [155] E. Kalai and D. Samet, Persistent equilibria in strategic games, Internat. J. Game Theory 13 (1984), no. 3, 129–144, DOI 10.1007/BF01769811. MR762348 [156] A. Keane, J. P. G. Jones, and E. J. Milner-Gulland, Modelling the effect of individual strategic behaviour on community-level outcomes of conservation interventions, Environmental Conservation 39 (2012), no. 4, 305–315. [157] C. D. Kelly, Allometry and sexual selection of male weaponry in wellington tree weta, Hemideina crassidens, Behavioral Ecology 16 (2005), 145–152. [158] B. Kempenaers and E. Schlicht, Extra-pair behaviour, Animal Behaviour: Evolution and Mechanisms (P. M. Kappeler, ed.), Springer, Berlin, 2010, pp. 359–411.
Bibliography
379
[159] T. Killingback, M. Doebeli, and N. Knowlton, Variable investment, the continuous prisoner’s dilemma, and the origin of cooperation, Proceedings of the Royal Society of London B 266 (1999), 1723–1728. [160] H. Kokko, Dyadic contests: modelling fights between two individuals, Animal Contests (I. C. W. Hardy and M. Briffa, eds.), Cambridge University Press, Cambridge, 2013, pp. 5–32. [161] H. Kokko, A. L´ opez-Sepulcre, and L. J. Morrell, From hawks and doves to self-consistent games of territorial behavior, American Naturalist 167 (2006), no. 6, 901–912. [162] K. A. Konrad, Strategy and Dynamics in Contests, Oxford University Press, Oxford, 2009. [163] D. M. Kreps, Game Theory and Economic Modelling, Clarendon Press, Oxford, 1990. [164] J. Kropko, Mathematics for social scientists, SAGE Publications, Thousand Oaks, California, 2016. [165] K. Kura, M. Broom, and A. Kandler, A game-theoretical winner and loser model of dominance hierarchy formation, Bull. Math. Biol. 78 (2016), no. 6, 1259–1290, DOI 10.1007/s11538-0160186-9. MR3523139 [166] T. Kura and K. Kura, War of attrition with individual differences on RHP, Journal of Theoretical Biology 193 (1998), 335–344. [167] A. H. Kydd, International Relations Theory: The Game-Theoretic Approach, Cambridge University Press, Cambridge, 2015. [168] M. Land and O. Gefeller, A game-theoretic approach to partitioning attributable risks in epidemiology, Biometrical J. 39 (1997), no. 7, 777–792, DOI 10.1002/bimj.4710390705. MR1613679 [169] S. Le and R. Boyd, Evolutionary dynamics of the continuous iterated prisoner’s dilemma, J. Theoret. Biol. 245 (2007), no. 2, 258–267, DOI 10.1016/j.jtbi.2006.09.016. MR2306446 [170] T. E. Lee and D. L. Roberts, Devaluing rhino horns as a theoretical game, Ecological Modelling 337 (2016), 73–78. [171] R. Levins, The strategy of model building in population biology, American Scientist 54 (1966), no. 4, 421–431. [172] S. Lippold, L. P. Fitzsimmons, J. R. Foote, L. M. Ratcliffe, and D. J. Mennill, Post-contest behaviour in black-capped chickadees (Poecile atricapillus): loser displays, not victory displays, follow asymmetrical countersinging exchanges, Acta Ethologica 11 (2008), 67–72. [173] J. P. Lorberbaum, D. E. Bohning, A. Shastri, and L. E. Sine, Are there really no evolutionarily stable strategies in the iterated prisoner’s dilemma?, J. Theoret. Biol. 214 (2002), no. 2, 155–169, DOI 10.1006/jtbi.2001.2455. MR1940377 [174] R. D. Luce and H. Raiffa, Games and decisions: introduction and critical survey, John Wiley & Sons, Inc., New York, N. Y., 1957. A study of the Behavioral Models Project, Bureau of Applied Social Research, Columbia University;. MR0087572 [175] D. Luenberger, Linear and Nonlinear Programming, 2nd ed., Addison-Wesley, Reading, Massachusetts, 1984. [176] O. L. Mangasarian, Nonlinear programming, McGraw-Hill Book Co., New York-London-Sydney, 1969. MR0252038 [177] J. Mann, R. C. Connor, P. L. Tyack, and H. Whitehead (eds.), Cetacean Societies: Field Studies of Dolphins and Whales, University of Chicago Press, 2000. [178] J. H. Marden and R. A. Rollins, Assessment of energy reserves by damselflies engaged in aerial contests for mating territories, Animal Behaviour 48 (1994), 1023–1030. [179] J. H. Marden and J. K. Waage, Escalated damselfy territorial contests are energetic wars of attrition, Animal Behaviour 39 (1990), 954–959. [180] M. Mareˇ s, Fuzzy cooperative games, Studies in Fuzziness and Soft Computing, vol. 72, PhysicaVerlag, Heidelberg, 2001. Cooperation with vague expectations. MR1841340 [181] P. Marrow, R. A. Johnstone, and L. D. Hurst, Riding the evolutionary streetcar: where population genetics and game theory meet, Trends in Ecology and Evolution 11 (1996), 445–446. [182] M. Maschler, Some tips concerning application of game theory to real problems, Game Practice: Contributions From Applied Game Theory (F. Patrone, I. Garcia-Jurado, and S. Tijs, eds.), Kluwer, Boston, 2000, pp. 1–5. [183] M. Maschler, B. Peleg, and L. S. Shapley, Geometric properties of the kernel, nucleolus, and related solution concepts, Math. Oper. Res. 4 (1979), no. 4, 303–338, DOI 10.1287/moor.4.4.303. MR549121 [184] J. Maynard Smith, On Evolution, Edinburgh University Press, Edinburgh, 1972. [185] J. Maynard Smith, The theory of games and the evolution of animal conflicts, J. Theoret. Biol. 47 (1974), no. 1, 209–221, DOI 10.1016/0022-5193(74)90110-6. MR0444115 [186] J. Maynard Smith, Evolution and the theory of games, American Scientist 64 (1976), no. 1, 41–45.
380
Bibliography
[187] J. Maynard Smith, Game theory and the evolution of behavior, Proceedings of the Royal Society of London B 205 (1979), 475–488. [188] J. Maynard Smith, Evolution and the Theory of Games, Cambridge University Press, Cambridge, 1982. [189] J. Maynard Smith, Can a mixed strategy be stable in a finite population?, J. Theoret. Biol. 130 (1988), no. 2, 247–251, DOI 10.1016/S0022-5193(88)80100-0. MR927216 [190] D. G. C. Harper, Maynard Smith: amplifying the reasons for signal reliability, J. Theoret. Biol. 239 (2006), no. 2, 203–209, DOI 10.1016/j.jtbi.2005.08.034. MR2223616 [191] J. Maynard Smith and G. A. Parker, The logic of asymmetric contests, Animal Behaviour 24 (1976), no. 2, 159–175. [192] J. Maynard Smith and G.R. Price, The logic of animal conflict, Nature 246 (1973), 15–18. [193] J. Maynard Smith and E. Szathm´ ary, The Origins of Life, Oxford University Press, Oxford, 1999. [194] D. McFarland, A Dictionary of Animal Behaviour, Oxford University Press, Oxford, 2006. [195] B. J. McGill and J. S. Brown, Evolutionary game theory and adaptive dynamics of continuous traits, Annual Review of Ecology and Systematics 38 (2007), 403–435. [196] C. A. McLean and D. Stuart-Fox, Rival assessment and comparison of morphological and performance-based predictors of fighting ability in Lake Eyre dragon lizards, Ctenophorus maculosus, Behavioral Ecology and Sociobiology 69 (2015), 523–531. [197] J. M. McNamara, Towards a richer evolutionary game theory, Journal of the Royal Society Interface 10 (2013), no. 88, 20130544. [198] J. M. McNamara and O. Leimar, Variation and the response to variation as a basis for successful cooperation, Philosophical Transactions of the Royal Society of London B 365 (2010), 2627–2633. [199] G. G. McNickle and R. Dybzinski, Game theory and plant ecology, Ecology Letters 16 (2013), 545–555. [200] M. Mesterton-Gibbons, A game-theoretic analysis of a motorist’s dilemma, Math. Comput. Modelling 13 (1990), no. 2, 9–14, DOI 10.1016/0895-7177(90)90028-L. MR1045776 [201] M. Mesterton-Gibbons, An escape from “the prisoner’s dilemma”, J. Math. Biol. 29 (1991), no. 3, 251–269, DOI 10.1007/BF00160538. MR1089785 [202] M. Mesterton-Gibbons, Ecotypic variation in the asymmetric Hawk-Dove game: when is Bourgeois an ESS?, Evolutionary Ecology 6 (1992), 198–222 and 448. [203] M. Mesterton-Gibbons, On the iterated prisoner’s dilemma in a finite population, Bulletin of Mathematical Biology 54 (1992), 423–443. [204] M. Mesterton-Gibbons, Game-theoretic resource modeling, Natur. Resource Modeling 7 (1993), no. 2, 93–147, DOI 10.1111/j.1939-7445.1993.tb00143.x. MR1267942 [205] M. Mesterton-Gibbons, The Hawk-Dove game revisited: effects of continuous variation in resource-holding potential on the frequency of escalation, Evolutionary Ecology 8 (1994), 230– 247. [206] M. Mesterton-Gibbons, On the war of attrition and other games among kin, J. Math. Biol. 34 (1996), no. 3, 253–270, DOI 10.1007/BF00160496. MR1375815 [207] M. Mesterton-Gibbons, On sperm competition games: incomplete fertilization risk and the equity paradox, Proceedings of the Royal Society of London B 266 (1999), 269–274. [208] M. Mesterton-Gibbons, On sperm competition games: raffles and roles revisited, J. Math. Biol. 39 (1999), no. 2, 91–108, DOI 10.1007/s002850050164. MR1714673 [209] M. Mesterton-Gibbons, On the evolution of pure winner and loser effects: a game-theoretic model, Bulletin of Mathematical Biology 61 (1999), 1151–1186. [210] M. Mesterton-Gibbons, A Concrete Approach to Mathematical Modelling, John Wiley, New York, 2007, corrected reprint of 1989 original. [211] M. Mesterton-Gibbons and E. S. Adams, Animal contests as evolutionary games, American Scientist 86 (1998), 334–341. [212] M. Mesterton-Gibbons and E. S. Adams, Landmarks in territory partitioning: a strategically stable convention?, American Naturalist 161 (2003), 685–697. [213] M. Mesterton-Gibbons and M. J. Childress, Constraints on reciprocity for non-sessile organisms, Bulletin of Mathematical Biology 58 (1996), 861–875. [214] M. Mesterton-Gibbons and Y. Dai, An effect of landmarks on territory shape in a Convict cichlid, Bull. Math. Biol. 77 (2015), no. 12, 2366–2378, DOI 10.1007/s11538-015-0130-4. MR3432500 [215] M. Mesterton-Gibbons, Y. Dai, and M. Goubault, Modeling the evolution of winner and loser effects: a survey and prospectus, Math. Biosci. 274 (2016), 33–44, DOI 10.1016/j.mbs.2016.02.002. MR3471834
Bibliography
381
[216] M. Mesterton-Gibbons, Y. Dai, M. Goubault, and I. C. W. Hardy, Volatile chemical emission as a weapon of rearguard action: a game-theoretic model of contest behavior, Bull. Math. Biol. 79 (2017), no. 11, 2413–2449, DOI 10.1007/s11538-017-0335-9. MR3717369 [217] M. Mesterton-Gibbons and L. A. Dugatkin, Cooperation among unrelated individuals: evolutionary factors, Quarterly Review of Biology 67 (1992), 267–281. [218] M. Mesterton-Gibbons and L. A. Dugatkin, Cooperation and the prisoner’s dilemma: toward testable models of mutualism versus reciprocity, Animal Behaviour 54 (1997), 551–557. [219] M. Mesterton-Gibbons and L. A. Dugatkin, On the evolution of delayed recruitment to food bonanzas, Behavioral Ecology 10 (1999), 377–390. [220] M. Mesterton-Gibbons, S. Gavrilets, J. Gravner, and E. Ak¸ cay, Models of coalition or alliance formation, J. Theoret. Biol. 274 (2011), 187–204, DOI 10.1016/j.jtbi.2010.12.031. MR2974954 [221] M. Mesterton-Gibbons and I. C. W. Hardy, The influence of contests on optimal clutch size: a game-theoretic model, Proceedings of the Royal Society of London B 271 (2004), 971–978. [222] M. Mesterton-Gibbons and S. M. Heap, Variation between self and mutual assessment in animal contests, American Naturalist 183 (2014), 199–213. [223] M. Mesterton-Gibbons, T. Karabiyik, and T. N. Sherratt, The iterated Hawk-Dove game revisited: the effect of ownership uncertainty on Bourgeois as a pure convention, Dyn. Games Appl. 4 (2014), no. 4, 407–431, DOI 10.1007/s13235-014-0111-5. MR3280565 [224] M. Mesterton-Gibbons, T. Karabiyik, and T. N. Sherratt, On the evolution of partial respect for ownership, Dyn. Games Appl. 6 (2016), no. 3, 359–395, DOI 10.1007/s13235-015-0152-4. MR3530233 [225] M. Mesterton-Gibbons, J. H. Marden, and L. A. Dugatkin, On wars of attrition without assessment, Journal of Theoretical Biology 181 (1996), 65–83. [226] M. Mesterton-Gibbons and E. J. Milner-Gulland, On the strategic stability of monitoring: implications for cooperative wildlife management programs in Africa, Proceedings of the Royal Society of London B 265 (1998), 1237–1244. [227] M. Mesterton-Gibbons and T. N. Sherratt, Victory displays: a game-theoretic analysis, Behavioral Ecology 17 (2006), 597–605. [228] M. Mesterton-Gibbons and T. N. Sherratt, Coalition formation: a game-theoretic analysis, Behavioral Ecology 18 (2007), 277–286. [229] M. Mesterton-Gibbons and T. N. Sherratt, Social eavesdropping: a game-theoretic analysis, Bull. Math. Biol. 69 (2007), no. 4, 1255–1276, DOI 10.1007/s11538-006-9151-3. MR2312215 [230] M. Mesterton-Gibbons and T. N. Sherratt, Animal network phenomena: insights from triadic games, Complexity 14 (2009), no. 4, 44–50, DOI 10.1002/cplx.20251. MR2500070 [231] M. Mesterton-Gibbons and T. N. Sherratt, Neighbor intervention: a game-theoretic model, J. Theoret. Biol. 256 (2009), no. 2, 263–275, DOI 10.1016/j.jtbi.2008.10.004. MR2973445 [232] M. Mesterton-Gibbons and T. N. Sherratt, Information, variance and cooperation: minimal models, Dyn. Games Appl. 1 (2011), no. 3, 419–439, DOI 10.1007/s13235-011-0017-4. MR2842291 [233] M. Mesterton-Gibbons and T. N. Sherratt, Divide and conquer: when and how should competitors share?, Evolutionary Ecology 26 (2012), 943–954. [234] M. Mesterton-Gibbons and T. N. Sherratt, Signalling victory to ensure dominance: a continuous model, Advances in dynamic games, Ann. Internat. Soc. Dynam. Games, vol. 12, Birkh¨ auser/Springer, New York, 2012, pp. 25–38. MR2976429 [235] M. Mesterton-Gibbons and T. N. Sherratt, Bourgeois versus anti-Bourgeois: a model of infinite regress, Animal Behaviour 89 (2014), 171–183. [236] M. Mesterton-Gibbons and T. N. Sherratt, How residency duration affects the outcome of a territorial contest: complementary game-theoretic models, J. Theoret. Biol. 394 (2016), 137– 148, DOI 10.1016/j.jtbi.2016.01.016. MR3463305 [237] R. E. Michod, Darwinian Dynamics, Princeton University Press, Princeton, New Jersey, 1999. [238] C. Molina and D. J. D. Earn, Game theory of pre-emptive vaccination before bioterrorism or accidental release of smallpox, Journal of the Royal Society Interface 12 (2015), no. 107, 20141387. [239] K. Montroy, M. J. Loranger, and S. M. Bertram, Male crickets adjust their aggressive behavior when a female is present, Behavioural Processes 124 (2016), 108–114. [240] S. C. Mouterde, D. M. Duganzich, L. E. Molles, S. Helps, and F. Helps, Triumph displays inform eavesdropping little blue penguins of new dominance asymmetries, Animal Behaviour 83 (2012), 605–611. [241] A. Muthoo, Bargaining Theory with Applications, Cambridge University Press, Cambridge, 1999. [242] J. F. Nash Jr., The bargaining problem, Econometrica 18 (1950), 155–162, DOI 10.2307/1907266. MR0035977
382
Bibliography
[243] J. Nash, Non-cooperative games, Ann. of Math. (2) 54 (1951), 286–295, DOI 10.2307/1969529. MR0043432 [244] C. Neuhauser and M. L. Roper, Calculus for biology and medicine, 4th ed., Pearson, New York, 2018. [245] J. A. Newman and T. Caraco, Cooperative and non-cooperative bases of food-calling, Journal of Theoretical Biology 141 (1989), 197–209. [246] I. Nishizaki and M. Sakawa, Fuzzy and multiobjective games for conflict resolution, Studies in Fuzziness and Soft Computing, vol. 64, Physica-Verlag, Heidelberg, 2001. MR1840381 [247] R. No¨ e, Alliance formation among male baboons: shopping for profitable partners, Coalitions and Alliances in Humans and other Animals (A. H. Harcourt and F. B. M. de Waal, eds.), Oxford University Press, Oxford, 1992, pp. 285–321. [248] R. No¨ e, A model of coalition formation among male baboons with fighting ability as the crucial parameter, Animal Behaviour 47 (1994), 211–213. [249] G. N¨ oldeke and L. Samuelson, How costly is the honest signaling of need?, Journal of Theoretical Biology 197 (1999), 527–539. [250] M. Nowak, Stochastic strategies in the prisoner’s dilemma, Theoret. Population Biol. 38 (1990), no. 1, 93–112, DOI 10.1016/0040-5809(90)90005-G. MR1069150 [251] M. Nowak, Evolutionary Dynamics, Belknap Press, Cambridge, Massachusetts, 2006. [252] M. A. Nowak, Evolving cooperation, 10.1016/j.jtbi.2012.01.014. MR2899045
J.
Theoret.
Biol.
299
(2012),
1–8,
DOI
[253] M. A. Nowak, S. Bonhoeffer, and R. M. May, More spatial games, Internat. J. Bifur. Chaos Appl. Sci. Engrg. 4 (1994), no. 1, 33–56, DOI 10.1142/S0218127494000046. MR1276803 [254] M. Nowak and K. Sigmund, The evolution of stochastic strategies in the prisoner’s dilemma, Acta Appl. Math. 20 (1990), no. 3, 247–265, DOI 10.1007/BF00049570. MR1081589 [255] M. Nowak and K. Sigmund, A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game, Nature 364 (1993), 56–58. [256] M. Nowak and K. Sigmund, Evolution of indirect reciprocity by image scoring, Nature 393 (1998), 573–577. [257] G. Owen, Game theory, 2nd ed., Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London, 1982. MR697721 [258] K. M. Page and M. A. Nowak, Unifying evolutionary dynamics, J. Theoret. Biol. 219 (2002), no. 1, 93–98, DOI 10.1016/S0022-5193(02)93112-7. MR2043352 [259] L. Page and J. Coates, Winner and loser effects in human competitions. evidence from equally matched tennis players, Evolution and Human Behavior 38 (2017), 530–535. [260] G. A. Parker, Assessment strategy and the evolution of fighting behaviour, Journal of Theoretical Biology 47 (1974), 223–243. [261] G. A. Parker, Sperm competition games: raffles and roles, Proceedings of the Royal Society of London B 242 (1990), 120–126. [262] G. A. Parker, Sperm competition and the evolution of ejaculates: toward a theory base, Sperm Competition and Sexual Selection (T. R. Birkhead and A. P. Møller, eds.), Academic Press, San Diego, 1998, pp. 3–54. [263] G. A. Parker, Sperm competition games between related males, Proceedings of the Royal Society of London B 267 (2000), 1027–1032. [264] G. A. Parker, C. M. Lessells, and L. W. Simmons, Sperm competition games: A general model for precopulatory male-male competition, Evolution 67 (2012), no. 1, 95–109. [265] G. A. Parker and T. Pizzari, Sperm competition and ejaculate economics, Biological Reviews 85 (2010), 897–934. [266] R. J. H. Payne, Gradually escalating fights and displays: the cumulative assessment model, Animal Behaviour 56 (1998), no. 3, 651–662. [267] I. Pen and F. J. Weissing, Sperm competition and sex allocation in simultaneous hermaphrodites: a new look at Charnov’s invariance principle, Evolutionary Ecology Research 1 (1999), 517–525. [268] S. Perry and J. H. Manson, Manipulative Monkeys: The Capuchins of Lomas Barbudal, Harvard University Press, Cambridge, Massachusetts, 2008. [269] H. Peters, Game theory, Springer-Verlag, Berlin, 2008. A multi-leveled approach. MR2492336 [270] G. Petersen and I. C. W. Hardy, The importance of being larger: parasitoid intruder-owner contests and their implications for clutch size, Animal Behaviour 51 (1996), 1363–1373. [271] L. Phlips, The Economics of Imperfect Information, Cambridge University Press, Cambridge, 1988.
Bibliography
383
[272] P. Poletti, M. Ajelli, and S. Merler, Risk perception and effectiveness of uncoordinated behavioral responses in an emerging epidemic, Math. Biosci. 238 (2012), no. 2, 80–89, DOI 10.1016/j.mbs.2012.04.003. MR2947086 [273] W. H. Press and F. J. Dyson, Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent, Proceedings of the National Academy of Sciences USA 109 (2012), no. 26, 10409–10413. [274] J. N. Pruitt and C. J. Goodnight, Site-specific group selection drives locally adapted group compositions, Nature 514 (2014), 359–362. [275] D. A. Rand, Correlation equations and pair approximations for spatial ecologies, Advanced Ecological Theory (J. M. McGlade, ed.), Blackwell Science, Oxford, 1999, pp. 100–142. [276] D. G. Rand and M. A. Nowak, Human cooperation, Trends in Cognitive Sciences 17 (2013), no. 8, 413–425. [277] A. Rapoport, Applications of game-theoretic concepts in biology, Bull. Math. Biol. 47 (1985), no. 2, 161–192, DOI 10.1016/S0092-8240(85)90046-1. MR803560 [278] J. C. Reboreda and A. Kacelnik, The role of autoshaping in cooperative two-player games between starlings, Journal of the Experimental Analysis of Behavior 60 (1993), 67–83. [279] S. M. Redpath, A. Keane, H. Andr´ en, Z. Baynham-Herd, N. Bunnefeld, A. B. Duthie, J. Frank, C. A. Garcia, J. M˚ ansson, L. Nilsson, C. R. J. Pollard, O. S. Rakotonarivo, C. F. Salk, and H. Travers, Games as tools to address conservation conflicts, Trends in Ecology and Evolution 33 (2018), no. 6, 415–426. [280] H. K. Reeve and L. A Dugatkin, Why we need evolutionary game theory, Game Theory and Animal Behavior (L. A. Dugatkin and H. K. Reeve, eds.), Oxford University Press, New York, 1998, pp. 304–311. [281] H. K. Reeve, S. T. Emlen, and L. Keller, Reproductive sharing in animal societies: reproductive incentives or incomplete control by dominant breeders?, Behavioral Ecology 9 (1998), 267–278. [282] T. C. Reluga, An SIS epidemiology game with two subpopulations, J. Biol. Dyn. 3 (2009), no. 5, 515–531, DOI 10.1080/17513750802638399. MR2572751 [283] G. S. Requena and S. H. Alonzo, Sperm competition games when males invest in paternal care, Proceedings of the Royal Society of London B 284 (2017), no. 1860, 20171266. [284] S. E. Riechert, Maynard Smith & Parker’s (1976) rule book for animal contests, mostly, Animal Behaviour 86 (2013), 3–9. [285] G. Roberts and T. N. Sherratt, Development of cooperative relationships through increasing investment, Nature 394 (1998), 175–179. [286] G. Romp, Game Theory, Oxford University Press, Oxford, 1997. [287] K. H. Rosen, Discrete Mathematics and Its Applications, 8th ed., McGraw-Hill, New York, 2019. [288] J. Roughgarden, Teamwork, pleasure and bargaining in animal social behaviour, Journal of Evolutionary Biology 25 (2012), 1454–1462. [289] T. Roughgarden, Twenty lectures on algorithmic game theory, Cambridge University Press, Cambridge, 2016. MR3525052 [290] A. Rubinstein, Comments on the interpretation of game theory, Econometrica 59 (1991), 909– 924. [291] H. Rusch and S. Gavrilets, The logic of animal intergroup conflict: A review, Journal of Economic Behavior and Organization (2017), DOI 10.1016/j.jebo.2017.05.004. [292] L. Samuelson, Evolutionary games and equilibrium selection, MIT Press Series on Economic Learning and Social Evolution, vol. 1, MIT Press, Cambridge, MA, 1997. MR1447191 [293] W. H. Sandholm, Population games and evolutionary dynamics, Economic Learning and Social Evolution, MIT Press, Cambridge, MA, 2010. MR2560519 [294] M. E. Schaffer, Evolutionarily stable strategies for a finite population and a variable contest size, J. Theoret. Biol. 132 (1988), no. 4, 469–478, DOI 10.1016/S0022-5193(88)80085-7. MR949816 [295] T. C. Schelling, Micromotives and Macrobehavior, Norton, New York, 1978. [296] H.R. Schiffman, Sensation and Perception, 2nd ed., John Wiley, New York, 1982. [297] D. Schmeidler, The nucleolus of a characteristic function game, SIAM J. Appl. Math. 17 (1969), 1163–1170, DOI 10.1137/0117107. MR0260432 [298] P. L. Schwagmeyer and G. A. Parker, Male mate choice as predicted by sperm competition in 13-lined ground squirrels, Nature 348 (1990), 62–64. [299] W. A. Searcy and S. Nowicki, The Evolution of Animal Communication: Reliability and Deception in Signaling Systems, Princeton University Press, Princeton, New Jersey, 2005. [300] R. Selten, A note on evolutionarily stable strategies in asymmetric animal conflicts, J. Theoret. Biol. 84 (1980), no. 1, 93–101, DOI 10.1016/S0022-5193(80)81038-1. MR577174
384
Bibliography
[301] M. R. Servedio, Y. Brandvain, S. Dhole, C. L. Fitzpatrick, E. E. Goldberg, C. A. Stern, J. Van Cleve, and D. J. Yeh, Not just a theory—the utility of mathematical models in evolutionary biology, PLoS Biology 12 (2014), no. 12, e1002017. [302] L. S. Shapley, Cores of convex games, Internat. J. Game Theory 1 (1971/72), 11–26; errata, ibid. 1 (1971/72), 199, DOI 10.1007/BF01753431. MR0311338 [303] P. P. Shenoy, Caplow’s theory of coalitions in the triad reconsidered, Journal of Mathematical Psychology 18 (1978), 177–194. [304] T. N. Sherratt and M. Mesterton-Gibbons, Models of group or multi-party contests, Animal Contests (I. C. W. Hardy and M. Briffa, eds.), Cambridge University Press, Cambridge, 2013, pp. 33–46. [305] T. N. Sherratt and M. Mesterton-Gibbons, The evolution of respect for property, Journal of Evolutionary Biology 28 (2015), 1185–1202. [306] M. Shubik, Game Theory in the Social Sciences: Concepts and Solutions, The MIT Press, Cambridge, Massachusetts, 1982. [307] M. Shubik, Game theory in the social sciences, MIT Press, Cambridge, Mass.-London, 1982. Concepts and solutions. MR664926 [308] M. Shubik, Cooperative game solutions: Australian, Indian, and U.S. opinions, Journal of Conflict Resolution 30 (1986), no. 1, 63–76. [309] A. N. Siegel, Combinatorial game theory, Graduate Studies in Mathematics, vol. 146, American Mathematical Society, Providence, RI, 2013. MR3097920 [310] K. Sigmund, The calculus of selfishness, Princeton Series in Theoretical and Computational Biology, Princeton University Press, Princeton, NJ, 2010. MR2590059 [311] K. Sigmund (ed.), Evolutionary game dynamics, Proceedings of Symposia in Applied Mathematics, vol. 69, American Mathematical Society, Providence, RI, 2011. Papers from the American Mathematical Society Short Course held in New Orleans, LA, January 4–5, 2011; AMS Short Course Lecture Notes. MR2893685 [312] K. Sigmund, Moral assessment in indirect reciprocity, J. Theoret. Biol. 299 (2012), 25–30, DOI 10.1016/j.jtbi.2011.03.024. MR2899048 [313] M. Simaan and J. B. Cruz Jr., On the Stackelberg strategy in nonzero-sum games, J. Optimization Theory Appl. 11 (1973), 533–555, DOI 10.1007/BF00935665. MR0332207 [314] L. W. Simmons, S. L¨ upold, and J. L. Fitzpatrick, Evolutionary trade-off between secondary sexual traits and ejaculates, Trends in Ecology and Evolution 32 (2017), no. 12, 964–976. [315] J. E. Smith, R. C. Van Horn, K. S. Powning, A. R. Cole, K. E. Graham, S. K. Memenis, and K. E. Holekamp, Evolutionary forces favoring intragroup coalitions among spotted hyenas and other animals, Behavioral Ecology 21 (2010), 284–303. [316] R. Spencer and M. Broom, A game-theoretical model of kleptoparasitic behavior in an urban gull (laridae) population, Behavioral Ecology 29 (2018), no. 1, 60–78. [317] D. W. Stephens, Cumulative benefit games: achieving cooperation when players discount the future, Journal of Theoretical Biology 205 (2000), 1–16. [318] D. W. Stephens and V. K. Heinen, Modeling nonhuman conventions: the behavioral ecology of arbitrary action, Behavioral Ecology 29 (2018), no. 3, 598–608. [319] D. W. Stephens, C. M. McLinn, and J. R. Stevens, Discounting and reciprocity in an Iterated Prisoner’s Dilemma, Science 298 (2002), 2216–2218. [320] D. W. Stephens, K. Nishimura, and K. B. Toyer, Error and discounting in the iterated prisoner’s dilemma, Journal of Theoretical Biology 176 (1995), 457–469. [321] A. J. Stewart and J. B. Plotkin, Extortion and cooperation in the prisoner’s dilemma, Proceedings of the National Academy of Sciences USA 109 (2012), no. 26, 10134–10135. [322] A. J. Stewart and J. B. Plotkin, From extortion to generosity, evolution in the iterated prisoner’s dilemma, Proc. Natl. Acad. Sci. USA 110 (2013), no. 38, 15348–15353, DOI 10.1073/pnas.1306246110. MR3153960 [323] A. J. Steward and J. B. Plotkin, Collapse of cooperation in evolving games, Proceedings of the National Academy of Sciences USA 111 (2014), no. 49, 17558–17563. [324] P. D. Straffin, The prisoner’s dilemma, UMAP Journal 1 (1980), no. 1, 101–103. [325] P. D. Straffin and J. P. Heaney, Game theory and the Tennessee Valley Authority, Internat. J. Game Theory 10 (1981), no. 1, 35–43, DOI 10.1007/BF01770069. MR618170 [326] R. Sugden, The Economics of Rights, Co-operation and Welfare, Basil Blackwell, Oxford, 1986. [327] U. R. Sumaila, Game Theory and Fisheries: Essays on the Tragedy of Free for All Fishing, Routledge, Abingdon, UK, 2013.
Bibliography
385
[328] W. J. Sutherland and K. Norris, Behavioural models of population growth rates: implications for conservation and prediction, Philosophical Transactions of the Royal Society of London B 357 (2002), 1273–1284. [329] W. J. Sutherland and K. Norris, Predicting the ecological consequences of environmental change: a review of the methods, Journal of Applied Ecology 43 (2006), 599–616. [330] S. Sz´ amad´ o, Cheating as a mixed strategy in a simple model of aggressive communication, Animal Behaviour 59 (2000), 221–230. [331] M. Taborsky, Frommen J. G., and C. Riehl, The evolution of cooperation based on direct fitness benefits, Philosophical Transactions of the Royal Society of London B 371 (2016), no. 1687, 20150472. [332] A. D. Taylor and W. S. Zwicker, Simple games, Princeton University Press, Princeton, NJ, 1999. Desirability relations, trading, pseudoweightings. MR1714706 [333] P. D. Taylor, Evolutionary stability in one-parameter models under weak selection, Theoret. Population Biol. 36 (1989), no. 2, 125–143, DOI 10.1016/0040-5809(89)90025-7. MR1020493 [334] P. D. Taylor and L. B. Jonker, Evolutionarily stable strategies and game dynamics, Math. Biosci. 40 (1978), no. 1-2, 145–156, DOI 10.1016/0025-5564(78)90077-9. MR0489983 [335] P. W. Taylor and R. W. Elwood, The mismeasure of animal contests, Animal Behaviour 65 (2003), no. 6, 1195–1202. [336] B. Thomas, Evolutionarily stable sets in mixed-strategist models, Theoret. Population Biol. 28 (1985), no. 3, 332–341, DOI 10.1016/0040-5809(85)90033-4. MR816922 [337] R. Trivers, The evolution of reciprocal altruism, Quarterly Review of Biology 46 (1971), 35–57, (reprinted in Clutton-Brock & Harvey, pp. 189–226). [338] R. Trivers, Social Evolution, Benjamin/Cummings, Menlo Park, California, 1985. [339] Y.-J. J. Tsai, E. M. Barrows, and M. R. Weiss, Pure self-assessment of size during male-male contests in the parasitoid wasp Nasonia vitripennis, Ethology 120 (2014), 816–824. [340] E. van Damme, Stability and perfection of Nash equilibria, Springer-Verlag, Berlin, 1987. MR917062 [341] G. S. van Doorn, G. M. Hengeveld, and F. J. Weissing, The Evolution of Social Dominance I: Two-Player Models, Behaviour 140 (2003), 1305–1332. [342] F. Vega-Redondo, Evolution, Games, and Economic Behaviour, Oxford University Press, Oxford, 1996. [343] F. Vega-Redondo, Economics and the Theory of Games, Cambridge University Press, Cambridge, 2003. [344] W. L. Vickery, How to cheat against a simple mixed strategy ESS, J. Theoret. Biol. 127 (1987), no. 2, 133–139, DOI 10.1016/S0022-5193(87)80124-8. MR900327 [345] W. L. Vickery, Reply to Maynard Smith, Journal of Theoretical Biology 132 (1988), 375–378. [346] W. E. Vinacke and A. Arkoff, An experimental study of coalitions in the triad, American Sociological Review 22 (1957), no. 4, 406–414. [347] T. L. Vincent and J. S. Brown, Evolutionary Game Theory, Natural Selection, and Darwinian Dynamics, Cambridge University Press, Cambridge, 2005. [348] T. L. Vincent and W. J. Grantham, Optimality in parametric systems, John Wiley & Sons, Inc., New York, 1981. A Wiley-Interscience Publication. MR628316 [349] M. Vojnovi´ c, Contest Theory: Incentive Mechanisms and Ranking Methods, Cambridge University Press, New York, 2015. [350] J. von Neumann and O. Morgenstern, Theory of games and economic behavior, 3rd ed., Princeton University Press, Princeton, N.J., 1980. MR565457 [351] L. M. Wahl and M. A. Nowak, The continuous prisoner’s dilemma, I & II, Journal of Theoretical Biology 200 (1999), 307–338. [352] L. A. Walker and G. I. Holwell, The role of exaggerated male chelicerae in male-male contests in New Zealand sheet-web spiders, Animal Behaviour 139 (2018), 29–36. [353] M. B. Walker, Caplow’s theory of coalitions in the triad reconsidered, Journal of Experimental Social Psychology 27 (1973), no. 3, 409–412. [354] J. Wang, The theory of games, Oxford Mathematical Monographs, The Clarendon Press, Oxford University Press, New York; The Clarendon Press, Oxford University Press, New York, 1988. Translated from the Chinese; Oxford Science Publications. MR969605 [355] D. Waxman and S. Gavrilets, 20 questions on adaptive dynamics, Journal of Evolutionary Biology 18 (2005), 1139–1154. [356] J. N. Webb, Game theory: Decisions, interaction and evolution, Springer Undergraduate Mathematics Series, Springer-Verlag London, Ltd., London, 2007. MR2265299
386
Bibliography
[357] M. A. Ball and G. A. Parker, Sperm competition games: sperm selection by females, J. Theoret. Biol. 224 (2003), no. 1, 27–42, DOI 10.1016/S0022-5193(03)00118-8. MR2069247 [358] J. W. Weibull, Evolutionary game theory, MIT Press, Cambridge, MA, 1995. With a foreword by Ken Binmore. MR1347921 [359] S West, Sex Allocation, Princeton University Press, Princeton, New Jersey, 2009. [360] S.A. West, C. El Mouden, and A. Gardner, Sixteen common misconceptions about the evolution of cooperation in humans, Evolution and Human Behavior 32 (2011), 231–262. [361] M. J. West-Eberhard, The evolution of social behavior by kin selection, Quarterly Review of Biology 50 (1975), 1–35. [362] D. S. Wilson, The Natural Selection of Populations and Communities, Benjamin/Cummings, Menlo Park, California, 1980. [363] S. Wolfram, Cellular Automata and Complexity, Addison-Wesley, Reading, Massachusetts, 1994. [364] J. Wright, R. E. Stone, and N. Brown, Communal roosts as structured information centres in the raven, Corvus corax, Journal of Animal Ecology 72 (2003), 1003–1014. [365] J. Wu and R. Axelrod, How to cope with noise in the iterated prisoner’s dilemma, Journal of Conflict Resolution 39 (1995), 183–189. [366] J. M. Wyse, I. C. W. Hardy, L. Yon, and M. Mesterton-Gibbons, The impact of competition on elephant musth strategies: A game-theoretic model, Journal of Theoretical Biology 417 (2017), 109–130. [367] S. D. Yi, S. K. Baek, and J.-K. Choi, Combination with anti-tit-for-tat remedies problems of tit-for-tat, J. Theoret. Biol. 412 (2017), 1–7, DOI 10.1016/j.jtbi.2016.09.017. MR3582302 [368] H. P. Young, Individual Strategy and Social Structure, Princeton University Press, Princeton, New Jersey, 1998. [369] A. Zahavi and A. Zahavi, The Handicap Principle, Oxford University Press, New York, 1997. [370] F. Zeuthen, Problems of Monopoly and Economic Welfare, G. Routledge, London, 1930. [371] V. I. Zhukovskiy and M. E. Salukvadze, The vector-valued maximin, Mathematics in Science and Engineering, vol. 193, Academic Press, Inc., Boston, MA, 1994. MR1242123 [372] N. Zucker, Shelter building as a means of reducing territory size in the fiddler crab, Uca terpsichores (Crustacea: Ocypodidae), The American Midland Naturalist 91 (1974), 224–236.
Index
active constraint, 324 adaptive dynamics, 110 admissible direction, 122 advertising victory display, 330, 341 African wild dogs, 205 Agelenopsis aperta, see spiders agreement binding, 12, 115, 129 conservation, 295 ALLC, 180 ALLD, 180 allocation in imputation, 138 allogrooming, 201 social grooming, 201 alpha individual, 167, 321, 342 Alternative, Theorem of the, 123 anisogamy, 274 antergy, 342, 355 anti-Bourgeois, 81, 296, 298, 313–315 antique dealing, 152 assessment cumulative, 274 mutual, 218, 274, 319 self, 87, 221, 265, 319, 341 sequential, 274 asymmetry of role, 55, 74 of value, 356 payoff, 12, 55, 128, 227 average reward, 88, 90, 91, 186 Axelrod’s prototype, 183
baboons, 355 backward convergence, 104 backward induction, 48, 49, 103 bacteria, 201 Banzhaf–Coleman index, 170 bargaining point, 125 bargaining set, 121 bargaining solution, Nash’s, 125, 126 uniqueness condition, 128 basic reproductive number, 262 bats, 205 Bayesian Nash equilibrium, see equilibria beetles, 256, 318 Nicrophorus orbicollis, 256 behavioral ecology, 217 best reply, 16, 20, 37, 176, 209 Beta function incomplete, 246, 342, 358 beta individual, 167, 321, 342 bimatrix game, see game binomial distribution, 283 bioterrorism, 256 birds, 109, 274, 318, 330 socially monogamous, 341 blood sharing, 205 blue jays, 211 bluffing, 275 by stomatopods, 264 bonanza, 42, 281 boubous, 341 Bourgeois, 81, 296, 298, 311, 313, 315 branch (of tree), 48, 178
387
388
Brouwer fixed-point theorem, 51 browbeating victory display, 330, 337, 340 byproduct mutualism, see mutualism CAM, 274 Cartesian product, 14 car pooling, 136, 144 cellular automata, 210 cetaceans, 341 CFG, 4, 135 characteristic function game, 4 coreless, 147, 148, 160 improper, 159 simple, 163 characteristic function, 137 normalized, 138 characteristic function game, see CFG cheating, see defection chickadees, 330 Chump, 55 circular triad, 321 classic Hawk-Dove game, see game clutch size, 248 coalition formation of, 1, 165, 341, 357 in a CFG, 135, 137, 138 coefficient of variation, 246, 353 commitment, 12 common end, 205, 314 common enemy, 202, 314 common knowledge, 52 communication, 275 as precursor of cooperation, 129 of threats, 275 of victory, 331, 357 prior to game, 12, 27 community game, 4, 9, 57, 65, 76, 115, 227, 240, 242, 316 community-based wildlife management, 289, 360 community norm, 295 community wage, 289 compartment model, 257, 261, 314 complete information, see information computer tournament, 182, 192 conditional strategy, see strategy connected set, 32 constant returns to size, 233 contest success function, see CSF contests, 41, 248 territorial, 71, 218, 237, 264, 296, 315
Index
continuously stable, see ESS continuous game, see game continuous population game, see population game contour map, 155, 169, 326 contrite tit for tat, see CTFT contrition (in IPD), 209 convention, 81, 237, 238, 275 convergence stable, 97, 109, 110 convex set, 128 convex CFG, 142 cooperation evolution of, 175 in CFG, 135 in foraging, 205, 281 in strategic-form game, 115 in wildlife management, 288 via mutualism, see mutualism via reciprocity, see reciprocity cooperator’s dilemma, 175, 204, 287 coreless CFG, see CFG core (of CFG), 141, 159, 162 correspondence, 19, 34 cost of displaying, 218, 332 of combat, 265 of depreciation, 136 of mating, 229 of monitoring, 290 of overestimating strength, 319 of scorekeeping, 205 of threatening, 265 of travel, 27, 35, 136 crabs, 211, 237, 317, 330, 355, 357 crayfish, 318 crickets, 318, 330, 341, 357 critical group size, 294 Crossroads, 9, 12, 50, 57, 62, 88, 94, 116, 127 Crossroads II, 67 crustaceans, 211, 264, 317, 330 CSF, 42 contest success function, 42, 168, 342, 360 CT F T , 209 contrite tit for tat, 209 damselflies, 218 Darwin, 216, 224 decision variable, 5 decision tree, see tree decision variable, 2, 4, 363, 364
Index
decision set, 14, 36, 40, 115, 121 defection, 128, 175 delayed recruiter, see DR delayers, 256 deviation, multiple versus uniform, 62 dictator, 172 difference (of two sets), 139 differential equation, 89, 91, 94, 257, 261 diminishing returns to size, 233, 234, 236 discounting, 69, 210 discrete population game, see population game dispersal, 274 display, 71, 73, 218, 330 threat, 264, 275 victory, 330 dissatisfaction, 144, 162 distribution Beta, 245–247, 278, 341, 345 exponential, 249, 251, 276, 277 geometric, 182, 298 parabolic, 221, 226 uniform, 220, 324 Weibull, 277 dominance, 167, 168, 170, 314, 321, 331, 333, 356 dominance disadvantage, 331 dominance hierarchy, see linear dominance hierarchy dominant strategy, see strategy dominated strategy, see strategy Dove, 71, 73, 83, 86, 298 DR, 282, 283, 286–288 delayed recruiter, 282 drift random, 297 driving behavior, 1, 9, 20, 57 dummy, 172 dynamic interaction (in IPD), 196, 201, 205, 209 dynamic programming, 103, 109 dynamical system, 89, 92, 93, 261 dynamics continuous, 89, 91–93, 98 discrete, 94, 185 -core (of CFG), 159 eavesdropping, 356 ecological simulation, 192 ecotype, 360
389
effort academic, 2 in contests, 41 military, 42, 43 egalitarian imputation, 162 elimination, 95 empty coalition, 137 envelope, 117 equations of, 118 environmental adversity, 205 epidemic, 262 epidemiology, 256, 275 equilibria (dynamic) of continuous system, 93, 263 of discrete system, 187 equilibria (strategic) Bayesian Nash, 52 Nash, 6, 17–19, 25, 34, 37, 40, 95, 184, 227 existence of, 26, 51 multiplicity of, 26, 50 strong, 41 weak, 41 errors, 60, 108, 109 in assessment, 221 in perception, 210, 297, 313 of execution, 207 ESS, 6, 62, 218, 222, 276, 359 among kin, 225 as property of strategy set, 192, 297 boundary, 77–80, 98, 100 conditions for continuous, 64, 225 discrete, 66, 206, 281 first-order, 81 necessary, 76–78, 80 second-order, 81 sufficient, 82 continuously stable, 97, 217, 253, 254, 264, 276, 351, 353 interior, 76–82 local versus global, 81, 110, 253, 346–348, 351 strong, 64, 66, 69, 95, 182, 184, 209, 228, 230, 271 weak, 65, 66, 69, 217, 225, 228, 271, 272 essential CFG, 138 Eupelmus vuilleti, see wasps event tree, see tree evolution
390
biological, 62, 94, 276 cultural (by imitation), 62, 94, 276 evolutionarily stable set, 236 evolutionarily stable state, 95 evolutionarily stable strategy, see ESS excess (of coalition), 141 shorthand for, 154 expected future reproductive success, see fitness expected value as payoff, 12 experienced (contestant), 321 extensive form, 4, 47 fairness in CFG, 135, 149, 151, 157 in sperm competition, 228 fast driver, 13 feasible strategy combination, 14, 29, 115 Fechner’s law, 219 feedback, 205, 314 fiddler crabs, 211, 237, 317, 355 finite (sub)population of foraging birds, 281 of IPD players, 193, 197, 198 of residents among wildlife, 289 fireflies, 317 fisheries management, 315 fishes, 248, 318 fitness, 71, 100, 217 inclusive, 227 focal individual, 57 foraging, 101 among insects, 177, 204 among ravens, 281 forgiving strategy, 183 forward convergence, 105 Four Ways, 21, 65, 77, 78, 109 game against nature, 5, 100, 104, 105, 112 among kin, 224, 227, 275 between kin, 227, 275 bimatrix, 12, 51, 364 community, see community game constant-sum, 12 continuous, 26, 51, 66 cooperative, 4, 12, 115, 129, 135 definition of, 3 dynamic, 109 Hawk-Dove, 70, 218, 274
Index
classic, 73 iterated, see IH-D key ingredients of, 6 matrix, 12, 26 noncooperative, 4, 9, 12, 27, 34, 129 population, see population game symmetric, 12, 40, 172 triadic, 317 vector, 5 zero-sum, 12, 22, 51 game theory, 3 value of, 361 game-theoretic modelling, 3 Gamma function Euler’s, 246 gamma individual, 167, 321, 342 generosity (in IPD), 210 genetic transmission, dynamic of, 62, 94 geometric series derivative formula, 196 sum formula, 184 global ESS, see ESS golden ratio, 169 Goniozus nephantidis, see wasps gossip, dynamic of, 62 grand coalition, 135, 138 group rationality local or global, 118 versus individual rationality, 128, 138 guppies, 205 Harsanyi and Selten criterion, 57 harsh environment, 204 Hawk, 71, 73, 83, 86, 298 Hawk-Dove I, 72 Hawk-Dove II, 73 Hawk-Dove game, see game Hessian, 81 highly mobile organism, 201 Hotelling model, 27 hunting technology, 290 ICD, 205 IH-D, 298 imitation, 62, 94 immediate recruiter, see IR impala, 201 implicit function theorem, 120 improper CFG, see CFG imputation, 138 inclusive fitness, 227 increasing returns to size, 233, 234, 236
Index
index of power, see power individual rationality, 120, 121, 128 inessential CFG, 138 infection rate, 264 infiltration, 94 infinite regress, 315 information, 52, 86, 228, 265, 317 complete, 19, 50, 52 imperfect, 52 incomplete, 52 partial, 52 inherited behavior, 62, 276 initial conditions, 90, 182 insects, 177, 204, 217, 224, 248, 317, 318, 330, 341, 357 interaction, mode of, see mode intermediate driver, 13 interpersonal comparison, see utility intrusiveness, 111, 297, 301, 302, 307, 308, 311 invasion, 95 invasion fitness, 97, 99, 110 IPD, 180 spatial, 210 under dynamic interaction, 196 under static interaction, 195 with ALLC, ALLD, T F T , 181 with ALLD, ST F T , T F 2T , 185 with ALLD, T F T , 182 with ALLD, T F T , ST F T , 184 with T F T , T F 2T , ST F T , 189 with TFT, TF 2T, STFT, ALLD, 188, 191 IR, 282, 283, 285–288 immediate recruiter, 282 isocline, 92, 93, 112, 262 iterated cooperator’s dilemma, see ICD iterated Hawk-Dove game, see IH-D iterated prisoner’s dilemma, see IPD Jacobian determinant, 120, 124 matrix, 93, 124, 263 just noticeable difference, 219 Kalai and Samet’s criterion, 57 Kant’s categorical imperative, 61 kinship, 224 kleptoparasitism, 314 Kuhn–Tucker conditions, 324 Lack clutch size, 248
391
Lagrange multiplier, 324 Lagrangian, 324 landmark (territorial), 237, 275, 357, 360 leaf (on tree), 47, 178 least core (of CFG), 160 least rational core (of CFG), 143 lexicographic center, 150 linear dominance hierarchy, 167, 321, 331, 333, 356 linear programming, 157 lizards, 224, 318 local ESS, see ESS locally stable equilibrium, 93, 187, 190, 263 long-jumping, team, 158 long-range technology, 290 loser effect, 318, 332, 356 losing coalition (in CFG), 164 Luangwa Valley, 290 mangrove crabs, 330, 357 mantis shrimps, 264 marginal cost parameter, 232, 319 Mathematica , xi, 82, 127, 326, 351 matrix game, see game max-min strategy, see strategy Maynard Smith’s criterion, 57 meerkats, 110, 205 merit subjective versus objective, 125 metastable equilibrium, 89, 95, 187, 189–191, 297 minimal stabilizing frequency, 191 minimizing function, 50, 120, 121 mistakes, see errors mixed strategy, see strategy mixed extension, 364 mixed infiltration, 95 model analytical, 210 state-dependent, 109 mode of interaction (in IPD), 193, 209 molting (by stomatopods), 264 mongooses, 110 monitoring of wildlife resource, 289 payment for, 295 monogamy social, 340 monomorphic ESS, 95, 223, 255, 287 monomorphism, see monomorphic ESS
392
morbidity risk, 260 move (of IPD), 180 multiple ESSs, 90, 182, 353 multiple Nash equilibria, 26 multivalued function, 19, 34 mutation, 62 mutual-assessment hypothesis, 218, 221, 223 mutualism, 205, 291, 314 versus reciprocity, 205, 211 naive (contestant), 321 Nash equilibrium, see equilibria nasty strategy, 183 Nature as an extra player, 177 necessary condition(s) for ESSs, 76 for stable T F T , 192 for unimprovability, 121 neighboring point, 118 nice population (playing IPD) under dynamic interaction, 198 under static interaction, 197 nice strategy, 183 Nicrophorus orbicollis, see beetles node stable or unstable, 93, 112, 263 noncooperative game, see game nonlinear dynamics, 96 nonlinear programming, 127 nonsubordination, 331 normal form, 4, 52 normalized characteristic function, 138 nucleolus (of CFG), 150, 157 Oecobius civitas, see spiders opportunity cost, 232, 291 optimal reaction set, 16, 23, 25, 31, 37, 40, 60, 82, 267 optimization constrained, 127, 324 scalar, 5 vector, 5 order notation, 97, 122 big oh, 97, 207 little oh, 122 oviposition game, 1, 177, 204 Owners and Intruders, 74–77, 80, 81, 111, 296, 297, 316 ownership advantage, 249
Index
pairwise contests, 66 paradox, 217, 232, 248, 255, 265, 276, 309, 359 of power, 44, 54 of animal behavior, 217 of the prisoner’s dilemma, 176 paradoxical ESS, 309 parasitoid, 248, 356 parent-offspring conflict, 274 Pareto optimality local or global, 118 partitioning, 233, 237 pattern (of interaction), see mode payoff matrix, 11, 66, 363 penguins, 330 perception (subjective), 218, 318 perfectness, 60 phase-plane analysis, 88, 93, 262 pigs, 224 plant ecology, 274 player, 3, 9 playing the field, 66 poachers, 202, 289 Pokeroo, 56 polygyny resource defense, 341 polymorphic ESS, 95, 223, 255, 287 polymorphism, see polymorphic ESS population game, 4 population dynamics, 88, 91, 94, 185 population game, 57 continuous, 66, 82, 215–217, 224, 228, 232, 237, 248, 256, 264, 317, 318, 330, 341 discrete, 66, 73, 76, 90, 176, 215, 281, 289 triadic, 317, 318, 330, 341 posse effect, 283, 287 power, index of, 164 pre-imputation, 159 predator inspection, 205 price setting, 1, 26, 35 primates, 355 prior experience effect, see winner effect or loser effect prisoner’s dilemma, 54, 111, 133, 176, 177, 180 between insects, 179 between rangers, 202 continuous, 209 escape from, 211
Index
experiments, 211 iterated, see IPD spatial, 209 with kin, 277 producing (versus scrounging), 314 properness in CFG, 159 of Nash equilibrium, 60 protagonist, 57 provocable strategy, 183 psychophysics, 218 pure strategy, see strategy pure infiltration, 95, 197, 199 raffle (for fertilization), 228 random allocation, 72, 233–236 random drift, 297 ranger game, 202 rational reaction set, 16 rational -core (of CFG), 143 rats, 211, 318 ravens, 281 reasonable set, 140, 160, 162 reciprocal altruism, 180 reciprocity, 180 indirect, 314 versus mutualism, 205 recognition, 90, 129, 177, 204, 205 recruitment, immediate versus delayed, 281, 314 recurrence equation, 94, 95, 196, 307 relatedness, 224 relativity principle, 219 reliability, 318, 342, 351 replicator equations, 91, 110 replicator-mutator equations, 110 reproductive number, 262 reproductive skew, 340 reproductive value, 71 reproductive inequity, 321 resource divisible, 42, 72, 232 indivisible, 42, 71, 72, 218, 296 partitioning, 233, 236, 237 resource holding potential, see RHP Retaliator, 73 reward, 15, 29, 36 kin-modified, 225 max-min, 120, 121, 125–127, 129 reward function, 6, 15, 115 reward matrix, 66 reward set, 115
393
convex, 128 reward vector, 115 RHP, 165 RN DM , 182, 212 robustness of model, 360 of T F T , 192, 210 roles certain, 74, 228 favored versus disfavored, 81, 228 intruder versus owner, 74, 75, 265 northbound versus southbound, 55 uncertain, 231, 297 root (of tree), 47, 178 round-robin competition, 192, 318, 333 rumor, dynamic of, 62 saddle point, 81, 93, 112 salmon (Pacific), 248 SAM, 274 sample space, 10 savings attributable versus nonattributable, 149 scorekeeping, 205 scouting fee, 290, 295 scrounging (versus producing), 314 scrub jays, 109 seed dispersal, 274 selection pressure, 297 self-assessment hypothesis, 221, 223 self-enforcing, 19 semelparous, 248 sensitivity, 342 sentinels, 205 separability, 215 separatrices, 92, 93 sessile organism, 201, 209 sex allocation, 216, 274, 275 sex ratio, 216, 274 Shapley–Shubik index, 164 Shapley value, 161 short-range technology, 290 signalling, see communication simplex, 159 simple CFG, see CFG SIR, 261 susceptible, infected, and recovered, 261 SIR model, 261 slow to cooperate, see ST CO slow driver, 13
394
smallpox, 256 snakes, 318 social grooming, see allogrooming social grooming, 205 social status, 282, 291, 314 solution concept, 6 solution curve, see trajectories song sparrows, 341 spatial structure, 209 sperm competition, 228, 275 spiders, 1, 224, 318 Agelenopsis aperta, 297 Oecobius civitas, 296, 359 spinning arrow, 14, 109, 129 squirrels, 13-lined ground, 228 stability matrix, 292 starlings, 211 static interaction (in IPD), 195, 201, 207 status enhancement, 282, 287, 314 ST CO, 183 slow to cooperate, 183 ST F T , 183 sticklebacks, 205 stochastic dynamic programming, 103, 109 stomatopods, 264 Store Wars, 27 Store Wars II, 35, 126 strategic behavior, 1–3 strategic form, 4, 115 strategy, 9, 22, 74, 363 behavior, 75 conditional, 20, 180 cooperative, 175, 287 correlated, 130 deviant, 61, 62, 192 dominant, 13, 21, 176 dominated, 13 max-min, 50, 115, 120, 121 mixed, 14, 22, 51, 75, 109 mutant, 61, 215, 222, 224, 228, 229, 232, 284, 286, 320 noncooperative, 175 orthodox, 62, 192 pure, 9, 51, 109, 188, 363 randomizing, 75 reciprocal, 180, 183 stochastic, 75 threshold, 56, 67, 83, 104 uninvadable, 60, 62
Index
strategy set, 6, 14, 22 strategy combination, 11, 14, 22, 40, 364 feasible, 14, 29, 115 max-min, 50 strongly dominant, 13 strong ESS, see ESS subgame, 10, 48, 49, 233 superadditivity (in CFG), 142, 159 susceptible, infected, and recovered, see SIR suspicious tit for tat, see ST F T symmetric game, see game synergicity, 342 synergy, 342 tactic, 74 Taylor’s theorem, 97 Tennessee Valley Authority, 149 tentative solution, 118 territorial contest, see contests territorial partitioning, 237 testable predictions, 224, 361 T F 2T , 183 T F T , 180 threat, see display tit for tat, see T F T tit for two tats, see T F 2T trajectories (of dynamical system), 90, 92, 93, 187, 190 tree decision, 47 event, 177, 250 triadic game, 317 unilateral departure, 17 unimprovability in CFG, 138 local or global, 118, 119, 121 necessary conditions for, 122 set of candidates for, 122 unintrusiveness, 111, 297, 301, 302, 307, 308, 311 uninvadable strategy, see strategy unit square, 14 unstable equilibrium, 93, 187 update matrix, 207 utility, interpersonal comparison of, 126, 131, 135 vaccination, 256 vaccinators, 256
Index
variance, 241, 318, 353 vector games, 5 vector game, see game vertex (on tree), 48, 178 victory display, 330 voting, 163 war of attrition, 218, 221, 223, 224, 265, 274, 276, 277 among kin, 225, 226, 277 war of attrition without assessment, see WOA-WA wasps, 224, 237 Eupelmus vuilleti, 356 Goniozus nephantidis, 248 weakly dominant, 13 weakly dominated, 13, 181 weak ESS, see ESS wetas, 224, 341 wildlife conservation, 202, 288, 315 winner effect, 318, 332, 356 winning coalition (in CFG), 164 WOA-WA, 224, 274, 276, 277 war of attrition without assessment, 224 Zambia, 290 zero-sum game, see game
395
Published Titles in This Series 37 Mike Mesterton-Gibbons, An Introduction to Game-Theoretic Modelling, Third Edition, 2019 ´ 35 Alvaro Lozano-Robledo, Number Theory and Geometry, 2019 34 C. Herbert Clemens, Two-Dimensional Geometries, 2019 33 Brad G. Osgood, Lectures on the Fourier Transform and Its Applications, 2019 32 John M. Erdman, A Problems Based Course in Advanced Calculus, 2018 31 30 29 28
Benjamin Hutz, An Experimental Introduction to Number Theory, 2018 Steven J. Miller, Mathematics of Optimization: How to do Things Faster, 2017 Tom L. Lindstrøm, Spaces, 2017 Randall Pruim, Foundations and Applications of Statistics: An Introduction Using R, Second Edition, 2018
27 26 25 24
Shahriar Shahriari, Algebra in Action, 2017 Tamara J. Lakins, The Tools of Mathematical Reasoning, 2016 Hossein Hosseini Giv, Mathematical Analysis and Its Inherent Nature, 2016 Helene Shapiro, Linear Algebra and Matrices, 2015
23 Sergei Ovchinnikov, Number Systems, 2015 22 Hugh L. Montgomery, Early Fourier Analysis, 2014 21 John M. Lee, Axiomatic Geometry, 2013 20 Paul J. Sally, Jr., Fundamentals of Mathematical Analysis, 2013 19 R. Clark Robinson, An Introduction to Dynamical Systems: Continuous and Discrete, Second Edition, 2012 18 Joseph L. Taylor, Foundations of Analysis, 2012 17 Peter Duren, Invitation to Classical Analysis, 2012 16 Joseph L. Taylor, Complex Variables, 2011 15 Mark A. Pinsky, Partial Differential Equations and Boundary-Value Problems with Applications, Third Edition, 1998 14 Michael E. Taylor, Introduction to Differential Equations, 2011 13 Randall Pruim, Foundations and Applications of Statistics, 2011 12 11 10 9
John P. D’Angelo, An Introduction to Complex Analysis and Geometry, 2010 Mark R. Sepanski, Algebra, 2010 Sue E. Goodman, Beginning Topology, 2005 Ronald Solomon, Abstract Algebra, 2003
8 7 6 5
I. Martin Isaacs, Geometry for College Students, 2001 Victor Goodman and Joseph Stampfli, The Mathematics of Finance, 2001 Michael A. Bean, Probability: The Science of Uncertainty, 2001 Patrick M. Fitzpatrick, Advanced Calculus, Second Edition, 2006
4 Gerald B. Folland, Fourier Analysis and Its Applications, 1992 3 Bettina Richmond and Thomas Richmond, A Discrete Transition to Advanced Mathematics, 2004 2 David Kincaid and Ward Cheney, Numerical Analysis: Mathematics of Scientific Computing, Third Edition, 2002 1 Edward D. Gaughan, Introduction to Analysis, Fifth Edition, 1998
In the present third edition, the author has added substantial new material on evolutionarily stable strategies and their use in behavioral ecology. The only prerequisites are calculus and some exposure to matrix algebra, probability, and differential equations.
For additional information and updates on this book, visit www.ams.org/bookpages/amstext-37
AMSTEXT/37
Sally
The
SERIES
This series was founded by the highly respected mathematician and educator, Paul J. Sally, Jr.
Photo courtesy of Eliza Schneider-Green.
This book introduces game theory and its applications from an applied mathematician’s perspective, systematically developing tools and concepts for game-theoretic modelling in the life and social sciences. Filled with downto-earth examples of strategic behavior in humans and other animals, the book presents a unified account of the central ideas of both classical and evolutionary game theory. Unlike many books on game theory, which focus on mathematical and recreational aspects of the subject, this book emphasizes using games to answer questions of current scientific interest.