Graduate Texts in Contemporary Physics Series Editors: R. Stephen Berry Joseph L. Birman Mark P. Silverman H. Eugene Stanley Mikhail Voloshin
Springer
New York Berlin Heidelberg Hong Kong London Milan Paris Tokyo
Nino Boccara
Modeling Complex Systems With 158 Illustrations
Nino Boccara Department of Physics University of Illinois–Chicago 845 West Taylor Street, #2236 Chicago, IL 60607 USA
[email protected] Series Editors R. Stephen Berry Department of Chemistry University of Chicago Chicago, IL 60637 USA
Joseph L. Birman Department of Physics City College of CUNY New York, NY 10031 USA
H. Eugene Stanley Center for Polymer Studies Physics Department Boston University Boston, MA 02215 USA
Mikhail Voloshin Theoretical Physics Institute Tate Laboratory of Physics The University of Minnesota Minneapolis, MN 55455 USA
Mark P. Silverman Department of Physics Trinity College Hartford, CT 06106 USA
Library of Congress Cataloging-in-Publication Data Boccara, Nino. Modeling complex systems / Nino Boccara. p. cm. — (Graduate texts in contemporary physics) Includes bibliographical references and index. ISBN 0-387-40462-7 (alk. paper) 1. System theory—Mathematical models. 2. System analysis—Mathematical models. I. Title. II. Series. Q295.B59 2004 003—dc21 2003054791 ISBN 0-387-40462-7
Printed on acid-free paper.
© 2004 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1
SPIN 10941300
www.springer-ny.com Springer-Verlag New York Berlin Heidelberg A member of BertelsmannSpringer Science+Business Media GmbH
Py´e-koko di i ka vw`e lwen, mach´e ou k´e vw`e pli lwen.1
1
Creole proverb from Guadeloupe that can be translated: The coconut palm says it sees far away, walk and you will see far beyond.
Preface
The preface is that part of a book which is written last, placed first, and read least. Alfred J. Lotka Elements of Physical Biology Baltimore: Williams & Wilkins Company 1925
The purpose of this book is to show how models of complex systems are built up and to provide the mathematical tools indispensable for studying their dynamics. This is not, however, a book on the theory of dynamical systems illustrated with some applications; the focus is on modeling, so, in presenting the essential results of dynamical system theory, technical proofs of theorems are omitted, but references for the interested reader are indicated. While mathematical results on dynamical systems such as differential equations or recurrence equations abound, this is far from being the case for spatially extended systems such as automata networks, whose theory is still in its infancy. Many illustrative examples taken from a variety of disciplines, ranging from ecology and epidemiology to sociology and seismology, are given. This is an introductory text directed mainly to advanced undergraduate students in most scientific disciplines, but it could also serve as a reference book for graduate students and young researchers. The material has been ´ taught to junior students at the Ecole de Physique et de Chimie in Paris and the University of Illinois at Chicago. It assumes that the reader has certain fundamental mathematical skills, such as calculus. Although there is no universally accepted definition of a complex system, most researchers would describe as complex a system of connected agents that exhibits an emergent global behavior not imposed by a central controller, but resulting from the interactions between the agents. These agents may
viii
Preface
be insects, birds, people, or companies, and their number may range from a hundred to millions. Finding the emergent global behavior of a large system of interacting agents using analytical methods is usually hopeless, and researchers therefore must rely on computer-based methods. Apart from a few exceptions, most properties of spatially extended systems have been obtained from the analysis of numerical simulations. Although simulations of interacting multiagent systems are thought experiments, the aim is not to study accurate representations of these systems. The main purpose of a model is to broaden our understanding of general principles valid for the largest variety of systems. Models have to be as simple as possible. What makes the study of complex systems fascinating is not the study of complicated models but the complexity of unsuspected results of numerical simulations. As a multidisciplinary discipline, the study of complex systems attracts researchers from many different horizons who publish in a great variety of scientific journals. The literature is growing extremely fast, and it would be a hopeless task to try to attain any kind of comprehensive completeness. This book only attempts to supply many diverse illustrative examples to exhibit that common modeling techniques can be used to interpret the behavior of apparently completely different systems. After a general introduction followed by an overview of various modeling techniques used to explain a specific phenomenon, namely the observed coupled oscillations of predator and prey population densities, the book is divided into two parts. The first part describes models formulated in terms of differential equations or recurrence equations in which local interactions between the agents are replaced by uniform long-range ones and whose solutions can only give the time evolution of spatial averages. Despite the fact that such models offer rudimentary representations of multiagent systems, they are often able to give a useful qualitative picture of the system’s behavior. The second part is devoted to models formulated in terms of automata networks in which the local character of the interactions between the individual agents is explicitly taken into account. Chapters of both parts include a few exercises that, as well as challenging the reader, are meant to complement the material in the text. Detailed solutions of all exercises are provided.
Nino Boccara
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 What is a complex system? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 What is a model? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 What is a dynamical system? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 4 9
2
How to Build Up a Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Lotka-Volterra model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 More realistic predator-prey models . . . . . . . . . . . . . . . . . . . . . . . . 2.3 A model with a stable limit cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Fluctuating environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Hutchinson’s time-delay model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Discrete-time models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Lattice models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17 17 24 25 27 28 31 33
Part I Mean-Field Type Models 3
Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Linearization and stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Nonlinear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Graphical study of two-dimensional systems . . . . . . . . . . . . . . . . 3.4 Structural stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Local bifurcations of vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 One-dimensional vector fields . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Equivalent families of vector fields . . . . . . . . . . . . . . . . . . . 3.5.3 Hopf bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 Catastrophes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41 41 51 51 56 66 69 71 73 82 83 85
x
Contents
3.6 Influence of diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.6.1 Random walk and diffusion . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.6.2 One-population dynamics with dispersal . . . . . . . . . . . . . . 92 3.6.3 Critical patch size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.6.4 Diffusion-induced instability . . . . . . . . . . . . . . . . . . . . . . . . 95 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4
Recurrence Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.1 Iteration of maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.3 Poincar´e maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.4 Local bifurcations of maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.4.1 Maps on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.4.2 The Hopf bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.5 Sequences of period-doubling bifurcations . . . . . . . . . . . . . . . . . . . 127 4.5.1 Logistic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.5.2 Universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5
Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.1 Defining chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 5.1.1 Dynamics of the logistic map f4 . . . . . . . . . . . . . . . . . . . . . 149 5.1.2 Definition of chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 5.2 Routes to chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 5.3 Characterizing chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.3.1 Stochastic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.3.2 Lyapunov exponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 5.3.3 “Period three implies chaos” . . . . . . . . . . . . . . . . . . . . . . . . 159 5.3.4 Strange attractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 5.4 Chaotic discrete-time models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.4.1 One-population models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.4.2 The H´enon map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.5 Chaotic continuous-time models . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.5.1 The Lorenz model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Part II Agent-Based Models
Contents
xi
6
Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.1 Cellular automaton rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.2 Number-conserving cellular automata . . . . . . . . . . . . . . . . . . . . . . 194 6.3 Approximate methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 6.4 Generalized cellular automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 6.5 Kinetic growth phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 6.6 Site-exchange cellular automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 6.7 Artificial societies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
7
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 7.1 The small-world phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 7.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 7.3 Random networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 7.4 Small-world networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 7.4.1 Watts-Strogatz model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 7.4.2 Newman-Watts model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 7.4.3 Highly connected extra vertex model . . . . . . . . . . . . . . . . . 287 7.5 Scale-free networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 7.5.1 Empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 7.5.2 A few models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
8
Power-Law Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 8.1 Classical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 8.2 A few notions of probability theory . . . . . . . . . . . . . . . . . . . . . . . . 317 8.2.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 8.2.2 Central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 8.2.3 Lognormal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 8.2.4 L´evy distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 8.2.5 Truncated L´evy distributions . . . . . . . . . . . . . . . . . . . . . . . 328 8.2.6 Student’s t-distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 8.2.7 A word about statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 8.3 Empirical results and tentative models . . . . . . . . . . . . . . . . . . . . . 334 8.3.1 Financial markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 8.3.2 Demographic and area distribution . . . . . . . . . . . . . . . . . . 339 8.3.3 Family names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 8.3.4 Distribution of votes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 8.4 Self-organized criticality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 8.4.1 The sandpile model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 8.4.2 Drossel-Schwabl forest fire model . . . . . . . . . . . . . . . . . . . . 343 8.4.3 Punctuated equilibria and Darwinian evolution . . . . . . . . 345 8.4.4 Real life phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
xii
Contents
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
1 Introduction
This book is about the dynamics of complex systems. Roughly speaking, a system is a collection of interacting elements making up a whole such as, for instance, a mechanical clock. While many systems may be quite complicated, they are not necessarily considered to be complex. There is no precise definition of complex systems. Most authors, however, agree on the essential properties a system has to possess to be called complex. The first section is devoted to the description of these properties. To interpret the time evolution of a system, scientists build up models, which are simplified mathematical representations of the system. The exact purpose of a model and what its essential features should be is explained in the second section. The mathematical models that will be discussed in this book are dynamical systems. A dynamical system is essentially a set of equations whose solution describes the evolution, as a function of time, of the state of the system. There exist different types of dynamical systems. Some of them are defined in the third section.
1.1 What is a complex system? Outside the nest, the members of an ant colony accomplish a variety of fascinating tasks, such as foraging and nest maintenance. Gordon’s work [152] on harvester ants1 has shed considerable light on the processes by which members of an ant colony assume various roles. Outside the nest, active ant workers can perform four distinct tasks: foraging, nest maintenance, patrolling, and midden work. Foragers travel along cleared trails around the nest to collect mostly seeds and, occasionally, insect parts. Nest-maintenance workers modify the nest’s chambers and tunnels and clear sand out of the nest or vegetation 1
Pogonomyrmex barbatus. They are called harvester ants because they eat mostly seeds, which they store inside their nests.
2
1 Introduction
from the mound and trails. Patrollers choose the direction foragers will take each day and also respond to damage to the nest or an invasion by alien ants. Midden workers build and sort the colony’s refuse pile. Gordon [150, 151] has shown that task allocation is a process of continual adjustment. The number of workers engaged in a specific task is appropriate to the current condition. When small piles of mixed seeds are placed outside the nest mound, away from the foraging trails but in front of scouting patrollers, early in the morning, active recruitment of foragers takes place. When toothpicks are placed near the nest entrance, early in the morning at the beginning of nest-maintenance activity, the number of nest-maintenance workers increases significantly. The surprising fact is that task allocation is achieved without any central control. The queen does not decide which worker does what. No master ant could possibly oversee the entire colony and broadcast instructions to the individual workers. An individual ant can only perceive local information from the ants nearby through chemical and tactile communication. Each individual ant processes this partial information in order to decide which of the many possible functional roles it should play in the colony. The cooperative behavior of an ant colony that results from local interactions between its members and not from the existence of a central controller is referred to as emergent behavior . Emergent properties are defined as largescale effects of locally interacting agents that are often surprising and hard to predict even in the case of simple interactions. Such a definition is not very satisfying: what might be surprising to someone could be not so surprising to someone else. A system such as an ant colony, which consists of large populations of connected agents (that is, collections of interacting elements), is said to be complex if there exists an emergent global dynamics resulting from the actions of its parts rather than being imposed by a central controller. Ant colonies are not the only multiagent systems that exhibit coordinated behaviors without a centralized control. Animal groups display a variety of remarkable coordinated behaviors [278, 64]. All the members in a school of fish change direction simultaneously without any obvious cue; while foraging, birds in a flock alternate feeding and scanning. No individual in these groups has a sense of the overall orderly pattern. There is no apparent leader. In a school of fish, the direction of each member is determined by the average direction of its neighbors [339, 331]. In a flock of birds, each individual chooses to scan for predators if a majority of its neighbors are eating and chooses to eat if a majority of its neighbors are already scanning [23]. The existence of sentinels in animal groups engaged in dangerous activities is a typical example of cooperation. Recent studies suggest that guarding may be an individual’s optimal activity once its stomach is full and no other animal is on guard [90]. Self-organized motion in schools of fish, flocks of birds, or herds of ungulate mammals is not specific to animal groups. Vehicle traffic on a highway exhibits
1.1 What is a complex system?
3
emergent behaviors such as the existence of traffic jams that propagate in the opposite direction of the traffic flow, keeping their structure and characteristic parameters for a long time [183], or the synchronization of average velocities in neighboring lanes in congested traffic [184]. Similarly, pedestrian crowds display self-organized spatiotemporal patterns that are not imposed by any regulation: on a crowded sidewalk, pedestrians walking in opposite directions tend to form lanes along which walkers move in the same direction. A high degree of self-organization is also found in social networks that can be viewed as graphs.2 The collection of scientific articles published in refereed journals is a directed graph, the vertices being the articles and the arcs being the links connecting an article to the papers cited in its list of references. A recent study [295] has shown that the citation distribution—that is, the number of papers N (x) that have been cited a total of x times—has a power-law tail, N (x) ∼ x−α with α ≈ 3. Minimally cited papers are usually referenced by their authors and close associates, while heavily cited papers become known through collective effects. Other social networks, such as the World Wide Web or the casting pattern of movie actors, exhibit a similar emergent behavior [37]. In the World Wide Web, the vertices are the HTML3 documents, and the arcs are the links pointing from one document to another. In a movie database, the vertices are the actors, two of them being connected by an undirected edge if they have been cast in the same movie. In October 1987, major indexes of stock market valuation in the United States declined by 30% or more. An analysis [321] of the time behavior of the U. S. stock exchange index S&P500 before the crash identifies precursory patterns suggesting that the crash may be viewed as a dynamical critical point. That is, as a function of time t, the S&P500 behaves as (t − tc )0.7 , where t is the time in years and tc ≈ 1987.65. This result shows that the stock market is a complex system that exhibits self-organizing cooperative effects. All the examples of complex systems above exhibit some common characteristics: 1. They consist of a large number of interacting agents. 2. They exhibit emergence; that is, a self-organizing collective behavior difficult to anticipate from the knowledge of the agents’ behavior. 3. Their emergent behavior does not result from the existence of a central controller. The appearance of emergent properties is the single most distinguishing feature of complex systems. Probably, the most famous example of a system that exhibits emergent properties as a result of simple interacting rules between its agents is the game of life invented by John H. Conway. This game is 2
3
A directed graph (or digraph) G consists of a nonempty set of elements V (G), called vertices, and a subset E(G) of ordered pairs of distinct elements of V (G), called directed edges or arcs. Hypertext Markup Language.
4
1 Introduction
played on an (infinite) two-dimensional square lattice. Each cell of the lattice is either on (occupied by a living organism) or off (empty). If a cell is off, it turns on if exactly three of its eight neighboring cells (four adjacent orthogonally and four adjacent diagonally) are on (birth of a new organism). If a cell is on, it stays on if exactly two or three of its neighboring cells are on (survival), otherwise it turns off (death from isolation or overpopulation). These rules are applied simultaneously to all cells. Populations evolving according to these rules exhibit endless unusual and unexpected changing patterns [137]. “To help people explore and learn about decentralized systems and emergent phenomena,” Mitchell Resnick4 developed the StarLogo5 modeling environment. Among the various sample projects consider, for example, the project inspired by the behavior of termites gathering wood chips into piles. Each cell of a 100 × 100 square lattice is either empty or occupied by a wood chip or/and a termite. Each termite starts wandering randomly. If it bumps into a wood chip, it picks the chip up and continues to wander randomly. When it bumps into another wood chip, it finds a nearby empty space and puts its wood chip down. With these simple rules, the wood chips eventually end up in a single pile (Figure 1.1). Although rather simple, this model is representative of a complex system. It is interesting to notice that while the gathering of all wood chips into a single pile may, at first sight, look surprising, on reflection it is no wonder. Actually it is clear that the number of piles cannot increase, and, since the probability for any pile to disappear is nonzero, this number has to decrease and ultimately become equal to one.6
1.2 What is a model? A model is a simplified mathematical representation of a system. In the actual system, many features are likely to be important. Not all of them, however, should be included in the model. Only the few relevant features that are thought to play an essential role in the interpretation of the observed phenomena should be retained. Models should be distinguished from what is usually called a simulation. To clarify this distinction, it is probably best to quote John Maynard Smith [234]: 4
5
6
See Mitchell Resnick’s Web page: http://mres.www.media.mit.edu/people/mres. Resnick’s research is described in his book [297]. StarLogo is freeware that can be downloaded from: http://el.www.media.mit.edu/groups/el/Projects/starlogo. Here is a similar mathematical model that can be solved exactly. Consider a random distribution of N identical balls in B identical boxes, and assume that, at each time step, a ball is transferred from one box to another, not necessarily different, with a probability P (n → n ± 1) of changing by one unit the number n of balls in a given box depending only on the number n of balls in that box. Moreover, if this probability is equal to zero for n = 0 (an empty box stays empty), then it can be shown that the probability for a given box to become empty is equal to 1 − n/N . Hence, ultimately all balls end up in one unique box.
1.2 What is a model?
5
Fig. 1.1. StarLogo sample project termites. Randomly distributed wood chips (left figure) eventually end up in a single pile (right figure). Density of wood: 0.25; number of termites: 75.
If, for example, one wished to know how many fur seals can be culled annually from a population without threatening its future survival, it would be necessary to have a description of that population, in its particular environment, which includes as much relevant detail as possible. At a minimum, one would require age-specific birth and death rates, and knowledge of how these rates varied with the density of the population, and with other features of the environment likely to alter in the future. Such information could be built into a simulation of the population, which could be used to predict the effects of particular management policies. The value of such simulations is obvious, but their utility lies mainly in analyzing particular cases. A theory of ecology must make statements about ecosystems as a whole, as well as about particular species at particular times, and it must make statements which are true for many different species and not for just one. Any actual ecosystem contains far too many species, which interact in far too many ways, for simulation to be a practical approach. The better a simulation is for its own purposes, by the inclusion of all relevant details, the more difficult it is to generalize its conclusions to other species. For the discovery of general ideas in ecology, therefore, different kinds of mathematical description, which may be called models, are called for. Whereas a good simulation should include as much detail as possible, a good model should include as little as possible. A simple model, if it captures the key elements of a complex system, may elicit highly relevant questions. For example, the growth of a population is often modeled by a differential equation of the form
6
1 Introduction
dN = f (N ), (1.1) dt where the time-dependent function N is the number of inhabitants of a given area. It might seem paradoxical that such a model, which ignores the influence of sex ratios on reproduction, or age structure on mortality, would be of any help. But many populations have regular sex ratios and, in large populations near equilibrium, the number of old individuals is a function of the size of the population. Thus, taking into account these additional features maybe is not as essential as it seems. To be more specific, in an isolated population (that is, if there is neither immigration nor emigration), what should be the form of a reasonable function f ? According to Hutchinson [177] any equation describing the evolution of a population should take into account that: 1. Every living organism must have at least one parent of like kind. 2. In a finite space, due to the limiting effect of the environment, there is an upper limit to the number of organisms that can occupy that space. The simplest model satisfying these two requirements is the so-called logistic model : dN N . (1.2) = rN 1 − dt K The word “logistic” was coined by Pierre Fran¸cois Verhulst (1804–1849), who used this equation for the first time in 1838 to discuss population growth.7 His paper [338] did not, at that time, arouse much interest. Verhulst’s equation was rediscovered about 80 years later by Raymond Pearl and Lowell J. Reed. After the publication of their paper [280], the logistic model began to be used extensively.8 Interesting details on Verhulst’s ideas and the beginning of scientific demography can be found in the first chapter of Hutchinson’s book [177]. In Equation (1.2), the constant r is referred to as the intrinsic rate of increase and K is called the carrying capacity because it represents the population size that the resources of the environment can just maintain (carry) without a tendency to either increase or decrease. The logistic equation is clearly a very crude model but, in spite of its obvious limitations,9 it is often a good starting point.10 The logistic equation contains two parameters. This number can be reduced if we express the model in non-dimensional terms. Since r has the 7
8
9 10
The French word “logistique” had, since 1840, the same meaning as the word “logistics” in English, but in old French, since 1611, it meant “l’art de compter”; i.e., the art of calculation. See Le nouveau petit Robert, dictionnaire de la langue fran¸caise (Paris: Dictionnaires Le Robert, 1993). For a critical review of experimental attempts to verify the validity of the logistic model, see Willy Feller [123]. See, e.g., Chapter 6 of Begon, Harper, and Townsend’s book [39]. On the history of the logistic model, see [185].
1.2 What is a model?
7
dimension of the inverse of a time and K has the dimension of a number of individuals, if we put N τ = rt and n = , K Equation (1.2) becomes dn = n(1 − n). (1.3) dτ This equation contains no more parameters. That is, if the unit of time is r−1 and the unit of number of inhabitants is K, then the reduced logistic equation (1.3) is universal; it is system-independent. Equation (1.3) is very simple and can be integrated exactly.11 Most equations cannot be solved analytically. But, following ideas going back to Poincar´e,12 a geometrical approach, developed essentially during the second half of the twentieth century, gives, in many cases of interest, a description of the qualitative behavior of the solutions. The reduction of equations to a dimensionless form simplifies the mathematics and, usually, leads to some insight even without solving the equation. Moreover, the value of a dimensionless variable carries more information than the value of the variable itself. For simple models such as Equation (1.2) the definition of scaled variables is straightforward. If the model is not so simple, reduced variables may be defined using a systematic technique. To illustrate this technique, consider the following model of insect population outbreaks due to Ludwig, Jones, and Holling [217]. Certain insect populations exhibit outbreaks in abundance as they move from a low-density equilibrium to a high-density equilibrium and back again. This is the case, for instance, of the spruce budworm (Choristoneura fumiferana), which feeds on the needles of the terminal shoots of spruce, balsam fir, and other evergreen trees in eastern North America. In an immature balsam fir and white spruce forest, the quantity of food for the budworms is low and their rate of recruitment (that is, the amount by which the population increases during one time unit) is low. It is then reasonable to assume that the budworm population is kept at a low-density equilibrium by its predators (essentially birds). However, as the forest gradually matures, more food becomes available, the rate of budworm recruitment increases, and the budworm density grows. Above a certain rate of recruitment threshold, avian predators can no longer contain the growth of the budworm density, which jumps to a high-level value. This outbreak of the budworm 11
Its general solution reads: 1 , 1 + ae−τ where a is an integration constant whose value depends upon the initial value n(0). See Chapter 3. n(τ ) =
12
8
1 Introduction
density quickly defoliates the mature trees; the forest then reverts to immaturity, the rate of recruitment decreases, and the budworm density jumps back to a low-level equilibrium. The budworm can increase its density several hundredfold in a few years. Therefore, a characteristic time interval for the budworm is of the order of months. The trees, however, cannot put on foliage at a comparable rate. A characteristic time interval for trees to completely replace their foliage is of the order of 7 to 10 years. Moreover, in the absence of the budworm, the life span of the trees is of the order of 100 years. Therefore, in analyzing the dynamics of the budworm population, we may assume that the foliage quantity is held constant.13 The main limiting features of the budworm population are food supply and the effects of parasites and predators. In the absence of predation, we may assume that the budworm density B satisfies the logistic equation dB B , = rB B 1 − dt KB where rB and KB are, respectively, the intrinsic rate of increase of the spruce budworm and the carrying capacity of the environment. Predation may be taken into account by subtracting a term p(B) from the right-hand side of the logistic equation. What conditions should satisfy the function p? 1. At high prey density, predation usually saturates. Hence, when B becomes increasingly large, p(B) should approach an upper limit a (a > 0). 2. At low prey density, predation is less effective. Birds are relatively unselective predators. If a prey becomes less common, they seek food elsewhere. Hence, when B tends to zero, p(B) should tend to zero faster than B. A simple form for p(B) that has the properties of saturation at a level a and vanishes like B 2 is aB 2 p(B) = 2 . b + B2 The positive constant b is a critical budworm density. It determines the scale of budworm densities at which saturation begins to take place. The dynamics of the budworm density B is then governed by dB aB 2 B − 2 . (1.4) = rB B 1 − dt KB b + B2 This equation, which is of the general form (1.1), contains four parameters: rB , KB , a, and b. Their dimensions are the same as, respectively, t−1 , B, Bt−1 , and B. Since the equation relates two variables B and t, we have to define two dimensionless variables 13
This adiabatic approximation is familiar to physicists. For a nice discussion of its validity and its use in solid-state theory, see Weinreich’s book [343].
1.3 What is a dynamical system?
τ=
t t0
and x =
B . B0
9
(1.5)
To reduce (1.4) to a dimensionless form, we have to define the constants t0 and B0 in terms of the parameters rB , KB , a, and b. Replacing (1.5) into (1.4), we obtain dx aB0 t0 x2 xB0 = rB t0 x 1 − − 2 . dτ KB b + x2 B02 To reduce this equation to as simple a form as possible, we may choose either −1 t0 = rB
or
t0 = ba−1
and B0 = KB , and B0 = b.
The first choice simplifies the logistic part of Equation (1.4), whereas the second one simplifies the predation part. The corresponding reduced forms of (1.4) are, respectively, dx αx2 = x(1 − x) − 2 dτ β + x2
(1.6)
and
dx x2 x − = rx 1 − . (1.7) dτ k 1 + x2 To study budworm outbreaks as a function of the available foliage per acre of forest, the second choice is better. To study the influence of the predator density, however, the first choice is preferable. Both reduced equations contain two parameters: the scaled upper limit of predation α and the scaled critical density β in the first case and the scaled rate of increase r and the scaled carrying capacity k in the second case. It is not very difficult to prove that, if the evolution of a model is governed by a set of equations containing n parameters that relate variables involving d independent dimensions, the final reduced equations will contain n − d scaled parameters.
1.3 What is a dynamical system? The notion of a dynamical system includes the following ingredients: a phase space S whose elements represent possible states of the system14 ; time t, which may be discrete or continuous; and an evolution law (that is, a rule that allows determination of the state at time t from the knowledge of the 14
S is also called the state space.
10
1 Introduction
states at all previous times). In most examples, knowing the state at time t0 allows determination of the state at any time t > t0 . The two models of population growth presented in the previous section are examples of dynamical systems. In both cases, the phase space is the set of nonnegative real numbers, and the evolution law is given by the solution of a nonlinear first-order differential equation of the form (1.1). The name dynamical system arose, by extension, after the name of the equations governing the motion of a system of particles. Today the expression dynamical system is used as a synonym of nonlinear system of equations. Dynamical systems may be divided into two broad categories. According to whether the time variable may be considered as continuous or discrete, the dynamics of a given system is described by differential equations or finitedifference equations of the form15 dx = x˙ = X(x), dt xt+1 = f xt ),
(1.8) (1.9)
where t belongs to the set of nonnegative real numbers R+ in (1.8) and the set of nonnegative integers N0 (that is, the union of the set N of positive integers and {0}) in (1.9). Such equations determine how the state x ∈ S of the system varies with time.16 To solve (1.8) or (1.9) we need to specify the initial state x(0) ∈ S. The state of a system at time t represents all the information characterizing the system at this particular time. Here are some illustrative examples. Example 1. The simple pendulum. In the absence of friction, the equation of motion of a simple pendulum moving in a vertical plane is d2 θ g + sin θ = 0, dt2
(1.10)
where θ is the displacement angle from the stable equilibrium position, g the acceleration of gravity, and the length of the pendulum. If we put x1 = θ, 15
16
˙ x2 = θ,
Here, we are considering autonomous systems; that is, we are assuming that the functions X and f do not depend explicitly on time. A nonautonomous system may always be written as an autonomous system of higher dimensionality (see Example 1). Assuming of course that, for a given initial state, the equations above have a unique solution. Since we are essentially interested in applications, we will not discuss problems of existence and uniqueness of solutions. These problems are important for the mathematician, and nonunicity is certainly an interesting phenomenon. But for someone interested in applications, nonunicity is an unpleasant feature indicating that the model has to be modified, since, according to experience, a real system has a unique evolution for any realizable initial state.
1.3 What is a dynamical system?
11
Equation (1.10) may be written dx1 = x2 dt dx2 g = − sin x1 . dt This type of transformation is general. Any system of differential equations of order higher than one can be written as a first-order system of higher dimensionality. The state of the pendulum is represented by the ordered pair (x1 , x2 ). Since x1 ∈ [−π, π[ and x2 ∈ R, the phase space X is the cylinder S1 × R, where Sn denotes the unit sphere in Rn+1 . This surface is a two-dimensional manifold. A manifold is a locally Euclidean space that generalizes the idea of parametric representation of curves and surfaces in R3 .17 Example 2. Nonlinear oscillators. Models of nonlinear oscillators have been the source of many important and interesting problems.18 They are described by second-order differential equations of the form x ¨ + g(x, x) ˙ = 0. While the dynamics of such systems is already nontrivial (see, for instance, the van der Pol oscillator discussed in Example 16), the addition of a periodic forcing term f (t) = f (t + T ) yields x ¨ + g(x, x) ˙ = f (t)
(1.11)
and can introduce completely new phenomena. If we put x1 = x,
x2 = x, ˙
x3 = t,
Equation (1.11) may be written x˙ 1 = x2 , x˙ 2 = −g(x1 , x2 ) + f (x3 ), x˙ 3 = 1. Here again, this type of transformation is general. Any nonautonomous system of differential equations of order higher than one can be written as a first-order system of higher dimensionality. The state of the system is represented by the triplet (x1 , x2 , x3 ). If the period T of the function f is, say, 2π, the phase space is X = R × R × S 1 ; that is, a three-dimensional manifold. 17 18
See also Section 3.1. Refer, in particular, to [160].
12
1 Introduction
Example 3. Age distribution. A one-species population may be characterized by its density ρ. Since ρ should be nonnegative and not greater than 1, the phase space is the interval [0,1]. The population density is a global variable that ignores, for instance, age structure. A more precise characterization of the population should take into account its age distribution. If f (t, a) da represents the density of individuals whose age, at time t, lies between a and a + da, then the state of the system is represented by the age distribution function a → f (t, a). The total population density at time t is ∞ ρ(t) = f (t, a) da 0
and, in this case, the state space is a set of positive integrable functions on R+ . Example 4. Population growth with a time delay. In the logistic model the growth rate of a population at any time t depends on the number of individuals in the system at that time. This assumption is seldom justified, for reproduction is not an instantaneous process. If we assume that the growth rate N˙ (t)/N (t) is a decreasing function of the number of individuals at time t − T , the simplest model is N (t − T ) dN = rN (t) 1 − . (1.12) N˙ (t) = dt K This logistic model with a time lag is due to Hutchinson [176, 177], who was the first ecologist to consider time-delayed responses. To solve Equation (1.12), we need to know not only the value of an initial population but a history function h such that (∀u ∈ [0, T ])
N (−u) = h(u).
If we put x1 (t) = N (t),
x2 (t, u) = N (t − u),
we have
dN (t − u) ∂x2 ∂x2 = =− , dt ∂u ∂t and we may, therefore, write the logistic equation with a time lag under the form x2 (t, u) dx1 , = rx1 (t) 1 − dt K ∂x2 ∂x2 =− . ∂t ∂u Here, the state space is two-dimensional. x1 is a nonnegative real and u → x2 (t, u) a nonnegative function defined on the interval [0, T ]. The boundary conditions are x1 (0) = h(0) and x2 (0, u) = h(u) for all u ∈ [0, T ].
1.3 What is a dynamical system?
In the more general case of a logistic equation of the form 1 t dN = rN (t) 1 − N (t − u)Q(u) du , dt K 0
13
(1.13)
Q, called the delay kernel, is a positive integrable normalized function on R+ ; that is, a function defined for u ≥ 0 such that ∞ Q(u) du = 1. (1.14) 0
Here are two typical illustrative examples of delay kernels found in the literature19 : 1 exp(−u/T ), T 1 Q2 (u) = 2 u exp(−u/T ). T Q1 (u) =
Hutchinson’s equation corresponds to the singular kernel Q = δT , where δT denotes the Dirac distribution at T .20 Taking into account (1.14), it is easy to verify that K is the only nontrivial equilibrium point of (1.13). The parameter K corresponds, therefore, to the carrying capacity of the standard logistic model. The history function h : u → N (−u) is defined on R+ , and we have dx1 1 ∞ = rx1 (t) 1 − x2 (t, u)Q(u) du , dt K 0 ∂x2 ∂x2 =− , ∂t ∂u with the boundary conditions x1 (0) = h(0) and x2 (0, u) = h(u) for all u ∈ R+ . Example 5. Random walkers on a lattice. Let ZL be a one-dimensional finite lattice of length L with periodic boundary conditions,21 and denote by n(t, i) the occupation number of site i at time t. n(t, i) = 0 if the site is vacant, and n(t, i) = 1 if the site is occupied by a random walker. The evolution rule of the system is such that, at each time step, a random walker selected at random—that is, the probability for a walker to be selected is uniform—will move with a probability 12 either to the right or the left neighboring site if this site is vacant. If the randomly selected site is not vacant, then the walker will not move. The state of the system at time t is represented by the function i → n(t, i), and the phase space is X = {0, 1}ZL . An element of such a phase space is called a configuration. 19 20
21
On delay models in population ecology, consult [97]. On distribution theory and its applications to differential and integral equations, see [46], Chapter 4. ZL denotes the set of integers modulo L. Similarly, ZdL represents a finite ddimensional lattice of volume Ld with periodic boundary conditions.
14
1 Introduction
In most situations of interest, the phase space of a dynamical system possesses a certain structure that the evolution law respects. In applications, we are usually interested in lasting rather than transient phenomena and so in steady states. Therefore, steady solutions of the governing equations of evolution are of special importance. Consider, for instance, Equation (1.2); its steady solutions, which are such that dN = 0, dt are N = 0 and N = K. In this simple case, it is not difficult to verify that, if the initial condition is N (0) > 0, N (t) tends to K when t tends to infinity. The expression “N (t) tends to K when t tends to infinity” is meaningful if, and only if, the phase space X has a topology. Roughly speaking, a topological space is a space in which the notion of neighborhood has been defined. A simple way to induce a topology is to define a distance, that is, to each ordered pair of points (x1 , x2 ) in X we should be able to associate a nonnegative number d(x1 , x2 ), said to be the distance between x1 and x2 , satisfying the following conditions: 1. d(x1 , x2 ) = 0 ⇔ x1 = x2 , 2. d(x1 , x2 ) = d(x2 , x1 ), 3. d(x1 , x3 ) ≤ d(x1 , x2 ) + d(x2 , x3 ). In the Euclidean space Rn , the distance is defined by 1/2 n α α 2 d(x1 , x2 ) = (x1 − x2 ) . α=1
If, as in Example 1, X is a manifold, we use a suitable coordinate system to define the distance. In Example 3, if we assume that age distribution functions are Lebesgue integrable,22 then the distance between two functions f1 and f2 may be defined by ∞ 1/p d(f1 , f2 ) = , |f1 (ξ)f2 (ξ)|p dξ 0
where p ≥ 1. In Example 5, the Hamming distance dH (c1 , c2 ) between two configurations c1 and c2 is defined by 1 |n1 (i) − n2 (i)|, L i=1 L
dH (c1 , c2 ) = 22
For an elementary presentation of the notion of measure and Lebesgue theory of integration, see [46], Chapter 1. In applications, this requirement is not restrictive, but it allows definition of complete metric spaces.
1.3 What is a dynamical system?
15
where n1 (i) and n2 (i) are, respectively, the occupation numbers of site i in configurations 1 and 2. When the evolution of a system is not deterministic, as is the case for the random walkers of Example 5, it is necessary to introduce the notion of a random process. Since a random process is a family of measurable mappings on the space Ω of elementary events in the phase space X, the phase space has to be measurable. To summarize the discussion above, we shall assume that, if the evolution is deterministic, the phase space is a metric space, whereas, if the evolution is stochastic, the phase space is a measurable metric space. To conclude this section, we present two examples of dynamical systems that can be viewed as mathematical recreations. Example 6. Bulgarian solitaire. Like many other mathematical recreations, Bulgarian solitaire has been made popular by Martin Gardner [138]. A pack of N = 12 n(n + 1) cards is divided into k packs of n1 , n2 , . . . , nk cards, where n1 + n2 + · · · + nk = N . A move consists in taking exactly one card of each pack and forming a new pack. By repeating this operation a sufficiently large number of times any initial configuration eventually converges to a configuration that consists of n packs of, respectively, 1, 2, . . . , n cards. For instance, if N = 10 (which corresponds to n = 4), starting from the partition {1, 2, 7}, we obtain the following sequence: {1, 3, 6}, {2, 3, 5}, {1, 2, 3, 4}. Numbers N of the form 12 n(n + 1) are known as triangular numbers. Then, what happens if the number of cards is not triangular? Since the number of partitions of a finite integer is finite, any initial partition leads into a cycle of partitions. For example, if N = 8, starting from {8}, we obtain the sequence: {7, 1}, {6, 2}, {5, 2, 1}, {4, 3, 1}, {3, 3, 2}, {3, 2, 2, 1}, {4, 2, 1, 1}, {4, 3, 1}. For any positive integer N , the convergence towards a cycle, which is of length 1 if N is triangular, has been proved by J. Brandt [71] (see also [1]). In the case of a triangular number, it has been shown that the number of moves before the final configuration is reached is at most equal to n(n − 1) [178, 118]. The Bulgarian solitaire is a time-discrete dynamical system. The phase space consists of all the partitions of the number N . Example 7. The original Collatz problem. Many iteration problems are simple to state but often intractably hard to solve. Probably the most famous one is the so-called 3x + 1 problem, also known as the Collatz conjecture, which asserts that, starting from any positive integer n, repeated iteration of the function f defined by
1 n, if n is even, f (n) = 21 2 (3n + 1), if n is odd,
16
1 Introduction
always returns 1. In what follows, we shall present a less known conjecture that, like the 3x + 1 problem, has not been solved. Consider the function f defined, for all positive integers, by ⎧ 2 ⎪ if n = 0 mod 3, ⎨ 3 n, 4 1 f (n) = 3 n − 3 , if n = 1 mod 3, ⎪ ⎩4 1 3 n + 3 , if n = 2 mod 3. Its inverse f −1 is defined by ⎧ 3 ⎪ if n = 0 mod 2, ⎨ 2 n, f (n) = 43 n + 14 , if n = 1 mod 4, ⎪ ⎩3 1 4 n − 4 , if n = 3 mod 4. f , which is bijective, is a permutation of the natural numbers. The study of the iterates of f has been called the original Collatz problem [198]. If we consider the first natural numbers, we obtain the following permutation: 1 2 3 4 5 6 7 8 9 ··· . 1 3 2 5 7 4 9 11 6 · · · While some cycles are finite, e.g. (3, 2, 3) or (5, 7, 9, 6, 4, 5), it has been conjectured that there exist infinite cycles. For instance, none of the 200,000 successive iterates of 8 is equal to 8. This is also the case for 14 and 16. For this particular dynamical system, the phase space is the set N of all positive integers, and the evolution rule is reversible.
2 How to Build Up a Model
Nature offers a puzzling variety of interactions between species. Predation is one of them. According to the way predators feed on their prey, various categories of predators may be distinguished [330, 39]. Parasites, such as tapeworms or tuberculosis bacteria, live throughout a major period of their life in a single host. Their attack is harmful but rarely lethal in the short term. Grazers, such as sheep or biting flies that feed on the blood of mammals, also consume only parts of their prey without causing immediate death. However, unlike parasites, they attack large numbers of prey during their lifetime. True predators, such as wolves or plankton-eating aquatic animals, also attack many preys during their lifetime, but unlike grazers, they quickly kill their prey. Our purpose in this chapter is to build up models to study the effects of true predation on the population dynamics of the predator and its prey. More precisely, among the various patterns of predator-prey abundance, we focus on two-species systems in which it appears that predator and prey populations exhibit coupled density oscillations. In order to give an idea of the variety of dynamical systems used in modeling, we describe different models of predatorprey systems.
2.1 Lotka-Volterra model The simplest two-species predator-prey model has been proposed independently by Lotka [214] and Volterra [340].1 Vito Volterra (1860–1940) was stimulated to study this problem by his future son-in-law, Umberto D’Ancona, who, analyzing market statistics of the Adriatic fisheries, found that, during the First World War, certain predacious species increased when fishing was severely limited. A year before, Alfred James Lotka (1880–1949) had come up with an almost identical solution to the predator-prey problem. His method was very general, but, probably because of that, his book did not receive the 1
See also [99] and [310].
18
2 How to Build Up a Model
attention it deserved.2 This model assumes that, in the absence of predators, the prey population, denoted by H for “herbivore,” grows exponentially, whereas, in the absence of prey, predators starve to death and their population, denoted by P , declines exponentially. As a result of the interaction between the two species, H decreases and P increases at a rate proportional to the frequency of predator–prey encounters. We have then H˙ = bH − sHP, P˙ = −dP + esHP,
(2.1) (2.2)
where b is the birth rate of the prey, d the death rate of the predator, s the searching efficiency of the predator, and e the efficiency with which extra food is turned into extra predators.3 Lotka-Volterra equations contain four parameters. This number can be reduced if we express the model in dimensionless form. If we put √ Ps b Hes , p= , (2.3) h= , τ = bd t, ρ = d b d (2.1) and (2.2) become dh = ρh (1 − p), dτ dp 1 = − p (1 − h). dτ ρ
(2.4) (2.5)
These equations contain only one parameter, which makes them much easier to analyze. Before analyzing this particular model, let us show how the asymptotic stability of a given equilibrium state of a two-species model of the form dx1 = f1 (x1 , x2 ), dτ dx2 = f2 (x1 , x2 ), dτ may be discussed. A more rigorous discussion of the stability of equilibrium points of a system of differential equations is given in Section 3.2. Here, we are just interested in the properties of the Lotka-Volterra model. Let (x∗1 , x∗2 ) be an equilibrium point of the differential system above, and put 2
3
On the relations between Lotka and Volterra, and how ecologists in the 1920s perceived mathematical modeling, consult Kingsland [186]. Chemists will note the similarity of these equations with the rate equations of chemical kinetics. For a treatment of chemical kinetics from the point of view of dynamical systems theory, see Gavalas [139].
2.1 Lotka-Volterra model
u1 = x1 − x∗1
19
and u2 = x2 − x∗2 .
In the neighborhood of this equilibrium point, neglecting quadratic terms in u1 and u2 , we have ∂f1 ∗ ∗ du1 = (x , x )u1 + dτ ∂x1 1 2 du2 ∂f2 ∗ ∗ (x , x )u1 + = dτ ∂x1 1 2
∂f1 ∗ ∗ (x , x )u2 , ∂x2 1 2 ∂f2 ∗ ∗ (x , x )u2 . ∂x2 1 2
The general solution of this linear system is u(τ ) = exp Df (x∗ )τ u(0), where f = (f1 , f2 ), x∗ = (x1 , x2 ), u = (u1 , u2 ), and Df (x∗ ) (called the community matrix in ecology) is the Jacobian matrix of f at x = (x∗1 , x∗2 ), that is, ⎡ ∂f ∂f1 ∗ ∗ ⎤ 1 (x∗1 , x∗2 ) (x , x ) ∂x2 1 2 ⎥ ⎢ ∂x1 ∗ Df (x ) = ⎣ ⎦. ∂f2 ∗ ∗ ∂f2 ∗ ∗ (x1 , x2 ) (x1 , x2 ) ∂x1 ∂x2 Therefore, the equilibrium point x∗ is asymptotically stable 4 if, and only if, the eigenvalues of the matrix Df (x∗1 , x∗2 ) have negative real parts.5 The Lotka-Volterra model has two equilibrium states (0, 0) and (1, 1). Since ⎡ ⎤ ⎤ ⎡ ρ 0 0 −ρ ⎦ , Df (1, 1) = ⎣ ⎦, Df (0, 0) = ⎣ 1 0 1 0 −ρ ρ 4
5
The precise definition of asymptotic stability is given in Definition 5. Essentially, it means that any solution x(t, 0, x0 ) of the system of differential equations satisfying the initial condition x = x0 at t = 0 tends to x∗ as t tends to infinity. Given a 2 × 2 real matrix A, there exists a real invertible matrix M such that J = M AM −1 may be written under one of the following canonical forms λ1 0 λ1 a −b , , . 0λ b a 0 λ2 The corresponding forms of eJτ are λ1 τ 0 1τ e eλτ , λ2 τ , 01 0 e
eaτ
eAτ can then be determined from the relation eAτ = M −1 eJτ M. More details are given in Subsection 3.2.1.
cos bτ − sin bτ . sin bτ cos bτ
20
2 How to Build Up a Model
(0, 0) is unstable and (1, 1) is stable but not asymptotically stable. The eigenvalues of the matrix Df (1, 1) being pure imaginary, if the system is in the neighborhood of (1, 1), it remains in this neighborhood. The equilibrium point (1, 1) is said to be neutrally stable.6 The set of all trajectories in the (h, p) phase space is called the phase portrait of the differential system (2.4–2.5). Typical phase-space trajectories are represented in Figure 2.1. Except the coordinate axes and the equilibrium point (0, 0), all the trajectories are closed orbits oriented counterclockwise.
1.4
predators
1.3 1.2 1.1 1 0.9 0.8 0.7 0.8
0.9
1 1.1 preys
1.2
1.3
Fig. 2.1. Lotka-Volterra model. Typical trajectories around the neutral fixed point for ρ = 0.8.
Since the trajectories in the predator-prey phase space are closed orbits, the populations of the two species are periodic functions of time (Figure 2.2). This result is encouraging because it might point toward a simple relevant mechanism for predator-prey cycles. There is an abundant literature on cyclic variations of animal populations.7 They were first observed in the records of fur-trading companies. The classic example is the records of furs received by the Hudson Bay Company from 1821 to 1934. They show that the numbers of snowshoe hares8 (Lepus americanus) and Canadian lynx (Lynx canadensis) trapped for the company vary periodically, the period being about 10 years. The hare feeds on a variety of herbs, shrubs, and other vegetable matter. The lynx is essentially single-prey oriented, and although it consumes other small animals if starving, it cannot 6
7 8
For the exact meaning of neutrally stable, see Section 3.2, and, in particular, Example 13. See, in particular, Finerty [127]. Also called varying hares. They have large, heavily furred hind feet and a coat that is brown in summer and white in winter.
2.1 Lotka-Volterra model predator
1.2 populations
21
1.1 1 0.9 prey
0.8 0
5
10 time
15
20
Fig. 2.2. Lotka-Volterra model. Scaled predator and prey populations as functions of scaled time.
live successfully without the snowshoe hare. This dependence is reflected in the variation of lynx numbers, which closely follows the cyclic peaks of abundance of the hare, usually lagging a year behind. The hare density may vary from one hare per square mile of woods to 1000 or even 10,000 per square mile.9 In this particular case, however, the understanding of the coupled periodic variations of predator and prey populations seems to require a more elaborate model. The two species are actually parts of a multispecies system. In the boreal forests of North America, the snowshoe hare is the dominant herbivore, and the hare-plant interaction is probably the essential mechanism responsible for the observed cycles. When the hare density is not too high, moderate browsing removes the annual growth and has a pruning effect. But at high hare density, browsing may reduce all new growth for several years and, consequently, lower the carrying capacity for hares. The shortage in food supply causes a marked drop in the number of hares. It has also been suggested that when hares are numerous, the plants on which they feed respond to heavy grazing by producing shoots with high levels of toxins.10 If this interpretation is correct, the hare cycles would be the result of the herbivore-forage interaction (in this case, hares are “preying” on vegetation), and the lynx, because they depend almost exclusively upon the snowshoe hares, track the hare cycles. A careful study of the variations of the numbers of pelts sold by the Hudson Bay Company as a function of time poses a difficult problem of interpretation. Assuming that these numbers represent a fixed proportion of the total 9
10
Many interesting facts concerning northern mammals may be found in Seton [312]. For a statistical analysis of the lynx-hare and other 10-year cycles in the Canadian forests, see Bulmer [73]. See [39], pp. 356–357.
22
2 How to Build Up a Model
populations of a two-species system, they seem to indicate that the hares are eating the lynx [144] since the predator’s oscillation precedes the prey’s. It should be the opposite: an increase in the predator population should lead to a decrease of the prey population, as illustrated in Figure 2.2. Although it accounts in a very simple way for the existence of coupled cyclic variations in animal populations, the Lotka-Volterra model exhibits some unsatisfactory features, however. Since, in nature, the environment is continually changing, in phase space, the point representing the state of the system will continually jump from one orbit to another. From an ecological viewpoint, an adequate model should not yield an infinity of neutrally stable cycles but one stable limit cycle. That is, in the (h, p) phase space, there should exist a closed trajectory C such that any trajectory in the neighborhood of C should, as time increases, become closer and closer to C. Furthermore, the Lotka-Volterra model assumes that, in the absence of predators, the prey population grows exponentially. This Malthusian growth11 is not realistic. Hence, if we assume that, in the absence of predation, the growth of the prey population follows the logistic model, we have H ˙ − sHP, (2.6) H = bH 1 − K P˙ = −dP + esHP, (2.7) where K is the carrying capacity of the prey. For large K, this model is just a small perturbation of the Lotka-Volterra model. If, to the dimensionless variables defined in (2.3), we add the scaled carrying capacity k=
Kes , d
(2.8)
Equations (2.6) and (2.7) become
dh h = ρh 1 − − p , dτ k 1 dp = − p (1 − h). dτ ρ
(2.9) (2.10)
The equilibrium points are (0, 0), (k, 0), and (1, 1 − 1/k). Note that the last equilibrium point exists if, and only if, k > 1; that is, if the carrying capacity of the prey is high enough to support the predator. Since 11
After Thomas Robert Malthus (1766–1834), who, in his most influential book [221], stated that because a population grows much faster than its means of subsistence—the first increasing geometrically whereas the second increases only arithmetically—“vice and misery” will operate to restrain population growth. To avoid these disastrous results, many demographers, in the nineteenth century, were led to advocate birth control. More details on Malthus and his impact may be found in [177], pp. 11–18.
2.1 Lotka-Volterra model
Df (0, 0) = and
ρ
0 −1/ρ
Df
0
,
1 1, 1 − k
Df (k, 0) =
=
−ρ
−ρk
0 −(1 − k)/ρ
−ρ/k
−ρ
(k − 1)/ρk 0
23
,
,
it follows that (0, 0) and (k, 0) are unstable, whereas (1, 1 − 1/k) is stable. A finite carrying capacity for the prey transformed the neutrally stable equilibrium point of the Lotka-Volterra model into an asymptotically stable equilibrium point. It is easy to verify that, if k is large enough for the condition ρ2 < 4k(k − 1) to be satisfied, the eigenvalues are complex, and, in the neighborhood of the asymptotically stable equilibrium point, the trajectories are converging spirals oriented counterclockwise (Figure 2.3). The predator and prey populations are no longer periodic functions of time, they exhibit damped oscillations, the predator oscillations lagging in phase behind the prey (Figure 2.4). If ρ2 > 4k(k−1), the eigenvalues are real and the approach of the asymptotically stable equilibrium point is nonoscillatory.
1.4 predators
1.2 1 0.8 0.6 0.4 0.2 0.6
0.8
1
1.2 1.4 preys
1.6
1.8
Fig. 2.3. Modified Lotka-Volterra model. A typical trajectory around the stable fixed point (big dot) for ρ = 0.8 and k = 3.5.
A small perturbation—corresponding to the existence of a finite carrying capacity for the prey—has qualitatively changed the phase portrait of the Lotka-Volterra model.12 A model whose qualitative properties do not change 12
A precise definition of what exactly is meant by small perturbation and qualitative change of the phase portrait will be given when we study structural stability (see Section 3.4).
24
2 How to Build Up a Model
populations
1.75 1.5 1.25
prey
1 0.75 0.5
predator
0.25 0
5
10
15 time
20
25
30
Fig. 2.4. Modified Lotka–Volterra model. Scaled predator and prey populations as functions of scaled time.
significantly when it is subjected to small perturbations is said to be structurally stable. Since a model is not a precise description of a system, qualitative predictions should not be altered by slight modifications. Satisfactory models should be structurally stable.
2.2 More realistic predator-prey models If we limit our discussion of predation to two-species systems assuming, as we did so far, that • • •
time is a continuous variable, there is no time lag in the responses of either population to changes, and population densities are not space-dependent,
a somewhat realistic model, formulated in terms of ordinary differential equations, should at least take into account the following relevant features13 : 1. Intraspecific competition; that is, competition between individuals belonging to the same species. 2. Predator’s functional response; that is, the relation between the predator’s consumption rate and prey density. 3. Predator’s numerical response; that is, the efficiency with which extra food is transformed into extra predators. Essential resources being, in general, limited, intraspecific competition reduces the growth rate, which eventually goes to zero. The simplest way to take 13
See May [229], pp. 80–84, Pielou [283], pp. 91–95.
2.3 A model with a stable limit cycle
25
this feature into account is to introduce into the model carrying capacities for both the prey and the predator. A predator has to devote a certain time to search, catch, and consume its prey. If the prey density increases, searching becomes easier, but consuming a prey takes the same amount of time. The functional response is, therefore, an increasing function of the prey density—obviously equal to zero at zero prey density—approaching a finite limit at high densities. In the Lotka-Volterra model the functional response, represented by the term sH, is not bounded. According to Holling [173, 174], the behavior of the functional response at low prey density depends upon the predator. If the predator eats essentially one type of prey, then the functional response should be linear at low prey density. If, on the contrary, the predator hunts different types of prey, the functional response should increase as a power greater than 1 (usually 2) of prey density. In the Lotka-Volterra model, the predator’s numerical response is a linear function of the prey density. As for the functional response, it can be argued that there should exist a saturation effect; that is, the predator’s birth rate should tend to a finite limit at high prey densities. Possible predator-prey models are aH P H H ˙ H = rH H 1 − , − K b+H aP P H P˙ = − cP, b+H or aH P H 2 H − , H˙ = rH H 1 − K b + H2 aP P H 2 P˙ = − cP. b + H2 In these two models, the predator equation follows the usual assumption that the predator’s numerical response is proportional to the rate of prey consumption. The efficiency of converting prey to predator is given by aH /aP . The term −cP represents the rate at which the predators would decrease in the absence of the prey. Following Leslie [201] (see Example 10), an alternative equation for the predator could be P ˙ , P = rP P 1 − cH an equation that has the logistic form with a carrying capacity for the predator proportional to the prey density.
2.3 A model with a stable limit cycle In a recent paper, Harrison [165] studied a variety of predator-prey models in order to find which model gives the best quantitative agreement with Luckin-
26
2 How to Build Up a Model
bill’s data on Didinium and Paramecium [215]. Luckinbill grew Paramecium aurelia together with its predator Didinium nasutum and, under favorable experimental conditions aimed at reducing the searching effectiveness of the Didinium, he was able to observe oscillations of both populations for 33 days before they became extinct. Harrison found that the predator-prey model aH P H H , (2.11) − H˙ = rH H 1 − K b+H aP P H P˙ = − cP, (2.12) b+H predicts the outcome of Luckinbill’s experiment qualitatively.14 In order to simplify the discussion of this model, we first define reduced variables. Equations (2.11) and (2.12) have only one nontrivial equilibrium point corresponding to the coexistence of both populations. It is the solution of the system15 aH P H aP H − rH 1 − = 0, − c = 0. K b+H b+H Denote as (H ∗ , P ∗ ) this nontrivial equilibrium point, and let h=
H , H∗
p=
P , P∗
τ = rH t,
k=
K , H∗
β=
b , H∗
Equations (2.11) and (2.12) become αh ph h dh − =h 1− , dτ k β+h αp ph dp = − γp, dτ β+h where16
αh =
1−
1 k
c γ= . r
(2.13) (2.14)
(β + 1) and αp = γ(β + 1).
Equations (2.13) and (2.14) contain only three independent parameters: k, β, and γ. In terms of the scaled populations, the nontrivial fixed point is (1, 1), and the expression of the Jacobian matrix at this point is 14
15
16
The reader interested in how Harrison modified this model to obtain a better quantitative fit should refer to Harrison’s paper [165]. Since the coordinates (H ∗ , P ∗ ) of an acceptable equilibrium point have to be positive, the coefficients of Equations (2.11) and (2.12) have to satisfy certain conditions. Note that there exist two trivial equilibrium points, (0, 0) and (K, 0). These relations express that (1, 1) is an equilibrium point of Equations (2.13) and (2.14).
2.4 Fluctuating environments
27
⎤
⎡
k−2−β 1 −1 + ⎥ ⎢ k(1 + β) k ⎥. Df (1, 1) = ⎢ ⎦ ⎣ βγ 0 1+β For (1, 1) to be asymptotically stable, the determinant of the Jacobian has to be positive and its trace negative. Since k > 1, the determinant is positive, but the trace is negative if, and only if, k < 2 + β. Below and above the threshold value kc = 2 + β, the phase portrait is qualitatively different. For k < kc , the trajectories converge to the fixed point (1, 1), whereas for k > kc they converge to a limit cycle, as shown in Figure 2.5. The value kc of the parameter k where this structural change occurs is called a bifurcation point.17 This particular type of bifurcation is a Hopf bifurcation (see Chapter 3, Example 22).
1.8
predators
1.6 1.4 1.2 1 0.8 0.6 0.4 0.5
1
1.5 preys
2
2.5
Fig. 2.5. Two trajectories in the (h, p)-plane converging to a stable limit cycle of the predator-prey model described by Equations (2.13) and (2.14). Big dots represent initial points. Scaled parameters values are k = 3.5, β = 1, and γ = 0.5.
2.4 Fluctuating environments Population oscillations may be driven by fluctuating environments. Consider, for example, the logistic equation N , (2.15) N˙ = rN 1 − K(t) 17
See Section 3.5.
28
2 How to Build Up a Model
where K(t) represents a time-dependent carrying capacity. If K(t) = K0 (1 + a cos Ωt), where a is a real such that |a| < 1, (2.15) takes the reduced form dn n =n 1− , (2.16) dτ 1 + a cos ωτ where n=
N , K0
ω=
Ω , r
population
population
1.2 1 0.8 0.6 0.4 0.2 0
5
10
15 20 time
25
30
τ = rt.
0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15 20 time
25
30
Fig. 2.6. Scaled population evolving according to the logistic equation with a periodic carrying capacity. ω = 1 and a = 0.5 (left); ω = 1 and a = 0.999 (right).
Numerical solutions of Equation (2.16), represented in Figure 2.6, show the existence of periodic solutions. Note that this nonlinear differential equation is a Riccati equation and, therefore, linearizable. If we put n = k u/u, ˙ where k(τ ) = 1 + a cos ωτ , (2.16) becomes k˙ u ¨+ − 1 u˙ = 0. k Hence,
u(τ ) = c1
eτ dτ + c2 , k(τ )
where c1 and c2 are constants to be determined by the initial conditions.
2.5 Hutchinson’s time-delay model Since reproduction is not an instantaneous process, Hutchinson suggested (see Chapter 1, Example 4) that he logistic equation modeling population growth should be replaced by the following time-delay logistic equation N (t − T ) dN = rN (t) 1 − . (2.17) N˙ (t) = dt K
2.5 Hutchinson’s time-delay model
29
Mathematically, this model is not trivial. Its behavior may, however, be understood as follows. K is clearly an equilibrium point (steady solution). If N (0) is less than K, as t increases, N (t) does not necessarily tend monotonically to K from below. The time delay allows N (t) to be momentarily greater than K. In fact, if at time t the population reaches the value K, it may still grow, for the growth rate, equal to r(1 − N (t − T )/K), is positive. When N (t − T ) exceeds K, the growth rate becomes negative and the population declines. If T is large enough, the model will, therefore, exhibit oscillations. In this problem, there exist two time scales: 1/r and T . As a result, the stability depends on the relative sizes of these time scales measured by the dimensionless parameter rT . Qualitatively, we can say that if rT is small, no oscillations—or damped oscillations—are observed, and, as for the standard logistic model, the equilibrium point K is asymptotically stable. If rT is large, K is no more stable, and the population oscillates. Hence, there exists a bifurcation point; that is, a threshold value for the parameter rT above which N (t) tends to a periodic function of time. This model is instructive for it proves that single-species populations may exhibit oscillatory behaviors even in stable environments. The following discussion shows how to analyze the stability of time-delay models and find bifurcation points. Consider the general time-delay equation (1.13) dN 1 t = rN (t) 1 − N (t − u)Q(u) du , dt K 0 where Q is the delay kernel whose integral over [0, ∞[ is equal to 1, and replace N (t) by K + n(t). Neglecting quadratic terms in n, we obtain dn(t) = −r dt
t
n(t − u)Q(u) du.
(2.18)
0
This type of equation can be solved using the Laplace transform.18 The Laplace transform of n is defined by ∞ L(z, n) = e−zt n(t) dt. 0
Substituting in (2.18) gives19 zL(z, n) − n0 = −rL(z, n)L(z, Q), 18
19
On using the Laplace transform to solve differential equations, see Boccara [46], pp. 94–103 and 226–231. If L(z, n) is the Laplace transform of n, the Laplace transform of n˙ is L(z, n) ˙ = zL(z, n) − n(0), and the Laplace transform of the convolution of n and Q, defined t by 0 n(t − u)Q(u) du (since the support of n and Q is R+ ), is the product L(z, n)L(z, Q) of the Laplace transforms of n and Q.
30
2 How to Build Up a Model
where n0 is the initial value n(0). Thus, L(z, n) =
n0 . z + rL(z, Q)
(2.19)
If, as in Equation (2.17), the delay kernel is δT , the Dirac distribution at T , its Laplace transform is e−zT , and we have n0 L(z, n) = . z + re−zT n(t) is, therefore, the inverse Laplace transform of L(z, n), that is c+i∞ dz n0 n(t) = , 2iπ c−i∞ z + re−zT where c is such that all the singularities of the function z → (z + re−zT )−1 are in the half-plane {z | Re z < c}. The only singularities of the integrand are the zeros of z + re−zT . As a simplification, define the dimensionless variable ζ = zT and parameter ρ = rT . We then have to find the solutions of ζ + ρe−ζ = 0.
(2.20)
If ζ = ξ + iη, where ξ and η are real, we therefore have to solve the system ξ + ρe−ξ cos η = 0, η − ρe
−ξ
sin η = 0.
(2.21) (2.22)
(i) If ζ is real, η = 0, and ξ is the solution of ξ + ρe−ξ = 0. This equation has no real root for ρ > 1/e, a double root equal to −1 for ρ = 1/e, and two negative real roots for ρ < 1/e. (ii) If ζ is complex, eliminating ρe−ξ between Equations (2.21) and (2.22) yields ξ = −η cot η, (2.23) which, when substituted back into (2.22), gives η = eη cot η sin η. ρ
(2.24)
Complex roots of Equation (2.20) are found by solving (2.22) for η and obtaining ξ from η by (2.21). Since complex roots appear in complex conjugate pairs, it is sufficient to consider the case η > 0. The function f : η → eη cot η sin η has the following properties (k is a positive integer): lim f (η) = 0,
η↑2kπ
lim
f (η) = −∞,
lim
f (η) = ∞.
η↓(2k−1)π η↑(2k−1)π
2.6 Discrete-time models
31
In each open interval ](2k−1)π, 2kπ[, f is increasing, and in each open interval ]2kπ, (2k + 1)π[, f is decreasing. Finally, in the interval [0, π[, the graph of f is represented in Figure 2.7.
Fig. 2.7. Graph of the function f : η → eη cot η sin η for η ∈]0, π[.
Since the slope at the origin f (0) = e, if ρ < 1/e, Equation (2.24) has no root in [0, π[ but a root in each interval of the form [2kπ, ηk ], where ηk is the solution of the equation eη = eη cot η sin η in the interval [2kπ, (2k + 1)π[. From (2.23) it could be verified that the corresponding value of ξ is negative. If 1/e < ρ < π/2, we find, as above, that there are no real roots and that there is an additional complex root with 0 < η < π/2 and, from (2.23), a negative corresponding ξ. Finally, if ρ > π/2, there is a complex root with π/2 < η < π and, from (2.23), a positive corresponding ξ. Consequently, if ρ < π/2, the equilibrium point N ∗ = K is asymptotically stable; but, for ρ > π/2, this point is unstable. ρ = π/2 is a bifurcation point. The method we have just described is applicable to any other delay kernel Q. Since each pole z∗ of the Laplace transform of n contributes in the expression of n(t) with a term of the form Aez∗ t , the problem is to find the poles of the Laplace transform of n given by (2.19) and determine the sign of their real part. These poles are the solutions of the equation z + rL(z, Q) = 0.
2.6 Discrete-time models When a species may breed only at a specific time, the growth process occurs in discrete time steps. To a continuous-time model, we may always associate a discrete-time one. If τ0 represents some characteristic time interval, the following finite difference equations
32
2 How to Build Up a Model
1 x1 (τ + τ0 ) − x1 (τ ) = f1 x1 (τ ), x2 (τ ) , τ0 1 x2 (τ + τ0 ) − x2 (τ ) = f2 x1 (τ ), x2 (τ ) , τ0
(2.25) (2.26)
coincide, when τ0 tends to zero, with the differential equations dx1 = f1 (x1 , x2 ), dτ dx2 = f2 (x1 , x2 ). dτ
(2.27) (2.28)
If τ0 is taken equal to 1, Equations (2.25) and (2.26) become x1 (τ + 1) = x1 (τ ) + f1 x1 (τ ), x2 (τ ) , x2 (τ + 1) = x2 (τ ) + f2 x1 (τ ), x2 (τ ) .
(2.29) (2.30)
Equations (2.29) and (2.30) are called the time-discrete analogues of Equations (2.27) and (2.28). Let (x∗1 , x∗2 ) be an equilibrium point of the difference equations (2.29) and (2.30),20 and put u1 = x1 − x∗1
and u2 = x2 − x∗2 .
In the neighborhood of this equilibrium point, we have ∂f1 ∗ ∗ (x , x )u1 (τ ) + ∂x1 1 2 ∂f2 ∗ ∗ u2 (τ + 1) = u2 (τ ) + (x , x )u1 (τ ) + ∂x1 1 2
u1 (τ + 1) = u1 (τ ) +
∂f1 ∗ ∗ (x , x )u2 (τ ), ∂x2 1 2 ∂f2 ∗ ∗ (x , x )u2 (τ ). ∂x2 1 2
The general solution of this linear system is τ u(τ ) = I + Df (x∗1 , x∗2 ) u(0), where I is the 2 × 2 identity matrix. Here τ is an integer. The equilibrium point (x∗1 , x∗2 ) is asymptotically stable if the absolute values of the eigenvalues of I +Df (x∗1 , x∗2 ) are less than 1. That is, in the complex plane, the eigenvalues of Df (x∗1 , x∗2 ) must belong to the open disk {z | |z − 1| < 1}. The stability criterion for discrete models is more stringent than for continuous models. The existence of a finite time interval between generations is a destabilizing factor. 20
That is, a solution of the system f1 (x∗1 , x∗2 ) = 0 f2 (x∗1 , x∗2 ) = 0.
2.7 Lattice models
33
For example, the time-discrete analogue of the reduced Lotka-Volterra equations (2.4) and (2.5) are h(τ + 1) = h(τ ) + ρ h(τ ) 1 − p(τ ) , (2.31) 1 p(τ + 1) = p(τ ) − p(τ ) 1 − h(τ ) . (2.32) ρ Since the eigenvalues of the Jacobian matrix at the fixed point (1, 1), which are i and −i, are outside the open disk {z | |z − 1| < 1}, the neutrally stable fixed point of the time-continuous model is unstable for the time-discrete model.
2.7 Lattice models To catch a prey, a predator has to be in the immediate neighborhood of the prey. In predator-prey models formulated in terms of either differential equations (ordinary or partial) or difference equations, the short-range character of the predation process is not correctly taken into account.21 One way to correctly take into account the short-range character of the predation process is to discretize space; that is, to consider lattice models. Dynamical systems in which states, space, and time are discrete are called automata networks. If the network is periodic, the dynamical system is a cellular automaton. More precisely, a one-dimensional cellular automaton is a dynamical system whose state s(i, t) ∈ {0, 1, 2, . . . , q − 1} at position i ∈ Z and time t ∈ N evolves according to a local rule f such that s(i, t + 1) = f s(i − r , t), s(i − r + 1, t), . . . , s(i + rr , t) . The numbers r and rr , called the left and right radii of rule f , are positive integers. In what follows, we briefly describe a lattice predator-prey model studied by Boccara, Roblin, and Roger [55]. Consider a finite two-dimensional lattice Z2L with periodic boundary conditions. The total number of vertices is equal to L2 . Each vertex of the lattice is either empty or occupied by a prey or a predator. According to the process under consideration, for a given vertex, we consider two different neighborhoods (Figure 2.8): a von Neumann predation neighborhood (4 vertices) and a Moore pursuit and evasion neighborhood (8 vertices). The evolution of preys and predators is governed by the following set of rules. 21
This will be manifest when we discuss systems that exhibit bifurcations. In phase transition theory, for instance, it is well-known that in the vicinity of a bifurcation point—i.e., a second-order transition point—certain physical quantities have a singular behavior (see Boccara [45], pp. 155–189). It is only above a certain spatial dimensionality—known as the upper critical dimensionality—that the behavior of the system is correctly described by a partial differential equation.
34
2 How to Build Up a Model
Fig. 2.8. Von Neumann predation neighborhood (left) and Moore pursuit and evasion neighborhood (right).
1. A prey has a probability dh of being captured and eaten by a predator in its predation neighborhood. 2. If there is no predator in its predation neighborhood, a prey has a probability bh of giving birth to a prey at an empty site of this neighborhood. 3. After having eaten a prey, a predator has a probability bp of giving birth to a predator at the site previously occupied by the prey. 4. A predator has a probability dp of dying. 5. Predators move to catch prey and prey move to evade predators. Predators and prey move to a site of their von Neumann neighborhood according to the occupation of the pursuit and evasion Moore neighborhood. This neighborhood is divided into four quarters along its diagonals, and prey and predator densities are evaluated in each quarter. Predators move to a neighboring site in the direction of highest local prey density of the Moore neighborhood. In case of equal highest density in two or four directions, one of them is chosen at random. If three directions correspond to the same highest density, predators select the middle one. Preys move to a neighboring site in the opposite direction of highest local predator density. If the four directions are equivalent, one is selected at random. If three directions correspond to the same maximum density, preys choose the remaining one. If two directions correspond to the same maximum density, preys choose at random one of the other two. If H and P denote, respectively, the prey and predator densities, then mHL2 preys and mP L2 predators are sequentially selected at random to perform a move. This sequential process allows some individuals to move more than others. The parameter m, which is a positive number, represents the average number of tentative moves per individual during a unit of time. Rules 1, 2, 3, and 4 are applied simultaneously. Predation, birth, and death processes are modeled by a three-state two-dimensional cellular automaton rule. Rule 5 is applied sequentially. At each time step, the evolution results from the application of the synchronous cellular automaton subrule followed by the sequential one.
2.7 Lattice models
35
To study such a lattice model, it is usually useful to start with the meanfield approximation that ignores space dependence and neglects correlations. In lattice models with local interactions, quantitative predictions of such an approximation are not very good. However, it gives interesting information about the qualitative behavior of the system in the limit m → ∞. If Ht and Pt denote the densities at time t of preys and predators, respectively, we have Ht+1 = Ht − Ht f (1, dh Pt ) + (1 − Ht − Pt )f (1 − Pt , bh Ht ), Pt+1 = Pt − dp Pt + bp Ht f (1, dh Pt ), where f (p1 , p2 ) = p41 − (p1 − p2 )4 . Within the mean-field approximation, the evolution of our predator-prey model is governed by two coupled finite-difference equations. The model has three fixed points that are solutions of the system: Hf (1, dh P ) − (1 − H − P )f (1 − P, bh t) = 0, dp Pt − bp Ht f (1, dh Pt ) = 0. In the prey-predator phase space ((H, P )-plane), (0, 0) is always unstable. (0, 1) is stable if dp − 4bp dh > 0. When dp − 4bp dh changes sign, a nontrivial fixed point (H ∗ , P ∗ ), whose coordinates depend upon the numerical values of the parameters, becomes stable, as illustrated in Figure 2.9.
0.2
Predators
0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.1
0.2
0.3 Preys
0.4
0.5
Fig. 2.9. Orbit in the prey-predator phase space approaching the stable fixed point (0.288, 0.103) for bh = 0.2, dh = 0.9, bp = 0.2, dp = 0.2.
Increasing the probability bp for a predator to give birth, we observe a Hopf bifurcation and the system exhibits a stable limit cycle, as illustrated in Figure 2.10.
36
2 How to Build Up a Model
Predators
0.25 0.2 0.15 0.1 0.05 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Preys Fig. 2.10. Orbit in the prey-predator phase space approaching the stable limit cycle for bh = 0.2, dh = 0.9, bp = 0.6, dp = 0.2.
We will not describe in detail the results of numerical simulations, which can be found in [55]. For large m values, the mean-field approximation provides useful qualitative—although not exact—information on the general temporal behavior as a function of the different parameters. If we examine the influence of the pursuit and evasion process (Rule 5)— i.e., if we neglect the birth, death, and predation processes (bh = bd = dh = dp = 0)—we observe the formation of small clusters of preys surrounded by predators preventing these preys from escaping. Preys that are not trapped by predators move more or less randomly avoiding predators. For bh = 0.2, dh = 0.9, bp = 0.2, dp = 0.2, the mean-field approximation exhibits a stable fixed point in the prey-predator phase space located at (H ∗ , P ∗ ) = (0.288, 0.103). Simulations show that, for these parameter values, a nontrivial fixed point exists only if m > m0 = 0.350. Below m0 , the stable fixed point is (1, 0). This result is quite intuitive; if the average number of tentative moves is too small, all predators eventually die and prey density grows to reach its maximum value. As m increases from m0 to m = 500, the location of the nontrivial fixed point approaches the mean-field fixed point. For bh = 0.2, dh = 0.9, bp = 0.6, dp = 0.2, the mean-field approximation exhibits a stable limit cycle in the prey-predator phase space. This oscillatory behavior is not observed for two-dimensional large lattices [84]. A quasicyclic behavior of the predator and prey densities on a scale of the order of the mean displacements of the individuals may, however, be observed. As mentioned above, cyclic behaviors observed in population dynamics have received a variety of interpretations. This automata network predator-prey model suggests another possible explanation: approximate cyclic behaviors could result as a consequence of a not too large habitat; i.e., when the size of the habitat is of the order of magnitude of the mean displacements of the individuals.
Part I
Mean-Field Type Models
Mean-field type models deal with average quantities. As models of complex systems, it is important to emphasize that they ignore space correlations between the elements of the system and replace local interactions by uniform long-range ones. These models are, therefore, rather crude representations of complex systems, but, as a first attempt to understand the behavior of multiagent systems, they are often very useful. The mean-field type models we shall study are formulated in terms of either differential equations or recurrence equations whose theory is well-developed. Since the purpose of this book is to show how models are built up and studied, we shall use the results of dynamical systems theory without giving proofs of most theorems, especially when these proofs are rather technical. We shall, however, make every effort to give precise definitions and state theorems with accuracy—avoiding unnecessary and overly general statements—and indicate references to mathematical texts where the interested reader will find correct proofs.
3 Differential Equations
The study of dynamical models formulated in terms of ordinary differential equations began with Newton’s attempts to explain the motion of bodies in the solar system. Except in very simple cases, such as the two-body problem, most problems in celestial mechanics proved extremely difficult. At the end of the nineteenth century, Poincar´e developed new methods to analyze the qualitative behavior of solutions to nonlinear differential equations. In a paper [287]1 devoted to functions defined as solutions of differential equations, he explains: Malheureusement, il est ´evident que, dans la grande g´en´eralit´e des cas qui se pr´esentent, on ne peut int´egrer ces ´equations a` l’aide des fonctions d´ej` a connues, ... Il est donc n´ecessaire d’´etudier les fonctions d´efinies par des ´equations diff´erentielles en elles-mˆemes et sans chercher `a les ramener a` des fonctions plus simples, ...2 The modern qualitative theory of differential equations has its origin in this work. There exists a wide variety of models formulated in terms of differential equations, and some of them are presented in this chapter.
3.1 Flows Consider a system whose dynamics is described by the differential equation 1
2
This paper is a revision of the work for which Poincar´e was awarded a prize offered by the king of Sweden in 1889. Unfortunately, it is clear that in most cases we cannot solve these equations using known functions, ... It is therefore necessary to study functions defined by differential equations for themselves without trying to reduce them to simpler functions, ...
42
3 Differential Equations
dx = x˙ = X(x), dt
(3.1)
where x, which represents the state of the system, belongs to the state or phase space S, and X is a given vector field . Figure 3.1 shows an example of a vector field. We have seen (Chapter 1, Example 1) that a differential equation of order higher than one, autonomous or nonautonomous, can always be written under the above form. To present the theory, we need to recall some definitions. Definition 1. A function f : Rn → Rn is differentiable at x0 ∈ Rn if there exists a linear transformation Df (x0 ) that satisfies lim
h→0
f (x0 + h) − f (x0 ) − Df (x0 )h = 0. h
(3.2)
The linear transformation Df (x0 ) is called the derivative of f at x0 . Instead of the word function, many authors use the words map or mapping. In this text, we shall indifferently use any of these terms. It is easily verified that the derivative Df is given by the n × n Jacobian matrix ∂(f1 , f2 , . . . , fn ) . ∂(x1 , x2 , . . . , xn ) Let U be an open subset of Rn ; the function f : U → Rn is continuously differentiable, or of class C 1 , in U if all the partial derivatives ∂fi ∂xj
(1 ≤ i ≤ n, 1 ≤ j ≤ n)
are continuous in U . More generally, f is of class C k in U if all the partial derivatives ∂ k fi ∂xj1 ∂xj2 · · · ∂xjk
(1 ≤ i ≤ n, 1 ≤ j1 ≤ n, 1 ≤ j2 ≤ n, . . . , 1 ≤ jk ≤ n)
exist and are continuous in U . Continuous but not differentiable functions are referred to as C 0 functions. A function is said to be smooth if it is differentiable a sufficient number of times. Definition 2. A function f : U → V , where U and V are open subsets of Rn , is said to be a C k diffeomorphism if it is a bijection, and both f and f −1 are C k functions. If f and f −1 are C 0 , f is called a homeomorphism. In many models, the phase space is not Euclidean. It may have, for instance, the structure of a circle or a sphere. But if, as in both these cases, we can define local coordinates, the notions of derivative and diffeomorphism can be easily extended. Phase spaces that have a structure similar to the
3.1 Flows
43
structure of a circle or a sphere are called manifolds. More precisely, M is a manifold of dimension n if, for any x ∈ M, there exist a neighborhood N (x) ⊆ M containing x and a homeomorphism h : N (x) → Rn that maps n N(x) onto in a neighborhood of h(x) ∈ R . Since we can define coordinates h N (x) ⊆ Rn , h defines local coordinates on N (x). The pair h N (x) , h is called a chart. In order to obtain a global description of M, we cover it with a family of open sets Ni , each associated with a chart (hi (Ni ), hi ). The set of all these charts is called an atlas. In all the models we shall study, even if the phase space is a manifold, the functions f will be given in terms of local coordinates. We shall, therefore, never be really involved with charts and atlases. In all definitions and theorems involving “differential manifolds of dimension n,” the reader could replace this expression by “open sets of Rn .” If, in Equation (3.1), the vector field X defined on an open subset U of Rn is C k , then given x0 ∈ U and t0 ∈ R, for |t − t0 | sufficiently small, there exists a solution of Equation (3.1) through the point x0 at t0 , denoted x(t, t0 , x0 ) with x(t0 , t0 , x0 ) = x0 . This solution is unique and is a C k function of t, t0 , and x0 . As we already pointed out (page 10, Footnote 16), we shall not give a proof of this fundamental theorem since a differential equation modeling a real system should have a unique evolution for any realizable initial state.3 The solution of Equation (3.1) being unique, we have x(t + s, t0 , x0 ) = x(s, t + t0 , x(t, t0 , x0 )). This property shows that the solutions of (3.1) form a one-parameter group of C k diffeomorphisms of the phase space. These diffeomorphisms are referred to as a phase flow or just a flow . The common notation for flows is ϕ(t, x) or ϕt (x), and we have ϕt ◦ ϕs = ϕt+s (3.3) for all t and s in R. Note that ϕ0 is the identity and that (ϕt )−1 exists and is given by ϕ−t .4 In most cases t0 = 0 and, from the definition of ϕt above we have d X(x) = (3.4) ϕ (x) . dt t t=0 Given a point x ∈ U ⊆ Rn , the orbit or trajectory of ϕ passing through x ∈ U is the set {ϕt (x) | t ∈ R} oriented in the sense of increasing t. There is only one trajectory of ϕ passing through any given point x ∈ U . That is, if two trajectories intersect, they must coincide. The set of all trajectories of a flow is called its phase portrait. A helpful representation of a flow is obtained by plotting typical trajectories (Figure 3.1). 3
4
The mathematically oriented reader may consult Hale [163] or Hirsch and Smale [171]. The mapping t → ϕt is an isomorphism from the group of real numbers R to the group {ϕt | t ∈ R}. This group is called an action of the group R on the state space U .
44
3 Differential Equations
Definition 3. Let X be a vector field defined on an open set U of Rn ; a point x∗ ∈ U is an equilibrium point of the differential equation x˙ = X(x) if X(x∗ ) = 0. Note that if x∗ is an equilibrium point, then ϕt (x∗ ) = x∗ for all t ∈ R. Thus x∗ is also called a fixed point of the flow ϕ. The orbit of a fixed point is the fixed point itself. A closed orbit of a flow ϕ is a trajectory that is not a fixed point but is such that ϕτ (x) = x for some x on the trajectory and a nonzero τ . The smallest nonzero value of τ is usually denoted by T and is called the period of the orbit. That is, we have ϕT (x) = x, but ϕt (x) = x for 0 < t < T .
4
1
2
0.5
0
0
-2
-0.5
-4
-1
-4
-2
0
2
4
-1
-0.5
0
0.5
1
Fig. 3.1. Vector field (left) and phase portrait (right) of the two-dimensional system x˙ 1 = x2 , x˙ 2 = −x1 − x22 . (0, 0) is a nonhyperbolic equilibrium point.
Definition 4. An equilibrium point x∗ of the differential equation x˙ = X(x) is said to be Lyapunov stable (or L-stable) if, for any given positive ε, there exists a positive δ (which depends on ε only) such that, for all x0 in the neighborhood of x∗ defined by x0 − x∗ < δ, the solution x(t, 0, x0 ) of the differential equation above satisfying the initial condition x(0, 0, x0 ) = x0 is such that x(t, 0, x0 ) − x∗ < ε for all t > 0. The equilibrium point is said to be unstable if it is not stable. An equilibrium point x∗ of a differential equation is stable if the trajectory in the phase space going through a point sufficiently close to the equilibrium point at t = 0 remains close to the equilibrium point as t increases. Lyapunov stability does not imply that, as t tends to infinity, the point x(t, 0, x0 ) tends to x∗ . But: Definition 5. An equilibrium point x∗ of the differential equation x˙ = X(x) is said to be asymptotically stable if it is Lyapunov stable and lim x(t, 0, x0 ) = x∗ .
t→∞
3.1 Flows
45
Example 8. Kermack-McKendrick epidemic model. To discuss the spread of an infection within a population, Kermack and McKendrick [182] divide the population into three disjoint groups. 1. Susceptible individuals are capable of contracting the disease and becoming infective. 2. Infective individuals are capable of transmitting the disease to others. 3. Removed individuals have had the disease and are dead, have recovered and are permanently immune, or are isolated until recovery and permanent immunity occur.5 Infection and removal are governed by the following rules. 1. The rate of change in the susceptible population is proportional to the number of contacts between susceptible and infective individuals, where the number of contacts is taken to be proportional to the product of the numbers S and I of, respectively, susceptible and infective individuals. The model ignores incubation periods. 2. Infective individuals are removed at a rate proportional to their number I. 3. The total number of individuals S + I + R, where R is the number of removed individuals, is constant, that is, the model ignores births, deaths by other causes, immigration, emigration, etc.6 Taking into account the rules above yields S˙ = −iSI, I˙ = iSI − rI, R˙ = rI,
(3.5)
where i and r are positive constants representing infection and the removal rates. From the first equation, it is clear that S is a nonincreasing function, whereas the second equation implies that I(t) increases with t, if S(t) > r/i, and decreases otherwise. Therefore, if, at t = 0, the initial number of susceptible individuals S0 is less than r/i, since S(t) ≤ S0 , the infection dies out, that is, no epidemic occurs. If, on the contrary, S0 is greater than the critical value r/i, the epidemic occurs; that is, the number of infective individuals first increases and then decreases when S(t) becomes less than r/i.7 Remark 1. This threshold phenomenon shows that an epidemic occurs if, and only if, the initial number of susceptible individuals S0 is greater than a threshold value Sth . For this model, Sth = r/i; i.e., in the case of a deadly disease, an epidemic has less chance to occur if the death rate due to the disease is high! 5 6 7
Models of this type are called SIR models. Or we could say that birth, death, and migration are in exact balance. That is, by definition, an epidemic occurs if the time derivative of the number of infective individuals I˙ is positive at t = 0.
3 Differential Equations
density
46
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
S
I
0
10
20 30 time
40
50
Fig. 3.2. Kermack-McKendrick epidemic model. Susceptible and infective densities as functions of time for i = 0.6, r = 0.1, S(0) = 0.7, and I(0) = 0.05. S(0) is above the threshold value, and, as a consequence of the epidemic, the total population is seriously reduced.
Note that S˙ is nonincreasing and positive, and R˙ is positive and less than or equal to the total population N , therefore, limt→∞ S(t) and limt→∞ R(t) exist. Since I(t) = N − S(t) − R(t), limt→∞ I(t) also exists.8 Moreover, from the first and third equations of (3.5), it follows that dS i = − S; dR r that is, iR S = S0 exp − r iN ≥ S0 exp − > 0. r
(3.6)
In this model, even in the case of a very serious epidemic, some individuals are not infected. The spread of the disease does not stop for lack of susceptible individuals (see Figure 3.2). Example 9. Hethcote-York model for the spread of gonorrhea [170]. Gonorrhea is a sexually transmitted disease that presents the following important characteristics that differ from other infections such as measles or mumps: 8
Equations (3.5) show that in the steady state I(∞) = 0. Then, R(∞) = N −S(∞), and Relation (3.6) shows that S(∞) is the only positive root of x = S0 exp(−i(N − x)/r). More details can be found in Waltman [341].
3.1 Flows
• • •
47
Gonococcal infection does not confer protective immunity, so individuals are susceptible again as soon as they recover from infection.9 The latent period is very short: 2 days, compared to 12 days for measles. The seasonal oscillations in gonorrhea incidence10 are very small (less than 10%), while the incidence of influenza or measles often varies by a factor of 5 to 50.
If we assume that the infection is transmitted only through heterosexual intercourse, we divide the population into two groups, Nf females and Nm males at risk, each group being divided into two subgroups, Nf Sf (resp. Nm Sm ) susceptible females (resp. males) and Nf If (resp. Nm Im ) infective females (resp. males). Nf and Nm are assumed to be constant. The dynamics of gonorrhea is then modeled by the four-dimensional system Nf S˙ f = −λf Sf Nm Im + Nf If /df , Nf I˙f = λf Sf Nm Im − Nf If /df , Nm S˙ m = −λm Sm Nf If + Nm Im /dm , Nm I˙m = λm Sm Nf If − Nm Im /dm , where λf (resp. λm ) is the rate of infection of susceptible females (resp. males), and df (resp. dm ) is the average duration of infection for females (resp. males). The rates λf and λm are different since the probability of transmission of gonococcal infection during a single sexual exposure from an infectious woman to a susceptible man is estimated to be about 0.2–0.3, while the probability of transmission from an infectious man to a susceptible woman is about 0.5–0.7. The average durations of infection df and dm are also different since 90% of all men who have a gonococcal infection notice symptoms within a few days after exposure and promptly seek medical treatment, while up to 75% of women with gonorrhea fail to have symptoms and remain untreated for some time. Since Sf + If = 1 and Sm + Im = 1, the four-dimensional system reduces to the two-dimensional system λf If I˙f = , (1 − If )Im − r df Im , I˙m = rλm (1 − Im )If − dm where r = Nf /Nm . The system has two equilibrium points (If , Im ) = (0, 0) and df dm λf λm − 1 r(df dm λf λm − 1) . (If , Im ) = , dm λm (r + df λf ) df λf (1 + rdm λm ) 9 10
Models of this type are called SIS models. Incidence is the number of new cases in a time interval.
48
3 Differential Equations
Since acceptable solutions should not be negative, we find that the nontrivial equilibrium point exists if df dm λf λm > 1. The coefficient λf /r (resp. λm r) represents the average fraction of females (resp. males) being infected by one male (resp. female) per unit of time. Since males (resp. females) are infectious during the period dm (resp. df ), then the average fraction of females (resp. males) being infected by one infective male (female) during his (resp. her) period of infection is λf dm /r (resp. λm df r). The condition λm df λf dm > 1 therefore expresses that the average fraction of females infected by one male will infect, during their period of infection, more than one male. In this case, gonorrhea remains endemic. If the condition is not satisfied, then gonorrhea dies out. As a consequence, for this model, if the nontrivial equilibrium point exists, it is asymptotically stable; if it does not, then the trivial fixed point is asymptotically stable. Example 10. Leslie’s predator-prey model. After the publication of the Lotka Volterra model Equations (2.1) and (2.1) , many other predator-prey models were proposed. In 1948, Leslie [201] suggested the system H ˙ − sHP, H = rH H 1 − K (3.7) P ˙ P = rP P 1 − , cH where H and P denote, respectively, the prey and predator populations. The equation for the preys is similar to Lotka-Volterra equation (2.1) except that, in the absence of predators, the growth of the preys is modeled by the logistic equation. The equation for the predators is a logistic equation in which the carrying capacity is proportional to the prey population. This model contains five parameters. There is only one nontrivial equilibrium point (H ∗ , P ∗ ), which is the unique solution of the linear system H rH 1 − = sP, P = cH. K If we put H h = ∗, H
P p = ∗, P
ρ=
rH , rP
k=
K , H∗
τ=
√
rH rP t,
(3.8)
Equations (3.7) become dh = ρh 1 − dτ dp 1 = p 1− dτ ρ
h − αph, k p . h
(3.9)
3.1 Flows
49
3
2.5
predators
2
1.5
1
0.5
0.5
1
1.5 preys
2
2.5
3
Fig. 3.3. Phase portrait of the scaled Leslie model for k = 5 and ρ = 1.5.
These equations contain two independent parameters, ρ and k, the extra parameter α being given in terms of these by11 1 . α=ρ 1− k The equilibrium points are (0, 0), (k, 0), and (1, 1). A few trajectories converging to the asymptotically stable equilibrium point (1, 1) are shown in Figure 3.3. Definition 6. Let ϕt : U → U and ψ t : W → W be two flows; if there exists a diffeomorphism h : U → W such that, for all t ∈ R, h ◦ ϕt = ψ t ◦ h, the flows ϕt and ψ t are said to be conjugate. In other words, the diagram ϕ
t U −−−− → ⏐ ⏐ h
U ⏐ ⏐ h
ψ
t W −−−− → W
11
This result follows from the first of the two equations (3.9) when we put dh = 0, h = 1, p = 1. dτ
(3.10)
50
3 Differential Equations
is commutative. The purpose of Definition 6 is to provide with a way of characterizing when two flows have qualitatively the same dynamics. Equation (3.10) can also be written ψ t = h ◦ ϕt ◦ h−1 . That is, h takes the orbits of the flow ϕt into the orbits of the flow ψ t . In other words, the flow ϕt becomes the flow ψ t under the change of coordinates h. It is readily verified that conjugacy is an equivalence relation, i.e., (ϕt ∼ ϕt ), (ϕt ∼ ψ t ) ⇒ (ψ t ∼ ϕt ), and (ϕt ∼ ψ t , ψ t ∼ χt ) ⇒ (ϕt ∼ χt ). If h is a C 1 function such that h ◦ ϕt = ψ t ◦ h, then differentiating both sides with respect to t and evaluating at t = 0 yields d d Dh(ϕt (x)) ϕt (x) ψ t h(x) , = dt dt t=0 t=0 and taking into account Relation (3.4), we obtain Dh(x)X(x) = Y(h(x)).
(3.11)
That is, if h is a differentiable flow conjugacy of the flows ϕt and ψ t , the derivative Dh(x) transforms X(x) into Y(h(x)). Remark 2. Definition 6 requires conjugacy to preserve the parameter t. If we are required to preserve only the orientation along the orbits of ϕt and ψ t , we obtain more satisfactory equivalence classes for flows. In that case, Relation (3.10) would have to be replaced by h ◦ ϕt = ψ τ (t,x) ◦ h, (3.12) where, for all x, the function t → τ (t, x) is strictly increasing; i.e., its derivative with respect to t has to be positive for all x. If there exist a homeomorphism h and a differentiable function τ such that Relation (3.12) is satisfied, it is said that the flows ϕt and ψ t are topologically equivalent. Here is a simple example. Consider the two one-dimensional flows: ϕ1 (t, x) = e−λ1 t x
and
ϕ2 (t, x) = e−λ2 t x,
where λ1 and λ2 are different positive numbers. Let h be such that h(e−λ1 t x) = e−λ2 t h(x).
(3.13)
If h is a diffeomorphism, then taking the derivative with respect to x of each side of this relation yields e−λ1 t h (e−λ1 t x) = e−λ2 t h (x). A diffeomorphism being invertible by definition, h (0) = 0, and we obtain e−λ1 t = e−λ2 t , which contradicts the assumption λ1 = λ2 . If, on the contrary, we assume that h is not differentiable everywhere (i.e., h is not a diffeomorphism but only a homeomorphism), then the homeomorphism12 12
h is continuous and invertible.
3.2 Linearization and stability ⎧ λ2 /λ1 ⎪ , if x < 0 ⎨−|x| h → 0, if x = 0 ⎪ ⎩ λ2 /λ1 , if x > 0 x
51
satisfies (3.13) and shows that ϕ1 and ϕ2 are topologically equivalent.
3.2 Linearization and stability In order to analyze a model described by a nonlinear differential equation of the form (3.1), we first have to determine its equilibrium points and study the behavior of the system near these points. Under certain conditions, this behavior is qualitatively the same as the behavior of a linear system. We therefore begin this section with a brief study of linear differential equations. 3.2.1 Linear systems The solution of the one-dimensional linear differential equation x˙ = ax, which satisfies the initial condition x(0) = x0 , is x(t) = x0 eat . Question. Let A be a time-independent linear operator defined on Rn ; is it possible to generalize the result above and say that, if x ∈ Rn , the solution to the linear differential equation x˙ = Ax,
(3.14)
which satisfies the initial condition x(0) = x0 , is x(t) = eAt x0 ? The answer is yes, provided we define and show how to express the linear operator eAt . Definition 7. Let A be a linear operator defined on Rn ; the exponential of A is the linear operator defined on Rn by13 eA =
∞ Ak k=0
13
k!
.
(3.15)
For this definition to make sense, it is necessary to show that the series converges and, therefore, to first define a metric on the space of linear operators on Rn in order to be able to introduce the notion of limit of a sequence of linear operators. If x is the norm of x ∈ Rn , the norm of a linear operator A may be defined as A = supx≤1 Ax. The distance d between two linear operators A and B on Rn is then defined as d(A, B) = A − B.
52
3 Differential Equations
Depending on whether the real linear operator A has real or complex distinct or multiple eigenvalues, the real linear operator eAt may take different forms, as described below.14 1. All the eigenvalues of A are distinct. If A has distinct real eigenvalues λi and corresponding eigenvectors ui , where i = 1, 2, . . . , k and distinct complex eigenvalues λj = αj + iβj and λj = αj − iβj and corresponding complex eigenvectors wj = uj + ivj and wj = uj − ivj , where j = k + 1, k + 2, . . . , , then the matrix15 M = [u1 , . . . , uk , uk+1 , vk+1 , . . . , u , v ] is invertible, and M−1 AM = diag[λ1 , . . . , λk , Bk+1 , . . . , B ]. The right-hand side denotes the matrix ⎡ λ1 0 · · · · · · · · · ⎢ 0 λ2 0 · · · · · · ⎢ ⎢· · · · · · · · · · · · · · · ⎢ ⎢· · · · · · · · · · · · · · · ⎢ ⎢ 0 0 · · · 0 λk ⎢ ⎢ 0 0 ··· ··· 0 ⎢ ⎣· · · · · · · · · · · · · · · 0 0 ··· ··· ···
··· ··· ··· ··· 0 Bk+1 ··· ···
··· ··· ··· ··· ··· ··· ··· ···
⎤ 0 0⎥ ⎥ · · ·⎥ ⎥ · · ·⎥ ⎥, 0⎥ ⎥ 0⎥ ⎥ · · ·⎦ B
where, for j = k + 1, k + 2, . . . , , the Bj are 2 × 2 blocks given by αj −βj . Bj = βj αj In this case, we have eAt = M diag[eλ1 t , . . . , eλk t , Ek+1 , . . . , E ]M−1 , the 2 × 2 block Ej being given by αj t
Ej = e 14
15
(3.16)
cos βj t − sin βj t . sin βj t cos βj t
For a simple and rigorous treatment of linear differential systems, see Hirsch and Smale [171]. If a1 , a2 , . . . , an are n independent vectors of Rn , the matrix [a1 , a2 , . . . , an ] denotes the matrix ⎤ ⎡ a11 a21 · · · an1 ⎢ a12 a22 · · · an2 ⎥ ⎥ ⎢ ⎣· · · · · · · · · · · · ⎦ . a1n a2n · · · ann
3.2 Linearization and stability
53
Note that the dimension of the vector space on which A and eAt are defined is n = 2 − k. 2. A has real multiple eigenvalues. If A has real eigenvalues λ1 , λ2 , . . . , λn repeated according to their multiplicity and if (u1 , u2 , . . . , un ) is a basis of generalized eigenvectors,16 then the matrix M = [u1 , u2 , . . . , un ] is invertible, and the operator A can be written as the sum of two matrices S + N, where SN = NS,
M−1 SM = diag[λ1 , λ2 , . . . , λn ],
and N is nilpotent of order k ≤ n.17 In this case, we have18 eAt = M diag[eλ1 t , eλ2 t , . . . , eλn t ] M−1 tk−1 . × I + Nt + · · · + Nk−1 (k − 1) !
(3.17)
3. A has complex multiple eigenvalues. If a real linear operator A, represented by a 2n × 2n matrix, has complex eigenvalues λj = αj + iβj and λj = αj − iβj , where j = 1, 2, . . . , n, there exists a basis of generalized eigenvectors wj = uj + ivj and wj = uj − ivj for Cn , (u1 , v1 , . . . , un , vn ) is a basis for R2n , the 2n × 2n matrix M = [u1 , v1 , . . . , un , vn ] is invertible, and the operator A can be written as the sum of two matrices S + N, where α −β2 α −βn α1 −β1 , 2 ,··· , n , SN = NS, M−1 SM = diag β1 α1 β2 α2 βn αn and N is nilpotent of order k ≤ 2n. In this case, we have cos β1 − sin β1 cos βn − sin βn , · · · , eαn t M−1 eAt = M diag eα1 t sin β1 cos β1 sin βn cos βn tk−1 . (3.18) × I + Nt + · · · + Nk−1 (k − 1) ! 16
17 18
If the real eigenvalue λ has multiplicity m < n, then for k = 1, 2, . . . , m, any nonzero solution of (A − λI)k u = 0 is called a generalized eigenvector. A linear operator N is nilpotent of order k if Nk−1 = 0 and Nk = 0. If the linear operators A and B commute, then eA+B = eA eB . The series defining the exponential of a nilpotent linear operator of order k is a polynomial of degree k − 1.
54
3 Differential Equations
4. A has both real and complex multiple eigenvalues. In this case, we use a combination of the results above to find the expression of linear operator eAt . Example 11. Classification of two-dimensional linear flows. Let tr A and det A denote, respectively, the trace and the determinant of the 2 × 2 timeindependent real matrix A. The eigenvalues of A are the roots of its characteristic polynomial: det(A − λI) = λ2 − tr A λ + det A.
(3.19)
We may distinguish the following cases: 1. If det A < 0, the eigenvalues of A are real and have opposite sign. The origin is said to be a saddle. 2. If det A > 0 and (tr A)2 ≥ 4 det A, the eigenvalues of A are real and have the same sign. The origin is said to be an attractive node if tr A < 0 and a repulsive node if tr A > 0. 3. If tr A = 0 and (tr A)2 < 4 det A, the eigenvalues of A are complex. The origin is said to be an attractive focus if tr A < 0 and a repulsive focus if tr A > 0. 4. If tr A = 0 and det A < 0, the eigenvalues of A are complex with a zero real part. The origin is said to be a center . The phase portraits corresponding to the different cases described above are represented in Figure 3.4. Attractive nodes and attractive foci are asymptotically stable equilibrium points for linear two-dimensional systems. Centers are Lyapunov stable but not asymptotically stable equilibrium points. To the classification above, we have to add two degenerate cases illustrated in Figure 3.5. 1. If tr A = 0 and det A = 0, one eigenvalue is equal to zero and the other one equal to tr A. The origin is said to be a saddle node.19 In this case, there exists a basis in which eAt = diag[1, eλt ], showing that the equations of the orbits are x = a, y > 0 and x = a, y < 0, with a ∈ R. 2. If tr A = 0 and det A = 0 with A = 0, there exists a basis in which A is of the form
0a . All the orbits are the straight lines y = y0 traveled at 00
constant velocity ay0 . We mentioned (Remark 2) that, if we require only the orientation along the orbits to be preserved, we obtain more satisfactory flow equivalence classes. In the case of two-dimensional linear systems it can be shown [164]20 that if the eigenvalues of the two matrices A and B have nonzero real parts, then the two linear systems x˙ = Ax and x˙ = Bx are topologically equivalent if, 19 20
Also called a fold or a tangent bifurcation point. See Section 3.5. See pp. 238–246.
3.2 Linearization and stability
a
b
c
d
e
f
55
Fig. 3.4. Two-dimensional linear flows. Phase portraits of the nondegenerate cases. (a) saddle, (b) attractive node, (c) repulsive node, (d) attractive focus, (e) repulsive focus, (f ) center.
Fig. 3.5. Two-dimensional linear flows. Phase portraits of the degenerate cases. Left: tr A = 0 and det A = 0. Right: tr A = 0 and det A = 0 with A = 0.
and only if, A and B have the same number of eigenvalues with negative (and hence positive) real parts. Consequently, up to topological equivalence, there are three distinct equivalence classes of hyperbolic21 two-dimensional linear systems; that is, cases (a), (b), and (c) in Figure 3.4. If the origin is a hyperbolic equilibrium point of a linear system x˙ = Ax, the subspace spanned by the (generalized) eigenvectors corresponding to the 21
That is, whose eigenvalues have nonzero real parts. See Definition 8.
56
3 Differential Equations
eigenvalues with negative (resp. positive) real parts is called the stable (resp. unstable) manifold of the hyperbolic equilibrium point. 3.2.2 Nonlinear systems Definition 8. Let x∗ ∈ U ⊆ Rn be an equilibrium point of the differential equation x˙ = X(x); x∗ is said to be hyperbolic if all the eigenvalues of the Jacobian matrix DX(x∗ ) have nonzero real part. The linear function x → DX(x∗ )x
(3.20)
is called the linear part of X at x∗ . Let x∗ be an equilibrium point of Equation (3.1). In order to determine the stability of x∗ , we have to understand the nature of the solutions near x∗ . Let x = x∗ + y. substituting in (3.1) and Taylor expanding about x∗ yields y˙ = DX(x∗ )y + O( y 2 ).
(3.21)
Since the stability of x∗ is determined by the behavior of orbits through points arbitrarily close to x∗ , we might think that the stability could be determined by studying the stability of the equilibrium point y = 0 of the linear system y˙ = DX(x∗ )y. The solution of (3.22) through the point y0 ∈ Rn at t = 0 is y(t) = exp DX(x∗ )t y0 .
(3.22)
(3.23)
Thus, the equilibrium point y = 0 is asymptotically stable if all the eigenvalues of DX(x∗ ) have negative real parts.22 Question: If all the eigenvalues of DX(x∗ ) have negative real parts, is the equilibrium point x∗ of Equation (3.1) asymptotically stable? The answer is yes. More precisely: Theorem 1. If x∗ is a hyperbolic equilibrium point of the differential equation x˙ = X(x), the flow generated by the vector field X in the neighborhood of x∗ is C 0 conjugate to the flow generated by DX(x∗ )(x − x∗ ). This result is known as the Hartman-Grobman theorem.23 Hence, if DX(x∗ ) has no purely imaginary eigenvalues, the stability of the equilibrium point x∗ of the nonlinear differential equation x˙ = X(x) can be determined from the study of the linear differential equation y˙ = DX(x∗ )y, where y = x − x∗ . If DX(x∗ ) has purely imaginary eigenvalues, this is not the case. The following examples illustrate the various possibilities. 22 23
See Subsection 3.2.1. For a proof, consult Palis and de Melo [276].
3.2 Linearization and stability
57
Example 12. The damped pendulum. The equation for the damped pendulum is θ¨ + 2aθ˙ + ω 2 sin θ = 0, (3.24) where θ is the displacement angle from the stable equilibrium position, a > 0 is the friction coefficient, and ω 2 is equal to the acceleration of gravity g divided by the pendulum length . If we put x1 = θ,
˙ x2 = θ,
Equation (3.24) may be written dx1 = x2 , dt dx2 = −ω 2 sin x1 − 2ax2 . dt The equilibrium points of this system are (nπ, 0), where n is any integer (n ∈ Z). The Jacobian of the vector field X = (x2 , −ω 2 sin x1 − 2ax2 ) is 0 1 . DX(x1 , x2 ) = −ω 2 cos x1 −2a
7.5 5 2.5 0 -2.5 -5 -7.5 -6
-4
-2
0
2
4
6
Fig. 3.6. Phase portrait of a damped pendulum x˙ 1 = x2 , x˙ 2 = −ω 2 sin x1 − 2ax2 for ω = 2.8 and a = 0.1. The equilibrium points 0 and ±2π are asymptotically stable, while the equilibrium points ±π are unstable (saddle points).
If n √ is even, the eigenvalues of the Jacobian matrix at (nπ, 0) are λ1,2 = −a ± a2 − ω 2 . If ω ≤ a, both eigenvalues are real and negative; if ω > a, the eigenvalues are complex conjugate and their real part is negative (it is equal to −a). Therefore, if n is an even integer, the equilibrium point (nπ, 0) is asymptotically stable (see Figure 3.6).
58
3 Differential Equations
√ If n is odd, the eigenvalues of the Jacobian matrix are λ1,2 = −a ± a2 + ω 2 , that is, real and of opposite signs. The equilibrium point (nπ, 0) is therefore unstable (see Figure 3.6). Remark 3. A hyperbolic equilibrium point x∗ of the differential equation x˙ = X(x) is called a sink if all the eigenvalues of DX(x∗ ) have negative real parts; it is called a source if all the eigenvalues of DX(x∗ ) have positive real parts; and it is called a saddle if it is hyperbolic and DX(x∗ ) has at least one eigenvalue with a negative real part and at least one eigenvalue with a positive real part.
Example 13. A perturbed harmonic oscillator. Consider the system x˙1 = x2 + λx1 (x21 + x22 ), x˙2 = −x1 + λx2 (x21 + x22 ), where λ is a parameter. These equations describe the dynamics of a perturbed harmonic oscillator. For all values of λ, the origin is an equilibrium point. The Jacobian at the origin of the vector field X = (x2 + λx1 (x21 + x22 ), −x1 + λx2 (x21 + x22 )) is 0 1 DX(0, 0) = . −1 −0 Its eigenvalues are ±i. The origin is a nonhyperbolic point, and to study its stability we have to analyze the behavior of the orbits close to the origin. If x1 (t), x2 (t) are the coordinates of a phase point at time t, its distance from the origin will increase or decrease according to the sign of the time derivative d 2 x (t) + x22 (t) = 2x1 (t)x˙ 1 (t) + 2x2 (t)x˙2 (t) = 2λ(x21 + x22 )2 . dt 1 Thus, as t tends to infinity, x(t) 2 tends to zero if λ < 0, and the origin is an asymptotically stable equilibrium; if λ > 0 with x0 = 0, x(t) 2 tends to infinity, showing that, in this case, the origin is an unstable equilibrium. We could have reached the same conclusion using polar coordinates defined by x1 = r cos θ, x2 = r sin θ. In terms of the coordinates (r, θ), the system becomes r˙ = λr3 ,
θ˙ = −1.
Since θ˙ = −1, the orbits spiral monotonically clockwise around the origin, and the stability of the origin is the same as the equilibrium point r = 0 of the one-dimensional system r˙ = λr3 . That is, r = 0 is asymptotically stable if λ < 0 and unstable if λ > 0 with r0 = 0.
3.2 Linearization and stability
59
Near a hyperbolic equilibrium point x∗ of a nonlinear system x˙ = X(x) we can define (local) stable and unstable manifolds. They are tangent to the respective stable and unstable manifolds of the linear system y˙ = DX(x∗ )y.24 If the equilibrium point x∗ is nonhyperbolic, we can define in a similar manner a center manifold tangent to the center subspace spanned by the nz (generalized) eigenvectors corresponding to the nz eigenvalues of the linear operator DX(x∗ ) with zero real parts. Note that while the stable and unstable manifolds are unique, the center manifold is not. The essential interest of the center manifold is that it contains all the complicated dynamics in the neighborhood of a nonhyperbolic point. The following classical example illustrates the characteristic features of the center manifold.25 Example 14. Consider the system x˙ 1 = x21 ,
x˙ 2 = −x2 .
The origin is a nonhyperbolic fixed point. The solutions are x1 (t) =
x1 (0) 1 − x1 (0)t
and x2 (t) = x2 (0)e−t .
Eliminating t, we find that the equations of the trajectories in the (x1 , x2 )space are 1 1 − x2 = 0. − x2 (0) exp x1 x1 (0) For x1 < 0, all the trajectories approach the origin with all the derivatives of x2 with respect to x1 equal to zero at the origin. For x1 ≥ 0, the only trajectory that goes through the origin is x2 = 0. The center manifold, tangent to the eigenvector directed along the x1 -axis, which corresponds to the zero eigenvalue of the linear part of the vector field at the origin is therefore not unique. Except for x2 = 0, all the center manifolds are not C ∞ . This example shows that the invariant center manifold, unlike the invariant stable and unstable manifolds, is not necessarily unique and as smooth as the vector field. 24
25
More precisely: Let x∗ be a hyperbolic equilibrium point of the nonlinear system x˙ = X(x), where X is a C k (k ≥ 1) vector field on Rn . If the linear operator DX(x∗ ) has nn eigenvalues with negative real parts and np = n − nn eigenvalues with positive real parts, there exists an nn -dimensional differentiable manifold s Wloc tangent to the stable subspace of the linear system y˙ = DX(x∗ )y at x∗ such s that, for all x0 ∈ Wloc , and all t > 0, limt→∞ ϕt (x0 ) = x∗ , and there exists an u np -dimensional differentiable manifold Wloc tangent to the unstable subspace of ∗ ∗ u the linear system y˙ = DX(x )y at x such that, for all x0 ∈ Wloc and all t < 0, limt→∞ ϕt (x0 ) = x∗ , where ϕt is the flow generated by the vector field X. This rather intuitive result is known as the stable manifold theorem. For more details on invariant manifolds consult Hirsch, Pugh, and Shub [172]. On center manifold theory, see Carr [79].
60
3 Differential Equations
In order to determine if an equilibrium point x∗ of the differential equation ˙x = X(x) is stable we have to study the behavior of the function x → x − x∗ in a neighborhood N (x∗ ) of x∗ . The Lyapunov method introduces more general functions. The essential idea on which the method rests is to determine how an adequately chosen real function varies along the trajectories of the flow ϕt generated by the vector field X. Definition 9. Let x∗ be an equilibrium point of the differential equation x˙ = X(x) on U ⊆ Rn . A C 1 function V : U → R is called a strong Lyapunov function for the flow ϕt on an open neighborhood N (x∗ ) of x∗ provided V (x) > V (x∗ ) and d V˙ (x) = V ϕt (x) 0, the orbit is crossing the curve from the inside to the outside. These remarks make the Lyapunov theorem quite intuitive.
3.2 Linearization and stability
61
1 2 2 x2 + g(1 − cos x1 ), 2 which represents the energy of the pendulum when the mass of the bob is equal to unity, is a weak Lyapunov function since V : (x1 , x2 ) →
∂V ∂V x˙ 1 + x˙ 2 ≡ 0. V (x1 , x2 ) > 0 for (x1 , x2 ) = (0, 0) and V˙ (x1 , x2 ) = ∂x1 ∂x2 The equilibrium point (0, 0) is Lyapunov stable. In the case of the damped pendulum (see Example 12), we have x˙ 1 = x2 , g x˙ 2 = − sin x1 − 2ax2 ,
(a > 0).
If here again we consider the function V defined by V (x1 , x2 ) = 12 2 x22 + g(1 − cos x1 ), we find that V˙ (x1 , x2 ) = −2a2 x22 . For the damped pendulum, V˙ (x1 , x2 ) is a strong Lyapunov function in a neighborhood of the origin. The origin is, therefore, asymptotically stable. Example 16. The van der Pol oscillator. The differential equation x ¨ + λ(x2 − 1)x˙ + x = 0
(3.25)
describes the dynamics of the van der Pol oscillator [291], which arises in electric circuit theory. It is a harmonic oscillator that includes a nonlinear friction term: λ(x2 − 1)x. ˙ If the amplitude of the oscillations is large, the amplitude-dependent “coefficient” of friction is positive, and the oscillations are damped. As a result, the amplitude of the oscillations decreases, and the amplitude-dependent “coefficient” of friction eventually becomes negative, corresponding to a sort of antidamping. If we put x1 = x and x2 = x, ˙ Equation (3.25) takes the form x˙ 1 = x2 ,
x˙2 = −x1 − λ(x21 − 1)x2 .
(3.26)
The equilibrium point (x1 , x2 ) = (0, 0) is nonhyperbolic, but we may study its stability using the Lyapunov method. Consider the function V : (x1 , x2 ) →
1 2 (x + x22 ). 2 1
It is positive for (x1 , x2 ) = (0, 0, ), and its time derivative V˙ (x1 , x2 ) = x1 x˙ 1 + x2 x˙2 = −λ(x21 − 1)x22
62
3 Differential Equations 3
3
2
2
1
1
0
0
-1
-1
-2
-2
-2
-1
0
1
2
3
-2
-1
0
1
2
3
Fig. 3.7. Phase portraits of the van der Pol equation: x ¨ + λ(x2 − 1)x˙ + x = 0. For λ = −0.5, the origin is an asymptotically stable equilibrium point (left), while for λ = 0.5 the orbits converge to a stable limit cycle (right).
is negative in a neighborhood of the equilibrium point (0, 0) if λ is negative. Hence, V is a strong Lyapunov function, which proves that (0, 0) is asymptotically stable if λ < 0. It is an attractive focus. If λ is positive, (0, 0) is unstable. It is a repulsive focus, and, as illustrated in Figure 3.7, trajectories converge to a stable limit cycle. Arnol’d gives the following simple proof of the existence of a stable limit cycle.27 Note first that the Lyapunov function V represents the energy of the system, which is a conserved quantity in the case of the harmonic oscillator, i.e., for λ = 0. If the parameter λ is very small, trajectories are spirals in which the distance between adjacent coils is of the order of λ. To determine if these spirals either approach the origin or recede from it, we may compute an approximate value of the increment ∆V over one revolution around the origin. Since V˙ (x1 , x2 ) = −λ(x21 − 1)x22 , and, to first order in λ, x1 (t) = A cos(t − t0 ) and x2 (t) = −A sin(t − t0 ), we obtain 2π ∆V A cos(t − t0 ), −A sin(t − t0 ) ∆V = −λ 0 A4 2 . = πλ A − 4 • If λ < 0 and the amplitude A of the oscillations is small, ∆V < 0, i.e., the system gives energy to the external world: the trajectory is a contracting spiral. • If λ > 0 and the amplitude A of the oscillations is small, ∆V > 0, i.e., the system receives energy from the external world: the trajectory is an expanding spiral. 27
See [9], pp. 150–151.
3.2 Linearization and stability
63
3
2
1
0
-1
-2
-2
-1
0
1
2
3
Fig. 3.8. Phase portrait of the van der Pol equation: x ¨ + λ(x2 − 1)x˙ + x = 0 for λ = 0.05. As explained in the text, for a small positive value of λ, the stable limit cycle (thick line) is close to the circle of radius 2 centered at the origin.
2.8 2.6 energy
2.4 2.2 2 1.8 1.6 0
10
20
30
40
50
time Fig. 3.9. Energy of the van der Pol oscillator as a function of time along the limit cycle for λ = 0.5.
•
If ∆V = 0, the energy is conserved, and, for a small positive λ, the trajectory is a cycle close to the circle x21 + x22 = A2 , where A is the positive root of A2 (1 − 14 A2 ) = 0, i.e., A = 2.
This result is illustrated in Figure 3.8. Note that for a finite positive value of λ, the energy is not conserved along a stable limit cycle but it varies periodically. As shown in Figure 3.9, the system receives energy from the external world during a part of the cycle and gives it back during the other part.
64
3 Differential Equations
If x∗ is an asymptotically stable equilibrium point of the differential equation x˙ = X(x) on U ⊆ Rn , it is of practical importance to determine its basin of attraction, that is, the set {x ∈ U | lim ϕt (x) = x∗ }. t→∞
(3.27)
The method of Lyapunov may be used to obtain estimates of the basin of attraction. The problem is to find the largest subset of U in which V˙ (x) is negative. Remark 4. In the case of the van der Pol equation, the symmetry of the equation, which is invariant under the transformation t → −t, λ → −λ, shows that, when λ < 0, the basin of attraction of the origin is the interior of the closed curve symmetrical, with respect to the 0x1 -axis, to the stable limit cycle obtained for |λ| (see Figure 3.7).
Definition 10. A point y ∈ Rn is an ω-limit point for the trajectory {ϕt (x) | t ∈ R} through x if there exists a sequence (tk ) going to infinity such that limk→∞ ϕtk (x) = y. The set of all ω-limit points of x is called the ω-limit set of x, and is denoted by Lω (x). α-limit points and the α-limit set are defined in the same way but with a sequence (tk ) going to −∞. The α-limit set of x is denoted Lα (x).28 Let y ∈ Lω (x) and z = ϕtk (y); then limk→∞ ϕt+tk (x) = z, showing that y and z belong to the ω-limit set Lω (x) of x. The ω-limit set of x is, therefore, invariant under the flow ϕt .29 Similarly, we could have shown that the α-limit set of x is invariant under the flow. If x∗ is an asymptotically stable equilibrium point, it is the ω-limit set of every point in its basin of attraction. A closed orbit is the α-limit and ω-limit set of every point on it. While, in general, limit sets can be quite complicated, for two-dimensional systems the situation is much simpler. The following result known as the Poincar´e-Bendixson theorem gives a criterion to detect limit cycles (see Definition 11 below) in systems modeled by a twodimensional differential equation: Theorem 2. A nonempty compact limit set30 of a two-dimensional flow defined by a C 1 vector field, which contains no fixed point, is a closed orbit.31 Definition 11. A limit cycle is a closed orbit γ such that either γ ⊂ Lω (x) or γ ⊂ Lα (x) for some x ∈ / γ. In the first case, γ is an ω-limit cycle; in the second case, it is an α-limit cycle. 28
29
30
31
The reason for this terminology is that α and ω are, respectively, the first and last letters of the Greek alphabet. A set M is also invariant under the flow ϕt if, for all x ∈ M and all t ∈ R, ϕt (x) ∈ M . For instance, fixed points and closed orbits are invariant sets. A set is compact if, from any covering by open sets, it is possible to extract a finite covering. Any closed bounded subset of a finite-dimensional metric space is compact. See Hirsch and Smale [171].
3.2 Linearization and stability
65
Closed orbits around a center are not limit cycles. A limit cycle is an isolated closed orbit in the sense that there exists an annular neighborhood of the limit cycle that contains no other closed orbits. The stability of limit cycles is studied in the next chapter Section 4.3. In order to prove the existence of a limit cycle using Theorem 2, one has to find a bounded subset D of R2 such that, for all x ∈ D, the trajectories {ϕt (x) | t > 0} remain in D32 and show that D does not contain an equilibrium point.
1.5 1 0.5 0 -0.5 -1
-1 -0.5
0
0.5
1
1.5
Fig. 3.10. Phase portrait of the two-dimensional system of Example 17
Example 17. Perturbed center. Consider the two-dimensional system x˙ 1 = x1 − x2 − x1 (x21 + x22 ),
x˙ 2 = x1 + x2 − x2 (x21 + x22 ).
Using polar coordinates, this system takes the following particularly simple form: r˙ = r − r3 , θ˙ = 1. For r = 1, r˙ = 0; therefore, if r1 and r2 are two real numbers such that 0 < r1 < 1 < r2 , we verify that, for r1 ≤ r < 1, r˙ > 0, and, for 1 < r ≤ r2 , r˙ < 0. Thus, the closed annular set {x | r1 ≤ (x21 +x22 )1/2 ≤ r2 } is positively invariant (see Footnote 32) and does not contain an equilibrium point. According to the Poincar´e-Bendixson theorem, this bounded subset contains a limit cycle (see Figure 3.10). 32
Such a subset D is said to be positively invariant.
66
3 Differential Equations
3.3 Graphical study of two-dimensional systems There exists a wide variety of models describing the interactions between two populations. If one seeks to incorporate in such a model a minimum of broadly relevant features, the equations describing the model might become difficult to analyze. It is, however, frequently possible to analyze graphically the behavior of the system without entering into specific mathematical details. The system N˙ 1 = N1 f1 (N1 , N2 ),
N˙ 2 = N2 f2 (N1 , N2 ),
(3.28)
represents a model of two interacting populations, whose growth rates N˙ 1 /N1 and N˙ 2 /N2 are, respectively, equal to f1 (N1 , N2 ) and f2 (N1 , N2 ). The general idea on which rests the qualitative graphical analysis of Equations (3.28) is to: (i) divide the positive quadrant of the (N1 , N2 )-plane in domains bounded by the sets {(N1 , N2 ) | f1 (N1 , N2 ) = 0} and {(N1 , N2 ) | f2 (N1 , N2 ) = 0},33 (ii) find, in each domain, the sign of both growth rates, which determine the direction of the vector field, and (iii) represent, in each domain, the direction of the vector field (i.e., the flow) by an arrow. As shown in the following example, a schematic phase portrait can then be easily obtained. Example 18. Lotka-Volterra competition model. Assuming that two species compete for a common food supply, their growth could be described by the following simple two-dimensional system: N1 N2 ˙ ˙ N 1 = r1 N 1 1 − − λ 1 N 1 N 2 , N 2 = r2 N 2 1 − − λ2 N1 N2 . (3.29) K1 K2 This competition model is usually associated with the names of Lotka and Volterra. Each population (N1 and N2 ) has a logistic growth but the presence of each reduces the growth rate of the other. The constants r1 , r2 , K1 , K2 , λ1 , and λ2 are positive. As usual, we define reduced variables writing √ r1 N1 N2 λ1 K2 λ2 K1 τ = r1 r2 t, ρ = , n1 = , n2 = , α1 = √ , α2 = √ , r2 K1 K2 r1 r2 r1 r2 and Equations (3.29) become dn1 = ρn1 (1 − n1 ) − α1 n1 n2 , dτ 33
dn2 1 = n2 (1 − n2 ) − α2 n1 n2 . dτ ρ
(3.30)
These sets, which are the preimages of the point (0, 0) by f1 and f2 , respectively, are called null clines.
3.3 Graphical study of two-dimensional systems
67
To determine under which conditions the two species can coexist, the two straight lines ρ(1 − n1 ) − α1 n2 = 0,
1 (1 − n2 ) − α2 n1 = 0, ρ
should intersect in the positive quadrant. There are two possibilities represented in Figure 3.11. We find that if a1 = ρ/α1 and a2 = 1/ρα2 are both n2
n2 a1
1
a1
1
1
a2
n1
a2
1
n1
Fig. 3.11. Lotka-Volterra competition model. Intersecting null clines. Dots show equilibrium points. In each domain, arrows represent the direction of the vector (n˙ 1 , n˙ 2 ), with time derivatives taken with respect to reduced time τ . Left: a1 = ρ/α1 > 1 and a2 = 1/ρα2 > 1. Right: a1 = ρ/α1 < 1 and a2 = 1/ρα2 < 1.
greater than 1, the nontrivial equilibrium point is asymptotically stable: the two populations will coexist. If, on the contrary, a1 = ρ/α1 and a2 = 1/ρα2 are both less than 1, the nontrivial equilibrium point is a saddle. The equilibrium points (0, 1) and (1, 0) are stable steady states. The population that will eventually survive depends upon the initial state. When the null clines do not intersect, as in Figure 3.12, only one population will eventually survive. It is population 1 if a1 = ρ/α1 > 1 and a2 = 1/ρα2 < 1, and population 2 if a1 = ρ/α1 < 1 and a2 = 1/ρα2 > 1. The equilibrium point (0, 0) is, in all cases, unstable. These results illustrate the so-called competitive exclusion principle whereby two species competing for the same limited resource cannot, in general, coexist. Note that the species that will eventually survive is the species whose growth rate is less perturbed by the presence of the other. It is possible to use the graphical method to study more complex models. For instance, Hirsch and Smale [171]34 discuss a large class of competition models. Their equations are of the form 34
In a much older paper, Kolmogorov [193] had already presented a qualitative study of a general predator-prey system.
68
3 Differential Equations n2
a1 n2 1
1
a1
a2
1
n1
1
a2
n1
Fig. 3.12. Lotka-Volterra competition model. Nonintersecting null clines. Dots show equilibrium points. In each domain, arrows represent the direction of the vector (n˙ 1 , n˙ 2 ), with time derivatives taken with respect to reduced time τ . Left: a1 = ρ/α1 > 1 and a2 = 1/ρα2 < 1. Right: a1 = ρ/α1 < 1 and a2 = 1/ρα2 > 1.
N˙ 1 = N1 f1 (N1 , N2 ),
N˙ 2 = N2 f2 (N1 , N2 ),
with the following assumptions: 1. If either species increases, the growth rate of the other decreases. Hence, ∂f1 ∂f2 < 0 and < 0. ∂N2 ∂N1 2. If either population is very large, neither species can multiply. Hence, f1 (N1 , N2 ) ≤ 0 and f2 (N1 , N2 ) ≤ 0 if either N1 > K or N2 > K. 3. In the absence of either species, the other has a positive growth rate up to a certain population and a negative growth rate beyond that. There are, therefore, constants a1 > 0 and a2 > 0 such that f1 (N1 , N2 ) > 0 for N1 < a1 and f1 (N1 , N2 ) < 0 for N1 > a1 and f2 (N1 , N2 ) > 0 for N2 < a2 and f2 (N1 , N2 ) < 0 for N2 > a2 . Analyzing different possible generic shapes of the null clines, Hirsch and Smale show that the ω-limit set of any point in the positive quadrant of the (N1 , N2 )-plane exists and is one of a finite number of equilibria; that is, there are no closed orbits.
3.4 Structural stability
69
3.4 Structural stability The qualitative properties of a model should not change significantly when the model is slightly modified: a model should be robust. To be precise we have to give a definition of what is a “slight modification.” That is, in the space V(U ) of all vector fields defined on an open set U ⊆ Rn , we have to define an appropriate metric.35 The metric, we said, has to be appropriate in the sense that, if two vector fields are close for this metric, then the dynamics they generate have the same qualitative properties. Actually, on the space V(U ), we shall first define an appropriate norm and associate a metric with that norm.36 X
x
Fig. 3.13. Two neighboring one-dimensional vector fields in the C 0 topology that do not have the same number of equilibrium points.
If X ∈ V(U ), its C 0 norm is defined by 35
We have already defined the notion of distance (or metric) (see page 14). A distance d on a space X is a mapping d : X × X → R+ satisfying the following conditions:
1. d(x, y) = 0 ⇐⇒ x = y. 2. d(x, y) = d(y, x). 3. d(x, z) ≤ d(x, y) + d(y, z). 36
The pair (X, d) is called a metric space. A mapping x → x defined on a vector space X into R+ is a norm if it satisfies the following conditions:
1. x = 0 ⇐⇒ x = 0. 2. For any x ∈ X and any scalar λ, λx = |λ|x. 3. For all x and y in X, x + y ≤ x + y. A vector space equipped with a norm is called a normed vector space.
70
3 Differential Equations
X 0 = sup X(x) , x∈U
where X(x) is the usual norm of the vector X(x) in Rn . The C 0 distance between two vector fields X and Y, which belong to V(U ), is then defined by d0 (X, Y) = X − Y 0 . As shown in Figure 3.13, two vector fields that are close in the C 0 topology37 may not have the same number of hyperbolic equilibrium points. To avoid this undesirable situation, we should define a distance requiring that the vector fields as well as their derivatives be close at all points of U . The C 1 distance between two vector fields X and Y belonging to V(U ) is then defined by d1 (X, Y) = sup { X(x) − Y(x) , DX(x) − DY(x) }. x∈U
Similarly we could define C k distances for k greater than 1. In the space V(U ) of all vector fields defined on an open set U ⊆ Rn , an ε-neighborhood of X ∈ V(U ) is defined by Nε (X) = {Y ∈ V(U ) | X − Y 1 < ε}. A vector field Y that belongs to an ε-neighborhood of X is said to be ε1 1 C of X. In this case, the components -close to X or an ε-C -perturbation X1 (x), X2 (x), . . . , Xn (x) and Y1 (x), Y2 (x), . . . , Yn (x) of, respectively, X and Y and their first derivatives are close throughout U . Theorem 3. Let x∗ be a hyperbolic equilibrium point of the flow ϕt generated by the vector field X ∈ V(U ). Then, there exists a neighborhood V of x∗ and a neighborhood N of X such that each Y ∈ N generates a flow ψ t that has a unique hyperbolic equilibrium point y∗ ∈ V . Moreover, the linear operators DX(x∗ ) and DY(y∗ ) have the same number of eigenvalues with positive and negative real parts. In this case, the flow ϕt generated by the vector field X is said to be locally structurally stable at x∗ . 37
A collection T of subsets of a set X is said to be a topology in X if T has the following properties:
1. X and ∅ belong to T . 2. If {Oi | i ∈ I} is an arbitrary collection of elements of T , then ∪i∈I Oi belongs to T . 3. If O1 and O2 belong to T , then O1 O2 belongs to T . The ordered pair (X, T ) is called a topological space, and the elements of T are called open sets in X. When no ambiguity is possible, one may speak of the “topological space X.”
3.5 Local bifurcations of vector fields
71
Since the flows exp DX(x∗ )t and exp DX(x∗ )t are equivalent, this result follows directly from Theorem 1, which proves that the flows ϕt and ψ t are conjugate.38 The vector field (x2 , −ω 2 sin x1 ) of the undamped pendulum generates a flow that is not structurally stable. Its equilibrium points are nonhyperbolic. The damped pendulum (Example 12), which is obtained by adding to the vector field of the undamped pendulum the perturbation (0, −2ax2 ), has hyperbolic equilibrium points, which are either asymptotically stable or unstable (see Figure 3.6). Similarly, a linear harmonic oscillator whose vector field is (x2 , −x1 ) has only one equilibrium point (0, 0), which is a center. The flow generated by the vector field is not structurally stable. As shown in Example 13, the perturbation (λx1 (x21 + x22 ), λx2 (x21 + x22 )) generates a qualitatively different flow: its ω-limit set is a stable limit cycle.
3.5 Local bifurcations of vector fields In Section 2.3 we studied the predator-prey model used by Harrison to explain Luckinbill’s experiment with Didinium and Paramecium. This model, whose dimensionless equations are αh ph h dh =h 1− , − dτ k β+h dp αp ph − γp, = dτ β+h
where αh =
1 1− k
(β + 1) and αp = γ(β + 1),
exhibits two qualitatively different behaviors. For k < β + 2, the equilibrium point (1, 1) is asymptotically stable, but, for k > β + 2, (1, 1) is unstable and the ω-limit set is a limit cycle. The change of behavior occurs for k = β + 2; that is, when (1, 1) is nonhyperbolic. Such a change in a family of vector fields, which depends upon a finite number of parameters, is referred to as a bifurcation. The van der Pol equation (Example 16) exhibits a similar bifurcation. Here again, at the bifurcation point, the equilibrium point (0, 0) is nonhyperbolic. Like many concepts of the qualitative theory of differential equations, the theory of bifurcations has its origins in the work of Poincar´e.39 Let 38
39
The flows have to be restricted on neighborhoods of the respective hyperbolic equilibrium points x∗ and y∗ on which Theorem 1 is valid. For more details, see Arrowsmith and Place [11]. Poincar´e was the first to use the French word bifurcation in this context [288].
72
3 Differential Equations
(x, µ) → X(x, µ)
(x ∈ Rn , µ ∈ Rr )
be a C k r-parameter family of vector fields.40 If (x∗ , µ∗ ) is an equilibrium point of the flow generated by X(x, µ) (i.e., X(x∗ , µ∗ ) = 0), we should be able to answer the question: Is the stability of the equilibrium point affected as µ is varied? If the equilibrium point is hyperbolic (i.e., if all the eigenvalues of the Jacobian matrix41 DX(x∗ , µ∗ ) have nonzero real part), then, from Theorem 3, it follows that the flow generated by X(x, µ), is locally structurally stable at (x∗ , µ∗ ). As a result, for values of µ sufficiently close to µ∗ , the stability of the equilibrium point is not affected. More precisely, since all the eigenvalues of the Jacobian DX(x∗ , µ∗ ) have nonzero real part, the Jacobian is invertible, and, from the implicit function theorem,42 it follows that there exists a unique C k function x : µ → x(µ) such that, for µ sufficiently close to µ∗ , X(x(µ), µ) = 0 with x(µ∗ ) = x∗ . Since the eigenvalues of the Jacobian DX x(µ), µ are continuous functions of µ, for µ sufficiently close to µ∗ , all the eigenvalues of this Jacobian have nonzero real part. Hence, equilibrium points close to (x∗ , µ∗ ) are hyperbolic and have the same type of stability as (x∗ , µ∗ ). If the equilibrium point (x∗ , µ∗ ) is nonhyperbolic (i.e., if some eigenvalues of DX(x∗ , µ∗ ) have zero real part), X(x, µ) is not structurally stable at (x∗ , µ∗ ). In this case, for values of µ close to µ∗ , a totally new dynamical behavior can occur. Our aim in this section is to investigate the simplest bifurcations that occur at nonhyperbolic equilibrium points in one- and two-dimensional systems. 40
41
42
The degree of differentiability has to be as high as needed in order to satisfy the conditions for the family of vector fields to exhibit a given type of bifurcation. See the necessary and sufficient conditions below. The notation DX(x∗ , µ∗ ) means that the derivative is taken with respect to x at the point (x∗ , µ∗ ). If there is a risk of confusion with respect to which variable the derivative is taken, we will write Dx X(x∗ , µ∗ ). We shall often use this theorem, in particular in bifurcation theory, where it plays an essential role. Here is a simplified version that is sufficient in most cases: Let f : I1 × I2 → R be a C k function of two real variables; if (x0 , y0 ) ∈ I1 × I2 and ∂f f (x0 , y0 ) = 0, (x0 , y0 ) = 0, ∂y then there exists an open interval I, containing x0 , and a C k function ϕ : I → R such that ϕ(x0 ) = y0 ,
and
f (x, ϕ(x)) = 0,
for all x ∈ I.
For a proof, see Lang [200], pp. 425–429, in which a proof of a more general version of the implicit function theorem is also given.
3.5 Local bifurcations of vector fields
73
3.5.1 One-dimensional vector fields In a one-dimensional system, a nonhyperbolic equilibrium point is necessarily associated with a zero eigenvalue of the derivative of the vector field at the equilibrium point. In this section, we describe the most important types of bifurcations that occur in one-dimensional systems x˙ = X(x, µ) (x ∈ R, µ ∈ R).
(3.31)
As a simplification, we assume that the bifurcation point (x∗ , µ∗ ) is (0, 0), that is, ∂X (0, 0) = 0. (3.32) X(0, 0) = 0 and ∂x In a neighborhood of a bifurcation point, the essential information concerning the bifurcation is captured by the bifurcation diagram, which consists of different curves. The locus of stable points is usually represented by a solid curve, while a broken curve represents the locus of unstable points (Figure 3.17). Saddle-node bifurcation Consider the equation x˙ = µ − x2 .
(3.33)
∗
For µ = 0, x = 0 is the only equilibrium point, and it is nonhyperbolic since DX(0, 0) = 0. The vector field X(x, 0) is not structurally stable and µ = 0 is a bifurcation value. For µ < 0, there are no equilibrium points, while, √ for µ > 0, there are two hyperbolic equilibrium points x∗ = ± µ. Since √ √ √ √ DX(± µ, µ) = ∓2 µ, µ is asymptotically stable, and − µ is unstable. The phase portraits for Equation (3.33) are shown in Figure 3.14. This type of bifurcation is called a saddle-node bifurcation.43 a
b
c Fig. 3.14. Saddle-node bifurcation. Phase portraits for the differential equation (3.33). (a) µ < 0, (b) µ = 0, (c) µ > 0.
43
Also called tangent bifurcation.
74
3 Differential Equations
Transcritical bifurcation Consider the equation x˙ = µx − x2 .
(3.34)
For µ = 0, x∗ = 0 is the only equilibrium point, and it is nonhyperbolic since DX(0, 0) = 0. The vector field X(x, 0) is not structurally stable, and µ = 0 is a bifurcation value. For µ = 0, there are two equilibrium points, 0 and µ. At the bifurcation point, these two equilibrium points exchange their stability (see Figure 3.15). This type of bifurcation is called a transcritical bifurcation.
a
b
c Fig. 3.15. Transcritical bifurcation. Phase portraits for the differential equation (3.34). (a) µ < 0, (b) µ = 0, (c) µ > 0.
Pitchfork bifurcation Consider the equation x˙ = µx − x3 .
(3.35)
For µ = 0, x∗ = 0 is the only equilibrium point. It is nonhyperbolic since DX(0, 0) = 0. The vector field X(x, 0) is not structurally stable and µ = 0 is a bifurcation value. For µ ≤ 0, 0 is the only equilibrium point, and it is asymptotically stable. For µ > 0, there are three equilibrium points, 0 is √ unstable, and ± µ are both asymptotically stable. The phase portraits for Equation (3.35) are shown in Figure 3.16. This type of bifurcation is called a pitchfork bifurcation.44 44
Sometimes called symmetry breaking bifurcation since it is the bifurcation that characterizes the broken symmetry associated with a second-order phase transition in statistical physics.
3.5 Local bifurcations of vector fields
75
a
b Fig. 3.16. Pitchfork bifurcation. Phase portraits for the differential equation (3.35). (a) µ ≤ 0, (b) µ > 0.
Necessary and sufficient conditions It is possible to derive necessary and sufficient conditions under which a oneparameter family of one-dimensional vector fields exhibits a bifurcation of one of the types just described. These conditions involve derivatives of the vector field at the bifurcation point. Saddle-node bifurcation. If the family X(x, µ) undergoes a saddle-node bifurcation, in the (µ, x)-plane there exists a unique curve of fixed points (see the top panel of Figure 3.17). This curve is tangent to the line µ = 0 at x = 0, and it lies entirely to one side of µ = 0. These two properties imply that dµ (0) = 0, dx
d2 µ (0) = 0. dx2
(3.36)
The bifurcation point is a nonhyperbolic equilibrium; i.e., X(0, 0) = 0, If we assume
∂X (0, 0) = 0. ∂x
∂X (0, 0) = 0, ∂µ
then, by the implicit function theorem, there exists a unique function µ : x → µ(x), such that µ(0) = 0, defined in a neighborhood of x = 0 that satisfies the relation X(x, µ(x)) = 0. To express Conditions (3.36), which imply that (0, 0) is a nonhyperbolic equilibrium point at which a saddle-node bifurcation occurs, in terms of derivatives of X, we have to differentiate the relation X(x, µ(x)) = 0 with respect to x. We obtain dX ∂X ∂X dµ (x, µ(x)) = (x, µ(x)) + (x, µ(x)) (x) = 0. dx ∂x ∂µ dx Hence, ∂X (0, 0) dµ (0) = − ∂x , ∂X dx (0, 0) ∂µ
3 Differential Equations
equilibrium
76
equilibrium
parameter
equilibrium
parameter
parameter
·
Fig. 3.17. Bifurcation diagrams with phase portraits. Bifurcation points are represented by • and hyperbolic equilibrium points by . Stable equilibrium points are on solid curves. Top: saddle-node; middle: transcritical; bottom: pitchfork.
which shows that ∂X (0, 0) = 0 and ∂x implies
∂X (0, 0) = 0, ∂µ
3.5 Local bifurcations of vector fields
77
dµ (0) = 0. dx If we differentiate X(x, µ(x)) = 0 once more with respect to x, we obtain d2 X ∂2X dµ ∂2X (x, µ(x)) + 2 (x, µ(x)) (x) (x, µ(x)) = dx2 ∂2x ∂x∂µ dx 2 2 ∂ X dµ d2 µ ∂X + 2 (x, µ(x)) (x) + (x, µ(x)) 2 (x) = 0. ∂ µ dx ∂µ dx Hence, taking into account the expression of dµ/dx(0) found above, ∂X d2 X d2 µ (0) + (0) = 0, (0, 0) dx2 ∂µ dx2 which yields ∂2X (0, 0) d µ ∂x2 . (0) = − ∂X d2 x (0, 0) ∂µ 2
That is, d2 µ (0) = 0 if dx2 In short, the one-parameter family
∂2X (0, 0) = 0. ∂x2
x˙ = X(x, µ) undergoes a saddle-node bifurcation if X(0, 0) = 0 and
∂X (0, 0) = 0, ∂x
showing that (0, 0) is a nonhyperbolic equilibrium point, and ∂X (0, 0) = 0 and ∂µ
∂2X (0, 0) = 0, ∂2x
which imply the existence of a unique curve of equilibrium points that passes through (0, 0) and lies, in a neighborhood of (0, 0), on one side of the line µ = 0. In order to exhibit a saddle-node bifurcation, the one-parameter family X has to be C k with k ≥ 2. Transcritical bifurcation. As for the saddle-node bifurcation, the implicit function theorem can be used to characterize the geometry of the bifurcation diagram in the neighborhood of the bifurcation point, assumed to be located at (0, 0) in the (µ, x)-plane, in terms of the derivatives of the vector field X. In the case of the transcritical bifurcation, in the neighborhood of the bifurcation point, the bifurcation diagram is characterized (see the middle panel of
78
3 Differential Equations
Figure 3.17) by the existence of two curves of equilibrium points: x = µ and x = 0. Both curves exist on both sides of (0, 0), and the equilibrium points on these curves exchange their stabilities on passing through the bifurcation point. The bifurcation point (0, 0) being nonhyperbolic, we must have X(0, 0) = 0,
∂X (0, 0) = 0. ∂x
Since two curves of equilibrium points pass through this point, ∂X (0, 0) = 0, ∂µ otherwise the implicit function theorem would imply the existence of only one curve passing through (0, 0). To be able to use the implicit function theorem, the result obtained in the discussion of the transcritical bifurcation will guide us. If the vector field X is assumed to be of the form X(x, µ) = x Ξ(x, µ), then x = 0 is a curve of equilibrium points passing through (0, 0). To obtain the additional curve, Ξ has to satisfy Ξ(0, 0). The values of the derivatives of Ξ at (0, 0) are determined using the definition of Ξ; i.e., ⎧ X(x, µ) ⎪ ⎪ , if x = 0, ⎨ x Ξ(x, µ) = ⎪ ⎪ ⎩ ∂X (0, µ), if x = 0. ∂x Hence, ∂Ξ ∂2X (0, 0) = 2 (0, 0), ∂x ∂ x and
If
∂2Ξ ∂3X (0, 0) = 3 (0, 0), 2 ∂ x ∂ x
∂Ξ ∂2X (0, 0) = (0, 0). ∂µ ∂µ∂x ∂Ξ (0, 0) = 0, ∂µ
from the implicit function theorem, it follows that there exists a function µ : x → µ(x), defined in the neighborhood of x = 0, that satisfies the relation Ξ(x, µ(x)) = 0. For the curve µ = µ(x) not to coincide with the curve x = 0 and to be defined on both sides of (0, 0), the function µ has to be such that dµ 0 < (0) < ∞. dx Differentiating with respect to x the relation Ξ(x, µ(x)) = 0, we obtain
3.5 Local bifurcations of vector fields
79
∂Ξ ∂2X (0, 0) (0, 0) 2 dµ (0) = − ∂x . = − ∂2 x ∂Ξ dx ∂ X (0, 0) (0, 0) ∂µ ∂µ∂x In short, the one-parameter family x˙ = X(x, µ) undergoes a transcritical bifurcation if X(0, 0) = 0 and
∂X (0, 0) = 0, ∂x
showing that (0, 0) is a nonhyperbolic equilibrium point, and ∂X (0, 0) = 0, ∂µ
∂2X (0, 0) = 0, ∂2x
and
∂2X (0, 0) = 0, ∂µ∂x
which imply the existence of two curves of equilibrium points on both sides (0, 0) passing through that point. In order to exhibit a transcritical bifurcation, the one-parameter family X has to be C k with k ≥ 2. Pitchfork bifurcation. The derivation of the necessary and sufficient conditions for a one-parameter family of vector fields X(x, µ) to exhibit a pitchfork bifurcation is similar to the derivation for such a family to exhibit a transcritical bifurcation. In the case of a pitchfork bifurcation, the bifurcation diagram is characterized (see the bottom panel of Figure 3.17) by the existence of two curves of equilibrium points: x = 0 and µ = x2 . The curve x = 0 exists on both sides of the bifurcation point (0, 0), and the equilibrium points on this curve change their stabilities on passing through this point. The curve µ = x2 exists only on one side of (0, 0) and at this point is tangent to the line µ = 0. The point (0, 0) being nonhyperbolic, we must have X(0, 0) = 0,
∂X (0, 0) = 0. ∂x
Since more than one curve of equilibrium points passes through this point, ∂X (0, 0) = 0. ∂µ As in the case of a transcritical bifurcation, we assume the vector field X to be of the form X(x, µ) = x Ξ(x, µ) to ensure that x = 0 is a curve of equilibrium points passing through (0, 0). To obtain the additional curve, Ξ has to satisfy Ξ(0, 0). The values of the derivatives of Ξ at (0, 0) may be determined using the definition of Ξ, i.e.,
80
3 Differential Equations
⎧ X(x, µ) ⎪ ⎪ , if x = 0, ⎨ x Ξ(x, µ) = ⎪ ⎪ ⎩ ∂X (0, µ), if x = 0. ∂x Then, ∂Ξ ∂2X (0, 0) = 2 (0, 0), ∂x ∂ x and
∂2Ξ ∂3X (0, 0) = 3 (0, 0), 2 ∂ x ∂ x
∂Ξ ∂2X (0, 0) = (0, 0). ∂µ ∂µ∂x
If
∂Ξ (0, 0) = 0, ∂µ from the implicit function theorem, it follows that there exists a function µ : x → µ(x), defined in the neighborhood of x = 0, that satisfies the relation Ξ(x, µ(x)) = 0. For the curve µ = µ(x) to have, close to x = 0, the geometric properties of µ = x2 , it suffices to have
dµ d2 µ (0) = 0. (0) = 0 and dx dx2 Differentiating the relation Ξ(x, µ(x)) = 0, we therefore obtain ∂Ξ ∂2X (0, 0) (0, 0) 2 dµ (0) = − ∂x =0, =− ∂2 x ∂Ξ dx ∂ X (0, 0) (0, 0) ∂µ ∂µ∂x ∂Ξ ∂3X (0, 0) (0, 0) 2 d µ ∂2x ∂3x (0) =− =0. =− ∂Ξ dx2 ∂2X (0, 0) (0, 0) ∂µ ∂µ∂x In short, the one-parameter family x˙ = X(x, µ) undergoes a pitchfork bifurcation if ∂X (0, 0) = 0, ∂x showing that (0, 0) is a nonhyperbolic equilibrium point, and X(0, 0) = 0 and
∂X (0, 0) = 0 ∂µ
∂2X (0, 0) = 0, ∂2x
∂2X (0, 0) = 0, ∂µ∂x
and
∂3X (0, 0) = 0, ∂3x
which imply the existence of two curves of equilibrium points passing through (0, 0), one being defined on both sides and the other one only on one side of this point. In order to exhibit a pitchfork bifurcation, the one-parameter family X has to be C k with k ≥ 3.
3.5 Local bifurcations of vector fields
81
Example 19. Bead sliding on a rotating hoop. Consider a bead of mass m sliding without friction on a hoop of radius R (Figure 3.18). The hoop rotates with angular velocity ω about a vertical axis. The acceleration due to gravity is g. If x is the angle between the vertical downward direction and the position vector of the bead, the Lagrangian of the system is L(x, x) ˙ =
1 1 mR2 x˙ 2 + mω 2 R2 sin2 x + mgR cos x, 2 2
and the equation of motion is mR2 x ¨ = mω 2 R2 sin x cos x − mgR sin x or R¨ x = ω 2 R sin x cos x − g sin x. Introducing the reduced variables
g R x
m Fig. 3.18. Bead sliding on a vertical rotating hoop without friction.
τ=
g t, R
µ=
R 2 ω , g
the equation of motion takes the form d2 x = sin x(µ cos x − 1). dτ 2
(3.37)
This equation may be written as a first-order system. If we put x1 = x and x2 = x, ˙ where now the dot represents derivation with respect to reduced time τ , Equation (3.37) is replaced by the system
82
3 Differential Equations
x˙ 1 = x2 ,
x˙ 2 = sin x1 (µ cos x1 − 1).
(3.38)
The state of the system represented by x = (x1 , x2 ) belongs to the cylinder [0, 2π[×R. In this phase space, there exist four equilibrium points (0, 0),
(π, 0),
(± arccos µ−1 , 0).
At these equilibrium points, the Jacobian of the system J(x∗1 , x∗2 ) is given by 0 1 0 1 J(0, 0) = , J(π, 0) = , µ−1 0 µ+1 0 ⎡ ⎤ 0 1 J(± arccos µ−1 , 0) = ⎣ 1 − µ2 ⎦ . 0 µ If 0 < µ < 1, (0, 0) is Lyapunov stable but not asymptotically stable, (π, 0) is unstable, and (± arccos µ−1 , 0) do not exist. For µ slightly greater than 1, (0, 0) is unstable, and the two equilibrium points (± arccos µ−1 , 0) are stable. The nonhyperbolic equilibrium point (x, µ) = (0, 1) is a pitchfork bifurcation point. For 0 < µ < 1, the bead oscillates around the point x = 0, while for µ slightly greater than 1, it oscillates around either x = arccos µ−1 or x = − arccos µ−1 . Since the Lagrangian is invariant under the transformation x → −x, this system exhibits a symmetry-breaking bifurcation similar to those characterizing second-order phase transitions in some simple magnetic systems. 3.5.2 Equivalent families of vector fields In a one-dimensional system, Conditions (3.32) are necessary but not sufficient for the system to exhibit a bifurcation. For instance, the equation x˙ = µ − x3 does not exhibit a bifurcation at the equilibrium point (x∗ , µ∗ ) = (0, 0). A simple analysis shows that, for all real µ, the equilibrium point x∗ = (µ)1/3 is always asymptotically stable. In such a case, it is said that the family of vector fields X(x, µ) does unfold the singularity in X(x, 0). More precisely, any local family, X(x, µ), at (x∗ , µ∗ ) is said to be an unfolding of the vector field X(x, µ∗ ). When X(x, µ∗ ) has a singularity at x = x∗ , X(x, µ)is referred to as an unfolding of the singularity. Definition 12. Two local families X(x, µ) and Y(y, ν) are said to be equivalent if there exists a continuous mapping (x, µ) → h(x, µ), defined in a neighborhood of (x∗ , µ∗ ), satisfying h(x∗ , µ∗ ) = y∗ , such that for each µ, x → h(x, µ) is a homeomorphism that exhibits the topological equivalence45 of the flows generated by X(x, µ) and Y(y, ν). 45
On topological equivalence, see Remark 2.
3.5 Local bifurcations of vector fields
83
Example 20. The family of vector fields X(x, µ) = µx − x2 can be written X(x, µ) = If y = h(x, µ) = x −
1 2
µ2 µ 2 . − x− 4 2
µ, then, for each µ,
y˙ = x˙ =
µ2 µ2 µ 2 − y 2 = Y (y, ν), = − x− 4 2 4
where ν = 14 µ2 . Hence, the two families X(x, µ) = µx−x2 and Y (y, ν) = ν−y 2 are topologically equivalent. 3.5.3 Hopf bifurcation If the differential equation x˙ = X(x, µ) exhibits a bifurcation of one of the types described above, clearly the two-dimensional system x˙ 1 = X(x1 , µ),
x˙ 2 = −x2
also exhibits the same type of bifurcation. We will not insist. In this section we present a new type of bifurcation that does not exist in a one-dimensional system: the Hopf bifurcation. This type of bifurcation appears at a nonhyperbolic equilibrium point of a two-dimensional system whose eigenvalues of its linear part are pure imaginary. We have already found Hopf bifurcations in two models: the prey-predator model used by Harrison, which predicts the outcome of Luckinbill’s experiment with Didinium and Paramecium qualitatively (Section 2.3), and the van der Pol oscillator (Example 16). The following theorem indicates under which conditions a two-dimensional system undergoes a Hopf bifurcation.46 Theorem 4. Let x˙ = X(x, µ) be a two-dimensional one-parameter family of vector fields (x ∈ R2 , µ ∈ R) such that 1. X(0, µ) = 0, 2. X is an analytic function of x and µ, and 3. Dx X(0, µ) has two complex conjugate eigenvalues α(µ) ± iω(µ) with α(0) = 0, ω(0) = 0, and dα/dµ|µ=0 = 0; then, in any neighborhood U ⊂ R2 of the origin and any given µ0 > 0 there is a µ < µ0 such that the equation x˙ = X(x, µ) has a nontrivial periodic orbit in U . 46
For a proof, see Hassard, Kazarinoff, and Wan [166], who present various proofs and many applications; see also Hale and Ko¸cak [164].
84
3 Differential Equations
Example 21. The van der Pol system revisited. In the case of the van der Pol system (3.26), we have 0 1 Dx X(0, λ) = . −1 λ The eigenvalues of the linear part of the van der Pol system are 12 (λ ± √ i 4 − λ2 ). If λ < 0, the origin is asymptotically stable, and if λ > 0 the origin is unstable. Since the real part of both eigenvalues is equal to 12 λ, all the conditions of Theorem 4 are verified. Example 22. Section 2.3 model. Harrison found that the model described by the system aH P H H ˙ − H = rH H 1 − , K b+H aP P H − cP, P˙ = b+H exhibits a stable limit cycle and is in good qualitative agreement with Luckinbill’s experiment. Using the reduced variables h=
H , H∗
p=
P , P∗
τ = rH t,
k=
K , H∗
β=
b , H∗
c γ= , r
where (H ∗ , P ∗ ) denotes the nontrivial fixed point, the model can be written dh αh ph h − =h 1− , dτ k β+h αp ph dp = − γp, dτ β+h
where αh =
1 1− k
(β + 1) and αp = γ(β + 1).
The Jacobian at the equilibrium point (1, 1) is ⎡ ⎤ 1 k−2−β −1 + ⎢ k(1 + β) k⎥ ⎢ ⎥. ⎣ ⎦ βγ 0 1+β Its eigenvalues are
k − 2 − β ± i 4(k − 1)k 2 βγ − (k − 2 − β)2 . 2k(1 + β)
The equilibrium point is asymptotically stable if k < 2+β. Since the derivative of the real part of the eigenvalues with respect to the parameter k is different from zero at the bifurcation point, all the conditions of Theorem 4 are verified: there exists a limit cycle for k > 2 + β.
3.5 Local bifurcations of vector fields
85
It is often desirable to prove that a two-dimensional system does not possess a limit cycle. The following theorem, known as the Dulac criterion [111], is often helpful to prove the nonexistence of a limit cycle. Theorem 5. Let Ω be a simply connected subset of R2 and D : Ω → R a C 1 function. If the function x → ∇DX(x) =
∂DX1 ∂DX2 + ∂x1 ∂x2
has a constant sign and is not identically zero in Ω, then the two-dimensional system x˙ = X(x) has no periodic orbit lying entirely in Ω. If there exists a closed orbit γ in Ω, Green’s theorem implies ∂DX1 ∂DX2 dx1 dx2 = 0, D(x)X1 (x) dx2 − D(x)X2 (x) dx1 = + ∂x1 ∂x2 γ Γ
where Γ is the bounded interior of γ (∂Γ = γ).47 But if γ is an orbit, we also have D(x) X1 (x) x˙ 2 − X2 (x) x˙ 1 dt = D(x) X1 (x)dx2 − X2 (x)dx1 = 0, γ
γ
where we have taken into account that x˙ i = Xi (x) for i = 1, 2. This contradiction proves the theorem. The function D is a Dulac function. There is no general method for finding an appropriate Dulac function. If D(x1 , x2 ) = 1, the theorem is referred to as the Bendixson criterion [40]. 3.5.4 Catastrophes Catastrophe theory studies abrupt changes associated with smooth modifications of control parameters.48 In this section, we describe one of the simplest types found in many models: the cusp catastrophe. 47
48
Since a closed orbit in R2 does not intersect itself, it is a closed Jordan’s curve and, according to Jordan’s theorem, it separates the plane into two disjoint connected components such that only one of these two components is bounded. The closed Jordan’s curve is the common boundary of the two components. On catastrophe theory, one should consult Thom [328]. On the mathematical presentation of the theory, refer to the recent book of Castrigiano and Hayes [82] with a preface by Ren´e Thom, the founder of the theory. On early applications of the theory, see Zeeman [354]. Many applications of catastrophe theory have been heavily criticized. For example, here is a quotation from the English translation by Wassermann and Thomas of Arnol’d [10], p. 9: I remark that articles on catastrophe theory are distinguished by a sharp and catastrophic lowering of the level of demands of rigour and also of novelty of published results. In [329] Thom expounds his philosophical standpoint and shows that a qualitative approach may offer a subtler explanation than a purely quantitative description.
86
3 Differential Equations
m2
3 equilibria
1 equilibrium
1 equilibrium
m1 Fig. 3.19. The bifurcation diagram of the differential equation x˙ = m1 + m2 x − x3 .
Consider the two-parameter family of maps on R (x, m1 , m2 ) → X(x, m1 , m2 ) = m1 + m2 x − x3 . Depending upon the values of the parameters, the differential equation x˙ = X(x, m1 , m2 ) has either one or three equilibrium points. Since at bifurcation points the differential equation must have multiple equilibrium points, bifurcation points are the solutions of the system X(x, m1 , m2 ) = 0,
∂X (x, m1 , m2 ) = 0, ∂x
that is, m1 + m2 x − x3 = 0,
m2 − 3x2 = 0.
Eliminating x, we find 27m21 − 4m32 = 0, which is the equation of the boundary between the domains in the parameter space in which the differential equation has either one or three equilibrium points. It is the equation of a cusp (Figure 3.19). The bifurcation diagram represented in Figure 3.20 will help us understand the nature of the cusp catastrophe. In this figure, the solid line corresponds to asymptotically stable equilibrium points and the broken line to unstable points. Suppose that the parameter m2 has a fixed value, say 1. If √ m1 > 2/3 3, there exists only one equilibrium point, represented by point A. This equilibrium is asymptotically stable. As m1 decreases, the point representing the equilibrium moves along the curve in the direction of the arrow. It passes point E, where nothing special occurs, to finally reach point B. If we try to further decrease m1 , the state of the system, represented by the x-coordinate of B, jumps to the value of the x-coordinate of C, and, for decreasing values of m1 , the asymptotically stable equilibrium points will be
3.5 Local bifurcations of vector fields
87
represented by the points on the solid line below C. If now we start increasing m1 , the point representing the asymptotically stable equilibrium will go back to C and proceed up to point D, where, if we try to further increase m1 , it will jump to E and move up along the solid line towards A as the parameter is varied. The important fact is that the state of the system has experienced jumps at two different values of the control parameter m1 . The parameter values at which jumps take place depend upon the direction in which the parameter is varied. This phenomenon, well-known in physics, is referred to as hysteresis, and the closed path BCDEB is called the hysteresis loop. It is important to note that the cusp catastrophe, described here, occurs in dynamical systems in which the vector field X is a gradient field ; that is, X(x, m1 , m2 ) = −∇Φ(x, m1 , m2 ), where Φ(x, m1 , m2 ) =
1 4
x4 −
1 2
m2 x2 − m1 x.
m1
A D E x C B
Fig. 3.20. Hysteresis. The closed path BCDEB is a hysteresis loop.
Example 23. Street gang control.49 Street gangs have emerged as tremendously powerful institutions in many communities. In urban ghettos, they may very 49
I presented a first version of this model at a meeting of the Research Police Forum organized at Starved Rock (IL) by the Police Training Institute of the University of Illinois at Urbana-Champaign in the early 1990s and a more elaborate version a few weeks later at the Criminal Justice Authority (Chicago, IL). It was prepared
88
3 Differential Equations
well be the most important institutions in the lives of a large proportion of adolescent and young adult males. To model the dynamics of a gang population, assume that the growth rate of the gang population size N is given by N˙ = g(N ) − p(N ), where g is the intrinsic growth function and p the police response function, which describes the amount of resources the police devote at each level of the population. It might be understandably objected that coercive methods are not sufficient; social programs and education also play an important role. The expression p(N ) should, therefore, represent the amount of resources devoted by society as a whole. Using a slight variant of the logistic function, assume that N , g(N ) = r(N + N0 ) 1 − K where r, K, and N0 are positive constants. Note that the “initial condition” N0 implies that a zero gang population is not an equilibrium. For the response function p, assume that, as gang membership in a community grows, the society will devote more resources to the problem. p should, therefore, be monotonically increasing, but since resources are limited, the investment will ultimately approach some maximum level. This pattern can be modeled by aN ξ , p(N ) = b + Nξ where a, b, and ξ are positive constants. The parameter a is the maximum response level. This maximum is approached faster for decreasing values of b and ξ.50 The parameter ξ is a measure of how “tough” the society response is. Introducing the dimensionless variables τ = rt,
n=
N , K
α=
a , rK
β=
b , Kξ
the dynamics of the reduced gang population n is modeled by dn αnξ . = (n + n0 )(1 − n) − dτ β + nξ
(3.39)
The equilibrium points are the solutions to the equation
50
in collaboration with Jonathan Crane from the Department of Sociology of the University of Illinois at Chicago and the Institute of Government and Public Affairs, Center for Prevention Research and Development. Note the similarity of this street gang control model with the Ludwig-HollingJones model of budworm outbreaks Equation (1.6) .
3.5 Local bifurcations of vector fields
0.25
89
0.03
0.15
0.2
0.5
0.8
0.2
0.5
0.8
0.2
0.5
0.8
-0.03 0.05
-0.06 0.2
0.4
0.6
0.8
1
0.25 0.03 0.15 -0.03 0.05 -0.06 0.2
0.4
0.6
0.8
1
0.25
0.04
0.15
0.05
-0.04 0.2
0.4
0.6
0.8
1 -0.08
Fig. 3.21. Equilibrium points of the street gang model. The graphs on the left show the intersections of the graphs of N → g(N ) and N → p(N ); on the right are represented the graphs of N → g(n) − p(n). Top: one stable low-density equilibrium; middle: two stable equilibria, one at low density and one at high density, separated by one unstable equilibrium at an intermediate density; bottom: one stable equilibrium at high density.
(n + n0 )(1 − n) =
αnξ . β + nξ
(3.40)
As shown in Figure 3.21, depending upon the values of the parameters, there exist one or three equilibrium points. The equation of the boundary separating the domains in which there exist either one or three equilibrium points is determined by eliminating n between Equation (3.40) and d d αnξ ; (n + n0 )(1 − n) = dn dn β + nξ that is, 1 − n0 − 2n =
ξαβnξ−1 . (β + nξ )2
(3.41)
90
3 Differential Equations
Eliminating α between (3.40) and (3.41), we solve for β, and then, replacing the expression of β in one of the equations, we obtain α. The parametric equation of the boundary is then ξ(n + n0 )2 (1 − n)2 , (2 − ξ)n2 + (ξ − 1)(1 − n0 )n + ξn0 nξ+1 (1 − n0 − 2n) . β= (2 − ξ)n2 + (ξ − 1)(1 − n0 )n + ξn0
α=
It is represented in Figure 3.22. It shows the existence of a cusp catastrophe in the model.
0.25
0.15
one one
0.05
three 0.1
0.2
0.3
0.4
Fig. 3.22. Boundary between the domains in which there are three equilibrium points and one equilibrium point in the (α − β) parameter space.
When dramatic increases in gang populations have occurred, there are usually numerous attempts to try to reverse the rise. Youth employment programs, educational programs in schools, and increases in police resources are common interventions. Such interventions can prevent some individuals from joining the gang, but none of them seem to have had much effect on overall gang populations. This model suggests one possible reason for this. Due to the existence of the hysteresis effect, small-scale interventions to reduce gang activity may have no permanent effect at all. An intervention may temporarily push gang membership to a slightly lower equilibrium, but unless the intervention is large enough to push the population all the way down to the unstable equilibrium, membership will move back up to the high equilibrium. While this implication is pessimistic, the model does suggest a strategy that could succeed. If the intervention is large enough to push gang membership below the unstable equilibrium, the gang population will revert to a low equilibrium, and because of the stability of the low equilibrium, the intervention does not have to be continued once the low level is achieved. Thus a
3.6 Influence of diffusion
91
short-term but high-intensity intervention might succeed where a long-term, low-intensity strategy would fail.
3.6 Influence of diffusion In all the models discussed so far, it has always been assumed that the various species were uniformly distributed in space. In other words, over the whole territory available to them, individuals were supposed to mix homogeneously. In a variety of problems, the spatial dimension of the environment cannot, however, be ignored. In this section, we study the influence of diffusion on the evolution of various populations51 ; that is, we investigate the dynamics of populations assuming that the individuals move at random.52 We will discover that diffusion can have a profound effect on the dynamics of populations. 3.6.1 Random walk and diffusion Consider a random walker who, in a one-dimensional space, takes a step of length ξ during each time interval τ . Assuming that the steps are taken either in the positive or negative direction with equal probabilities, let p(t, x) dx be the probability that the random walker is between x and x + dx at time t. The random walker is between x and x + dx at time t if he was either between x + ξ and x + ξ + dx or between x − ξ and x − ξ + dx at time t − τ . Thus, the function p satisfies the following difference equation p(x, t) = 12 (p(x + ξ, t − τ ) + p(x − ξ, t − τ )).
(3.42)
If p is continuously differentiable, and if we assume that τ and ξ are small, then ∂p ∂p 1 ∂2p (x, t)ξ 2 +O(ξ 3 ), (x, t)τ +O(τ 2 )± (x, t)ξ + ∂t ∂x 2 ∂x2 and substituting in (3.42) yields p(x±ξ, t−τ ) = p(x, t)−
1 ∂p ξ2 ∂2p (x, t) + (O(τ 2 ) + O(ξ 3 )). (x, t) = 2 ∂t 2τ ∂x τ Hence, if there exists a positive constant D such that, when both τ and ξ tend to 0, ξ 2 /2τ tends to a constant D, p satisfies the diffusion equation ∂p ∂2p = D 2. ∂t ∂x 51 52
(3.43)
The standard text on diffusion in ecology is Okubo [272]. The interest in random dispersal in populations was triggered by the paper of Skellam [314].
92
3 Differential Equations
If, at t = 0, the random walker is at the origin, p(t, x) satisfies the initial condition lim p(x, t) = δ, t→0
where δ is the Dirac distribution, and for t > 0, the solution of (3.43) is53 1 x2 p(x, t) = √ exp − . 4Dt 2 πDt
(3.44)
The random variable representing the distance of the random walker from the origin is, therefore, normally distributed. Its mean value is 0, and its variance 2Dt varies linearly with time. This result is important. It is indeed a direct consequence of the fact that p is a generalized homogeneous function of t and x.54 3.6.2 One-population dynamics with dispersal As a simple example, consider the influence of random dispersal in a twodimensional space on the evolution of a Malthusian population. Including a diffusion term in the equation for the population density n˙ = an, we obtain 53
If p(k, t) =
∞
p(x, t)eikx dx
−∞
denotes the Fourier transform of x → p(x, t), we have d p + Dk2 p = 0, dt and p(k, 0) = 1. Hence, p(k, t) = exp(−Dk2 t). Therefore, ∞ 1 x2 1 p(x, t) = exp − exp(−Dk2 t − ikx) dk = √ . 2π −∞ 4Dt 2πDt
54
This particular solution is known as a fundamental solution of the diffusion equation. It can be shown (see Boccara [46]) that, if the initial condition is p(x, 0) = f (x), then the solution of the diffusion equation is simply given by the convolution g ∗ f , where g is a fundamental solution. Fundamental solutions are not unique. In physics, fundamental solutions are referred to as Green’s functions. f : Rn → R is a generalized homogeneous function if, for all λ ∈ R, f (λa1 x1 , λa2 x2 , . . . , λan xn ) ≡ λr f (x1 , x2 , . . . , xn ), where a1 , a2 , . . . , an and r are real constants. Since λ can be any real number, 1/a we can replace λ by 1/x1 1 ; f may then be written as a function of the n − 1 ai /a1 a /a (i = 2, 3, . . . , n). It is said that xi scales as x1 i 1 . reduced variables xi /x1 2 Since √ the probability density p satisfies the relation p(λ t, λx) ≡ λp(t, x), x scales as t.
3.6 Influence of diffusion
∂n =D ∂t
∂2n ∂2n + ∂x21 ∂x22
93
+ an.
(3.45)
The exponential growth of a spreading population is an acceptable approximation if, initially, the population consists of very few individuals that spread and reproduce in a habitat where natural enemies, such as competitors and predators, are lacking. If we assume that the dispersal is isotropic, the population dispersal is modeled by ∂n D ∂ ∂n = r + an, (3.46) ∂t r ∂r ∂r where r = x21 + x22 . Introducing the dimensionless variables a , (3.47) τ = at and ρ = r D Equation (3.46) becomes ∂n 1 ∂ = ∂τ ρ ∂ρ
∂n ρ ∂ρ
+ n.
(3.48)
Assuming that, at time t = 0, there are N0 individuals concentrated at the origin, the solution of (3.48) is ρ2 N0 exp τ − . (3.49) n(ρ, τ ) = 4πτ 4τ The total number of individuals that, at time t, are at a distance greater than R from the origin is given by ∞ N (R, t) = n(r a/D, at) 2πr dr R R2 . = N0 exp at − 4Dt For R2 = 4aDt2 , the number of individuals that, at time t, are outside a circle of radius R, equal to N0 , is negligible compared to the total number of individuals N0 eat . Hence, the radius of an approximate boundary of the habitat occupied by the invading species is proportional to t. According to Skellam [314], this result is in agreement with data on the spread of the muskrat (Ondatra zibethica), an American rodent, introduced inadvertently into Central Europe in 1905. The fact that, for the random dispersal of a Malthusian population, R is proportional to t contrasts with simple diffusion, where space scales as the square root of time.
94
3 Differential Equations
3.6.3 Critical patch size Consider a refuge (i.e., a patch of favorable environment surrounded by a region where survival is impossible). If the population is diffusing, individuals crossing the patch boundary will be lost. The problem is to find the critical patch size c such that the population cannot sustain itself against losses from individuals crossing the patch boundary if the patch size is less than c but can maintain itself indefinitely if the patch size is greater than c . As a simplification, we discuss a one-dimensional problem (i.e., we assume that in two dimensions, the patch Σ is an infinite strip of width ): ! Σ = (x, y) | − 12 < x < 12 , −∞ < y < ∞ . Consider the scaled equation ∂2n ∂n = + n, ∂t ∂x2
(3.50)
where a, the growth rate of the population density, has been absorbed in t and D/a, the square root of the diffusion coefficient divided by the growth rate, in x (see (3.47)). The condition that survival is impossible outside Σ implies n(x, t) = 0 if x = ± 12 . Assuming that the solution n(x, t) of (3.50), as a function of x, can be represented as a convergent Fourier series, one finds that this solution can be written as ∞ kπ k2 π2 x + 12 . t sin nk exp 1− 2 (3.51) n(x, t) = k=1
If < π, then, for all positive integers k, 1 − k 2 /π 2 2 < 0, and n(x, t) given by (3.51) goes to zero exponentially as t tends to infinity. Therefore, for < π, the strip Σ is not a refuge. If > π, then, for any initial population density, Σ is a refuge since the amplitude of the first Fourier component of n(x, t) grows without limit when t tends to infinity. The reduced critical size is then c = π. In terms of the original space variable, the critical size is D/a π. Remark 5. If, for − 12 ≤ x ≤ 21 , the initial density n(x, 0) is bounded by M , then, for all t > 0 and all |x| ≤ 12 , n(x, t) ≤
∞ (2k + 1)2 π 2 4M 1 exp 1− t 2 π 2k + 1 k=0 (2k + 1)π 1 × sin x+ . 2
(3.52)
If the growth of the population density obeys the reduced logistic equation n˙ = n(1 − n), Ludwig, Aronson, and Weinberg [216] have shown that it grows
3.6 Influence of diffusion
95
less rapidly than if it were Malthusian and, therefore, goes to zero, as t → ∞, if < π. If > π, then, as t → ∞, n(x, t) tends to the solution of u + u(1 − u) = 0, satisfying the condition u(± /2) = 0. In their paper, Ludwig, Aronson, and Weinberg apply their method to the Ludwig-Jones-Holling model of spruce budworm outbreak (Equation (1.4)). They show that, for this system, there exist two critical strip widths. The smaller one gives a lower bound for the strip width that can support a nonzero population. The larger one is the lower bound for the strip width that can support an outbreak.
3.6.4 Diffusion-induced instability Since diffusion tends to mix the individuals, it seems reasonable to expect that a system eventually evolves to a homogeneous state. Thus, diffusion should be a stabilizing factor. This is not always the case, and the Turing effect [334], or diffusion-induced instability, is an important exception. Turing’s paper, which shows that the reaction (interaction) and diffusion of chemicals can give rise to a spatial structure, suggests that this instability could be a key factor in the formation of biological patterns.55 Twenty years later, Segel and Jackson [311] showed that diffusive instabilities could also appear in an ecological context. Consider two populations N1 and N2 evolving according to the following one-dimensional diffusion equations56 : ∂N1 ∂ 2 N1 , = f1 (N1 , N2 ) + D1 ∂t ∂x2 ∂ 2 N2 ∂N2 = f2 (N1 , N2 ) + D2 . ∂t ∂x2
(3.53) (3.54)
f1 (N1 , N2 ) and f2 (N1 , N2 ) denote the interaction terms, D1 and D2 are the diffusion coefficients, and x is the spatial coordinate. We assume the existence of an asymptotically stable steady state (N1∗ , N2∗ ) in the absence of diffusion. To examine the stability of this uniform solution to perturbations, we write N1 (t, x) = N1∗ + n1 (t, x) and N2 (t, x) = N2∗ + n2 (t, x).
(3.55)
If n1 (t, x) and n2 (t, x) are small, we can linearize the equations obtained upon substituting (3.55) in Equations (3.53) and (3.54). We obtain57 ∂n1 ∂ 2 n1 , = a11 n1 + a12 n2 + D1 ∂t ∂x2 ∂n2 ∂ 2 n2 , = a21 n1 + a22 n2 + D2 ∂t ∂x2 where the constants aij , for i = 1, 2 and j = 1, 2, are given by 55 56 57
A rich variety of models is discussed in Murray [255]. Extension to the more realistic two-dimensional case is straightforward. N1∗ and N2∗ are such that f1 (N1∗ , N2∗ ) = 0 and f2 (N1∗ , N2∗ ) = 0.
96
3 Differential Equations
aij =
∂fi (N ∗ , N ∗ ). ∂Nj 1 2
Solving the system of linear partial differential equations above is a standard application of Fourier transform theory.58 Let ∞ 1 n1 (t, x) = n "1 (t, k)e−ikx dk, (3.56) 2π −∞ ∞ 1 n2 (t, x) = n "2 (t, k)e−ikx dx. (3.57) 2π −∞ Replacing (3.56) and (3.57) in the system of linear partial differential equations yields the following system of ordinary linear differential equations d" n1 = a11 n "1 + a12 n "2 − D1 k 2 n "1 , dt d" n2 "1 + a22 n "2 − D2 k 2 n "2 . = a21 n dt This system has solutions of the form λt n "1 (t, k) = n # 01 (k)e , λt n "2 (t, k) = n # 02 (k)e ,
where λ is an eigenvalue of the 2 × 2 matrix a11 − D1 k 2 a12 . a21 a22 − D2 k 2 For a diffusive instability to set in, at least one of the conditions a11 − D1 k 2 + a22 − D2 k 2 < 0,
(3.58)
(a11 − D1 k )(a22 − D2 k ) − a12 a21 > 0,
(3.59)
2
2
should be violated. Since (N1∗ , N2∗ ) is asymptotically stable, the conditions a11 + a22 < 0, a11 a22 − a12 a21 > 0,
(3.60) (3.61)
are satisfied. From Condition (3.60) it follows that (3.58) is always satisfied. Therefore, an instability can occur if, and only if, Condition (3.59) is violated. If D1 = D2 = D, the left-hand side of (3.59) becomes a11 a22 − a12 a21 − Dk 2 (a11 + a22 ) + D2 k 4 . 58
For discussion of Fourier transform theory and its applications to differential equations, see Boccara [46], Chapter 2, Section 4.
3.6 Influence of diffusion
97
In this case, (3.59) is always satisfied since it is the sum of three positive terms. Thus, if the diffusion coefficients of the two species are equal, no diffusive instability can occur. If D1 = D2 , Condition (3.59) may be written H(k 2 ) = D1 D2 k 4 − (D1 a22 + D2 a11 )k 2 + a11 a22 − a12 a21 > 0. 2 Since D1 D2 > 0, the minimum of H(k 2 ) occurs at k 2 = km , where 2 km =
D1 a22 + D2 a11 > 0. 2D1 D2
2 ) < 0, which is equivalent to The condition H(km
a11 a22 − a12 a21 −
(D1 a22 + D2 a11 )2 < 0, 4D1 D2
is a sufficient condition for instability. This criterion may also be written D1 a22 + D2 a11 > 2(a11 a22 − a12 a21 )1/2 > 0. (D1 D2 )1/2
(3.62)
Since the first term on the left-hand side of (3.62) is a homogeneous function of D1 and D2 , for given interactions between the two species, the occurrence of a diffusion-induced instability depends only on the ratio of the two diffusion coefficients D1 and D2 . From (3.60), a11 and a22 cannot both be positive. If they are both negative, Condition (3.62) is violated, and Condition (3.59) cannot be violated by increasing k; then, necessarily, a11 a22 < 0
(3.63)
a12 a21 < 0.
(3.64)
and, from (3.61), If a11 (resp. a22 ) is equal to zero, then, from (3.60), a22 (resp. a11 ) must be negative, and here again Condition (3.59) cannot be violated by increasing k 2 . Conditions (3.63)59 and (3.64), which are strict inequalities, are useful. An instability can be immediately ruled out if they are not verified.
59
2 This condition also follows from the fact that the expression of km has to be positive.
98
3 Differential Equations
Exercises Exercise 3.1 In a careful experimental study of the dynamics of populations of the metazoan Daphnia magna, Smith [316] found that his observations did not agree with the predictions of the logistic model. Using the mass M of the population as a measure of its size, he proposed the model K −M M˙ = rM , K + aM where r, K, and a are positive constants. Find the equilibrium points and determine their stabilities. Exercise 3.2 The dimensionless Lotka-Volterra equations (2.4 and 2.5) are dh = ρh (1 − p), dτ dp 1 = − p (1 − h), dτ ρ where h and p denote, respectively, the scaled prey and predator populations, and ρ is a positive parameter. (i) Show that there exists a function (h, p) → f (h, p) that is constant on each trajectory in the (h, p) plane. (ii) Use the result above to find a Lyapunov function defined in a neighborhood of the equilibrium point (1, 1). Exercise 3.3 In order to develop a strategy for harvesting60 a renewable resource, say fish, consider the equation N N˙ = rN 1 − − H(N ), K which is the usual logistic population model with an increase of mortality rate as a result of harvesting. H(N ) represents the harvesting yield per unit time. (i) Assuming H(N ) = CN , where C is the intrinsic catch rate, find the equilibrium population N ∗ , and determine the maximum yield. (ii) If, as an alternative strategy, we consider harvesting with a constant yield H(N ) = H0 , the model is N − H0 . N˙ = rN 1 − K Determine the stable equilibrium point, and show that when H0 approaches below, there is a risk for the harvested species to become extinct. Exercise 3.4 Assume that the system N1 N˙ 1 = r1 N1 1 − − λ1 N1 N2 − CN1 , K1 60
1 4
rK from
N2 N˙ 2 = r2 N2 1 − − λ2 N1 N2 , K2
For the economics of the sustainable use of biological resources, see Clark [89].
Exercises
99
is an acceptable model of two competing fish species in which species 1 is subject to harvesting. For certain values of the parameters, we have seen in Example 18 that for C = 0 (no harvesting), this system has an unstable equilibrium point for nonzero values of both populations. In this particular case, what happens when the catching rate C increases from zero? Exercise 3.5 The competition between two species for the same resource is described by the two-dimensional system N˙ 1 = N1 f1 (N1 , N2 ),
N˙ 2 = N2 f2 (N1 , N2 ),
where f1 and f2 are differentiable functions. (i) Show that the slope of the null clines is negative. (ii) Assuming that there exists only one nontrivial equilibrium point (N1∗ , N2∗ ), find the condition under which this equilibrium is asymptotically stable. Exercise 3.6 In experiments performed on two species of fruit flies ( Drosophila pseudoobsura and D. willistini), Ayala, Gilpin, and Ehrenfeld [20] tested 10 different models of interspecific competition, including the Lotka-Volterra model, presented in Example 18, as a special case. They found that the model that gave the best fit was the system θ θ N1 1 N2 2 N2 N1 ˙ ˙ N1 = r1 N1 1 − − a12 − a21 , N2 = r2 N2 1 − , K1 K1 K2 K2 where r1 , r2 , K1 , K2 , θ1 , θ2 , a12 , and a21 are positive constants. Under which condition does this model exhibit an asymptotically stable nontrivial equilibrium point? Exercise 3.7 Consider the two-dimensional system x˙ 1 = x2 ,
x˙ 2 = −x1 + x2 (1 − 3x21 − 2x22 ),
which describes a perturbed harmonic oscillator. Use the Poincar´e-Bendixson theorem to prove the existence of a limit cycle. Hint: Using polar coordinates, show that there exists an invariant bounded set {(x1 , x2 ) ∈ R2 | 0 < r1 < x21 + x22 < r2 }, that does not contain an equilibrium point. Exercise 3.8 The second dimensionless Ludwig-Jones-Holling equation modeling budworm outbreaks Equation (1.6) reads dx x x2 , = rx 1 − − dt k 1 + x2 where x represents the scaled budworm density, and r and k are two positive parameters. (i) Show that, according to the values of r and k, there exist either two or four equilibrium points and study their stabilities. (ii) Determine analytically the domains in the (k, r)-space where this equation has either one or three positive equilibrium points. Show that the boundary between the two domains has a cusp point. Find its coordinates.
100
3 Differential Equations
Solutions Solution 3.1 In terms of the dimensionless time and mass variables τ = rt
and
the Smith model takes the form dm =m dτ
m=
1−m 1 + am
M , K .
The parameter a is dimensionless. The equilibrium points are m = 0 and m = 1. Since m(1 − m) d 1 − 2m − am2 , = dm 1 + am (1 + am)2 m = 0 is always unstable, and m = 1 is asymptotically stable if a > 0. Solution 3.2 (i) Eliminating dτ between the two equations yields h(1 − p) dh = −ρ2 p(1 − h) dp or
1−p 1−h dh = −ρ2 dp. h p Integrating, we find that on any trajectory the function f : (h, p) → ρ2 (p − log p) + h − log h
is constant. The constant depends upon the initial values h(0) and p(0). (ii) Since f (1, 1) = 1 + ρ2 , define V (h, p) = f (h, p) − (1 + ρ2 ). It is straightforward to verify that, in the open set ]0, ∞[×]0, ∞[, the function V has the following properties V (1, 1) = 0, V (h, p) > 0 for (h, p) = (1, 1), V˙ (h, p) ≡ 0. V is, therefore, a weak Lyapunov function, and the equilibrium point (1, 1) is, consequently, Lyapunov stable. Solution 3.3 (i) It is simpler, as usual, to define dimensionless variables in order to reduce the number of parameters. If τ = rt,
n=
N , K
c=
the equation becomes dn = n(1 − n) − cn. dτ
C , r
Solutions
101
The equilibrium point n∗ is the solution of the equation 1 − n − c = 0 ( i.e., n∗ = 1 − c). This result supposes that c < 1 or, in terms of the original parameters, C < r. That is, the intrinsic catch rate C must be less than the intrinsic growth rate r for the population not to become extinct. It is easy to check that n∗ is asymptotically stable. The scaled yield is cn∗ = c(1 − c), and the maximum scaled yield, which corresponds to c = 12 , is equal to 14 , or, in terms of the original parameters, 14 rK. (ii) If the harvesting goal is a constant yield H0 , then, with τ = rt,
n=
N , K
h0 =
H0 , rK
the equation becomes dn = n(1 − n) − h0 . dτ √ The equilibrium√ points are the solutions of n(1−n)−h0 = 0 ( i.e., n∗1 = 21 (1+ 1 − 4h0 ) and n∗2 = 12 (1− 1 − 4h0 )). These solutions are positive numbers if the reduced constant yield h0 is less than 14 . n∗1 and n∗2 are, respectively, asymptotically stable and unstable. 1 When H0 approaches rK, the reduced variable h0 approaches 14 and the distance 4 √ ∗ ∗ |n1 − n2 | = 1 − 4h0 between the two equilibrium points becomes very small. Then, as a result of a small perturbation, n(t) might become less than the unstable equilibrium value n∗1 and, in that case, will tend to zero. Therefore, if H0 is close to 14 rK, there is a risk for the harvested species to become extinct. Solution 3.4 In terms of the dimensionless variables τ = ρ=
r1 , r2
√
r1 r2 t,
n1 =
λ1 K2 α1 = √ , r1 r2
N1 , K1
n2 =
λ2 K1 α2 = √ , r1 r2
N2 , K2 c= √
C , r1 r2
the equations become dn1 = ρ n1 (1 − n1 ) − α1 n1 n2 − cn1 , dτ
dn2 1 = n2 (1 − n2 ) − α2 n1 n2 . dτ ρ
If C = 0 ( i.e., c = 0), referring to Example 18, the nontrivial fixed point ρα2 − 1 α1 − ρ , ρ(α1 α2 − 1) α1 α2 − 1 is a saddle if α1 > ρ
and
ρα2 > 1.
In this case, the equilibrium points (0, 1) and (1, 0) are both asymptotically stable. Depending upon environmental conditions, either species 1 or species 2 is capable of dominating the natural system. Since, in this model, species 1 is harvested we will assume that species 1 is dominant. As a result of a nonzero c value, the asymptotically stable equilibrium ρ−c , 0 , ρ which was equal to (1, 0) for c = 0, moves along the n1 -axis, as indicated by the arrow on the left in Figure 3.23. For a c value such that
102
3 Differential Equations ρ−c 1 , = ρ ρα2
that is,
c=ρ−
1 , α2
there is a bifurcation and, for c slightly greater than ρ − 1/α2 , as on the right in Figure 3.23, species 1 does not exist anymore. This extinction is, however, not a direct consequence of harvesting (c is still less than ρ, that is, C < r1 ) but the result of the competitive interaction. For c < ρ − 1/α2 , the only stable equilibrium point is (0, 1), so species 2 should become dominant. But, we assumed that, before harvesting, the state of the system was n∗1 = 1 and n∗2 = 0. In nature this might not be entirely true, a small population n2 could exist in a refuge and could grow according to the equations of the model once a sufficient increase of the catching rate had changed the values of the parameters. n2
n2
1
1
a1
a1
a2
1
n1
a2
n1
Fig. 3.23. Modification of the null cline of species 1 as it is harvested.
This model is a possible explanation of the elimination, in the late 1940s and early 1950s, of the Pacific sardines that have been replaced by an anchovy population.61 Solution 3.5 (i) The equations of the null clines in the (N1 , N2 )-plane are f1 (N1 , N2 ) = 0
and
f2 (N1 , N2 ) = 0,
and
∂f2 dN2 ∂N1 =− ∂f2 dN1 ∂N2
and their slopes are given by ∂f1 dN2 ∂N1 =− ∂f1 dN1 ∂N2
for N˙ 1 = 0,
for N˙ 1 = 0.
Since all the partial derivatives, which represent limiting effects of each species on itself or on its competitor, are negative in a competition model, both slopes are negative. That is, each population is a decreasing function of the other, as should be the case for a system of two competing species. 61
See Clark [89].
Solutions
103
(ii) The nontrivial equilibrium point (N1∗ , N2∗ ), if it exists, is the unique solution, in the positive quadrant, of the system f1 (N1 , N2 ) = 0,
f2 (N1 , N2 ) = 0.
This equilibrium point is asymptotically stable if the eigenvalues of the Jacobian matrix ⎤ ⎡ ∗ ∂f1 ∗ ∗ ∗ ∂f1 ∗ ∗ N (N , N ) N (N , N ) 1 1 2 1 1 2 ⎥ ⎢ ∂N1 ∂N2 J =⎣ ⎦ ∗ ∂f2 ∗ ∗ ∗ ∂f2 ∗ ∗ N2 (N1 , N2 ) N2 (N1 , N2 ) ∂N1 ∂N2 have negative real parts, that is, if tr J < 0
and
det J > 0.
All partial derivatives being negative, the condition on the trace is automatically satisfied. The only condition to be satisfied for the equilibrium point to be asymptotically stable is, therefore, ∂f1 ∂f2 ∂f1 ∂f2 (N1∗ , N2∗ ) (N1∗ , N2∗ ) > (N1∗ , N2∗ ) (N1∗ , N2∗ ). ∂N1 ∂N2 ∂N2 ∂N2 That is, a system of two competitive species exhibits a stable equilibrium if, and only if, the product of the intraspecific growth regulations is greater than the product of the interspecific growth regulations. This result has been given without proof by Gilpin and Justice [143]. It has been established by Maynard Smith [234] assuming, as suggested by Gilpin and Justice, that, at the point of intersection of the two null clines, the slope of the null cline of the species plotted along the x-axis is greater than the slope of the null cline of the species plotted along the y-axis. This property of the null clines is a consequence of the position of the stable equilibrium point found by Ayala [19] in his experimental study of two competing species of Drosophila. Solution 3.6 In terms of the dimensionless variables τ =
√
r1 r2 t,
ρ=
r1 , r2
n1 =
N1 , K1
n2 =
N2 , K2
α12 = a12
K2 , K1
α21 = a21
K1 , K2
the Ayala-Gilpin-Ehrenfeld model becomes dn1 = ρ (1 − nθ11 − α12 n2 ), dτ
dn1 1 = (1 − nθ22 − α21 n1 ). dτ ρ
Assuming the existence of a nontrivial equilibrium point, from the general result derived in the preceding exercise, this point is asymptotically stable if the condition ∂f1 ∗ ∗ ∂f2 ∗ ∗ ∂f1 ∗ ∗ ∂f2 ∗ ∗ (n1 , n2 ) (n1 , n2 ) > (n1 , n2 ) (n1 , n2 ), ∂n1 ∂n2 ∂n2 ∂n2 where f1 (n1 , n2 ) = ρ (1 − nθ11 − α12 n2 ) is satisfied; that is, if
and
f2 (n1 , n2 ) =
1 (1 − nθ22 − α21 n1 ), ρ
θ1 θ2 (n∗1 )θ1 −1 (n∗2 )θ2 −1 > α12 α21 ,
where n∗1 and n∗2 are the dimensionless coordinates of the equilibrium point.
104
3 Differential Equations
1
0.5
0
-0.5
-1 -1
-0.5
0
0.5
1
Fig. 3.24. Phase portrait of the perturbed harmonic oscillator. Dots show initial values. Solution 3.7 In polar coordinates, the equations become r˙ = r sin2 θ(1 − 2r2 − r2 cos2 θ), θ˙ = −1 + sin θ cos θ(1 − 2r2 − r2 cos2 θ). For r = 21 , the first equation shows that r˙ =
1 4
sin2 θ(1 −
1 2
cos θ) ≥ 0
with equality only for θ = 0 and θ = π. The first equation also implies that r˙ ≤ r sin2 (1 − 2r2 ). √ Hence, for r = 1/ 2, r˙ ≤ 0, with equality only for θ = 0 and θ = π. These two results prove that, for any point x in the annular domain x ∈ R2 12 < x < √12 ,
the trajectory {ϕt (x) | t > 0} remains in D. Since the only equilibrium point of the planar system is the origin, which is not in D, D contains a limit cycle. The phase portrait of this system is represented in Figure 3.24. Solution 3.8 (i) Equilibrium points are the solutions of the equation x x2 f (x; r, k) = rx 1 − = 0. − k 1 + x2
This is an algebraic equation of degree 4 that has either two or four real solutions. The graph of the function f : x → rx(1 − x/k) − x2 /(1 + x2 ) represented in Figure 3.25 shows that the equilibrium point x0 = 0, which exists for all values of r and k, is always unstable (positive slope). Therefore, if there are only two equilibrium points, the nonzero
Solutions
105
0.3 -0.1
2
4
6
8
10
12
-0.5 -0.9
Fig. 3.25. Typical graph of the function x → rx(1 − x/k) − x2 /(1 + x2 ) when there exist four equilibrium points. For k = 12 and r = 0.5, the equilibrium points are located at x0 = 0, x = 0.704, xu = 1.794, and xh = 9.502. equilibrium point is stable, and if there exist four equilibrium points (as in Figure 3.25) there exist two stable equilibrium points x and xh at, respectively, low and high density separated by an unstable point xu . (ii) The boundary between the domains in the (k, r)-space in which there exist either one or three positive equilibrium points is determined by expressing that the equation x x r 1− (3.65) = k 1 + x2 has a double root. This occurs when x x d d ; 1− = r dx k dx 1 + x2 that is, if
1 − x2 r . (3.66) = k (1 + x2 )2 Solving (3.65) and (3.66) for k and r, we obtain the parametric representation of the boundary of the domain in which Equation (3.65) has either one or three positive solutions (see Figure 3.26): −
2x3 , x2 − 1 2x3 r= . (1 + x2 )2
k=
This model exhibits a cusp catastrophe. At the cusp point, the derivatives of k and r with respect to x both vanish. Its coordinates are k = 33/2 and r = 33/2 /8 ( i.e., k = 5.196 . . . and r = 0.650 . . .). Note that lim k(x) = ∞,
x→∞
lim r(x) = 0,
x→∞
and lim k(x) = ∞,
x→1
lim r(x) =
x→1
1 . 2
106
3 Differential Equations
r 0.65 1 equilibrium point 0.55 3 equilibrium points
0.45
k 9
13
17
21
0.35 Fig. 3.26. Boundary between domains in (k, r)-space in which there exist either one or three positive equilibrium points.
4 Recurrence Equations
In this chapter, we study dynamical models described by recurrence equations of the form xt+1 = f (xt ), (4.1) where x, representing the state of the system, belongs to a subset J of Rn , f : J → J is a map, and t ∈ N0 . Since many notions are common to differential equations and recurrence equations, this chapter will be somewhat shorter than the preceding one. As for differential equations, models formulated in terms of recurrence equations such as (4.1) ignore the short-range character of the interactions between the elements of a complex system.
4.1 Iteration of maps Let f be a map defined on a subset J of Rn ; in what follows, for the iterates of f , we shall adopt the notation f t+1 = f ◦ f t ,
where
f1 = f
and t ∈ N0 (f 0 is the identity). If f is a diffeomorphism, then f −1 is defined and the definition of f t above remains valid for all t ∈ Z. In applications, the map f is seldom invertible.1 In the previous chapter on differential equations, a flow has been defined as a one-parameter group on the phase space. That is, if the phase space is an open set U of Rn , for all t ∈ R,2 the flow is a mapping ϕt : U → U 1
2
In the mathematical literature, diffeomorphisms are favored. Among the different motivations, Smale [315] mentions “its natural beauty” and the fact that “problems in the qualitative theory of differential equations are present in their simplest form in the theory of diffeomorphisms.” Or sometimes only for t ∈ R+ . In this case, the flow is a one-parameter semigroup.
108
4 Recurrence Equations
such that the group property holds: ϕ0 is the identity and ϕt+s = ϕt ◦ ϕs (Relation (3.3)). In the case of maps, the analogue of the flow is the mapping f t : J → J, defined only for t ∈ Z or t ∈ N0 .3 The forward orbit of f through x is the set {f t (x) | t ∈ N0 } oriented in the sense of increasing t. If f is invertible, the backward orbit is the set {f −t (x) | t ∈ N0 }. In this case the (whole) orbit is the set {f t (x) | t ∈ Z}. A point x∗ ∈ J is an equilibrium point or a fixed point if f (x∗ ) = x∗ . The orbit of an equilibrium point is the equilibrium point itself. A point x∗ ∈ J is a periodic point if f τ (x∗ ) = x∗ for some nonzero τ . The smallest positive value of τ is the period of the point and is usually denoted by T . A periodic point of period T is a fixed point of f T . The set {f t (x∗ ) | t = 0, 1, . . . , T − 1} is a periodic orbit or a T -point cycle. A periodic orbit consists of a finite number of points. Example 24. Graphical analysis. In the case of one-dimensional maps, there exists a simple graphical method to follow the successive iterates of an initial point x0 . First plot the graphs of x → f (x) and x → x. Since the sequence of iterates is generated by the equation xt+1 = f (xt ), the iterate of the initial value x0 is on the graph of f at (x0 , f (x0 )), that is, (x0 , x1 ). The horizontal line from this point intersects the diagonal at (x1 , x1 ). The vertical line from this point intersects the graph of f at (x1 , f (x1 )); that is, (x1 , x2 ). Repeating this process generates the sequence (x0 , x1 ), (x1 , x1 ), (x1 , x2 ), (x2 , x2 ), (x2 , x3 ), . . . The equilibrium point x∗ is located at the intersection of the graphs of the two functions. The diagram (Figure 4.1) that consists of the graphs of the functions x → f (x) and x → x and the line joining the points of the sequence above is called a cobweb. It clearly shows if the sequence of iterates of the initial point x0 does or does not converge to the equilibrium point. Definition 13. Two maps f and g defined on J ⊆ Rn are said to be conjugate if there exists a homeomorphism h such that h◦f =g◦h or
h ◦ f ◦ h−1 = g.
This relation implies that, for all t ∈ N, h ◦ f t ◦ h−1 = gt . 3
This mapping is sometimes called a cascade in the Russian literature on dynamical systems; see, for instance, Anosov and Arnold [7].
4.1 Iteration of maps
109
1 1.2 0.8
1 0.8
0.6
0.6 0.4
0.4 0.2
0.2
0 0 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Fig. 4.1. Cobwebs in the vicinity of equilibrium points. The equilibrium point in the left diagram is asymptotically stable (|f (x∗ )| < 1), while the equilibrium point in the right diagram is unstable (|f (x∗ )| > 1). Dots represent initial values.
As for flows, h takes the orbits f into the orbits of g. In particular, if x∗ is a periodic point of f of period T , then h(x∗ ) is a periodic point of period T of g. This follows from gT (h(x∗ )) = h(f T (x∗ )) = h(x∗ ), which proves that h(x∗ ) is a periodic point of g. To verify that T is the (smallest) period, assume that there exists T0 < T such that gT0 (h(x∗ )) = h(x∗ ). Since gT0 (h(x∗ )) = h(f T0 (x∗ )), this would imply f T0 (x∗ ) = x∗ , which is not true. Example 25. The logistic map f4 and the binary tent map T2 , defined on the interval [0, 1] by
2x, if 0 ≤ x < 12 , f4 (x) = 4x(1 − x) and T2 (x) = 2 − 2x, if 12 ≤ x ≤ 1, are conjugate. To prove this result, it suffices to show that the function h : x → sin2 ( π2 x) is a conjugacy. This is indeed the case since 1. h : [0, 1] → [0, 1] is continuous, 2. h(x1 ) = h(x2 ) ⇒ x1 = x2 , 3. h([0, 1]) = [0, 1], 4. h exists and is positive, so h−1 : [0, 1] → [0, 1] exists and is continuous, and 5. (f4 ◦ h)(x) = 4 sin2 ( π2 x)(1 − sin2 ( π2 x)) = sin2 (πx) = (h ◦ T2 )(x).
110
4 Recurrence Equations
4.2 Stability Definition 14. A fixed point x∗ of the recurrence equation xt+1 = f (xt ) is said to be Lyapunov stable if, for any given positive ε, there exists a positive δ (which depends on ε only) such that f t (x) − x∗ < ε for all x such that x − x∗ < δ and all t > 0. A fixed point x∗ is said to be asymptotically stable if it is Lyapunov stable and if there exists a positive δ such that, for all x such that x − x∗ < δ, limt→∞ f t (x) = x∗ . Fixed points that are not stable are unstable. That is, the iterates of points close to a Lyapunov stable fixed point remain close to it, while the iterates of points close to an asymptotically stable fixed point move towards it as t increases. Since a T -periodic point x∗ is a fixed point of f T , its stability is determined by the behavior of the sequence of iterates: f T (x), f 2T (x), f 3T (x), . . ., where x is close to x∗ . Definition 15. If a metric d is defined on the phase space S, a map f : S → S is said to be contracting if there exists a positive real λ < 1 such that, for any pair (x, y) ∈ S × S, d f (x), f (y) ≤ λd(x, y). (4.2) Iterating inequality (4.2) yields d(f t (x), f t (y)) ≤ λd(f t−1 (x), f t−1 (y)) ≤ · · · ≤ λt d(x, y),
(4.3)
so lim d(f t (x), f t (y)) = 0.
t→∞
This result means that the asymptotic behavior of all points is the same. On the other hand, as a consequence of the triangular inequality, for s > t, d(f s (x), f t (x)) ≤ d(f s (x), f s−1 (x)) + d(f s−1 (x), f s−2 (x)) + · · · + d(f t+1 (x), f t (x)) ≤ λs−1 d(f (x), x) + λs−2 d(f (x), x) + · · · + λt d(f (x), x) ≤
λt d(f (x), x), 1−λ
(4.4)
that is, lim d(f s (x), f t (x)) = 0. For a given x ∈ S, the sequence of iterates f t (x) t∈N is a Cauchy sequence.4 Therefore, if the space S is complete, the sequence f t (x) is convergent, and t→∞
4
A sequence (xn ) is a Cauchy sequence if, for any positive ε > 0, there exists a positive integer n0 (ε) such that d(xn , xn+k ) < ε for all n > n0 (ε) and k > 0. A metric space is said to be complete if every Cauchy sequence is convergent.
4.2 Stability
111
from (4.3) the limit of the sequence is the same for all x ∈ S. Denote this limit by x∗ . For any x and any positive integer t, using again the triangular inequality, we obtain d(f (x∗ ), x∗ ) ≤ d(f (x∗ ), f t+1 (x)) + d(f t+1 (x), f t (x)) + d(f t (x), x∗ ) ≤ (1 + λ)d(f t (x), x∗ ) + λt d(f (x), x). Since limt→∞ d(f t (x), x∗ ) = 0 and limt→∞ λt = 0, we find that f (x∗ ) = x∗ , which shows that the limit x∗ of the sequence f t (x) is a fixed point of the recurrence equation (4.1). If, in the inequality (4.4) we take the limit s → ∞, we find that λt d(f (x), x), d(f t (x), x∗ ) ≤ 1−λ that is the iterates of all points x ∈ S converge exponentially to the fixed point x∗ . The considerations above prove the following fundamental result: Theorem 6. Let f : S → S be a contracting map defined on a complete metric space. Then the sequence of iterates (f t (x))t∈N converges exponentially to the unique fixed point x∗ of f . Example 26. Newton’s method. Any solution of the equation f (x) = 0 is a solution of the equation f (x) x=x− g(x) for any function g that is not equal to zero in the vicinity of the solution. In order to optimize the convergence of the iterates of the function defined by the right-hand side of the equation above towards the root x∗ of f (x) = 0, the absolute value of the derivative of this function at x = x∗ should be as small as possible. From f (x) f (x∗ ) f (x∗ )g (x∗ ) d x− + =1− 2 , dx g(x) g(x∗ ) g(x∗ ) ∗ x=x
it follows that, if we take g(x) = f (x), d dx
f (x) x− g(x)
The iteration xt+1 = xt −
= 0. x=x∗
f (xt ) f (xt )
112
4 Recurrence Equations
is known as Newton’s method for solving the equation f (x) = 0. For instance, the numerical values of the solutions of the equation x2 − 2 = 0 (which are √ xt ± 2) are the asymptotically stable equilibrium points of the map x → − 2 1 5 . Starting from the initial value x = 3, the first iterates are xt 1.833333333333, 1.46212121212194, 1.4149984298, 1.414213780047, 1.414213562373, . . . . The convergence towards the equilibrium point is very fast. After only five √ iterations, we obtain the value of 2 with 12 exact digits after the decimal point. Theorem 6 has many important consequences.6 For instance, in the case of linear maps, if all the eigenvalues of a linear map A : Rn → Rn have absolute values less than one, the map is contracting and the iterates of every point converge to the origin exponentially. In the case of a nonlinear recurrence equation xt+1 = f (xt ), as for nonlinear differential equations, under certain conditions, the stability of an equilibrium point x∗ can be determined by the eigenvalues of the linear operator Df (x∗ ). Definition 16. Let x∗ ∈ S be an equilibrium point of the recurrence equation xt+1 = f (xt ); x∗ is said to be hyperbolic if none of the eigenvalues of the Jacobian matrix Df (x∗ ) has modulus equal to one. The linear function x → Df (x∗ )x is called the linear part of f at x∗ . The following result is the Hartman-Grobman theorem for maps. It is the analogue of Theorem 1 for flows. Theorem 7. If x∗ is a hyperbolic equilibrium point of the recurrence equation xt+1 = f (xt ), the mapping f in the neighborhood of x∗ is C 0 conjugate to the linear mapping Df (x∗ ) in the neighborhood of the origin. Therefore, if Df (x∗ ) has no eigenvalues with modulus equal to 1, the stability of the equilibrium point x∗ of the nonlinear recurrence equation xt+1 = f (xt ) can be determined from the study of the linear recurrence equation yt+1 = Df (x∗ )yt , where y = x − x∗ . If Df (x∗ ) has eigenvalues with modulus equal to 1, this is not the case. The following examples illustrate the various possibilities. 5
6
√ √ The restrictions of the map to the open intervals ] 2, ∞[ and ] − ∞, − 2[ are obviously contracting. If the initial point √ x0 does not lie in√these intervals, then √ its first iterate f (x√ 2, or in ]−∞, − 2[, 0 ) = x1 is either in ] 2, ∞[, for 0 < x0 < for −∞ < x0 < − 2. For more details, refer to Katok and Hasselblatt [181].
4.2 Stability
113
Nt+1
Nt Fig. 4.2. Typical graph of a function f in a one-population model of the form Nt+1 = f (Nt ).
Example 27. One-population models. If a species breeds only at a particular time of the year, whether adults do or do not survive to breed in the next season has an important effect on their population dynamics. Due to the existence of breeding seasons, the population growth may be described by a recurrence equation of the form Nt+1 = f (Nt ).7 A reasonable function f should satisfy the following two conditions: • the image f (N ) of any positive N should be positive, and • f should be increasing for small N and decreasing for large N . The graph of a function f satisfying these two conditions is represented in Figure 4.2. A cobweb analysis of single-population models is presented in Figure 4.3. As a first model, consider Nt Nt+1 = Nt exp r 1 − , (4.5) K where r and K are positive constants. If nt denotes the dimensionless population Nt /K, we have nt+1 = nt exp (r(1 − nt )) .
(4.6)
There are two equilibrium points 0 and 1. Since the values of the derivative of the function n → n exp(r(1 − n)) at n = 0 and n = 1 are, respectively, er and 1 − r, 0 is always unstable and 1 is asymptotically stable if 0 < r < 2. Another example is that of Hassel [167]: Nt+1 =
7
rNt a , Nt 1+ K
Various examples are given in May and Oster [231].
(4.7)
114
4 Recurrence Equations 2
2
1.5
1.5
1
1
0.5
0.5
0
0 0
0.5
1
1.5
2
0
0.5
1
1.5
2
Fig. 4.3. Cobweb analysis of single-population models. Left figure: nt+1 = nt exp(r(1 − nt )) for r = 1.8 and n0 = 0.1; the equilibrium point n∗ = 1 is asymptotically stable. Right figure: nt+1 = rnt /(1 − nt )a for r = 12, a = 4, and n0 = 1.5; the equilibrium point n∗ = r1/a − 1 is asymptotically stable.
where r, K, and a are constants greater than 1. If nt denotes the dimensionless population Nt /K, we have nt+1 =
rnt . (1 + nt )a
(4.8)
There are two equilibrium points, 0 and r1/a − 1. Since the values of the derivative of the function n → rn/(1 + n)a at n = 0 and n = r1/a − 1 are, respectively, r and 1 − a(1 − r−1/a ), 0 is always unstable and r1/a − 1 is asymptotically stable if 0 < a(1 − r−1/a ) < 2. Example 28. Host-parasitoid models. A parasitoid is an insect having a lifestyle intermediate between a parasite and a usual predator. Parasitoid larvae live inside their hosts, feeding on the host tissues and generally consuming them almost completely. If we assume that the host has discrete nonoverlapping generations and is attacked by a parasitoid during some interval of its life cycle, a simple model is Ht+1 = rHt e−aPt , Pt+1 = cHt (1 − e−aPt ),
(4.9) (4.10)
where Ht and Pt are, respectively, the host and parasitoid populations at time t, and r, a, and c are positive constants. This model, whose interest is essentially historical, was proposed by Nicholson and Bailey [268] in 1935. The term eaPt represents, at time step t, the fraction of hosts that escape detection. They are the only ones that can live and reproduce. For the parasitoid, the factor 1 − eaPt is the probability that, at time t, a host is discovered. Each discovered host gives rise to c new parasitoids at the next time step.
4.2 Stability
115
The model has two equilibrium points, (0, 0) and (H ∗ , P ∗ ), where H∗ =
r log r , ac(r − 1)
P∗ =
log r . a
The nontrivial fixed point exists only for r > 1. The eigenvalues of the Jacobian matrix re−aP −arHe−aP c(1 − e−aP ) acHe−aP at (0, 0) are 0 and r. For (H ∗ , P ∗ ) to exist, r must be greater than 1, and (0, 0) is unstable. At the equilibrium point (H ∗ , P ∗ ) the eigenvalues are 1 r − 1 + log r ± i 4(r − 1)r log r − (r − 1 + log r)2 . 2(r − 1) The modulus of these two complex conjugate eigenvalues is equal to r log r . r−1 For r > 1, it is greater than 1, and (H ∗ , P ∗ ) is unstable (see the left panel of Figure 4.4). Nicholson and Bailey showed numerically that the oscillations of the populations of the two species seemed to increase without limit. They mentioned that such a behavior “is certainly not compatible with what we observe to happen.” To eliminate the unrealistic behavior of the Nicholson-Bailey model, Beddington, Free, and Lawton [38] introduced a density-dependent self-regulation for the host. Their model is Ht+1 = Ht exp r(1 − Ht /K) − aPt , (4.11) Pt+1 = cHt 1 − exp(−aPt ) . (4.12)
Parasitoids
Parasitoids
1.5 300
200
1
0.5
100 800
1000 1200 Hosts
1400
0.5
·
1 Hosts
1.5
Fig. 4.4. Host-parasitoid models. Small dots represent initial states and bigger dots • equilibrium points. For the Nicholson-Bailey model (left figure), the nontrivial equilibrium point is unstable for all values of the parameters (here r = 1.2, a = 0.001, and c = 1). For the reduced Beddington-Free-Lawton model (right figure), there exists in the (r, q) parameter space a domain in which the nontrivial fixed point is stable (here r = 1 and q = 0.5).
116
4 Recurrence Equations
In the absence of parasitoids, the host population evolves according to the recurrence equation (4.5), which has a stable equilibrium for 0 < r < 2. To simplify the discussion of this model, we introduce reduced variables. If (H ∗ , P ∗ ) is the nontrivial equilibrium point, let h=
H , H∗
p=
P , P∗
q=
H∗ . K
If, in the presence of parasitoids, the host population in the steady state is nonzero, it is expected to be less than the carrying capacity K. The scaled parameter q is a measure of the extent to which the host population at equilibrium is reduced with respect to K. The evolution of the scaled populations h and p is the solutions of the equations ht+1 = ht exp(r(1 − qht ) − αpt ), pt+1 = γht (1 − exp(−αpt )),
(4.13) (4.14)
where the scaled constants α and γ depend on the two independent scaled parameters r and q. Their expressions satisfy the relations α = r(1 − q),
γ=
1 . 1 − e−α
The Jacobian matrix at the nontrivial equilibrium point (h∗ , p∗ ) = (1, 1) is ⎡ ⎤ 1 − qr −r(1 − q) J(1, 1) = ⎣ r(1 − q) ⎦ . 1 er(1−q) − 1 Its eigenvalues are the solutions to the quadratic equation λ2 − tr J(1, 1) λ + det J(1, 1) = 0.
(4.15)
The equilibrium point (1, 1) is asymptotically stable if the solutions of this equation 1 T ± T 2 − 4D , λ1,2 = 2 where T = tr J(1, 1), D = det J(1, 1), are such that
T ± T 2 − 4D < 2.
(4.16)
To find the conditions under which (4.16) is satisfied, we have to consider two cases: 1. If T 2 − 4D 0, the eigenvalues are real and the stability conditions (4.16) √ may be written (T ± √ T 2 − 4D)2 < 4. It is easy to verify that, solving for T the equations (T ± T 2 − 4D)2 = 4, we find T = ±(1 + D). Hence, if |T | < 1 + D, (4.16) is satisfied. The trace and the determinant of the Jacobian J(1, 1) should, therefore, satisfy the conditions | tr J(1, 1)| < 1 + det J(1, 1) < 2
(4.17)
for the eigenvalues, either real or complex, to have a modulus less than 1 (see the right panel of Figure 4.4). Figure 4.5 shows the domain in which the nontrivial fixed point (1, 1) is asymptotically stable. As for models described in terms of differential equations, models represented by maps should not change significantly when the model is slightly modified. A result similar to Theorem 3 can be stated for maps. First we define, as we did for flows, when one can say that two maps are close. In the space M(U ) of all C 1 maps defined on an open set U ⊆ Rn , an ε-neighborhood of f ∈ M(U ) is defined by Nε (f ) = {g ∈ M(U ) | f − g 1 < ε}. A map g that belongs to an ε-neighborhood of f is said to be ε-C 1 -close to f or an ε-C 1 -perturbation of f . In this case, the components (f1 (x), . . . , fn (x)) and (g1 (x), . . . , gn (x)) of, respectively, f and g and their first derivatives are close throughout U . Theorem 8. Let x∗ be a hyperbolic equilibrium point of the C 1 map f ∈ M(U ). Then, there exists a neighborhood V (x∗ ) of x∗ and a neighborhood
118
4 Recurrence Equations
N (f ) of f such that each g ∈ N (f ) has a unique hyperbolic equilibrium point y∗ ∈ V (x∗ ) of the same type as x∗ . In this case, the map f is said to be locally structurally stable at x∗ .
4.3 Poincar´ e maps Definition 17. Let ϕt be a flow on the phase space S, generated by the vector field X, and Σ a subset of S of codimension 1 8 such that: 1. For all values of t, every orbit of ϕt meets Σ. 2. If x ∈ Σ, X(x) is not tangent to Σ. Then, Σ is said to be a global cross-section of the flow. Let y ∈ Σ and τ (y) be the least positive time for which ϕτ (y) (y) ∈ Σ; then the map P : Σ → Σ defined by P(y) = ϕτ (y) (y) is the Poincar´e map9 for Σ of the flow ϕt . Example 29. A trivial example. In Example 17, it was shown that, according to the Poincar´e-Bendixson theorem, the two-dimensional system x˙ 1 = x1 − x2 − x1 (x21 + x22 ),
x˙ 2 = x1 + x2 − x2 (x21 + x22 ),
had a limit cycle. Using polar coordinates, this system becomes r˙ = r − r3 ,
θ˙ = 1.
Under this particularly simple form, we can integrate these two decoupled differential equations. We find −1/2 1 −2t e r(t) = 1 + , θ(t) = θ0 + t, r02 − 1 where r0 = r(0) and θ0 = θ(0). If r0 is close to 1, it is clear that the orbit tends to the limit cycle r(t) = 1, which is stable. This example is so simple that we can determine a Poincar´e map explicitly. If Σ is the ray θ = θ0 through the origin, then Σ is orthogonal to the limit cycle r(t) = 1, and the orbit passing through the point (r0 , θ0 ) ∈ Σ at t = 0 intersects Σ again at t = 2π. Thus, the map defined on Σ by −1/2 1 −4π P (r0 ) = 1 + −1 e r02 is a Poincar´e map. r0 = 1 is a fixed point for this map, and it is stable since P (1) = e−4π < 1. 8 9
The codimension of Σ is dim(S) − dim(Σ). Also called the first return map.
4.3 Poincar´e maps
119
If ϕt is the flow generated by the vector field X, taking into account (3.4), we have d ϕτ (x) X(ϕt (x)) = dτ τ =t d = ϕ ϕ (x) dτ t τ τ =0 d = Dx ϕt ϕτ (x) dτ τ =0 (4.18) = Dx ϕt X(x) . If x0 is a point on a periodic orbit of period T , then ϕT (x0 ) = x0 , and from (4.18) it follows that X(x0 ) = X(ϕT (x0 )) = Dx0 ϕT X(x0 ) . That is, X(x0 ) is an eigenvector of the linear operator Dx0 ϕT associated with the eigenvalue 1. If x0 and x1 are two points on a periodic orbit of period T , there exists s such that x1 = ϕs (x0 ). But, we always have ϕT ◦ ϕs (x) = ϕs ◦ ϕT (x). Hence, taking the derivative of both sides of the relation above at x0 , we obtain Dx1 ϕT Dx0 ϕs = Dx0 ϕs Dx0 ϕT , which shows that the linear operators Dx1 ϕT and Dx0 ϕT are (linearly) conjugate by Dx0 ϕs . From the considerations above, it follows that if γ is a periodic orbit of the flow ϕt in Rn and x0 ∈ γ, the n eigenvalues of the linear operator Dx0 ϕT are 1, λ1 , λ2 , . . . , λn−1 . The n − 1 eigenvalues λ1 , λ2 , . . . , λn−1 are called the characteristic multipliers of the periodic orbit. Since Dx1 ϕT and Dx0 ϕT are conjugate, these eigenvalues do not depend upon the point x0 on the orbit. If, for all i = 1, 2, . . . , n − 1, |λi | = 1, the orbit is said to be hyperbolic. If the absolute values of all the characteristic multipliers are less than 1, the orbit is a periodic attractor, if the absolute values of all the characteristic multipliers are greater than 1, the orbit is a periodic repeller, and if the hyperbolic orbit is neither a periodic attractor nor a periodic repeller, it is a periodic saddle. It can be shown that the eigenvalues of the derivative of the Poincar´e map at a point x0 on a periodic orbit γ are the characteristic multipliers of the periodic orbit γ.10 This result reduces the problem of the stability of a periodic orbit of a flow to the problem of the stability of a fixed point of a map. 10
See Robinson [299].
120
4 Recurrence Equations
4.4 Local bifurcations of maps The bifurcation theory of equilibrium points of maps is similar to the theory for vector fields. We shall, therefore, only highlight the differences when they occur. Let (x, µ) → f (x, µ) be a C k 11 r-parameter family of maps on Rn . If (x∗ , µ∗ ) is an equilibrium point (i.e., x∗ = f (x∗ , µ∗ )), is the stability of the equilibrium point affected as µ is varied? If the equilibrium point is hyperbolic (i.e., if none of the eigenvalues of the Jacobian matrix Df (x∗ , µ∗ ) has unit modulus), then for values of µ sufficiently close to µ∗ , the stability of the equilibrium point is not affected. Moreover, in this case, from the implicit function theorem, it follows that there exists a unique C k function x : µ → x(µ) such that, for µ sufficiently close to µ∗ , f (x(µ), µ) = x(µ) with x(µ∗ ) = x∗ . Since the eigenvalues of the Jacobian matrix Df x(µ), µ are continuous functions of µ, for µ sufficiently close to µ∗ , none of the eigenvalues of this Jacobian matrix have unit modulus. Hence, equilibrium points close to (x∗ , µ∗ ) are hyperbolic and have the same type of stability as (x∗ , µ∗ ). If the equilibrium point (x∗ , µ∗ ) is nonhyperbolic (i.e., if some eigenvalues of Df (x∗ , µ∗ ) have a modulus equal to one), for values of µ close to µ∗ , a totally new dynamical behavior can occur. We shall only study the simplest bifurcations that occur at nonhyperbolic equilibrium points; that is, we investigate one-parameter maps f on R such that ∂f f (0, 0) = 0 and (0, 0) = 1 or − 1, ∂x and one-parameter maps f on R2 such that f (0, 0) = 0 and spec Dx (0, 0) = {α ± iω | (α, ω) ∈ R2 , α2 + ω 2 = 1}, where spec(A) is the spectrum of the linear operator A; that is, the set of complex numbers λ such that A − λI is not an isomorphism. 4.4.1 Maps on R Saddle-node bifurcation Consider the one-parameter family of maps x → f (x, µ) = x + µ − x2 . 11
(4.19)
The degree of differentiability has to be as high as needed in order to satisfy the conditions for the family of maps to exhibit a given type of bifurcation. See the necessary and sufficient conditions below.
4.4 Local bifurcations of maps
121
We have
∂f (0, 0) = 1. ∂x In the neighborhood of the nonhyperbolic equilibrium point (0, 0), equilibrium points are solutions to the equation f (0, 0) = 0 and
f (x, µ) − x = µ − x2 = 0. For µ < 0 there are no equilibrium points, while for µ > 0 there are two √ √ √ √ hyperbolic equilibrium points x∗ = ± µ. Since f (± µ, µ) = 1 ∓ 2 µ, µ is √ asymptotically stable, and − µ is unstable. This type of bifurcation is called a saddle-node bifurcation. Phase portraits and the bifurcation diagram for the recurrence equation xt+1 = f (xt , µ) are identical to those obtained for the differential equation (3.33). See Figures 3.14 and 3.17. As for flows, we can derive the necessary and sufficient conditions for a one-parameter map on R to undergo a saddle-node bifurcation. The derivation of these conditions makes use of the implicit function theorem and is similar to what has been done for one-parameter flows on R. A one-parameter family of C k (k ≥ 2) maps (x, µ) → f (x, µ) exhibits a saddle-node bifurcation at (x, µ) = (0, 0) if ∂f f (0, 0) = 0, (0, 0) = 1, ∂x and ∂2f ∂f (0, 0) = 0. (0, 0) = 0, ∂µ ∂x2
Transcritical bifurcation Consider the one-parameter family of maps x → f (x, µ) = x + µx − x2 .
(4.20)
We have
∂f (0, 0) = 1. ∂x In the neighborhood of the nonhyperbolic fixed point (0, 0), equilibrium points are solutions to the equation f (0, 0) = 0 and
f (x, µ) − x = µx − x2 = 0. For all values of µ, there exist two fixed points: 0 and µ. If µ < 0, 0 is stable and µ unstable, while for µ > 0, 0 becomes unstable and µ stable. Hence, there exist two curves of fixed points that exchange their stabilities when passing through the bifurcation point. This type of bifurcation is called a transcritical bifurcation. Phase portraits and the bifurcation diagram for the recurrence
122
4 Recurrence Equations
equation xt+1 = f (xt , µ) are identical to those obtained for the differential equation (3.34). See Figures 3.15 and 3.17. Here again, using the implicit function theorem, it may be shown that a one-parameter family of C k (k ≥ 2) maps (x, µ) → f (x, µ) on R undergoes a transcritical bifurcation at (x, µ) = (0, 0) if ∂f (0, 0) = 1, ∂x
f (0, 0) = 0, and
∂f (0, 0) = 0, ∂µ
∂2f (0, 0) = 0, ∂x2
∂2f (0, 0) = 0. ∂µ∂x
Pitchfork bifurcation Consider the one-parameter family of maps x → f (x, µ) = x + µx − x3 .
(4.21)
We have
∂f (0, 0) = 1. ∂x In the neighborhood of the nonhyperbolic fixed point (0, 0), equilibrium points are solutions to the equation f (0, 0) = 0 and
f (x, µ) − x = µx − x3 = 0. For µ ≤ 0, 0 is the only fixed point and it is asymptotically stable. For µ > 0, √ there are three fixed points, 0 is unstable, and ± µ are both asymptotically stable. The curve x = 0 exists on both sides of the bifurcation point (0, 0), and the fixed points on this curve change their stabilities on passing through this point. The curve µ = x2 exists only on one side of (0, 0) and is tangent at this point to the line µ = 0. This type of bifurcation is called a pitchfork bifurcation. Phase portraits and the bifurcation diagram for the recurrence equation xt+1 = f (xt , µ) are identical to those obtained for the differential equation (3.35). See Figures 3.16 and 3.17. Using the implicit function theorem, it may be shown that a one-parameter family of C k (k ≥ 3) maps (x, µ) → f (x, µ) on R undergoes a pitchfork bifurcation at (x, µ) = (0, 0) if f (0, 0) = 0,
∂f (0, 0) = 1, ∂x
and ∂f (0, 0) = 0, ∂µ
∂2f (0, 0) = 0, ∂x2
∂2f (0, 0) = 0, ∂µ∂x
∂3f (0, 0) = 0. ∂x3
4.4 Local bifurcations of maps
123
Period-doubling bifurcation Consider the one-parameter family of maps x → f (x, µ) = −(1 + µ)x + x3 .
(4.22)
We have
∂f (0, 0) = −1. ∂x The fixed point (0, 0) is a nonhyperbolic fixed point. The other fixed points are the solutions to the equation f (0, 0) = 0 and
f (x, µ) − x = x(x2 − 2 − µ). For all values of µ, 0 is a fixed point; it is stable for −2 < µ < 0 and unstable for√either µ ≤ −2 or µ ≥ 0. For µ ≥ −2, there exist two other fixed points, ± 2 + µ that are both unstable. Hence for µ > 0, the map (x, µ) → f (x, µ) has three fixed points all of them being unstable. This behavior is new; it could not exist for one-dimensional vector fields. The second iterate f 2 of f is (x, µ) → f 2 (x, µ) = (1 + µ)2 x − (2 + 4µ + 3µ2 + µ3 )x3 + 3(1 + m)2 x5 − 3(1 + m)x7 + x9 . It can be verified that f 2 (0, 0) = 0 and
∂f 2 (0, 0) = 1, ∂x
and ∂f 2 (0, 0) = 0, ∂µ
∂2f 2 (0, 0) = 0, ∂x2
∂2f 2 (0, 0) = 2, ∂µ∂x
∂3f 2 (0, 0) = −12. ∂x3
Hence, the one-parameter family of maps (x, µ) → f 2 (x, µ) undergoes a pitchfork bifurcation at the point (0, 0) (i.e., the new fixed points of f 2 are period 2 points of f ). This new type of bifurcation is called a period-doubling bifurcation.12 Using the implicit function theorem, it is possible to find the necessary and sufficient conditions for a one-parameter family of C k (k ≥ 3) maps (x, µ) → f (x, µ) on R to exhibit a period-doubling bifurcation at (x, µ) = (0, 0). From the discussion above, it follows that (0, 0) should be a nonhyperbolic fixed point of f such that f (0, 0) = 0 and
∂f (0, 0) = −1, ∂x
and, at (0, 0), f 2 should exhibit a pitchfork bifurcation; i.e., 12
Also called a flip bifurcation.
124
4 Recurrence Equations
∂f 2 (0, 0) = 0, ∂µ
∂2f 2 (0, 0) = 0, ∂x2
∂2f 2 (0, 0) = 0, ∂µ∂x
∂3f 2 (0, 0) = 0. ∂x3
For a small positive value of the parameter µ, the map f 2 has two fixed √ √ points, which are x∗1 = − µ and x∗2 = µ, such that f (x∗1 , µ) = x∗2
and f (x∗2 , µ) = x∗1 .
Hence, x∗1 and x∗2 are period 2 points of the map f , and the periodic orbit {x∗1 , x∗2 } is asymptotically stable if x∗1 and x∗2 are asymptotically stable fixed points of f 2 . Since ∂f ∂f ∗ ∂f ∗ ∂f 2 ∗ (x1 , µ) = (x1 , µ) (x , µ) ∂x ∂x ∂x ∂x 1 ∂f ∂f ∗ = (x2 , µ) (x∗1 , µ), ∂x ∂x the stability condition is ∂f ∗ (x1 , µ) ∂f (x∗2 , µ) < 1. ∂x ∂x √ √ With x∗1 = − µ and x∗2 = µ, we find
(4.23)
∂f √ ∂f √ ( µ, µ) = (− µ, µ) = −1 + 2µ. ∂x ∂x √ √ Hence, the 2-point cycle { µ, − −µ} is asymptotically stable for µ sufficiently small. Example 30. Period-doubling bifurcation in a one-population model. In Example 27, two different one-population models were presented. In the first model, the reduced population n evolves according to the recurrence equation nt+1 = nt exp(r(1 − nt )). Does this model exhibit a period-doubling bifurcation? Let f (n, r) = n exp(r(1 − n)). Since
∂f (1, 2) = −1, ∂n the nonhyperbolic fixed point (1, 2) may be a period-doubling bifurcation point. Let us check the remaining four other conditions. The second iterate of f is (n, r) → f 2 (n, r) = ner(1−n) exp r(1 − ner(1−n) ) , f (1, 2) = 1 and
and we have
4.4 Local bifurcations of maps
125
2
1.5
1
0.5
0 0
0.5
1
1.5
2
Fig. 4.6. Stable 2-point cycle of the one-population model nt+1 = nt exp(r(1 − nt ) obtained for r = 2.1. A transient of 40 iterations has been discarded.
∂f 2 (1, 2) = 0, ∂r
∂2f 2 (1, 2) = 0, ∂2n
∂2f 2 (1, 2) = 2, ∂r∂n
∂3f 2 (1, 2) = −8. ∂3n
The one-parameter family of maps (n, r) → n exp(r(1−n)) exhibits, therefore, a period-doubling bifurcation at the point (n, r) = (1, 2). If r is slightly greater than 2, the map f (n, r) has two period 2 points, n∗1 and n∗2 , such that f (n∗1 , r) = n∗2 and f (n∗2 , r) = n∗1 . To check the stability of these periodic points, we have to verify that Condition (4.23) is satisfied. If, for example, r = 2.1, we find that (n∗1 , n∗2 ) = (0.629294 . . . , 1.370706 . . .) and ∂f ∂f (0.629294, 2.1) (1.370706, 2.1) = 0.603965. ∂n ∂2 The 2-point cycle {0.629294 . . . , 1.370706 . . .}, shown in Figure 4.6, is asymptotically stable.
4.4.2 The Hopf bifurcation As in the case of families of maps on R, nonhyperbolic points in families of maps on R2 are usually bifurcation points. The various types of bifurcations that may occur for one-dimensional systems may also occur for higherdimensional systems. There exist, however, other possible types of bifurcations. In this section, we discuss a bifurcation that appears when the eigenvalues of the linear part of the two-dimensional part in the neighborhood of a nonhyperbolic equilibrium point cross the unit circle.
126
4 Recurrence Equations
Example 31. Delayed logistic model. One-population models of the form xt+1 = f (xt ) are not adequate for species whose individuals need a certain time to reach sexual maturity. By analogy with the continuous-time equation (1.12), consider the Maynard Smith delayed logistic recurrence equation [233] nt+1 = rnt (1 − nt−1 ). In order to write this second-order equation as a first-order two-dimensional system, let xt = nt−1 and yt = nt . The delayed logistic equation then takes the form xt+1 = yt ,
yt+1 = ryt (1 − xt ).
The map f : (x, y) → (y, ry(1 − x)) has two fixed points, r−1 r−1 , . (0, 0) and r r The eigenvalues of the linear operator at the nontrivial fixed point r − 1 r − 1 Df ,r , r r √ are 12 (1 ± 5 − 4r). If r > 54 , the eigenvalues are complex conjugate. Their modulus is equal to 1 when 1 1 + (4r − 5) = 1, 4
that is, for r = 2,
and, in this case, they are equal to e±iπ/3 (i.e., to two of the sixth roots of unity).13 For r < 2, the nontrivial fixed point is asymptotically stable, hence, for r > 2, we expect the existence of a stable limit cycle. This is the case, as shown in Figure 4.7. The nonhyperbolic point ( 12 , 12 ), 2 is, therefore, a bifurcation point. Because of its connection with the Hopf bifurcation for flows, this bifurcation is also named after Hopf. Theorem 9. Let (x, µ) → f (x, µ) be a one-parameter C k (k ≥ 5) family of maps in R2 such that: 1. For µ near 0, f (0, µ) = 0. 2. For µ near 0, Df (0, µ) has two complex conjugate eigenvalues λ(µ) and λ(µ), with |λ(0)| = 1. 3. λ(0) is not a qth root of unity for q = 1, 2, 3, 4 or 5. d 4. |λ(µ)| > 0 for µ = 0. dµ 13
See Theorem 9 for the importance of this remark.
4.5 Sequences of period-doubling bifurcations
127
0.8
0.6
0.4
0.2 0.2
0.4
0.6
0.8
Fig. 4.7. Stable cycle for the two-dimensional map (x, y) → (y, ry(1 − x)) (delayed logistic model) for r = 2.1. Initial point: (0.3, 0.5), number of iterations: 400.
Then, there exists a smooth change of coordinates that brings f in polar coordinates under the form g(r, θ, µ) = (|λ(µ)|r + a(µ)r3 , θ + b(µ) + c(µ)r2 ), where terms of degree 5 and higher in r have been neglected. a, b, and c are smooth functions of µ. If a(0) < 0 (resp. a(µ) > 0) then, for µ < 0 (resp. µ > 0), the origin is stable (resp. unstable) and, for µ > 0 (resp. µ < 0)), the origin is unstable (resp. stable) and surrounded by an attracting (resp. repelling) invariant circle. We could verify that in the case of the delayed logistic equation of Example 31, the conditions of the theorem are satisfied (for r = 2 the eigenvalues are the sixth root of unity). The proof of this theorem is rather laborious. See either Devaney [102] or Arrowsmith and Place [11].
4.5 Sequences of period-doubling bifurcations When a one-parameter family of maps on R, (x, µ) → f (x, µ), exhibits a period-doubling bifurcation at a nonhyperbolic fixed point (x1 , µ1 ), the family of second iterates (x, µ) → f 2 (x, µ) exhibits at (x1 , µ1 ) a pitchfork bifurcation. Then, when µ is slightly modified, say increased, this bifurcation gives birth to a 2-point cycle, and both periodic points of this cycle are hyperbolic fixed points of f 2 . As µ is further increased, it might happen that the family of second iterates undergoes a period-doubling bifurcation at a nonhyperbolic fixed point (x2 , µ2 ), giving birth to a 2-point cycle for f 2 (i.e., a 4-point cycle for f ). This process may occur repeatedly, generating an infinite sequence of period-doubling bifurcations. The interesting fact is that these sequences possess some universal properties, which are discussed below.
128
4 Recurrence Equations
4.5.1 Logistic model The discrete logistic model, described by the recurrence equation nt+1 = rnt (1 − nt ),
(4.24)
has been extensively studied. It is the simplest nonlinear map, but nevertheless its properties are far from being trivial. In an earlier study of the iteration of quadratic polynomials, Myrberg [256] already mentioned the existence of considerable difficulties: En nous limitant dans notre travail au cas le plus simple non lin´eaire, c’est-`a-dire aux polynˆ omes r´eels du second degr´e, nous observons que mˆeme dans ce cas sp´ecial on rencontre des difficult´es consid´erables, dont l’explication exigera des recherches ult´erieures.14 As a one-population model, Equation (4.24) represents the evolution of a reduced population n = N/K, where K is the carrying capacity. From a purely mathematical point of view, it has the advantage of being extremely simple, but, as a population model, it has the disadvantage of requiring n to belong to the closed interval [0, 1] since, for n ∈ / [0, 1], the sequence of iterates of n tends to −∞. The fixed points of (4.24) are the solutions of n = f (n, r) = rn(1 − n); i.e., n = 0 and n = (r − 1)/r. The derivative of f with respect to n being equal to r(1 − 2n), if r < 1, n = 0 is asymptotically stable and n = (r − 1)/r, which does not belong to [0, 1], is unstable. The nonhyperbolic fixed point (n, r) = (0, 1) is a transcritical bifurcation point. If 1 < r < 3, n = 0 is unstable and n = (r − 1)/r is asymptotically stable. For r = 3, the derivative of f at n = (r − 1)/r = 2/3 is equal to −1. Since f 2 (n, r) = r2 n(1 − n)(1 − (rn(1 − n)), we verify that f 2 (2/3, 3) = 2/3,
∂f 2 (2/3, 3) = 1, ∂r
and ∂f 2 (2/3, 3) = 0, ∂r ∂2f 2 (2/3, 3) = 2, ∂r∂n 14
∂2f 2 (2/3, 3) = 0, ∂2n ∂3f 2 (2/3, 3) = − 108. ∂3n
Limiting ourselves to the simplest nonlinear case; that is, to quadratic real polynomials, it is observed that even in this special case considerable difficulties are encountered, whose explanation will require more work in the future.
4.5 Sequences of period-doubling bifurcations
129
These results show that the nonhyperbolic fixed point (2/3, 3) is a perioddoubling bifurcation point. The map f 2 has four fixed points that are the solutions of the equation f 2 (n, r) = n. Two solutions are already known, namely the two unstable fixed points, n = 0 and n = (r − 1)/r, of f . The remaining two solutions are15 n=
1 (r + 1 ± r2 − 2r − 3). 2r
They are the two components of the 2-point cycle. They are defined only for r ≥ 3. The domain of stability of this cycle is determined by the condition ∂f 1 ∂f 1 2 − 2r − 3), r 2 − 2r − 3), r < 1. (1 + r + (1 + r − r r ∂n 2r ∂n 2r √ That is, √ the 2-point cycle is asymptotically stable if 3 < r < 1 + 6. For r > 1 + 6, the system undergoes an infinite sequence of period-doubling bifurcations. A few iterations of the logistic map for increasing values of r are shown in Figure 4.8. Let (rk )k∈N be the sequence of parameter values at which a period-doubling bifurcation occurs. This sequence is such that the 2k -point cycle is stable for rk < r < rk+1 . Then, if {nk,1 , nk,2 , . . . , nk,2k } denotes the 2k -point cycle, for i = 1, 2, . . . , 2k and rk < r < rk+1 , we have k
f 2 (nk,i , r) = nk,i .
(4.25)
Moreover, for i = 1, 2, . . . , 2k ,16 k
∂f 2 (nk,i , rk ) = +1, ∂n
k
∂f 2 (nk,i , rk+1 ) = −1. ∂n
(4.26)
Numerically, one can easily discover that (rk ) is an increasing bounded sequence of positive numbers. This sequence has, therefore, a limit r∞ . If we assume that the asymptotic behavior of rk is of the form 15
Let n1,1 and n1,2 be these solutions. They are such that f (n1,1 , r) = n1,2 ,
f (n1,2 , r) = n1,1 .
Hence, 1+r 1+r , , n1,1 n1,2 = r r2 are the solutions of the quadratic equation
n1,1 + n1,2 = which shows that n1,1 and n1,2
r2 n2 − r(1 + r)n + 1 + r = 0. 16
The numerical values of the nk,i (i = 1, 2, . . . , 2k ) depend on r.
130
4 Recurrence Equations
0.8 0.65 0.6 0.55
0.7 0.6 4
8
12
16
0.8
0.8
0.6
0.6
4
8
12
16
4
8
12
16
0.4
0.4 4
8
12
16
Fig. 4.8. Iterations of the logistic map f : (n, r) → rn(1 − n) for r = 2.5 (fixed point), r = 3.2 (2-point cycle), r = 3.48 (4-point cycle), and r = 3.55 (8-point cycle). Sixteen iterates are shown after 100 have been discarded.
rk ∼ r∞ −
a , δk
(4.27)
where a and δ are two positive numbers, then, according to (4.27), lim
k→∞
rk − rk−1 = δ; rk+1 − rk
(4.28)
δ is known as the Feigenbaum number. To determine the numerical values of r1 , r2 , r3 , . . . and δ, one proceeds as follows.√We have seen that the exact values of r1 and r2 are, respectively, 3 and 1 + 6. Once the approximate value of r3 is found, r2 − r1 δ1 = r3 − r2 gives an approximate value of δ, which is used to estimate r 4 ≈ r3 +
r3 − r2 . δ1
As k increases, solving Equation (4.25), we locate rk and determine a better value of δ.17 This is the method followed by Feigenbaum [121, 122] who found δ = 4.6692016091029... . 17
The first period-doubling bifurcations occur for the following parameter values:
4.5 Sequences of period-doubling bifurcations
131
4.5.2 Universality The interesting fact, found by Feigenbaum [121], is that the rate of convergence δ of the sequence (rk ) is universal in the sense that it is the same for all recurrence equations of the form xt+1 = f (xt , r) that exhibit an infinite sequence of period-doubling bifurcations, if f is continuous and has a unique maximum xc (such a map is said to be unimodal ) with f (xc )−f (x) ∼ (xc −x)2 . If the order of the maximum is changed, the ratio will also change. Consider the map g(x, a) = 1 − axz , (z > 1). It is easy to verify that, for z = 2, f and g are topologically equivalent; i.e., there exists a map such that ◦ g = f ◦ , where (x) =
1 4
r−
1 2
x + 21 .
The parameters a and r are such that r2 − 2r − 4a = 0. The following table gives the value of δ for different values of the exponent z.
z δ
2 4 6 8 4.669 7.284 9.296 10.948
The existence of an infinite sequence of period-doubling bifurcations for r ≥ 3 can be made plausible by a simple geometric argument. Consider the graphs of f 2 for two parameter values, one slightly less than 3 and the other one slightly greater than 3 (Figure 4.9). They show how the asymptotically stable point for r < 3 splits into an asymptotically stable 2-point orbit for r > 3. Consider now the graph of f 2 for a value of r between r1 and r2 . As shown in Figure 4.10, it has four fixed points. Two of them, 0 and (r − 1)/r, are unstable and the other two correspond to the stable 2-point cycle. In the interval [1/r, (r − 1)/r], the graph of f 2 , after a symmetry about a horizontal axis, is similar to the graph of f for a certain parameter value s. More precisely, the map F , defined by F = L ◦ f 2 ◦ L−1 , (4.29) where L is the linear map L : x →
x − (r − 1)/r , (2 − r)/r
(4.30)
r1 = 3.0,
r2 = 3.449499... ,
r3 = 3.544090... ,
r4 = 3.564407... ,
r5 = 3.568759... ,
r6 = 3.569692... ,
r7 = 3.569891... ,
r8 = 3.569934... .
The limit of the sequence (rk ) is r∞ = 3.5699456... , and the value of a in Relation (4.27) is 2.6327... .
132
4 Recurrence Equations
0.8
0.8
0.75
0.75
0.7
r=2.85
0.7
0.65
0.65
0.6
0.6
0.55
0.55
0.55 0.6 0.65 0.7 0.75 0.8
r=3.15
0.55 0.6 0.65 0.7 0.75 0.8
Fig. 4.9. Graphs of the second iterates of the logistic map for r = 2.85 and r = 3.15 showing how the period-doubling bifurcation at the point (2/3, 3) gives birth to a stable 2-point orbit. 1
1 r=3.3
0.8
0.8 r=2.24
0.6
0.6
0.4
0.4
0.2
0.2
0
0 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Fig. 4.10. In the boxed domain, the graph of the second iterate of the logistic map for r = 3.3 is similar to the graph of the logistic map for r = 2.24.
is such that F (0) = F (1) = 0, and if we choose a parameter value s such that f ( 12 , s) = F ( 12 , r), the graphs of (x, s) → f (x, s) and (x, r) → F (x, r) are very close, as shown in Figure 4.11. For r close to r2 , in the boxed domain, the scenario represented in Figure 4.9 will this time occur for f 4 ; that is, the stable fixed point of f 2 in the boxed domain, the asymptotically stable point for r < r2 , splits into an asymptotically stable 2-point cycle for r > r2 . This 2-point cycle for f 2 is a 4-point cycle for f , which is asymptotically stable for r2 < r < r3 , and so on ... . This argument can be made more rigorous using renormalization-group methods.18 18
See Coullet and Tresser [95] and Feigenbaum [121].
4.5 Sequences of period-doubling bifurcations
133
0.5 0.4 0.3 0.2 0.1 0.2
0.4
0.6
0.8
1
Fig. 4.11. Graph of the transform F of f (see text) for r = 3.3 (broken line) compared to the graph of the logistic curve for r = 2.24 (full line). 2
A fixed point x∗ of a map f is said to be superstable if f (x∗ ) = 0.19 In the case of the logistic map f : (n, r) → rn(1 − n), ( 12 , 2) is a superstable fixed point. Superstable cycles are defined in a similar way. In the case of the logistic map, a 2k -point cycle that includes the point 12 is superstable if, for k the parameter value r = Rk , (f 2 ) ( 12 , Rk ) = 0. Since rk < Rk < rk+1 , we have Rk − Rk−1 lim = δ. k→∞ Rk+1 − Rk It is actually easier to calculate δ accurately from the relation above than from (4.28) because, when r is close to rk from above, the 2k -cycle is weakly stable, and the method to determine rk converges slowly. The sequence (dk ), where dk = f 2
k−1
( 12 , Rk ) −
represents the distance between the point point cycle, is such that the limit20 − lim
k→∞
1 2
1 2
and its nearest point on the 2k -
dk =α dk+1
is also a universal constant equal to 2.5029078751 . . . [121]. While Feigenbaum’s universality is quantitative, in 1973 Metropolis, Stein, and Stein [241] had already discovered a qualitative universality of finite limit sets of families of unimodal maps fr defined on the unit interval [0, 1] such that fr (0) = fr (1) = 0.21 19 20
21
Such a point is said to be critical. The minus sign in the definition of α is related to the fact that the nearest point to 12 on the 2k -point cycle is alternatively below and above 12 as k increases. In their paper, Metropolis, Stein, and Stein enumerate a list of conditions the maps should satisfy. In particular, they allow maps such as, for example,
134
4 Recurrence Equations
In order to illustrate this qualitative universality, consider the logistic map (x, r) → fr (x) = rx(1 − x). fr (x) is maximum for x = 12 , and fr ( 12 ) = 0. For a given integer k > 1, solving for r the equation frk ( 12 ) =
1 2
gives the values of the parameter r for which there exists a superstable k-point cycle. For example, if k = 5, there are three different solutions: 3.7389149 . . ., 3.9057065 . . ., and 3.9902670 . . .. In each case, the successive iterates of x = 12 are either greater than or less than 12 . Following Metropolis, Stein, and Stein, according to whether an iterate is greater than or less than 12 , it is said to be of type R or of type L. Using this convention, the itinerary along a superstable 5-point cycle is characterized by a pattern of four letters: 1 2 1 2 1 2
→R →L →R →R → →R →L →L → R → →R →L →L → L →
1 2 1 2 1 2
for r = 3.7389149 . . . , for r = 3.9057065 . . . , for r = 3.9902670 . . . .
Omitting the initial and final points 12 , the patterns are written in the simplified form RLR2 , RL2 R, RL3 . These sequences of R and L symbols are called U -sequences, where U stands for universal. They are universal in the sense that they appear for all unimodal maps in the same order for increasing values of the parameter r. Note that any U -sequence necessarily starts with R, followed by L if the sequence contains more than one symbol; that is, if the period of the superstable cycle is greater than 2. Many one-population models, described by continuous maps with only one quadratic maximum, exhibit universal infinite sequences of period-doubling bifurcations. This is the case, for example, for the map (n, r) → n exp(r(1 − n)) (see Figure 4.12), that undergoes period-doubling bifurcations for the following parameter values22 : r1 = 2,
r2 = 2.526...,
r3 = 2.656...,
r4 = 2.682...,
r∞ = 2.692... .
Since n exp(r(1 − n)) is maximum for x = 1/r, superstable cycles must include this point. The one-population model exhibits superstable 2k -point cycles for the following parameter values: ⎧r ⎪ x, ⎪ ⎪ ⎨a fr (x, a) = r, ⎪ ⎪ ⎪ ⎩ r (1 − x), a
22
if 0 ≤ x ≤ a, if a ≤ x ≤ 1 − a, if 1 − a ≤ x ≤ 1,
where a and r are positive reals such that 1 − a < r < 1. See also Exercise 4.6.
4.5 Sequences of period-doubling bifurcations
135
1.4 1.05 1
1 0.95
0.6 4
8
12
16
1.8
2
1.3
1.4
0.8
0.8
0.3
0.2 4
8
12
16
4
8
12
16
4
8
12
16
Fig. 4.12. Iterations of the first one-population map of Example 27 (n, r) → ner(1−n) for r = 1.8 (fixed point), r = 2.1 (2-point cycle), r = 2.55 (4-point cycle), and r = 2.67 (8-point cycle). Sixteen iterates are shown after 200 have been discarded.
R0 = 1,
R1 = 2.25643 . . . ,
R2 = 2.59352 . . . ,
R3 = 2.671 . . .
It can be verified that the U -sequences corresponding to the 2-, 4-, and 8-point cycles are R, RLR, RLR3 LR, the same as for the superstable cycles of the logistic map. Remark 6. The patterns in Figure 4.13 illustrate an interesting general feature mentioned by Metropolis, Stein, and Stein [241]. If P is a pattern for a given superstable cycle, then the pattern of its first harmonic is P XP , where X = L if P contains an odd number of Rs, and X = R otherwise.
136
Recurrence Equations
2
2
1.5
1.5
1
1
0.5
0.5
0
0 0
0.5
1
1.5
2
2
2
1.5
1.5
1
1
0.5
0.5
0
0.5
1
1.5
2
0
0.5
1
1.5
2
0
0 0
0.5
1
1.5
2
Fig. 4.13. Superstable cycles of the one-population map of Example 27 (n, r) → ner(1−n) for r = 1 (fixed point), r = 2.25643 (2-point cycle), r = 2.59352 (4-point cycle), and r = 2.671 (8-point cycle).
Exercises Exercise 4.1 An acceptable discrete one-population model of sexually reproducing organisms should be such that, if the population density is small, the organisms are sparsely distributed in their habitat, resulting in a low mating rate and a density-dependent growth rate less than 1. This effect is named after Allee [6]. If the population density is large, intraspecific competition is strong, and it is reasonable to assume that, in this case, the density-dependent growth rate has to be less than 1. The model nt+1 = rnxt (1 − nt ), where nt is the population density at time t, and r and x are positive constants, is a generalization of the logistic model. Under which condition(s) does it exhibit the Allee effect? In this case, find its equilibrium points and determine their stabilities.
Exercises
137
Exercise 4.2 The recurrence equation nt+1 = rn2t (1 − nt ) − cnt , is a model of harvesting a species that exhibits the Allee effect. The constant c is the intrinsic catching rate. Show that, for a fixed value of r greater than a critical value rc , the system undergoes a saddle-node bifurcation at a point (n∗ , c∗ ). What does this bifurcation imply from an ecological point of view? Exercise 4.3 In an SIR model for the spread of an infectious disease, based on disease status, the individuals are divided into three disjoint groups: 1. Susceptible individuals are capable of contracting the disease and becoming infective. 2. Infective individuals are capable of transmitting the disease to susceptibles. 3. Removed individuals have had the disease and are dead, have recovered and are permanently immune, or are isolated until recovery and permanent immunity occur. Let St , It , and Rt denote the densities of individuals belonging to the three different groups defined above. If we assume that the interactions responsible for the spread of the disease have a very long range ( i.e., each susceptible individual can be equally infected by any infective individual), it can be shown that the spread of the infectious disease can be modeled by the recurrence equations23 St+1 + It+1 + Rt+1 = ρ,
It+1 = It + St 1 − exp(−iIt ) − rIt ,
Rt+1 = Rt + rIt , where i is a coefficient proportional to the probability for a susceptible individual to be infected, r is the probability for an infective individual to be removed, and ρ is the total initial density. (i) Show that as time t goes to infinity, the limits S∞ , I∞ , and R∞ exist. What is the value of I∞ ? (ii) Show that, as for the Kermack-McKendrick epidemic model (Example 8), there is an epidemic if, and only if, the initial density of susceptible individuals S0 is greater than 23
If each susceptible individual can be equally infected by any infective individual, the probability for a susceptible to be infected can be written i/N , where i is a positive real number and N the total number of individuals in the population. Hence, between times t and t + 1, the density of infected individuals due to infection of susceptible individuals is given by N −1 iIt St 1 − 1 − , N which, in the limit N → ∞ (i.e., in the limit of infinite-range interactions), is equal to St 1 − (1 − exp(−iIt )) . See Boccara and Cheong [49].
138
Recurrence Equations
a critical value. The initial conditions are such that R0 = 0 and I0 S0 . (iii) What is the behavior as a function of t of It if S0 is either less than or greater than its critical value? Exercise 4.4 (i) Extend the previous one-population SIR epidemic model to describe the spread of a sexually transmitted disease among heterosexual individuals. m m m f f f (ii) Show that the limits S∞ , I∞ , R∞ , S∞ , I∞ , and R∞ , where the superscripts m and f refer, respectively, to the male and female populations, exist. What are the values of m f I∞ and I∞ ?
(iii) According to the initial values S0m , I0m , S0f , and I0f and the ratios rm /im and rm /im , discuss the existence of epidemics in one or both populations. Exercise 4.5 Consider an SIS epidemic model; i.e., a model in which, after recovery, infected individuals again become susceptible to catch the disease (as, e.g., with common cold). (i) Assuming, as in Exercise 4.3, that each susceptible individual can be equally infected by any infective individual, derive the recurrence equations satisfied by the densities of susceptible and infected individuals denoted, respectively, by St and It .24 (ii) Show that this model exhibits a transcritical bifurcation between an endemic state ( i.e., a state in which the stationary density of infected individuals I∞ is nonzero), and a disease-free state in which I∞ is equal to zero. Exercise 4.6 Determine the first period-doubling bifurcations of the one-parameter map (n, r) → f (n, r) = −rn log n, which is a discrete version of the classical Gompertz map [147].
24
See Boccara and Cheong [52].
Solutions
139
Solutions Solution 4.1 Let f denote the one-parameter family of maps (n, r) → rnx (1 − n). For intermediate values of n, the growth rate rnx−1 (1−n) should be larger than 1; otherwise extinction is automatic. If x > 1, the value of the population density maximizing the growth rate is (x − 1)/x, and the corresponding value of the growth rate is r
(x − 1)x−1 . xx
This quantity is larger than 1 if r>
xx . (x − 1)x−1
If, as a simple example, we assume that x = 2, then r has to be greater than 4, and, in this case, the recurrence equation nt+1 = f (nt , r) has three fixed points: √ √ r+ r−4 √ , 0 and 2 r which are asymptotically stable, and √
√ r− r−4 √ , 2 r
which is always unstable (Figure 4.14). The nonhyperbolic equilibrium point ( 12 , 4) is a saddle-node bifurcation point.
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2 0
0 0
0.2 0.4 0.6 0.8
1
0
0.2 0.4 0.6 0.8
1
Fig. 4.14. Cobwebs for the map f : (n, r) → rnx (1 − n) for x = 2 and r = 5. The asymptotically stable fixed points are 0 and 0.723607, and the unstable fixed point is 0.276393. Starting from an initial population density n0 = 0.25 (left cobweb), the population becomes extinct, while for an initial population density n0 = 0.3, n converges to the high-density asymptotically stable fixed point.
140
Recurrence Equations
Solution 4.2 If 0 < r < 4, 0 is the only fixed√point of the map n → rn2 (1−n). If r > 4, √ √ √ √ √ there exist three fixed points 0 and ( √r ± r − 4)/2 r; 0 and ( r + r − 4)/2 r √ √ are asymptotically stable, while ( r − r − 4)/2 r is unstable. The harvesting model makes sense only if r > rc = 4. If c > 0 and r − 4c − 4 > 0, the map n → rn2 (1 − n) − cn has two nontrivial fixed points: √ √ r ± r − 4c − 4 √ n∗± = ; 2 r n∗+ is asymptotically stable and n∗− unstable. Therefore, a saddle-node bifurcation occurs at the nonhyperbolic point (n∗ , c∗ ) = ( 12 , 1− 14 r). From an ecological point of view, if c is less than but close to its bifurcation value (c 1 − 14 ), there is a serious risk of extinction of the harvested species. Solution 4.3 (i) The three recurrence equations show that, as time t increases, St cannot increase and Rt cannot decrease. Since these quantities are bounded, the limits S∞ and R∞ exist. The second equation shows that I∞ also exists and is equal to zero. (ii) By definition, there is an epidemic if I1 > I0 .25 According to the second equation, this occurs if S0 1 − exp(−iI0 ) − rI0 > 0. If the initial conditions are R0 = 0 and I0 S0 , then S0 1 − exp(−iI0 ) − rI0 = I0 (iS0 − r) + O I02 ). Hence, an epidemic occurs if, and only if, S0 >
r . i
This is the threshold phenomenon of Kermack and McKendrick (see Chapter 3, Remark 1). (iii) Since, as t increases, St does not increase, if S0 is less than r/i, for all values of t, St remains less than r/i. Hence, It goes monotonically to zero as t tends to ∞. If S0 > r/i, the density of infective individuals increases as long as St is greater than r/i. Since St cannot increase, when it reaches the value r/i, It reaches its maximum value and then tends monotonically to zero as t tends to ∞. Solution 4.4 (i) Assuming that each susceptible individual can be equally infected by any infective individual (see Footnote 23), the equations describing the spread of a sexually transmitted disease among heterosexuals are m m m St+1 + It+1 + Rt+1 = ρm ,
m It+1 = Itm + Stm 1 − exp(−im Itf ) − rm Itm ,
m = Rtm + rm Itm , Rt+1 f f f St+1 + It+1 + Rt+1 = ρf ,
f It+1 = Itf + Stf 1 − exp(−if Itm ) − rf Itf ,
f Rt+1 = Rtf + rf Itf , 25
Refer to Chapter 3, Footnote 7.
Solutions
141
where ρm and ρf are two constants representing the initial densities of males and females respectively. im (resp. if ) is proportional to the probability for a male (resp. female) to be infected by a female (resp. male). (ii) As for the one-population model, it follows that Stm and Stf cannot increase whereas m m f f Rtm and Rtf cannot decrease. Hence, all the limits S∞ , R∞ , S∞ , R∞ exist. The third m f and sixth equations show that the limits I∞ and I∞ also exist and are both equal to zero. (iii) Due to the coupling between the two populations, a wider variety of situations may occur. For instance, if the initial conditions are R0m = R0f = 0
and
I0m S0m , I0f S0f ,
we have I1m − I0m = im S0m I0f − rm I0m + O (I0f )2 , I1f − I0f = if S0f I0m − rf I0f + O (I0m )2 . Hence, according to the initial values S0m , S0f , I0m , and I0f , we may observe the following behaviors: 1. If im S0m I0f − rm I0m < 0 and if S0f I0m − rf I0f < 0, then I1m < I0m and I1f < I0f . Since Stm and Stf are nonincreasing functions of time, Itm and Itf go monotonically to zero as t tends to ∞. No epidemic occurs. 2. If im S0m I0f − rm I0m > 0 and if S0f I0m − rf I0f > 0, then I1m > I0m and I1f > I0f . The densities of infected individuals in both populations increase as long as the densities of susceptible and infective individuals satisfy the relations im Stm Itf − rm Itm > 0 and if Stf I0m − rf Itf > 0 and then tend monotonically to zero. 3. If im S0m I0f − rm I0m < 0 and if S0f I0m − rf I0f > 0, then I1m < I0m and I1f > I0f . m But, since It+1 depends on Itf , the density of infected males does not necessarily goes monotonically to zero. After having decreased for a few time steps, due to the increase of the density of infected females, it may increase if im Stm Itf − rm Itm becomes positive. The spread of the disease in the female population may trigger an epidemic in the male population, as shown in Figure 4.15. If, however, the increase of the density of infected females is not high enough, then the density of infected males will decrease monotonically, whereas the density of infected females will increase as long as the density of susceptible individuals Stf is greater than rf Itm /if Itm and then tend monotonically to zero. The disease spreads only in the female population, whereas no epidemic occurs among males. 4. If im S0m I0f − rm I0m > 0 and if S0f I0m − rf I0f < 0, our conclusions are similar to the case above. Solution 4.5 (i) As in Exercise 4.3, assuming that each susceptible individual can be equally infected by any infective individual, we can write St+1 + It+1 = ρ,
It+1 = It + St 1 − exp(−iIt ) − rIt ,
where i is a positive real number proportional to the probability for a susceptible individual to catch the disease and r is the probability for an infected individual to recover.
Recurrence Equations
infective densities
142
0.025 0.02 0.015 0.01 0.005 0 0
Fig. 4.15. Time evolution population SIR model when in the other one. im = 0.5, R0m = 0, R0f = 0, ρm = 0.35,
20
40
60 time
80
100
of the densities of infected individuals for the twothe epidemic in one population triggers an epidemic if = 0.9, rm = 0.3, rf = 0.1, I0m = 0.01, I0f = 0.01, ρf = 0.35, and number of iterations = 100.
ρ is the constant total density of individuals. (ii) Eliminating St from the second equation, we obtain It+1 = (1 − r)It + (ρ − It ) 1 − exp(−iIt ) . The steady-state density of infected individuals I∞ therefore satisfies the equation I∞ = (1 − r)I∞ + (ρ − I∞ ) 1 − exp(−iI∞ ) . This equation has the obvious solution I∞ = 0, which characterizes the disease-free state. In the vicinity of the bifurcation point, I∞ is small, and the equation above can be written 2 I∞ = (1 − r + ρi)I∞ − i + 12 ρi2 I∞ + O (I∞ )3 , which shows that, in the (i, r)-plane, the transcritical bifurcation occurs along the line ρi − r = 0. The disease-free state is stable if ρi < r and unstable if ρi > r. The endemic state is stable in the latter case. Solution 4.6 For n ∈ [0, 1] and r > 0, the map f has a quadratic maximum at n = e−1 . For r = e, the maximum reaches the value 1. There exist two fixed points: 0 and e−1/r . The derivative of f is equal to −r(1 + log n), which shows that 0 is always unstable and e−1/r is asymptotically stable for 0 < r < 2. Since 1 1 ∂f √ , 2 = −1, f √ , 2 = 1 and ∂n e e √ the nonhyperbolic point (1/ e, 2) may be a period-doubling bifurcation point. We have to check the remaining conditions. The second iterate of f is (n, r) → f 2 (r, n) = r2 n log n log − rn log n ,
Solutions and we have
1 √ , 2 = 0, e 1 ∂2f 2 √ , 2 = 2, ∂r∂n e ∂f 2 ∂r
143
1 √ , 2 = 0, e ∂3f 2 1 √ , 2 = − 16e. ∂3n e
∂2f 2 ∂2n
The one-parameter family of maps (n, r) → −rn log n exhibits, therefore, a period√ doubling bifurcation at the point (n, r) = (1/ e, 2). It can be shown numerically that the 2-point cycle is asymptotically stable for 2 < r < 2.39536... . For r = 2.39536... , there is a new period-doubling bifurcation, and we find that the 4-point cycle is asymptotically stable for 2.39536... < r < 2.47138... . There is actually an infinite sequence of period-doubling bifurcations. The first period-doubling bifurcations of this infinite sequence occurring for the parameter values: r1 = 2,
r2 = 2.39536...,
r3 = 2.47138...,
r4 = 2.48614....
A few iterations for different values of r are represented in Figure 4.16.
0.6
0.8
0.5
0.6
0.4
0.4 4
8
12
16
0.9
0.9
0.7
0.7
0.5
0.5
0.3
0.3 4
8
12
16
4
8
12
16
4
8
12
16
Fig. 4.16. Iterations of the f : (n, r) → −rn log n for r = 1.4 (fixed point), r = 2.1 (2-point cycle), r = 2.43 (4-point cycle), and r = 2.483 (8-point cycle). 16 iterates are shown after 1000 have been discarded.
5 Chaos
In 1963, Edward Lorenz [212] published a numerical analysis of a simplified model of thermal convection represented by three quadratic differential equations. He discovered that all nonperiodic solutions of his deterministic model were bounded but showed irregular fluctuations. Some thirty years later [213], here is how he described his discovery: ..., while conducting an extensive experiment in the theory of weather forecasting, I had come across a phenomenon that later came to be called “chaos”—seemingly random and unpredictable behavior that nevertheless proceeds according to precise and often easily expressed rules. Earlier investigators had occasionally encountered behavior of this sort, but usually under rather different circumstances. Often they failed to recognize what they had seen, and simply became aware that something was blocking them from solving their equations or otherwise completing their studies. My situation was unique in that, as I eventually came to realize, my experiment was doomed to failure unless I could construct a system of equations whose solutions behaved chaotically. Chaos suddenly became something to be welcomed, at least under some conditions, and in the ensuing years I found myself turning more and more toward chaos as a phenomenon worthy of study for its own sake. The study of chaos can be traced back to Poincar´e [290]. In Science et M´ethode, first published in 1909, he already indicates the possibility for certain systems to be subject to sensitive dependence on initial conditions: Si nous connaissions exactement les lois de la nature et la situation de l’univers a` l’instant initial, nous pourrions pr´edire exactement la situation de ce mˆeme univers `a un instant ult´erieur. Mais, lors mˆeme que les lois naturelles n’auraient plus de secret pour nous, nous ne pourrons connaˆıtre la situation initiale qu’approximativement. Si cela nous permet de pr´evoir la situation ult´erieure avec la mˆeme approximation, c’est tout ce qu’il nous faut, nous disons que le ph´enom`ene a
146
5 Chaos
´et´e pr´evu, qu’il est r´egi par des lois ; mais il peut arriver que de petites diff´erences dans les conditions initiales en engendrent de tr`es grandes dans les ph´enom`enes finaux ; une petite erreur sur les premi`eres produirait une erreur ´enorme sur les derniers. La pr´ediction devient impossible et nous avons le ph´enom`ene fortuit.1 And a few lines below, he shows how these considerations allow an answer to the question: Pourquoi les m´et´eorologistes ont tant de peine a` pr´edire le temps avec quelque certitude ? 2 More than 50 years later, computers were available, and in his seminal paper, Lorenz used almost identical terms to answer the same question: When our results concerning the instability of nonperiodic flow are applied to the atmosphere, which is ostensibly nonperiodic, they indicate that prediction of the sufficiently distant future is impossible by any method unless the present conditions are known exactly. In view of the inevitable inaccuracy and incompleteness of weather observation, precise very-long forecasting would seem to be non-existent. While the kind of complicated dynamics Lorenz discovered cannot exist in one- or two-dimensional systems of differential equations, this is not the case for recurrence equations. The modern theory of deterministic chaos started after the publication, in 1976, of a review article in which Robert May [230] called attention to the very complicated dynamics of some very simple onepopulation models: Not only in research, but also in the everyday world of politics and economics, we would all be better off if more people realized that simple nonlinear systems do not necessarily possess simple dynamical properties. 1
2
If we could know exactly the laws of nature and the situation of the universe at the initial instant, we should be able to predict exactly the situation of this same universe at a subsequent instant. But even when the natural laws should have no further secret for us, we could know the initial situation only approximately. If that permits us to foresee the subsequent situation with the same degree of approximation, this is all we require, we say that the phenomenon has been predicted, that it is ruled by laws. But this is not always the case; it may happen that slight differences in the initial conditions produce very great differences in the final phenomena; a slight error in the former would make an enormous error in the latter. Prediction becomes impossible and we have the fortuitous phenomenon. (English translation [290], p. 397.) Why have the meteorologists such difficulty in predicting the weather with any certainty? (English translation [290], p. 398.)
5.1 Defining chaos
147
5.1 Defining chaos In the previous chapter, we have seen that, in the case of the logistic map (n, r) → f (n, r) = rn(1 − n), there exists an infinite sequence of parameter values (rk )k∈N for which f undergoes a period-doubling bifurcation, a 2k point cycle being asymptotically stable for all r ∈]rk , rk+1 [. The sequence (rk ) has a limit r∞ ≈ 3.599692. A natural question is: What is the dynamics for r > r∞ ? The answer is given by the bifurcation diagram represented in Figure 5.1. The bifurcation diagram has been computed for 300 parameter values equally spaced between 2.5 and 4. For each value of r, 300 iterates are calculated, but only the last 100 have been plotted. 1 0.8 0.6 0.4 0.2
2.6
2.8
3
3.2
3.4
3.6
3.8
4
Fig. 5.1. Bifurcation diagram of the logistic map (n, r) → rn(1 − n). The parameter r, plotted on the horizontal axis, varies from 2.5 to 4, and the reduced population n, plotted on the vertical axis, varies between 0 and 1.
For most values of r ∈]r∞ , 4], the successive iterates of f seem to wander in an apparently random manner. For example, if r = 4, √ as illustrated in Figure 5.2, the trajectory through the initial point n0 = 2 − 1 appears to be dense3 in [0.1]. A numerical experiment cannot, of course, determine whether the trajectory converges to an asymptotically stable periodic orbit of very high period or is dense in an interval. If the trajectory is dense, say in [0.1], we can divide this interval into a certain number of subintervals (bins) and count how many iterates fall into each subinterval. If the number of iterates is high enough, this 3
Let I be an interval of R. A subset J of I is dense in I if the closure of J coincides with I. In other words, any neighborhood of any point in I contains points in J. For example, the set of rational numbers Q is dense in R.
148
5 Chaos 1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 5.2. Cobweb showing √ a sequence of 250 iterates of the map n → 4n(1 − n). The initial value is n0 = 2 − 1 with a precision of 170 significant digits.
could determine, if it exists, an approximate invariant measure,4 as shown in Figure 5.3.
Fig. 5.3. Histogram representing the approximate probability distribution of the iterates of the logistic map f4 : n → 4n(1 − n). The histogram has been obtained from 75,000 iterates distributed in 25 bins.
4
Let f be a map defined on an interval I of R; a measure µ on I is invariant for f if, for any measurable subset E ⊂ I, µ(E) = µ(f −1 (E)).
5.1 Defining chaos
149
Remark 7. As already noticed by Poincar´e, some maps may have a sensitive dependence on initial conditions. This feature, which is a necessary condition for a map to be chaotic (see Definition 20), implies that the precision—i.e., number of significant digits—of each iterate is decreasing. In the case of the map n → 4n(1 − n), this precision decreases, on average, by 0.6 digit/iteration. In order to determine numerically a meaningful histogram, it is therefore essential to choose an initial value with a sufficient number of significant digits.
The considerations above are essentially heuristic, but, at least in the case of the map f4 : n → 4n(1 − n), it is possible to completely understand its dynamics. 5.1.1 Dynamics of the logistic map f4 In Example 25, we have seen that the logistic map f4 and the binary tent map T2 , defined on [0, 1] by
2x, if 0 ≤ x < 12 , f4 (x) = 4x(1 − x) and T2 (x) = 2 − 2x, if 12 ≤ x ≤ 1, are conjugate. This property greatly simplifies the study of the dynamics of f4 . Let 0.x1 x2 x3 . . . be the binary representation of x ∈ [0, 1], i.e., x=
∞ xi i=1
2i
,
where, for all i ∈ N, xi ∈ {0, 1}. The binary representation of T2 (x) is then given by
0.x2 x3 x4 . . . , if 0 ≤ x ≤ 12 , T2 (x) = 0.(1 − x2 )(1 − x3 )(1 − x4 ) . . . , if 12 ≤ x ≤ 1. These formulae are both correct for x = 12 . The binary representation of 12 being either 0.1000 . . . or 0.01111 . . ., the binary representation of T2 (x) is, in both cases, 0.1111 . . ., which is equal to 1. The description of the iterates of x in terms of their binary representation leads to some remarkable results due to James Whittaker [344]. 1- If the binary representation of x is finite (i.e., if there exists a positive integer n such that xi = 0 for all i > n), then after at most n + 1 iterations the orbit of x will reach 0 and stay there. Hence, there exists a dense set of points whose orbit reaches the origin and stays there. 2- If the binary representation of x is periodic with period p, then the orbit of x is periodic with a period equal to p or a divisor of p. For example, for p = 2,
150
5 Chaos
x is either equal to 0.010101 . . . = 13 or 0.101010 . . . = 23 , and we verify that T2 ( 13 ) = 23 and T2 ( 32 ) = 23 . In both cases, the orbit after, at most, one iteration reaches the fixed point 23 (period 1). More generally, suppose that x1+p = x1 . If after p iterations there have been an even number of conjugations (i.e., changes of xi in 1 − xi ), we are back to the initial point x = 0.x1 x2 x3 . . .;,but if, after p iterations, there have been an odd number of conjugations, we are not back to the initial point but to the point whose binary representation is 0.(1 − x1 )(1 − x2 )(1 − x3 ) · · · . However, in this case, T2 (0.(1 − x1 )(1 − x2 )(1 − x3 ) · · · ) = T2 (0.x1 x2 x3 . . .), and after p + 1 iterations we are back to the first iterate T2 (x). Here is an example. Consider x = 0.001110011100111 . . ., whose binary representation is periodic with period 5. We have x = 0.001110011100111 . . . T2 (x) = 0.01110011100111 . . . T22 (x) = 0.1110011100111 . . . T23 (x) = 0.001100011000 . . . T24 (x) T25 (x) T26 (x)
conjugation
= 0.01100011000 . . . = 0.1100011000 . . . = 0.011100111 . . .
this is not x but this is T2 (x).
Using this method, we could show that the binary tent map has periodic orbits of all periods, and the set of all periodic points is dense in [0.1]. T2 and f4 being conjugate, the logistic map f4 has the same property. Regarding the existence of a periodic orbit, an amazing theorem, due to ˇ Sarkovskii, indicates which periods imply which other periods. First, define ˇ among all positive integers Sarkovskii’s order relation by 3 5 7 · · · 2 · 3 2 · 5 · · · 2 2 · 3 22 · 5 · · · 23 · 3 23 · 5 · · · · · · 23 22 2 1. That is, first list all the odd numbers, followed by 2 times the odd numbers, 22 times the odd numbers, etc. This exhausts all the positive integers except the powers of 2 that are listed last in decreasing order. Since is an order ˇ relation, it is transitive (i.e., n1 n2 and n2 n3 imply n1 n3 ). Sarkovskii’s theorem is5 : Theorem 10. Let f : R → R be a continuous map. If f has a periodic orbit of period n, then, for all integers k such that n k, f has also a periodic orbit of period k. 5
ˇ For a proof, see Stefan [326] or Collet and Eckmann [91].
5.1 Defining chaos
151
We said above that the binary tent map T2 and its conjugate, the logistic ˇ map f4 , have periodic orbits of all periods. Following Sarkovskii’s theorem, to prove this result it suffices to prove that either T2 or f4 has a periodic orbit of period 3. In order to generate an orbit of period 3 for f4 , we can start from x = 0.100100100 . . ., which generates an orbit of period 3 for the binary tent map. From the binary representation of x, we find that x satisfies the equation 8x = 4 + x, thus x = 47 . In order to generate the corresponding period 3 orbit for f4 , we have to start from the point h(x) = sin2 (2π/7). All these periodic orbits are unstable. Hence, periodic orbits computed with a finite precision will always present, after a number of iterations depending upon the precision, an erratic behavior. This feature is illustrated in Figure 5.4.
1 0.8 0.6 0.4 0.2 20
40
60
80
100
20
40
60
80
100
1 0.8 0.6 0.4 0.2
Fig. 5.4. Unstable orbit of period 3 of the map x → 4x(1 − x) generated from an initial value with a precision equal to either 16 (top) or 60 (bottom).
152
5 Chaos
3- Now let us turn our attention to initial points whose binary representation is neither finite nor periodic such as, for instance, x = 0.101001000100001 . . .. It is readily verified that, given ε > 0 and k = 0, 1, 2, . . ., there exists a positive integer t such that |T2t (x) − 2−k | < ε. Theorem 11. Let x0 ∈ [0, 1]; then, arbitrarily close to x0 , there exists x ∈ [0, 1] such that the orbit through x {T2t (x) | t ∈ N}, is dense in [0, 1]. Denote s0 ∈ [0, 1], a number whose finite binary representation agrees with sufficiently many initial digits of x0 so that s0 is as close as desired to x0 . Let (sij ) be a sequence of numbers in [0, 1] with finite binary representations, where, for a given number sij , i is the finite number of digits of its binary representation and j its decimal value (0 ≤ j ≤ 2i − 1). For example, s20 = 0.00,
s21 = 0.01,
s22 = 0.10,
s23 = 0.11.
Now define recursively the binary representation of x as follows. First let x = s0 , then to the digits of s0 add either the digits of each element of the sequence (sij ) in the order s10 , s11 , s20 , s21 , s22 , s23 , ... or the digits obtained by conjugation s10 = 0.1, s11 = 0.0, s20 = 0.11, ... according to the following rule. Once the digits of n − 1 elements of either the sequence (sij ) or the sequence (sij ) have been added, if the resulting finite representation contains an even number of conjugations, we add the digits of the nth element of (sij ); otherwise we add the digits of the nth element of (sij ). The infinite binary representation so obtained gives the number x close to x0 . Now for any y ∈ [0, 1] and any positive ε, there is an sij such that 2−i < ε/2 and |sij − y| < ε/2. From this it follows that one of the iterates of x, say s, will have the same initial digits as sij , so |s − y| ≤ |s − sij | + |sij − y|
δ. A map has sensitive dependence on initial conditions if there exist points arbitrarily close to x that eventually separate from x by at least δ under iteration of f . Note, however, that sensitive dependence on initial conditions requires that some points, but not necessarily all points, close to x eventually separate from x under iteration. If a map has sensitive dependence on initial conditions, one has to be very careful when drawing conclusions from numerical determinations of orbits. As a consequence of Theorem 11, the binary tent map T2 and its conjugate f4 are topologically transitive and possess sensitive dependence on initial conditions. 5.1.2 Definition of chaos We are now ready to define chaos. The following definition is one of the simplest. It is due to Devaney [102]. Definition 20. Let S be a set. The mapping f : S → S is said to be chaotic on S if 1. f has sensitive dependence on initial conditions, 2. f is topologically transitive, 3. f has periodic points that are dense in S. And after having chosen this definition, Devaney adds the following comment: To summarize, a chaotic map possesses three ingredients: unpredictability, indecomposability, and an element of regularity. A chaotic system is unpredictable because of sensitive dependence on initial conditions. It cannot be broken down or decomposed into two subsystems (two invariant open subsets) which do not interact under f because of topological transitivity. And, in the midst of this random behavior, we nevertheless have an element of regularity, namely the periodic points which are dense. According to Devaney’s definition, the binary tent map T2 and the logistic map f4 are chaotic. Actually, Devaney’s definition of chaos is redundant. It can be shown [31] that sensitive dependence on initial conditions follows from topological transitivity and the existence of a dense set of periodic points. For general maps it is the only redundancy [12]. However, if one considers maps on an interval, finite or infinite, then continuity and topological transitivity imply the existence of
154
5 Chaos
a dense set of periodic points and sensitive dependence on initial conditions. The proof of this result, which uses the ordering of R, cannot be extended to maps defined in Rn for n > 1 or to the unit circle S 1 [337]. Remark 8. The word “chaotic” to qualify the apparently random behavior of deterministic nonperiodic orbits of one-dimensional maps seems to have been used for the first time in 1975 by Li and York [204].
5.2 Routes to chaos The infinite sequence of period-doubling bifurcations preceding the onset of chaos is called the period-doubling route to chaos. A route to chaos is a scenario in which a parameter-dependent system that has a simple deterministic time evolution becomes chaotic as a parameter is changed. The period-doubling route is followed by a variety of deterministic systems, but it is not the only one. For example, the intermittency route, discovered by Pomeau and Manneville [292], is another scenario in which a deterministic system can reach chaos. As described by the authors, for values of a control parameter r less than a critical transition value rT , the attractor is a periodic orbit. For r slightly greater than rT , the orbit remains apparently periodic during long time intervals (said to be laminar phases,) but this regular behavior seems to be abruptly and randomly disrupted by a burst of finite duration. These bursts occur at seemingly random times, much larger than—and not correlated with—the period of the underlying oscillations. As r increases substantially above rT , bursts become more and more frequent, and regular oscillations are no longer apparent. Pomeau and Manneville have identified three types of intermittency transitions according to the nature of the bifurcations characterizing these transitions. An example of an intermittent transition √ occurs for the logistic map (n, r) → f (n, r) = rn(1 − n) for rT = 1 + 2 2. If r is slightly greater than rT , there exists a stable and an unstable 3-point cycle (Figure 5.5). These cycles coalesce at rT through a saddle-node bifurcation for f 3 (n, r), and for r slightly less than rT , we observe intermittent bursts, as shown in Figure 5.6. Numerical calculations were done with a precision of 200 significant digits on the initial value; 100 iterations have been discarded, and 170 are represented. For this type of intermittency, Pomeau and Manneville have shown that the average time between two bursts goes to infinity as r tends to rT from below as (rT − r)−1/2 .
5.2 Routes to chaos
155
0.8 0.6 0.4 0.2
25
50
75
100
125
150
175
Fig. √ 5.5. Stable 3-point cycle of the logistic map (n, r) → rn(1 − n) for r = 1 + 2 2 + 1/1000. Numerical calculations were done with a precision of 200 significant digits on the initial value. One hundred iterations have been discarded.
0.8 0.6 0.4 0.2
25
50
75
100
125
150
175
Fig. √ 5.6. Intermittent bursts of the logistic map (n, r) → rn(1 − n) for r = 1 + 2 2 − 1/1000. Numerical calculations were done with a precision of 200 significant digits on the initial value. One hundred iterations have been discarded.
156
5 Chaos
5.3 Characterizing chaos Computer studies have shown that chaos is probably present in many models. In the absence of a mathematical proof, when can we say that the time evolution of a system is chaotic? Some indications are given in the following sections.6 5.3.1 Stochastic properties The histogram represented in Figure 5.3, while suggesting the existence of an invariant measure for the map f4 , shows that chaos implies deterministic randomness. In what follows, we describe the stochastic properties associated with the chaotic behavior of the binary tent map T2 and the logistic map f4 . That is, for each map, we determine an invariant probability density ρ such that the probability ρ(x) dx measures how frequently the interval [x, x + dx] is visited by the dense orbit of a point x0 ∈ [0, 1]. Before going into any further detail, let us recall a few basic results of ergodic theory.7 Let f : S → S be a map. A subset A of S is invariant if f (A) = A. A measure µ is invariant for the map f if, for all measurable subsets A of S, µ(f −1 (A)) = µ(A). The map f is ergodic with respect to the invariant measure µ if any measurable invariant subset A of S is such that either µ(A) = 0 or µ(A) = µ(S). If f is ergodic with respect to the invariant probability measure µ8 and g : S → R, an integrable function with respect to µ, then, for almost all x0 ∈ S, t 1 lim g ◦ f i (x0 ) = g(x) dµ(x). t→∞ t i=1 That is, the time average is equal to the space average. If there exists a positive real function ρ such that dµ(x) = ρ(x) dx, ρ is called an invariant probability density. If the map f : [0, 1] → [0, 1] is such that any point x has k preimages y1 , y2 , . . . yk by f (i.e., for all i = 1, 2, . . . , k, f (yi ) = x), the probability of finding an iterate of f in the interval [x, x + dx] is then the sum of the probabilities of finding its k preimages in the intervals [yi , yi + dyi ]. Hence, from the relation k ρ(x) dx = ρ(yi ) dyi , i=1
it follows that 6 7 8
See Ruelle [303]. See, for example, Shields [313]. µ is a probability measure if µ(S) = 1.
5.3 Characterizing chaos k ρ(yi ) , ρ(x) = (y )| |f i i=1
157
(5.1)
where we have taken into account that dx = f (yi ). dyi Equation (5.1) is called the Perron-Frobenius equation. In the case of the binary tent map T2 , the Perron-Frobenius equation reads ρ(x) = 12 ρ 12 x + ρ 1 − 12 x . This equation has the obvious solution ρ(x) = 1. That is, the map T2 preserves the Lebesgue measure. If I1 ∪ I2 is the preimage by T2 of an open interval I of [0, 1], we verify that m(I1 ∪ I2 ) = m(I1 ) + m(I2 ) = m(I), where m denotes the Lebesgue measure. Using the relation f4 ◦ h = h ◦ T2 , where h : x → x), the density ρ of the invariant probability measure for the map f is given by sin2 ( π2
dh−1 dx dx dx = . π x(1 − x)
(5.2)
ρ(x) dx =
(5.3)
The graph of ρ, represented in Figure 5.7, should be compared with the histogram in Figure 5.3.
3
2
1
0.2
0.4
0.6
0.8
1
Fig. 5.7. Invariant probability density ρ : x → 1/π x(1 − x) for the logistic map f4 . Compare with Figure 5.3.
158
5 Chaos
5.3.2 Lyapunov exponent Sensitive dependence on initial conditions is a necessary but not sufficient condition for a map f to be chaotic.9 It is nonetheless interesting to be able to measure this essential feature of a chaotic map. Consider two neighboring initial points x0 and x0 + ε. If after t iterations of the map f the distance |f t (x0 + ε) − f t (x0 )| grows exponentially, we may define the exponent λ(x0 ) by |f t (x0 + ε) − f t (x0 )| ∼ εetλ(x0 ) , or more precisely by f t (x + ε) − f t (x ) 1 0 0 λ(x0 ) = lim lim log t→∞ ε→0 t ε df t 1 = lim log (x0 ). t→∞ t dx
(5.4)
λ(x0 ) is called the Lyapunov exponent. eλ(x0 ) is the average factor by which the distance between two neighboring points becomes stretched after one iteration. Applying the chain rule to Relation (5.4), the Lyapunov exponent can also be expressed as t−1 $ 1 λ(x0 ) = lim log f (xi ) t→∞ t i=0
= lim
t→∞
1 t
t−1
log |f (xi )|,
(5.5)
i=0
where xi = f i (x0 ), for i = 0, 1, 2, . . . , t − 1. In the case of a chaotic map, except for a set of measure zero, the Lyapunov exponent does not depend upon x0 . If the map is ergodic with respect to an invariant measure µ, then the Lyapunov exponent expressed as a time average by (5.5) is also given by the space average: (5.6) λ = log |f (x)|dµ(x). Example 32. Lyapunov exponent of the logistic map f4 . The map f4 is ergodic with respect to the invariant measure dµ(x) = 9
π
dx x(1 − x)
,
See Exercise 5.4 for an example of a map that has sensitive dependence on initial conditions and dense periodic points but is, however, not chaotic.
5.3 Characterizing chaos
159
thus λ=
1 π
1
0
log |4(1 − 2x)| dx x(1 − x)
1
=
log(4| cos πu|) du 0
= log 2. Since a chaotic map has sensitive dependence on initial conditions, its Lyapunov exponent is necessarily positive. Notice that the result found for the logistic map f4 could have been obtained without any calculation since f4 is conjugate to T2 and, for all x ∈ [0, 1], |T2 (x)| = 2.
5.3.3 “Period three implies chaos” In their paper Period Three Implies Chaos, published in 1975, Li and York [204] gave the following theorem. Theorem 12. Let I be an interval and let f : I → I be a continuous map. Assume there is a point x ∈ I such that its first three iterates satisfy either
f 3 (x) ≤ x < f (x) < f 2 (x)
or
f 3 (x) ≥ x > f (x) > f 2 (x).
(5.7)
Then, for all positive integers k, there exists in I a periodic point of period k. Furthermore, there exists an uncountable set S ⊂ I, containing no periodic or eventually periodic points,10 such that, for all pairs of different points x1 and x2 in S, lim sup |f t (x1 ) − f t (x2 )| > 0 t→∞
and
lim inf |f t (x1 ) − f t (x2 )| = 0. t→∞
(5.8)
For Li and York, “chaotic behavior” means existence of 1. infinitely many periodic points, and 2. an uncountable subset of I becoming highly folded under successive iterations. ˇ According to Sarkovskii’s theorem, the existence of infinitely many periodic points follows from the existence of a periodic point of either an odd period [205] or a period not equal to 2n [274]. Condition (5.7) has been improved by Li, Misiurewicz, Pagani, and York [205], making the existence of chaotic behavior easier to verify. If there exists an initial point x0 such that (5.9) f n (x0 ) < x0 < f (x0 ), where n is an odd integer, then there exists a periodic orbit of odd period k with 1 < k ≤ n.
160
5 Chaos
1 0.8 0.6 0.4 0.2
3.75
3.8
3.85
3.9
3.95
4
Fig. 5.8. Magnification of the bifurcation diagram of Figure 5.1 around the period 3 window. 1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 5.9. Three-point cycle of the logistic map (n, r) → rn(1 − n) for r = 3.835. Fifty iterations are represented after 150 have been discarded.
It is important to realize that, when the Li-York conditions of Theorem 12 are satisfied, it may happen that, for most initial values x0 ∈ I, the orbit {f t (x0 ) | t ∈ N} is eventually periodic. For example, the magnification of the bifurcation diagram of the logistic map represented in Figure 5.8 shows the existence of 3-point cycles in a narrow range of values of the parameter r. Convergence to a typical cycle is shown in Figure 5.9. Such behavior is certainly not chaotic. However, according to the Li-York theorem there exists an 10
A point is eventually periodic if it tends to a periodic point.
5.3 Characterizing chaos
161
uncountable set of orbits that are chaotic. These orbits may, however, be rare and thus difficult to exhibit numerically. Therefore, time series obtained either numerically or experimentally are considered chaotic when apparently dense orbits in an invariant subset and sensitive dependence on initial conditions can be observed. Example 33. Condition (5.9) is easily verified for the two discrete one-population models of Example 27. For nt+1 = nt exp(r(1 − nt ) with r = 3, we find n5 < n0 < n1 ,
where
n0 = 0.3, n1 = 2.44985, n5 = 0.0877748.
For nt+1 = rnt (1 + nt )−a with r = 40 and a = 8, we find n7 < n0 < n1 ,
where
n0 = 0.15, n1 = 1.96141, n7 = 0.0265951.
In Subsection 5.4.1 it is shown that apparent dense orbits are observed in both cases. 5.3.4 Strange attractors A closed set A is an attracting set for a map f if there exists a neighborhood N of A such that lim f t (N ) = A. t→∞
If a domain D is such that f t (D) ⊂ D for all positive t, D is called a trapping region. If we can find such a domain, then % A= f t (D). t∈N
If an attracting set contains a dense orbit it is called an attractor. Attractors of chaotic systems often have complex structures. For example, a onedimensional attractor can be an uncountable set of zero Lebesgue measure. In order to characterize such sets, we need to introduce a new type of measure. First, let us describe the typical classical example of such a set. Example 34. The Cantor set [76]. Let (Jn ) be a sequence of subsets of [0, 1] defined by J0 = [0, 1], J1 = [0, 13 ] ∪ [ 23 , 1], J2 = [0, 19 ] ∪ [ 29 , 13 ] ∪ [ 69 , 79 ] ∪ [ 89 , 1], ······ Jn is the union of 2n pairwise disjoint closed intervals of length 3−n . It is obtained from Jn−1 by removing from each closed interval of length 3−(n−1) an open interval of length 3−n with the same center. The set
162
5 Chaos ∞ %
C=
Jn
n=1
is called the triadic Cantor set or ternary Cantor set. Note that, for all n ∈ N, the endpoints of all closed intervals, whose union at stage n is Jn , belong to C. C is closed since it is the intersection of closed sets. It contains no interval,11 so its interior12 is empty, that is, all its points are boundary points. Since any open set containing a point x of C contains at least another point of C, all points of C are limit points.13 At each stage of the construction of C, the open intervals that constitute the “middle thirds” of the closed intervals left at the previous stage are removed. Thus any element x of the Cantor ternary set can be written ∞ xn x= , 3n n=1
where, for all positive integers n, xn = 0 or 2. In other words, the ternary expansion—that is, the expansion to the base 3—of an element x ∈ C does not contain the digit 1. One might think that this statement is not correct since, for example, 7/9 = 2/3 + 1/32 belongs to C, and its ternary expansion, which is 0.21, contains the digit 1. There is indeed an infinite number of elements in C that have this property. The expansion of all of them contains only a finite number of terms, the last one being 1/3n0 . If this term is replaced by the series ∞ n=n0
2 , 3n +1
which is equal to 1/3n0 , the statement becomes correct. We shall always respect this convention and write 7/9 as 0.2022 . . .. Note that, with this convention, each point in C has a unique infinite expansion. The Cantor set is not countable. If C were countable, there would exist a bijection ϕ : N → C, and the set ϕ(N) = {ϕ(n) | n ∈ N} would coincide with C. Let 11 12
13
At stage n there is no interval of length greater than 2n . Let (X, T ) be a topological space. A point x is an interior point of a subset E ⊂ X if there exists an open set U ∈ T containing x such that U ⊂ E. The set of interior points of E is the interior of E. A point x is a boundary point of a set E if there exists an open set U ∈ T containing x that contains at least one point of E and one point of X\E. The set of boundary points of E is called the boundary of E. Limit points are also called accumulation points.
5.3 Characterizing chaos
(∀n ∈ N)
ϕ(n) =
∞ (n) x k
k=1
3k
163
,
and consider the sequence (xn ) such that
(n) 0, if xn = 2, xn = (n) 2, if xn = 0. The number
∞ xn 3n n=1
x=
is clearly in C but not in ϕ(N). Hence, there is no such bijection ϕ.14 The Cantor set is equipotent to the interval [0, 1]. To prove it, we have to exhibit a bijection ψ : [0, 1] → C. Let ∞ xn 2n n=1
x=
be an element of [0, 1], where, for all n ∈ N, xn = 0 or 1, and define the Cantor function ψ by ∞ 2xn ψ(x) = . 3n n=1 It is clear that ψ([0, 1]) = C. Furthermore, if y=
∞ yn , 3n n=1
is an element of C, for all n ∈ N, yn = 0 or 2, and x = ψ −1 (y) =
∞ n=1
1 2
yn 2n
exists and is unique. Then ψ −1 (C) = [0, 1], which finally proves that ψ : [0, 1] → C is a bijection. The Lebesgue measure of the Cantor set is zero. Since Jn is obtained from Jn−1 by removing 2n−1 open intervals of length 3−n , we have m(C) = 1 − 14
∞ 2n = 0. n+1 3 n=0
This method of proof, called Cantor’s diagonal process, is frequently used to demonstrate that a set is not countable.
164
5 Chaos
To make a distinction between sets of zero Lebesgue measure, the Hausdorff dimension is a helpful notion [168]. Let A be a subset of R; the Hausdorff outer measure in dimension d of A is d Hd∗ (A) = lim inf m(Ij ) , (d > 0), ε→0
j∈J
where the infimum is taken over all finite or countable coverings of A by open intervals Ij whose Lebesgue measure (here length) m(Ij ) is less than ε. The limit exists—but may be infinite—since the infimum can only increase as ε decreases. 1. If Hd∗1 (A) < ∞, then, for d2 > d1 , Hd∗2 (A) = 0. Indeed, the condition m(Ij ) < ε, for all j ∈ J, implies
m(Ij ) m(Ij )
d2
d2 −d1 ≤ εd2 −d1 . d1 = m(Ij )
Therefore, inf
ε→0
j∈J
m(Ij )
d2
≤
m(Ij )
j∈J
≤ εd2 −d1
d2
m(Ij )
d1
.
j∈J
Letting ε → 0, the result follows. 2. If 0 < Hd∗1 (A) < ∞, then, for d2 < d1 , Hd∗2 (A) = ∞. This is a corollary of the preceding result. Let A be a subset of R; the number inf{d | Hd∗ (A) = 0} is called the Hausdorff dimension of the subset A and denoted by dH (A). The Hausdorff outer measure and Hausdorff dimension of a subset of Rk can be defined in a similar way using the Lebesgue measure (volume) m of open hypercubes. Example 35. Hausdorff dimension of the ternary Cantor set. C is self-similar , that is, 3 C ∩ [0, 13 ] = C and 3 C ∩ [ 23 , 1] − 2 = C, where, for a given real number k, kA denotes the set {kx | x ∈ A}. But Hd∗ (C) = Hd∗ (C ∩ [0, 13 ]) + Hd∗ (C ∩ [ 23 , 1]), and Hd∗ (C ∩ [0, 13 ]) = Hd∗ (C ∩ [ 23 , 1]) 1 = d Hd∗ (C), 3
5.3 Characterizing chaos
165
since, if A = kB, then Hd∗ (A) = k d Hd∗ (B). Finally, Hd∗ (C) =
2 ∗ H (C). 3d d
This relation shows that Hd∗ (C) takes a finite nonzero value if, and only if, 2 = 1. 3d Hence, the Hausdorff dimension dH (C) of the ternary Cantor set C is given by dH (C) =
log 2 . log 3
Remark 9. There exist many different definitions of dimension. Let A be a subset of Rk and N (ε, A) be the minimum number of open hypercubes of side ε needed to cover A. Then, the capacity of A is dC (A) = − lim sup ε→0
log N (ε, A) . log ε
In the case of the ternary Cantor set, we have dC (C) = lim
ε→0
log 2n log 2 = . log 3n log 3
For the Cantor set dC (C) = dH (C). This is not always the case. In general, dC (A) ≥ dH (A) since in the definition of the Hausdorff dimension of a subset A of Rk , we consider coverings of A by open hypercubes of variable sides.
Sets like the ternary Cantor set are called fractals by Mandelbrot [224, 225] who defined them as sets for which the Hausdorff dimension strictly exceeds the topological dimension. This definition, which has the advantage of being precise, is a bit restrictive. It excludes sets that can be accepted as fractals. Self-similarity, which is an essential characteristic of fractals, should be included in their definition. In the following example, we study a two-dimensional map having a fractal attractor. Example 36. The generalized baker’s map. The three-parameter family of twodimensional maps defined by ⎧ 1 ⎪ ⎪ if 0 ≤ x ≤ 1, 0 ≤ y ≤ a, ⎨ r1 x, y , a br1 ,r2 ,a (x, y) = 1 1 ⎪ ⎪ (y − a) , if 0 ≤ x ≤ 1, a ≤ y ≤ 1, ⎩ r2 x + , 2 1−a where the parameters r1 , r2 , and a are positive numbers less than 12 , is called the generalized baker’s map [120].15 It can be verified that the map transforms the unit square [0, 1] × [0, 1] into the two rectangles 15
The map models the process whereby a baker kneads dough by stretching and folding.
166
5 Chaos
[0, r1 ] × [0, 1] ∪ [ 12 , 12 + r2 ] × [0, 1], and, after a second iteration, we obtain four rectangles: [0, r12 ] × [0, 1] ∪ [ 12 r1 , r1 ( 12 + r2 )] × [0, 1] ∪ [ 12 , 12 + r1 r2 ] × [0, 1] ∪ [ 12 (1 + r2 ), 12 + r2 ( 12 + r2 )] × [0, 1]. At this stage it is important to note the following self-similarity property: if the interval [0, r1 ] on the x-axis is magnified by a factor 1/r1 , the set [0, r12 ] × [0, 1] ∪ [ 12 r1 , r1 ( 12 + r2 )] × [0, 1] becomes a replica of the set obtained after the first iteration. Similarly, if the interval [ 12 , 12 + r2 ], after a translation equal to − 12 , is magnified by a factor 1/r2 , the set [ 12 , 12 + r1 r2 ] × [0, 1] ∪ [ 12 (1 + r2 ), 12 + r2 ( 12 + r2 )] × [0, 1] also becomes a replica of the set obtained after the first iteration. To determine the capacity and Hausdorff dimension of the attractor of the generalized baker’s map, we first note that this attractor is the product of a Cantor set16 along the x-axis and the interval [0, 1] along the y-axis. Thus dC and dH are, respectively, equal to 1 + dC and 1 + dH , where dC and dH are the capacity and the Hausdorff dimension of the attractor along the x-axis. To find the capacity dC (A) of the intersection of the attractor A and the interval [0, 1] along the x-axis, we use the self-similarity property. We write N (ε, A) = N1 (ε, A) + N2 (ε, A), where N1 (ε, A) is the number of intervals of length ε needed to cover the part of the attractor along the x-axis that lies in the interval [0, r1 ], and N2 (ε, A) is the number of intervals of length ε needed to cover the part of the attractor along the x-axis that lies in the interval [ 12 , 12 + r2 ]. From the scaling property ε ε ,A and N2 (ε, A) = N ,A . N1 (ε, A) = N r1 r2
Thus N (ε, A) = N
ε ε ,A + N ,A . r1 r2
If dC (A) exists, N (ε, A) should behave as ε−dC when ε is small enough. Hence, 1 = r1dC + r2dC . If r1 = r2 = 12 , dC = 1, and for r1 = r2 = r, dC decreases with r. Again using self-similarity, we find that the Hausdorff dimension dH (A) of the attractor A is equal to its capacity dC (A). 16
By “Cantor set” we mean a closed set with no interior points, and such that all points are limit points. A closed set that, as a Cantor set, coincides with the set of all its limit points is said to be perfect.
5.3 Characterizing chaos
167
Chaotic attractors that are fractals are usually called strange attractors. More precisely, according to Ruelle [300, 301, 302]: Definition 21. A bounded set A ⊂ Rk is a strange attractor for a map f if there is a k-dimensional neighborhood N of A such that, for all t ∈ N, f t (N ) ⊂ N , and if, for all initial points in N , the map f has sensitive dependence on initial conditions. As for the generalized baker’s map, strange attractors occur in dissipative systems17 and in most cases result from a stretching and folding process.
1
0.8
0.6
0.4
0.2 I1 0.2
J1 0.4
I2 0.6
0.8
1
Fig. 5.10. Graph of the logistic map n → 4.5 n(1 − n).
Example 37. A strange repeller: the logistic map for r > 4. Consider again the logistic map (n, r) → fr (n) = rn(1 − n) but, this time, for parameter values greater than 4. Since fr (0.5) > 1, there exists an open interval J1 centered at n = 12 such that, for all x ∈ J1 , limn→∞ frn (x) = −∞. Hence, for the orbit of a point x ∈ [0, 1] to be bounded, x has to belong to [0, 1]\J1 , that is, to I1 ∪ I2 = [0, 1] ∩ fr−1 ([0, 1]) = fr−1 ([0, 1]), as shown in Figure 5.10. This condition, which is necessary but obviously not sufficient, can be rewritten & Ii1 = fr−1 ([0, 1]). i1 =1,2 17
That is, systems in which the determinant of the Jacobian matrix is less than 1. This implies that the Hausdorff dimension of a strange attractor is less than the dimension of the phase space.
168
5 Chaos
More generally, for the orbit of a point x ∈ [0, 1] to be bounded, x has to belong to the union of 2n disjoint closed intervals & Ii1 ,i2 ,...,in = fr−n ([0, 1]). i1 ,i2 ,...,in =1,2
If we let n go to infinity, we find that the orbit of a point x ∈ [0, 1] is bounded if, and only if, x belongs to the invariant set Cr defined by Cr =
∞ %
&
Ii1 ,i2 ,...,in .
(5.10)
n=1 i1 ,i2 ,...,in =1,2
The set Cr is closed since it is the intersection of closed sets. The Lebesgue measure (i.e., the length) of the closed intervals Ii1 ,n2 ,...,in decreases as n increases. Denoting by m(I) the Lebesgue measure of the interval I, we shall prove that lim m(Ii1 ,n2 ,...,in ) = 0
(5.11)
n→∞
√ for r > 2 + 5. √ First, let us show that, for all x ∈ I1 ∪ I2 , |fr (x)| > 1 if r > 2 + 5. Since fr (x) = r − 2rx and fr (x) = −2r < 0, |f √r (x)| is minimum on I1 ∪ I2 for x such that fr (x) = 1, that is, for x = (r ± r2 − 4r)/2r. But √ r ± r2 − 4r 2 = r − 4r . fr 2r √ Thus, fr (r ± r2 − 4r)/2r > 1 if r2 − 4r > 1;
that is, if r > 2 +
√
5.
Let λr = inf x∈I1 ∪I2 |fr (x)|. We shall prove by induction that m(Ii1 ,n2 ,...,in ) ≤ λ−n r .
(5.12)
Consider the interval Ii = [ai , bi ], where i = 1, 2. The images by fr of its endpoints are 0 and 1 for i = 1 and 1 and 0 for i = 2. By the mean value theorem, there exists a point ci ∈ Ii such that fr (bi ) − fr (ai ) = fr (ci )(bi − ai ). Hence, 1 = |fr (bi ) − fr (ai )| = |fr (ci )| m(Ii ) ≥ λr m(Ii ). That is, m(Ii ) ≤ λ−1 r , which shows that (5.12) is true for n = 1. Let ai1 ,i2 ,...,in and bi1 ,i2 ,...,in denote the endpoints of the interval Ii1 ,i2 ,...,in . For in = 1, 2, fr (Ii1 ,i2 ,...,in ) = Ii1 ,i2 ,...,in−1 . Hence,
5.4 Chaotic discrete-time models
169
m(Ii1 ,i2 ,...,in−1 ) = |bi1 ,i2 ,...,in−1 − ai1 ,i2 ,...,in−1 | = |fr (bi1 ,i2 ,...,in ) − fr (ai1 ,i2 ,...,in )|
= |fr (ci1 ,i2 ,...,in )| |(bi1 ,i2 ,...,in − ai1 ,i2 ,...,in )| ≥ λr m(Ii1 ,i2 ,...,in ),
and, if we assume that (5.12) is true for n − 1, we finally obtain m(Ii1 ,i2 ,...,in ) ≤ λr m(Ii1 ,i2 ,...,in−1 ) ≤ λ−n r . Thus, in the limit n → ∞, Relation (5.11) is verified. The set Cr is a Cantor set since it possesses the characteristic properties of such a set given in Footnote 16: it is closed with no interior points, and all its points are limit points. At the same time it is a repeller since any point in a neighborhood of Cr goes to −∞. Cr is then a strange repeller.
5.4 Chaotic discrete-time models 5.4.1 One-population models We have seen that, for continuous maps defined on an interval, topological transitivity implies the existence of a dense set of periodic points and sensitive dependence on initial conditions. To verify that a one-dimensional map is chaotic, it is therefore sufficient to prove that it is topologically transitive. It can be shown [299] that if a map f is topologically transitive on an invariant set X, then the orbit of some point x ∈ X is dense in X. Except for very particular maps, it is, in general, almost impossible to determine a specific point that has a dense orbit. However, when analyzing models, a map is considered to be chaotic if it is possible to find numerically an orbit that “looks” dense. In Figures 5.11 and 5.13 are represented the bifurcation diagrams of two one-population models. These bifurcation diagrams have been computed for 200 equally spaced parameter values. For each parameter value, 300 iterates are calculated, but only the last 100 have been plotted. Cobwebs corresponding to “apparently” dense orbits are represented in Figures 5.12 and 5.14.
5.4.2 The H´ enon map In order to find a simpler model than the Lorenz system (Equations (5.14) below) that nevertheless possesses a strange attractor, H´enon [169] considered
170
5 Chaos
2.5 2 1.5 1 0.5
1.6
1.8
2
2.2
2.4
2.6
2.8
3
Fig. 5.11. Bifurcation diagram of the one-population map (n, r) → ner(1−n) . The parameter r, plotted on the horizontal axis, varies from 1.5 to 3, and the population n, plotted on the vertical axis, varies between 0 and 2.5. 2.5
2
1.5
1
0.5
0 0
0.5
1
1.5
2
2.5
r(1−n) Fig. 5.12. Cobweb showing a sequence √ of 250 iterates of the map (n, r) → ne for r = 3. The initial value is n0 = 2 with a precision of 150 significant digits.
the two-dimensional map18 Ha,b (x, y) = (1 + y − ax2 , bx). The Jacobian matrix
18
(5.13)
−2ax 1 DHa,b (x, y) = b 0
It can be shown that the H´enon map is one of the reduced forms of the most general quadratic map, whose determinant of its Jacobian matrix is constant.
5.4 Chaotic discrete-time models
171
2 1.75 1.5 1.25 1 0.75 0.5 0.25 10
15
20
25
30
35
40
Fig. 5.13. Bifurcation diagram of the one-population map (n, r) → rn(1+n)−8 . The parameter r, plotted on the horizontal axis, varies from 5 to 40, and the population n, plotted on the vertical axis, varies between 0 and 2. 2
1.5
1
0.5
0 0
0.5
1
1.5
2
Fig. 5.14. Cobweb showing a sequence of 250 √ iterates of the map (n, r) → rn(1 + n)−8 for r = 40. The initial value is n0 = 3 with a precision of 150 significant digits.
has a constant determinant equal to −b. The map is, consequently, a uniform contraction for |b| < 1 and area-preserving for |b| = 1. For b = 0, the map is invertible, and we have −1 H−1 y, x − 1 + ab−2 y 2 ). a,b (x, y) = (b
If a > a0 = − 14 (1 − b)2 , the map has two fixed points:
172
5 Chaos
x∗1,2
=
b−1±
(1 − b)2 + 4a , 2a
∗ y1,2 = bx∗1,2 .
One is always unstable, while the other one is asymptotically stable if a1 = 3(1 − b)2 /4 > a > a0 . For b = 0.3, which is the value chosen by H´enon, a0 = −0.1225 and a1 = 0.3675. For a fixed value of b, as a increases, the map exhibits an infinite sequence of period-doubling bifurcations. Derrida, Gervois, and Pomeau [101] have determined the parameter values ak of the first 11 period-doubling bifurcations and found that the rate of convergence is equal to the Feigenbaum number δ. Their estimate of a∞ is 1.058048 . . .. The universality of the period-doubling route to chaos is, therefore, valid for dissipative maps of dimensionality higher than 1.
0.4 0.2 0 -0.2 -0.4 -1
-0.5
0
0.5
1
Fig. 5.15. Ten thousand successive points obtained by iteration of the H´ enon map starting from the unstable fixed point (0.631354 . . . , 0.189406 . . .).
After some preliminary numerical experiments, H´enon found a strange attractor for the parameter values a = 1.4 and b = 0.3. It is represented in Figure 5.15. Two enlargements of the attractor are represented in Figures 5.16 and 5.17 showing its fine structure, which looks identical at all scales. Both frames contain the unstable fixed point (0.631354 . . . , 0.189406 . . .), which apparently lies on the upper boundary of the attractor. The Hausdorff dimension of the strange attractor of the H´enon map has been determined numerically by Russell, Hanson, and Ott [304]19 , who found 19
The authors divided the phase space in boxes of side ε and after many iterations counted how many boxes contained at least one iterate. The dimension is found assuming that, for a sufficiently small ε, the number of boxes behaves as εd . Strictly speaking, such a dimension approaches the value dC of the capacity. However, for most strange attractors, dC = dH (see Farmer, Ott, and York [120]).
5.4 Chaotic discrete-time models
173
0.2
0.18
0.16
0.56
0.6
0.64
0.68
Fig. 5.16. Enlargement of the H´enon attractor increasing the number of iterations to 105 . The initial point in both cases is the unstable fixed point.
0.19
0.188
0.186 0.625
0.63
0.635
Fig. 5.17. Enlargement of the H´enon attractor increasing the number of iterations to 106 . The initial point is the unstable fixed point.
dH = 1.261±0.003 for a = 1.4 and b = 0.3, and dH = 1.202±0.003 for a = 1.2 and b = 0.3.20 20
All these numerical considerations do not constitute a proof of the existence of a strange attractor. Sometimes such a proof can be given. For the H´enon map, see Tresser, Coullet, and Arneodo [332].
174
5 Chaos
5.5 Chaotic continuous-time models 5.5.1 The Lorenz model In his historical paper, published in 1963 [212], Lorenz derived, from a model of fluid convection, a three-parameter family of three ordinary differential equations that appeared, when integrated numerically, to have extremely complicated solutions. These equations are x˙ = σ(y − x) y˙ = rx − y − xz z˙ = xy − bz,
(5.14)
where σ, r, and b are real positive parameters. The system is invariant under the transformation (x, y, z) → (−x, −y, z). Figures 5.18, 5.19 and 5.20 represent the projections of the Lorenz strange attractor, calculated for σ = 10, r = 28, and b = 8/3, on, respectively, the planes x0y, x0z, and y0z.
20
y
10
0
-10
-20 -15 -10 -5
0 x
5
10
15
20
Fig. 5.18. Projection on the x0y plane of a numerical solution of the Lorenz equations for t ∈ [0, 40] with (x0 , y0 , z0 ) = (0, 0, 1).
The orbit is obviously not periodic. As t increases, the orbit winds first around the unstable nontrivial fixed point (x∗ , y ∗ , z ∗ ) = (−8.48528 . . . , −8.48528 . . . , 27) and then around the other unstable fixed point, (x∗ , y ∗ , z ∗ ) = (8.48528 . . . , 8.48528 . . . , 27),
5.5 Chaotic continuous-time models
175
40
z
30
20
10
0 -15 -10
-5
0 x
5
10
15
20
Fig. 5.19. Projection on the x0z plane of a numerical solution of the Lorenz equations for t ∈ [0, 40] with (x0 , y0 , z0 ) = (0, 0, 1).
40
z
30
20
10
0 -20
-10
0
10
20
y
Fig. 5.20. Projection on the y0z plane of a numerical solution of the Lorenz equations for t ∈ [0, 40] with (x0 , y0 , z0 ) = (0, 0, 1).
without ever settling down. Its shape does not depend upon a particular choice of initial conditions. The divergence of the flow (trace of the Jacobian matrix) is equal to −(σ + b + 1). Thus a three-dimensional volume element is contracted, as a function of time t, by a factor e−(σ+b+1)t . It can be shown that there is a bounded
176
5 Chaos
ellipsoid E ⊂ R3 that all trajectories eventually enter.21 Taken together, the existence of the bounded ellipsoid and the negative divergence of the flow imply that there exists a bounded set of zero Lebesgue measure inside the ellipsoid E towards which all trajectories tend.
Exercises ˇ Exercise 5.1 The converse of Sarkovskii’s theorem is also true.22 Show that the piecewise linear map defined on [1, 5] by ⎧ 2x + 1, if 1 ≤ x ≤ 2, ⎪ ⎪ ⎪ ⎨−x + 7, if 2 ≤ x ≤ 3, f (x) = ⎪ −2x + 10, if 3 ≤ x ≤ 4, ⎪ ⎪ ⎩ −x + 6, if 4 ≤ x ≤ 5, has a periodic orbit of period 5 but no periodic orbit of period 3. Exercise 5.2 Consider the symmetric tent map defined by rx, if 0 ≤ x ≤ 12 , Tr (x) = r(1 − x), if 12 ≤ x ≤ 1. Find the values of the parameter r for which the map Tr has periodic orbits of period 3. Exercise 5.3 Find the condition under which the invariant probability density ρ is equal to 1 ( i.e., the Lebesgue measure) for the asymmetric tent map defined by ⎧ b ⎪ ⎨ax, if 0 ≤ x ≤ , a + b Ta,b (x) = b ⎪ ⎩b(1 − x), if ≤ x ≤ 1. a+b Exercise 5.4 Let f : R+ → R+ be the map defined by ⎧ 3x, if 0 ≤ x ≤ 13 , ⎪ ⎪ ⎪ ⎨ −3x + 2, if 13 ≤ x ≤ 23 , f (x) = ⎪ 3x − 2, if 23 ≤ x ≤ 1, ⎪ ⎪ ⎩ f (x − 1) + 1, if x ≥ 1. Show that f has sensitive dependence on initial conditions, has periodic points that are dense in R+ , but is not chaotic. Exercise 5.5 Determine the bifurcation diagram of the one-parameter family of maps (n, r) → −rn log n, which is a discrete version of the Gompertz model (see Exercise 4.6) for 1.8 ≤ r ≤ 2.7. 21 22
For the most complete study of the Lorenz model, consult Sparrow [323]. This example is taken from Li and York [204].
Solutions
177
Exercise 5.6 Drosophila population dynamics has been investigated to clarify if the observed fluctuations about the carrying capacity were the result of a chaotic behavior [251, 282]. Populations Nt and Nt+1 at generation t and t + 1, respectively, were fitted on the discrete theta model θ Nt , Nt+1 = Nt 1 + r 1 − K where K is the carrying capacity and r and θ are two positive parameters that measure the growth rate and its asymmetry. If we denote by nt = Nt /K the reduced population, find the first period-doubling bifurcation points and determine the bifurcation diagram of the two-parameter family of maps (n, r, θ) → fr,θ (n) = n + rn(1 − nθ ). Exercise 5.7 Find the invariant probability density of the two-dimensional chaotic map (xt+1 , yt+1 ) = (yt , 4xt (1 − xt )). Exercise 5.8 Consider the two-parameter family of one-dimensional maps defined by fa,b (x) = a +
bx . 1 + x2
Study its bifurcation diagram for a ∈ [−5, 0] and b ∈ [11, 12].
Solutions Solution 5.1 Figure 5.21 shows that the map f has a unique fixed point in the interval [3, 4], solution of the equation −2x + 10 = x, i.e., x = 10 . This fixed point is unstable 3 (f ( 10 ) = −2). Since f 3 has no other fixed point, f has no 3-point cycle, while it is 3 readily verified that it has a 5-point cycle, namely {1, 3, 4, 2, 5}. Solution 5.2 If {x∗1 , x22 , x∗3 } is a 3-point cycle, we have to distinguish three possibilities. 1. Two points of the cycle (say x∗1 and x∗2 ) are less than 12 , and the third one (x∗3 ) is greater. In this case, the three points satisfy the equations x∗2 = rx∗1 ,
x∗3 = rx∗2 ,
Hence, x∗1 =
r , 1 + r3
x∗2 =
x∗1 = r(1 − x∗3 ).
r2 , 1 + r3
x∗3 =
r3 . 1 + r3
Since we assumed that 0 < x∗1 < 12 ,
0 < x∗2 < 12 ,
1 2
the parameter r has to satisfy the condition √ 1 (1 + 5) < r ≤ 2. 2
< x∗3 < 1,
178
5 Chaos
5
5
4
4
3
3
2
2
1 2
3
4
5
1
2
3
4
5
Fig. 5.21. Exercise 5.1. Graphs of maps f (left) and f (right). They show that f has a 5-point cycle {1, 3, 4, 2, 5}, but no 3-point cycle. 3
2. One point of the cycle (say x∗1 ) is less than 21 , and the other two (x∗2 and x∗3 ) are greater. In this case, the three points satisfy the equations x∗2 = rx∗1 ,
x∗3 = r(1 − x∗2 ),
Hence, r , 1 + r + r2 Since we assumed that x∗1 =
0 < x∗1 < 12 ,
x∗2 =
1 2
x∗1 = r(1 − x∗3 ).
r2 , 1 + r + r2
< x∗2 < 1,
x∗3 =
1 2
r(1 + r) . 1 + r + r2
< x∗3 < 1,
here again, the parameter r has to satisfy the condition √ 1 (1 + 5) < r ≤ 2. 2 These two 3-cycles are shown in Figure 5.22. 3. If one point of the cycle (say x∗2 ) is equal to 12 , then, among the other two, one (x∗3 ) is greater than 12 , and the second one (x∗1 ) is less than 12 . Then x∗1 and x∗3 have to satisfy the relation rx∗1 = 12 ,
r = x∗3 , 2
r(1 − x∗3 ) = x∗1 .
In this case, r√ is the root of r3 − 2r2 + 1 = 0 in the semi-open interval ]1, 2]; that 1 is, r = 2 (1 + 5). This particular 3-cycle is represented in Figure 5.23. Solution 5.3 Let ρ be the invariant probability density for the asymmetric tent map −1 Ta,b . If I is an open interval in [0, 1], its preimage by Ta,b (I) is the union of two disjoint intervals I1 and I2 whose Lebesgue measure is 1 1 m(I1 ∪ I2 ) = m(I1 ) + m(I2 ) = + m(I). a b
Solutions 1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
179
0 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Fig. 5.22. The two different types of 3-point cycles for the symmetric tent map Tr for r = 1.9. 1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 5.23. The 3-point cycle for the symmetric tent map Tr for r = 12 (1 +
√
5).
Hence the map Ta,b preserves the Lebesgue measure if 1 1 + = 1. a b This relation is satisfied in particular by the symmetric binary tent map T2 and by any asymmetric tent map Ta,b with maximum b ab Ta,b = 1. = a+b a+b We could also have derived this result from the Perron-Frobenius equation, which, in the case of the asymmetric tent map, reads 1 x 1 x ρ(x) = ρ + ρ 1− . a a b b
180
5 Chaos
This equation is satisfied by ρ(x) = 1 if 1 1 + = 1. a b The graph of the asymmetric tent map for a = 3 and b = 3/2, which satisfy the condition above, is plotted in Figure 5.24.
1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 5.24. An example of an asymmetric tent map Ta,b for a = 3 and b = 3/2 and invariant probability ρ = 1 (see the text).
Solution 5.4 The map f (see Figure 5.25) has sensitive dependence on initial conditions because |f (x)| = 3 for all x ∈ R+ . Between two consecutive integral values of x, f n has 3n −2 unstable fixed points. Since the distance between two consecutive fixed points is less than 3−n+1 , these periodic points are dense in R+ . The map f , however, is not chaotic since it is not topologically transitive, because f ([k, k + 1]) = [k, k + 1], for any nonnegative integer k. Solution 5.5 As for all bifurcation diagrams represented in this chapter, this one (see Figure 5.26) has been computed for 300 equally spaced parameter values, and for each parameter value, 300 iterates are calculated, but only the last 100 have been plotted. Solution 5.6 The map fr,θ has two fixed points: 0 and 1. 0 is always unstable (fr,θ (0) = 1 + r), and 1, which is the reduced carrying capacity, is asymptotically stable if 0 < rθ < 2 (fr,θ (1) = 1 − rθ). For rθ = 2, the family of maps undergoes a period-doubling bifurcation. While the stability of the fixed point 1 depends only on the product rθ, this is not the case for the 2-point cycle. To simplify, we shall, therefore, fix the value of r and find the values of θ for which the family fr,θ undergoes the first period-doubling bifurcations. For r = 1 the 2-point cycle is stable for 2 < θ < 2.47887 and the 4-point cycle is stable for 2.47887 < θ < 2.5876. The bifurcation diagram is represented in Figure 5.27.
Solutions 3
3
2
2
1
1
1
2
3
1
2
181
3
Fig. 5.25. An example of a map in R+ (left figure) that has sensitive dependence on initial conditions and dense periodic points but that is not topologically transitive and, therefore, not chaotic. The right figure shows the graph of f 2 .
1 0.8
0.6
0.4
0.2
2
2.2
2.4
2.6
Fig. 5.26. Bifurcation diagram of the discrete map (n, r) → −rn log n. The parameter r, plotted on the horizontal axis, varies from 1.8 to 2.7, and the population n, plotted on the vertical axis, varies between 0 and 1. Solution 5.7 Let f : (x, y) → y, 4x(1 − x) ; then f 2 : (x, y) → 4x(1 − x), 4y(1 − y) , which shows that f 2 : (x, y) → f4 (x), f4 (y) , where f4 is the chaotic logistic map x → 4x(1 − x). Thus, taking into account the expression of the invariant probability density (5.3) found in Subsection 5.3.1, the invariant probability density of the two-dimensional map f is given by ρ(x, y) =
π2
1 . xy(1 − x)(1 − y)
182
5 Chaos
1.2 1 0.8 0.6 0.4 0.2 1.75
2
2.25
2.5
2.75
3
3.25
Fig. 5.27. Bifurcation diagram of the discrete θ map (n, r, θ) → n + rn(1 − nθ ). The value of r is equal to 1, the parameter θ, plotted on the horizontal axis, varies from 1.5 to 3.4, and the population n, plotted on the vertical axis, varies between 0 and 1.4.
5 4 3 2 1
-4
-3
-2
-1
0
Fig. 5.28. Bifurcation diagram of the two-parameter map (x, a, b) → a + bx/(1 + x2 ) for b = 11. The parameter a, plotted on the horizontal axis, varies from −5 to 0, and x, plotted on the vertical axis, varies between 0 and 5. Solution 5.8 Figures 5.28 and 5.29 show the bifurcation diagram of the map fa,b for b equal to 11 and 12 respectively. For b = 11 and a = −5, the attractor is a stable fixed point. Then, as a increases from −5 to −3, we observe two period-doubling bifurcations, but as a increases further, the system undergoes two reverse period-doubling bifurcations and its attractor is again a fixed point. For b = 12 and a = −5, the attractor is, as for b = 11, a stable fixed point. But here, as a increases, the map becomes apparently chaotic after a sequence of period-doubling bifurcations, and as a further increases, through a reverse route, the attractor again becomes a fixed point. In the chaotic region, periodic windows are clearly visible.
Solutions
183
5 4 3 2 1
-4
-3
-2
-1
0
Fig. 5.29. Bifurcation diagram of the two-parameter map (x, a, b) → a + bx/(1 + x2 ) for b = 12. The parameter a, plotted on the horizontal axis, varies from −5 to 0, and x, plotted on the vertical axis, varies between 0 and 5. Figure 5.30 shows the bifurcation diagram obtained for a = −3 and increasing b from 11 to 12.
3.5
2.5
1.5
0.5 11.2
11.6
12
Fig. 5.30. Bifurcation diagram of the two-parameter map (x, a, b) → a+bx/(1+x2 ). The parameter a is equal to −3, the parameter b, plotted on the horizontal axis, varies from 11 to 12, and x, plotted on the vertical axis, varies between 0 and 3.5.
Part II
Agent-Based Models
In the preceding chapters, interacting individual actors have never been explicitly taken into account, we have only been interested in the evolution of average quantities. For instance, when modeling a predator-prey system, we only considered predator and prey densities and wrote down equations to determine how they varied as a function of time. In such an approach, each species is supposed to be homogeneously distributed in space. Even when space is taken into account, as in a model formulated in terms of partial differential equations, we have to assume that population densities are smooth functions of space, and an element of volume dx3 is supposed to contain a large number of agents for “local densities” to be defined. Agent-based modeling is more ambitious; it deals directly with spatially distributed agents, which may be animals, people, or companies. The agent-based models we shall discuss are formulated in terms of automata networks whose theory is not yet so well-developed. An automata network is a fully discrete dynamical system. It consists of a graph where each vertex takes states in a finite set.1 The state of a vertex changes in time according to a rule that only depends upon the states of neighboring vertices in the graph. Automata networks are more precisely defined as follows. Definition. Let G = (V, E) be a graph, where V is a set of vertices and E a set of edges. Each edge joins two vertices not necessarily distinct. An automata network, defined on V , is a triple (G, Q, {fi | i ∈ V }), where G is a graph on V , Q a finite set of states and fi : Q|Ui | → Q a mapping, called the local evolution rule or local transition rule associated with vertex i. Ui = {j ∈ V | {j, i} ∈ E} is the neighborhood of i ( i.e., the set of vertices connected to i), and |Ui | denotes the number of vertices belonging to Ui . The graph G is assumed to be locally finite; i.e., for all i ∈ V , |Ui | < ∞. Automata networks are ideal tools for agent-based modeling. Each agent has a position in space that is one of the vertices of the graph and an individuality represented by the state of the vertex where it is located. The interaction between the agents is described by the local evolution rule. The states may just be integers. For instance, in a simple epidemic model, the state of a vertex could be equal to 0, 1 or 2 to represent, respectively, an empty site, a site occupied by a susceptible (i.e., a healthy individual susceptible to catch the disease), or a site occupied by an infective (i.e., an individual capable communicating the disease). Note that these numbers are just symbols. We could replace 0, 1, and 2 by, respectively, the letters e, s, and i. States may also be vectors, whose different components would represent, for example, an individual’s sex, income, political affiliation, list of people met at previous times, etc. The word “vector” is not to be understood as an element of a linear space. It only denotes a list of symbols that are numbers, letters, or even lists of symbols. 1
On graph theory, see Section 7.2.
188
Part II. Agent-Based Models
In the following chapters, the notion of critical behavior will often play an important role. Critical behavior manifests itself in many-component systems and is characteristic of a cooperative behavior of the various components. This notion has been introduced in statistical physics for many-body systems such as ferromagnetic materials, alloys, or liquid helium, which exhibit second-order phase transitions; that is, a phase change as a function of a tuning parameter, such as temperature. A ferromagnetic material (i.e., a system having a spontaneous nonzero magnetization), becomes, as its temperature is (in general) increased, paramagnetic (i.e., its magnetization in the absence of an external magnetic field is equal to zero). In an ordered alloy such as β-brass—a 50% copper and 50% zinc alloy—the atoms of copper and zinc are located on two identical sublattices, one sublattice containing more copper and the other more zinc. As the temperature is increased, the alloy becomes disordered (i.e., both sublattices contain equal fractions of copper and zinc). Liquid helium, which behaves as an ordinary liquid at temperatures above 2.19 K, becomes superfluid at lower temperatures. The temperature at which these phase transitions occur is called the critical temperature, and the system at the critical temperature—more generally at the critical point—is said to be in a critical state. At the critical point, physical quantities such as entropy, volume, or magnetization that are first derivatives of the free energy are continuous, in contrast with second-order derivatives, such as the specific heat or the magnetic susceptibility, which are singular. These singular behaviors reflect the long-range nature of the correlations in the vicinity of the critical point. Close to a second-order phase transition, correlation functions of fluctuating quantities (such as spins in the case of a para-ferromagnetic phase transition) at two different points decrease exponentially with a characteristic correlation length ξ. As the temperature T approaches the critical temperature Tc , ξ diverges as (T − Tc )−ν if T > Tc and (T − Tc )−ν if T < Tc . This cooperative effect is characteristic of criticality. It implies that certain physical quantities either vanish or diverge as powers of |T − Tc | as T approaches Tc . Today, when a many-agent system displays a power-law behavior for some observable, most researchers agree that this is a sign of some cooperative effect and a manifestation of the system complexity. Despite the great variety of physical systems that exhibit second-order phase transitions, their critical behaviors, characterized by a set of critical exponents, fall into a small number of universality classes that only depend on the symmetry of the order parameter (such as the magnetization for a ferromagnet) and space dimension. Critical behavior is universal in the sense that it does not depend upon details whose characteristic size is much less than the correlation length, such as lattice structure, range of interactions (as long as this range is finite), spin length, etc. Moreover, for a given secondorder phase transition, one needs to know only a rather small number of critical exponents to determine all other exponents. For instance, in the case of a para-ferromagnetic second-order phase transition, the specific heat at
Part II. Agent-Based Models
189
constant magnetic field CB diverges as (T − Tc )−α if T > Tc and (T − Tc )−α if T < Tc , the magnetization M , which is identically equal to zero for T > Tc , goes to zero as (Tc − T )β if T < Tc , and the isothermal susceptibility χT diverges as (T − Tc )−γ if T > Tc and (T − Tc )−γ if T < Tc . If we assume that the free energy F , close to the critical point, is a generalized homogeneous function of T − Tc and M that is a function satisfying, for all values of λ, the relation F (λ(T − Tc ), λβ M ) ≡ λ2−α F (T − Tc , M ), which implies that F (T − Tc , M ) can be written under the form M 2−α F (T − Tc , M ) = |T − Tc | , f |T − Tc |β where f is a function of only one variable, it can be shown2 that the critical exponents satisfy the following so-called scaling relations α = α ,
γ = γ,
and α + 2β + γ = 2.
An important distinguishing feature of the power-law behavior of physical quantities in the neighborhood of a critical point is that these quantities have no intrinsic scale. The function x → e−x/ξ has an intrinsic scale ξ, whereas the function x → xa has no intrinsic scale: power laws are self-similar. Quantities exhibiting a power-law behavior have been observed in a variety of disciplines ranging from linguistics and geography to medicine and economics. As mentioned above, the emergence of such a behavior is regarded as the signature of a collective mechanism. Second-order phase transitions are always associated with a broken symmetry. That is, the symmetry group of the ordered phase (the phase characterized by a nonzero value of the order parameter) is a subgroup of the disordered phase (the phase characterized by an order parameter identically equal to zero). To the order parameter, we can always associate a symmetrybreaking field. In the presence of such a field, the order parameter has a nonzero value, and, in this case, the system cannot exhibit a second-order phase transition.3 Depending upon the nature of the order parameter, a second-order phase transition exists only above a critical space dimensionality, called the lower critical space dimension. Critical exponents, which depend upon space dimensionality, have their mean-field values above another critical space dimensionality, called the upper critical space dimension. In the case of the Ising model, the lower and upper critical space dimensions are, respectively, equal to 1 and 4. 2 3
See Boccara [45]. The nature of the broken symmetry is not always obvious as, for instance, in the case of the normal-superfluid or normal-superconductor phase transitions. See Boccara [44].
6 Cellular Automata
Depending upon the nature of the graph G, there exist many types of automata networks. In this chapter, we focus on cellular automata in which the set of vertices, usually called sites, is either Zn , if the n-dimensional cellular automaton is infinite, or the torus ZnL , if the n-dimensional cellular automaton is finite. ZL denotes the set of integers modulo L. If the set of vertices is ZnL , the cellular automaton is said to satisfy periodic boundary conditions. In what follows, we always assume that this is the case. Cellular automata, constructed from many identical simple components but together capable exhibiting complex behavior, are ideal tools for developing models of complex systems [349].
6.1 Cellular automaton rules Deterministic one-dimensional cellular automaton rules are defined as follows. Let s(i, t) ∈ Q represent the state at site i ∈ Z and time t ∈ N; a local evolution rule is a map f : Qr +rr +1 → Q such that (6.1) s(i, t + 1) = f s(i − r , t), s(i − + 1, t), . . . , s(i + rr , t) , where the integers r and rr are, respectively, the left radius and right radius of the rule f ; if r = rr = r, r is called the radius of the rule. The local rule f , which is a function of n = r + r1 + 1 arguments, is often said to be an n-input rule. The function St : i → s(i, t) is the state of the cellular automaton at time t; St belongs to the set QZ of all configurations. Since the state St+1 at t + 1 is entirely determined by the state St at time t and the local rule f , there exists a unique mapping Ff : S → S such that St+1 = Ff (St ),
(6.2)
which is called the cellular automaton global rule or the cellular automaton evolution operator induced by the local rule f .
192
6 Cellular Automata
In the case of two-dimensional cellular automata, there are several possible lattices and neighborhoods. If, for example, we lattice and a consider a square (2r1 + 1) ×(2r2 + 1)-neighborhood, the state s (i1 , i2 ), t + 1 ∈ Q of site (i1 , i2 ) at time t + 1 is determined by the state of the (2r1 + 1) × (2r2 + 1)-block of sites centered at (i1 , i2 ) by s (i1 , i2 ), t + 1 ⎞ ⎛ s (i1 − r1 , i2 − r2 ), t · · · s (i1 + r1 , i2 − r2 ), t ⎜ s (i1 − r1 , i2 − r2 + 1), t · · · s (i1 + r1 , i2 − r2 + 1), t ⎟ ⎟ , (6.3) =f⎜ ⎝ ⎠ ··· ··· ··· s (i1 − r1 , i2 + r2 ), t · · · s (i1 + r1 , i2 + r2 ), t where f : Q(2r1 +1)×(2r2 +1) → Q is the two-dimensional cellular automaton local evolution rule. Most models presented in this chapter will be formulated in terms of probabilistic cellular automata. In this case, the image by the evolution rule of any (r + rr + 1)-block, for a one-dimensional cellular automaton, or of any given (2r1 + 1) × (2r2 + 1)-block, for a two-dimensional cellular automaton, is a discrete random variable with values in Q. The simplest cellular automata are the so-called elementary cellular automata in which the finite set of states is Q = {0, 1} and the rule’s radii are r = rr = 1. Sites in a nonzero state are sometimes said to be active. It is easy 3 to verify that there exist 22 = 256 different elementary cellular automaton local rules f : {0, 1}3 → {0, 1}. The local rule of an elementary cellular automaton can be specified by its look-up table, giving the image of each of the eight three-site neighborhoods. That is, any sequence of eight binary digits specifies an elementary cellular automaton rule. Here is an example: 111 1
110 0
101 1
100 1
011 1
010 0
001 0
000 0
Following Wolfram [348], a code number may be associated with each cellular automaton rule. If Q = {0, 1}, this code number is the decimal value of the binary sequence of images. For instance, the code number of the rule above is 184 since 101110002 = 27 + 25 + 24 + 23 = 18410 . More generally, the code number N (f ) of a one-dimensional |Q|-state n-input cellular automaton rule f is defined by n−1 n−2 0 N (f ) = f (x1 , x2 , . . . , xn )|Q||Q| x1 +|Q| x2 +···+|Q| xn . (x1 ,x2 ,...,xn )∈Qn
Figure 6.1 represents the first 50 iterations of the elementary cellular automaton rule 184. The cellular automaton size is equal to 50. The initial
6.1 Cellular automaton rules
193
Fig. 6.1. First 50 iterations of the elementary cellular automaton rule 184 starting from a random initial configuration with an equal number of 0s (light grey) and 1s (dark grey). Time is oriented downwards. Number of lattice sites: 50.
configuration is random, and the density of active sites is exactly equal to 12 . As a result of the local rule, we observe the growth of a self-organized pattern. After a number of time steps of the order of the size of the cellular automaton, the initially disordered configuration becomes perfectly ordered, i.e., the attractor consists of two configurations, namely . . . 01010101 . . .
and
. . . 10101010 . . . .
In the case of cellular automata, the attractor is called the limit set and is defined by % ΛF = lim F t (S) = F t (S), t→∞
t≥0
where F is the global evolution rule and S = QZ is the set of all configurations. As mentioned in Chapter 1, such an emergent property is an essential characteristic of a complex system. In the case of the elementary cellular automaton 184, the perfect order exists only if the density of active sites is exactly equal to 12 . If this is not the case, the limit set ΛF184 , illustrated in Figure 6.2, is the set of all configurations consisting of finite sequences of alternating 0s and 1s separated by finite sequences of either 0s, if the density of active sites is less than 12 , or 1s, if the density of active sites is greater than 12 .
194
6 Cellular Automata
The evolution towards the limit set can be viewed as the elimination of defects, which are sequences of 0s or sequences of 1s. These defects can also be viewed as interacting particlelike structures evolving in a regular background.1
Fig. 6.2. First 50 iterations of the elementary cellular automaton rule 184 starting from a random initial configuration. The density of active sites is exactly equal to 0.4 in the left figure and 0.6 in the right one. Time is oriented downwards. The number of lattice sites is equal to 50.
6.2 Number-conserving cellular automata Elementary cellular automaton 184 belongs to a special class of cellular automata. It is number-conserving; i.e., it satisfies the condition (∀t ∈ N)
L
s(i, t) = constant,
i=1
where L is the cellular automaton size. It is not difficult to establish a necessary and sufficient condition for a one-dimensional |Q|-state n-input cellular automaton rule to be number-conserving [62]. An extension of these results to infinite lattices and higher dimensions can be found in Durand, Formenti, and R´ oka [112]. It can be shown that any one-dimensional cellular automaton can be simulated by a number-conserving cellular automaton; see Moreira [249]. 1
On interacting particlelike structures in spatiotemporal patterns representing the evolution of one-dimensional cellular automata, see Boccara, Nasser, and Roger [48].
6.2 Number-conserving cellular automata
195
Definition 22. A one-dimensional |Q|-state n-input cellular automaton rule f is number-conserving if, for all cyclic configurations of length L ≥ n, it satisfies f (x1 , x2 , . . . , xn−1 , xn ) + f (x2 , x3 , . . . , xn , xn+1 ) + · · · + f (xL , x1 . . . , xn−2 , xn−1 ) = x1 + x2 + · · · + xL . (6.4) Theorem 13. A one-dimensional |Q|-state n-input cellular automaton rule f is number-conserving if, and only if, for all (x1 , x2 , . . . , xn ) ∈ Qn , it satisfies f (x1 , x2 , . . . , xn ) = x1 +
n−1
f (0, 0, . . . , 0, x2 , x3 , . . . , xn−k+1 ) - ./ 0 k=1 k −f (0, 0, . . . , 0, x1 , x2 , . . . , xn−k ) . - ./ 0
(6.5)
k
To simplify the proof, we need the following lemma. Lemma 1. If f is a number-conserving rule, then f (0, 0, . . . , 0) = 0.
(6.6)
Write Condition (6.4) for a cyclic configuration of length L ≥ n where all elements are equal to zero. To prove that Condition (6.5) is necessary, consider a cyclic configuration of length L ≥ 2n − 1 that is the concatenation of a sequence (x1 , x2 , . . . , xn ) and a sequence of L − n zeros, and express that the n-input rule f is numberconserving. We obtain f (0, 0, . . . , 0, x1 ) + f (0, 0, . . . , 0, x1 , x2 ) + · · · + f (x1 , x2 , . . . , xn ) + f (x2 , x3 , . . . , xn , 0) + · · · + f (xn , 0, . . . , 0) = x1 + x2 + · · · + xn ,
(6.7)
where all the terms of the form f (0, 0, . . . , 0), which are equal to zero according to (6.6), have not been written. Replacing x1 by 0 in (6.7) gives f (0, 0, . . . , 0, x2 ) + · · · + f (0, x2 , . . . , xn ) + f (x2 , x3 , . . . , xn , 0) + · · · + f (xn , 0, . . . , 0) = x2 + · · · + xn .
(6.8)
Subtracting (6.8) from (6.7) yields (6.5). Condition (6.5) is obviously sufficient since, when summed on a cyclic configuration, all the left-hand side terms except the first cancel. Remark 10. The proof above shows that if we can verify that a cellular automaton rule f is number-conserving for all cyclic configurations of length 2n − 1, then it is number-conserving for all cyclic configurations of length L > 2n − 1.
196
6 Cellular Automata
The following corollaries are simple necessary conditions for a cellular automaton rule to be number-conserving. Corollary 1. If f is a one-dimensional |Q|-state n-input number-conserving cellular automaton rule, then, for all x ∈ Q, f (x, x, . . . , x) = x.
(6.9)
To prove (6.9), which is a generalization of (6.6), write Condition (6.5) for x1 = x2 = · · · = xn = x. Corollary 2. If f is a one-dimensional |Q|-state n-input number-conserving cellular automaton rule, then
f (x1 , x2 , . . . , xn ) =
(x1 ,x2 ,...,xn )∈Qn
1 (|Q| − 1) |Q|n . 2
(6.10)
When we sum (6.5) over (x1 , x2 , . . . , xn ) ∈ Qn , all the left-hand-side terms except the first cancel, and the sum over the remaining terms is equal to (0 + 1 + 2 + · · · + (|Q| − 1))|Q|n−1 = 12 (|Q| − 1) |Q|n . Number-conserving cellular automata can be used to model closed systems of interacting particles. So far, we have only considered cellular automata evolving according to deterministic rules. In many applications it is often necessary to allow for probabilistic rules as defined on page 192. Example 38. Highway car traffic. Vehicular traffic can be treated as a system of interacting particles driven far from equilibrium. The so-called particlehopping model describes car traffic in terms of probabilistic cellular automata. The first model of this type was proposed by Nagel and Schreckenberg [257]. These authors consider a finite lattice of length L with periodic boundary conditions. Each cell is either empty ( i.e., in state e), or occupied by a car ( i.e., in state v), where v = 0, 1, . . . , vmax denotes the car velocity (cars are moving to the right). If di is the distance between cars i and i+1, car velocities are updated in parallel according to the following subrules: vi (t + 12 ) = min(vi (t) + 1, di (t) − 1, vmax ),
max(vi (t + 12 ) − 1, 0), with probability p, vi (t + 1) = vi (t + 12 ), with probability 1 − p, where vi (t) is the velocity of car i at time t. Then, if xi (t) is the position of car i at time t, cars are moving according to the rule xi (t + 1) = xi (t) + vi (t + 1). That is, at each time step, each car increases its speed by one unit (acceleration a = 1), respecting the safety distance and the speed limit. But, the model also
6.2 Number-conserving cellular automata
197
includes some noise: with a probability p, each car decreases its speed by one unit. Although rather simple, the model exhibits features observed in real highway traffic (e.g., with increasing vehicle density, it shows a transition from laminar traffic flow to start-stop waves, as illustrated in Figure 6.3).
Fig. 6.3. First 50 iterations of the Nagel-Schreckenberg probabilistic cellular automaton traffic flow model. The initial configuration is random with a density equal to 0.24 in the left figure and 0.48 in the right one. In both cases vmax = 2 and p = 0.2. The number of lattice sites is equal to 50. Empty cells are very light grey while cells occupied by a car with velocity v equal to 0, 1, and 2 have darker shades of grey. Time increases downwards.
When discussing road configurations evolving according to various illustrative rules, the knowledge of both car positions and velocities proves necessary. Therefore, although we are dealing with two-state cellular automaton rules, we shall not represent the state of a cell by its occupation number (i.e., 0 or 1), but by a letter in the alphabet {e, 0, 1, . . . , vmax } indicating that the cell is either empty (i.e., in state e) or occupied by a car with a velocity equal to v (i.e., in state v ∈ {0, 1, . . . , vmax }). Configurations of cells of this type will be called velocity configurations, or configurations for short. When using velocity configurations, we shall not represent cellular automaton rules by their rule tables but make use of a representation that clearly exhibits the particle motion. This motion representation, or velocity rule, introduced by Boccara and Fuk´s [59], may be defined as follows. If the integer r is the rule’s radius, list all the (2r + 1)-neighborhoods of a given particle represented by a 1 located at the central site of the neighborhood. Then, to each neighborhood, associate an integer v denoting the velocity of this particle, which is the number of sites this particle will move in one time step, with the convention that v is positive if the particle moves to the right and
198
6 Cellular Automata
negative if it moves to the left. For example, rule 184 is represented by the radius 1 velocity rule: •11 → 0, •10 → 1. (6.11) The symbol • represents either a 0 or a 1 (i.e., either an empty or occupied site). This representation, which clearly shows that a car can move to the next neighboring site on its right if, and only if, this site is empty, is shorter and more explicit. Elementary cellular automaton rule 184 is a particular Nagel-Schreckenberg highway traffic flow model in which the probability p of decelerating is equal to zero and the maximum speed vmax = 1. In this particular case, Fuk´s [133] has been able to determine the exact expression of the average car velocity v(ρ, t) as a function of car density ρ and time t. He found ⎧ (4ρ(1 − ρ))t ⎪ ⎪ √ , if ρ < 12 , ⎨1 − πt v(ρ, t) = (4ρ(1 − ρ))t 1−ρ ⎪ ⎪ √ 1− , otherwise. ⎩ ρ πt In the limit t → ∞, this yields ⎧ ⎨1, if ρ < 12 , v(ρ, ∞) = 1 − ρ ⎩ , otherwise. ρ This deterministic model exhibits a second-order phase transition at ρ = 12 between a free-moving phase and a jammed phase (see Figure 6.2). This phase transition may be characterized by the order parameter m(ρ) = 1 − v(ρ, ∞), which is equal to zero in the free-moving phase.2 The control parameter, whose role is usually played by the temperature in statistical physics, is here played by the car density. It is straightforward to generalize the traffic flow model described by rule 184 to higher maximum velocities (see Figure 6.4). If, as before, di is the distance between car i and car i + 1, car velocities are updated in parallel according to the rule3 vi (t + 1) = min(di (t) − 1, vmax ), and car positions are updated according to the rule 2
3
In phase transition theory, the order parameter is equal to zero in the phase of higher symmetry, its nonzero value characterizes the broken symmetry of the ordered phase. A similar model has also been studied by Fukui and Ishibashi [134].
6.2 Number-conserving cellular automata
199
Fig. 6.4. First 30 iterations of the generalized deterministic rule 184 traffic flow model. The initial configuration is random with a density equal to 0.24 in the left figure and 0.48 in the right one. In both cases vmax = 2. The number of lattice sites is equal to 50. Empty cells are very light grey while cells occupied by a car with velocity v equal to 0, 1, and 2 have darker shades of grey. Time increases downwards.
xi (t + 1) = xi (t) + vi (t + 1). In the steady state, the average car velocity v(ρ, ∞) is given, as a function of the car density ρ, by ⎧ 1 ⎪ , ⎨vmax , if ρ < ρc = 1 + vmax v(ρ, ∞) = 1 − ρ (6.12) ⎪ ⎩ , otherwise, ρ and the order parameter characterizing the second-order phase transition between the free-moving phase and the jammed phase is m(ρ) = vmax − v(ρ, ∞), or
(6.13)
⎧ ⎨0, if ρ < ρc , m(ρ) = ρ − ρc ⎩ , otherwise. ρρc
In the jammed phase close to the critical car density ρc , the critical behavior of the order parameter is characterized by the exponent β = 1 defined by m(ρ) ∼ (ρ − ρc )β . While the value of the critical exponent β can be found exactly, this is not the case for the other critical exponents. The exponents γ and δ have been determined by Boccara and Fuk´s [60]. They characterize, respectively, the critical behavior of the susceptibility and the response to the symmetrybreaking field at the critical point. If ρ < ρc , any configuration in the limit set consists of perfect tiles of vmax + 1 cells, as shown below.4 4
Note that, in what follows, v is the velocity with which the car is going to move at the next time step.
200
6 Cellular Automata
vmax
e
···
e
e
in a sea of cells in state e (see Figure 6.4). If ρ > ρc , a configuration belonging to the limit set only consists of a mixture of tiles containing v + 1 cells of the type v
e
···
e
e
where v = 0, 1, · · · , vmax . If we introduce random braking, then, even at low density, some tiles become defective, which causes the average velocity to be less than vmax . The random-braking probability p can, therefore, be viewed as a symmetrybreaking field conjugate to the order parameter m defined by (6.13). This point of view implies that the phase transition characterized by m will be smeared out in the presence of random braking as in ferromagnetic systems placed in a magnetic field. The susceptibility, defined as χ(ρ) = lim
p→0
∂m , ∂p
diverges, in the vicinity of ρc , as (ρc − ρ)−γ for ρ < ρc and as (ρ − ρc )−γ for ρ > ρc , and at ρ = ρc , the response function lim
p→0
m(ρc , 0) − m(ρc , p) p
goes to zero as p1/δ . For the model corresponding to vmax = 2,5 using numerical simulations and a systematic approximation technique, described in Example 42, Boccara and Fuk´s found γ = γ ≈ 1 and δ ≈ 2. It is interesting to note that these values are found in equilibrium statistical physics in the case of second-order phase transitions characterized by nonnegative order parameters above the upper critical space dimension. Close to the phase transition point, critical exponents obey scaling relations. If we assume that, in the vicinity of the critical point (ρ = ρc , p = 0), the order parameter m is a generalized homogeneous function of ρ − ρc and p of the form p m = |ρ − ρc |β f , (6.14) |ρ − ρc |βδ where the function f is such that f (0) = 0, then, differentiating f with respect to p and taking the limit p → 0, we readily obtain γ = γ = (δ − 1)β. 5
(6.15)
For vmax = 2, the cellular automaton local rule must have at least a left radius r = 2 and right radius rr = 1. Using Wolfram codification, the corresponding 4-input rule code number is 43944.
6.2 Number-conserving cellular automata
201
These relations, verified by the numerical values, confirm the existence of a universal scaling function, which has also been checked directly [60]. Boccara [61] has shown that this highway traffic flow model satisfies, with other deterministic traffic flow models, a variational principle. For the sake of simplicity, in our discussion of this variational principle, it is sufficient to consider the case vmax = 2; that is, the radius 2 velocity rule • • 11• → 0,
• • 101 → 1,
• • 100 → 2.
(6.16)
According to velocity rule (6.16) and our choice of configuration representation, a cell occupied by a car with velocity v must be preceded by at least v empty cells. Each configuration is therefore a concatenation of the following four types of tiles: 2 1
e
e
e 0
e
The first tile, which corresponds to cars moving at the speed limit vmax = 2, is a perfect tile, the next two tiles, corresponding to cars with a velocity less than 2 (here 1 and 0), are defective tiles, and the last tile is a free empty cell ; that is, an empty cell that is not part of either a perfect or a defective tile. If ρ < ρc = 13 (left part of Figure 6.4), only the first configurations contain defective tiles. After a few time steps, these tiles progressively disappear, and the last configurations contain only perfect tiles and free empty cells. Hence, all cars move at vmax = 2, and the system is in the free-moving phase. If ρ > ρc (right part of Figure 6.4), at the beginning, the same process of annihilation of defective tiles takes place, but, in this case, all defective tiles do not eventually disappear. A few cars move at vmax , while other cars have either a reduced speed (v = 1) or are stopped (v = 0). The system is in the jammed phase. To analyze the annihilation process of defective tiles in generalized rule 184 models, we need to define what we call a local jam. Definition 23. In deterministic generalized rule 184 models of traffic flow, a local jam is a sequence of defective tiles preceded by a perfect tile and followed by either a perfect tile or free empty cells. From this definition, it follows that: Theorem 14. In the case of deterministic generalized rule 184 models of traffic flow, the number of cars that belong to a local jam is a nonincreasing function of time. This result is a direct consequence of the fact that, by definition, a local jam is preceded by a car that is free to move, and, according to whether a new car joins the local jam from behind or not, the number of cars in the local jam remains unchanged or decreases by one unit. Note that the jammed
202
6 Cellular Automata
car just behind the free-moving car leading the local jam itself becomes free to move at the next time step. In order to establish the variational principle for any value of vmax , we will first prove the following lemma. Lemma 2. In the case of deterministic generalized rule 184 models of traffic flow, the number of free empty cells is a nonincreasing function of time. Let us analyze how the structure of the most general local jam changes in one time step. A local jam consisting of n defective tiles is represented below: · · · v1 ee · · · e0 v2 -ee ./ · · · e0 · · · · · · vn ee · · · e0 vmax -ee ./ · · · e0 · · · , - ./ - ./ v1
v2
vn
vmax
where, for i = 1, 2, . . . , n, 0 ≤ vi < vmax . At the next time step, a car with velocity vi located in cell k moves to cell k + vi . Hence, if the local jam is followed by v0 free empty cells, where v0 ≥ 0, we have to distinguish two cases: 1. If v0 +v1 < vmax , then the number of jammed cars remains unchanged but the leftmost jammed car, whose velocity was v1 , is replaced by a jammed car whose velocity is v0 + v1 : vmax +v0
/ 0- . · · · vmax ee · · · e v1 ee · · · ev2 ee · · · e · · · · · ·vn ee · · · evmax ee · · · e · · · · · · e(v0 + v1 )ee · · · ev2 ee · · · e · · · · · · vn ee · · · evmax ee · · · e · · · 2. If v0 + v1 ≥ vmax , then the local jam loses its leftmost jammed car: · · · ev1 ee · · · ev2 ee · · · e · · · · · · ·vn ee · · · evmax ee · · · e · · · · · · vmax ee · · · e0 v2 ee · · · e · · · · · · vn ee · · · evmax ee · · · e · · · - ./ vmax +v0
and, at the next time step, the local jam is followed by v0 + v1 − vmax = v0 < v0 free empty cells. If we partition the lattice in tile sequences whose endpoints are perfect tiles, then, between two consecutive perfect tiles, either there is a local jam, and the proof above shows that the number of free empty cells between the two perfect tiles cannot increase, or there is no local jam, and the number of free empty cells between the two perfect tiles remains unchanged. Remark 11. If a configuration contains no perfect tiles, then it does not contain free empty cells. Such a configuration belongs, therefore, to the limit set, and as we shall see below, the system is in its steady state. On a circular highway, if a configuration contains only one perfect tile, the reasoning above applies without modification. Remark 12. The proof above shows that, at each time step, the rightmost jammed car of a local jam moves one site to the left. Local jams can only move backwards.
6.2 Number-conserving cellular automata
203
We can now prove the following variational principle: Theorem 15. In the case of deterministic generalized rule 184 models of traffic flow, for a given car density ρ, the average car flow is a nondecreasing function of time and reaches its maximum value in the steady state. If L is the lattice length, N the number of cars, and Nfec (t) the number of free empty cells at time t, we have Nfec (t) = L − N −
N
vi (t)
i=1
since, at time t, car i is necessarily preceded by vi (t) empty cells. Dividing by L we obtain N Nfec (t) N N 1 =1− vi (t). − L L L N i=1 Hence, for all t, ρfec (t) = 1 − ρ − ρv(ρ, t),
(6.17)
where ρfec (t) is the density of free empty cells at time t and v(ρ, t) the average car velocity at time t for a car density equal to ρ. This last result shows that, when time increases, since the density of free empty cells cannot increase, the average car flow ρv(ρ, t) cannot decrease. The annihilation process of defective tiles stops when there are either no more defective tiles or no more free empty cells. Since a perfect tile consists of vmax + 1 cells, if the car density ρ is less than the critical density ρc = 1/(vmax + 1), there exist enough free empty cells to annihilate all the defective tiles, and all cars eventually become free to move. If ρ > ρc , there are not enough free empty cells to annihilate all defective tiles, and eventually some cars are not free to move at vmax . At the end of the annihilation process, all subsequent configurations belong to the limit set, and the system is in equilibrium or in the steady state. Note that the existence of a free-moving phase for a car density less than ρc = 1/(vmax + 1) can be seen as a consequence of Relation (6.17). When t goes to infinity, according to whether ρfec is positive or zero, this relation implies 1−ρ v(ρ, ∞) = min , vmax . ρ If the system is finite, its state eventually becomes periodic in time, and the period is equal to the lattice size or one of its submultiples. Since local jams move backwards and free empty cells move forwards, equilibrium is reached after a number of time steps proportional to the lattice size.
204
6 Cellular Automata
Remark 13. In the case of rule 184, the existence of a free-moving phase for a particle density ρ less than the critical density ρc = 12 obviously implies that the average velocity v(ρ, t), as a function of time t, is maximum when t → ∞. For ρ > 12 , this property is still true since the dynamics of the holes (empty sites) is governed by the conjugate of rule 184 (i.e., rule 226),6 which describes exactly the same dynamics as Rule 184 but for holes moving to the left. Therefore, for all values of the particle density, the average velocity takes its maximum value in the steady state.
Recently, many papers on the application of statistical physics to vehicular traffic have been published. For a review see, for example, Chowdhury, Santen, and Schadschneider [87]. Example 39. Pedestrian traffic. While considerable attention has been paid to the study of car traffic since the early 1990s, comparatively very few cellular automaton models of pedestrian dynamics have been published. In contrast with cars, pedestrians can accelerate and brake in a very short time, and their average velocity is sharply peaked. Pedestrian traffic models can provide valuable tools to plan and design a variety of pedestrian areas such as shopping malls, railway stations, or airport terminals. One of the first pedestrian traffic models is due to Fukui and Ishibashi [135, 136], who studied pedestrians in a passageway moving in both directions. The passageway is represented as a square lattice of length L and width W < L with periodic boundary conditions in which each cell is either occupied by one pedestrian or empty. Pedestrians are divided in two groups walking in opposite directions. Eastbound pedestrians may move only at odd time steps, while westbound pedestrians may move only at even time steps. At each odd time step, an eastbound pedestrian moves to his right-nearest cell when it is empty. If this cell is occupied by another eastbound pedestrian, he does not move, but if it is occupied by a westbound pedestrian, he tries to change lanes. At each even time step, a westbound pedestrian moves to his left-nearest cell when it is empty. If this cell is occupied by another westbound pedestrian, he does not move, but if it is occupied by an eastbound pedestrian, he tries to change lanes. When a pedestrian cannot move forward, he tries to change lanes according to the following rules: 1. In the first model, called the sidestepping model, (i) if both adjacent cells are empty, he moves to one of them chosen at random giving precedence, however, to pedestrians moving in the same direction without changing lanes, (ii) if only one cell is empty, he moves to this cell, here again giving 6
If f is an n-input two-state deterministic cellular automaton rule, its conjugate, denoted Cf , is defined by Cf (x1 , x2 , · · · , xn ) = 1 − f (1 − x1 , 1 − x2 , · · · , 1 − xn ).
6.2 Number-conserving cellular automata
205
precedence to a pedestrian moving without changing lanes, and (iii) if none is empty, he does not move. 2. In a second model, called the diagonal-stepping model, instead of trying to move to an adjacent cell, the pedestrian tries to move to one of the cells in front of the adjacent cells—in the direction of motion—applying the same rule as in the first model. In their first paper, Fukui and Ishibashi study the problem of one westbound pedestrian walking across the passageway among a fixed density ρ of eastbound pedestrians. In the case of the sidestepping model, they find that, for ρ ≤ ρc = (W − 1)/2W , the average flow of eastbound pedestrians grows linearly with ρ, for ρc ≤ ρ ≤ 1 − ρc , the flow remains constant, and, for 1 − ρc < ρ ≤ 1, the flow decreases linearly with ρ and becomes equal to zero for ρ = 1. In the limit W → ∞, as it should be, ρc = 12 .7 Phase transitions at ρ = ρc and ρ = 1 − ρc are second order. In the case of the diagonal-stepping model, for ρ ≤ 1 − ρc , the average flow of eastbound pedestrians grows linearly with ρ and then decreases linearly to reach zero for ρ > 1 − ρc . At ρ = 1 − ρc the phase transition is second order. In their second paper, Fukui and Ishibashi consider an equal number of westbound and eastbound pedestrians walking along a passageway who avoid collisions trying first to move to one of the cells in front of the adjacent cells, or, if these cells are occupied, trying to move to one of the adjacent cells. Numerical simulations seem to indicate that, as a function of pedestrian density, the system exhibits a first-order phase transition. Adopting a slightly different point of view, it is possible to build up a much simpler purely deterministic bidirectional pedestrian traffic model. Consider a square lattice of length L and width W < L with periodic boundary conditions in which Nw and Ne cells are, respectively, occupied by westbound and eastbound pedestrians, with Nw + Ne < LW . As in the Fukui–Ishibashi models, a pedestrian moves forward to the cell in front of him if it is empty. If this cell is occupied by another pedestrian moving in the same direction, the pedestrian does not move, but, if it is occupied by a pedestrian moving in the opposite direction, the pedestrian moves to the cell in front of his right adjacent cell (with respect to the moving direction), and if this cell is also occupied, he moves to his right adjacent cell. If both cells are occupied, the pedestrian does not move. In all cases, pedestrians who can move forward have the right of way. Eastbound (resp. westbound) pedestrians move at odd (resp. even) time steps. As a result of the lane-changing rule, to avoid collisions, the local walking rule, which is of the form ⎛ ⎞ s(i + 1, j − 1, t) s(i + 1, j, t) s(i + 1, j + 1, t) s(i, j, t) s(i, j + 1, t) ⎠ , s(i, j, t + 1) = f ⎝ s(i, j − 1, t) s(i − 1, j − 1, t) s(i − 1, j, t) s(i − 1, j + 1, t) 7
Refer to car traffic rule 184 and see Figure 6.1.
206
6 Cellular Automata
where i ∈ {1, L} and j ∈ {1, W }, depends on a lesser number of variables and is not probabilistic. Figure 6.5 shows an example of spontaneous lane formation of pedestrians moving in the same direction.
Fig. 6.5. Multilane bidirectional pedestrian traffic. Lattice length: 100; lattice width: 10; number of pedestrians: 150 moving to the left (light grey cells) and 150 to the right (black cells); number of iterations: 200. Empty cells are grey.
A few other lattice models of pedestrian traffic have been studied [252, 253, 254]. In the first paper, Muramatsu, Irie, and Nagatani consider a square lattice of length L and width W < L whose sites are occupied by two types of biased random walkers. That is, eastbound walkers cannot move west and westbound walkers cannot move east, but both types may move either north or south. For each walker, the respective probabilities of moving to any of the three authorized neighboring sites depend upon the occupancy of these sites. For instance, if all three sites are empty, a walker will move forward with probability D + (1 − D)/3 and sideways with probability (1 − D)/3; if one lateral site is occupied, then the walker will move forward with probability D + (1 − D)/2 and laterally with probability (1 − D)/2. The probabilistic walking rules depend, therefore, upon a unique parameter D called the drift strength. Lateral walls are reflecting, and the right and left boundaries are open. A constant flow of walkers enters the system at each of these open boundaries. The walking rules are applied sequentially. In the case of a passageway, numerical simulations seem to indicate the existence of a first-order jamming phase transition depending upon D and W . A promising model has been recently proposed by Burstedde, Klauck, Schadschneider, and Zittartz [75]. Pedestrians move on a square lattice with at most one pedestrian per cell. A 3 × 3 matrix M is associated with each pedestrian. The nine elements of this matrix give the respective probabilities for a pedestrian to move to one of the eight neighboring sites. The central element refers to the probability of not moving. In the simple problem of bidirectional traffic in a passageway, we only need two different types of matrices, one for pedestrians moving east and the other for pedestrians moving west. More complex problems could be handled using many more different types of matrices. One of the key features of this model is the existence of a floor field, which is a substitute for the long-range interactions necessary to take into account the geometry of a building, such as the existence of emergency exits. Actually, the authors introduce two floor fields: the static floor field matrix S, which does not depend upon time and the presence of pedestrians, is used to specify regions in space such as doors, in contrast with the dynamic
6.3 Approximate methods
207
floor field matrix D, which is modified by the presence of pedestrians and has its own dynamics and that is used to represent, for instance, the trace left by pedestrian footsteps. The expression of the probability for a pedestrian to move from site i to site j is then given by an expression of the form pij = AMij Sij Dij (1 − nij ),
19 where A is a normalization constant to ensure that, for all i, j=1 pij = 1 and nij the occupation number of the neighboring cell j of cell i. The system is updated synchronously. This model has been applied [188] to study a variety of situations such as the evacuation of a large room through one or two doors, the evacuation of a lecture hall, the formation of lanes in a long corridor, etc. Remark 14. Cars and pedestrians are self-propelled particles that, as a result of local interactions, exhibit self-organized collective motion. There exist many other systems that display similar collective behaviors, such as, for instance, flocks of birds, schools of fish, or herds of quadrupeds. Cellular automata have rarely been used to model these systems. Researchers have mostly used so-called off-lattice models. Simple two-dimensional models, closely related to the XY model of ferromagnetism,8 that exhibit flocking behavior have been proposed by different authors [331, 339]. Essentially, these models consist of a system of particles located in a plane, driven with a constant absolute velocity, whose direction is, at each time step, given by the average direction of motion of the particles in their neighborhood with the addition of some random perturbation. Such models undergo a second-order phase transition characterized by an order parameter breaking the symmetry of the two-dimensional rotational invariance of the average velocity of the particles. Extensive numerical simulations have been performed [98] on systems of 104 to 105 particles in a 100×100 plane with periodic boundary conditions.9 Typically, the authors run simulations of 105 time steps. If the noise is uniformly distributed in the interval [− 12 η, 12 η], the average velocity of the particles (the order parameter) behaves as (ηc − η)β with β = 0.42 ± 0.03, where ηc is the critical noise amplitude. For a fixed lattice size, ηc behaves as a function of the particle density ρ as ρκ with κ = 0.45 ± 0.05. When the absolute value of the particle velocity v0 is equal to zero, this model coincides with a diluted XY model and does not exhibit long-range order. For a nonvanishing v0 , the model belongs to a new universality class of dynamic, nonequilibrium models.
6.3 Approximate methods If we assume that the finite set of states Q is equal to {0, 1}, the simplest statistical quantity characterizing a cellular automaton configuration is the 8
9
The XY model is a system of two-dimensional spins (or classical vectors) located at the sites of a d-dimensional lattice. Its Hamiltonian is H = −J ij Si · Sj , where the sum runs over all first-neighboring pairs of spins, and J is a positive or negative constant. According to the Mermin-Wagner theorem [240], such a system does not exhibit long-range order for d ≤ 2. Particles are located anywhere in the plane, in particular, a lattice cell may contain more than one particle.
208
6 Cellular Automata
average density of active (nonzero) sites denoted by ρ. If the initial configuration is random (i.e., if the states of the different cells are uncorrelated), the iterative application of the cellular automaton rule introduces correlations between these states. Starting from an initial random configuration with a density of active sites equal to ρ(0), the simplest approximate method to determine the density as a function of time t is the mean-field approximation. This method neglects correlations, introduced by the application of the cellular automaton rule, between the states of the cells. This approximation gives a simple expression of the form ρ(t + 1) = fM F A ρ(t) , where fM F A is a map derived from the look-up table of the local rule f of the cellular automaton assuming that the probability for a cell to be in state 1 at time t is equal to ρ(t). Example 40. Elementary cellular automaton rule 184. Since the preimages of 1 by rule f184 are 111, 101, 100, 011, the probability ρ(t + 1) for a cell to be active at time t + 1 is the sum of the following terms: ρ(t)3 ,
2ρ(t)2 (1 − ρ(t)),
ρ(t)(1 − ρ(t))2 .
The evolution of ρ described by the mean-field approximation for rule f184 is, therefore, determined by the map ρ → ρ3 + 2ρ2 (1 − ρ) + ρ(1 − ρ)2 . 2 Taking into account the identity ρ + (1 − ρ) = 1, this yields ρ(t + 1) = ρ(t),
(6.18)
which is exact since rule 184 is number-conserving. It can be shown that, for all number-conserving cellular automaton rules, the mean-field approximation always gives Relation (6.18). This, obviously, does not mean that, for this set of rules, the mean-field approximation is exact. Example 41. Elementary cellular automaton rule 18. Since 1810 = 000100102 , the look-up table for rule 18 is 111 0
110 0
101 0
100 1
011 0
The preimages of 1 by rule f18 being 100 and 001,
010 0
001 1
000 0
6.3 Approximate methods
209
Fig. 6.6. Limit set of the elementary cellular automaton rule 18. One hundred successive iterates of an initial random configuration have been discarded. The following 50 iterations are represented. Time is oriented downwards. Number of lattice sites: 50.
the mean-field map is ρ → 2(1 − ρ)2 . Within the framework of the approximation, in the steady state, the density of active sites ρ∞ is thus given by the solution to the equation ρ∞ = 2(1 − ρ∞ )2 .
√ That is, ρ∞ = 1 − 1/ 2 = 0.292893 . . .. This result is not exact. It can be shown [47] that the configurations in the limit set consist of sequences of 0s of odd lengths separated by isolated 1s (see Figure 6.6). More precisely, the structure of the configurations belonging to the limit set results from random concatenations of the 2-blocks 00 and 01 with equal probability. The exact density of active sites of such configurations is thus equal to 14 . In order to build up better approximate methods than the mean field, we have to take explicitly into account the existence of correlations between the states of the cells. The local structure theory [162] is a systematic generalization of the mean-field approximation. In what follows, as a simplification, the theory is presented assuming that the set of cell states Q is {0, 1}. An n-block Bn is a sequence s1 s2 . . . sn of length n, where for i = 1, 2, . . ., si ∈ Q. If Bn is the set of all n-blocks, an n-input rule f is a map from Bn into Q. The truncation operators L and R map any n-block Bn to an (n − 1)-block by truncation from the left or right, respectively. That is, if Bn = s1 s2 . . . sn , LBn = s2 s3 . . . sn
and RBn = s1 s2 . . . sn−1 .
210
6 Cellular Automata
L and R commute. Applied to a 1-block, L and R yield the null-block. A map Pn from the set of all blocks of length less than or equal to n into [0, 1] is a block probability distribution of order n if it satisfies the following self-consistency conditions: 1. For i = 0, 1, 2, . . . , n, Pn (Bi ) ≥ 0.10 2. For i = 0, 1, 2, . . . , n, Pn (Bi ) = 1. Bi ∈Bi
3. For all blocks B of length n − 1, Pn (B ) =
Pn (B).
B st LB=B
4. For all blocks B of length n − 1, Pn (B ) =
Pn (B).
B stRB=B
In B st LB = B and B st RB = B , “st” stands for “such that.” Satisfaction of these last two conditions together implies that the probability of any block in the domain of Pn is the sum of the probabilities of blocks that contain the given block at a particular position. For example, considering all 3-blocks B3 = {000, 001, 010, 011, 100, 101, 110, 111}, the probability distribution P3 must satisfy the conditions P3 (00) = P3 (000) + P3 (001) = P3 (000) + P3 (100), P3 (01) = P3 (010) + P3 (011) = P3 (001) + P3 (101), P3 (10) = P3 (100) + P3 (101) = P3 (010) + P3 (110), P3 (11) = P3 (110) + P3 (111) = P3 (011) + P3 (111). These four constraints on P3 imply that only four (22 ) parameters are needed to describe P3 instead of eight (23 ). In general, for Pn , 2n−1 parameters are needed. Note that the conditions P3 (0) = P3 (00) + P3 (01) = P3 (00) + P3 (10), P3 (1) = P3 (10) + P3 (11) = P3 (01) + P3 (11), do not further restrict the number of parameters. The problem now is to construct block probability distributions satisfying the consistency relations. Given a block probability distribution Pn , we want to define a process to generate block probability distributions Pm for m > n. 10
Since there is only one null-block, its probability is equal to 1.
6.3 Approximate methods
211
It is reasonable to assume that, although the iterative application of a cellular automaton rule produces configurations in which nearby cells are correlated, these correlations die away with increasing cell separation. That is, for a block B that is long enough, the conditional probability of finding the block B augmented by a cell in state s on the right does not significantly depend upon the state of the left-most cell of B. Then P (Bs) P (LBs) = P (B) P (LB) or P (Bs) =
P (LBs)P (B) . P (LB)
This last formula should be symmetric between the addition of cells to the right or left of B; changing Bs in B, B in RB, and LB in LRB, it can be written under the symmetric form P (B) =
P (LB)P (RB) . P (LRB)
We shall use this relation to define an operator mapping an n-block probability distribution Pn to an (n + 1) probability distribution (Pn ), which gives the probability of (n + 1)-blocks B by Pn (LB)Pn (RB) . Pn (B) = Pn (LRB)
(6.19)
Since (Pn ) = Pn for blocks of length n or less, we have to verify that the consistency relations are satisfied for blocks of length n + 1. Let B and B be blocks of length n and n + 1, respectively; then
(Pn )(B) =
B st LB=B
B st LB=B
Pn (LB)Pn (RB) Pn (LRB)
Pn (B )Pn (RB) Pn (RB ) B st LB=B Pn (B ) = Pn (RB). Pn (RB ) =
B st LB=B
If B = RB, then LB = B implies LRB = RB = LB , and the relation Pn (B ) = Pn (RB ) B st LB =RB
shows that
B st LB=B
Pn (RB) = Pn (RB ),
212
6 Cellular Automata
which yields
B
(Pn )(B) = Pn (B ).
st LB=B
Replacing the operator L by R, we also obtain (Pn )(B) = Pn (B ). B st RB=B
Moreover,
(Pn )(B) =
B ∈Bn B st LB=B
B∈Bn+1
=
Pn (RB)Pn (LB) Pn (LRB)
Pn (B )
B ∈Bn
= 1. Example 42. Traffic flow models. In Example 38, we studied a cellular automaton model of traffic flow generalizing rule 184 for a speed limit vmax = 2. Here we show how to apply the local structure theory to this model. To construct a local structure approximation, it is more convenient to represent configurations of cars as binary sequences, where 0s represent empty spaces and 1s represent cars. Since for vmax = 2 the speed of a car is determined by the states of, at most, two sites in front of it,11 the minimal block size to obtain nontrivial results is 3 (the site occupied by a car plus two sites in front of it). In what follows, we limit our attention to order-three local structure approximation. Using 3-block probabilities, we can write a set of equations describing the time evolution of these probabilities, Pt+1 (σ2 σ3 σ4 ) =
w(σ2 σ3 σ4 | s0 s1 s2 s3 s4 s5 s6 )Pt (s0 s1 s2 s3 s4 s5 s6 ),
(6.20)
si ∈{0,1} i=0,1,...,6
where Pt (σ2 σ3 σ4 ) is the probability of block σ2 σ3 σ4 at time t, and w(σ2 σ3 σ4 | s0 s1 s2 s3 s4 s5 s6 ) is the conditional probability that the rule maps the sevenblock s0 s1 s2 s3 s4 s5 s6 into the three-block σ2 σ3 σ4 . States of lattice sites at time t are represented by s variables, while σ variables represent states at time t + 1, so that, for example, s3 is the state of site i = 3 at time t, and σ3 is the state of the same site at time t + 1. Conditional probabilities w are easily computed from the definition of the rule. Equation (6.20) is exact. The approximation consists of expressing the seven-block probabilities in terms of three-block probabilities using Relation (6.19). That is, 11
As a cellular automaton rule, it is the 4-input rule 43944. See Footnote 5.
6.4 Generalized cellular automata
Pt (s0 s1 s2 s3 s4 s5 s6 ) = Pt (s0 s1 s2 )Pt (s1 s2 s3 )Pt (s2 s3 s4 )Pt (s3 s4 s5 )Pt (s4 s5 s6 ) , Pt (s1 s2 )Pt (s2 s3 )Pt (s3 s4 )Pt (s4 s5 )
213
(6.21)
where Pt (si si+1 ) = Pt (si si+1 0)+Pt (si si+1 1) for i = 1, . . . , 4. Equations (6.20) and (6.21) define a dynamical system whose fixed point approximates threeblock probabilities of the limit set of the cellular automaton rule. Due to the nonlinear nature of these equations, it is not possible to find the fixed point analytically. It can be done numerically. From the knowledge of three-block probabilities Pt (s2 s3 s4 ), different quantities can be calculated.12 We have seen that the expression of the average velocity v(ρ, ∞) in the steady state as a function of the car density ρ can be determined exactly by (6.12). As a test of the local structure theory, we can check that the value of the average velocity as a function of ρ, which is given by v(ρ, ∞) = 2P∞ (100) + P∞ (101), agrees with the exact relation. The car density is given by ρ = Pt (1) = Pt (100) + Pt (101) + Pt (110) + Pt (111). More interestingly, we can determine the velocity probability distribution in the steady state; that is, the probabilities P (v = 0), P (v = 1), and P (v = 2) for a car velocity to be equal either to 0, 1, or 2 as a function of car density in the limit t → ∞. These probabilities are given by P (v = 0) = P∞ (110) + P∞ (111), P (v = 1) = P∞ (101), P (v = 2) = P∞ (100). For 0 < ρ ≤ ρc = 13 , P (v = 0) = 0, P (v = 1) = 0, and P (v = 2) = 1. For ρc ≤ ρ ≤ 1, the velocity probability distributions obtained from the expressions above, are represented in Figure 6.7.
6.4 Generalized cellular automata In this section, we define a new class of evolution operators that contains as particular cases all cellular automaton global evolution rules.13 In the following discussion, we suppose that the finite set of cell states Q coincides with 12
13
The local structure theory and numerical simulations have been used by Boccara and Fuk´s [60] to determine critical exponents. See Example 38. This result had been derived first for a restricted set of rules by Boccara, Fuk´s, and Geurten [58].
214
6 Cellular Automata
1
probability
0.8
v=2
0.6
v=0
0.4
v=1
0.2
0 0.4
0.6 0.8 car density
1
Fig. 6.7. Velocity probability distributions for the deterministic car traffic flow model with vmax = 2 determined using the local structure theory.
{0, 1}. Moreover, we may always assume that the radii r and rr of the cellular automaton local rule f are equal by formally expressing f as a function of 2r + 1 variables, where r = max{r , rr }. Under that form, f does not depend upon the values of the extra |r − rr | variables.14 Let St : i → s(i, t) be the state of the system at time t, and put σ(i, t) =
∞
s(i + n, t)p(n),
n=−∞
where p is a given probability measure on the set of all integers Z15 ; that is, a nonnegative function such that ∞
p(n) = 1.
n=−∞ 14
15
Note that the set of all radius r rules contains all n-input rules for n ≤ 2r + 1. For example, a rule f with r = 0 and rr = 1 may be represented as a function of three variables (x1 , x2 , x3 ) → f (x1 , x2 , x3 ) satisfying, for Q = {0, 1}, the condition f (x1 , x2 , x3 ) = f (1 − x1 , x2 , x3 ). The expression “probability measure” does not imply that we are considering stochastic cellular automata. A probability measure is just a finite measure such that the measure of the whole space is equal to 1.
6.4 Generalized cellular automata
215
Since s(i, t) ∈ {0, 1}, for all i ∈ Z and t ∈ N, it is clear that σ(i, t) ∈ [0, 1]. The state St+1 of the system at time t + 1 is then determined by the function i → s(i, t + 1) = IA (σ(i, t)), where IA is a given indicator function on [0, 1], that is, a function such that, for all x ∈ [0, 1],
1, if x ∈ A ⊂ [0, 1], IA (x) = 0, otherwise. Since the state St+1 at time t + 1 is entirely determined by the state St at time t, the probability measure p, and the subset A of [0, 1], there exists a unique mapping Fp,A : S → S such that St+1 = Fp,A (St ). Fp,A is an evolution operator defined on the state space S = {0, 1}Z . Theorem 16. Let Ff be a cellular automaton global rule on S = {0, 1}Z induced by a local rule f ; then, there exists an evolution operator Fp,A such that, for any configuration x ∈ S, Fp,A (x) = Ff (x). The set of all configurations with prescribed values at a finite number of sites is called a cylinder set. Cellular automaton rules being translationinvariant, we only need to consider cylinder sets centered at the origin. Since we assumed that both radii of the local rule are equal to r, we are interested in the 22r+1 cylinder sets associated with the different (2r + 1)-blocks (x−r , x−r+1 , . . . , xr ). Each (2r + 1)-block may be characterized by the binary number νr = x0 x−1 x1 x−2 x2 · · · x−r xr ; that is, νr =
r n=1
x−n 22r−2n+1 +
r
xn 22r−2n .
n=0
The cylinder set corresponding to a specific (2r + 1)-block is denoted C(r, νr ). For any configuration x belonging to the cylinder set C(r, νr ), the set of all numbers ∞ ξ C(r, νr ) = xn p(n) n=−∞
belongs to the subinterval (called a C-interval in what follows) 2 3 ξmin C(r, νr ) , ξmax C(r, νr ) ⊂ [0, 1], where
r xn p(n) ξmin C(r, νr ) = n=−r
and
216
6 Cellular Automata ∞ ξmax C(r, νr ) = ξmin C(r, νr ) + p(n).
|n|=r
There are 22r+1 different ξmin C(r, νr ) and as many different C-intervals. 2r+1 If we want to define 22 different Fp,A evolution operators, representing 2r+1 2 the 2 different cellular automaton local evolution rules f , we first have to find a probability measure p such that the 22r+1 C-intervals are pairwise disjoint. Then, according to the local rule f to be represented, we choose a subset A of [0, 1] such that some of these C-intervals are strictly included in A, whereas the others have an empty intersection with A. The only problem is therefore to find the conditions to be satisfied by p. Let ⎧ β−1 ⎪ ⎨ 2n , if n < 0, p(n) = ββ− 1 (6.22) ⎪ ⎩ , if n ≥ 0. β 2n+1 We verify that, for β > 1, ∞ p(n) = 1. n=−∞
In the trivial case of cellular automaton rules with r = 0, we have the following two C-intervals: ξmin C(0, 0) = 0,
1 ξmax C(0, 0) = , β
and
β−1 ξmin C(0, 1) = , ξmax C(0, 1) = 1. β These C-intervals are disjoint if ξmin C(0, 1) > ξmax C(0, 0) or
(6.23)
β−1 1 > ; β β
that is, for β > 2. Increasing the radius by one unit, multiply the number of C-intervals by 4. That is, from each closed C-interval of the form ⎡ ⎤ ⎣ an p(n), an p(n) + p(n)⎦ , |n|≤r
|n|≤r
|n|>r
where (a−r , a−r+1 , · · · , ar−1 , ar ) is a specific (2r + 1)-block, we generate four closed C-intervals, each one being associated with one of the following (2r+3)blocks:
6.4 Generalized cellular automata
(0, a−r , a−r+1 , · · · , ar−1 , ar , 0), (1, a−r , a−r+1 , · · · , ar−1 , ar , 0),
217
(0, a−r , a−r+1 , · · · , ar−1 , ar , 1), (1, a−r , a−r+1 , · · · , ar−1 , ar , 1).
These new C-intervals are disjoint if, and only if, the following three conditions are satisfied: ∞
p(r + 1) >
p(n),
(6.24)
|n|=r+2 ∞
p(−r − 1) > p(r + 1) +
p(n),
(6.25)
|n|=r+2
p(−r − 1) + p(r + 1) > p(−r − 1) +
∞
p(n).
(6.26)
|n|=r+2
Conditions (6.24) and (6.26) are identical. Therefore, replacing p(n) by its expression, and noting that ∞
p(n) =
|n|=r+2
1 β 2r+3
,
we have finally to satisfy the conditions β−1 1 > 2r+3 2r+3 β β
and
β−1 β−1 1 1 > 2r+3 + 2r+3 = 2r+2 . 2r+2 β β β β
That is, β has to be greater than 2. Therefore, the global evolution rule induced by any radius r cellular automaton local rule f can be represented as an evolution operator Fp,A , where the probability measure p is given by (6.22) for β > 2 and A a subset of [0, 1] that, according to f , includes some C-intervals, whereas the others have an empty intersection with it. A is not unique. Example 43. Radius 1 rules. The table below lists the eight different Cintervals for β = 2.5. triplet νr C-interval triplet νr C-interval (0,0,0) (0,0,1) (1,0,0) (1,0,1)
0 [0, 0.064] 1 [0.096, 0.16] 2 [0.24, 0.304] 3 [0.336, 0.4]
(0,1,0) (0,1,1) (1,1,0) (1,1,1)
4 [0.6, 0.664] 5 [0.696, 0.76] 6 [0.84, 0.904] 7 [0.936, 1]
Using this table, we can represent any elementary cellular automaton global evolution rule. For example, the global rule induced by local rule 18 (see Example 41), defined by
218
6 Cellular Automata
1, if (x1 , x2 , x3 ) = (0, 0, 1) or (1, 0, 0), f18 (x1 , x2 , x3 ) = 0, otherwise,
(6.27)
may be represented by an evolution operator Fp,A , where p is given by (6.22) with β = 2.5, and A is any subset of [0, 1] containing the union [0.096, 0.16] ∪ [0.24, 0.304] but whose intersection with all the other six C-intervals is empty. In particular, we could choose A = [0.09, 0.31], illustrating the fact that A is not unique. Remark 15. In the process described above, four new closed intervals, necessary to represent radius (r + 1) rules, are obtained by removing three open intervals from each closed C-interval used to represent radius r rules. This process is reminiscent of the recursive construction of the triadic Cantor set (see Example 34). In the limit r → ∞ (i.e., when all open intervals have been removed), we obtain a Cantor-like set Σ; that is, a closed set with no interior points and such that all points are limit points (see page 166, Footnote 16).16 This set consists of all the numbers represented by expansion of the form ∞ σ(x) = xn p(n), (6.28) n=−∞
where p is given by (6.22) for β > 2 and, for all n ∈ Z, xn ∈ {0, 1}. The right-hand side of (6.28) is the expansion of σ(x) in (nonintegral) base β.17 The Lebesgue measure of Σ is obtained by subtracting from 1 the sum of the lengths of all the removed intervals. That is, 1−
∞ ∞ 2(β − 2) 4(β − 2) β−2 β−2 β−2 β−2 =1− − 4 −2 − 2 − 2r+2 2 − 1) 2r+3 β β β β β − 1 β(β r=0 r=0
=
6 . β(β 2 − 1)
As shown in Figure 6.8, the Lebesgue measure of Σ is equal to 1 for β = 2 (Σ coincides, in this case, with the set of all numbers of the interval [0, 1] in base 2) and goes to zero as β tends to infinity. For a finite value of β, Σ has a nonzero Lebesgue measure (equal, for example, to 0.457143 for β = 2.5).18 Remark 16. An interesting feature of the definition of the space Fp of evolution operators for a given probability measure p is that it is possible to define the distance d(Fp,A1 , Fp,A2 ) between two evolution operators Fp,A1 and Fp,A2 . The basic 16
Let x(1) and x(2) be two elements of S = {0, 1}Z ; the distance between the two corresponding points σ(x(1) ) and σ(x(2) ) in Σ is defined by ∞ (2) |x(1) d σ(x(1) ), σ(x(2) ) = n − xn |p(n), n=−∞
17
18
where p(n) is given by (6.22). In the mathematical literature, these representations of a real number of the interval [0, 1] are called β-expansions. For a review of their properties, see Blanchard [43]. On Cantor-like sets, see Boccara [57].
6.4 Generalized cellular automata
219
1 0.8 0.6 0.4 0.2
3
4
5
6
7
8
Fig. 6.8. Lebesgue measure of the Cantor set Σ as a function of β. property of d, defined on Fp , is that the distance between two different evolution operators (that is, two operators associated with two different subsets A1 and A2 of [0, 1]) should be zero if the evolution operators Fp,A1 and Fp,A2 represent the same evolution rule (remember, the subset A defining Fp,A is not unique). The distance d : Fp → [0, 1] is defined by 1 d(Fp,A1 , Fp,A2 ) = |IA1 (σ) − IA2 (σ)| dF (σ), (6.29) 0
where dF (σ) is a measure we have to define. The best way to understand the meaning of this measure is to use the language of probability theory. Let (Xn )n∈Z be a doubly infinite sequence of identically distributed Bernoulli random variables such that, for all n ∈ Z, P (Xn = 0) = P (Xn = 1) = 12 , and consider the cumulative distribution function F of the random variable ∞
Xn p(n),
n=−∞
that is, the function F : [0, 1] → [0, 1] such that19 ∞ F (σ) = P Xn p(n) ≤ σ . n=−∞
In other words, F (σ) is the probability that the random variable ∞ n=−∞ Xn p(n) does not exceed σ. F is a singular measure such that F (0) = 0 and F (1) = 1. It is represented in Figure 6.9. The meaning of the distance d(Fp,A1 , Fp,A2 ) defined by (6.29) becomes clear. It is the measure of the symmetrical difference between the sets A1 and A2 , that is, the set of all numbers σ(x) ∈ [0, 1] that belong to A1 XOR A2 . Hence 19
On probability theory, see, for example, Grimmett and Stirzaker [158].
220
6 Cellular Automata 1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 6.9. Cumulative distribution function F : [0, 1] → [0, 1] of the random variable ∞ n=−∞ Xn p(n) for β = 2.5. d(Fp,A1 , Fp,A2 ) =
dF (σ), A1 A2
where AA2 is the symmetrical difference (A1 ∪ A2 )\(A1 ∩ A2 ). For example, the distance between rule 18 defined by (6.27) and rule 22 defined by 1, if (x1 , x2 , x3 ) = (0, 0, 1) or (1, 0, 0) or (0, 1, 0), (6.30) f22 (x1 , x2 , x3 ) = 0, otherwise, is, according to the table of C-intervals, the measure of any interval including the C-interval [0.6, 0.664] but having an empty intersection with all the other seven Cintervals. By construction of the distribution function F , all C-intervals for a given radius are equally probable (have the same measure). Thus, the distance between rule 18 and rule 22 is equal to 18 . It does not depend upon the value of β for β > 2. Remark 17. For a given p, the space F of all generalized cellular automaton evolution operators {Fp,A | A ⊂ [0, 1]} defined on Z is a metric space for the distance given by 6.29. It is easy to verify that the set of all cellular automaton global rules is countable and dense in F, exactly as the set of positive rational numbers less than 1 is dense in the closed interval [0, 1]. Thus, just as any irrational number in the interval [0, 1] can be approximated by a rational number in the same interval as close as we want, any generalized cellular automaton evolution operator Fp,A can be approximated by a cellular automaton global rule (i.e., with a finite radius) as close as we want. In other words, given Fp,A , we can always find a sequence of cellular automaton global rules (Ffn ) induced by the local rules fn (n ∈ N) that converges to Fp,A for the metric d defined by (6.29).
6.5 Kinetic growth phenomena
221
6.5 Kinetic growth phenomena Kinetic growth phenomena are the result of stochastic growth processes taking place in time. As a result, the systems we shall consider evolve according to a probabilistic growth rule. Strictly speaking, the percolation phenomenon described below is not a kinetic growth process, but it often plays an important role in many kinetic growth processes.20 This is the reason why it is briefly described in this section. Example 44. Percolation models. Percolation processes were defined in the late 1950s by Broadbent and Hammersley [72]. Discussing the spread of a fluid through a random medium, these authors distinguish diffusion processes, in which the random mechanism is ascribed to the fluid, from percolation processes, where, in contrast, the random mechanism is ascribed to the medium. Among the examples given to illustrate what is meant by percolation processes, Broadbent and Hammersley consider gas molecules (fluid) adsorbed on the surface of a porous solid (medium). The gas molecules move through all pores large enough to admit them. If p is the probability for a pore to be wide enough to allow for the passage of a gas molecule, they discovered that there exists a critical probability pc below which there is no adsorption in the interior. Similarly, in a large orchard, if the probability p for an infectious disease to spread from one tree to its neighbors is less than a critical probability pc , the disease will not spread through the orchard, only a small number of trees will be infected. There exist two types of standard percolation models: bond and site percolation models (see Figures 6.10 and 6.11, respectively). In what follows, definitions are given for two-dimensional square lattices. Extension to higher dimensionalities and different lattice symmetries is straightforward. On a square lattice, the so-called Manhattan distance is defined by d(x, y) = |x1 − y1 | + |x2 − y2 |, where x = (x1 , x2 ) and y = (y1 , y2 ) are two lattice sites. Viewed as a graph, a square lattice consists of a set of vertices Z2 (called sites) and a set of edges (called bonds) E = {(x, y) | (x, y) ∈ Z2 , |x − y| = 1}. That is, the set of edges is the set of all straight line segments joining nearest-neighboring vertices. In the bond percolation model, we assume that each edge is either open with a probability p or closed with a probability 1 − p, independently of all other edges. In contrast, in the site percolation model, we assume that each vertex is either open or occupied with a probability p or closed or empty with a probability 1 − p, independently of all other vertices. An alternating sequence (x0 , e0 , x1 , e1 , x2 , . . . , xn−1 , en−1 , xn ), of distinct vertices xi and edges ei−1 = (xi−1 , xi ) (i = 0, 1, 2, . . . , n) is called a path of length n. For the bond (resp. 20
Many examples showing the unifying character of the percolation concept can be found in de Gennes [141]).
222
6 Cellular Automata
site) percolation model, a path is said to be open (resp. closed) if all its edges (resp. vertices) are open (resp. closed). Two different sites are said to belong to the same open cluster if they are connected by an open path. It can be
Fig. 6.10. Bond percolation model. In the left figure, the probability for a bond to be open (darker bond) is p = 0.4 and, in the right figure, this probability is p = 0.7.
Fig. 6.11. Site percolation model. In the left figure, the probability for a site to be occupied (darker site) is p = 0.4 and, in the right figure, this probability is p = 0.7.
shown that if the dimension of the lattice is greater than or equal to 2, then the and psite for the bond critical probabilities (or percolation thresholds) pbond c c and site percolation models, respectively, satisfy the relation21 21
See Grimmett [159].
6.5 Kinetic growth phenomena
0 < pbond < psite < 1. c c
223
(6.31)
Except for a very few lattice structures, critical probabilities can be determined using numerical simulations. For example, it can be shown that, for the square lattice, the bond critical probability pbond is exactly equal to c 0.5 [159] while the site critical probability psite is only approximately equal to c 0.592746 . . . [266]. While critical probabilities, for a given space dimensionality, depend on the symmetry of the lattice and whether bond or site percolation is considered, critical exponents are universal. Here are two exactly known results.22 1. The correlation length ξ diverges as |p − pc |−ν when p tends to pc either from below or from above with ν = 43 . 2. The percolation probability, which is the probability P∞ (p) that a given vertex belongs to an infinite cluster, is clearly equal to zero for p < pc , 5 23 but, for p > pc , it behaves as (p − pc )β with β = 36 . The percolation process we have described is not the only one. Another important process in applications is the directed percolation process. Consider the square lattice represented in Figure 6.12 in which open bonds are randomly distributed with a probability p. In contrast with the usual bond percolation problem, here bonds are directed downwards, as indicated by the arrows.
Fig. 6.12. A configuration of directed bond percolation on a square lattice. 22
23
These exact results are obtained from the equivalence of both percolation models with Potts models; see Wu [350]. For percolation, the lower and upper critical space dimensions are, respectively, equal to 1 and 6.
224
6 Cellular Automata
If we imagine a fluid flowing downwards from wet sites in the first row, one problem is to find the probability P (p) that, following directed open bonds, the fluid will reach sites on an infinitely distant last row. There clearly exists a threshold value pc above which P (p) is nonzero. If the downwards direction is considered to be the time direction, the directed bond percolation process may be viewed as a two-input one-dimensional cellular automaton rule f such that s(i, t + 1) = f s(i, t), s(i + 1, t) ⎧ ⎪ ⎨0, if s(i, t) + s(i + 1, t) = 0, = 1, with probability p, if s(i, t) + s(i + 1, t) = 1, ⎪ ⎩ 1, with probability 1 − (1 − p)2 , if s(i, t) + s(i + 1, t) = 2. If ρ is the density of wet (active) sites,24 the mean-field map is ρ → 2pρ(1 − ρ) + (1 − (1 − p)2 )ρ2 = 2pρ − p2 ρ2 . This map shows the existence of a directed bond percolation threshold, or a directed bond percolation probability pDBP , equal to 21 . That is, within the c mean-field approximation, using the image of the flowing fluid, above pDBP , c the fluid has a nonzero probability of reaching an infinitely distant last row. These values are not exact. Numerical simulations show that pDBP equals c 0.6445 ± 0.0005 for the bond model.25 The most general two-input one-dimensional totalistic probabilistic cellular automaton rule,26 called the Domany-Kinzel cellular automaton rule [103], may be written s(i, t + 1) = f s(i, t), s(i + 1, t) ⎧ ⎪ ⎨0, if s(i, t) + s(i + 1, t) = 0, = 1, with probability p1 , if s(i, t) + s(i + 1, t) = 1, ⎪ ⎩ 1, with probability p2 , if s(i, t) + s(i + 1, t) = 2. Directed bond percolation corresponds to p1 = p and p2 = 2p − p2 . The case p1 = p2 = p is also interesting; it describes the directed site percolation process. In this case, numerical simulations show that the site percolation probability pDSP = 0.7058 ± 0.0005. c Example 45. General epidemic process. The general epidemic process (see Figure 6.13) describes, according to Grassberger [153, 77], who studied its critical properties, “the essential features of a vast number of population growth 24
25 26
The density ρ, which is equal to P (p), is the order parameter of the second-order phase transition. Refer to Kinzel [187] for critical exponents and scaling laws. A cellular automaton n-input rule is said to be totalistic if it only depends upon the sum of the n inputs.
6.5 Kinetic growth phenomena
225
phenomena.” In its simplest version the growth process can be described as follows. Initially the cluster consists of the seed site located at the origin. At the next time step, a nearest-neighboring site is randomly chosen. This site is either added to the cluster with a probability p or rejected with a probability 1 − p. At all subsequent time steps, the same process is repeated: a nearestneighboring site of any site belonging to the cluster is selected at random, and it is either added to the cluster with a probability p or rejected with a probability 1 − p. It is clear that there exists a critical probability pc such that for p > pc , the seed site has a nonzero probability of belonging to an infinite cluster.
Fig. 6.13. Grassberger’s general epidemic process. The cluster represents the spread of the epidemic to 4000 sites for p = 0.6.
In order to determine the critical behavior of the general epidemic model, Grassberger performed extensive numerical simulations on a slightly different model that belongs to the same universality class and whose static properties are identical to a bond percolation model. In this model, every lattice site of a two-dimensional square lattice is occupied by only one individual, who cannot move away from it. The individuals are either susceptible, infected, or immune. At each time step, every infected individual infects each nearestneighboring susceptible site with a probability p and becomes immune with probability 1. For this model, the critical probability pc is exactly equal to 12 . If, at time t = 0, all the sites of one edge of the lattice are infected, among other results, Grassberger found that, at pc , the average number of immune
226
6 Cellular Automata
sites per row parallel to the initial infected row increases as a function of time as tx , where the critical exponent x is equal to 0.807 ± 0.01.27 Example 46. Forest fires. A fire is characterized by its intensity,28 its direction of travel, and its rate of spread.29 MacKay and Jan [219] proposed a simple bond percolation model to describe the propagation of fire in a densely packed forest from a centrally located burning tree. The fire is supposed to be of low intensity so that its propagation can be viewed as a localized surface phenomenon (i.e., a burning tree is only able to ignite its nearest neighbors).
Fig. 6.14. Burning forest after 70 time steps. Burning, burnt and unburnt trees are, respectively, represented by white, black, and grey squares. The forest is a square lattice of size 101 × 101. In the left figure, the fire propagates isotropically with a probability p = 0.6. In the right figure, the fire propagates northward (resp. southward) with a probability pn = 0.65 (resp. ps = 1 − pn ) and eastward or westward with the same probability pe = pw = 0.6.
More specifically, the authors assume that trees occupy the sites of a twodimensional (square)30 lattice. At each time step, the fire propagates from burning trees to their nearest neighbors with probability p, and burning trees become burnt. At t = 0, all sites are occupied by unburnt trees except for the 27
28
29
30
For more detailed numerical results concerning this model, refer to Grassberger [153]. Fire intensities are measured in kilocalories per second per meter of fire front. Low-intensity fires of about 800–900 kcal s−1 m−1 are two-dimensional phenomena, while high-intensity fires of about 15,000–25,000 kcal s−1 m−1 are threedimensional phenomena. The rate of spread of a fire depends upon many factors: recent rainfall, oxygen supply, wind velocity, age and type of trees, etc. In their original paper, the authors assumed the lattice to be triangular.
6.5 Kinetic growth phenomena
227
central tree, which is burning. This growth model is identical to the model for the spread of blight in a large orchard mentioned by Broadbent and Hammersley (see Example 44).31 It is also similar to the general epidemic process. Precise numerical simulations [269] show that the critical probability pc = 0.5. For p = pc , the total number of burnt trees M (t) grows as a function of time t as tα , where α = 1.59, while the radius of gyration R(t) increases with time as t2β , where β = 0.79. It is possible to slightly modify the model above in order to take into account the influence of wind. We can, for example, suppose that the probability for a tree to be ignited by a neighboring burning tree is equal to pn (resp. ps = 1 − pn ) if it is located north (resp. south) of the burning tree and pe = pw if it located either east or west of the burning tree (see Figure 6.14). The critical probability pe,c = pw,c depends upon the value of the probability pn . Von Niessen and Blumen [269] give the following values: pn 0.5 0.6 0.7 0.8 0.9 pe,c 0.5 0.459 0.385 0.287 0.160 Example 47. Diffusion-limited aggregation. Cluster formation from a single particle is of interest in many domains such as, for instance, dendritic growth, flocculation, soot formation, tumor growth, and cloud formation. In 1981, Witten and Sander [347] proposed a model for random aggregates in which an initial seed particle is located at the center of a square lattice. A second particle is added at some random site at a large distance from the origin. This particle walks at random on the lattice until either it reaches a site adjacent to the seed and stops or first reaches the boundary of the lattice and is removed. A third particle is then introduced at a random distant point and walks at random until either it joins the cluster or reaches the boundary and is removed, and so forth. Figure 6.15 represents a (small) aggregate of 250 particles. Numerical simulations showed that these structures have remarkable scaling and universal properties [237, 238]. In particular, these aggregates are fractals with a Hausdorff dimension equal to 5d/6, where d is the Euclidean dimensionality. In 1983 Meakin [239] and Kolb, Botet, and Jullien [191] investigated simultaneously a related model in which, in the initial state, N0 lattice sites are occupied. Clusters—which may consist of only one particle—are picked at random and moved to a nearest-neighboring site selected at random. If a cluster contacts one or more clusters, these clusters merge to form a larger cluster. Clusters therefore grow larger and larger until one large cluster remains. Different variants of this model have been considered. Clusters may move with a probability independent of their size or with a probability inversely proportional to their size. 31
See also Frish and Hammersley [131].
228
6 Cellular Automata
Fig. 6.15. Diffusion-limited aggregate of 250 particles.
6.6 Site-exchange cellular automata A site-exchange cellular automaton is an automata network whose evolution rule consists of two subrules: 1. The first subrule is a probabilistic cellular automaton rule, such as, for example, a “diluted” one-dimensional rule f defined by s(i, t + 1) = Xf s(i − r , t), s(i − r + 1, t), . . . , s(i + rr , t) , (6.32) where f is a one-dimensional deterministic cellular automaton rule with left and right radii equal, respectively, to r and rr , and X is a Bernoulli random variable such that P (X = 0) = 1 − p
and P (X = 1) = p.
When p = 1, rule (6.32) coincides with the deterministic cellular automaton rule f . 2. The second subrule is a site-exchange rule defined as follows. An occupied site (i.e., a site whose state value is 1) is selected at random and swapped with another site (either empty or occupied) also selected at random. The second site is either a nearest neighbor of the first site (short-range move) or any site of the lattice (long-range move). Between the application of the first subrule at times t and t+1, this operation is repeated m×ρ(m, t)×L times, where m is a positive real number called the degree of mixing, ρ(t) the density of occupied sites at time t, L the total number of lattice sites, and x the largest integer less than or equal to x.
6.6 Site-exchange cellular automata
229
Site-exchange cellular automata were introduced by Boccara and Cheong [49, 52] and used in theoretical epidemiology in which infection by contact and recovery are modeled by two-dimensional probabilistic cellular automaton rules applied synchronously, whereas the motion of the individuals, which is an essential factor, is modeled by a sequential site-exchange rule. While short-range moves are clearly diffusive moves on a finite-dimensional lattice, long-range moves may be viewed as diffusive motion of the individuals on an infinite-dimensional lattice. In order to help clarify the mixing process that results from the motion of the individuals, it might be worthwhile to characterize the mixing process as a function of m and the lattice dimensionality d. Consider a random initial configuration c(0, ρ) of walkers with density ρ on a d-dimensional torus ZdL . Select sequentially mρLd walkers and move them to a neighboring site also selected at random if, and only if, the randomly selected neighboring site is vacant. Ld is the total number of sites of the torus (finite lattice), and m represents the average number of tentative moves per random walker since only a fraction of these moves are effective. The expression at random means that all possible choices are equally probable. Let c(mρLd , ρ) denote the resulting configuration.32 Theorem 17. The d-dimensional sequential multiple random walkers model is equivalent to a one random walker model in which the walker has a probability 1/2dρL to move either to the left or to the right and a probability 1−1/dρL not to move, where ρ is the density of random walkers. To simplify the proof, consider a one-dimensional torus (d = 1), and let P (mρL, ρ, j) be the probability that the site j is occupied after mρL tentative moves. P (mρL, ρ, j) is therefore the average value n(mρ, ρ, j)rw over all the possible random walks starting from the same initial configuration. In order to find the evolution equation of P (mρL, ρ, j), consider the set of configurations in which n(mρL, ρ, j − 1), n(mρL, ρ, j), and n(mρL, ρ, j + 1) have fixed values whereas, for all i = j − 1, j, j + 1, n(mρL, ρ, i) takes any value with the restriction that the total density is equal to ρ. Let px1 x2 x3 (mρL, ρ, j) denote the probability of a configuration such that n(mρL, ρ, j − 1) = x1 ,
n(mρL, ρ, j) = x2 ,
n(mρL, ρ, j + 1) = x3 .
The probability P (mρL + 1, ρ, j) for site j to be occupied by a random walker after mρL + 1 tentative moves is the sum of different terms. 1. If site j is empty after mρL tentative moves, then at least one of its nearest-neighboring sites has to be occupied. The probability for a given site to be occupied being 1/ρL, and the probability to move either to the right or to the left being equal to 12 , the contribution to P (mρL + 1, ρ, j) is in this case 32
The following exact results were established by Boccara, Cheong, and Oram [54].
230
6 Cellular Automata
1 1 p001 (mρL, ρ, j) + p100 (mρL, ρ, j) + p101 (mρL, ρ, j). 2ρL ρL 2. If site j is occupied by a random walker after mρL tentative moves, the site has to remain occupied. The different contributions to P (mρL+1, ρ, j) are, in this case, the sum of the following three terms: 1 p010 (mρL, ρ, j), 1− ρL since the occupied site j should not be chosen to move, 1 p011 (mρL, ρ, j) + p110 (mρL, ρ, j) , 2ρL since the occupied site j should be selected to try to move to the neighboring occupied site, and p111 (mρL, ρ, j). Hence, P (mρL + 1, ρ, j) =
1 p001 (mρL, ρ, j) + p100 (mρL, ρ, j) 2ρL 1 p010 (mρL, ρ, j) + 1− ρL 1 p011 (mρL, ρ, j) + p110 (mρL, ρ, j) + 2ρL 1 p101 (mρL, ρ, j) + p111 (mρL, ρ, j). + ρL
Taking into account the relations P (mρL, ρ, j − 1) = p100 (mρL, ρ, j) + p101 (mρL, ρ, j) + p110 (mρL, ρ, j) + p111 (mρL, ρ, j) P (mρL, ρ, j) = p010 (mρL, ρ, j) + p011 (mρL, ρ, j) + p110 (mρL, ρ, j) + p111 (mρL, ρ, j) P (mρL, ρ, j + 1) = p001 (mρL, ρ, j) + p011 (mρL, ρ, j) + p101 (mρL, ρ, j) + p111 (mρL, ρ, j), we finally obtain the following discrete diffusion equation: P (mρL + 1, ρ, j) = P (mρL, ρ, j) 1 + P (mρL, ρ, j + 1) + P (mρL, ρ, j − 1) − 2P (mρL, ρ, j) . 2ρL For a d-dimensional torus, we would have obtained
(6.33)
6.6 Site-exchange cellular automata
P (mρL + 1, ρ, j) = P (mρL, ρ, j) ⎛ ⎞ 1 ⎝ + P (mρL, ρ, i) − 2dP (mρL, ρ, j)⎠ , 2dρL i nn j
231
(6.34)
where the summation runs over the 2d nearest neighbors i of j. This equation may also be written 1 P (mρL, ρ, j) P (mρL + 1, ρ, j) = 1 − ρL 1 + P (mρL, ρ, i), (6.35) 2dρL i nn j showing that the problem of ρLd random walkers on a d-dimensional torus is equivalent to the problem of one random walker that may move to any single neighboring site with probability 1/2dρLd or not move with probability 1 − 1/ρLd . To characterize the mixing process, consider the Hamming distance dH
d
L 2 1 c(0, ρ), c(nρL , ρ) = d n(0, ρ, j) − n(mρLd , ρ, j) , L j=1 d
where n(0, ρ, j) and n(mρLd , ρ, j) are the occupation numbers of site j in c(0, ρ) and c(mρLd , ρ), respectively. If m is very large, c(0, ρ) and c(mρLd , ρ) are not correlated and the Hamming distance, which is the average over space of 2 n(0, ρ, j) − n(mρLd , ρ, j) = n(0, ρ, j) + n(mρLd , ρ, j) − 2n(0, ρ, j) n(mρLd , ρ, j), is equal to 2ρ(1 − ρ). Theorem 18. In the limit L → ∞, the average of the reduced Hamming distance δH , defined by 4 5 dH c(0, ρ), c(nρLd , ρ) δH (m, d, ρ) = lim L→∞ 2ρ(1 − ρ) is a function of m and d only, i.e., it does not depend on ρ. We shall first consider a complete graph (i.e., a graph in which any pair of vertices are nearest neighbors). If the graph has N vertices, we have P (mρN + 1, ρ, j) = P (mρN, ρ, j) +
N 1 P (mρN, ρ, k) − P (mρN, ρ, j) . ρ(N − 1)N k=1
(6.36)
232
6 Cellular Automata
Taking into account the relation N
P (mρN, ρ, k) = ρN,
k=1
Equation (6.36) may also be written 1 1 . P (mρN, ρ, j) + P (mρN + 1, ρ, j) = 1 − ρ(N − 1) N −1
(6.37)
Since this equation involves only one site—a typical result of mean-field equations—its solution is easy to obtain. We find mρN 1 P (mρN, ρ, j) = 1 − n(0, ρ, j) − c + ρ. ρ(N − 1) Initial configurations of numerical simulations are random; hence, only averages over all random walks and all initial configurations with a given density ρ of random walkers are meaningful quantities. Averaging over all random walks starting from the same initial configuration c(0, ρ) = {n(0, ρ, j) | j = 1, 2, . . . , N }, the Hamming distance is dH c(0, ρ), c(mρN, ρ) = N 1 n(0, ρ, j) + P (mρN, ρ, j) − 2n(0, ρ, j)P (mρN, ρ, j) , N j=1
that is, dH
N 2 c(0, ρ), c(mρN, ρ) = 2ρ − n(0, ρ, j)P (mρN, ρ, j). N j=1
Replacing P (mρN, ρ, j) by its expression yields n(0, ρ, j)P (mρN, ρ, j) mρN 1 1− = n(0, ρ, j) − c + c n(0, ρ, j) ρ(N − 1) mρN 2 1 = 1− n(0, ρ, j) − ρn(0, ρ, j) + ρn(0, ρ, j), ρ(N − 1) and, taking the average over space, we obtain n(0, ρ, j)P (mρN, ρ, j)space mρN 2 1 n(0, ρ, j) space − ρ2 + ρ2 . = 1− ρ(N − 1)
6.6 Site-exchange cellular automata
233
2 Since n(0, ρ, j) space = n(0, ρ, j)space = ρ, letting N go to infinity, we finally obtain 2ρ(1 − ρ)(1 − e−m ), i.e., δH (m, ∞, ρ) = 1 − e−m , where the symbol ∞ refers to the fact that, in the limit N → ∞, a complete graph is equivalent to an infinite-dimensional lattice. Consider now Equation (6.33), and replace P (mρL, ρ, j) by its expression in terms of its Fourier transform P6(mρL, ρ, k) defined by 1 6 P (mρL, ρ, j) = √ P (mρL, ρ, k)e2iπk/L . L k=1 L
This yields P6(mρL + 1, ρ, k) =
1−
2 sin2 ρL
πk L
P6(mρL, ρ, k)
and, therefore, P6(mρL, ρ, k) =
2 1− sin2 ρL
πk L
mρL
P6(0, ρ, k).
The average Hamming distance over all initial configurations is L 1 P (0, ρ, j) + P (mρL, ρ, j) − 2P (0, ρ, j)P (mρL, ρ, j) ic N j=1
= 2ρ −
mρL L 2 πk 2 6 P (0, ρ, k)P6(0, ρ, −k)ic 1 − , sin2 L ρL L k=1
where f (j)ic denotes the average over all initial configurations of f (j). Since P6(0, ρ, k)P6(0, ρ, −k)ic =
L L 1 P (0, ρ, j1 )P (mρL, ρ, j2 )ic e2iπk(j1 −j2 )/N , N j =1 j =1 1
2
we find mρL L 2 6 2 πk 2 6 P (0, ρ, k)P (0, ρ, −k)ic 1 − sin L ρL L k=1 mρL L−1 2 πk 1 1 1− = 2 (Lρ + L(L − 1)ρ2 ) + (ρ − ρ2 ) . sin2 L L ρL L k=1
234
6 Cellular Automata
Finally, in the limit L → ∞, the average Hamming distance over all random walks and all initial configurations is equal to 1 −2m sin2 πq 2ρ(1 − ρ) 1 − e dq , 0
and the reduced Hamming distance is thus given by δH (m, 1, ρ) = 1 − e−m I0 (m), where the function I0 , defined by 1 I0 (m) = π
π
em cos u du,
0
is the modified Bessel function of the first kind of order zero. δH (m, 1, ρ) does not, √ therefore, depend upon ρ. For large values of m, I0 (m) behaves as √ em / 2πm, hence, δH (m, 1, ρ) approaches unity as m tends to infinity as 1/ m. Equation (6.34) could be solved in a similar way, and, in particular, we would find that δH (m, d,√ ρ) does not depend upon ρ and approaches unity as m tends to infinity as 1/ md . For small m, it is easy to verify that, for all d, the reduced Hamming distance behaves as m. Example 48. Period-doubling route to chaos for a global variable of a onedimensional totalistic cellular automaton. As a first example of a site-exchange cellular automaton, we shall study the behavior, as a function of the degree of mixing m in the case of long-range moves, of the infinite-time limit density of active sites ρ(∞, m) of the two-state one-dimensional radius r totalistic cellular automaton rule 33 defined by ⎛ ⎞ j=r s(i, t + 1) = Xf ⎝ s(i + j, t)⎠ , j=−r
where
1, if Smin ≤ x ≤ Smax , f (x) = 0, otherwise.
(6.38)
The spatiotemporal pattern of the cellular automaton evolving according to the radius 3 totalistic rule with Smin = 1 and Smax = 6 is represented in Figure 6.16. It can be shown [50] that the limit set of this cellular automaton consists of sequences of 0s and 1s whose lengths are multiples of 6. The distributions of 0s and 1s are identical, which implies that the asymptotic density 33
Totalistic cellular automaton rules of this type were studied in detail by Bidaux, Boccara, and Chat´e [42].
6.6 Site-exchange cellular automata
235
Fig. 6.16. First 100 iterations of the radius 3 totalistic cellular automaton rule (6.38) (with Smin = 1 and Smax = 6) starting from a random initial configuration with 25 nonzero sites. 0s and 1s are, respectively, represented by white and black squares. Total number of lattice sites: 96.
of nonzero sites is 12 . The average number of sequences of length 6n per site is 13 × 2n+3 . The prediction of the mean-field approximation is very bad. The mean-field map given by ρ → 1 − ρ7 − (1 − ρ)7 = 7ρ(1 − ρ)(1 − 2ρ + 3ρ2 − 2ρ3 + ρ4 ) is chaotic. Since the mean-field result should be correct when the number of tentative moves per active site m tends to infinity, the asymptotic density of active sites ρ(∞, m) should evolve chaotically when m is large enough. Interestingly, increasing m from 0, Boccara and Roger [50] observed for ρ(∞, m) a complete sequence of period-doubling bifurcations, as illustrated in Figure 6.17. In order to estimate the Feigenbaum number δ, some numerical simulations were done on lattices having L = 1.6 × 107 sites to be able to locate the first four bifurcations with a reasonable accuracy. The parameter values of these bifurcation points are m1 = 0.340 ± 0.005, m2 = 0.655 ± 0.005, m3 = 0.780 ± 0.001, and m4 = 0.813 ± 0.001, yielding δ1 = 2.5 and δ2 = 3.8, where δn = (mn+1 − mn )/(mn+2 − mn+1 ). Taking into account the very small number of bifurcations, these results suggest that the period-doubling sequence has the universal behavior found for maps with a quadratic maximum. Plotting ρ(t + 1, m) as a function of ρ(t, m) for different values of m allowed verification of this hypothesis. Note that ∂ρ(∞, m)/∂m is infinite at m = 0.
236
6 Cellular Automata
Fig. 6.17. Bifurcation diagram for the asymptotic density of active sites ρ(∞, m) of the radius 3 totalistic cellular automaton rule (6.38) (with Smin = 1 and Smax = 6). The parameter m, which is the number of tentative moves per active site, is plotted on the horizontal axis and varies from 0 to 1.7
Example 49. SIR epidemic model in a population of moving individuals. In an SIR epidemic model, individuals are divided into three disjoint groups: 1. susceptible individuals capable of contracting the disease and becoming infective, 2. infective individuals capable of transmitting the disease, and 3. removed individuals who have had the disease and are dead, have recovered and are permanently immune, or are isolated until recovery and permanent immunity occurs. The possible evolution of an individual may, therefore, be represented by the transfer diagram pi pr S −→ I −→ R, where pi denotes the probability for a susceptible to be infected and pr the probability for an infective to be removed.
6.6 Site-exchange cellular automata
237
In a two-dimensional cellular automaton model, each site of a finite square lattice is either empty or occupied by a susceptible or an infective. The spread of the disease is governed by the following rules: 1. Susceptibles become infective by contact (i.e., a susceptible may become infective with a probability pi if, and only if, it is in the neighborhood of an infective). This hypothesis neglects latent periods (i.e., an infected susceptible becomes immediately infective). 2. Infectives are removed (or become permanently immune) with a probability pr . This assumption states that removal is equally likely among infectives. In particular, it does not take into account the length of time the individual has been infective. 3. The time unit is the time step. During one time step, the two preceding rules are applied after the individuals have moved on the lattice according to a specific rule. 4. An individual selected at random may move to a site also chosen at random. If the chosen site is empty, the individual will move; otherwise, the individual will not move. The set in which the site is randomly chosen depends on the range of the move. To illustrate the importance of this range, two extreme cases may be considered. The chosen site may either be one of the four nearest neighbors (short-range move) or any site of the graph (long-range move). If N is the total number of individuals on Z2L , mN individuals, where the positive real number m is the degree of mixing, are sequentially selected at random to perform a move. This sequential process allows some individuals to move more than others. This model assumes that the population is closed. It ignores births, deaths by other causes, immigrations, and emigrations. In order to have an idea of the qualitative behavior of the model, let us first study its mean-field approximation. Denoting by St , It , and Rt the respective densities of susceptible, infective, and removed individuals at time t, we have St+1 = ρ − It+1 − Rt+1 , It+1 = It + St 1 − (1 − pi It )4 − pr It , Rt+1 = Rt + pr It , where ρ is a constant representing the total density of individuals. From the equations above, it follows that, as functions of t, St is positive nonincreasing, whereas Rt is positive nondecreasing. Therefore, the infinitetime limits S∞ and R∞ exist. Since It = ρ − St − Rt , it follows also that I∞ exists and satisfies the relation R∞ = R∞ + pr I∞ , which shows that I∞ = 0. In such a model, there is no endemic steady state.
238
6 Cellular Automata
infective density
0.04 0.03 0.02 0.01 0 0
10
20
30 time
40
50
Fig. 6.18. Time evolution of the density of infective individuals for the onepopulation SIR epidemic model using the mean-field approximation. The initial densities of susceptible and infective individuals are, respectively, equal to 0.69 and 0.01. For pr = 0.6 and pi = 0.3, an epidemic occurs, whereas there is no epidemic if, for the same value of pr , pi = 0.2.
If the initial conditions are R0 = 0 and I0 S0 , I1 is small, and we have I1 − I0 = (4pi S0 − pr )I0 + O(I02 ). Hence, according to the initial value of the density of susceptible individuals, we may distinguish two cases: 1. If S0 < pr /4pi , then I1 < I0 , and, since St is a nonincreasing function of time t, It goes monotonically to zero as t tends to ∞. That is, no epidemic occurs. 2. If S0 > pr /4pi , then I1 > I0 . The density It of infective individuals increases as long as St is greater than the threshold pr /zpi and then tends monotonically to zero. An epidemic occurred. This shows that the spread of the disease occurs only if the initial density of susceptible individuals is greater than a threshold value. This threshold theorem was established for the first time by Kermack and McKendrick [182] using an epidemic model formulated in terms of a set of three differential equations (see Example 8, page 45). It being, in general, very small, the second recurrence equation of the model is well-approximated by It+1 = It + 4pi St It − pr It ,
6.6 Site-exchange cellular automata
239
which shows that the mean-field approximation is equivalent to a time discrete formulation of the Kermack-McKendrick model. Figure 6.18 shows two typical time evolutions of the density of infective individuals. In the numerical simulations done by Boccara and Cheong [49], the total density of individuals was equal to 0.6, slightly above the site percolation threshold in two dimensions for the square lattice (approximately equal to 0.592746 . . .) in order to be able to see cooperative phenomena. As the degree of mixing m increases, as expected, the density of infective individuals tends to the mean-field result. Long-range moves are a much more effective mixing process. For instance, in the case of an epidemic with permanent removal, simulation and mean-field results almost coincide for m = 2 for long-range moves compared to m = 250 for short-range moves. In the case of infective individuals recovering and becoming immune, the convergence to mean-field behavior is somewhat slower since the presence of the inert immune population on the lattice interferes with the mixing process. As shown by Kermack and McKendrick [182], the spread of the disease does not stop for lack of susceptible individuals. For any given value of m, the stationary density of susceptible individuals S∞ (m) can be small but is always positive. In the case of short-range moves and permanent removal, S∞ (m) tends to its mean-field value S∞ (∞) as m−α with α = 1.14 ± 0.11. If, instead of short-range moves, we consider long-range moves, the convergence to meanfield behavior is faster: the exponent α is equal to 1.73 ± 0.11. All these results were obtained for pi = 0.5, pr = 0.3, and with 100 × 100 lattices. The results above show that the mean-field approximation is valid in the limit m → ∞. A recent analysis of the spatiotemporal behavior of the spread of influenza for the epidemic of winter 1994–1995 [67] reveals a high degree of mixing of the population, making mean-field models appropriate. A possible explanation could be the change in transportation systems, allowing people to move over greater distances more easily and more frequently than in the past. Example 50. SIS epidemic model in a population of moving individuals. In an SIS epidemic model, individuals are divided into two disjoint groups: 1. susceptible individuals capable of contracting the disease and becoming infective, and 2. infective individuals capable of transmitting the disease to susceptibles. If pi denotes the probability for a susceptible to be infected and pr the probability for an infective to recover and return to the susceptible group, the possible evolution of an individual may be represented by the following transfer diagram: pi pr S −→ I −→ S. In a two-dimensional cellular automaton model, each site of a finite square lattice is either empty or occupied by a susceptible or an infective.
240
6 Cellular Automata
The spread of the disease is governed by the following rules. 1. Susceptible individuals become infected by contact (i.e., a susceptible may become infective with a probability pi if, and only if, it is in the neighborhood of an infective). This hypothesis neglects incubation and latent periods: an infected susceptible becomes immediately infective. 2. Infective individuals recover and become susceptible again with a probability pr . This assumption states that recovery is equally likely among infective individuals but does not take into account the length of time the individual has been infective. 3. The time unit is the time step. During one time step, the two preceding rules are applied synchronously, and the individuals move on the lattice according to a specific rule. 4. As in the model above, an individual selected at random may perform either short-range or long-range moves. Here again, this model assumes that the population is closed. It ignores births, deaths by other causes, immigrations, or emigrations. Let us first study the qualitative behavior of the model using the meanfield approximation. Denoting by St and It the respective densities of susceptible and infective individuals at time t, we have St+1 = ρ − It+1 , It+1 = It + St 1 − (1 − pi It )4 − pr It , where ρ is a constant representing the total density of individuals. Eliminating St in the second equation yields It+1 = (1 − pr )It + (ρ − It ) 1 − (1 − pi It )4 . In the infinite-time limit, the stationary density of infective individuals I∞ satisfies the equation I∞ = (1 − pr )I∞ + (ρ − I∞ ) 1 − (1 − pi I∞ )4 . I∞ = 0 is always a solution to this equation. This value characterizes the disease-free state. It is a stable stationary state if, and only if, 4ρpi − pr ≤ 0. If 4ρpi − pr > 0, the stable stationary state is given by the unique positive solution of the equation above. In this case, a nonzero fraction of the population is infected. The system is in the endemic state. For 4ρpi − pr = 0, the system, within the framework of the mean-field approximation, undergoes a transcritical bifurcation similar to a second-order phase transition characterized by a nonnegative order parameter, whose role is played, in this model, by the stationary density of infected individuals I∞ . In the numerical simulations done by Boccara and Cheong [52], the total density of individuals was equal to 0.6, slightly above the site percolation
6.6 Site-exchange cellular automata
241
threshold in two dimensions for the square lattice in order to be able to observe cooperative effects. Most simulations were performed on a 100 × 100 lattice and some on a 200 × 200 lattice to check possible size effects. In the case of short-range moves, for given values of pr and m there exists a critical value pci of the probability for a susceptible to become infected. At this bifurcation point, the stationary density of infective individuals I∞ (m) behaves as (pi −pci )β . When m = 0, the exponent β is close to 0.6, which is the value obtained for the two-dimensional directed percolation. When m = ∞, that is for the mean-field approximation, β = 1. Because it neglects correlations, which play an essential role in the neighborhood of a second-order phase transition, the mean-field approximation cannot correctly predict the critical behavior of short-range interaction systems [45]. For standard probabilistic cellular automata, this is also the case [42, 53]. For a given value of pr , the variations of β and pci as functions of m are found to exhibit two regimes reminiscent of crossover phenomena. In the small m regime (i.e., for m 10), pci and particularly β have their m = 0 values. In the large m regime (i.e., for m 300), pci and β have their mean-field values. In agreement with what is known in phase transition theory, the exponent β does not seem to depend upon pr ; i.e., its value does not change along the second-order transition line. The fact that, for m = 0, the value of β, for this model, is equal to the two-dimensional directed percolation value strongly suggests that the critical properties of this model are universal (i.e., model-independent). For given values of pi and pr , the asymptotic behaviors of the stationary density of infective individuals I∞ (m) for small and large values of m may be characterized by the exponents log I∞ (m) − I∞ (0) α0 = lim , m→0 log m log I∞ (∞) − I∞ (m) α∞ = lim . m→∞ log m It is found that α0 = 0.177 ± 0.15 and α∞ = −0.945 ± 0.065. The fact that α0 is rather small shows the importance of motion in the spread of a disease. The stationary number of infective individuals increases dramatically when the individuals start to move. In other words, the response ∂I∞ (m)/∂m of the stationary density I∞ (m) to the degree of mixing m tends to ∞ when m tends to 0. In the case of long-range moves, for a fixed value of pr , the variations of pci and β are very different from those for short-range moves. Whereas for short-range moves β and pci do not vary in the small-m regime, for long-range moves, on the contrary, the derivatives of β and pci with respect to m tend to ∞ as m tends to 0. For small m, the asymptotic behaviors of β and pci may therefore be characterized by the exponents
242
6 Cellular Automata
log β(m) − β(0) αβ = lim , m→∞ log m c log pi (m) − pci (0) . αpci = lim m→0 log m Both exponents are found to be close to 0.5. Example 51. Rift Valley Fever in Senegal. The Rift Valley Fever is a viral disease infecting sheep, cattle, and humans in contact with infected animals. The virus is transmitted among susceptible hosts by mosquitoes such as the Aedes vexans and Culex pipiens. The eggs may remain in diapause (i.e., a period of reduced metabolic activity), until the damp soil on which the eggs are laid is flooded to form a pool suitable for the larvae. A model for the propagation of the disease in Senegal explaining the appearance of endemicity in subsahelian regions has been given by Dubois, Favier, and Sabatier [110]. In order to take into account seasonal migration of herds and shepherds, the authors consider a two-dimensional square lattice in which a viremic herd at a given site can infect, with a probability α, herds located in its von Neumann neighborhood and with a probability β a randomly selected distant site. This model is not, strictly speaking a site-exchange cellular automaton. A herd infecting a distant site does so by moving back and forth between two consecutive time steps, i.e., a given herd always stays at the same location. A time step corresponds to a full year during which the seasonal migration took place. It is clear that interesting results are expected only for nonzero values of both α and β. While for α = 0 and β = 0 we have a classical cellular automaton epidemic model, for α = 0 and β = 0 we have what is called a small-world model (see next chapter), which may exhibit an explosive propagation of the disease. When the motion of the individuals is an important factor of an agentbased model, instead of a sequential site-exchange rule, one could use a synchronous moving rule defined, for example, as follows. Consider N random walkers on a d-dimensional hypercubic lattice, each lattice site being occupied by at most one random walker. At each time step, each walker decides to move to one of his 2d neighboring sites with a probability equal to 1/2d. If this neighboring site is empty, the walker performs the move; otherwise, he does not. If two or more random walkers choose to move to the same empty site, then one can imagine many rules to resolve this conflict. For example, timorous walkers could agree not to move, while amiable walkers could agree on randomly selecting one of them to move to the empty site.34 The multiple amiable random walkers should have, when the number of iterations is small, a degree of mixing slightly higher than the timorous model (see Exercise 6.6). The essential difference between sequential site-exchange rules and multiple 34
A cellular automaton model implementing these rules has been published by Nishidate, Baba, and Gaylord [270].
6.7 Artificial societies
243
random walker cellular automaton rules is that the degree of mixing, measured by the reduced Hamming distance between two configurations, can be continuously increased from zero in the first case, whereas it jumps to a finite value after only one iteration in the second case.
6.7 Artificial societies Lately there has been a growing interest in agent-based modeling in the social sciences.35 Following Epstein and Axtell, agent-based models of social processes are referred to as artificial societies. The aim of these models, which may include [117] “trade, migration, group formation, combat, interaction with an environment, transmission of culture, propagation of disease, and population dynamics,” is “to discover fundamental local or micro mechanisms that are sufficient to generate the macroscopic social structures and collective behaviors of interest.” Example 52. Spatial segregation. In probably the first attempt at agent-based modeling in the social sciences, Schelling [306, 307] devised a very simple model of spatial segregation in which individuals prefer that a minimum fraction of their neighbors be of their own “color.” Consider a two-dimensional cellular automaton on Z2L (i.e., an L×L square lattice with periodic boundary conditions). Each site is either occupied by an individual with a probability ρ or empty with a probability 1 − ρ. Individuals are of two different types. The state of an empty site is 0, while the state of an occupied site is an integer a ∈ {1, 2} representing the individual type (race, gender, social class). The local evolution rule consists of the following steps. 1. Each individual counts how many individuals in his Moore neighborhood (eight sites) share with him the same type. 2. If this number is larger than or equal to four, the individual stays at the same location; if this is not the case, the individual tries to move to a randomly selected nearest-neighboring site. 3. Individuals who decide to move do so according to the multiple random walkers model described on page 242 in which when two or more random walkers are facing the same empty site they agree not to move (timorous walkers). Starting from a random configuration, Figure 6.19 shows the formation of segregated neighborhoods. Since an unsatisfied individual moves to a randomly selected location, he may dissatisfy his new neighbors who, consequently, will try to move to a new location. Long-range moves not being permitted, some unsatisfied individuals who would like to move to new locations cannot do so if they are trapped in a neighborhood fully occupied 35
See, in particular, Epstein and Axtell [117] and Axelrod [17].
244
6 Cellular Automata
Fig. 6.19. Spatial segregation. Lattice size: 20 × 20; number of individuals: 238 (138 of type 1 and 130 of type 2 represented by light grey and dark grey squares, respectively). The left figure represents the initial random configuration, and the right one shows the formation of segregated neighborhoods after 2000 iterations. Note the existence of two individuals trapped in Moore neighborhoods fully occupied by individuals of a different type.
by individuals of a different type. Because individuals move away from other individuals of different type, they also move away from empty sites that have no type. As a consequence, empty sites also segregate. Remark 18. If we increase the number of types, say from two to four (see Figure 6.20), as expected, it takes more iterations to observe well-segregated neighborhoods.
Since individuals were only permitted to perform short-range moves, this model is not a very realistic model of spatial segregation. However, it is rather realistic if we view it as a model of clustering of people, say by gender or generation, at a party. In a more realistic model of spatial segregation, we should allow unsatisfied individuals to perform long-range moves to reach a distant preferable location. A model of spatial segregation allowing individuals to perform long-range moves has been proposed by Gaylord and D’Andria in an interesting book [140] intended for the reader who wishes to create his own computer programs to model artificial societies. As in the model above, individuals of two different types occupy a fraction ρ of the sites of a finite square lattice with periodic boundary conditions. The state of an empty site is 0, and the state of an occupied site is either 1 or 2 according to the individual type. The evolution rule consists of the following steps. 1. Build up the list of the individuals of each type who want to move using the rule that an individual wants to move if his Moore neighborhood contains fewer individuals of his own type than the other type.
6.7 Artificial societies
245
Fig. 6.20. Spatial segregation. Lattice size: 20 × 20; number of individuals: 268 (70, 69, 61, and 68 of, respectively, types 1, 2, 3, and 4, represented by darker shades of grey from light grey to black). The left figure represents the initial random configuration, and the right one shows the formation of segregated neighborhoods after 10,000 iterations.
2. List the positions of all individuals of type 1 who want to move. 3. List the positions of all individuals of type 2 who want to move. 4. List the positions of all empty sites that are suitable for individuals of type 1. 5. List the positions of all empty sites that are suitable for individuals of type 2. 6. If the length of the list of a given type of individual who wants to move is not equal to the list of empty sites suitable to that type of individual, match the lengths of the two lists by randomly eliminating elements of the longest list. 7. Carry out the moves by swapping the positions of individuals of a given type who want to move with the positions of the empty sites suitable to that type of individual. As expected, it takes much fewer iterations to observe the formation of segregated neighborhoods when individuals are permitted long-range moves. Figure 6.21 shows the initial random configuration and the resulting configuration after 200 iterations. Note that in this model an individual does take into account empty sites in deciding to stay or to move. This is why, in contrast with the previous model, there is no segregation of empty sites. Example 53. Dissemination of culture. Axelrod [16] has proposed a model for the dissemination of culture in which an individual’s culture is described by a list of features such as political affiliation or dress style. For each feature there is a set of traits, which are the alternative values the feature may have, such
246
6 Cellular Automata
Fig. 6.21. Spatial segregation allowing long-range moves. Lattice size: 20×20; number of individuals: 254 (125 of type 1 and 129 of type 2 represented by light grey and dark grey squares, respectively). The left figure represents the initial random configuration, and the right one shows the formation of segregated neighborhoods after 200 iterations.
as the different colors of a piece of clothing. If we assume that there are five features and each feature can take on any one of ten traits, then the culture of an individual is represented by a sequence of five digits such as (4, 5, 9, 0, 2). The degree of cultural similarity between two individuals is defined as the percentage of their features that have the identical trait. For example, the degree of cultural similarity between (4, 5, 9, 0, 2) and (4, 7, 5, 0, 2) is equal to 60%, for these two sequences share traits for three of the five cultural features. The basic idea is that individuals who are similar to each other are likely to interact and then become even more similar. This process of social influence is implemented by assuming that the probability of interaction between two neighboring individuals is equal to their degree of similarity. The local evolution rule consists of the following steps. 1. Select at random a site to be active and one of its neighbors. 2. Select at random a feature on which the active site and its neighbor differ, and change the active site’s trait on this feature to the neighbor’s trait on this feature with a probability equal to the degree of similarity between the two sites. If the two sites have a degree of similarity equal to 100%, nothing happens. Applying this evolution rule repeatedly, Axelrod found that cultural features tend to be shared over larger and larger regions. Since two neighboring sites with completely different cultures cannot interact, the process eventually stops with several surviving cultural regions.
6.7 Artificial societies
247
An interesting result concerns the influence of the number of features and number of traits per feature on the number of surviving cultural regions: The number of surviving cultural regions is much smaller if there are many cultural features with few traits per feature than if there are few cultural features with many traits per feature. Increasing the number of cultural features increases the interaction probability since there is a greater chance that two neighboring sites will have the same trait on at least one feature. On the contrary, increasing the number of traits per feature has the opposing effect since, in this case, there is a smaller chance that two neighboring sites will have the same trait. In the original Axelrod model, there is no movement and the sites may be viewed as villages. Gaylord and D’Andria [140] proposed a modified version of the Axelrod model incorporating random motion of the individuals36 and bilateral, instead of unilateral, cultural exchange. Their model is a twodimensional cellular automaton on Z2L . Each site is either occupied by an individual with a probability ρ or empty with a probability 1 − ρ. The state of an empty site is 0, while the state of an occupied site is an ordered pair (i, j). The first element i is a list of s specific traits of cultural features, each trait having an integer value between 1 and m. The second element j, randomly selected at each time step in the set {n, e, s, w}, indicates the direction faced by the individual (i.e., north, east, south or west). The local evolution rule consists of two subrules. 1. The first subrule is a bilateral cultural exchange rule: two individuals facing each other interact if they have some, but not all, traits in common. The interaction is carried out by randomly selecting one feature with traits having different values and changing these traits to a common value for both individuals with a probability equal to the degree of cultural similarity between the two individuals. The new value taken by the trait is randomly selected between the two differing values. For example, if a north-facing individual’s traits list is (1, 4, 2, 7, 9) and her south-facing neighbor’s traits list is (1, 8, 2, 7, 5), these individuals will change either their second or fifth feature trait with a probability equal to 0.6. If the fifth trait is randomly chosen, then the new lists will be (1, 4, 2, 7, x) and (1, 8, 2, 7, x), where x is an integer selected at random in the set {5, 6, 7, 8, 9}. 2. The second rule is the multiple random walkers moving rule defined on page 242 in which when two or more random walkers are facing the same empty site they agree not to move (timorous walkers). 36
Mobility had already been included in Axelrod’s model and was shown to result in fewer stable cultural regions. See Axtell, Axelrod, Epstein, and Cohen [18]. See an adapted version in Axelrod [17].
248
6 Cellular Automata
Fig. 6.22. Dissemination of culture among individuals, each of whom has two possible traits for each of their two features. Empty sites are white, and the traits lists (1, 1), (1, 2), (2, 1), and (2, 2) are darker shades of grey from light grey to black. Lattice size: 20 × 20; number of individuals: 335. The left figure represents the initial random configuration, and the right one is the resulting configuration after 10,000 iterations.
Figure 6.22 shows the dissemination of culture among 335 individuals on a 20 × 20 lattice. Each individual has two possible traits for each of their two features. The table below gives the number of individuals for each traits list for the initial random configuration and for the configuration obtained after 10,000 time steps. traits list (1, 1) (1, 2) (2, 1) (2, 2) initial configuration 102 89 70 74 final configuration 211 0 114 10 The traits list (1, 2) has completely disappeared, and only ten individuals having the traits list (2, 2) remain. The dominant “cultures” are (1, 1) and (2, 1). In the social sciences, the interaction between members of a population is often modeled as a game. Game theory was originally designed by John von Neumann and Oskar Morgenstern [258] to solve problems in economics. In a two-person game, the players try to outsmart one another by anticipating each other’s strategies. The “stone-paper-scissors” game played by most children makes clear what is meant by strategy. The game’s rules are: 1. scissors defeats paper, 2. paper defeats stone, 3. stone defeats scissors. Assuming that the winner wins one unit from the loser, the outcomes are listed in the following table:
6.7 Artificial societies
249
player B stone paper scissors stone (0, 0) (−1, 1) (1, −1) player A paper (1, −1) (0, 0) (−1, 1) scissors (−1, 1) (1, −1) (0, 0) Each player has three strategies: “stone,” “paper,” and “scissors.” The number of units received by players A and B adopting, respectively, strategies i and j are given by the ordered pair (xij , yij ). For example, if A adopts strategy “scissors” and B strategy “paper,” A wins one unit and B loses one unit, represented by (1, −1). The entries in the table above are referred to as payoffs, and the table of entries is the payoff matrix. Usually, only the payoffs received by player A from player B are listed, and the payoff matrix of the “stone-paper-scissors” game is ⎡ ⎤ 0 −1 1 ⎣ 1 0 −1⎦ . (6.39) −1 1 0 In the “stone-paper-scissors” game, the gain of player A equals the loss of player B, and vice versa. Such a game is said to be a zero-sum game. Since the game is entirely determined by the payoff matrix, it is called a matrix game. Suppose players A and B have, respectively, m and n strategies, and let aij be the payoff A receives from B. The payoff matrix is ⎤ ⎡ a11 a12 · · · a1n ⎢ a21 a22 · · · a2n ⎥ ⎥ A=⎢ ⎣··· ··· ··· ··· ⎦. am1 am2 · · · amn If player A adopts strategy i, his payoff is at least min aij .
1≤j≤n
Since player A wishes to maximize his payoff, he should choose strategy i so as to receive a payoff not less than max min aij .
1≤i≤m 1≤j≤n
If player B adopts strategy j, his loss is at most max aij .
1≤i≤m
Since player B wishes to minimize his loss, he should choose strategy j so as to lose not more than min max aij . 1≤j≤n 1≤i≤m
250
6 Cellular Automata
Theorem 19. Let the payoff matrix of a matrix game be [aij ]; then max min aij ≤ min max aij .
1≤i≤m 1≤j≤n
1≤j≤n 1≤i≤m
(6.40)
Since, for all i = 1, 2, . . . , m and for all j = 1, 2, . . . , n, we have min aij ≤ aij
1≤j≤n
and aij ≤ max aij , 1≤i≤m
for all (i, j), we have min aij ≤ max aij .
1≤j≤n
1≤i≤m
The left-hand side of the inequality above does not depend upon j; hence, taking the minimum with respect to j of both sides yields min aij ≤ min max aij .
1≤j≤n
1≤j≤n 1≤i≤m
This result being true for all i, taking the maximum with respect to i of both sides, we finally obtain inequality (6.40). If the elements of the payoff matrix [aij ] satisfy the relation max min aij = min max aij ,
1≤i≤m 1≤j≤n
1≤j≤n 1≤i≤m
(6.41)
this common value is called the value of the game, and there exist is and js such that min ais ,j = max min aij
1≤j≤n
1≤i≤m 1≤j≤n
and max ai,js = min max aij .
1≤i≤m
1≤j≤n 1≤i≤m
Thus, min ais ,j = max ai,js .
1≤j≤n
1≤i≤m
But min ais ,j ≤ ais js ≤ max ai,js .
1≤j≤n
1≤i≤m
Hence, min ais ,j = ais js = max ai,js .
1≤j≤n
1≤i≤m
Therefore, ais j ≥ ais js ≥ aijs ,
(6.42)
which shows that ais js is equal to the value of the game. Relation (6.42) shows that whichever strategy is adopted by player B, if player A adopts strategy is , the payoff cannot be less than the value of the game, and whichever strategy
6.7 Artificial societies
251
is adopted by player A, if player B adopts strategy js , the payoff cannot be greater than the value of the game. This is the reason why strategies is and js are said to be optimal strategies for, respectively, players A and B. The pair (is , js ) is called a saddle point of the game, and (i, j) = (is , js ) is a solution of the game. The existence of a saddle point ensures that Relation (6.41) is satisfied. In this case, the element ais js of the payoff is, at the same time, the minimum of its row and the maximum of its column. Using this remark, we verify that the “stone-paper-scissors” game has no saddle point. Saddle points are not necessarily unique. If (is1 , js1 ) and (is2 , js2 ) are two different saddle points, then the relations aijs1 ≤ ais1 js1 ≤ ais1 j
and aijs2 ≤ ais2 js2 ≤ ais2 j ,
which are valid for all i and j, imply ais1 js1 ≤ ais1 js2 ≤ ais2 js2 ≤ ais2 js1 ≤ ais1 js1 ; that is, ais1 js1 = ais1 js2 = ais2 js2 = ais2 js1 = ais1 js1 . Hence, if (is1 , js1 ) and (is2 , js2 ) are two different saddle points, (is1 , js2 ) and (is2 , js1 ) are also saddle points. When saddle points exist, suitable strategies are clear, but if, as in the “stone-paper-scissors” game, there are no saddle points—that is, when the elements of the payoff matrix [aij ] are such that max min aij ≤ min max aij
1≤i≤m 1≤j≤n
1≤j≤n 1≤i≤m
—the game has no solution in the sense given above. In such a game, players cannot select optimal strategies; they are led to adopt mixed strategies. Let [ai j] 1≤i≤m be the payoff matrix of a matrix game. Mixed strategies for players 1≤j≤n
A and B, respectively, are sequences (xi )1≤i≤m and (yj )1≤j≤m of nonnegative numbers such that m n xi = 1 and yj = 1. (6.43) i=1
j=1
If player A adopts strategy i with probability xi and player B adopts strategy j with probability yj , then the expected payoff of player A is n m
aij xi yj .
i=1 j=1
If Sm (resp. Sn ) denotes the set of all sequences X = (xi )1≤i≤m (resp. Y = (yj )1≤j≤n ) of all nonnegative numbers satisfying Conditions (6.43), then the expected payoff of player A using the mixed strategy X = (xi ) ∈ Sm is at least
252
6 Cellular Automata
min
Y ∈Sn
m n
aij xi yj ,
i=1 j=1
and, to maximize this quantity, he should choose a mixed strategy X ∈ Sm such that his expected payoff would be at least equal to max min
X∈Sm Y ∈Sn
m n
aij xi yj .
i=1 j=1
Similarly, player B should choose a mixed strategy Y ∈ Sn to prevent player A from receiving an expected payoff greater than min max
Y ∈Sn X∈Sm
m n
aij xi yj .
i=1 j=1
Proceeding as we did to prove Theorem 19, it can be shown that max min
X∈Sm Y ∈Sn
m n i=1 j=1
aij xi yj ≤ min max
Y ∈Sn X∈Sm
m n
aij xi yj .
i=1 j=1
Actually, John von Neumann proved in 1928 that for all finite37 two-person zero-sum games38 max min
X∈Sm Y ∈Sn
m n i=1 j=1
aij xi yj = min max
Y ∈Sn X∈Sm
m n
aij xi yj .
(6.44)
i=1 j=1
This important result is known as the minimax theorem. The pair of sequences (Xs , Ys ) ∈ Sm × Sn is said to be a solution of the game, and Xs and Ys are optimal mixed strategies for, respectively, players A and B. The common value of both sides of Relation (6.44) is called the value of the game. In a two-person zero-sum game, the players have conflicting interests: what is gained by one player is lost by the other. In a nonzero-sum game, this is not necessarily the case. Consider, for example, the celebrated prisoner’s dilemma. Two prisoners, A and B, suspected of committing a serious crime are isolated in different cells and each is asked whether the other is guilty. Both prisoners want, naturally, to get the shortest prison sentence, and both know the consequences of their decisions. Each prisoner has two strategies: to accuse or not to accuse the other. 1. If both defect—that is, each of them accuses the other—both go to jail for five years. 37 38
That is, all matrix games having a finite number of strategies. See von Neumann and Morgenstern [258] and, for an alternative simple proof, Jianhua [180].
6.7 Artificial societies
253
2. If both cooperate—that is, neither of them accuses the other—both go to jail for one year, say, on account of carrying concealed weapons. 3. If one defects while the other does not, the former goes free and the latter goes to jail for 15 years. For both prisoners, the only rational choice is to defect. Although each prisoner ignores what the other will do, if he defects, he will either get five years instead of 15 if the other also defects or go free if the other does not defect. The paradox is that when both prisoners act selfishly and do not cooperate, they both get a five-year sentence, whereas both could have been sentenced only one year if they had cooperated. More formally, in the two-person prisoner’s dilemma game, if both players cooperate, they receive a payoff R (reward); if both defect, they receive a lesser payoff P (punishment); and if one defects while the other cooperates, the defector receives the largest payoff T (temptation), while the cooperator receives the smallest S (sucker), as shown in the following table cooperation defection cooperation (R, R) (S, T ) defection (T , S) (P , P ) where T > R > P > S and R > 12 (T + S), so that, in an iterated game, in which players can modify their strategies at each time step, each player gains on average more per game by agreeing to repeatedly cooperate (R) than by agreeing to alternately cooperate and defect out-of-phase 12 (T + S) . As emphasized by Axelrod and Hamilton in their 1981 paper [15], if two players can meet only once, the best strategy is to defect, but if an individual can recognize a previous interactant and remember some aspects of the previous outcomes, then the strategic situation becomes an iterated Prisoner’s Dilemma with a much richer set of possibilities. A strategy would take the form of a decision rule which determined the probability of cooperation or defection as a function of the history of the interaction so far. Here are a few examples of evolving strategies adopted by players on a square lattice with periodic boundary conditions. Example 54. Iterated prisoner’s dilemma game: individuals have memory of past encounters. On an L × L lattice with periodic boundary conditions Z2L , each site is either occupied by an individual with probability ρ or empty with probability 1−ρ. The state of an empty site is 0, while the state of an occupied site is a five-tuple where: 1. the first element is an integer representing the individual’s name; 2. the second element is either d or c, indicating whether the individual is a defector or a cooperator;
254
6 Cellular Automata
3. the third element is a list of defectors who have interacted with the individual in previous encounters; 4. the fourth element is an integer representing the sum of the payoffs from previous time steps; 5. the last element, randomly selected at each time step in the set {n, e, s, w}, indicates the direction faced by the individual (i.e., north, east, south or west). The local evolution rule consists of the following two subrules. 1. Two facing individuals decide to interact if neither of them is on the other individual’s list of previously encountered defectors. 2. The second rule is the multiple random walkers moving rule defined on page 242 in which when two or more random walkers are facing the same empty site they agree not to move (timorous walkers). In agreement with what is found in the literature, we used the payoff values T = 5, R = 3, P = 1, and S = 0. Figure 6.23 shows that, in the long run, cooperators do better than defectors. In this model, information about defectors is kept private. If individuals are allowed to share information, the average gain of defectors eventually declines.39 gain
gain 300
40
cooperators
250
cooperators 30
200 defectors
150
20
100 10
defectors
50 time 20
40
60
80 100 120 140
200
400
600
800
time 1000
Fig. 6.23. Average gain of defectors and cooperators in an iterated prisoner’s dilemma game in which individuals refuse to interact with previously encountered defectors. Lattice size: 30 × 30, number of cooperators: 360, number of defectors: 355, total number of iterations: 1000. The left figure represents the average gain for each population for the first 150 iterations, showing that at the beginning the average gain of defectors is higher.
In Example 54, each individual always adopted the same strategy; and if the cooperators’ average gain was higher in the long run than the defectors’ average gain, this was due to the fact that the former refused to interact with defectors they had previously encountered. 39
See Gaylord and D’Andria [140], p. 57.
6.7 Artificial societies
255
While defection is the unbeatable strategy if individuals know they will never meet again, when after the current encounter the two opponents will meet again with a nonzero probability w, strategic choices become more complex. To find out what type of strategy was the most effective, Axelrod in 1980 conducted two computer tournaments [13, 14]. The strategies were submitted by a variety of people, including game theorists in economics, sociology, and political science or professors of evolutionary biology, physics, and computer science. In both cases, the strategies were paired in a round-robin tournament, and in both cases the winner was Anatol Rapoport,40 who submitted the simplest strategy, called tit for tat. This strategy consists in cooperating on the first encounter (i.e., being nice) and then doing whatever the opponent did in the previous encounter (i.e., being provoking and forgiving). Axelrod and Hamilton [15] argued that, if two individuals have a probability w of meeting again greater than a threshold value, tit for tat is an evolutionarily stable strategy, which means that a population of individuals using that strategy cannot be invaded by a rare mutant adopting a different strategy.41 They determined the w threshold value as follows. 1. If two players adopt the tit-for-tat strategy, every one receives a payoff R at each time step, that is, a total payoff42 R + Rw + Rw2 + · · · =
R . 1−w
2. If a player adopts the tit-for-tat strategy against another player who always defects, the latter receives a payoff T the first time and P thereafter, so it cannot invade a population of tit-for-tat strategists if R Pw ≥T + 1−w 1−w or
T −R . (6.45) T −P 3. If a player adopts the tit-for-tat strategy against another player who alternates defection and cooperation, the latter receives the sequence of payoffs T, S, T, S, . . ., and so cannot invade a population of tit-for-tat strategists if R T + Sw ≥ 1−w 1 − w2 w>
40
41
42
An interesting early discussion of the prisoner’s dilemma game “addressed specifically to experimental psychologists and generally to all who believe that knowledge about human behavior can be gained by the traditional method of interlacing theoretical deductions with controlled observations” can be found in Rapoport and Chammah with the collaboration of Carol Orwant [293]. On the concept of stability in evolutionary game theory, see Maynard Smith [235] and Cressman [96]. The game is supposed to go on forever.
256
6 Cellular Automata
or
T −R . (6.46) R−S From these two results, it follows that no mutant adopting one of these two strategies can invade a population of tit-for-tat strategists if w is such that T −R T −R w > max . (6.47) , T −P R−S w>
According to Axelrod and Hamilton [15], if w satisfies (6.47), a population of tit-for-tat strategists cannot be invaded not only by a mutant who either always defects or alternates defection and cooperation but also, as a consequence, by any mutant adopting another strategy. This result, which would imply that the tit-for-tat strategy is evolutionarily stable, has been questioned by Boyd and Lorberbaum [70] (see also May [232]). Essentially, the argument of these authors is that Axelrod and Hamilton proved that, if tit for tat becomes common in a population where w is sufficiently high, it will remain common because such a strategy is collectively stable. A strategy Se is collectively stable if, for any possible strategy Si , V (Se | Se ) ≥ V (Si | Se ), where V (S1 | S2 ) is the expected fitness of individuals who use strategy S1 when interacting with individuals using strategy S2 . But collective stability does not imply evolutionary stability.43 Tit for tat is nevertheless a very successful strategy, and there have been a few attempts to test tit for tat as a model to explain cooperative behavior among animals in nature. It seems that the first test was reported by Lombardo [211], who examined the interactions between parent and nonbreeding tree swallows (Tachycineta bicolor ) and showed that parents and nonbreeders seem to resolve their conflicting interests by apparently adopting a tit-for-tat strategy. He used stuffed model nonbreeders and simulated acts of defection by nonbreeders by making as though nonbreeders had killed nestlings. A more sophisticated test has been reported by Milinski [244], who studied the behavior of three-spined sticklebacks (Gasterosteus aculeatus) when approaching a potential predator to ascertain whether it poses a threat or not. Using a system of mirrors, single three-spined sticklebacks “inspecting” a predator were provided with either a simulated cooperating companion or a simulated defecting one. In both cases, the test fish adopted the tit-for-tat strategy. Example 55. Iterated prisoner’s dilemma game: tit-for-tat strategists against defectors. Consider a population of mobile players in which most individuals adopt the tit-for-tat strategy and the rest always defect. As illustrated in Figure 6.24, the average gain of individuals adopting the tit-for-tat strategy is eventually higher than the average gain of a small fraction of individuals who always defect. 43
See Exercise 6.10.
6.7 Artificial societies
257
gain 300
tit for tat
250 200 150 always defector
100 50
time 100
200
300
400
500
600
Fig. 6.24. Average gain of individuals adopting the tit-for-tat strategy or being always the defector in an iterated prisoner’s dilemma game. Lattice size: 25 × 25; number of individuals who always defect: 62; number of individuals adopting the tit-for-tat strategy: 459; total number of iterations: 600.
Example 56. Iterated prisoner’s dilemma game: imitating the most successful neighbor. Nowak and May [271] have obtained fascinating results concerning the evolution of cooperation among individuals located at the sites of a two-dimensional square lattice. The individuals interact with their neighbors through simple deterministic rules and have no memory. The evolution produces constantly changing surprising spatial patterns reminiscent of Persian carpets, in which both cooperators and defectors persist forever. Consider an L × L square lattice in which initially all sites are occupied by cooperators except a few sites located in the middle of the lattice and disposed symmetrically that are occupied by defectors. At each time step, each individual plays with all the individuals located in his Moore neighborhood, including himself. He then compares his payoff to the payoffs of his neighbors and adopts at the next time step the strategy of the most successful neighbor. The outcome depends on the initial configuration and the value of the single parameter b characterizing the reduced payoff values: S = P = 0,
R = 1,
T = b > 1.
For b = 2, successive patterns are shown in Figure 6.25. Note that regions occupied by individuals who defect or cooperate twice in a row are separated by narrow boundaries of individuals who just changed their strategy. Since the early 1980s, the iterated prisoner’s dilemma game has become the most important model for studying the evolution of cooperation among a population of selfish individuals, and a very large number of articles have been published on the subject.
258
6 Cellular Automata
Fig. 6.25. Successive patterns generated from an initial state in which all individuals located at the sites of a 60 × 60 square lattice are cooperators except the four sites (28, 28), (28, 32), (32, 28), and (32, 32), who are defectors. A site is colored based on the strategies adopted at time t − 1 and t by the individual located at that site according to the following code: if the individual defects at times t − 1 and t, the site is black; if the individual defects at t − 1 and cooperates at t, the site is dark grey; if the individual cooperates at t − 1 and defects at t the site is light grey, and if the individual cooperates at t − 1 and t, the site is very light grey.
Exercises Exercise 6.1 We have seen that configurations in the limit set of elementary cellular automaton rule 18 consist of sequences of 0s of odd lengths separated by isolated 1s as illustrated in Figure 6.6. The evolution towards the limit set can be viewed as interacting particle-like defects.44 Defects in the case of rule 18 are only of one type, namely two 44
Boccara, Nasser, and Roger [48]
Exercises
259
consecutive 1s, which generate a sequence of an even number of 0s. These defects (kinks) were first studied by Grassberger [154], who showed that they perform a random walk. When two defects meet, they annihilate, and this implies, in the case of an infinite lattice, that the density of these defects decreases as t−1/2 , where t is the number of iterations. Illustrate this annihilation process starting from an initial configuration containing only two defects. Exercise 6.2 Let S = {0, 1}Z be the set of all configurations and M a surjective mapping on S.45 (i) If F is a global one-dimensional cellular automaton rule of radius r, show that there exists a unique global cellular automaton rule Φ of radius not greater than r such that46 Φ◦M =M ◦F
(R)
if, and only if, for any configuration c ∈ S, the image by M ◦ F of the set M −1 (c) contains only one element. When Φ exists, it is called the transform of F by M . Prove that, if Φ exists, it is unique. (ii) It can be shown that an n-input mapping M is surjective if, and only if, each nblock has exactly the same number of preimages. Using this result, show that the 3-input mapping M30 , whose Wolfram code number is 30 (Wolfram code numbers are defined on page 192), is surjective. (iii) Determine all radius 1 rules F that have a transform Φ under the 3-input surjective mapping 30. Exercise 6.3 Consider a string of 0s and 1s of finite length L; the density classification problem is to find a two-state cellular automaton rule fρ such that the evolution of the string according to fρ (assuming periodic boundary conditions) converges to a configuration that consists of 1s only (resp. 0s only) if the density of 1s (resp. 0s) is greater than (resp. less than) a given value ρ. It has been found that there exists no such rule [199]. The problem can, however, be solved using two cellular automaton rules. Following Fuk´s [133], if ρ = 12 , the solution is first to let the string evolve according to radius 1 rule 184 for a number of time steps of the order of magnitude of half of the string length and then again to let the last iterate evolve according to radius 1 rule 232 for a number of time steps of the order of magnitude of half the string length. Justify this result. Exercise 6.4 A two-lane one-way car traffic flow consists of two interacting lanes with cars moving in the same direction with the possibility for cars blocked in their lane to shift to the adjacent lane. If we assume that the road is a circular highway with neither entries nor exits, such a system may be modeled as a one-dimensional cellular automaton with periodic boundary conditions in which the state of cell i is a two-dimensional vector si = (s1i , s2i ), where (s1i , s2i ) ∈ {0, 1} × {0, 1}, 45
46
A mapping M on S is surjective if, for all c ∈ S, the set of all preimages M −1 (c) of c by M is not empty. That is, each configuration c ∈ S has at least one preimage. M has to be surjective because the domain of definition of Φ has to be the whole space S.
260
6 Cellular Automata
s1i and s2i representing, respectively, the occupation numbers of lane 1 and lane 2 at location i, as shown below. ... ...
s1i−1 s2i−1
s1i s2i
s1i+1 s2i+1
... ...
If the speed limit is vmax = 1, we may adopt the following moving rule: 1. If the site ahead of a site occupied by a car is empty, the car moves to that site. 2. If the site ahead of a site occupied by a car is occupied, the car shifts to the adjacent site if this site and the site behind are both empty. This rule is a radius 1 rule of the form (si,1 , si,2 )t+1 = f (si−1,1 , si−1,2 )t , (si,1 , si,2 )t , (si+1,1 , si+1,2 )t , . In traffic engineering, the car flow—that is, the product of the car density by the average car velocity—as a function of the car density is called the fundamental diagram. Write down explicitly traffic rule f , and determine the fundamental diagram of this model. Exercise 6.5 Let Q = {0, 1, 2, . . . , q − 1} × · · · × {0, 1, 2, . . . , q − 1} . ! "# $
An n-input one-dimensional cellular automaton -vectorial rule (x1 , x2 , · · · , xn ) → f (x1 , x2 , · · · , xn ) where, for i = 1, 2, . . . , n, xi ∈ Q, is number-conserving if, for all cyclic configurations of length L ≥ n, fj (x1 , x2 , . . . , xn ) + fj (x2 , x3 , . . . , xn+1 ) + · · · + fj (xL , x1 . . . , xn−1 ) j=1
=
x1,j + +x2,j + · · · + xL,j ,
j=1
where (f1 , f2 , . . . , f ) are the components of f and (xi,1 , xi,2 , . . . , xi, ) the components of xi . Extend the proof of Theorem 13 and find the necessary and sufficient condition for vectorial rule f to be number-conserving. Exercise 6.6 In order to evaluate the degree of mixing of the multiple random walkers cellular automaton model described at the end of Section 6.6 (Nishidate-Baba-Gaylord model), study, as a function of the number of time steps, the reduced Hamming distance between the initial configuration and the configuration obtained after t time steps. Exercise 6.7 The Eden model [114] is one of the simplest growth models. Consider a finite square lattice with free boundary conditions, and suppose that, at time t, some sites are occupied while the other sites are empty. The growth rule consists of selecting at random an empty site, the nearest neighbor of an occupied site, and occupying that site at t + 1. Assume that at t = 0 only the central site of the lattice is occupied. According to Eden, this process may reasonably represent the growth of bacterial cells or tissue cultures of cells that are constrained from moving. For example, the common sea lettuce ( Ulva lactuca) grows as a sheet only two cells thick at its periphery. Write a small program to generate Eden clusters.
Exercises
261
Fig. 6.26. Bethe lattice for z = 3. Exercise 6.8 In most lattice models, the existence of closed paths (also called loops) causes difficulties in finding exact solutions. In statistical physics, special lattices with no loops have been very helpful in understanding critical phenomena. Such lattices, called Bethe lattices, are undirected graphs in which all vertices have the same degree z. An example for z = 3 is represented in Figure 6.26. (i) Find the site percolation critical probability of a Bethe lattice, and show that the critical exponent β, characterizing the critical behavior of the site percolation probability site P∞ (p) for p > psite c , is equal to 1 for all values of z. Examine the case z = 3. (ii) Answer the same questions for the bond percolation model. Exercise 6.9 Motivated by the problem of one fluid displacing another from a porous medium under the action of capillary forces, Wilkinson and Willemsen [345] considered a new kind of percolation problem, which may describe any kind of invasion process proceeding along a path of least resistance. The invasion percolation model considered here consists of a two-dimensional square lattice in which a random number, uniformly distributed in the interval [0, 1], is assigned to each lattice site. A cluster of invaders grows by occupying the lattice site of its perimeter that has the smallest random number. By perimeter of a cluster we mean all the sites that are nearest neighbors of cluster sites. Assuming that clusters of invaders grow from a single site localized at the origin, write and run a program to generate invader clusters. Exercise 6.10 On page 256 we mentioned that Boyd and Lorberbaum questioned the evolutionary stability of the tit-for-tat strategy. In their discussion, they considered the following two strategies: 1. the tit for two tats which allows two consecutive defections before retaliating, and 2. the suspicious tit for tat, which defects on the first encounter but thereafter plays tit for tat. Compare how tit for tat and tit for two tats behave against suspicious tit for tat, and show that tit for tat is not evolutionarily stable. Exercise 6.11 In a society, individuals often adopt prevailing political ideas, beliefs, fashion, etc., of their neighbors. In order to model such a process of conforming, consider
262
6 Cellular Automata
the following two-dimensional two-state cellular automaton. Each cell of a square lattice with periodic boundary conditions is occupied by an individual who at each time step can adopt only one of two possible behaviors (being Democrat or Republican, going or not going to church, buying an American or a foreign car, etc.). In order to fit into his neighborhood, at each time step, each individual adopts the prevailing norm of his neighborhood, i.e., if two or more of his four neighbors share his own idea, he keeps it; otherwise he changes. Run a simulation to exhibit the formation of clusters of individuals having identical behaviors.
Solutions Solution 6.1 The annihilation of two defects for a cellular automaton evolving according to rule 18 is illustrated in Figure 6.27.
Fig. 6.27. Rule 18: annihilation of two defects. The defects (two consecutive 1s) at positions 19 and 28 in the initial configuration meet and annihilate at t = 32. Time is oriented downwards.
Solution 6.2 (i) The process defining the transform Φ of F by the surjective mapping M may be conveniently represented by the following commutative diagram: S ⏐ ⏐ F&
M
− −−−− →
S ⏐ ⏐ &Φ
M
−−−− → Φ(S) F (S) − Since M is surjective, each configuration c ∈ S has a preimage. Let c1 and c2 be two different preimages of a given c (M not being invertible in general, the set M −1 (c) of preimages of a given c contains more than one element). If the transform of Rule F exists, then
Solutions
263
(Φ ◦ M )(c1 ) = Φ(c) = (M ◦ F )(c1 ), (Φ ◦ M )(c2 ) = Φ(c) = (M ◦ F )(c2 ); that is, (M ◦ F )(c1 ) = (M ◦ F )(c2 ). Hence, the transform Φ of a given rule F by a surjective mapping exists if, and only if, for any configuration c ∈ S, the set (M ◦ F )(M −1 (c)) contains one, and only one, element. In other words, all the images by M ◦ F of the configurations in M −1 (c) are identical. If Φ1 and Φ2 are such that, for all c ∈ S, (Φ1 ◦ M )(c) = (Φ2 ◦ M )(c) = (M ◦ F )(c), then, for all c ∈ S,
(Φ1 − Φ2 ) ◦ M (c) = 0,
which implies Φ1 = Φ2 and proves that, when Φ exists, it is unique. Since M and F are local, Φ is local, and it is readily verified that its radius cannot be greater than the radius r of F . (ii) The look-up table of the 3-input mapping 30 is 111 0
110 0
101 0
100 1
011 1
010 1
001 1
000 0
The following table, which lists all the preimages by M106 of each 3-block, shows that each 3-block has an equal number of preimages and proves that M106 is surjective. 111 00100 01001 10010 10011
110 00101 00110 00111 01000
101 01010 01011 01100 10001
100 01101 01110 01111 10000
011 00010 00011 10100 11001
010 10101 10110 10111 11000
001 00001 11010 11011 11100
000 00000 11101 11110 11111
(iii) To determine all radius 1 rules F that have a transform Φ under the 3-input surjective mapping 30, we have to systematically check when Relation (R) is satisfied for a pair of radius 1 rules (F, Φ). We find that only seven rules F have a transform Φ. The results are summarized in the table below.47 F 0 30 170 204 225 240 255 Φ 0 30 170 204 120 240 0 Solution 6.3 As illustrated in Figure 6.1, the limit set of the cellular automaton rule 184 consists of alternating 0s and 1s if the density of 1s (active sites) is exactly equal to 12 . If the density is not equal to 12 , as shown in Figure 6.2, the limit set consists of finite sequences of alternating 0s and 1s separated by finite sequences of either 0s, if the density of active sites is less than 12 , or 1s, if the density of active sites is greater than 1 . Hence, after a number of time steps of the order of magnitude of half the lattice 2 size, any initially disordered configuration becomes ordered as much as possible; that is, according to whether the density of active sites is less than or greater than 12 , the 1s or the 0s are isolated. 47
More results on transformations of cellular automaton rules can be found in Boccara [51].
264
6 Cellular Automata
Fig. 6.28. Successive applications of rules 184 and 232 to determine whether the density of an initial configuration is either less than or greater than 12 . Empty (resp. active) sites are light (resp. dark) grey. The density is equal to 0.48 in the left two figures and to 0.52 in the right two ones. The lattice size is 50 and the number of iterations is in all cases equal to 30. Time is oriented downwards.
If such a configuration evolves according to the 3-input majority rule, that is, the rule f such that 0 if x1 + x2 + x3 = 0 or 1, f (x1 , x2 , x3 ) = 1 if x1 + x2 + x3 = 2 or 3, then, after a number of time steps of the order of magnitude of half the lattice size, the configuration consists either of 0s only, if the density of active sites is less than 12 , or of 1s only, if the density of active sites is greater than 12 . Radius 1 rule 232 is precisely the majority rule above. Figure 6.28 illustrates these considerations. If ρ = 12 , we obtain a string of alternating 0s and 1s. The number of necessary time steps is equal to the number of time steps necessary to completely eliminate one type of defect. Since the two types of defects, for both rules, travel in opposite directions with the same speed v = 1 (see Figure 6.28), we find that for each rule the system has to evolve for a number of time steps of the order of magnitude of half the string length. The density classification problem can also be solved of any rational value of ρ [85, 86]. Solution 6.4 The 3-input traffic rule f reads
Solutions f (•, •), (1, 1), (a, b) =(a, b), f (•, 0), (1, 0), (1, •) =(0, 1), f (•, 1), (1, 0), (1, •) =(1, 1), f (0, •), (0, 1), (•, 1) =(1, 0), f (1, •), (0, 1), (•, 1) =(1, 1),
265
f (•, 0), (1, 0), (0, •) =(0, 0), f (•, 1), (1, 0), (0, •) =(0, 1), f (0, •), (0, 1), (•, 0) =(0, 0), f (1, •), (0, 1), (•, 0) =(1, 0), f (a, b), (0, 0), (•, •) =(a, b),
where • stands for either 0 or 1, and a and b are equal to either 0 or 1. Note that the two lanes play a symmetric role. Figure 6.29 represents the flow diagram of the two-lane one-way car traffic flow with vmax = 1. The average car flow has been determined numerically for the following car densities: 0.1, 0.2, 0.3, 0.4, 0.405, 0.406, 0.407, 0.408, 0.41, 0.43, 0.45, 0.47, 0.49, 0.51, 0.55, 0.6, 0.7, 0.8, 0.9, 0.95.
0.4
car flow
0.3 0.2 0.1 0 0
0.2
0.4 0.6 car density
0.8
1
Fig. 6.29. Fundamental diagram of a two-lane one-way road with vmax = 1. Lattice size: 1000. All simulations were done on a 1000-site lattice. Starting from a random initial road configuration with an exact car density value, the car flow is measured after 2000 iterations. Each point of the fundamental diagram is an average over 10 different simulations. The critical car density ρc is approximately equal to 0.406. Remark 19. Since the speed limit is equal to 1, as for traffic rule 184, one could expect that the critical density should be ρc = 12 . This would be the case if, for ρ = 12 , cars were equally distributed between the two lanes. This is rarely the case, and, due to the existence of defects in the steady state, the average velocity for ρ = 12 is less than 1. This feature is illustrated in Figure 6.30. Solution 6.5 We shall prove that an n-input one-dimensional cellular automaton vectorial rule f is number-conserving if, and only if, for all (x1 , x1 , . . . , x1 ) ∈ Qn , it satisfies
266
6 Cellular Automata
Fig. 6.30. Car traffic on a two-lane one-way road for a car density exactly equal to 0.5 on a 60-site lattice. The top figure represents the initial configuration and the bottom one the configuration after 120 iterations.
fj (x1 , x2 , . . . , xn ) =
j=1
x1,j +
j=1
j=1
n−1 fj (0, 0, . . . , 0, x2 , x3 , . . . , xn−k+1 ) ! "# $ k=1
k
− fj (0, 0, . . . , 0, x1 , x2 , . . . , xn−k ) . ! "# $
(6.48)
k
To prove this result, we proceed exactly as for the proof of Theorem 13. First, if we write that a cyclic configuration of length L ≥ n, which consists of 0s only, is number-conserving, we verify that f (0, 0, · · · , 0) = 0, where 0 = (0, 0, . . . , 0). ! "# $
Then, if we consider a cyclic configuration of length L ≥ 2n − 1 which is the concatenation of a sequence (x1 , x2 , . . . , xn ) and a sequence of L − n 0s, and express that the n-input rule f is number-conserving, we obtain
fj (0, 0, . . . , 0, x1 ) +
j=1
fj (0, 0, . . . , 0, x1 , x2 ) + · · ·
j=1
+
fj (x1 , x2 , . . . , xn ) +
j=1
+
fj (x2 , x3 , . . . , xn , 0) + · · ·
j=1
fj (xn , 0, . . . , 0)
j=1
=
(x1,j + x2,j + · · · + xn,j ), j=1
where, for j = 1, 2, . . . , , all the terms of the form fj (0, 0, . . . , 0) that are equal to zero have not been written. Replacing x1,j by 0 for all j ∈ {1, 2, . . . , } in the relation above and subtracting the relation so obtained shows that Condition (6.48) is necessary. Condition (6.48) is obviously sufficient since, when summed over a cyclic configuration, all the left-hand-side terms except the first cancel. One easily verifies that the two-lane traffic rule of the preceding exercise satisfies Condition (6.48). Solution 6.6 Figure 6.31 shows, for the timorous and amiable multiple random walkers models, plots of the reduced Hamming distance between the initial configuration and the
0.98
267
20
30
0.98 0.4
0.4
0.94 0.6
5 0.86
Solutions
10
0.94 15
20
25
0.6
30 5
10
15
25
Fig. 6.31. Reduced Hamming distance between a random initial configuration and the configuration obtained after t time steps for t varying from 1 to 30. Each point represents an average reduced Hamming distance over 40 simulations. Lattice size is 200 × 200. The left (resp. right) figure corresponds to timorous (resp. amiable) random walkers. configuration obtained after t time steps for two different densities of random walkers. Increasing the density ρ of random walkers decreases the reduced Hamming distance δH . For timorous random walkers, starting from a random initial configuration, δH jumps to 0.729 for ρ = 0.4 and to 0.615 for ρ = 0.6. When the number of time steps increases, as expected, δH approaches 1. After 30 time steps, δH is equal to 0.989 for ρ = 0.4 and to 0.987 for ρ = 0.6. For amiable random walkers, the mixing process is, for a small number of iterations, slightly more efficient. δH jumps to 0.861 or 0.797 for ρ equal to 0.4 or 0.6. After 30 time steps, δH is equal to 0.989 and 0.991. Both models seem to have the same asymptotic behavior. As for the sequential rule, the asymptotic reduced Hamming distances of these synchronous rules probably do not depend upon ρ. Solution 6.7 To generate an Eden cluster, we have to manipulate two lists of sites: a list of t sites that, at time t−1, form the Eden cluster and the list of perimeter sites ( i.e., the list of sites that are first neighbors of sites belonging to the Eden cluster). Then at time t a randomly selected site from the second list is added to the first. Implementing this algorithm, we obtain clusters as illustrated in Figure 6.32. Solution 6.8 Let us first label the sites of the Bethe lattice as shown in Figure 6.33. For z = 3, the lattice consists of three binary trees whose root is site 0. Note that, in the case of an infinite lattice, our labelling is purely conventional: all sites play exactly the same role and any site could have been chosen as root 0. (i) If p is the probability for a site to be occupied, let Q(p) be the probability that a path starting from an occupied site is connected to infinity along a given branch. The probability that site 0 is not connected to infinity along a specific branch, say branch (0, 1), is then Q(p) = 1 − p + pQ2 (p) since either site 1 is empty (probability 1 − p) or occupied (probability p) and not connected to infinity along either branch (1, 11) or branch (1, 12) (probability Q2 (p)). For an arbitrary value of z > 2, the equation above is to be replaced by Q(p) = 1 − p + pQz−1 (p).
(6.49)
For all values of the positive integer z, Q(p) = 1 is always a solution to Equation (6.49). site (p) = 0 ( i.e., for p < psite This solution corresponds to P∞ c ). When p is increasing, it
268
6 Cellular Automata
Fig. 6.32. An example of Eden cluster obtained on a square lattice after 4000 iterations of the growth rule.
12
11 1
21
32
0 2
3
22
31
Fig. 6.33. Labelled Bethe lattice sites for z = 3. site 48 reaches its critical value psite Thus, psite satisfies c ; Q(pc ) = 1 becomes a double root. c the equation ∂ = p(z − 1) = 1; (1 − p + pQz−1 ) ∂Q Q=1
that is, psite = c 48
1 . z−1
As a function of p, the map Q → 1 − p + pQz−1 exhibits a transcritical bifurcation corresponding to the second-order phase transition characterized by the incipient infinite cluster.
Solutions
269
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
Fig. 6.34. Graphical solution of the equation Q = 1 − p + pQz−1 . For z = 5 and p = 0.6, we find Q = 0.418385 . . . As shown in Figure 6.34, for p > psite c , Equation (6.49) has two real roots, Q(p) = 1 and 0 < Q(p) < 1. For z = 3, this second root is Q(p) =
1−p . p
site (p) being the probability for an occupied site to be The percolation probability P∞ connected to infinity, we have site (p) = p 1 − Qz (p) (6.50) P∞
since the site has to be occupied (probability p) and connected to infinity through at least one of the z outgoing edges (probability 1 − Qz (p)). For z = 3, we have 3 1−p site P∞ (p) = p 1 − , p and in the vicinity of psite = 12 , we find c site P∞ (p) = 6 p − 12 + O (p − 12 )2 , which shows that β = 1. In order to verify that β = 1 for all values of z, we first find the expansion in powers of p − 1/(z − 1) of the solution Q(p) < 1 to Equation (6.49). Let 2 3 1 1 1 +O Q(p) = 1 + a1 p − . p− + a2 p − z−1 z−1 z−1 Then, substituting this expression in (6.49) yields a1 =
2(z − 1) . 2−z
270
6 Cellular Automata
Replacing 2(z − 1) Q(p) = 1 + 2−z
p−
1 z−1
+O p−
1 z−1
2
site in the expression of P∞ (p) given by Equation (6.50), we obtain 2 2z 1 1 site P∞ (p) = , p− +O p− z−2 z−1 z−1
which proves that β = 1 for all values of z. (ii) In the case of the bond percolation model, p is the probability for a bond to be open. Let Q(p) denote the probability that an outgoing path from a given site along a specific branch is closed. For z = 3, the probability that site 0 is not connected to infinity along a specific branch, say branch (0, 1), is then 2 Q(p) = 1 − p + pQ(p) since the two branches (1, 11) and (1, 12) are either closed (probability 1 − p) or open but not connected to infinity (probability pQ(p)). For an arbitrary value of z > 2, the equation above is to be replaced by z−1 Q(p) = 1 − p + pQ(p) . (6.51) For all values of the positive integer z, Q(p) = 1 is always a solution to Equation (6.51). bond This solution corresponds to P∞ (p) = 0 ( i.e., for p < pbond ). When p is increasing, c bond it reaches its critical value pc ; Q(pbond ) = 1 becomes a double root. Thus, pbond c c satisfies the equation ∂ (1 − p + pQ)z−1 = p(z − 1) = 1, ∂Q Q=1 and we find, as for the site percolation model, pbond = c
1 . z−1
For p > psite c , Equation (6.51) has two real roots, Q(p) = 1 and 0 < Q(p) < 1. For z = 3, this second root is 2 1−p Q(p) = . p bond (p) being the probability for a site, say site 0, to be The percolation probability P∞ connected to infinity, we have bond P∞ (p) = 1 − Qz (p)
(6.52)
since site 0 has to be connected to infinity through at least one of z outgoing edges. For z = 3, we have 6 1−p bond P∞ (p) = 1 − , p = 12 , we find and in the vicinity of pbond c
Solutions
271
bond P∞ (p) = 24 p − 12 + O (p − 12 )2 , which shows that β = 1. To prove that β = 1 for all values of z, we proceed as for the site model by first finding the expansion in powers of p − 1/(z − 1) of the solution Q(p) < 1 to Equation (6.51). Let 2 3 1 1 1 +O . p− + a2 p − Q(p) = 1 + a1 p − z−1 z−1 z−1 Then, substituting this expression in (6.51) yields a1 =
2(z − 1)2 . 2−z
Replacing Q(p) = 1 +
2(z − 1)2 2−z
p−
1 z−1
+O
p−
1 z−1
2
bond in the expression of P∞ (p) given by Equation (6.52), we obtain 2 2z(z − 1)2 1 1 bond (p) = , p− +O P∞ p− z−2 z−1 z−1
which proves that β = 1 for all values of z. Solution 6.9 To generate a cluster of invaders, we manipulate two lists of sites: a list of t sites that, at time t − 1, form the cluster of invaders and the list of perimeter sites. Then, at time t, the perimeter site with the smallest random number is added to the cluster of invaders. Implementing this algorithm, we obtain a typical cluster, as shown in Figure 6.35. Solution 6.10 Tit-for-tat and tit-for-two-tats strategists always cooperate when playing against each other or against a fellow strategist. They cannot, therefore, be distinguished in the absence of a third strategy. Playing against each other, a suspicious tit-for-tat and a tit-for-tat strategist lock into a pattern of alternating cooperations and defections, the first one playing the sequence defection, cooperation, defection, cooperation, etc., and the second one cooperation, defection, cooperation, defection, etc. Then, their respective total payoffs are T + wS , 1 − w2 S + wT S + wT + w2 S + w3 T + · · · = . 1 − w2
T + wS + w2 T + w3 S + · · · =
Playing against each other, a suspicious tit-for-tat and a tit-for-two-tats strategist, after the first encounter, lock into a pattern of mutual cooperation, and their respective total payoffs are
272
6 Cellular Automata
Fig. 6.35. Example of a cluster of invaders obtained on a square lattice after 2000 iterations of the invasion rule. wR , 1−w wR S + wR + w2 R + w3 R + · · · = S + . 1−w
T + wR + w2 R + w3 R + · · · = T +
Hence, in the presence of mutant suspicious tit-for-tat strategists, tit-for-two-tats strategists do better than tit-for-tat strategists showing that the tit-for-tat strategy is not evolutionarily stable. Remark 20. While this example may cast doubts on the general result of Axelrod and Hamilton [15] as stressed by May [232] it may turn out that tit for tat is more robust than what Boyd and Lorberbaum [70] think since it depends on these strategies being rigidly adhered to. In the real world, cooperation is not always rewarded and defection not always punished. If, for example, a tit-for-tat strategist punishes defection with a probability slightly less than 1, tit-for-tat and suspicious tit-for-tat strategists would eventually settle on mutual cooperation. Solution 6.11 If the set of states is {0, 1}, where state 0 corresponds to say opinion A and 1 to opinion B, the evolution rule is totalistic (i.e., it depends only upon the sum of the states of the nearest neighbors) and can be written t < 3, 0, if 0 ≤ Si,j t+1 t si,j = f (Si,j ) = t ≤ 5, 1, if 3 ≤ Si,j where t Si,j = sti,j−1 + sti−1,j + sti,j + sti+1,j + sti,j+1 .
Figure 6.36 shows the initial random configuration followed by three other configurations exhibiting cluster formation after 50, 100, and 200 iterations. Note that the patterns after 100 and 200 iterations are identical, showing that clusters do not evolve after a rather small number of time steps. While many individuals change their behavior
Solutions
273
during the evolution, the number of individuals having a given behavior does not change so much. In our simulation, initially the individuals were divided into two groups of 1214 (light grey) and 1286 (dark grey) individuals. In the steady state these groups consisted of 1145 (light grey) and 1355 (dark grey) individuals, respectively.
Fig. 6.36. Formation of clusters of individuals having identical behavior. Initially the individuals have either behavior A or B with equal probability. Lattice size: 50 × 50. From left to right and top to bottom, the figures represent configurations after 0, 50, 100, and 200 iterations.
7 Networks
7.1 The small-world phenomenon In the late 1980s, Stanley Milgram [243] performed a simple experiment to show that, despite the very large number of people living in the United States and the relatively small number of a person’s acquaintances, two persons chosen at random are very closely connected to one another. Here is how he says his study was carried out: The general idea was to obtain a sample of men and women from all walks of life. Each of these persons would be given the name and address of the same target person, a person chosen at random, who lives somewhere in the United States. Each of the participants would be asked to move a message toward the target person, using only a chain of friends and acquaintances. Each person would be asked to transmit the message to the friend or acquaintance who he thought would be most likely to know the target person. Messages could move only to persons who knew each other on a first-name basis. In a first study, the letters, given to people living in Wichita, Kansas, were to reach the wife of a divinity student living in Cambridge, Massachusetts, and in a second study, the letters, given to people living in Omaha, Nebraska, had to reach a stockbroker working in Boston and living in Sharon, Massachusetts. Out of 160 chains that started in Nebraska, 44 were completed. The number of intermediaries needed to reach the target person in the Nebraska study is distributed as shown below: (2,2) (3,4) (4,9) (5,8) (6,11) (7,5) (8,1) (9,2) (10,2) . The first number of the ordered pairs is the chain length (number of intermediaries) and the second one the number of such completed chains. The median of this distribution is equal to 5 and its average is 5.43. These results are certainly not very accurate, the statistics being too poor to make
276
7 Networks
serious claims. However, the fact that two randomly chosen persons are connected by only a short chain of acquaintances, referred to as the small-world phenomenon, has been verified for many different social networks. One striking example is the collaboration graph of movie actors. Consider the graph whose vertices are the actors listed in the Internet Movie Database1 in which two actors are connected by an edge if they acted in a film together. In April 1997,2 this graph had 225,226 vertices, with an average number of edges per vertex equal to 61. It is found that the average chain length (in the sense of Milgram) or degree of separation between two actors selected at random from the giant connected component of this graph, which includes about 90% of the actors listed in the database, is equal to 3.65.3 Networks, like Milgram’s network of personal acquaintances, are everywhere, and recently a substantial number of papers dealing with their basic properties have been published. Examples include neural networks, food webs, metabolic networks, power grids, collaboration networks, distribution networks, highway systems, airline routes, the Internet, and the World Wide Web.
7.2 Graphs From a mathematical point of view, a network is a graph G that is an ordered pair of disjoint sets (V, E), where V is a nonempty set of elements called vertices, nodes, or points, and a subset E of ordered pairs of distinct elements of V , called directed edges, arcs, or links. An edge may join a vertex to itself. If we need to specify that V (resp. E) is the set of vertices (resp. edges) of a specific graph G, we use the notation V (G) (resp. E(G)).4 Edges can also be undirected. A network of personal acquaintances is an undirected graph, while the routes of an airline company are a directed graph. A graph H is a subgraph of G if V (H) ⊂ V (G) and E(H) ⊂ E(G). In a directed graph (also called a digraph), if (x, y) is an arc joining vertex x to vertex y, vertex y is said to be adjacent to x or a neighbor of x. The set of all vertices adjacent to a vertex x is the neighborhood N (x) of x. In an 1 2 3
4
Web site: http://us.imdb.com. These data are taken from Watts and Strogatz [342]. On the Web site: http://www.cs.virginia.edu/oracle/ one can select two actors and find their degree of separation. In the world of mathematics, there exists a famous collaboration graph centered on the Hungarian mathematician Paul Erd¨ os, who traveled constantly, collaborated with hundreds of mathematicians, and wrote more than 1400 papers. Mathematicians who coauthored a paper with Erd¨ os have an Erd¨ os number equal to 1, mathematicians who wrote a paper with a colleague who wrote a paper with Erd¨ os have an Erd¨ os number equal to 2, and so forth. It appears that among mathematicians who coauthored papers, less than 2% have an Erd¨ os number larger than 8 (http://www.oakland.edu/ grossman/erdoshp.html). On graph theory, see Bollob´ as [65].
7.2 Graphs 2
1
3
2
1
6 3
4
5
277
6
4
5
Fig. 7.1. Examples of directed and undirected graphs with six vertices and nine edges.
undirected graph, the edge joining x and y is denoted {x, y}, and, in this case, x and y are said to be adjacent. Figure 7.1 shows both types of graphs. The number of vertices |V (G)| of a graph G is the order of G, and the number of edges |E(G)| of G is the size of G. The notation G(N, M ) represents an arbitrary graph of order N and size M . A graph of order N is empty if it has no edges (i.e., if its size M =0) and complete if every two vertices are N adjacent i.e., if its size M = . 2 Graphs G1 = (V1 , E1 ) and G2 = (V2 , E2 ) are isomorphic if there exists a bijection ϕ : V1 → V2 such that {x, y} ∈ E1 if, and only if, {ϕ(x), ϕ(y)} ∈ E2 . In other words, G1 and G2 are isomorphic if the bijection ϕ between their vertex sets V1 (G1 ) and V2 (G2 ) preserves adjacency. Isomorphic graphs have, of course, the same order and same size. The degree d(x) of vertex x is the number |N (x)| of vertices adjacent to x. In the case of a directed graph, we have to distinguish the in-degree din (x) and the out-degree dout (x) of a vertex x, which represent, respectively, the number of incoming and outgoing edges. A graph is said to be regular if all vertices have the same degree. A vertex of degree zero is said to be isolated. An alternating sequence (x0 , e1 , x1 , e2 , . . . , e , x ) of vertices xi (i = 0, 1, 2, . . . , ) and edges ej = (xj−1 , xj ) (j = 1, 2, . . . , ) is a path or a chain of length joining vertex x0 to vertex x . Without ambiguity, a path can also be represented by a sequence of vertices (x0 , x1 , . . . , x ). This notation assumes that, for all j = 1, 2, . . . , , all edges ej = (xj−1 , xj ) exist. The chemical distance d(x, y), or simply distance, between vertices x and y is the smallest length of a path joining x to y.
278
7 Networks
If a path (x0 , x1 , . . . , x ) is such that ≥ 3, x0 = x and, for all j = 1, 2, . . . , , all vertices xj are distinct from each other, the path is said to be a cycle. A graph is connected if, for every pair of distinct vertices {x, y}, there is a path joining them. A connected graph of order greater than 1 does not contain isolated vertices. If a graph G is not connected, then it is the union of connected subgraphs. These connected subgraphs are the components of G. A connected graph without any cycle is called a tree. Graphs considered here contain neither multiple edges (i.e., more than one edge joining a given pair of vertices) nor loops (i.e., edges joining a vertex to itself). A graph G of order N can be represented by its N × N adjacency matrix A(G), whose elements aij are given by
1, if there exists an edge between vertices i and j, aij = aji = 0, otherwise. ⎡ 0 ⎢1 ⎢ ⎢1 ⎢ ⎢0 ⎢ ⎢1 A(G(10, 20)) = ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢1 ⎢ ⎣0 1
1 0 1 0 1 1 0 0 0 1
1 1 0 1 0 1 1 0 0 0
0 0 1 0 0 0 0 1 1 0
1 1 0 0 0 0 0 1 0 0
0 1 1 0 0 0 0 0 0 1
0 0 1 0 0 0 0 1 1 0
1 0 0 1 1 0 1 0 0 1
0 0 0 1 0 0 1 0 0 1
⎤ 1 1⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥. 1⎥ ⎥ 0⎥ ⎥ 1⎥ ⎥ 1⎦ 0
(7.1)
Figure 7.2 shows the graph of order 10 and size 20 represented by the adjacency matrix (7.1). The ij-element of the square of the adjacency matrix is given by N
aik akj ;
k=1
it is nonzero if, for some k = 1, 2, . . . , N , both aik and akj are equal to 1. If, for a specific value of k, aik akj = 1, this implies that there exists a path of length 2 1N joining vertex i to vertex j, and the value of k=1 aik akj represents, therefore, the number of different paths of length 2 joining vertex i to vertex j. More generally, it is straightforward to verify that the value of the element ij of the nth power of the adjacency matrix is the number of different paths of length n joining vertex i to vertex j. Note that a specific edge may belong to a path more than once. In particular, in the case of an undirected graph, the value of the diagonal element ii of the square of the adjacency matrix is equal to the
7.3 Random networks 3
279
2
4
1
5
10
6
9
7
8
Fig. 7.2. Undirected graph of order 10 and size 20 represented by the adjacency matrix (7.1).
number of paths of length 2 joining vertex i to itself. It represents, therefore, the degree of vertex i. These considerations are illustrated by matrix (7.2) which represents the square of the adjacency matrix (7.1). ⎤ ⎡ 5312232212 ⎢3 5 2 1 1 2 1 3 1 2⎥ ⎥ ⎢ ⎢1 2 5 0 2 1 0 3 2 3⎥ ⎥ ⎢ ⎢2 1 0 3 1 1 3 0 0 2⎥ ⎥ ⎢ ⎢2 1 2 1 3 1 1 1 0 3⎥ 2 ⎥. ⎢ (7.2) A (G(10, 20)) = ⎢ ⎥ ⎢3 2 1 1 1 3 1 1 1 1⎥ ⎢2 1 0 3 1 1 3 0 0 2⎥ ⎥ ⎢ ⎢2 3 3 0 1 1 0 5 3 1⎥ ⎥ ⎢ ⎣1 1 2 0 0 1 0 3 3 0⎦ 2232312105 Sometimes it might be useful to consider that the vertices of a graph are not of the same type. The collaboration graph of movie actors can be viewed as a set of actors (vertices of type 1) and a set of films (vertices of type 2) with links between a film and an actor when the actor appeared in the film. Such graphs are said to be bipartite. In a bipartite graph, there exist no edges connecting vertices of the same type.
7.3 Random networks Except when mentioned otherwise, in what follows we shall deal with undirected graphs. A random graph is a graph in which the edges are randomly
280
7 Networks
distributed. Networks of intricate structure could be tentatively modeled by random graphs. But, in order to verify if they are acceptable models, we have to study some of their basic properties. To give a precise definition of a random graph, we have to define a space whose elements are graphs and a probability measure on that space.5 For graphs with a given number N of vertices, there exist two closely related models. N ) 2 graphs with exactly M edges. If the 1. For 0 ≤ M ≤ N2 , there are (M N −1 probability of selecting any of them is ( 2 ) , we define the probability M
space G(N, M ) of uniform random graphs. 2. Let 0 ≤ p ≤ 1. If the probability of selecting a graph with exactly m N edges in the set of all graphs of order N is pk (1 − p)( 2 )−m —that is, if an edge between a pair of vertices exists independently of the other edges with a probability p and does not with a probability 1 − p—the resulting probability space is denoted G(N, p), and graphs of this type are called binomial random graphs. In a uniform random graph, the probability that there exists an edge between a given pair of vertices is M/ N2 , while the average number of edges of a binomial random graph is p N2 . If p is the probability that there exists an edge between two specific vertices, the probability that a specific vertex x has a degree d(x) = m is N −1 m P d(x) = m = p (1 − p)N −1−m . m In most applications, p is small and N is large, so, in the limit p → 0, N → ∞, and pN = λ, we find that for both uniform and binomial random graphs, the vertex degree probability distribution is Poisson6 : λm . (∀x) P d(x) = m = e−λ m! For a given number of vertices N , if the number M of edges is small, a random graph tends to have only tree components of orders depending upon the value of M .7 As the number of edges grows, there is an important qualitative change first studied by Erd¨ os and R´enyi and referred to as the phase transition, which is the sudden formation of a giant component (i.e., a component whose order is a fraction of the order N of the graph). This component, formed from smaller ones, appears when M = N/2 + O(N 2/3 ).8 5 6
7 8
On random graphs, see Janson, L uczak, and Ruci´ nski [179]. This classical result, which is very simple to establish using either characteristic functions or generating functions, can be found in almost any textbook on probability theory. See Exercise 7.2. The interested reader will find precise results in Bollob´ as [65], p. 240. For details on this “sudden jump” of the largest component, refer to Janson, L uczak, and Ruci´ nski [179], Chapter 5.
7.3 Random networks
281
The phase transition in random graphs is clearly similar to the percolation transition described on page 221. In a random graph of order N , each of the N edges exists with a probability p = 2M/N (N − 1). The formation of the 2 giant component may, therefore, be viewed as a percolation phenomenon in N dimensions. Since, in most real networks, N is very large, the percolation threshold is extremely small (see Exercise 6.8). This agrees with the result mentioned in the previous paragraph (i.e., the giant component appears for a number of edges M ∼ N ; that is, for a probability p = O(N −1 )). When studying the properties of a large network, we shall only consider the giant component. The structural properties of a network are usually quantified by its characteristic path length, its clustering coefficient, and its vertex degree probability distribution. The characteristic path length L of a network is the average shortest path length between two randomly selected vertices. The clustering coefficient C of a network is the conditional probability that two randomly selected vertices are connected given that they are both connected to a common vertex. That is, if x is a given vertex and d(x) the number of other vertices linked to x (i.e., the degree of x), since these d(x) vertices may be connected together by at most 12 d(x) d(x) − 1 edges, the clustering coefficient Cx of vertex x is the fraction of this maximum number of edges present in the actual network, and the clustering coefficient C of the network is the average of the clustering coefficients of all vertices. The characteristic length takes its minimum value of L = 1 and the clustering coefficient its maximum value C = 1 in the case of a complete graph. If N is the order of a random graph, d = λ the average vertex degree, and p ≈ d/N the probability that there exists an edge between two vertices, in the case of the random graph, the conditional probability that two randomly selected vertices are connected given that they are both connected to a common vertex coincides with the probability that two randomly selected vertices are connected, which is d/N . Thus Crandom ≈
d . N
The shortest path length between any two vertices of a graph is called the diameter D of the graph. Since a given vertex has, on average, d first neighbors, d2 second neighbors, etc., the diameter Drandom of a random graph is such that dDrandom ≈ N ; hence Drandom ≈
log N . logd
This result shows that for large random networks the characteristic path length Lrandom satisfies
282
7 Networks
Lrandom ∼
log N . logd
Let us compare these results with data obtained from real networks. The collaboration network of movie actors mentioned on page 276 has been studied by different authors [342, 37]. The vertices of this collaboration graph are the actors listed in the Internet Movie Database, and there is an edge linking two actors if they acted in a film together. This network is continuously growing; it had 225,226 vertices in April 1997 and almost half a million in May 2000 [267]. The table below indicates a few structural properties of the collaboration graph of movie actors, such as the order Nactors , the average vertex degree dactors , the characteristic path length Lactors , and the clustering coefficient Cactors . For a comparison, the characteristic path length Lrandom and the clustering coefficient Crandom of a random network of the same order and same average vertex degree are also given.9 Movie actors Nactors dactors Lactors Lrandom Cactors Crandom 225,226 61 3.65 2.99 0.79 0.00027 While the characteristic path lengths Lactors and Lrandom have the same order of magnitude, the clustering coefficients Cactors and Crandom are very different, suggesting that a random graph is not a satisfactory model for the collaboration network of movie actors. This is the case for many other real networks. Here we shall mention only one more example10 : scientific collaboration networks. These networks have been extensively studied by Newman [263, 264, 265]. The vertices are authors of scientific papers and two authors are joined by an edge if they have written an article together. Bibliographic data from the years 1995–1999 were drawn from the following publicly available databases: 1. The Los Alamos e-Print Archive, a database of unrefereed preprints in physics, self-submitted by their authors. 2. MEDLINE, a database of papers on biomedical research published in refereed journals. 3. SPIRES, a database of preprints and published papers in high-energy physics. 4. NCSTRL, a database of preprints in computer science submitted by participating institutions. Except in theoretical high-energy physics and computer science where the average number of authors per paper is small (about 2), the size of the giant component is greater than 80%. Since authors may identify themselves in different ways on different papers (e.g., using first initials only, using all initials, 9 10
The data are taken from Watts and Strogatz [342]. Data on many more networks can be found in Albert and Barab´ asi [5].
7.4 Small-world networks
283
or using full names), the determination of the true number of distinct authors in a database is problematic. In the table below, all initials of each author were used. This rarely confuses two distinct authors for the same person but could misidentify the same person as two different people. The order N , the average vertex degree d, the characteristic path length L, and the clustering coefficient C are given for the various networks. As above, the characteristic path length Lrandom and the clustering coefficient Crandom of a random network of the same order and same average vertex degree are given for comparison. Scientific collaboration Network N d L Lrandom Los Alamos archive 52,909 9.7 5.9 4.79 MEDLINE 1,520,251 18.1 4.6 4.91 SPIRES 56,627 173 4.0 2.12 NCSTRL 11,994 3.59 9.7 7.34
C Crandom 0.43 0.00018 0.066 0.000011 0.726 0.003 0.496 0.0003
7.4 Small-world networks 7.4.1 Watts-Strogatz model The collaboration network of movie actors and the various scientific collaboration networks have short characteristic path lengths and high clustering coefficients. In what follows, any large network possessing these two properties will be called a small-world network, where by large we mean that the order N of the corresponding graph is much larger than the average vertex degree d. Random graphs, which have short characteristic path lengths but very small clustering coefficients, are not acceptable models of small-world networks. On the other hand, regular lattices, which have obviously high clustering coefficients but large characteristic path lengths, cannot be used to model small-world networks.11 A small-world model with a high clustering coefficient, such as a regular lattice, and a short characteristic path length, such as a random network, has been suggested by Watts and Strogatz [342]. Starting from a ring lattice 11
For a regular one-dimensional lattice with periodic boundary conditions in which each site is linked to its r nearest neighbors (i.e., the degree of each vertex is equal to 2r), it can be shown (see Exercise 7.1) that the characteristic path length Llattice and the clustering coefficient Clattice are given exactly by ' ( ' ( 3(r − 1) N −1 N −1 r . + 1 , Clattice = Llattice = 1 − N −1 2r 2r 2(2r − 1)
284
7 Networks 4 5
4
3
5
3
2
2
6
6 1
7
1 7
15 8
15 8
14 9
14 9
13 10 11
12
13 10 11
12
Fig. 7.3. Constructing the Watts-Strogatz small-world network model. The graph is of order N = 15, the average vertex degree d = 2r = 4, and the probability p = 0.2.
with N vertices and d = 2r edges per vertex, they rewire each edge at random with probability p. More precisely, the rewiring is done as follows. The vertices along the ring are visited one after the other; each edge connecting a vertex to one of its r right neighbors is reconnected to a randomly chosen vertex other than itself with a probability p and left unchanged with a probability 1 − p. Two vertices are not allowed to be connected by more than one edge. This construction is illustrated in Figure 7.3. Note that this rewiring process keeps the average vertex degree unchanged. Varying p from 0 to 1, the perfectly regular network becomes more and more disordered. However, for p = 1, the network is not locally equivalent to a random network since the degree of any vertex is larger than or equal to r, as shown in Figure 7.4. However, for p = 1, the characteristic path length and the clustering coefficient have the same order of magnitude as the characteristic path length and the clustering coefficient of a random network. What is particularly interesting in this model is that, starting from p = 0, a small increase of p causes a sharp drop in the value of the characteristic path length but almost no change in the value of the clustering coefficient. That is, for small values of p, the Watts-Strogatz model has a clustering coefficient CWS (p) of the same order of magnitude as Clattice , while its characteristic path length LWS (p) is of the order of magnitude of Lrandom . To have an idea of the structural properties of the Watts-Strogatz smallworld network model compared to those of regular lattices and random networks, it suffices to consider a rather small network with N = 500 vertices, d = 8, and p = 0.05. We find: 1. For a regular lattice, Llattice = 31.6894, Clattice = 0.642857. 2. For a random network, Lrandom = 3.21479, Crandom = 0.00930834.
7.4 Small-world networks
4 5
285
3 2
6 1 7 15 8 14 9 13 10 11
12
Fig. 7.4. For p = 1, the Watts-Strogatz small-world network model is NOT a random graph. N = 15 and d = 2r = 4.
3. For a Watts-Strogatz small-world network, LWS = 5.50771, CWS = 0.555884. Hence, for a small value of p, the clustering coefficient remains of the same order of magnitude, while the characteristic path length is much shorter. It is possible to modify the rewiring process of the Watts-Strogatz smallworld network model in order to obtain a random network for p = 1. Starting from a regular ring lattice, we visit each edge and replace it with a probability p by an edge between two randomly selected vertices or leave it unchanged with a probability 1 − p, subject to the condition that no more than one edge may exist between two vertices. Small-world graphs of this type may have disconnected subgraphs, as shown in Figure 7.5. For the same values of N , d, and p as above, we find, in this case, LmodifiedWS = 5.73836, CmodifiedWS = 0.556398, which are very close to the values obtained for the Watts-Strogatz original rewiring process. If we say that a network in which the characteristic path length behaves as the number of vertices is a “large-world” network as opposed to a small-world network in which the characteristic path length behaves as the logarithm of the number of vertices, on the basis of numerical simulations, it has been shown that, for the Watts-Strogatz model, as p increases from zero, the appearance of the small-world behavior is not a phase transition but a crossover phenomenon [36, 262, 8, 37]. As a function of N and p, the characteristic path length satisfies the scaling relation N LWS (N, p) = N f , ξ(p)
286
7 Networks
4 5
3 2
6 1 7 15 8 14 9 13 10 11
12
Fig. 7.5. For the modified Watts-Strogatz small-world network model, the graph may become disconnected, and some vertices may have a degree equal to 0. N = 15, d = 2r = 4, and p = 1.
where ξ(p) ∼ p−1 and
⎧ ⎨constant, if x 1, f (x) ∼ log x ⎩ , if x 1. x
7.4.2 Newman-Watts model While the rewiring process of the original Watts-Strogatz small-world network model cannot yield a disconnected graph, some rewiring process may disconnect the graph. Although, for small values of p, there always exists a giant connected subgraph, the existence of small disconnected subgraphs makes the characteristic path length poorly defined since we may find pairs of vertices without a path joining them. And assuming that, in such a case, the path length is infinite does not solve the problem. What is usually done is to forget about small detached subgraphs and define the characteristic path length for the giant connected part. While such a procedure may be acceptable for numerical studies, for analytic work, having to deal with poorly defined quantities is not very satisfactory. To avoid disconnecting the network, Newman and Watts [261] suggested adding new edges between randomly selected pairs of vertices instead of rewiring existing edges. That is, in order to randomize a regular lattice, on average, pN new links, referred to as shortcuts, between pairs of randomly selected vertices are added, where, as usual, p is a probability and N the number
7.4 Small-world networks
4 5
287
3 2
6 1 7 15 8 14 9 13 10 11
12
Fig. 7.6. Newman-Watts small-world network model with shortcuts. The graph is of order N = 15 and the probability p = 0.2.
of vertices of the lattice (see Figure 7.6). In this model, it is allowed to have more than one edge between a pair of vertices and to have edges connecting a vertex to itself. If we take as above N = 500, d = 8, and p = 0.05, for the Newman-Watts small-world model, we find LNW = 7.03808 and CNW = 0.636151. 7.4.3 Highly connected extra vertex model Instead of creating direct shortcuts between sites as in the Newman-Watts model, we can create shortcuts connecting distant sites going through an extra site connected to an average pN sites, as illustrated in Figure 7.7. In the case of a large network, this extra site would represent the existence of a well-connected individual. More realistically, for a large network, we should consider the existence of a small group of well-connected individuals. In the case of the collaboration network of movie actors, these extra vertices would correspond to the small group of very popular actors, while for the collaboration network of scientists, they would correspond to particularly productive authors. If we take, as above, N = 500, d = 8, and p = 0.05, for the model with only one highly connected extra vertex, we find LHCEV = 6.07453 and CHCEV = 0.633522.
288
7 Networks
4 5
3 2
6 1 7 15 8 14 9 13 10 11
12
Fig. 7.7. Extra vertex connected on average to pN sites. The graph is of order N = 15 and the probability p = 0.2.
7.5 Scale-free networks 7.5.1 Empirical results At the end of Section 7.3, we reported results on the existence of a short characteristic path length and a high clustering coefficient for the collaboration network of movie actors and different scientific collaboration networks showing that these networks are therefore better modeled by small-world network models than by random graphs. There is, however, a third important structural property of real networks that we have not yet examined, namely the vertex degree probability distribution. World Wide Web The World Wide Web, which has become the most useful source of information, is a huge network whose order, larger than one billion, is continuously growing. This growth is totally unregulated: any individual or institution is free to create Web sites with an unlimited number of documents and links. Moreover, existing Web sites keep evolving with the addition and deletion of documents and links. The vertices of this network are HTML documents called Web pages, and the edges are the hyperlinks pointing from one document to another. Since the Web has directed links, as mentioned on page 277, we have to distinguish the in-degree din (x) and the out-degree dout (x) of a vertex x, which represent, respectively, the number of incoming and outgoing edges. In 1999, Albert, Jeong, and Barab´ asi [2] studied a subset of the Web (the nd.edu
7.5 Scale-free networks
289
domain) containing 325,729 documents and 1,469,680 links.12 They found that both vertex in-degree and out-degree probability distributions have power-law tails extending over about 4 orders of magnitude; i.e., P (dout = k) ∼ k −γout ,
P (din = k) ∼ k −γin ,
with γout = 2.45 and γin = 2.1. As pointed out by the authors, this result shows that, while the owner of a Web page has complete freedom in choosing the number of links and the addresses to which they point, the overall system obeys scaling laws characteristic of self-organized systems. Despite its huge size, Barab´ asi, Albert, and Jeong [34] found that the World Wide Web is a highly connected graph: two randomly chosen documents are, on average, 19 clicks away from each other. More precisely, they showed that the average distance between any two documents is given by 0.35 + 2.06 log N , where N is the total number of documents. This logarithmic dependence shows that an “intelligent” agent should be able to find in a short time the information he is looking for by navigating the Web. Scientific collaborations Like the World Wide Web, networks of scientific collaboration are evolving networks.13 An extensive investigation of the dynamical properties of these networks has been recently carried out [35]. To reveal the dynamics of these networks, it is important to know the time at which the vertices (authors) and edges (coauthored papers) have been added to the network. The databases analyzed by the authors contain titles and authors of papers in important journals in mathematics and neuroscience published over an 8-year period from 1991 to 1998. In mathematics, the number of different authors is 70,975 and the number of published papers is 70,901. In neuroscience, these numbers are, respectively, 209,293 and 210,750. Both networks are scale-free. The exponents characterizing the power-law behavior of the vertex degree probability distributions are γM = 2.4 for mathematics and γNS = 2.1 for neuroscience. An important part of this study is devoted to the time evolution of various properties of these networks. 1. While the order of these networks increases with time, the characteristic path length is found to decrease with time. This apparently surprising result could be due to new edges being created between already existing vertices: two authors, who had not yet written a paper together, coauthor a new article. Since the average number of coauthors in neuroscience is 12
13
In Barab´ asi, Albert, and Jeong [34], the authors also studied the domains whitehouse.gov, yahoo.com, and snu.ac.kr. The collaboration network of movie actors is also an evolving network, but the decision to collaborate is usually the responsibility of the casting director. In contrast, the decision for two scientists to collaborate is made at the author level.
290
7 Networks
larger than in mathematics, the characteristic path length is smaller in neuroscience than in mathematics. 2. Clustering coefficients are found to decay as a function of time. Convergence to an asymptotic value, if it exists, appears to be very slow. 3. The size of the largest component, greater for neuroscience than for mathematics, grows with time. For neuroscience, this size seems to be very close to an asymptotic value, while this is, apparently, not the case for mathematics. 4. The average vertex degree increases with time, much faster for neuroscience than for mathematics. These time-dependent properties suggest that one has to be very careful when interpreting certain results. For instance, numerical simulations of a model based on the empirical results show that the vertex degree probability distribution exhibits a crossover between two different power-law regimes characterized by the exponents γsmall = 1.5 for small degrees and γlarge = 3 for large degrees. First papers by new authors (i.e., authors who did not belong to the network), are found to be preferentially coauthored with old authors who already have large numbers of coauthors than with those less connected. This phenomenon is called preferential attachment. More exactly, it is found that the probability that an old author with d links is selected by—or selects—a new author is proportional to dα with α ≈ 0.8 for mathematics and α ≈ 0.75 for neuroscience. Moreover, old authors also write papers together. If d1 and d2 are the number of coauthors of author 1 and author 2, respectively, it is found that the probability that two old authors write a new paper together is proportional to d1 d2 , exhibiting another aspect of preferential attachment. Citations Another interesting study of the vertex connectivity within a network has been carried out by Redner [295], who investigated two large data sets: 1. The citation distribution of 783,339 papers published in 1981 and cited between 1981 and June 1997 that have been cataloged by the Institute for Scientific Information with a number of 6,716,198 citations. 2. The citation distribution, as of June 1997, of the 24,296 papers cited at least once that where published in volumes 11 through 50 of Physical Review D from 1975 to 1994 with a number of 351,872 citations. In this network, the vertices are the papers and the directed edges are the links from citing to cited papers. The main result of the study is that the citation distribution is, for large citation numbers, reasonably well-approximated by a decreasing power law, characterized by an exponent γcite ≈ 3, while, for small citation numbers, a stretched exponential (i.e., exp − (x/x0 )β with β ≈ 0.4) provides a better
7.5 Scale-free networks
291
fit. The fact that two different functions are needed to fit the data is, according to Redner, a consequence of different mechanisms generating the citation distributions of rarely cited and frequently cited papers. The former are essentially cited by their authors and close associates and usually forgotten a short time after their publication, while the latter become known through collective effects and their impact extends over a much longer time. In contrast with Redner’s suggestion, Tsallis and de Albuquerque [333] showed that the whole distribution is fairly well fit by a single function, namely q/(1−q) x → N0 1 + λ(1 − q)x with N0 = 2332, λ = 0.13, and q = 1.64 for the Physical Review D papers, and N0 = 46604, λ = 0.11, and q = 1.53 for the Institute for Scientific Information papers. In both cases, the distribution has a decreasing power-law tail with an exponent γcite = q/(q −1) equal to 2.56 for the first data set and to 2.89 for the second one. The fit by a single function might suggest that the underlying mechanisms responsible for the citation distribution of rarely and frequently cited papers are not necessarily different, but the authors do not put forward any theoretical argument indicating why the parameter q should be approximately equal to 1.5. It is worth mentioning that, in 1966, Mandelbrot [223] suggested a modification of Zipf’s law (see page 314) concerning the asymptotic behavior of the frequency f of occurrence of a word as a function of its rank r of the form f (r) ∼
A , (1 + ar)γ
which coincides with the form used by Tsallis and de Albuquerque. The empirical results above concerning the citation network, in which the vertices are the papers and the directed edges are links from citing to cited papers, are related to the vertex in-degree distribution. V´azquez [336] investigated the out-degree distribution (i.e., the distribution of the number of papers cited). Analyzing the data collected from the Science Citation Index report concerning 12 important journals for the period 1991–1999, he found that, in all cases, the out-degree probability distribution exhibits a maximum for an out-degree value dm . The values of dm and f (dm ) depend upon the journal. dm varies from 13 to 39 and f (dm ) from 0.024 to 0.079. A plot of the scaled probability distribution f (d)/f (dm ) as a function of d/dm reveals that, for large out-degrees, the probability distribution decreases exponentially, with a rate equal to either 0.4 or 1.6 according to whether there is or there is not a restriction on the number of pages per article. Human sexual contacts The network of human sexual contacts has been recently investigated [206]. The authors analyzed data gathered in a 1996 Swedish survey of sexual behavior. They found that the cumulative distribution functions of number of sexual partners, either during the 12-month period preceding the survey or during the entire life, have power-law behaviors for males and females. For
292
7 Networks
the 12-month period, the characteristic exponents are equal to 2.54 ± 0.2 for females with more than four partners and to 2.31 ± 0.2 for males with more than five partners. Over the entire life, these exponents are 2.1 ± 0.3 for females, and 1.6 ± 0.3 for males when the number of partners is greater than 20. Since the number of different partners cannot be very large over a one-year period, power-law behaviors are observed over only one order of magnitude. Over an entire life, the number of partners of some promiscuous individuals could be quite large, but the survey apparently did not gather enough data on older individuals. The authors did not mention whether the survey they analyzed took into consideration only heterosexual contacts or also homosexual ones. If only heterosexual contacts are considered, the network of human sexual contacts is an example of a bipartite graph (see page 279). E-mails E-mail networks, which consist of e-mail addresses connected by e-mails, appear to have a scale-free structure. A study of the log file of the e-mail server at the University of Kiel recording the source and destination of all e-mails exchanged between students over a 3-month period has shown that the vertex degree distribution has a power-law behavior characterized by an exponent γe−mail = 1.82 ± 0.10 over about 2 orders of magnitude [113]. Since the study was restricted to one e-mail server, only the accounts at this server are known exactly. When only e-mails exchanged between students having an account at this server are taken into account, the exponent characterizing the power-law behavior of the “internal” vertex degree distribution, equal to 1.47 ± 0.12, is smaller than the exponent for the whole network. The authors also determined the characteristic path length and clustering coefficient of this e-mail network. They found Le−mail = 5.33 ± 0.03 and Ce−mail = 0.113, showing that the e-mail network is a small world. 7.5.2 A few models A common feature of both the random network model and the Watts-Strogatz small-world network model is that the probability of finding a vertex with a large degree decreases exponentially. In contrast, power-law vertex degree probability distributions imply that highly connected vertices have a large chance of being present. Example 57. Barab´ asi-Albert models. According to Barab´asi and Albert [37, 5], there are two generic aspects of real networks that are not incorporated in the two types of models above. First, most real networks continuously expand. The collaboration network of movie actors grows by the addition of new actors, and the World Wide Web grows by the addition of new Web pages. Second, when constructing a random network, the addition, with a probability p, of an edge is done by selecting the end vertices with equal probability
7.5 Scale-free networks
293
among all pairs of vertices. In growing real networks, a new vertex has a higher probability of being connected to a vertex that already has a large degree. For instance, a new actor is more likely to be cast with a well-known actor than with a less well-established one; a new paper has a higher probability of citing an already much-cited article. In order to incorporate these two features, Barab´asi and Albert suggested the following model. Start at t = 0 with an empty graph of small order N0 . At t = 1, add a new vertex and a number m ≤ N0 of edges linking that vertex to m randomly selected different vertices. Then, at all times t > 1, add a new vertex and m new edges connecting this vertex to m existing vertices. In order to take into account preferential attachment of a new vertex to highly connected vertices, the probability of connecting the new vertex to an existing vertex xi is d(xi ) , Nt−1 d(xj ) j=1
where d(xi ) is the degree of vertex xi and Nt−1 is the number of already existing vertices.14 At time t, the network is of order Nt = N0 + t and of size Mt = mt. These linear behaviors imply that the average vertex degree is constant in time. Numerical simulations by the authors showed that their model evolves into a stationary scale-free network with a power-law probability distribution for the vertex degree P (d = k) ∼ k −γBA , where γBA = 2.9 ± 0.1 over about 3 orders of magnitude.15 In their simulations, the authors considered different values of m = N0 = 1, 3, 5, and 7 for t = 300,000, and different values of t = 100,000, 150,000, and 200,000 for m = N0 = 5. To verify that continuous growth and preferential attachment are necessary ingredients to build up a model of a scale-free network, the authors studied two other models. In the first one, they consider a growing network in which each new added vertex has an equal probability of connecting to any of the already existing vertices. In the second model, they consider a network with a fixed number N of vertices. Initially the network consists of N vertices and no edges; then, at each time step a randomly selected vertex is connected to another vertex with preferential attachment. They found that neither of these two models developed a stationary power-law probability distribution. Because of preferential attachment, a vertex that acquires more connections than another will increase its connectivity at a higher rate (i.e., a sort of “rich get richer” phenomenon). Barab´ asi and Albert showed that, as a function of time t, the degree of a vertex increases as tβBA , where βBA = 12 . 14
15
The linear dependence upon d(xi ) of the probability to connect is an essential assumption. See Remark 22. See Exercise 7.3.
294
7 Networks
The Barab´ asi–Albert exponent γBA ≈ 3 is close to the value found by Redner [295] for the network of citations, but other real growing networks such as the World Wide Web or scientific collaboration networks have powerlaw vertex degree probability distributions characterized by different exponent values. Remark 21. Error and attack tolerance. In a scale-free network, the power-law vertex degree probability distribution implies that the majority of vertices have only a few links. Thus, if a small fraction of randomly selected vertices are removed, the characteristic path length remains essentially the same since highly connected vertices have a very small probability of being selected. A scale-free network should, therefore, exhibit a high degree of tolerance against errors. On the other hand, the intentional removal of a few highly connected vertices could destroy the small-world properties of the network, making scale-free networks highly vulnerable to deliberate attacks. These properties have been verified for the Barab´ asi-Albert model [4]. In contrast, random networks, in which all vertices have approximately the same number of links, are much less vulnerable to attack, but their characteristic path length, which increases monotonically with the fraction of removed vertices, shows they are less tolerant against errors.
While the Barab´ asi-Albert model predicts the emergence of power-law scaling, as pointed out by the authors, the agreement between the measured and predicted exponents is less than satisfactory. This led them to propose an extended model [3]. The dynamical growth process of this new model, which incorporates addition of new vertices, addition of new edges, and rewiring of edges, starts like the first model, but, instead of just adding new vertices linked to existing ones with preferential attachment, at each time step, one of the following operations is performed. 1. With probability p, add m new edges. That is, randomly select a vertex as one endpoint of an edge and attach that edge to another existing vertex with preferential attachment. Repeat this m times. 2. With a probability q, rewire m edges. That is, randomly select a vertex xi and an edge {xi , xj }. Replace that edge by a new edge {xi , xj } with preferential attachment. Repeat this m times. 3. With a probability 1 − p − q, add a new vertex and m new edges linking the new vertex to m existing vertices with preferential attachment. In the (p, q)-plane, the system exhibits two regimes: 1. an exponential regime as for the uniform or binomial random network models and the Watts-Strogatz small-world network model; 2. a scale-free regime, but with an exponent γ characterizing the power-law decay depending continuously on the parameters p and q, showing a lack of universality, and accounting for the variations observed for real networks. Using this model to reproduce the properties of the collaboration network of movie actors, Albert and Barab´ asi found an excellent fit taking p = 0.937, q = 0 (rewiring is absent), and m = 1.
7.5 Scale-free networks
295
Example 58. Dorogovtsev-Mendes-Samukhin model. Starting from the assumption that preferential attachment is the only known mechanism of selforganization of growing a network into a scale-free structure, Dorogovtsev, Mendes, and Samukhin [104] proposed a model similar to the first Barab´asiAlbert model but in which a new parameter, representing the site initial attractiveness, leads to a more general power-law behavior. In this model, at each time step, a new vertex and m new edges are added to the network. Here, the edges are directed and coming out from nonspecified vertices; that is, they may come from the new vertex, from already existing vertices, or even from outside of the network. In the Barab´asi-Albert model, the new links were all coming out from the new vertex. The probability that a new edge points to a given vertex xt , added at time t, is proportional to the attractiveness of this site, defined as Axt = A + d(xt ), where A ≥ 0 is the initial attractiveness of a site (the same for all sites) and d(xt ) denotes, as usual, the degree of vertex xt . In the Barab´asi-Albert model, all sites have an initial attractiveness A equal to m. The Dorogovtsev-MendesSamukhin model is equivalent to a system of increasing number of particles distributed in a growing number of boxes, such that, at each time step, m new particles (incoming links) have to be distributed between an increasing number (one per time step) of boxes (sites) according to the rule above. Writing down the master equation for the evolution of the distribution of the connectivity of a given site as a function of time, the authors were able to derive the exact expression of the vertex degree probability distribution. In particular, for the Barab´ asi-Albert model, they obtained (see Exercise 7.4) P (d = k) =
2m(m + 1) , (k + m)(k + m + 1)(k + m + 2)
which, for large k, behaves as k −3 . Another interesting result is the existence of a simple universal relation between the exponent β characterizing the connectivity growth rate of a vertex and the exponent γ characterizing the power-law behavior of the vertex degree probability distribution. They found that β(γ − 1) = 1, which is, in particular, satisfied by the Barab´ asi-Albert exponents βBA = and γBA = 3.
1 2
Example 59. Krapivsky-Rodgers-Redner model. Using a simple growing network model, Krapivsky, Rodgers, and Redner were able to reproduce the observed in-degree and out-degree probability distributions of the World Wide Web as well as find correlations between in-degrees and out-degrees of each vertex [196]. In their model, the network grows according to one of the following two processes:
296
7 Networks
1. With probability p, a new vertex is added to the network and linked to an existing target vertex, the target vertex being selected with a probability depending only upon its in-degree. 2. With probability 1−p, a new edge is created between two already existing vertices. The originating vertex is selected with a probability depending upon its out-degree and the target vertex with a probability depending upon its in-degree. If N is the order of the network, and Din and Dout denote, respectively, the total in-degree and out-degree, according to the growth processes, the evolution equations of these quantities are
N (t) + 1, with probability p, N (t + 1) = N (t), with probability 1 − p, Din (t + 1) = Din (t) + 1, Dout (t + 1) = Dout (t) + 1. Hence, N (t) = pt,
Din (t) = Dout = t.
These results show that the average in-degree and out-degree are both timeindependent and equal to 1/p. In order to determine the joint degree distributions, the authors assume that the attachment rate A(din , dout ), defined as the probability that a newly introduced vertex connects with an existing vertex of in-degree din and out degree dout , and the creation rate C (din (x1 ), dout (x1 )), (din (x2 ), dout (x2 )) , defined as the probability of adding a new edge pointing from vertex x1 , whose in-degree and out-degree are, respectively, din (x1 ) and dout (x1 ), to vertex x2 , whose in-degree and out-degree are, respectively, din (x2 ) and dout (x2 ), are simply given by A(din , dout ) = din + λ C (din (x1 ), dout (x1 )), (din (x2 ), dout (x2 )) = din (x2 ) + λ dout (x1 ) + µ ,
where λ > 0 and µ > −1 to ensure that both rates are positive for all permissible values of in-degrees din ≥ 0 and out-degrees dout ≥ 1. Using the assumptions above regarding the attachment and creation rates to write down and solve the rate equation for the evolution of the joint degree distribution N (din , dout , t), defined as the average number of vertices with din incoming and dout outgoing links, Krapivsky, Rodgers, and Redner obtained N (din , dout , t) = n(din , dout )t. The expression of n(din , dout ) reveals interesting features. Power-law probability distributions for the in-degrees and out-degrees of vertices are dynamically generated. Choosing the values [5]
7.5 Scale-free networks
p = 0.1333,
λ = 0.75,
297
µ = 3.55,
it is possible to match the empirical values regarding the World Wide Web: din = dout = 7.5,
γin = 2.1,
γout = 2.7.
Correlations between the in-degree and out-degree of a vertex develop spontaneously. For instance, the model predicts that popular Web sites (i.e., those with large in-degree) tend to list many hyperlinks (i.e., have large out-degree), and sites with many links tend to be popular. Finally, the model also predicts power-law behavior when, e.g., the in-degree is fixed and the out-degree varies. Remark 22. The linear dependence of the attachment rate upon din appears to be a necessary condition for a growing network to display a scale-free behavior [195, 197, 104]. If the attachment rate varies as dα in , then, 1. for 0 < α < 1, the degree distribution exhibits a stretched exponential decay; 2. for α = 1, the degree distribution has a power-law behavior characterized by the exponent γBA = 3; 3. for an attachment rate asymptotically linear in din , the degree distribution has a nonuniversal power-law behavior characterized by an exponent γ > 2; 4. for 1 < α < 2, the number of vertices with a smaller number of links grows slower than linearly in time, while one vertex has the rest of the links; 5. for α > 2, all but a finite number of vertices are linked to one particular vertex. The statistical analysis of the two databases containing paper titles and authors of important journals in mathematics and neuroscience seems to indicate a nonlinear dependence of the attachment rate (see page 290), in contradiction with the reported power-law behavior of the vertex degree probability distribution.
Example 60. V´ azquez model. When a scientist enters a new field, he is usually familiar with only some of the already published papers dealing with this field. Then, little by little, he discovers other papers on the subject by following the references included in the papers he knows. When, finally, the new paper is accepted for publication, it includes, in its list of references, papers discovered in this way; i.e., with the addition of a new vertex to the citation network, new outgoing links, discovered recursively, are simultaneously added to the network. This walking on the network mechanism is the basic ingredient of a citation network model proposed by V´ azquez [335]. Starting from a single vertex, at each time step the following subrules are applied successively. 1. A new vertex with an outgoing link to a randomly selected existing vertex is added. 2. Then, other outgoing links between the new vertex and each neighbor of the randomly selected existing vertex are added with a probability p. If at least one link has been added in this way, the process is repeated by adding outgoing links between the new vertex and the second neighbors of the randomly selected vertex, and so forth. When no links are added, the time step is completed.
298
7 Networks
This evolution rule does not include preferential attachment as in Barab´asiAlbert models, but vertices with large in-degrees being more frequently reachable during the walking process, the basic mechanism of the evolution rule induces preferential attachment. Numerical simulations show that, if p < pc ≈ 0.4, the asymptotic behavior of the vertex in-degree distribution decreases exponentially, while, for p > pc , this distribution decreases as a power law with an exponent γV ≈ 2. A value close to 2 has been found for the distribution of in-degrees in the case of the World Wide Web (see page 289). The small-world network models described in the preceding section were built up adding some randomness to an underlying regular lattice. While these models are certainly interesting toy models, real small-world networks, such as acquaintance networks, form dynamically, starting from some random structure. On the other hand, many real networks, and in particular some acquaintance networks, have been shown to possess a scale-free structure. This important feature is successfully explained in terms of network growth and preferential attachment. But models based on these two ingredients have a small clustering coefficient, which arguably makes them small-world network models. Example 61. Davidsen-Ebel-Bornholdt model. Davidsen, Ebel, and Bornholdt proposed a small-world network model (i.e., a graph with short characteristic path length and a high clustering coefficient) evolving towards a scale-free network despite the fact that it keeps a fixed number of vertices [100]. The two essential ingredients of this model are a local connection rule based on transitive linking and a finite vertex life span. In order to model an acquaintance network, the authors start from a random graph of order N in which each vertex represents a person and each undirected edge between two vertices indicates that these two persons know each other. The network evolves as the result of the application at each time step of the following subrules: 1. Assuming that people are usually introduced to each other by a common acquaintance, randomly select one vertex and two of its neighbors, and, if these two vertices are not connected, add an edge between them, otherwise do nothing. If the selected vertex has less than two neighbors, add an edge between this vertex and another one selected at random. 2. Since people have a finite life span, with probability p 1, one randomly selected vertex and all the edges connected to this vertex are removed from the network and replaced by a new vertex and an edge connecting this vertex to a randomly selected existing vertex. The condition p 1 implies that the average life span of a person is much larger than the rate at which people make new acquaintances. Due to this finite life span, the average vertex degree does not grow forever but tends to a finite value that depends upon the value of p.
Exercises
299
Numerical simulations of a model with N = 7000 vertices show that, for p = 0.0025, the vertex degree probability distribution behaves as a decreasing power law characterized by an exponent γDEB = 1.35, the average vertex degree d = 149.2, the clustering coefficient CDEB = 0.63, and the characteristic path length LDEB = 2.38. The vertex degree distribution exhibits a cutoff at high degree values resulting from the finite vertex life span.
Exercises Exercise 7.1 Consider a regular finite one-dimensional lattice having N sites and periodic boundary conditions ( i.e., a ring), in which each site is linked to its r nearest neighbors. (i) Find the exact expression of the characteristic path length Llattice . (ii) Find the exact expression of the clustering coefficient Clattice . (iii) What is the asymptotic expression of Llattice for large N and fixed r? What is the maximum value of Clattice ? Exercise 7.2 Consider a rather large random graph of a few thousand vertices, and check numerically that the probability distribution of the vertex degrees of a random graph are a Poisson distribution. Exercise 7.3 In order to study the evolution as a function of time t of the degree d(xi ) of a given vertex xi of the Barab´ asi-Albert scale-free network model, Barab´ asi, Albert, and Jeong [33] assume that the degree is a continuous function of time whose growth rate is proportional to the probability d(xi )/2mt that the new vertex, added at time t, will connect to vertex xi . Taking into account that, when adding a new vertex to the network, m new edges are added at the same time, the authors show that d(xi ) satisfies the equation d(xi ) ∂d(xi ) = . ∂t 2t (i) Find the solution to this equation. (ii) Assuming that, at time t, the random variable ti at which vertex xi has been added to the network is uniformly distributed in the interval [0, N0 + t[, use the result of (i) to determine in the infinite-time limit the vertex degree probability distribution. (iii) In order to verify that continuous growth and preferential attachment are necessary ingredients to build up a model of a scale-free network, Barab´ asi and Albert studied two other models (see page 293). In the first one, they consider a growing network in which each new added vertex has an equal probability of connecting to any of the already existing vertices. In the second model, they consider a network with a fixed number N of vertices. Initially the network consists of N vertices and no edges. Then, at each time step, a randomly selected vertex is connected to another vertex with preferential attachment. Use the method of (i) and (ii) to find the time evolution of the degree of a specific vertex and the vertex degree probability distribution for these two models.
300
7 Networks
Exercise 7.4 Following a method due to Krapivsky, Redner, and Leyvraz [195, 296], it can be shown that, for the Barab´ asi-Albert network model, the average number Nd (t) of vertices with d outgoing edges at time t is a solution of the differential equation (d − 1)Nd−1 − dNd dNd = + δdm , dt 2t where δdm = 1 if d = m and 0 otherwise. (i) Justify this equation. (ii) Show that the vertex degree probability distribution P (d) = lim
t→∞
Nd (t) t
satisfies a linear recurrence equation, and obtain its expression. Exercise 7.5 When dealing with discrete random variables, the generating function technique is a very effective problem-solving tool,16 and it can be used successfully when dealing with random graphs [267]. Let X be a discrete random variable defined on the set of nonnegative integers N0 = {0, 1, 2, . . .}. If the probability P (X = k) is equal to pk , the generating function GX of this probability distribution is defined by GX (x) =
∞
pk xk .
k=0
The relation (∀k ∈ N0 )
0 ≤ pk ≤ 1
implies that the radius of convergence of the series defining GX is at least equal to 1; that is, for |x| < 1 the series is normally convergent. In this exercise, G1 denotes the generating function of the vertex degree probability distribution ( i.e., pk is the probability that a randomly chosen vertex has k first neighbors). (i) What is the generating function of the sum of the degrees of n randomly chosen vertices? (ii) What is the generating function of the vertex degree probability distribution of an end vertex of a randomly chosen edge? (iii) What is the generating function of the number of outgoing edges of an end vertex of a randomly chosen edge? (iv) We have seen that, for uniform and binomial random graphs, the vertex degree probability distribution is the Poisson distribution pk = e−λ
λk k!
when N → ∞, p → 0, and N p = λ. In this case, what is the generating function of the probability distribution of the number of outgoing edges of an end vertex of a randomly chosen edge? 16
See, for example, Boccara [56], pp. 42–49.
Solutions
301
(v) What is the generating function of the probability distribution of the number of second neighbors of a randomly selected vertex? (vi) What is the average number of second neighbors of a randomly selected vertex? Exercise 7.6 (i) Let St and It denote the respective densities of susceptible and infective individuals occupying the sites of a random network at time t. If pi is the probability for a susceptible to be infected by a neighboring infective, show that the average value of the probability for a susceptible to be infected, at time t, by one of his neighbors is 1 − exp(−dpi It ), where d is the average vertex degree. (ii) If pr is the probability for a susceptible to recover and become susceptible again, within the mean-field approximation, show that the system undergoes a transcritical bifurcation from a disease-free to an endemic state. Denote by ρ the total constant density of individuals. (iii) The previous model neglects births and deaths. In order to model the spread of a more serious disease, assume that infected individuals do not recover and denote by ds and di , respectively, the probabilities of dying for a susceptible and an infective. Dead individuals are permanently removed from the network. If bs and bi are the respective probabilities for a susceptible and an infective to give birth to a susceptible at a vacant vertex of the network during one time unit, show that, within the mean-field approximation, this model exhibits various bifurcations. Discuss numerically the existence and stability of the different steady states.
Solutions Solution 7.1 (i) Let the sites of the regular lattice be numbered from 0 to N − 1. If < N/2, we have ⎧ ⎪ if = 0 mod r, ⎪ ⎨r, d(0, ) = d(0, N − ) = ' ( ⎪ ⎪ ⎩ + 1, otherwise, r where d(x, y) denotes the distance between sites x and y. Hence, grouping sites in r-blocks on both sides of site 0, we are left with N − 1 − 2r(N − 1)/2r sites that cannot be grouped into two r-blocks, and their distance from 0 is (N − 1)/2r + 1 (if N − 1 = 0 mod 2r, there are no sites left, but the final result is always valid). The sum of the distances from site 0 to all the other N − 1 sites of the lattice is therefore given by ' ( ' ( ' ( N −1 N −1 N −1 2r 1 + 2 + · · · + + N − 1 − 2r +1 2r 2r 2r ( ' ( ' ( ' ( ' N −1 N −1 N −1 N −1 + 1 + N − 1 − 2r +1 =r 2r 2r 2r 2r ( ' ( ' N −1 N −1 +1 = N −1−r 2r 2r
302
7 Networks
and, since site 0 does not play any particular role, dividing this result by N − 1 yields the characteristic path length: ' ( ' ( N −1 N −1 r Llattice = 1 − +1 . N −1 2r 2r (ii) Let −r, −r + 1, −r + 2, . . . , −1, 1, 2, . . . , r − 1, r be the 2r nearest neighbors of site 0. To avoid counting edges twice, starting from site −r, we only count edges whose endpoints are on the right of each site of the list above. For 0 ≤ j < r, site −r + j is linked to r − 1 sites on its right, and for 0 ≤ k < r, site r − k is linked to k sites on its right. The total number of edges is thus r(r − 1) + (r − 1) + (r − 2) + · · · + 2 + 1 + 0 =
3 r(r − 1). 2
The number of edges connecting all pairs of nearest-neighboring sites of site 0 is r(2r−1). Thus, the clustering coefficient of the lattice s given by Clattice (r) =
3(r − 1) . 2(2r − 1)
(iii) For large N and fixed r, we find Llattice ∼
N . 4r
This result can be obtained without knowing the exact expression of Llattice . It suffices to note that site N/2, which is one of the most distant sites from site 0, is at distance N/2r. The characteristic path length is, therefore, approximately equal to N/4r. For r ≥ 1, Clattice is an increasing function of r that tends to 34 when r → ∞. Clattice (r) is, therefore, bounded by 34 . Solution 7.2 Writing a small program to build up a random graph with N = 5, 000 vertices and a total number of edges E = 20,000, for a specific example, we find that the degrees of the vertices vary between 0 and 13. Degrees equal to 0 imply the existence of disconnected subgraphs. Actually, the graph connectivity, which is defined as the fraction of vertices belonging to the giant component, is found to be equal to 0.964242. The numerical frequencies of the different degrees compared to the theoretical frequencies predicted by the Poisson distribution are listed in the table below. degree 0 1 2 3 4 5 6 numerical 89 363 715 996 986 787 520 theoretical 91.58 366.31 732.63 976.83 976.83 781.47 520.98 degree 7 8 9 10 11 12 13 numerical 297 143 65 24 9 5 1 theoretical 297.70 148.85 66.16 26.46 9.62 3.21 0.99 The agreement is quite good. The plots of the numerical and theoretical frequencies shown in Figure 7.8 almost coincide. The standard deviation ofthe numerical frequencies is found equal to 1.98544 compared to the theoretical value d = 2.
Solutions
303
1000 800 600 400 200
2
4
6
8
10
12
Fig. 7.8. Poisson test. Plots of the degree’s frequencies for z varying from 0 to 13 obtained from a numerical simulation of a random graph with 5000 vertices and an average degree d = 4 (light grey) compared to theoretical frequency values predicted by a Poisson distribution (darker grey). Solution 7.3 (i) The solution d(xi , t) to the equation d(xi ) ∂d(xi ) = , ∂t 2t which, for t = ti , is equal to m, is d(xi , t) = m
t . ti
(ii) Taking into account the expression of d(xi , t), the probability P (d(xi , t) ≤ d) that the degree of a given vertex is less than or equal to d can be written m2 t P (d(xi , t) ≤ d) = P ti ≥ 2 . d But P
m2 t m2 t = 1 − P ti < 2 , ti ≥ 2 d d
and since, at time t, the random variable ti is uniformly distributed in the interval [0, N0 + t[, the probability P (ti < T ) that, at time t, ti is less than T is T /(N0 + t). Hence, m2 t P (d(xi , t) ≤ d) = 1 − 2 . d (N0 + t) Thus, in the infinite-time limit, the vertex degree cumulative distribution function Fvertex is such that m2 t Fvertex (d) = lim 1 − 2 t→∞ d (N0 + t) =1−
m2 , d2
304
7 Networks
and the vertex degree probability density is given by ∂F (d) ∂d 2m2 = 3 . d
fvertex (d) =
As it should be, the function Fvertex , defined on the interval [m, ∞[, is monotonically increasing from 0 to 1. In agreement with numerical simulations, the exponents βBA and γBA are, respectively, equal to 12 and 3 (see page 293). (iii) For the first model, with growth but no preferential attachment, the differential equation satisfied by d(xi ) is ∂d(xi ) m = ∂t N0 + t − 1 since the probability that, at time t, one of the m new edges added to the network connects to an already existing vertex is the same for all the N0 + t − 1 vertices of the network. The solution to this equation, which, for t = ti , is equal to m, is N0 + t − 1 d(xi , t) = m log +1 . N0 + ti − 1 As above, the probability P (d(xi , t) ≤ d) that the degree of a given vertex is less than or equal to d can be written d P (d(xi , t) ≤ d) = P ti ≥ (N0 + t − 1) exp 1 − − N0 + 1 . m Assuming that, at time t, the random variable ti is uniformly distributed in the interval [0, N0 + t[, we can write d P ti ≥ (N0 + t − 1) exp 1 − − N0 + 1 m d = 1 − P ti < (N0 + t − 1) exp 1 − − N0 + 1 m d (N0 + t − 1) exp 1 − − N0 + 1 m =1− . N0 + t 1 For this first model, in the infinite-time limit, the cumulative distribution function Fvertex of the vertex degree is given by d 1 (d) = 1 − exp 1 − , Fvertex m 1 is such that and the corresponding vertex degree probability density fvertex 1 fvertex (d) =
e −d/m . e m
For the second model with preferential attachment but no growth ( i.e., a fixed number of vertices N ). The degree growth rate is given by the probability 1/N of first
Solutions
305
selecting the vertex from which the edge originates plus the probability of preferentially attaching that edge to vertex xi . This second probability is of the form A
d(xi ) , N d(xj ) j=1
N where j=1 d(xj ) = 2t and A = N/(N − 1) to exclude from the summation edges originating and terminating at the same vertex. Hence, we have
∂d(xi ) N dxi 1 = + . ∂t N − 1 2t N The solution that, for t = 0, satisfies d(xi , 0) = 0 (the initial network consists of N vertices and no edges) is d(xi , t) =
2(N − 1) t + CtN/2(N −1) , N (N − 2)
where C is a constant. Since N 1, the expression above can be written d(xi , t) ≈ Taking into account the relation we finally obtain
N j=1
2 t + Ct1/2 . N
d(xj ) = 2t, the constant C is equal to zero, and
2 t. N That is, for large values of N and t all vertices have approximately the same degree. This implies a normal vertex degree distribution centered around its mean value. In their paper, Barab´ asi, Albert, and Jeong [33] give the results of numerical simulations that agree with all the analytical results above. d(xi , t) ≈
Solution 7.4 (i) When a new vertex with m outgoing edges is added to the network, the probability that a new edge connects to a vertex with degree d is given by md md d = = . 2mt 2t kNk k≥m
Hence, the first and second terms of the right-hand side of the Krapivsky-Redner-Leyvraz rate equation represent, respectively, the increase and decrease of Nd when new edges are connected either to vertices of degree d − 1 or to vertices of degree d. The last term accounts for new vertices of degree m. (ii) The term k≥m kNk = 2mt represents the sum of the degrees of all the vertices, clearly equal to twice the sum of all edges. On average, in the infinite-time limit, Nd (t) is therefore linear in t. Thus, replacing Nd (t) by tP (d) in the differential equation, we obtain P (d) = 21 (d − 1)P (d − 1) − dP (d) + δdm or
306
7 Networks ⎧ d−1 ⎪ P (d − 1), ⎨ d+2 P (d) = ⎪ ⎩ 2 , m+2
if d > m, if d = m.
Hence, for d > m, P (d) =
2m(m + 1) . d(d + 1)(d + 2)
For large d, P (d) ∼ d−γBA with γBA = 3. Solution 7.5 (i) This is a classical result: If GX and GY are the respective generating functions of the discrete random variables X and Y , then the generating function of the random variable X + Y is GX+Y = GX GY . Hence, the generating function of the sum of the degrees of n randomly chosen vertices is the nth power of the generating function G1 of the vertex degree probability distribution. (ii) The probability that an edge reaches a vertex of degree k is proportional to kpk . Since the sum of all probabilities has to be equal to 1, the generating function of the vertex degree probability distribution of an end vertex of a randomly chosen edge is defined by ∞ kpk xk xG1 (x) k=0 = . ∞ G1 (1) kpk k=0
This result assumes that the value of the average vertex degree G1 (1) = exists ( i.e., is finite).
∞ k=1
kpk
(iii) The previous result gave us the generating function of the vertex degree probability distribution of an end vertex of a randomly chosen edge. The number of outgoing edges of one end vertex is equal to its degree minus 1, since one edge was selected to reach this vertex. Thus, the generating function Gout of the number of outgoing edges is equal to the generating function of the vertex degree probability distribution of an end vertex of a randomly chosen edge divided by x; that is, Gout (x) =
G1 (x) . G1 (1)
(iv) The expression of the generating function of the Poisson probability distribution is G1 (x) = =
∞ k=0 ∞ k=0
e−λ
λk k x k!
e−λ
(λx)k k!
= eλ(x−1) , where λ = G1 (0) is the average vertex degree. Hence, the generating function of the probability distribution of the number of outgoing edges of an end vertex of a randomly selected edge is, in this case, Gout (x) =
G1 (x) = eλ(x−1) = G1 (x); G1 (1)
Solutions
307
that is, the generating function of the vertex degree probability distribution. Whether a vertex is randomly selected or it is an end vertex of a randomly selected edge, the probability distribution of the number of outgoing edges of this vertex is the same! (v) A randomly selected vertex has a probability pk of having k first neighbors, and each of these neighbors has a probability q of having a number of outgoing edges, where q is the coefficient of x in the series expansion of Gout (x); that is, q =
1 d G (x) . out ! dx x=0
The generating function of the number of second neighbors of a randomly selected vertex is therefore defined by G1 (x) G2 (x) = G1 Gout (x) = G1 . G1 (1) (vi) If GX is the generating distribution of a discrete random variable X, the average value of X, if it exists, is given by X =
∞
kpk = GX (1).
k=0
The average number of second neighbors of a randomly selected vertex is then G1 (x) d G1 = G1 (1). dx G1 (1) x=1 Solution 7.6 (i) Let St and It denote the respective densities of susceptible and infective individuals occupying the network vertices at time t. If pi is the probability for a susceptible to be infected by a neighboring infective, the probability for a susceptible to be infected, at time t, by one of his k neighbors is P (pi , It , k) = 1 − (1 − pi It )k . Since, in a random network, the vertex degree probability distribution is Poisson, the average value of P (pi , It , k) as a function of the random variable k is given by P (pi , It , k)Poisson =
∞ k=0
dk 1 − (1 − pi It )k e−d k!
= 1 − exp(−dpi It ). (ii) If pr is the probability for an infective to recover and become susceptible again, the mean-field equations governing the time evolution of this SIS epidemic model are St+1 = ρ − It+1 , It+1 = It + St 1 − exp(−dpi It ) − pr It , where ρ is the total constant density of individuals. Eliminating St in the second equation yields It+1 = (1 − pr )It + (ρ − It ) 1 − exp(−dpi It ) .
308
7 Networks
In the infinite-time limit, the stationary density of infective individuals I∞ satisfies the equation I∞ = (1 − pr )I∞ + (ρ − I∞ ) 1 − exp(−dpi I∞ ) . I∞ = 0 is always a solution to this equation. This value characterizes the disease-free state. It is a stable stationary state if, and only if, dρpi − pr < 0. If dρpi − pr > 0, the stable stationary state is given by the unique positive solution of the equation above. In this case, a nonzero fraction of the population is infected. The system is in the endemic state. For dρpi − pr = 0, within the framework of the mean-field approximation, the system undergoes a transcritical bifurcation similar to a second-order phase transition characterized by a nonnegative order parameter, whose role is played, in this model, by the stationary density of infected individuals I∞ . (iii) If bs and bi are, respectively, the probabilities for a susceptible and an infective to give birth to a susceptible at a vacant site during one time unit, reasoning as in (i), the probability that either a susceptible or an infective will give birth to a susceptible at a neighboring vacant site is (1 − St − It ) 1 − exp − d(bs St + bi It ) . Taking into account this result, the mean-field equations governing the time evolution of the model are St+1 = St + (1 − St − It ) 1 − exp − d(bs St + bi It ) − ds St , − (1 − ds )St 1 − exp(−dpi It ) It+1 = It + (1 − ds )St 1 − exp(−dpi It ) − di It , where ds and di are, respectively, the probabilities for a susceptible and an infective to die during one time unit. In the infinite-time limit, the stationary densities of susceptible and infective individuals S∞ and I∞ satisfy the equations S∞ = S∞ + (1 − S∞ − I∞ ) 1 − exp − d(bs S∞ + bi I∞ ) − ds S∞ − (1 − ds )S∞ 1 − exp(−dpi I∞ ) I∞ = I∞ + (1 − ds )S∞ 1 − exp(−dpi I∞ ) − di I∞ . There exist different solutions. 1. (S∞ , I∞ ) = (0, 0) is always a solution of the system of equations above. The eigenvalues of the Jacobian at (0, 0) being 1−di and 1−dbs −ds , this point, which corresponds to extinction of all individuals, is asymptotically stable if dbs < ds ( i.e., when the average birth rate of susceptible individuals is less than their death rate). 2. I∞ = 0 is always a solution to the second recurrence equation. There exists, therefore, a disease-free state (S♥ , 0), where the infinite-time limit of the susceptible density S♥ is the nonzero solution to the equation (1 − S♥ ) 1 − exp(−dbs S♥ ) − ds S♥ = 0. This solution is stable if
Solutions
309
dpi (1 − ds )S♥ − di < 0. It can be verified that, during the evolution towards the disease-free steady state, •
if S0 > di/(dpi (1 − ds )), I1 > I0 , there is an epidemic,
•
if S0 < di/(dpi (1 − ds )), I1 < I0 , there is no epidemic,
where S0 and I0 denote the susceptible and infective initial densities.17 For example, if d = 4, pi = 0.34, bs = 0.2, bi = 0.1, ds = 0.355, and di = 0.5, the condition dbs > ds is verified. In the infinite-time limit, the system is in a disease-free state, and since di /(dpi (1 − ds )) = 0.57 there is no epidemic for S0 = 0.54 and an epidemic for S0 = 0.6. For these parameter values S♥ = 0.468, and the eigenvalues of the Jacobian at the point (S♥ , 0) are 0.911 and 0.625, confirming the asymptotic stability of this point. 3. There also exists a solution (S♣ , I♣ ) = (0, 0) that satisfies the system of equations 0 = (1 − S♣ − I♣ ) 1 − exp − d(bs S♣ + bi I♣ ) − ds S♣ − (1 − ds )S♣ 1 − exp(−dpi I♣ ) , 0 = (1 − ds )S♣ 1 − exp(−dpi I♣ ) − di I♣ .
Infectives
0.1 0.08 0.06 0.04 0.04
0.06 0.08 0.1 Susceptibles
0.12
Fig. 7.9. Limit cycle. Parameter values: d = 4, pi = 0.8, bs = 0.049, bi = 0.0001, ds = 0.001, and di = 0.18. For d = 4, pi = 0.8, bs = 0.6, bi = 0.1, ds = 0.355, and di = 0.5, the nontrivial fixed point (S♣ , I♣ ) = (0.3495, 0.2450) and the eigenvalues of the Jacobian at the point are complex conjugate and equal to 0.448522 ± i 0.401627. This indicates the existence of damped oscillations in the neighborhood of this asymptotically stable point representing an endemic state. 4. Playing with the numerical values of the parameters, we can modify the location of the nontrivial fixed point (S♣ , I♣ ) and increase the modulus of the eigenvalues of 17
This result is another illustration of the Kermack-McKendrick threshold theorem; see pages 45 and 137.
310
7 Networks
the Jacobian at this point. When the modulus of these eigenvalues becomes equal to 1, the system exhibits a Hopf bifurcation. For d = 4, pi = 0.8, bs = 0.054, bi = 0.0001, ds = 0.001, and di = 0.18, the convergence to the fixed point (S♣ , I♣ ) = (0.0623303, 0.0646393) is extremely slow since the eigenvalues of the Jacobian at this point, equal to 0.983503 ± i 0.180775, have a modulus equal to 0.999979, indicating the proximity of a Hopf bifurcation. 5. Decreasing the value of the parameter bs , the system exhibits a limit cycle as illustrated in Figure 7.9.
8 Power-Law Distributions
8.1 Classical examples Many empirical analyses suggest that power-law behaviors in the distribution of some quantity are quite frequent in nature.1 However, we have to keep in mind that these behaviors may be purely apparent (see pages 324 and 353) or simply transient or spurious (see page 345). Here are two classical examples completed with some recent developments. Example 62. Pareto law. Italian economist and sociologist Vilfredo Pareto (1848–1923) is known, in particular, for his pioneering work on the distribution of income. Pareto [277] considered data for England, a few Italian towns, some German states, Paris, and Peru. A log-log plot of the number N (x) of individuals with income greater than x, showed2 that N (x) behaved as x−α , where, for all the data, the exponent α, referred to as the Pareto exponent, is in all cases close to 1.5. x 150 200 300 400 500 1
2
N 400,648 234,185 121,996 74,041 54,419
x 600 700 800 900 1000
N 42,072 34,269 29,311 25,033 22,899
x N 2000 9880 3000 6069 4000 4161 5000 3081 10,000 1104
In Subsection 7.5.1, we presented a few scale-free networks characterized by power-law vertex degree probability distributions. If Ntotal is the total number of individuals, the quantity F (x) =
Ntotal − N (x) Ntotal
is the probability that an individual has an income less than or equal to x. F is known in probability theory as the cumulative distribution function; see Subsection 8.2.1.
312
8 Power-Law Distributions
The table above gives the number N (x) as a function of x in pounds sterling of a category of British taxpayers (businessmen and professionals) for ´ the fiscal year 1893–1894. The data are taken from Pareto’s Cours d’Economie Politique, volume 2, page 305. Figure 8.1 shows a log-log plot of these data (dots) and the least-squares fit (straight line). Actually, from our fit of Pareto’s data, we find α = 1.34. Many papers have been published on the probability density of income distribution in various countries.3 It appears that this probability density follows a power-law behavior in the high-income range, but, according to Gibrat [142],4 it is lognormal in the low–middle range. This probability distribution, which depends upon two parameters is discussed in Subsection 8.2.3. It can can be mistaken for a power-law distribution as shown in Figure 8.5. 13
log N
11
9
7 5
6
7 log x
8
9
Fig. 8.1. Distribution of income of a category of British taxpayers (businessmen and professionals) for the year 1893–1894. x is the annual income and N (x) the number of taxpayers whose annual income is greater than x.
In a recent study of the overall profile of Japan’s personal income distribution, Souma [322]5 found that only the top 1% of the Japanese income data for the year 1998 follows a power-law distribution with a Pareto exponent equal to α = 2.06; the remaining 99%√of the data fit a lognormal distribution with m = 4 million yen and β = 1/ 2πσ 2 = 2.68.6 Souma also studied the time dependence from 1955 to 1998 of Pareto’s exponent α and the index β and 3
4 5
6
Pareto law, which states N (x) ∼ x−α , implies the probability density f (x) ∼ x−(1+α) . See also Badger [22] and Montroll and Shlesinger [248]. See also analyses of large and precise data sets by Fujiwara, Souma, Aoyama, Kaizoji, and Aoki [132]. β is called the Gibrat index.
8.1 Classical examples
313
found that the values of the Pareto exponent for Japan and the US during that period were strongly correlated. Pointing out that many recent papers dealing with income probability distributions “do little or no comparison at all with real statistical data,” Dr˘ agulescu and Yakovenko [107, 108] analyzed the data on income distribution in the United States from the Bureau of the Census and the Internal Revenue Service. They found that the individual income of about 95% of the population is not lognormal but described by an exponential distribution of the form exp(−r/R)/R.7 The parameter R, which represents the average income, changes every year. For example, from data collected from the Internal Revenue Service8 for the year 1997, R = 35,200 US$ from a database size equal to 1.22 × 108 . Above 100,000 US$ the income distribution has a power-law behavior with a Pareto exponent α = 1.7 ± 0.1. The exponential distribution is valid for individual incomes. For households with two persons filing jointly, the total income r is the sum r1 + r2 of the two individual incomes, and the income probability distribution p2 of the household is r r p2 (r) = p(r )p(r − r ) dr = 2 e−r/R . R 0 Data obtained from the Web site of the bureau of the Census9 are in good agreement with this distribution. The individual and family income distributions differ qualitatively. The former monotonically increases towards low incomes and reaches its maximum at zero. The latter has a maximum at a finite income rmax = R and vanishes at zero. That is, for individuals, the most probable income is zero, while, for a family with two earners, it is equal to R. According to Dr˘ agulescu and Yakovenko, hierarchy could explain the existence of two regimes, exponential and power-law, for the income distribution. People have leaders, which have leaders of higher order, and so forth. The number of people decreases exponentially with the hierarchical level. So, if the income increases linearly with the hierarchical level, an exponential distribution follows; but if the income increases exponentially with the hierarchical level, a power law is obtained. The linear increase is probably more realistic for moderate incomes, while the exponential increase should be the case for very high incomes. 7
8
9
This discrepancy with Souma’s result may be due, according to the authors, to the fact that, in Japan, below a relatively high threshold, people are not required to file a tax form. Information on tax statistics can be found on the Internal Revenue Service Web site: http://www.irs.gov/taxstats/. http://ferret.bls.census.gov/. FERRET, which stands for Federal Electronic Research and Review Electronic Tool, is a tool developed and supported by the U.S. Bureau of the Census.
314
8 Power-Law Distributions
Example 63. Zipf law. Analyzing word frequency in different natural languages, George Kingsley Zipf (1902–1950) discovered that the frequency f of a word is inversely proportional to its rank r [355, 356], where a word of rank r is the rth word in the list of all words ordered with decreasing frequency. For example, James Joyce’s novel Ulysses contains 260,430 words. If words such as give, gives, gave, given, giving, giver, and gift are considered to be different, there are, in Ulysses, 29,899 different words. Zipf data, taken from Human Behavior and the Principle of Least Effort, p. 24, are reproduced in the table below. r 10 20 30 40 50 100
f 2653 1311 926 717 556 265
r 200 300 400 500 1000 2000
f 133 84 62 50 20 12
r f 3000 8 4000 6 5000 5 10,000 2 20,000 1 29,899 1
A log-log plot of the word frequency f as a function of the word rank r is shown in Figure 8.2 with a straight line representing the least-squares fit. The slope of this line is equal to 1.02. 8
log f
6 4 2 0 2
4
6 log r
8
10
Fig. 8.2. Log-log plot of word frequency f as a function of word rank r (dots) and the least-squares fit (straight line). Zipf data taken from Human Behavior and the Principle of Least Effort are reproduced in the table above.
In its original form, Zipf’s law, observed on a rather small corpus, accounts for the statistical behavior of word frequencies for ranks between a few hundred and a few thousand. To test its validity for higher rank values, much larger corpora have to be analyzed. Recently, Montemurro [246] studied a very large corpus made up of 2606 books with a vocabulary size of 448,359
8.1 Classical examples
315
different words. He found that Zipf’s law is verified with an exponent equal to 1.05 up to r ≈ 6000. Above this value, there is a crossover to a new regime characterized by an exponent equal to 2.3. These two regimes have also been found by Ferrer i Cancho and Sol´e [125], who processed the British National Corpus, which is a collection of text samples both spoken and written and comprising about 9 × 107 words. The values of the two exponents found by these authors are 1.01±0.02 and 1.92±0.07. Words seem, therefore, to belong to two different sets: a kernel vocabulary used by most speakers and a much larger vocabulary used for specific communications. For words belonging to the second set, being much less usual, it is not so surprising that their frequency decreases much faster as a function of their rank than the more usual words belonging to the first set. It is clear that the frequency-ordered list carries only a small part of the information on the role of words. If, for instance, we reorder at random the different words of a text, the word frequency as a function of word rank is unchanged, while the text meaning is totally lost. It is, therefore, not sufficient to determine how often a word is used, we need also to assess where a word is used. A step in this direction has been taken by Montemurro and Zanette [247]. Consider a text corpus (36 plays by Shakespeare containing a total of 885,535 words divided into 23,150 different words) as made up of the concatenation of P parts (the different plays). Determine the frequency fi of a particular word in part i (i = 1, 2, . . . , P ). The quantity pi =
fi P
fi
i=1
is the probability of finding the word in part i and S=−
P 1 pi log pi log P i=1
the entropy associated with this word. Note that 0 ≤ S ≤ 1, where the minimum value 0 corresponds to a word that appears only in one part and the maximum value to a word equally distributed among all parts. It is clear that very frequent words tend to be more uniformly distributed, which implies that the entropy S of a word increases with its global frequency. To explain the behavior of the entropy, the authors generated a random version of Shakespeare’s 36 plays. This was done by shuffling the list of all the 885,535 words and then dividing this list in 36 parts, part j containing exactly the same number of words as play j (j = 1, 2, . . . , 36). Comparing the entropy of the actual plays with the entropy of the randomized plays, while the tendency of the entropy to increase with the global word frequency is preserved, it is found that large fluctuations of the value of the entropy are absent in the randomized version. On average, words have higher entropies in the random version than in the actual corpus.
316
8 Power-Law Distributions
Language can also be described as a network of linked words. That is, a language can be described as a graph whose vertices are words, and undirected edges represent connections between words. Word interactions are not easy to define precisely, but it seems that different reasonable definitions provide very similar network structures. Following Ferrer i Cancho and Sol´e [126], an edge may connect an adjective and a noun (e.g., red flowers), an article and a noun (e.g., the house), a verb and an adverb (e.g., stay here), etc. This means that two words are at distance 1, i.e., they are connected by an edge if they are nearest neighbors in at least one sentence. The existence of a link between two words could be given a more precise definition saying that words i and j are at distance 1 if the probability pij of the co-occurrence of the pair (i, j) is larger than the product pi pj of their probabilities of occurring in the language. Taking into account this restrictive condition, Ferrer i Cancho and Sol´e found that in a network of 460,902 vertices (words) and 1.61 × 107 edges, the average degree d = 70.13. The vertex degree probability distribution has two different power-law regimes characterized by the exponents γ1 = 1.50 for vertex degrees less than 103 and γ2 = 2.70 for vertex degrees larger than 103 . Furthermore, the characteristic path length of the word network Lwn = 2.67, and the clustering coefficient Cwn = 0.437, which shows that the word network is a small world. Considering the language as a self-organizing growing network, and using the Barab´ asi-Albert idea of preferential attachment (see page 292) (i.e., a new word has a probability of being connected to an old word x proportional to the degree d(x) of that word), Dorogovtsev and Mendes [105] found that the vertex degree probability distribution exhibits a crossover between two power-law behaviors with exponents γ1 = 1.5 for vertex degrees d 5000 and γ2 = 3 for vertex degrees d 5000. Another application of statistical linguistics is the idea of extracting keywords from interword spacing [275]. The interword spacing is defined as the word count between a word and the next occurrence of the same word in a text. For each word, all interword spacings are counted up and the standard deviation is computed. It is found that high standard deviation is a better criterion as a search engine keyword than high frequency. Standard deviations of interword spacings seem also to offer the possibility of identifying texts due to the same author. If, for a given text, words are ranked according to the value of the standard deviation of their interword spacing and standard deviation is plotted versus the logarithm of the rank, it appears that plots corresponding to different texts by the same author almost coincide. This result, reported by Berryman, Allison, and Abbot [41], is in favor of the hypothesis that the biblical books of Luke and The Acts of the Apostles have a common author.
8.2 A few notions of probability theory
317
8.2 A few notions of probability theory 8.2.1 Basic definitions Although ideas about probability are very old, the development of probability theory, as a scientific discipline, really started in 1654 when Pascal (1623–1662) and Fermat (1601–1665) analyzed simple games of chance in a correspondence published for the first time in 1679 [124]. In the meantime, the basic concepts of probability theory were clearly defined and used by Huygens (1629–1695) in his treatise De Ratiociniis in AleæLudo published in 1657. In the early 18th century, Jakob Bernoulli (1654–1705) proved the law of large numbers, and de Moivre (1667–1754) established a first version of the central limit theorem. In the early 19th century, important contributions, essentially on limit theorems, were made by Laplace (1749–1827), Gauss (1777–1855) and Poisson (1781– 1840). These results were extended in the late 19th and early 20th centuries by Chebyshev (1821–1894), Markov (1856–1922), and Lyapunov (1857–1918). The axiomatization of the theory started with the works of Bernstein (1878– 1956), von Mises (1883–1953), and Borel (1871–1956), but the formulation based on the notion of measure, which became generally accepted, was given by Kolmogorov (1903–1987) in 1933. According to Kolmogorov [192], a probability space is an ordered triplet (Ω, A, P ), where Ω is the sample space or space of elementary events, A is the set of all events, and P is a probability measure defined on A. The set A is such that: 1. The empty set ∅ is an event. 7 2. If 8 {An | n ∈ N} is a countable family of events, then n∈N An and n∈N An are events. 3. If A is an event, the complement Ac of A is an event. The probability measure P defined on A has the following properties: 1. P (∅) = 0. 2. For all countable sequences (An ) of pairwise disjoint events ∞ ∞ & P An = P (An ). n=1
n=1
3. P (Ω) = 1. A random variable X is a real function defined on Ω such that, for all x ∈ R, the set {ω ∈ Ω | X(ω) ≤ x} is an event; i.e., an element of A. The cumulative distribution function of the random variable X is the function FX : R → [0, 1] defined by FX (x) = P (X ≤ x),
318
8 Power-Law Distributions
where the notation P (X ≤ x) represents the probability of the event {ω ∈ Ω | X(ω) ≤ x}. The cumulative distribution function is a measure. This measure is absolutely continuous if there exists a nonnegative function fX , called the probability density of the random variable X, such that x FX (x) = fX (u) du. −∞
In this case, X is said to be an absolutely continuous random variable. If the cumulative distribution function is piecewise constant with discontinuities at points xk (k ∈ N) such that, for all k, ∆FX (xk ) = F (xk ) − F (xk − 0) > 0, the random variable is said to be discrete, and the sequence of numbers (pk ), where pk = P ({xk }), is called a discrete probability distribution (see an example on page 280). There exist continuous cumulative distribution functions that are constant almost everywhere; i.e., they increase on a set of zero Lebesgue measure. Such distributions are said to be singular (see an example on page 219). In most applications, when dealing with a random variable X, we never refer to the sample space Ω of elementary events. We always refer to the cumulative distribution function FX , or, when X is absolutely continuous, to its probability density fX . Both functions are often abusively called probability distributions. It is clear that any nonnegative integrable function f such that ∞ f (x) dx = 1 −∞
defines a probability density. The following characteristics are of frequent use. They do not, however, always exist. The mean value or average value of a random variable X is ∞ m1 (X) = X = x dFX (x), −∞ ⎧∞ ⎪ ⎪ ⎪ xk pk , if X is discrete, ⎪ ⎨ k=1 = ∞ ⎪ ⎪ ⎪ ⎪ xfX (x) dx, if X is absolutely continuous. ⎩ −∞
More generally, the moment of order r of X is defined by ∞ xr dFX (x). mr (X) = X r = −∞
The variance of X is
8.2 A few notions of probability theory
319
σ 2 (X) = (X − X)2 = X 2 − X2 = m2 (X) − m21 (X). The square root σ(X) of the variance is called the standard deviation. If there is no risk of ambiguity, we shall omit X and simply write m1 , mr , and σ 2 . An important function, which is always defined, is the characteristic function ϕX of the probability distribution of X defined by ∞ itX ϕX (t) = e = eitx dFX (x). −∞
In the case of absolutely continuous random variables, the characteristic func10 tion is the Fourier transform of the probability density: ϕX = f" X. If the moment mr (X) of the random variable X exists, then ϕX (t) =
r mk (X) k=0
k!
(it)k + o(tr );
that is, if mr (X) exists, then ϕX is at least of class C r and, for k = 0, 1, . . . , r, ϕ(k) (0) = ik mk (X). There exist various modes of convergence for sequences of random variables. Let us just mention one mode that usually leads to short and elegant proofs of limit theorems: Definition 24. The sequence of random variables (Xn ) converges in distribution to the random variable X if for every bounded continuous function f lim f (Xn ) = f (X). n→∞
This mode of convergence is denoted d
Xn −→ X. That is, to prove that (Xn ) converges to X in distribution, we may just check if either lim FXn (x) = FX (x) n→∞
at every point of continuity of FX or lim ϕXn (t) = ϕX (t).
n→∞ 10
The Fourier transform of the function f is denoted f.
320
8 Power-Law Distributions
8.2.2 Central limit theorem The most important absolutely continuous probability distribution is the normal or Gauss distribution, whose density, defined on R, is fGauss (x) = √
1 (x − m)2 , exp − 2σ 2 2πσ
where m is real and σ positive. It is easily verified that, for a Gaussian random variable, X = m, and X 2 − X2 = σ 2 . A Gaussian random variable of mean m and variance σ 2 is denoted N (m, σ 2 ). As its name emphasizes, the central limit theorem plays a central role in probability theory. In its simplest form it can be stated: Theorem 20. Let (Xj ) be a sequence of independent identically distributed random variables with mean m and standard deviation σ; if Sn = X1 + X2 + · · · + Xn , then Sn − nm d √ −→ N (0, 1). σ n That is, the scaled sum of n independent identically distributed random variables with mean m and standard deviation σ converges in distribution to a normal random variable of zero mean and unit variance. Following Lyapunov, the simplest method for proving the central limit theorem is based on characteristic functions of probability distributions. Since the random variables Xj are identically distributed, they have the same characteristic function ϕX . For j ∈ N, let Yj = (Xj − m)/σ and denote by ϕY their common characteristic function. Then, the characteristic function ϕUn of the random variable n 1 Sn − nm √ Un = √ Yj = n j=1 σ n is
√ n ϕUn (t) = ϕY (t/ n) .
For a fixed t and a large n, 2 t t2 t ϕY √ +o . =1− 2n n n
Thus ϕUn (t) = and
t2 1− +o 2n
t2 n
n
lim ϕUn (t) = exp − 12 t2 ,
n→∞
1
which proves the theorem since t → e− 2 t is the characteristic function of a Gaussian random variable N (0, 1) (see page 325). 2
8.2 A few notions of probability theory
321
Example 64. The probability distribution of a Poisson random variable is PPoisson (X = k) = e−λk
λk . k!
It is easy to verify that the sum of n independent identically distributed Poisson variables is a Poisson variable. Thus, for λ = 1, the central limit theorem implies x k 2 1 −nk n e e−u /2 du. =√ lim n→∞ k ! √ 2π −∞ k≤n+x n Figure 8.3 shows the approach to a normal distribution for the scaled sum of 50 independent Poisson random variables with a unit parameter value; see also Exercise 8.2.
1 0.8 0.6 0.4 0.2
-4
-2
2
4
Fig. 8.3. Convergence to the normal distribution of the scaled sum of 50 independent Poisson random variables with unit parameter value.
The central limit theorem holds under more general conditions. Here is an example11 : Theorem 21. Let (Xj ) be a sequence of independent random variables such that, for all j, Xj = mj ,
(Xj − mj )2 = σj2 ,
If 11
See, for example, Lo`eve [210].
|Xj − mj |3 < ∞.
322
8 Power-Law Distributions
1n lim
n→∞
then
j=1 |Xj
− mj |3 1 3/2 = 0, n 2 σ j=1 j
1n
j=1 (Xj
91 n
− mj )
2 j=1 σj
d
−→ N (0, 1).
Remark 23. According to the central limit theorem, the normal distribution is universal in the sense that the distribution of the scaled sum of a large number of random variables subject to some restrictive conditions is a Gaussian random variable. In the averaging process, we are thus losing information concerning the specific random variables. This implies that, for given average value and variance, the normal distribution is the distribution containing minimal information.12
Random variables observed in nature can often be interpreted as sums of many individual mutually independent random variables, which, as a consequence of the central limit theorem, should be normally distributed. Coefficients characterizing the shape of a probability distribution as skewness and kurtosis excess are often used to check possible deviations from normality. These coefficients are, respectively, defined by the scale-invariant expressions13 (X − m1 )3 , σ3 (X − m1 )4 − 3. α4 = σ4
α3 =
They are both equal to zero for normally distributed random variables. The coefficient α3 is a measure of the asymmetry of the probability distribution, and α4 can be, more specifically, used as a measure of departure from Gaussian behavior. In the case of a unimodal probability distribution (i.e., probability distribution with only one relative maximum—the mode), the median m, : defined by FX (m) : = 12 , may give, when it differs from the average value m1 , a measure of skewness (see page 332).14 Example 65. A discrete random variable X that can take the value k = 0, 1, 2, . . . , n with the probability n k Pbinomial (X = k) = p (1 − p)n−k k is a binomial random variable. Its skewness and kurtosis excess are given by 12 13
14
This result can, in fact, be established directly, as shown in Exercise 8.1. The coefficient (X − m1 )4 /σ 4 , which is also used in statistics, is called the kurtosis. On the technical aspects of statistics, the interested reader may consult Sachs [305]. On the mathematical aspects, see Wilks [346].
8.2 A few notions of probability theory
1 − 2p
α3 =
np(1 − p)
and α4 =
323
1 − 6p(1 − p) . np(1 − p)
As n → ∞, both coefficients go to zero. This result follows from the central limit theorem. Since the binomial distribution of parameters n and p is the probability distribution of the sum of n independent identically distributed Bernoulli random variables Xj (j = 1, 2, . . . , n), where, for all j, PBernoulli (Xj = 1) = p
and PBernoulli (Xj = 0) = 1 − p,
the central limit theorem implies15 x 2 n k 1 n−k lim =√ e−u /2 du. p (1 − p) n→∞ k √ 2π −∞ k≤np+x
np(1−p)
The central limit theorem tells us that the sum Sn of n independent identically distributed random variables X1 , X2 , . . . , Xn with finite variance is well-approximated by a Gaussian distribution with mean nm1 and variance nσ 2 when n is large enough. Typically, for a large finite value of n, deviations of Sn from nm1 are of the order of n1/2 , which is small compared to nm1 . Deviations of Sn from nm1 may be much greater. The theory of large deviations studies, in particular, the asymptotic behavior of P (|Sn − nm1 | > an), where a > 0. The following result is often useful16 : Theorem 22. Let (Xj ) be a sequence of independent identically distributed random variables; if the function17 M : t → exp(t(X1 − m1 )) is finite in some neighborhood of the origin, then, if P (X1 − m1 > a) > 0 for a positive a, 1/n lim (P (Sn − nm1 > na)) = e−ψ(a) , n→∞
where Sn = X1 + X2 + · · · + Xn and ψ(a) = − log inf e−at M (t) t>0
is positive. That is, P (Sn − nm1 > na) decays exponentially. For an application of this result, see Exercise 8.3. 15
16 17
This result was first established by de Moivre for p = to any value of p. See Grimmett and Stirzaker [158], p. 184. M is the moment-generating function.
1 2
and extended by Laplace
324
8 Power-Law Distributions
0.2 0.15 0.1 0.05
5
10
15
20
Fig. 8.4. Probability density function of the lognormal distribution for x0 = e and σ = 1.
8.2.3 Lognormal distribution The lognormal probability density function, which depends upon two parameters x0 and σ, is given by 2 log(x/x0 ) 1 . exp − flognormal (x) = √ 2σ 2 x 2πσ 2 That is, if X is a lognormal random variable, log X is normally distributed with log X = log x0 and (log X)2 − log X2 = σ 2 . The graph of flognormal is represented in Figure 8.4. A lognormal distribution results when many random variables cooperate multiplicatively. Let Xi (i = 1, 2, . . . , n) be independent random variables. Since n n $ log Xi = log Xi , i=1
i=1
if the independent random variables log Xi (i = 1, 2, . . . , n) satisfy the conditions of validity of the central limit theorem, the scaled sum of such random variables is a lognormal random variable. The lognormal distribution is used in biology, where, for instance, the sensitivity of an individual to a drug may depend multiplicatively upon many characteristics such as weight, pulse frequency, systolic and diastolic blood pressure, cholesterol level, etc. It is also used in finance, where it is assumed that the increments of the logarithm of the price rather than the change of prices are independent random variables (see page 336). In data analysis, power-law behavior of certain observables is detected using log-log plots. When adopting such a procedure, a lognormal distribution can be mistaken for a power-law distribution over a rather large interval, as illustrated in Figure 8.5. (See Exercise 8.6 for another example of apparent power-law behavior.)
8.2 A few notions of probability theory
325
-24 -28 -32 -36 -40 12
13
14
15
16
Fig. 8.5. Apparent power-law behavior of the The dotted lognormal distribution. line represents the sequence log(10, 000 n), log fnormal (10, 000 n) n=1,2,...,1000 . The parameters of the lognormal distribution are x0 = e and σ = −3.873. The slope of the straight line giving the least-squares fit is equal to −3.873.
8.2.4 L´ evy distributions Let X1 and X2 be two absolutely continuous random variables and fX1 and fX2 their respective probability densities. If X1 and X2 are defined on the same space and are independent, the probability density fX1 +X2 of their sum is given by x
fX1 +X2 (x) =
−∞
fX1 (x1 )fX2 (x − x1 ) dx1 .
That is, the probability density fX1 +X2 of the sum of the two independent random variables X1 and X2 is the convolution fX1 ∗ fX2 of their probability densities. Since the characteristic function ϕX of an absolutely continuous random variable X is the Fourier transform of its probability density, we have ϕX1 +X2 (t) = ϕX1 (t)ϕX2 (t). It is straightforward to verify that the characteristic function of a normal (Gaussian) random variable with probability density function 1 x − m)2 x → , exp 2πσ 2σ 2 where m is real and σ positive, is t → exp itm −
1 2 2 2 t σ
.
Using this result, we find that the characteristic function of the sum of two independent normal random variables is given by
326
8 Power-Law Distributions
exp itm1 −
1 2 2 2 t σ1
exp itm2 − 12 t2 σ22 = exp it(m1 + m2 ) −
1 2 2 2 t (σ1
+ σ22 ) ,
which proves that the sum of two independent normal random variables is also a normal random variable. Probability distributions having this property are said to be stable under addition.18 Paul L´evy [202, 203, 145] addressed the problem of determining the class of all stable distributions. He found that the general form of the characteristic 6 α,β of a stable probability distribution is such that function L ⎧ π t ⎪ α ⎪ α , if α = 1, ⎨imt − a|t| 1 − iβ tan |t| 2 6 α,β (t) = (8.1) log L k 2 ⎪ ⎪ log |t| , if α = 1, ⎩imt − a|t| 1 − iβ |t| π where 0 < α ≤ 2,19 a > 0, m is real, and −1 ≤ β ≤ 1. The indices α and β refer, respectively, to the peakedness and skewness of L´evy probability density functions. If β = 0, the L´evy probability density function is symmetric about x = m (see Figure 8.6).
0.4 0.05 0.3
0.04 0.03
0.2
0.02 0.1
-4
-2
0.01 2
4
2.5
3
3.5
4
4.5
5
Fig. 8.6. Right figure: symmetric L´ evy stable probability density functions for α = 1.5 (lighter grey) and α = 2 (normal distribution). Left figure: tails of both probability density functions.
The particular case α = 2, a = 12 σ 2 , and β = 0 corresponds to the normal distribution. Another simple example of a stable distribution is the Cauchy distribution (also called the Lorentzian distribution). It corresponds to α = 1 and m = β = 0, and its probability density is L1,0 (x) = 18
19
a , π(x2 + a2 )
(a > 0).
Using the characteristic function method, one can easily verify that binomial and Poisson probability distributions are stable under addition. α For α > 2, the Fourier transform of e−a|t| may become negative for some values of x and cannot be an acceptable probability density function.
8.2 A few notions of probability theory
327
If m = β = 0, we have ∞ 1 exp(−itx − a|t|α ) dt 2π −∞ 1 ∞ −a|t|α = e cos tx dt. π 0
Lα,0 (x) =
Except for α equal to either 1 or 2, there is no simple expression for Lα,0 (x). It is, however, possible to find its asymptotic expansions for large values of x. For 0 < α < 2, one finds Lα,0 (x) ∼
aΓ (1 + α) sin 12 πα, π|x|1+α
where Γ is the Euler function.20 The case α = 2, which corresponds to the normal distribution, is singular with its exponential tail. For 0 < α ≤ 1, the mean value of a L´evy random variable is not defined, while for 1 < α < 2, the mean value is defined but not the variance. The probability that the sum Sn of n independent identically distributed symmetric L´evy random variables lies in the interval ]sn , sn + dsn ] is ∞ 1 −an|t|α e cos tsn dt dsn . π 0 Changing n1/α t in τ yields ∞ 1 −a|τ |α −1/α e cos τ (sn n ) dτ d(sn n−1/α ), π 0 showing that the correct scaled sum is the random variable Un = Sn n−1/α . Note that for α = 2, which corresponds to the Gaussian case, we obtain the right scaling. The central limit theorem may be restated saying that the Gaussian distribution with zero mean and unit variance is, in the space of probability distributions, the attractor of scaled sums of large numbers of independent identically distributed random variables with finite variance; or, equivalently, scaled sums of large numbers of independent identically distributed random variables with finite variance are in the basin of attraction of the Gaussian distribution with zero mean and unit variance. Is there an equivalent result for scaled sums of independent identically distributed random variables with infinite variance? The answer is yes. 20
For Re z > 0, the Γ function is defined by ∞ tz−1 e−t dt. Γ (z) = 0
328
8 Power-Law Distributions
Consider the random variable Sn =
n
Xj ,
j=1
where the Xj (j = 1, 2, . . . , n) are independent identically distributed random variables. Suppose that their common probability density function f behaves for large x as
C− |x|−(1+α) , as x → −∞, f (x) ∼ C+ x−(1+α) , as x → ∞, where 0 < α < 2, and let β=
C+ − C− . C+ + C−
Then, the probability density function of the scaled sum tends to a L´evy probability density of index α and asymmetry parameter β. The asymptotic power-law behavior of stable L´evy probability distributions makes them good candidates when modeling systems characterized by Pareto’s tails with a Pareto exponent equal to α. If, in a random walk, the jump length probability distribution is a symmetric L´evy distribution, the existence of a long tail implies unfrequent long jumps connecting distant clusters of visited points. This feature is illustrated in Figure 8.7 on Cauchy random walks in one and two dimensions.21 Random walks with a jump length probability distribution given by a L´evy stable distribution are called L´evy flights. 8.2.5 Truncated L´ evy distributions Finite-size effects prohibit the observation of infinitely long tails. If a measured probability distribution looks L´evy-like, since it cannot extend to infinity, there must exist some sort of cutoff. This remark led Mantegna and Stanley [226] to define truncated L´evy probability densities by
CLα,0 (x), if |x| ≤ , T (x) = 0, otherwise, where C is a normalizing factor and a cutoff length. An example of such a distribution is represented in Figure 8.8. For a L´evy random variable X, the probability P (X > ) that X is greater than is a decreasing function of . The truncation therefore eliminates the 21
The probability density function of the Cauchy distribution considered here is x →
1 , π(1 + x2 )
which is the L´evy distribution L1,0 for a = 1.
8.2 A few notions of probability theory
329
75 50 25 50
100
150
200
250
-25 -50 -75 -100
Fig. 8.7. Cauchy random walks showing occasional long jumps. Top: onedimensional walk, number of steps: 250. Bottom: two-dimensional walk, dot indicates initial point, number of steps: 100.
0.5 0.4 0.3 0.2 0.1 -3
-2
-1
1
2
3
Fig. 8.8. A truncated symmetric L´ evy probability density function for α = 1.5 equal to zero for |x| > 2.
possible but rare occurrence of certain events. For very large values of n, the scaled sum of n independent identically distributed truncated L´evy random variables should approach a Gaussian distribution as a consequence of the
330
8 Power-Law Distributions
central limit theorem. But it is clear that the larger is, the slower the convergence will be, since, for a large , the truncation eliminates values of X that, even for sums of nontruncated L´evy random variables, would have had a negligible probability of occurring. There should therefore exist a value n∗ of n such that, for n < n∗ , the scaled sum would approach a L´evy random variable, while, for n > n∗ , the scaled sum would approach a Gaussian random variable. This crossover has indeed been observed by Mantegna and Stanley, who found n∗ to be proportional to α . If, instead of the sharp cutoff of Mantegna and Stanley, one considers with Koponen a smooth cutoff [194]—that is, a probability density of the form
A− e−λx |x|−1−α , if x < 0, TKoponen (x) = if x ≥ 0, A+ e−λx x−1−α , where 0 < α < 2—the expression of its characteristic function can be exactly calculated. This allowed Koponen to prove for sums of independent random variables, whose common probability distribution is TKoponen (x), the existence of the two regimes observed by Mantegna and Stanley. 8.2.6 Student’s t-distribution The t-distribution was introduced in statistics by William Sealy Gosset (1876– 1937), writing under the pseudonym Student [327]. For x ∈ R, its probability density function is given by 1
− 12 (1+µ) (1 + µ) x2 ft (x) = . 1+ 1 µ (πµ) 2 Γ 12 µ Γ
2
In Figure 8.9 are represented the graphs of two√Student’s probability density functions for µ = 1 and µ = 10. Since Γ ( 12 ) = π, for µ = 1, ft (x) coincides −1 with the Cauchy distribution π(1 + x2 ) . The Student’s probability distribution arises in a commonly used statistical test of hypotheses concerning the mean of a small sample of observed values drawn from a normally distributed population when the population standard deviation is unknown (see page 333). It is simple to show that, as µ → ∞, the Student’s distribution tends to the Gaussian distribution with zero mean and unit variance. We have Γ 12 (1 + µ) 1 1 1 − ··· 1− + =√ 1 4µ 32µ2 2π (πµ) 2 Γ 12 µ and
8.2 A few notions of probability theory
331
0.3
0.2
0.1
-4
-2
2
4
Fig. 8.9. Probability density function of the Student t-distribution for µ = 1 (black) and µ = 10 (grey).
x2 log 1 + µ
− 12 (1+µ)
Thus 1 2 1 ft (x) = √ e− 2 x 2π
which proves that
x2 1 = − (1 + µ) log 1 + 2 µ 2 1 x x4 = − (1 + µ) − 2 + ··· 2 µ 2µ x2 x4 − 2x2 + ··· . =− + 2 4µ
1−
1 + ··· 4µ
1+
x4 − 2x2 + ··· 4µ
,
1 2 1 lim ft (x) = √ e− 2 x . 2π
µ→∞
8.2.7 A word about statistics If, for example, we are interested in the distribution of annual incomes of people living in a specific country, we assume that the income is a random variable X taking the observed values x1 , x2 , . . ., where each index refers to a specific individual of the population. In order to estimate the parameters characterizing the probability distribution of X, we have to select a set of observed values {xi | i = 1, 2, . . . , n}, which is said to be a sample of size n.22 22
In mathematical statistics a random sample is a sequence of random variables (X1 , X2 , . . . , Xn ) with a common probability distribution. Such a random sample can be viewed as the result of n successive drawings from an infinite population, the population probability distribution remaining unchanged throughout the drawings.
332
8 Power-Law Distributions
Assuming that the sample is representative,23 numerical values computed from the sample are called statistics. Here are some examples: 1. the arithmetic mean
n
x= 2. the standard deviation
xi
i=1
n
;
; < n x) has, for both positive and negative tails, a power-law behavior [284, 148]. That is, P (|g| > x) ∼
1 , xα
where, in the region 2 ≤ |g| ≤ 80,
3.10 ± 0.03, for positive g, α= 2.84 ± 0.12, for negative g. The S&P 500 index, defined as the sum of the market capitalizations of 500 companies representative of the US economy, has also been analyzed [149]. The probability P (|g| > x) of the normalized return also exhibits power-law tails characterized by the exponents
2.95 ± 0.07, for positive g, α= 2.75 ± 0.13, for negative g, for ∆t = 1 min, over the 13-year period 1984–1996, and in the region 3 ≤ |g| ≤ 50. This result is surprising since the returns of the S&P 500 are weighted sums 29
There is no unique definition of the volatility. For example, it could also be defined as an average over a time window T = n∆t, i.e., τ +n−1 1 |G∆t (t)|, n t=τ
where n is an integer.
8.3 Empirical results and tentative models
337
of the returns of 500 companies, and, as a consequence of the central limit theorem, one would expect to find that these returns are normally distributed unless there exist significant dependencies between the returns of the different companies. Investigating the power-law behavior for sampling time interval ∆t, it appears that the exponent α ≈ 3 for ∆t varying from a few minutes up to several days. For larger values of ∆t, there is a slow convergence to a Gaussian behavior. There exist many other stock market indices. A study, by the same authors, of the NIKKEI index of the Tokyo stock exchange (for the 14-year period 1984–1997) and the Hang Seng index of the Hong Kong stock exchange (for the 18-year period 1980–1997) shows that, in both cases, the probability P (|g| > x) has similar power-law tails with exponents close to 3, suggesting the existence of a universal behavior for these distributions. In the recent past, most papers on price dynamics have essentially tried to answer the question: How can one approximately determine with good accuracy the probability distribution function of the logarithm of price differences from market data?30 While, for small ∆t, it is clear that this distribution is not Gaussian, it is less clear to decide whether it is a truncated L´evy distribution with an exponential cutoff or a Student’s t-distribution, which, incidentally, are not very easy to distinguish. More recently, the search for plausible mechanisms to explain the observed phenomena and build up models that are not just curve-fitting has been suggested by some authors.31 To conclude this section, we briefly present an example of a time series model widely used in finance.32 One important feature of financial time series is the phenomenon of conditional heteroskedasticity; i.e., the time-dependence of the variance of the price increments. For instance, traders may react to certain news by buying or selling many stocks, and a few days later, when the news is valued more properly, a return to the behavior preceding the arrival of the news may be observed. This could induce a large increase or decrease of the returns followed by the opposite change in the following days; and it may be reasonable to assume that these two abrupt changes are correlated. Let W = {Wt | t ∈ N0 } be a stochastic process, where the Wt are independent identically distributed random variables with zero mean and unit variance, and consider the process X = {Xt | t ∈ N0 } defined by Xt = σt Wt , 30 31
32
2 with σt2 = a0 + a1 Xt−1
(a0 > 0, 0 < a1 < 1 σ0 = 1). (8.2)
Mantegna and Stanley [227], Chapter 8. See Bouchaud and Potters [68], Chapter 2; Lux and Marchesi [218]; Huang and Solomon [175]; and Bouchaud [69]. On time series models, see Franses [129].
338
8 Power-Law Distributions
Processes such as X are called ARCH processes.33 ARCH processes are specified by the probability density fW of the random variables Wt and the parameters a0 and a1 .34 Since Wt = 0, the unconditional mean value of Xt is equal to zero, and its unconditional variance Xt2 is given by 2 Xt2 = a0 + a1 Xt−1 ,
which shows the autocorrelation of the squared time series. Solving the equa2 tion above for Xt−1 = Xt2 yields Xt2 =
a0 . 1 − a1
Relations (8.2) show that large (resp. small) absolute values of Xt are expected to be followed by large (resp. small) absolute values even while Xt−τ Xt = 0; i.e., the time series is uncorrelated. The ARCH model can therefore describe time series with sequences of data points looking like outliers, and as a conse2 quence of the relation σt2 = a0 + a1 Xt−1 , these outliers will appear in clusters (see Exercise 8.7). Clusters of outliers lead to nonzero kurtosis excess. From (8.2), we have 4 Xt4 = 3(a20 + 2a0 a1 Xt2 + a21 Xt−1 ) a 0 4 + 3a21 Xt−1 . = 3a20 + 6a0 a1 1 − a1 4 Solving for Xt−1 = Xt4 yields
Xt4 =
3a20 (1 + a1 ) , (1 − a1 )(1 − 3a21 )
which shows that the unconditional kurtosis excess is thus given by 6a21 Xt4 −3= 4 2 Xt 1 − 3a21 if 0 < a1 < 3− 2 . Since Wt3 = 0, the skewness of Xt is zero. An interesting property of ARCH processes is that they can dynamically generate power-law tails [285, 286] (see Exercises 8.9 and 8.10). For instance, the probability distribution of the first differences of the S&P 500 index over the 12-year period 01/84–12/95, for ∆t = 1 min, is well-fit by the probability distribution of an ARCH process with fW given by a truncated L´evy probability density determined by the parameters α = 1.4, a = 0.275, and = 8 1
33
34
The acronym ARCH stands for autoregressive conditional heteroskedasticity. Such time series models were initially proposed by Engle [116]. This is actually an ARCH model of order 1. In an ARCH model of order τ , σt2 is 2 2 2 given by a0 + a1 Xt−1 + a2 Xt−2 + · · · + aτ Xt−τ .
8.3 Empirical results and tentative models
339
(see pages 326 and 328 for parameter definitions) and the parameters of the ARCH process a1 = 0.4 and a0 chosen to fit the empirical standard deviation σ = 0.07. The probability density fX appears to exhibit two power-law regimes: a first one characterized by the exponent α1 = 1.4 of the L´evy distribution up to the cutoff length followed by a second one with a dynamically generated exponent α2 = 3.4 for x > . 8.3.2 Demographic and area distribution The demographic distribution of humans on Earth is extremely heterogeneous. Highly concentrated areas (cities) alternate with large extensions of much lower population densities. If s is the population size of a city (number of inhabitants) and r(s) its rank, where the largest city has rank 1, Zipf [355] found empirically that 1 r(s) ∼ , s which, for the frequency as a function of population size, implies f (s) ∼
1 . s2
This result appears to be universal. Statistical studies of 2700 cities of the world having more than 105 inhabitants, 2400 cities of the United States having more than 104 inhabitants, and 1300 municipalities of Switzerland having more than 103 inhabitants give exponents very close to 2. More precisely: 2.03 ± 0.05 for the world (in the range s < 107 ), 2.1 ± 0.1 for the United States, and 2.0 ± 0.1 for Switzerland [351]. Moreover, around huge urban centers, the distribution of the areas covered by satellite cities, towns, and villages obeys the same universal law. Zanette and Manrubia proposed a simple model to account for the exponent −2 characterizing the power-law behavior of the frequency f as a function of population size s [351]. Consider an L × L square lattice and denote by s(i, j; t) ≥ 0 the population size at site (i, j) ∈ Z2L at time t. This system evolves in discrete time steps according to the following two subrules. 1. The one-input rule: ⎧ 1−q ⎪ ⎨ s(i, j; t), with probability p, p s(i, j; t + 12 ) = q ⎪ ⎩ s(i, j; t), with probability 1 − p, 1−p where 0 < p < 1 and 0 ≤ q ≤ 1. 2. A diffusion process in which a fraction a of the population at site (i, j) at time t + 12 is equally divided at time t + 1 between its nearest-neighboring sites.
340
8 Power-Law Distributions
Note that for such an evolution rule the total population is conserved in the limit L → ∞. Zanette and Manrubia claim that numerical simulations on a 200 × 200 square lattice starting from an initial homogeneous population size distribution (s(i, j; 0) = 1 for all (i, j) ∈ Z2200 ), and for different values of the parameters p, q and a, show that, after a transient of 103 time steps, the time average of the size frequency has a well-defined power-law dependence characterized by an exponent α = −2.01 ± 0.01. This result has been questioned by Marsili, Maslov, and Zhang [228], who found that, for the same parameter values, the total population tends to zero as a function of time. In their reply, Zanette and Manrubia [352] argued that extinction is a finite-size effect and that simulations in large systems clearly show a transient in which a power-law distribution builds up before finite-size effects play their role. 8.3.3 Family names A statistical study of the distribution of family names in Japan shows that, if s denotes the size of a group of individuals bearing the same name, the number n(s) of groups of size s has a decreasing power-law behavior characterized by an exponent equal to 1.75 ± 0.05 in five different Japanese communities [245]. The data were obtained from 1998 telephone directories. Similar results have been obtained from (i) a sample of the United States 1990 census and (ii) all the family names of the Berlin phone book beginning with A. In both these cases, however, the exponent is close to 2 [353]. It has been argued that the discrepancy between these two results could be attributed to transient effects: most Japanese family names having been created in the rather recent past (about 120 years ago) [353]. In order to obtain reliable results, statistical analyses of family name frequency should be carried out on very large population sizes. This is not the case for the two studies above. 8.3.4 Distribution of votes On October 4, 1998, Brazil held general elections with compulsory vote. The state of S˜ ao Paulo, with 23,321,034 voters and 1260 local candidates, is the largest electoral group. It has been found [93] that the number of candidates N (v) receiving a fraction v of the votes behaves as v −α with α = 1.03 ± 0.03 over almost two orders of magnitude. The authors claimed that this power-law behavior is also observed for the other Brazilian states having a large enough number of voters and candidates. A similar result (α = 1.00 ± 0.02) holds also for all the candidates for state deputy in the country.35 A similar analysis was carried out for the 2002 elections [94]. In spite of a new rule concerning alliances between parties approved by the National 35
Detailed information on Brazilian elections can be found on the Web site of the Tribunal Superior Eleitoral: http://www.tse.gov.br. For the 1998 elections, go to http://www.tse.gov.br/eleitorado/eleitorado98/index.html.
8.4 Self-organized criticality
341
Congress, the same power law was observed for the number of candidates N (v) receiving a fraction v of the votes. It should be noted that, while a scale-free behavior is found for elections to the National Congress and State Houses, there is no evidence of such a behavior for the municipal elections held in 2000. According to the authors, this discrepancy may be due to the fact that, in municipal elections, candidates are much closer to voters while candidates to the National Congress or State Houses are usually only known through the media. According to the authors, a possible explanation of the power-law behavior is to assume that the success of a candidate is the result of a multiplicative process in which each factor “should be related to attributes and/or abilities of the candidate to persuade voters.” Assuming that the corresponding random variables are independent and in large number, the central limit theorem implies that the distribution of votes should be lognormal, which can be mistaken for a power-law distribution over a few orders of magnitudes (see page 324).36
8.4 Self-organized criticality In the late 1980s, Bak, Tang, and Wiesendfeld [24, 25, 27] reported “the discovery of a general organizing principle governing a class of dissipative coupled systems.” These systems evolve naturally towards a critical state, with no intrinsic time or length scale, in which a minor event starts a chain reaction that can affect any number of elements in the system. This self-organized criticality therefore would be an explanation of the occurrence of power-law behaviors. In a recent book [29], Bak even argues that self-organized criticality “is so far the only known general mechanism to generate complexity.” 8.4.1 The sandpile model Consider a finite L × L square lattice. The state of a cell is a nonnegative integer representing the number of sand grains stacked on top of one another in that cell. In the initial state, each cell contains a random number of sand grains equal to 0, 1, 2, or 3. Then, to cells sequentially chosen at random, we add one sand grain. When the number of sand grains in one cell becomes equal to 4, we stop adding sand grains, remove the four grains, and equally distribute them among the four nearest-neighboring cells. That is, the number of sand grains in the cell that reached the threshold value 4 at time t becomes equal to 0 at time t + 1, and the number of sand grains in the four nearestneighboring cells increases by one unit. This process is repeated as long as 36
Playing with the two parameters of the lognormal distribution, it is not difficult to adjust the value of the exponent of the apparent power-law behavior.
342
8 Power-Law Distributions
the number of sand grains in a cell reaches the threshold value.37 When the number of sand grains in a boundary cell reaches the value 4, the same process applies but the grains “falling off” the lattice are discarded. A sequence of toppling events occurs when nearest-neighboring cells of a cell reaching the threshold value contain three sand grains. Such a sequence is called an avalanche. An avalanche is characterized by its size (that is, the number of consecutive toppling events) and its duration, equal to the number of update steps. An avalanche of size 6 and duration 4 is shown in Figure 8.11.
Fig. 8.11. An example of an avalanche of size 6 and duration 4. Cells occupied by n sand grains, where n is equal to 0, 1, 2, 3, and 4 are, respectively, white, light grey, medium grey, dark grey, and black. The system evolves from left to right and top to bottom.
The sandpile model illustrates the basic idea of self-organized criticality. For a large system, adding sand grains leads at the beginning to small avalanches. But as time increases, the system reaches a stationary state where the amount of added sand is balanced by the sand leaving the system along the boundaries. In this stationary state, there are avalanches of all sizes up to the size of the entire system. Numerical simulations show that the number N (s) of avalanches of size s behaves as s−τ , where, in two-dimensional systems, the exponent τ is close to 1. Careful experiments carried out on piles of rice [130] have shown that self-organized criticality 37
A similar transition rule defined on a one-dimensional lattice was first studied by J. Spencer [324]; for exact results on one-dimensional sandpile models, see E. Goles [146].
8.4 Self-organized criticality
343
is not as ‘universal’ and insensitive to the details of a system as was initially supposed, but that instead its occurrence depends on the detailed mechanism of energy dissipation. Power-law behaviors were only observed for rice grains of elongated shape. The authors of this experimental study were then led to reanalyze previous experiments on granular systems. They found that the flow over the rim of the pile had a stretched-exponential distribution that is definitely incompatible with self-organized criticality. A distinctive observable consequence of self-organized critical dynamics, as illustrated by the sandpile model, is that the distribution of fluctuations or avalanche size has a power-law behavior. But, as we shall discover, the converse is not necessarily true. 8.4.2 Drossel-Schwabl forest fire model An important number of numerical studies have been dedicated to the analysis of this model. Originally, a forest fire model that should exhibit self-organized criticality had been proposed by Bak, Chen, and Tang [26]. Their model is a three-state cellular automaton. State 0 represents an empty site, state 1 a green tree, and state 2 a burning tree. At each time step, a green tree has a probability p of growing at an empty site, a green tree becomes a burning tree if there is, at least, one burning tree in its von Neumann neighborhood, and sites occupied by burning trees become empty. As stressed by Grassberger and Kantz [155], for finite p,“the model is in the universality class of the epidemic model with recovery which is isomorphic to directed percolation.” Numerical simulations on rather small lattices led Bak, Chen, and Tang to conjecture that the system exhibits self-organized criticality in the limit p → 0, i.e., when the time for a tree to burn is much smaller than the average time to wait for a green tree to grow at an empty site. Simulations on much larger lattices and for much smaller values of p performed by Grassberger and Kantz [155] and Moßner, Drossel, and Schwabl [250] provided evidence that the system does not exhibit self-organized criticality. In two dimensions, starting from a random distribution of trees (green and burning), for small values of p, after a number of iterations of the order of 1/p, the fire propagates along rather regular fronts, as illustrated in Figure 8.12. Following Drossel and Schwabl [109], for the model to become critical, a mechanism allowing small forest clusters to burn has to be included. This can be done introducing the following extra subrule: A green tree with no burning tree in its von Neumann neighborhood has a probability f p of being struck by lightning and becoming a burning tree. This model involves three time scales: the fire propagation rate from burning trees to green neighbors, the growth rate of green trees, and the lightning rate (1 p f ). As emphasized by Grassberger [157]:
344
8 Power-Law Distributions
Fig. 8.12. Forest fire model of Bak, Chen, and Tang. Lattice size: 200; initial green tree density: 0.3; initial burning tree density: 0.01; tree growth rate p = 0.05; number of iterations: 40. Color code: burning trees are black, green trees are grey, and empty sites are light grey.
The growth of trees does not lead to a state which is inherently unstable (as does the addition of sand grains to the top of the pile does), but only to a state susceptible to being burnt. Without lightning, the tree density in any patch of forest can go far beyond criticality. When the lighting strikes finally, the surviving trees have a density far above critical. Actually, Grassberger showed [156] that the “forest” consists of large patches of roughly uniform density, most of which are either far below or far above the critical density for spreading. Since these patches occur with all sizes, fires also have a broad distribution of sizes. Grassberger confirmed the criticality of the Drossel-Schwabl model, but he found critical exponents in disagreement with those obtained by Drossel and Schwabl: 1. If s is the number of trees burnt in one fire, in the limit p/f → ∞, the fire size distribution behaves as s1−τ with τ = 2.15 ± 0.02. 2. For a finite value of p/f , the power-law behavior above is cut off at smax ∼ (p/f )λ with λ = 1.08 ± 0.02. 3. The average cluster radius R2 1/2 scales as (p/f )ν with ν = 0.584 ± 0.01. In their paper, Drossel and Schwabl [109], using mean-field arguments, found τ = 2, λ = 1, and ν = 0.5. Their numerical simulations apparently confirmed these results, but they were carried out for rather small values of p/f . This, however, is not the end of the story. Following recent studies [308, 279, 309], that introduced corrections to the scaling laws above, Grass-
8.4 Self-organized criticality
345
berger [157] presented numerical simulations on very large systems (65, 536 × 65, 536) with helical boundary conditions, large p/f values (256,000), long transients (approximately 107 lightnings), and a very large number of fires (between 9.3 × 106 for p/f = 256, 000, and 109 for p/f ≤ 250). While confirming previous results obtained for much smaller lattice sizes and p/f values, the essential conclusion of this study is that, without ambiguity, previous results do not describe the true critical behavior of the Drossel-Schwabl forest fire model. Still worse, according to Grassberger, his own simulations do not reach the true asymptotic regime. Most probably, all proposed scaling laws are just transient and there seems to be no indications of any power laws. These results are to be taken as a warning, especially when dealing with real systems [157]: The situation becomes even worse when going to real life phenomena. It does not seem unlikely that many of the observed scaling laws are just artifacts or transients. Problems of spurious scaling in models which can be simulated with very high precision such as the present one should be warnings that not every power law supposedly seen in nature is a real power law.
8.4.3 Punctuated equilibria and Darwinian evolution In the foreword to the Canto edition of John Maynard Smith’s book on the theory of evolution [236], Richard Dawkins tells us that: Natural selection is the only workable explanation for the beautiful and compelling illusion of ‘design’ that pervades every living body and every organ. Knowledge of evolution may not be strictly useful in everyday commerce. You can live some sort of life and die without ever hearing the name of Darwin. But if, before you die, you want to understand why you lived in the first place, Darwinism is the one subject that you must study. This book is the best general introduction to the subject now available. And, in the introduction of this highly recommended book, John Maynard Smith so presents Eldredge and Gould’s theory of punctuated equilibria [115, 325]: A claim of which perhaps too much has been made concerns ‘punctuated equilibria’. Gould and Eldredge suggested that evolution has not proceeded at a uniform rate, but that most species, most of the time, change very little, and that this condition of ‘stasis’ is occasionally interrupted by a rapid burst of evolution. I do not doubt that this picture is sometimes, perhaps often, true. My difficulty is that I cannot see that it makes a profound difference to our view of evolution.
346
8 Power-Law Distributions
number of mutations
Punctuated equilibrium behavior, with its intermittent bursts interrupting long periods of stasis, is very similar to self-organized critical behavior. Bak and Sneppen [28] proposed a model of biological evolution of interacting species that exhibits co-evolutionary avalanches. Consider a onedimensional lattice of size L, with periodic boundary conditions. Each lattice site represents a species, and its state is a random number between 0 and 1 that measures the evolutionary barrier of the species. At each time step, the system—called an ecology—is updated by locating the species with the lowest barrier and mutating it by assigning a new random barrier to that species. At the same time, the left and right neighbors are also assigned new random barriers.
160 120 80 40 0 0
40
80 120 species
160
200
Fig. 8.13. Number of mutations of each species. The number of species is 200. The system has been updated 20,000 times, and the first 10,000 iterations have been discarded.
Figure 8.13 shows that some species do not change at all while others change many times. Flyvberg, Sneppen, and Bak [128] gave a mean-field theory of a slightly different model in which, instead of modifying three random barriers—the barrier of the species with the lowest barrier and the barriers of its two nearest neighbors—in the new model, the random barriers of K species are modified: the barrier of the species with the lowest barrier and the barriers of K − 1 other species chosen at random. In the limit N → ∞, the authors found that avalanches come in all sizes, and the larger ones are distributed according to a power law with an exponent equal to − 32 . Actually this model has been solved exactly [63], and the mean-field exponent was found to be exact.
8.4 Self-organized criticality
347
Analyzing the distribution of distances between successive mutations, Bak and Sneppen found a power-law behavior indicating that the ecology has organized into a self-critical state. 8.4.4 Real life phenomena Here are presented some observed phenomena that may provide illustration of the idea of self-organized criticality. Example 66. Mass extinctions. Most species that ever lived on Earth are extinct. Recently, several well-documented mass extinctions have been analyzed, and the interpretation of the available data has generated many controversies. Analyzing the extinction record of the past 250 ma (millions of years), Raup and Sepkoski [294] discovered, using a variety of tests, a periodicity of 26 ma. Discarding the existence of purely biological or earthbound cycles, the authors favor extraterrestrial causes, such as meteorite impacts due to the passage of the solar system through the spiral arms of the Milky Way, as a possible explanation of periodic mass extinctions. More recently, the existence of power-law functional forms in the distribution of the sizes of extinction events has been put forward in favor of a self-organized criticality mechanism [317, 319, 320]. In particular, it has been reported [320] that many time series are statistically self-similar with 1/f power spectra. These findings, according to the authors, support the idea that a nonlinear response of the biosphere to perturbations provides the main mechanism for the distribution of events. These results have been criticized by Kirchner and Weil [189], who claimed that the apparent self-similarity and 1/f scaling are artefacts of interpolation methods—the analyzed time series containing actually much more interpolated (86%) than real data. On the other hand, in a detailed paper on mass extinction, Newman [260] reviewed the various models based on self-organized criticality and analyzed the available empirical evidence. Using a simple model he previously used in collaboration with Sneppen [259, 318] to study earthquake dynamics, he showed that self-organized criticality is not the only mechanism leading to power laws. Newman’s model assumes that the ultimate cause for the extinction of any species is environmental stress of any kind such as climate change or meteorite impact. His model may be described as follows. Consider a system of N species, and assign to species i a random number xi from a probability distribution with a given density fth (x), which, for example, may be uniform on the interval [0, 1]. The number xi , called the threshold level of species i, represents the amount of stress above which it becomes extinct. Then, at each time step, the system evolves according to two subrules: 1. Select a random number η, called the stress level, from a probability distribution with a given density fst (η). Assuming that small stresses are
348
8 Power-Law Distributions
more common than large ones, the probability density is chosen to fall off away from zero. All species whose threshold level is less than η become extinct. Extinct species are replaced by an equal number of new species to which are assigned threshold levels distributed according to fth (x). The number of extinct species s is the size of the avalanche taking place at this time step. 2. A small fraction f of all species, chosen at random, are assigned new random threshold levels. Although this system never becomes critical in the sense of possessing longrange spatial correlations, it does organize into a stationary state characterized by avalanches with a power-law size distribution. If fst (x) is a Gaussian distribution with zero mean and standard deviation σ = 0.1, it is found that the distribution of the size of extinction events has a power-law behavior, over many orders of magnitude, with an exponent equal to 2.02 ± 0.02. Moreover, for a variety of stress probability distributions—e.g., Cauchy or exponential— the exponent varies slightly but remains close to 2, in agreement with fossil data. In self-organized critical models, the system is slowly driven towards some instability and the relaxation mechanism is local. In Newman’s model, on the contrary, the system is not slowly driven to produce intermittent bursts and there are no (local) interactions between the species. Example 67. Earthquakes. Earthquakes are the result of seismic waves produced when some sort of stored energy is suddenly released. Earthquakes are classified in magnitude according to the Richter scale [298]. Essentially, the magnitude number is the logarithm of the maximum seismic wave amplitude measured at a certain distance from the epicenter.38 Earthquakes having a magnitude equal to 8 or more are extremely devastating. For example, the San Francisco earthquake of April 18, 1906, had a magnitude of 8.25, and was felt along the west coast from Los Angeles in the south to Coos Bay in southwestern Oregon in the north. In the late 1960s, the plate tectonics model was developed to explain seismic patterns. According to this theory, the lithosphere (i.e., the Earth’s upper shell) consists of nearly a dozen large slabs called plates. These plates move relative to each other and interact at their boundaries, where most faults lie. The majority of earthquakes are the result of movements along faults. The San Andreas fault, which extends northwest from Southern California for about 1000 km, results from the abutment of two major plates. The San Francisco earthquakes of 1906 and 1989 were both caused by movement along this fault. The frequency of shallow earthquakes as a function of their magnitude obeys the Gutenberg-Richter law [161], which states that the logarithm of the frequency of shocks decreases linearly with magnitude. That is, 38
The epicenter is the point on the surface of Earth that is directly above the source of the earthquake, called the focus.
8.4 Self-organized criticality
349
log10 N (M > m) = a − bm, where the left-hand side is the base 10 logarithm of the number of earthquakes whose magnitude M is larger than m. Data collected [30] from the Web site of the Southern California Earthquake Data Center 39 show that for a total number of 335,076 earthquakes recorded during the period 1984–2000, the exponent b is equal to 0.95 for magnitudes larger than 2. Most of the various models that have been proposed to account for the Gutenberg-Richter law are based on the idea of self-organized criticality. One popular model has been suggested by Olami, Feder, and Christensen [273].40 Consider a two-dimensional square array of blocks in which each block is connected to its four nearest neighbors by springs of elastic constant K. Additionally, each block is connected to a single rigid driving plate by another set of springs of elastic constant KL as well as connected frictionally to a fixed rigid plate. The blocks are driven by the relative movement of the two rigid plates. When the force on one of the blocks is larger than some threshold value Fth , the block slips to a zero-force position. The slip of one block redefines the forces on its nearest neighbors, and this may result in further slips, starting an avalanche of slips. If ∆i,j is the displacement of block (i, j) from its zero-force position, the total force exerted by the springs on block (i, j) is Fi,j = K(4∆i,j − ∆i−1,j − ∆i+1,j − ∆i,j−1 − ∆i,j+1 ) + KL ∆i,j . If v is the velocity of the moving plate relative to the fixed plate, the total force on a given block increases at a rate proportional to KL v until the force on one block reaches its threshold value and triggers an earthquake. This mechanism is referred to as a stick-slip frictional instability.41 The redistribution of forces after a local slip of block (i, j) is such that Fi,j → 0,
Fi±1,j → Fi±1,j +
K Fi,j , 4K + KL
Fi,j±1 → Fi,j±1 +
K Fi,j 4K + KL
The parameter α = K/(4K + KL ) varies between 0 and 0.25. The system evolves in time according to the following rules. 1. Consider an L × L array of blocks satisfying a zero-force condition on the boundary. 2. Initialize forces on blocks to a random value between 0 and Fth . 3. If, for a pair (i, j), Fi,j > Fth , then redistribute the force Fi,j on the neighboring blocks as indicated above. An earthquake is taking place. 39 40
41
http://www.scecdc.scec.org/. A purely deterministic one-dimensional version of this model had been studied by Carlson and Langer [78]. Both models are variants of the Burridge-Knopoff model [74]. The earthquake is the “slip,” and the “stick” is the period between seismic events during which elastic strain accumulates.
350
8 Power-Law Distributions
4. Repeat step 3 until the earthquake is fully evolved. 5. Locate the block on which the exerted force is maximum. If Fmax is the value of this force, increment the force exerted on each block by Fth −Fmax and repeat step 3. The numerical determination of the probability distribution of the size—i.e., the total number of relaxations—of the earthquakes, which is proportional to the energy released during an earthquake, is found to exhibit self-organized criticality for a wide range of values of the parameter α.42 The exponent values depend upon α. Lattice sizes used in the computer simulations were rather small: from 15 to 50. As stressed by the authors, it is important to note that in this model, in contrast with the original Bak-Tang-Wiesenfeld model, only a fraction of the transported quantity is dissipated in each relaxation event: for KL > 0, the redistribution of forces is nonconservative. This fraction is controlled by the parameter α. Soon after the publication of the paper of Olami, Feder, and Christensen, Klein and Rundle [190], noticing that the behavior, as a function of the lattice size, of the exponential cutoff observed by the authors could not be correct, found suspect the observed critical behavior. A few years later, de Carvalho and Prado [80], revisiting the Olami-Feder-Christensen [273], found, in contrast with its authors and a subsequent study by Middleton and Tang [242], that it can be critical only in the conservative case, i.e., for α = 0.25. Christensen, Hamon, Jensen, and Lise [88], repeating the simulations of de Carvalho and Prado with the same parameter values, claimed that numerical results did not really exclude that the Olami-Feder-Christensen model remained critical also in the nonconservative case. In their reply, de Carvalho and Prado [81] agreed that on numerical evidence a true critical behavior for the nonconservative case could not be ruled out, but they thought that it was unlikely. Recently, extensive numerical simulations [207, 208] have shown that the model displays scaling behavior on rather large square lattices (512 × 512). The probability density function of earthquake sizes obeys the GutenbergRichter law with a universal exponent τ = 1.843 that does not depend upon the parameter α. It has been found, however, that the probability distribution of earthquakes initiated close to the boundary of the system does not obey 42
43
The energy E released during the earthquake is given in terms of the magnitude m by the relation log10 E = c + dm, where c and d are positive constants. From the Gutenberg-Richter law, the probability for an earthquake to have an energy larger than E behaves as E −B , where B = b/d. In the modern scientific literature, an earthquake size is measured by its seismic moment M0 , which is defined as GAu, where G is the shear modulus, A the rupture area, and u the mean slip. This exponent τ is equal to 1 + B, where B is the exponent characterizing the power-law behavior of the cumulative distribution function (see Footnote 42).
8.4 Self-organized criticality
351
finite-size scaling.44 On the other hand, numerical investigations of the OlamiFeder-Christensen model on a random graph [209] show that, for an average vertex degree equal to 4, there is no criticality in the system, as the cutoff in the probability distribution of earthquake sizes does not scale with system size. On the contrary, if each vertex has the same degree, for a weaker disorder, the system does exhibit criticality and obey finite-size scaling with universal critical exponents that are different from those of the lattice model. Self-organized criticality is not the only mechanism that may explain the scale-free Gutenberg-Richter law. The Newman-Sneppen model [259, 318], described as a model of mass extinction on page 347, may be used as an earthquake model replacing the word ‘species’ by ‘point of contact’ in a subterranean fault. Choosing an exponential distribution for fst (x), the exponent of the size distribution is found equal to 1 + B = 1.84 ± 0.03. In order to investigate possible connections with a spatially organized system, Newman and Sneppen [259] implemented their model on a lattice and, at each time step, they eliminated not only those agents whose stress thresholds fall below the stress level but also their neighbors. As for noninteracting systems, they found an avalanche power-law distribution with an exponent close to 2. To conclude this discussion on earthquake models, let us just mention another completely different model [83]. Its essential ingredient is the fractal nature of the solid-solid contact surfaces involved in the stick-slip mechanism. The authors relate the total contact area between the two surfaces to be proportional to the elastic strain energy grown during the sticking period as the solid-solid friction force due to interactions between asperities increases. This energy is released as one surface slips over the other. Assuming that the fractal dimension df of the surfaces in contact is less than 2, the exponent B (see Footnote 42) of the Gutenberg-Richter law is found to be approximately equal to 1/df . Example 68. Rainfall. Conventional methods of rain measurement are based on the idea of collecting rain in a container and measuring the amount of water after a certain time. Recently, high-resolution data have been collected by the Max-Planck Institute for Meteorology (Hamburg, Germany). Data collected at 250 meters above sea level with a one-minute resolution have been analyzed [281]. The authors define a rain event as a sequence of successive nonzero rain rates (i.e., two rain events are separated by at least one minute). The size of a rain event is the height, in millimeters, of the rain collected in a water column during the rain event. The drought duration is the time, in minutes, between two consecutive rain events. They found that the number N (M ) of rain events of size M per year and the number N (D) of drought periods of duration D per year have the power-law behaviors N (M ) ∼ M −1.36 44
and N (D) ∼ D−1.42 ,
This feature was already apparent in the numerical results of Olami, Feder, and Christensen, and it is the reason why Klein and Rundle [190] cast doubt on the critical behavior.
352
8 Power-Law Distributions
which extend over at least three orders of magnitude. These power-law behaviors are, according to the authors, consistent with a self-organized critical process. As for a sandpile to which sand grains are constantly added, the atmosphere is driven by a slow and constant energy input from the Sun, leading to water evaporation. The energy is thus stored in the form of clouds and suddenly released in bursts when the vapor condenses to water drops. The power-law distribution N (M ) ∼ M −1.36 is the analog of the power-law distribution of the avalanche size of the sandpile model.
Exercises Exercise 8.1 (i) Show that, among all continuous probability density functions f defined on R subject to the conditions xf (x) dx = m, (x − m)2 f (x) dx = σ 2 , R
R
the normal distribution maximizes the entropy defined as S(f ) = −k f (x) log f (x) dx, R
where k is a constant depending upon the entropy unit. (ii) What kind of auxiliary condition(s) should be imposed so that an a priori interesting probability density function f would maximize the entropy? Consider the case f (x) = a/π(a2 + x2 ), where x ∈ R. Exercise 8.2 The exponential probability density function is defined, for x ≥ 0, by fexp (x) = λe−λx
(λ > 0).
(i) Find the probability density function of the sum of n independent identically distributed exponential random variables. (ii) Show that 0
√ n+x n
un−1 1 e−u du = √ (n − 1) ! π
x
−∞
exp − 12 u2 du.
(iii) Illustrate the approach to the Gaussian distribution of the scaled sum of n independent identically distributed exponential random variables for two values of n, one somewhat small and the other one sufficiently large for the two distributions to be reasonably close in an interval of a few units around the origin. Exercise 8.3 Consider a sequence of independent identically distributed Bernoulli random variables (Xj ) such that, for all j, P (Xj = −1) =
1 2
and
P (Xj = 1) = 12 .
Exercises
353
Use Theorem 22 to show that, if 0 < a < 1,45 1 lim (P (Sn > an))1/n = . (1 + a)1+a (1 − a)1−a
n→∞
Exercise 8.4 If (xj )1≤j≤n is a sequence of observed values of a random variable X, in order to limit risk, one is often interested in finding the probability distributions of minimum and maximum values. (i) Consider a sequence of n independent identically distributed random variables (Xj )1≤j≤n ; knowing the common cumulative distribution function FX of these random variables, determine the cumulative distribution functions FYn and FZn of the random variables Yn = min(X1 , X2 , . . . , Xn )
and
Zn = max(X1 , X2 , . . . , Xn ).
(ii) Assuming that the n independent random variables have a common Cauchy probability density function centered at the origin and unit parameter, find the limit nz lim P Zn > . n→∞ π Exercise 8.5 In statistical physics, the Boltzmann-Gibbs distribution applies to large systems in equilibrium with an infinite reservoir of energy. In this case, the system is said to be in thermal equilibrium, which means that its temperature is constant and its average energy is fixed. Claiming that in a closed economic system the total amount of money is conserved, Dr˘ agulescu and Yakovenko [106] argue that, in a large system of N economic agents, money plays the role of energy and the probability that an agent has an amount of money between m and m + dm is p(m)dm with p(m) of the form Ce−m/T , where C = 1/T is a normalizing factor and T an effective “temperature.” (i) Assuming that m can take any nonnegative value, derive this expression of p(m). What is the meaning of the “temperature” T ? (ii) What is the form of p(m) if m values are restricted to the interval [m1 , m2 ]? Discuss, in this case, the sign of T . Exercise 8.6 Show that a power law with an exponential cutoff distribution of the form x−α (1 − ε)−x , where ε is a small positive number, can be mistaken for a pure power-law distribution. Exercise 8.7 (i) In order to show that the ARCH model can describe time series with sequences of data points looking like outliers, generate a time series from the ARCH process defined by (8.2) with independent identically distributed Gaussian random variables Wt with zero mean and unit variance, i.e., for all t > 0, Wt = N (0, 1). (ii) Determine the different statistics (mean, variance, skewness, kurtosis excess) for the generated time series Xt . Exercise 8.8 (i) Use the method described on page 334 to obtain an approximate graph of the cumulative distribution function from a random sample drawn from a Student’s t-distribution for µ = 2. 45
This example is taken from G. R. Grimmett and D. R. Stirzaker [158], p. 187, where a few more examples can be found.
354
8 Power-Law Distributions
(ii) Consider a rather large random sample to determine numerically the exponent characterizing the power-law behavior of the tail of the Student’s t-distribution. Exercise 8.9 Study numerically the positive tail of the probability distribution of a random sequence generated from the ARCH process of Exercise 8.7. Exercise 8.10 The ARCH process of order τ is defined by the relations Xt = σt Wt ,
2 2 2 σt2 = a0 + a1 Xt−1 + a2 Xt−2 + · · · + aτ Xt−τ ,
with
where a0 > 0 and aj ≥ 0 for j = 1, 2, . . . , τ . In some applications, it appears that the order τ takes a high value, and one therefore has to estimate many parameters. It would be convenient to have a simpler model involving less parameters. The GARCH model, proposed by Bollerslev [66], involves only three parameters. It is defined by the relations Xt = σt Wt ,
with
2 2 σt2 = a0 + a1 Xt−1 + b1 σt−1 ,
(8.3)
where the Wt are independent identically distributed Gaussian random variables with zero mean and unit variance, a0 , a1 , and b1 are positive, and a1 + b1 < 1. (i) Show that the GARCH model may describe time series with data points looking like outliers. (ii) Study numerically the positive tail of the probability distribution of a random sequence generated from a GARCH process.
Solutions Solution 8.1 (i) The probability function f that maximizes the entropy S(f ) = −k f (x) log f (x) dx R
and satisfies the conditions f (x) dx = 1, R
xf (x) dx = m,
R
R
(x − m)2 f (x) dx = σ 2 ,
is the solution to the equation δ f (x) dx k f (x) log f (x) dx, + λ1 δf R R xf (x) dx + λ3 (x − m)2 f (x) dx = 0, + λ2 R
R
where λ1 , λ2 , and λ3 are Lagrange multipliers to be determined from the conditions above and δ/δf denotes the functional derivative with respect to the function f . Taking the derivative yields k(log f (x) + 1) + λ1 + λ2 x + λ3 (x − m)2 = 0.
Solutions
355
2
This shows that the function f should be proportional to e−λ2 x−λ3 (x−m) , and expressing that it has to satisfy the three conditions above, yields (x − m)2 1 f (x) = √ exp − . 2σ 2 2πσ f is the probability density function of the normal distribution. Its entropy is equal to Snormal =
k (1 + log 2π) + k log σ = k(1.41894 . . . + log σ). 2
To verify that the entropy of the normal distribution is maximum, it suffices to consider any other probability density with the same mean value m and standard deviation σ and check that its entropy is smaller. The entropy of the symmetric exponential distribution √ 1 2 √ exp − |x − m| σ 2σ is
1 Sexponential = k 1 + log 2 + log σ = k(1.34657 . . . + log σ) < Snormal . 2
Since the entropy measures the lack of information, the central limit theorem shows that averaging a large number of independent identically distributed random variables leads to a loss of information about the probability distribution of the random variables. (ii) The probability density function f that maximizes the entropy S(f ) = −k f (x) log f (x) dx R
subject to the n auxiliary conditions Fj (x)f (x) dx = cj
(j = 1, 2, . . . , n),
R
with F1 (x) = 1 (normalization of f ), should verify k(log f (x) + 1) + λ1 + λ2 F2 (x) + · · · + λn Fn (x) = 0, where λ1 , λ2 , . . . , λn are Lagrange multipliers. If we want f to be the Cauchy probability density function, the equation above becomes k(log(a/π) + 1) + λ1 − k log(a2 + x2 ) + λ2 F2 (x) + · · · + λn Fn (x) = 0. Since all the moments of the Cauchy distribution are infinite, we cannot expect to satisfy this equation expanding log(a2 + x2 ) and writing down an infinite number of conditions for all moments of even order. The only possibility is to choose λ1 to cancel all constant terms and F2 (x) proportional to log(a2 + x2 ), which would be, as pointed out by Montroll and Schlesinger [248], a rather unlikely a priori choice.
356
8 Power-Law Distributions
Solution 8.2 (i) There are at least two different methods to obtain this result. First method. Let us first find the probability density function of the sum of two independent identically distributed exponential random variables. It is given by s f2exp (s) = λ2 e−λx e−λ(s−x) dx 0
= λ2 se−λs . For the sum of three exponential random variables, we find s f3exp (s) = λ3 xe−λx e−λ(s−x) dx 0
λ3 s2 −λs . = e 2 This last result suggests that the probability density function of the sum of n independent identically distributed exponential random variables could be the Gamma distribution of index n defined by λn sn−1 −λs fGamma (s) = . e (n − 1) ! This is easily verified by induction since s n+1 n−1 λ x λn+1 sn −λs . e−λx e−λ(s−x) dx = e (n − 1) ! n! 0 Second method. The characteristic function of an exponential random variable is given by ∞ ϕexp (t) = λe−λx eitx dx 0
−1 it . = 1− λ On the other hand, the characteristic function of a Gamma random variable of index n is ∞ n n−1 λ x e−λx eitx dx ϕGamma (t) = (n − 1) ! 0 −n it = 1− , λ which proves that the sum of n independent identically distributed exponential random variables is a Gamma random variable of index n. (ii) For λ = 1, the mean m1 and the standard deviation σ of the corresponding exponential random variable are both equal to 1. Hence, from the central limit theorem, it follows that n+x√n x un−1 1 exp − 12 u2 du. e−u du = √ (n − 1) ! π 0 −∞ (iii) Figure 8.14 shows the approach to the Gaussian distribution of the scaled sum of 10 and 50 independent identically distributed exponential random variables.
Solutions
-3
-2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
-1
1
2
3
-3
-2
-1
1
2
357
3
Fig. 8.14. Convergence to the normal distribution of the scaled sum of 10 (left figure) and 50 (right figure) exponential random variables with unit parameter value. Continuous curves represent the Gaussian distribution and dashed curves distributions of scaled sums of independent exponential random variables for λ = 1. Solution 8.3 The mean value of the Bernoulli random variables Xj is zero, and their moment-generating function M is given by M (t) = cosh t. This function is finite in any finite neighborhood of the origin. From Theorem 22 it follows that, for 0 < a < 1, lim (P (Sn > na))1/n = inf e−at cosh t . n→∞
t>0
As a function of et , e−at cosh t has a minimum for et =
(1 + a)/(1 − a), so
1 , inf e−at cosh t = 1+a (1 + a) (1 − a)1−a
t>0
which proves the required result. Solution 8.4 (i) By definition FYn (y) = P min(X1 , X2 , . . . , Xn ) ≤ y = 1 − P min(X1 , X2 , . . . , Xn ) > y = 1 − P X1 > y, X2 > y, · · · , Xn > y − 1 − P (X1 > y)P (X2 > y) · · · P (Xn > y) n = 1 − 1 − FX (y) . Similarly, FZn (z) = P max(X1 , X2 , . . . , Xn ) ≤ z = P X1 ≤ z, X2 ≤ z, · · · , Xn ≤ z = P (X1 ≤ z)P (X2 ≤ z) · · · P (Xn ≤ z) n = FX (z) . (ii) The cumulative distribution function FX (x) of a Cauchy random variable X centered at the origin and unit parameter is given by x arctan(x) 1 du 1 FX (x) = = + . 2 π 1 + u 2 π −∞
358
8 Power-Law Distributions
Hence,46
nz n nz 1 1 lim P Zn > + arctan = 1 − lim n→∞ n→∞ π 2 π π π n 1 = 1 − lim 1 − arctan n→∞ π nz n 1 = 1 − lim 1 − + o(n−1 ) n→∞ zn = 1 − e−1/z .
Remark 26. The function z → e−1/z is the cumulative distribution function of a positive random variable. As a consequence of the result above, the scaled random variable πZn /n converges in distribution to a positive random variable Z such that FZ (z) = e−1/z . For z = 1/ log(2), e−1/z = 12 ; therefore, the probability that, in a sequence of n realizations of a symmetric Cauchy random variable with unit parameter, the maximum of the n observed values has a probability equal to 12 to be greater than n/(π log 2) ≈ 0.4592241 n. Solution 8.5 (i) To derive the expression of p(m), we proceed as in Exercise 8.1. The entropy associated with a probability density function p is ∞ S(p) = − p(m) log p(m) dm. 0
Using Lagrange multipliers to maximize this entropy with respect to the function p under the constraints ∞ ∞ M p(m) dm = 1 and mp(m) dm = , N 0 0 which express that p(m) has to be normalized and that the average amount of money per agent, equal to the total amount of money M divided by the number of agents N , is fixed, we have to maximize ∞ ∞ ∞ p(m) log p(m) dm + λ1 p(m) dm + λ2 mp(m) dm. − 0
0
0
Writing that the functional derivative with respect to p of the expression above is zero yields − log p(m) − 1 + λ1 + λ2 m = 0. This shows that p is an exponential distribution, which can be written m 1 p(m) = exp − (T > 0). T T The meaning of the parameter T is found expressing that the average amount of money per agent is fixed. This gives ∞ M mp(m) dm = T = . N 0 46
Using the relations arctan(x) + arctan(1/x) = π/2 and arctan(x) = x + o(x) as x → 0.
Solutions
359
That is, when the system is in equilibrium, the parameter T , which plays the role of the “temperature” in statistical physics, is the average amount of money per agent. It is always positive. Remark 27. Maximizing the entropy means that the disorder associated with the distribution of an amount of money M between N agents under the weak constraint that fixes the average amount per agent is maximized. Taking into account more restrictive constraints such as a fixed standard deviation would give a Gaussian distribution (as in Exercise 8.1) corresponding to a much more egalitarian society. Note that, for the exponential distribution, the most probable amount of money is equal to zero! (ii) In the derivation of the probability distribution of money above, it was assumed that the amount of money could vary from 0 to infinity. If m ∈ [m1 , m2 ], the maximization of the entropy under constraints leads again to an exponential distribution, but now we have to satisfy the conditions m2 m2 M Ce−m/T dm = 1 and Cme−m/T dm = m = . N m1 m1 Hence, CT e−m1 /T − e−m2 /T = 1, CT e−m1 /T (m1 + T ) − e−m2 /T (m2 + T ) = m. The first equation determines the normalizing factor C. The probability density function is then given by e−m/T p(m) = . T (e−m1 /T − e−m2 /T ) Replacing C by its expression in the second equation yields em2 /T (m1 + T ) − em1 /T (m2 + T ) = m. em2 /T − em1 /T Let
m2 − m1 2
m2 + m1 2
and
m− =
m1 = m+ − m−
and
m2 = m+ + m− .
m+ = or
In terms of m+ and m− , the relation between T and m becomes coth
m+ + m m− T =− . − T m− m−
The odd function L : x → coth x − 1/x, familiar in statistical physics, is the Langevin function. Its graph is represented in Figure 8.15. The relation above shows that the sign of T is the same as the sign of −(m+ + m). If, for example, m1 = −5, m2 = 5, and m = 4, then −(m+ + m) = −4, and solving for T , we find T = −1. The graph of the corresponding probability density function is shown in Figure 8.16.
360
8 Power-Law Distributions
1
0.5
-20
-10
10
20
-0.5
-1 Fig. 8.15. Graph of the Langevin function.
0.4 0.3 0.2 0.1
-4
-2
2
4
Fig. 8.16. Probability density function of the distribution of money when the amount of money possessed by one individual is restricted to the interval [m1 , m2 ]. Solution 8.6 In order to show that a power law with an exponential cutoff distribution can be mistaken for a pure power-law distribution, consider the probability density function f (x) = Cx−1.5 (1 − 0.0005)x (x ≥ 1), where C is a normalizing factor approximately equal to 0.520. Figure 8.17 shows a log-log plot of the sequence f (10 k) for k = 1, 2, . . . , 100. A least-squares fit of the apparent power-law behavior of this sequence gives an exponent equal to −1.6. While the random variable distributed according to f has a mean value equal to 40.2, the random variable defined for x ≥ 1 with a power-law probability density proportional to x−1.6 has no mean value. Solution 8.7 (i) Consider the ARCH model defined by the relations Xt = σt Wt ,
with
2 σt2 = a0 + a1 Xt−1
(a0 > 0, 0 < a1 < 1),
Solutions
361
0.01 0.001 0.0001 0.00001 1
2
5
10
40
100
Fig. 8.17. Log-log plot of the sequence f (10 k) for k = 1, 2, . . . , 100, which exhibits an apparent power-law behavior, with an exponent equal to −1.6, over about two orders of magnitude. The slightly shifted straight line represents the least-squares fit.
where, for all positive integers t, Wt = N (0, 1), and choose, for example, a0 = 0.6, a1 = 0.5, and σ1 = 1. Figure 8.18 clearly shows the existence of outliers present in a generated sequence of 1000 data points.
4 2 Xt
0 -2 -4 -6 0
200
400 600 time
800
1000
Fig. 8.18. ARCH model: (Xt ) time series for t = 1, 2, . . . , 1000. Another interesting way to show the existence of outliers in the generated time series (Xt ) is the scatter plot of Xt versus Xt−1 represented in Figure 8.19. By comparison, the scatter plot of the Gaussian time series is shown in Figure 8.20. (ii) The different statistics for the time series Wt and Xt are given in the following table. To obtain a better accuracy, we generated a sequence of 50, 000 data points. mean variance skewness kurtosis excess Wt −0.000854 0.997 −0.0213 0.0240 Xt −0.00242 1.20 −0.0847 3.86 The numerical values of the statistics of the time series Wt are to be compared to the theoretical values, respectively equal to 0, 1, 0, and 0. We have also represented the
362
8 Power-Law Distributions
4 2 Xt
0 -2 -4 -6 -6
-4
-2 0 Xt-1
2
4
Fig. 8.19. Scatter plot of Xt versus Xt−1 .
3 2
Wt
1 0 -1 -2 -3 -3
-2
-1
0 Wt - 1
1
2
3
Fig. 8.20. Scatter plot of Wt versus Wt−1 . histograms of both time series in Figure 8.21. The heights of the bars being scaled so that their areas sum to unity, the histograms represent approximate probability density functions. Solution 8.8 (i) The probability density of the Student’s t-distribution for µ = 2 is given by 1 ft (x) = 3 . (2 + x2 ) 2 Following the method described on page 334, from a random sample of 2000 points, we obtain the plot represented in Figure 8.22. (ii) In order to study numerically the power-law behavior of the positive tail of the probability distribution, we have generated a random sample of size n = 100, 000 drawn from a Student’s t-distribution for µ = 2 and considered the last 2000 points of the ordered random sequence. Using the notation of page 334, a log-log plot of the sequence of points (ξk , 1 − ηk ) for k > 98, 000 is represented in Figure 8.23. From a least-squares
Solutions 0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
-4.
-2.
0.
-10.
2.
-5.
0.
5.
363
10.
Fig. 8.21. Histograms of time series (Wt ) (left) and (Xt ) (right). The heights of the bars are scaled so that their areas sum to unity. 1 0.8 0.6 0.4 0.2
-4
-2
2
4
Fig. 8.22. Approximate cumulative distribution function of the Student’s tdistribution for µ = 2. fit, the exponent characterizing the power-law behavior of the cumulative distribution function is found to be equal to −1.99, in very good agreement with the asymptotic behavior of the probability density function ft (x) ∼ x−3 for large x. Here the tail corresponds to x-values greater than 4.8 (log 4.8 = 1.57). Solution 8.9 If we consider the ARCH model of Exercise 8.7, from the sequence of 50,000 points generated to compute the four statistics (see solution of Exercise 8.7), and using the method described on page 334, from the log-log plot of the last 2000 points of the ordered sequence, represented in Figure 8.24, it can be shown that the tail of the distribution exhibits a power-law behavior. A least-squares fit gives α = −3.62 for the exponent characterizing the power-law behavior for large positive x values of the cumulative distribution function (see Figure 8.24). The last 100 points have been discarded. Solution 8.10 Choosing a0 = 0.5, a1 = 0.5, b1 = 0.3, and σ1 = 1, the time series and the scatter plot represented, respectively, in Figures 8.25 and 8.26 show the existence of outliers in a sequence of 1000 data points generated by the GARCH process defined by (8.3). (ii) Different statistics for a time series Xt of 100,000 data points generated by the GARCH process defined above are given in the following table.
364
8 Power-Law Distributions -4 -5 -6 -7 -8 2
2.5
3
3.5
Fig. 8.23. Power-law behavior of the cumulative distribution function of the Student’s t-distribution for µ = 2 determined from the last 2000 points of an ordered random sequence of size n = 100, 000. A least-squares fit gives an exponent equal to −1.99, in very good agreement with the theoretical value −2. -3 -4 -5 -6 -7 0.6
0.8
1
1.2
1.4
1.6
Fig. 8.24. Power-law behavior of the positive tail of the probability distribution of the Xt sequence of 50, 000 points generated by the ARCH model studied in Exercise 8.7. The straight line representing the least-squares fit has been slightly translated for clarity. The exponent is equal to −3.62. mean variance skewness kurtosis excess Xt 0.000933 2.45 0.142 10.5 Using the method described on page 334, from the log-log plot of the last 5000 points of the ordered sequence represented in Figure 8.27, show that the tail of the distribution exhibits a power-law behavior. A least-squares fit gives α = 3.62 for the exponent characterizing the power-law behavior for large positive x values of the cumulative distribution function.
Solutions
365
8 6
Xt
4 2 0 -2 -4 -6 0
200
400 600 time
800
1000
Fig. 8.25. GARCH model: (Xt ) time series for t = 1, 2, . . . , 1000.
8 6
Xt
4 2 0 -2 -4 -6 -6
-4
-2
0
2 Xt-1
4
6
8
Fig. 8.26. GARCH model: Scatter plot of Xt versus Xt−1 .
-3 -4 -5 -6 -7 0.8
1
1.2
1.4
1.6
1.8
2
2.2
Fig. 8.27. Power-law behavior of the positive tail of the probability distribution of the Xt sequence of 100, 000 points generated by the GARCH process. The straight line representing the least-squares fit has been slightly translated for clarity. The exponent is equal to −3.08.
Notations
The essential mathematical notations used in the text are grouped below.
indicates the end of a proof. R set of real numbers. R+ set of positive numbers. C set of complex numbers. Z set of all integers. ZL set of integers modulo L. N set of positive integers. N0 set of nonnegative integers. Q set of rational numbers. ∅ empty set. x∈X x is an element of the set X . S⊂X S is a subset of X . A∪B set of elements either in A or B. A∩B set of elements that are in A and B. Ac complement of set A. A\B set of all elements in A that are not in B. AB set of elements either in A or B but not in A and B. |A| number of elements of set A. {x ∈ M | P (x)} all elements of M that have the property P (x). a∼b a is equivalent to b. a⇒b a implies b. a⇔b a implies b and conversely. x largest integer less than or equal to x. x! smallest integer greater than or equal to x. z complex conjugate of z. m(A) Lebesgue measure of the set A. f :X→Y f is a mapping from the set X into the set Y .
368
NOTATIONS
f : x → f (x) mapping f takes the point x to the point f (x). f −1 (x) set of all preimages of x. f ◦g composite of mappings f and g, g being applied first. x is an element of the n-dimensional vector space Rn . x ∈ Rn x = (x1 , x2 , . . . , xn ) x1 , x2 , . . . , xn are the components of vector x. x norm of vector x. dim(S) dimension of (manifold) S. [a, b] closed interval; i.e., {x ∈ R | a ≤ x ≤ b}. ]a, b[ open interval; i.e., {x ∈ R | a < x < b}. [a, b[ {x ∈ R | a ≤ x < b}. ]a, b] {x ∈ R | a < x ≤ b}. d(x, y) distance between points x and y in a metric space. N (x) neighborhood of x. x˙ time derivative of vector x. DX(x) derivative of vector field X at x. Dx X(x, µ) derivative with respect to x of vector field X at (x, µ). tr A trace of square matrix A. det A determinant of square matrix A. diag[λ1 , . . . , λn ] diagonal matrix whose diagonal elements are λ1 , . . . , λn . diag[B1 , . . . , Bn ] block-diagonal matrix whose diagonal blocks are B1 , . . . , Bn . [a1 , a2 , . . . , an ] n × n matrix whose element aij is the j-component of the n-dimensional vector ai . spec(A) spectrum of linear operator A. x→0 there exist two positive constants A and a such that f (x) = O g(x) |f (x)| ≤ A|g(x)| for |x| < a. x→0 f (x) = o g(x) for any ε > 0, there exists δ > 0 such that |f (x)| ≤ ε|g(x)| for |x| < δ. f (x) ∼ g(x) f (x) and g(x) have the same asymptotic behavior. f (x + 0) limε→0 , where ε > 0. f (x − 0) limε→0 , where ε < 0. a≈b a is approximately equal to b. ab a is less than or approximately equal to b. ab a is greater than or approximately equal to b. G(N, M ) graph of order N and size M . A(G) adjacency matrix of graph G. V (G) set of vertices of graph G. E(G) set of edges of graph G. N (x) neighborhood of vertex x of a graph. d(x) degree of the vertex x of a graph. din (x) in-degree of the vertex x of a digraph. out-degree of the vertex x of a digraph. dout (x)
369
LN CN D N n k
P (X = x) FX fX d
Xn −→ X
characteristic path length of network N . clustering coefficient of network N . diameter of network N . binomial number. probability that the random variable X is equal to x. cumulative distribution function of the random variable X defined by FX (x) = P (X ≤ x). probability density function of the absolutely continuous random variable X.
sequence of random variables (Xn ) converges in distribution to X. m : median of a distribution defined by F (m) : = 12 . X average value of random variable X. moment of order r of a random variable X; mr (X) i.e., mr (X) = X r . 2 σ (X) variance of a random variable X; i.e., σ 2 (X) = m2 (X) − m21 (X). ϕX characteristic function of the random variable X. N (m, σ 2 ) normal random variable on mean m and variance σ 2 . f6 Fourier transform of function f . Lα,β probability density function of stable L´evy distribution. ft probability density of a Student’s t-distribution. W = {Wt | t ≥ 0} stochastic process.
References
1. Akin E. and Davis M., Bulgarian Solitaire, American Mathematical Monthly 92 237–250 (1985) ´si A.-L., Diameter of the World Wide 2. Albert R., Jeong H., and Baraba Web, Nature 401 130–131 (1999) ´si A.-L., Topology of Evolving Networks: Local Events 3. Albert R. and Baraba and Universality, Physical Review Letters 85 5234–5237 (2000) ´si A.-L., Error and Attack Tolerance of 4. Albert R., Jeong H., and Baraba Complex Networks, Nature 406 378–382 (2000) ´si A.-L., Statistical Mechanics of Complex Networks, 5. Albert R. and Baraba Review of Modern Physics 74 47–97 (2002) 6. Allee W. C., Animal Aggregations: A Study in General Sociology (Chicago: The University of Chicago Press 1931). Reprinted from the original edition by AMS Press, New York (1978) 7. Anosov D. V. and Arnold V. I. (editors), Dynamical Systems I: Ordinary Differential Equations and Smooth Dynamical Systems (Heidelberg: SpringerVerlag 1988) 8. Argollo de Menezes M., Moukarzel C. F., and Penna T. J. P., FirstOrder Transition in Small-World Networks, Europhysics Letters 50 5–8 (2000) 9. Arnol’d V. I., Ordinary Differential Equations (Berlin: Springer-Verlag 1992) 10. Arnol’d V. I., Catastrophe Theory (Berlin: Springer-Verlag 1992) 11. Arrowswmith D. K. and Place C. M., An Introduction to Dynamical Systems (Cambridge, UK: Cambridge University Press 1990) 12. Assaf D., IV, and Gadbois S., Definition of Chaos, American Mathematical Monthly 99 865 (1992) 13. Axelrod R., Effective Choice in the Prisoner’s Dilemma, Journal of Conflict Resolution 24 3–25 (1980) 14. Axelrod R., More Effective Choice in the Prisoner’s Dilemma, Journal of Conflict Resolution 24 379–403 (1980) 15. Axelrod R. and Hamilton W. D., The Evolution of Cooperation, Science 211 1390–1396 (1981) 16. Axelrod R., The Dissemination of Culture: A Model with Local Convergence and Global Polarization, Journal of Conflict Resolution 41 203–226 (1997), Reprinted in Axelrod [17] pp 148–177
372
References
17. Axelrod R., The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration (Princeton, NJ: Princeton University Press 1997) 18. Axtell R., Axelrod R., Epstein J. M., and Cohen M. D., Aligning Simulation Models: A Case Study and Results, Computational and Mathematical Organization Theory 1 123–141 (1996). See an adapted version in Axelrod [17], pp. 183–205. 19. Ayala F. J., Experimental Invalidation of the Principle of Competitive Exclusion, Nature 224 1076–1079 (1969) 20. Ayala F. J., Gilpin M. E., and Ehrenfeld J. G., Competition Between Species: Theoretical Models and Experimental Tests, Theoretical Population Biology 4 331–356 (1973) 21. Bachelier L., Th´ eorie de la sp´ eculation (Paris: Gauthier-Villars 1900). English translation by A. James Boness, in The Random Character of Stock Market Prices, pp. 17–78, Paul H. Cootner (editor) (Cambridge, MA: The MIT Press 1964) 22. Badger W. W., An Entropy-Utility Model for the Size Distribution of Income, in Mathematical Models as a Tool for the Social Sciences, pp. 87–120, B. J. West (editor) (New York: Gordon and Breach 1980) 23. Bahr D. B. and Bekoff M., Predicting Flock Vigilance from Simple Passerine Interactions: Modelling with Cellular Automata, Animal Behavior 58 831–839 (1999) 24. Bak P., Tang C., and Wiesenfeld K., Self-Organized Criticality: An Explanation of 1/f noise, Physical Review Letters 59 381–384 (1987) 25. Bak P., Tang C., and Wiesenfeld K., Self-Organized Criticality Physical Review A 38 364–374 (1988) 26. Bak P., Chen K., and Tang C., A Forest-Fire Model and Some Thoughts on Turbulence, Physics Letters A 147 297–300 (1990) 27. Bak P. and Chen K., Self-Organized Criticality, Scientific American 264 46–53 (1991) 28. Bak P. and Sneppen K., Punctuated Equilibrium and Criticality in a Simple Model of Evolution, Physical Review Letters 24 4083–4086 (1993) 29. Bak P., How Nature Works: The Science of Self-Organized Criticality (New York: Springer-Verlag 1996) 30. Bak P., Christensen K., Danon L., and Scanlon T., Unified Scaling Law for Earthquakes, preprint:cond-mat/0112342 31. Banks J., Brooks J., Cairns G., Davis G., and Stacey P., On Devaney’s Definition of Chaos, American Mathematical Monthly 99 332–334 (1992) ´si A.-L. and Albert R., Emergence of Scaling in Random Networks, 32. Baraba Science 286 509–512 (1999) ´si A.-L., Albert R., and Jeong H., Mean-Field Theory for Scale-Free 33. Baraba Random Networks, Physica A 272 173–187 (1999) ´si A.-L., Albert R., and Jeong H., Scale-Free Characteristics of 34. Baraba Random Networks: the Topology of the World Wide Web, Physica A 281 69–77 (2000) ´si A.-L., Jeong H., Ne ´da Z., Ravasz E., Schubert A., and Vicsek 35. Baraba T., Evolution of the Social Network of Scientific Collaborations, Physica A 311 590–614 (2002) 36. Barrat A., Comment on “Small-World Networks: Evidence for a Crossover Picture”, preprint:cond-mat/9903323
References
373
´le ´my M. and Amaral L. A. N., Erratum: Small-World Networks: 37. Barthe Evidence for a Crossover Picture [Phys. Rev. Lett. 82, 3180 (1999)], Physical Review Letters 82 5180 (1999) 38. Beddington J. R., Free C. A., and Lawton J. H., Dynamic Complexity in Predator-Prey models Framed in Difference Equations, Nature 255 58–60 (1975) 39. Begon M., Harper J. L., and Townsend C. R., Ecology: Individuals, Populations and Communities (Oxford: Blackwell 1986) 40. Bendixson I., Sur les courbes d´efinies par des ´ equations diff´ erentielles, Acta Mathematica 24 1–88 (1901) 41. Berryman M. J., Allison A., and Abbott D., Classifying Ancient Texts by Inter-Word Spacing, preprint:physics/0206080 ´ H., Order of the Dimension Versus Space 42. Bidaux R., Boccara N., and Chate Dimension in a Family of Cellular Automata, Physical Review A 39 3094–3105 (1989) 43. Blanchard F., β-expansions and Symbolic Dynamic, Theoretical Computer Science 65 131–141 (1989) 44. Boccara N., On the Microscopic Formulation of Landau Theory of Phase Transitions, Solid State Communications 11 39–40 (1972) 45. Boccara N., Sym´etries bris´ees (Paris: Hermann 1976) 46. Boccara N., Functional Analysis: An Introduction for Physicists (Boston: Academic Press 1990) 47. Boccara N., Nasser J., and Roger M., Annihilation of Defects During the Evolution of Some One-Dimensional Class-3 Deterministic Cellular Automata, Europhysics Letters 13 489–494 (1990) 48. Boccara N., Nasser J., and Roger M., Particlelike Structures and Their Interactions in Spatiotemporal Patterns Generated by One-Dimensional Deterministic Cellular Automaton Rules, Physical Review A 44 866–875 (1991) 49. Boccara N. and Cheong K., Automata Network SIR Models for the Spread of Infectious Diseases in Populations of Moving Individuals, Journal of Physics A: Mathematical and General 25 2447–2461 (1992) 50. Boccara N. and Roger M., Period-Doubling Route to Chaos for a Global Variable of a Probabilistic Automata Network, Journal of Physics A: Mathematical and General 25 L1009–L1014 (1992) 51. Boccara N., Transformations of One-Dimensional Cellular Automaton Rules by Translation-Invariant Local Surjective Mappings, Physica D 68 416–426 (1993) 52. Boccara N. and Cheong K., Critical Behaviour of a Probabilistic Automata Network SIS Model for the Spread of an Infectious Disease in a Population of Moving Individuals, Journal of Physics A: Mathematical and General 26 3707–3717 (1993) 53. Boccara N., Nasser J., and Roger M., Critical Behavior of a Probabilistic Local and Nonlocal Site-Exchange Cellular Automaton, International Journal of Modern Physics C 5 537–545 (1994) 54. Boccara N., Cheong K., and Oram M., Probabilistic Automata Epidemic Model with Births and Deaths Exhibiting Cyclic Behaviour, Journal of Physics A: Mathematical and General 27 1585–1597 (1994) 55. Boccara N., Roblin O., and Roger M., An Automata Network PredatorPrey Model with Pursuit and Evasion, Physical Review E 50 4531–4541 (1994)
374
References
56. Boccara N., Probabilit´es (Paris: Ellipses 1995) 57. Boccara N., Int´egration (Paris: Ellipses 1995) 58. Boccara N., Fuk´ s H., and Geurten S., A New Class of Automata Networks, Physica D 103 145–154 (1997) 59. Boccara N. and Fuk´ s H., Cellular Automaton Rules Conserving the Number of Active Sites, Journal of Physics A: Mathematical and General 31 6007–6018 (1998) 60. Boccara N. and Fuk´ s H., Critical Behaviour of a Cellular Automaton Highway Traffic Model, Journal of Physics A: Mathematical and General 33 3407– 3415 (2000) 61. Boccara N., On the Existence of a Variational Principle for Deterministic Cellular Automaton Models of Highway Traffic Flow, International Journal of Modern Physics C 12 1–16 (2001) 62. Boccara N. and Fuk´ s H., Number-Conserving Cellular Automaton Rules, Fundamenta Informaticae 52 1–13 (2002), 63. Boer J. de, Derrida B., Flyvberg H., Jackson A. D., and Wettig T., Simple Model of Self-Organized Biological Evolution, Physical Review Letters 73 906–909 (1994) 64. Boinski S. and Garber P. A. (editors), On the Move: How and Why Animals Travel in Groups, (Chicago: The University of Chicago Press 2000) ´s B., Modern Graph Theory (New York: Springer-Verlag 1998) 65. Bolloba 66. Bollerslev T., Generalized Autoregressive Conditional Heteroskedasticity, Journal of Econometrics 31 307–327 (1986) 67. Bonabeau E., Toubiana L., and Flahaut A., Evidence for Global Mixing in Real Influenza Epidemics, Journal of Physics A: Mathematical and General 31 L361–L365 (1998) 68. Bouchaud J.-P. and Potters M., Theory of Financial Risks: From Statistical Physics to Management Risks (Cambridge, UK: Cambridge University Press 2000) 69. Bouchaud J.-P., An Introduction to Statistical Finance, Physica A 313 238– 251 (2002) 70. Boyd R. and Lorberbaum J. P., No Pure Strategy is Evolutionarily Stable in the Repeated Prisoner’s Dilemma Game, Nature 327 58–59 (1987) 71. Brandt J., Cycles of Partitions, Proceedings of the American Mathematical Society 85 483–487 (1982) 72. Broadbent S. R. and Hammersley J. M., Percolation Processes: I. Crystal and Mazes, Proceedings of the Cambridge Philosophical Society 53 629–641 (1957) 73. Bulmer M. G., A Statistical Analysis of the 10-Year Cycle in Canada, Journal of Animal Ecology 43 701–718 (1974) 74. Burridge R. and Knopoff L., Models of Theoretical Seismicity, Bulletin of the Seismological Society of America 57 341–371 (1967) 75. Burstedde C., Klauck K., Schadschneider A., and Zittartz J., Simulation of Pedestrian Dynamics Using a Two-Dimensional Cellular Automaton, Physica A 295 507–525 (2001) 76. Cantor C., De la puissance des ensembles parfaits de points, Acta Mathematica 4 381–392 (1884) 77. Cardy J. L. and Grassberger P., Epidemic Models and Percolation, Journal of Physics A: Mathematical and General 18 L267–L271 (1985)
References
375
78. Carlson J. M. and Langer J. S., Mechanical Model of an Earthquake Fault, Physical Review A 40 6470–6484 (1989) 79. Carr J., Applications of Centre Manifold Theory (New York: Springer-Verlag 1981) 80. Carvalho J. X. de and Prado C. P. C., Self-Organized Criticality in the Olami-Feder-Christensen Model, Physical Review Letters 84 4006–4009 (2000) 81. Carvalho J. X. de and Prado C. P. C., de Carvalho and Prado Reply, Physical Review Letters 87 039802 (1p) (2001) 82. Castrigiano D. P. L. and Hayes S. A., Catastrophe Theory (Reading, MA: Addison-Wesley 1993) 83. Chakrabarti B. K. and Stinchcombe R. B., Stick-Slip Statistics for Two Fractal Surfaces: A Model for Earthquakes, preprint:cond-mat/9902164 ´ H. and Manneville P., Collective Behaviors in Spatially Extended Sys84. Chate tems with Local Interactions and Synchronous Updating, Progress of Theoretical Physics 87 1–60 (1992) 85. Chau H. F., Yan K. K., Wan K. Y., and Siu L. W., Classifying Rational Densities Using Two One-Dimensional Cellular Automata, Physical Review E 57 1367–1369 (1998) 86. Chau H. F., Siu L. W., and Yan K. K., One-Dimensional n-ary Density Classification Using Two Cellular Automaton Rules, International Journal of Modern Physics C 10 883–889 (1999) 87. Chowdhury D., Santen L., and Schadschneider A., Statistical Physics of Vehicular Traffic and Some Related Systems, Physics Reports 329 199–329 (2000) 88. Christensen K., Hamon D., Jensen H. J. , and Lise S., Comment on “SelfOrganized Criticality in the Olami-Feder-Christensen Model”, Physical Review Letters 87 039801 (1p) (2001) 89. Clark C. W., Mathematical Bioeconomics: The Optimal Management of Renewable Resources (New York: John Wiley & Sons 1990) 90. Clutton-Brock T. H., O’Riain M. J., Brotherton P. N. M., Gaynor D., Kansky R., Griffin A. S., and Manser M., Selfish Sentinels in Cooperative Mammals, Science 284 1640–1644 (1999) 91. Collet P. and Eckmann J.-P., Iterated Maps on the Interval as Dynamical Systems (Boston: Birkh¨ auser 1980) 92. Cootner P. H., Comments on the Variation of Certain Speculative Prices in The Random Character of Stock Market Prices, pp. 333–337, Paul H. Cootner (editor) (Cambridge, MA: The MIT Press 1964) 93. Costa Filho R. N., Almeida M. P., Andrade Jr J. S., and Moreira J. E., Scaling Behavior in a Proportional Voting Process, Physical Review E 60 1067–1068 (1999) 94. Costa Filho R. N., Almeida M. P., Andrade Jr J. S., and Moreira J. E., Brazilian Elections: Voting for a Scaling Democracy, preprint: cond-mat/0211212 95. Coullet P. and Tresser C., It´erations d’endomorphismes et groupe de renormalisation, Journal de Physique Colloque 39 C5–C25 (1978) 96. Cressman R., The Stability Concept of Evolutionary Game Theory: A Dynamic Approach (Berlin: Springer-Verlag 1992) 97. Cushing J. M., Integrodifferential Equations and Delay Models in Population Dynamics (Berlin: Springer-Verlag 1977)
376
References
´ k A., Stanley H. E., and Vicsek T., Spontaneously Ordered Motion of 98. Cziro Self-Propelled Particles, Journal of Physics A: Mathematical and General 30 1375–1385 (1997) 99. D’Ancona U., The Struggle for Existence (Leiden: Brill 1954) 100. Davidsen J., Ebel H., and Bornholdt S., Emergence of a Small World from Local Interactions: Modeling Acquaintance Networks, Physical Review Letters 88 128701 (4p) (2002) 101. Derrida B., Gervois A., and Pomeau Y., Universal Metric Properties of Bifurcations and Endomorphisms, Journal of Physics: Mathematical and General A 12 269–296 (1979) 102. Devaney R. L., Chaotic Dynamical Systems (Redwood City, CA: AddisonWesley 1989) 103. Domany E. and Kinzel W., Equivalence of Cellular Automata to Ising Models and Directed Percolation, Physical Review Letters 53 311–314 (1984) 104. Dorogovtsev S. N., Mendes J. F. F., and Samukhin A. N., Structure of Growing Networks with Preferential Linking, Physical Review Letters 85 4633– 4636 (2000) 105. Dorogovtsev S. N. and Mendes J. F. F., Language as an Evolving Word Web, Proceedings of the Royal Society B 268 2603–2606 (2001) ˘gulescu A. A. and Yakovenko V. M., Statistical Mechanics of Money, 106. Dra The European Physical Journal B 17 723–729 (2000) ˘gulescu A. A. and Yakovenko V. M., Evidence for the Exponential 107. Dra Distribution of Income in the USA, The European Physical Journal B 20 585– 589 (2001) ˘gulescu A. A. and Yakovenko V. M., Exponential and Power-Law 108. Dra Probability Distributions of Wealth and Income in the United Kingdom and the United States, Physica A 299 213–221 (2001) 109. Drossel B. and Schwabl F., Self-organized Critical Forest-Fire Model, Physical Review Letters 69 1629–1632 (1992) 110. Dubois M. A., Favier C., and Sabatier P., Mathematical Modelling of the Rift Valley Fever in Senegal, in Simulation and Industry 2001, pp. 530–531, Norbert Giambiasi and Claudia Frydman (editors) (Ghent: SCS Europe Bvba, 2001) 111. Dulac M. H., Sur les cycles limites, Bulletin de la Soci´et´e Math´ematique de France 51 45–188 (1923) ´ ka Z., Number-Conserving Cellular Au112. Durand B., Formenti E., and Ro tomata: From Decidability to Dynamics, preprint:nlin.CG/0102035 113. Ebel H., Mielsch M. -I., and Bornholdt S., Scale-Free Topology of e-mail Networks, Physical Review E 66 035103(R) (4p) (2002) 114. Eden M., A Two-Dimensional Growth Process, in The Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability, Volume 3, pp. 223–239, Jerzy Neyman (editor) (Berkeley: University of California Press 1961) 115. Eldredge N. and Gould S. J., Punctuated Equilibria: An Alternative to Phyletic Gradualism, in Models in Paleobiology, T. J. M. Schopf (editor), pp. 82– 115 (San Francisco: Freeman and Cooper 1972) 116. Engle R. F., Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of the United Kingdom Inflation, Econometrica 50 987–1008 (1982)
References
377
117. Epstein J. M. and Axtell R., Growing Artificial Societies: Social Science from the Bottom Up (Cambridge, MA: The MIT Press 1996) 118. Etienne G., Tableaux de Young et Solitaire Bulgare, Journal of Combinatorial Theory A 58 181–197 (1991) 119. Fama E. F., Mandelbrot and the Stable Paretian Hypothesis in The random character of stock market prices, pp 297–306, Paul H. Cootner (editor), (Cambridge, MA: The MIT Press 1964) 120. Farmer J. D., Ott E., and York J. A., The Dimension of Chaotic Attractors, Physica D 7 153–180 (1983) 121. Feigenbaum M. J., Quantitative Universality for a Class of Nonlinear Transformations, Journal of Statistical Physics 19 25–52 (1978) 122. Feigenbaum M. J., Universal Behavior in Nonlinear Systems, Los Alamos Science 1 4–27 (1980) 123. Feller W., On the Logistic Law of Growth and Its Empirical Verification in Biology, Acta Biotheoretica 5 51–65 (1940) 124. Fermat, P. de, Varia Opera Mathematica D. Petri de Fermat, Senatoris Tolosani (Tolosæ: Johannem Pech 1679) ´ R. V., Two Regimes in the Frequency of 125. Ferrer i Cancho R. and Sole Words and the Origins of Complex Lexicons, Journal of Quantitative Linguistics 8 165–173 (2001) ´ R. V., The Small-World of Human Language, 126. Ferrer i Cancho R. and Sole Proceedings of the Royal Society of London B 268 2261–2266 (2001) 127. Finerty J. P., The Population Ecology of Cycles in Small Mammals, (New Haven: Yale University Press 1980) 128. Flyvberg H., Sneppen K., and Bak P., Mean Field Theory for a Simple Model of Evolution, Physical Review Letters 71 4087–4090 (1993) 129. Franses P. H., Time Series Models for Business and Economic Forecasting (Cambridge, UK: Cambridge University Press 1998) 130. Frette V., Christensen K., Malthe-Sørenssen A., Feder J., Jøssang T., and Meakin P., Avalanche Dynamics in a Pile of Rice, Nature 379 49–52 (1996) 131. Frish H. L. and Hammersley J. M., Percolation Processes and Related Topics, Journal of the Society for Industrial and Applied Mathematics 11 894–918 (1963) 132. Fujiwara Y., Souma W., Aoyama H., Kaizoji T., and Aoki M., Growth and Fluctuations of Personal Income, preprint:cond-mat/0208398 133. Fuk´ s H., Solution of the Density Classification Problem with Two Cellular Automaton Rules, Physical Review E 55 R2081–R2084 (1997) 134. Fukui M. and Ishibashi Y., Traffic Flow in 1D Cellular Automaton Model Including Cars Moving with High Speed, Journal of the Physical Society of Japan 65 1868–1870 (1996) 135. Fukui M. and Ishibashi Y., Self-Organized Phase Transitions in Cellular Automaton Models for Pedestrians, Journal of the Physical Society of Japan 68 2861–2863 (1999) 136. Fukui M. and Ishibashi Y., Jamming Transition in Cellular Automaton Models for Pedestrians on Passageway, Journal of the Physical Society of Japan 68 3738–3739 (1999) 137. Gardner M., The Fantastic Combinations of John Conway’s New Solitaire Game“Life”, Scientific American 223 120–123 (1970)
378
References
138. Gardner M., Mathematical Games, Scientific American 249 12–21 (1983) 139. Gavalas G. E., Nonlinear Differential Equations of Chemically Reacting Systems (New York: Springer-Verlag 1968) 140. Gaylord R. J. and D’Andria L. J., Simulating Society: A Mathematica Toolkit for Modeling Socioeconomic Behavior (New York: Springer-Verlag 1998) 141. Gennes P.-G. de, La percolation: un concept unificateur, La Recherche 7 919– 926 (1976) 142. Gibrat R., Les in´egalit´es ´economiques (Paris: Sirey 1931) 143. Gilpin M. E. and Justice K. E., Reinterpretation of the Invalidation of the Principle of Competitive Exclusion, Nature 236 273–274 and 299–301 (1972) 144. Gilpin M. E., Do Hares Eat Lynx?, American Naturalist 107 727–730 (1973) 145. Gnedenko B. V. and Kolmogorov A. N., Limit Distributions for Sums of Independent Random Variables, translated from Russian by K. L. Chung (Cambridge, MA: Addisson-Wesley 1954) 146. Goles E., Sand Piles, Combinatorial Games and Cellular Automata, in Instabilities and Nonequilibrium Structures, pp. 101–121, E. Tirapegui and W. Zeller (editors) (Dordrecht: Kluwer Academic Publishers 1991) 147. Gompertz B., On the Nature of the Function Expressing Human Mortality Philosophical Transactions 115 513–585 (1825) 148. Gopikrishnan P., Meyer M., Amaral L. A. N., and Stanley H. E., Inverse Cubic Law for the Distribution of Stock Price Variations, The European Physical Journal B 3 139–140 (1998) 149. Gopikrishnan P., Plerou V., Amaral L. A. N, Meyer M., and Stanley H. E., Scaling of the Distribution of Fluctuations of Financial Market Indices, Physical Review E 60 5305–5316 (1999) 150. Gordon D. M., Dynamics of Task Switching in Harvester Ants, Animal Behavior 38 194–204 (1989) 151. Gordon D. M.,The Organization of Work in Social Insect Colonies, Nature 380 121–124 (1996) 152. Gordon D. M., Ants at Work: How an Insect Society is organized (New York: The Free Press 1999) 153. Grassberger P., On the Critical Behavior of the General Epidemic Process and Dynamical Percolation, Mathematical Biosciences 63 157–172 (1983) 154. Grassberger P., Chaos and Diffusion in Deterministic Cellular Automata, Physica D 10 52–58 (1984) 155. Grassberger P. and Kantz H., On a Forest-Fire Model with Supposed SelfOrganized Criticality, Journal of Statistical Physics 63 685–700 (1991) 156. Grassberger P., On a Self-Organized Critical Forest-Fire Model, Journal of Physics A: Mathematical and General 26 2081–2089 (1993) 157. Grassberger P., Critical Behavior of the Drossel-Schwabl Forest-Fire Model, New Journal of Physics 4 17 (15p) (2002), preprint:cond-mat/0202022 158. Grimmett G. R. and Stirzaker D. R., Probability and Random Processes (Oxford: Clarendon Press 1992) 159. Grimmett G., Percolation (Berlin: Springer-Verlag 1999) 160. Guckenheimer J. and Holmes P., Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields (New York: Springer-Verlag 1983) 161. Gutenberg B. and Richter C. F., Seismicity of the Earth, Geological Society of America Special Papers, Number 34 (1941) 162. Gutowitz H. A., Victor J. D., and Knight B. W., Local Structure Theory for Cellular Automata, Physica D 28 18–48 (1987)
References
379
163. Hale J. K., Ordinary Differential Equations (New York: Wiley-Interscience 1969) 164. Hale J. K. and Koc ¸ ak H., Dynamics and Bifurcations (Heidelberg: SpringerVerlag 1991) 165. Harrison G. W., Comparing Predator-Prey Models to Luckinbill’s Experiment with Didinium and Paramecium, Ecology 76 357–374 (1995) 166. Hassard B. D., Kazarinoff N. D., and Wan Y.-H, Theory and Applications of Hopf Bifurcation (Cambridge, UK: Cambridge University Press 1981) 167. Hassel M. P., Density-Dependence in Single-Species Populations, Journal of Animal Ecology 44 283–295 (1975) 168. Hausdorff F., Dimension und a ¨ußeres Maß, Mathematische Annalen 79 157– 179 (1918) ´non M., A Two-dimensional Mapping with a Strange Attractor, Communi169. He cations in Mathematical Physics 50 69–77 (1976) 170. Hethcote H. W. and Yorke J. A., Gonorrhea Transmission Dynamics and Control (Heidelberg: Springer-Verlag 1984) 171. Hirsch M. W. and Smale S., Differential Equations, Dynamical Systems and Linear Algebra (New York: Academic Press 1974) 172. Hirsch M. W., Pugh C. C., and Shub M., Invariant Manifolds (Heidelberg: Springer-Verlag 1977) 173. Holling C. S., The Components of Predation as Revealed by a Study of SmallMammal Predation of the European Pine Sawfly, Canadian Entomologist 91 293–320 (1959) 174. Holling C. S., The Functional Response of Predators to Prey Density and Its Role in Mimicry and Population, Memoirs of the Entomological Society of Canada 45 1–60 (1965) 175. Huang Z. F. and Solomon S., Power, L´evy, Exponential and Gaussian-like Regimes in Autocatalytic Financial Systems, The European Physical Journal B 20 601–607 (2001) 176. Hutchinson G. E., Circular Cause Systems in Ecology, Annals of the New York Academy of Sciences 50 221–246 (1948) 177. Hutchinson G. E., An Introduction to Population Ecology, (New Haven: Yale University Press 1978) 178. Igusa K., Solution of the Bulgarian Solitaire Conjecture, Mathematics Magazine 58 259–271 (1985) ´ski A., Random Graphs (New York: Aca179. Janson S., L uczak T., and Rucin demic Press 2000) 180. Jianhua W., The Theory of Games (Oxford: Clarendon Press 1988) 181. Katok A. and Hasselblatt B., Introduction to the Modern Theory of Dynamical Systems (Cambridge, UK: Cambridge University Press 1975) 182. Kermack W. O. and McKendrick A. G., A contribution to the Mathematical Theory of Epidemics, Proceedings of the Royal Society A 115 700–721 (1927) 183. Kerner B. S. and Rehborn H., Experimental Features and Characteristics of Traffic Jams, Physical Review E 53 R1297–R1300 (1996) 184. Kerner B. S. and textscRehborn H., Experimental Properties of Complexity in Traffic Flows, Physical Review E 53 R4275–R4278 (1996) 185. Kingsland S. E., The Refractory Model: The Logistic curve and the History of Population Ecology, The Quaterly Review of Biology 57 29–52 (1982) 186. Kingsland S. E., Modeling Nature: Episodes in the History of Population Ecology (Chicago: University of Chicago Press 1995)
380
References
187. Kinzel W., Directed Percolation, in Percolation, Structures and Processes, Annals of the Israel Physical Society, 5, pp. 425–445 (1983) 188. Kirchner A. and Schadschneider A., Simulation of Evacuation Processes Using a Bionics-Inspired Cellular Automaton Model for Pedestrian Dynamics, Physica A 312 260–276 (2002) 189. Kirchner J. W. and Weil A., No fractals in Fossil Extinction Statistics, Nature 395 337–338 (1998) 190. Klein W. and Rundle J., Comment on “Self-Organized Criticality in a Continuous, Nonconservative Cellular Automaton Modeling Earthquakes”, Physical Review Letters 71 1288–1288 (1993) 191. Kolb M., Botet R., and Jullien R., Scaling of Kinetically Growing Clusters, Physical Review Letters 51 1123–1126 (1983) 192. Kolmogorov A. N., Grundbegriffe der Wahrscheinlichkeitsrechnung (Berlin: Verlag von Julius Springer 1933), English translation edited by Nathan Morrison: Foundations of the Theory of Probability (New York: Chelsea Publishing Company 1950) 193. Kolmogorov A. N., Sulla teoria di Volterra della lotta per l’esistenza, Giornale dell’Istituto Italiano degli Attuari 7 74–80 (1936) 194. Koponen I., Analytic Approach to the Problem of Convergence of Truncated L´evy Flights Towards the Gaussian Stochastic Process, Physical Review E 52 1197–1199 (1995) 195. Krapivsky P. L., Redner S., and Leyvraz F., Connectivity of Growing Random Networks, Physical Review Letters 85 4629–4632 (2000) 196. Krapivsky P. L., Rodgers G. J., and Redner S., Degree Distributions of Growing Networks, Physical Review Letters 86 5401–5404 (2001) 197. Krapivsky P. L. and Redner S., Organization of Growing Random Networks, Physical Review E 63 066123 (14p) (2001) 198. Lagarias J. C., The 3x + 1 Problem and Its Generalizations, American Mathematical Monthly 92 3–23 (1985) 199. Land M. and Belew R. K., No Perfect Two-State Cellular Automata for Density Classification, Physical Review Letters 74 5148–5150 (1995) 200. Lang S., Undergraduate Analysis (Heidelberg: Springer-Verlag 1983) 201. Leslie P. H., Some Further Notes on the Use of Matrices in Population Mathematics, Biometrika 35 213-245 (1948) ´vy P., Th´ 202. Le eorie de l’addition des variables al´ eatoires, 2`eme ´edition (Paris: Gauthier-Villars 1954) ´vy P., Processus stochastiques et mouvement brownien (Paris: Gauthier203. Le Villars 2`eme ´edition 1965) 204. Li T.-Y. and York J. A., Period Three Implies Chaos, American Mathematical Monthly 82 985–992 (1975) 205. Li T.-Y., Misiurewicz M., Pianigiani G., and York J. A., Odd Chaos, Physics Letters 87 A 271–273 (1982) 206. Liljeros F., Edling C. R., Amaral L. A. N., Stanley H. E., and ˚ Aberg Y., The Web of Human Sexual Contacts, Nature 411 907–908 (2001) 207. Lise S. and Paczuski M., Self-Organized Criticality and Universality in a Nonconservative Earthquake Model, Physical Review E 63 036111 (5p) (2001) 208. Lise S. and Paczuski M., Scaling in a Nonconservative Earthquake Model of Self-Organized Criticality, Physical Review E 64 046111 (5p) (2001) 209. Lise S. and Paczuski M., A Nonconservative Earthquake Model of SelfOrganized Criticality on a Random Graph, preprint:cond-mat/0204491
References
381
`ve M., Probability Theory (Berlin: Springer-Verlag 1977) 210. Loe 211. Lombardo M. P., Mutual Restraint in Tree Swallows: A Test of the Tit for Tat Model of Reciprocity, Science 227 1363–1365 (1985) 212. Lorenz E. N., Deterministic Nonperiodic Flow, Journal of Atmospheric Sciences 20 130–141 (1963) 213. Lorenz E. N., The Essence of Chaos, (Seattle: University of Washington Press 1993) 214. Lotka A. J., Elements of Physical Biology (Baltimore: Williams and Wilkins 1925). Reprinted under the new title: Elements of Mathematical Biology (New York: Dover 1956) 215. Luckinbill L. S., Coexistence in Laboratory Populations of Paramecium aurelia and its Predator Didinium nasutum, Ecology 54 1320–1327 (1973) 216. Ludwig D., Aronson D. G., and Weinberg H. F., Spatial Patterning of the Spruce Budworm, Journal of Mathematical Biology 8 217–258 (1979) 217. Ludwig D., Jones D. D., and Holling C. S., Qualitative Analysis of Insect Outbreak Systems: The Spruce Budworm and Forest, Journal of Animal Ecology 47 315–332 (1978) 218. Lux T and Marchesi M., Scaling and Criticality in a Stochastic Multi-Agent Model of a Financial Market, Nature 397 498–500 (1999) 219. MacKay G. and Jan N., Forest Fires as Critical Phenomena, Journal of Physics A: Mathematical and General 17 L757–L760 (1984) 220. Malescio G., Dokholyan N. V., Buldyrev S. V., and Stanley H. E., Hierarchical Organization of Cities and Nations, preprint:cond-mat/0005178 221. Malthus T. R., An Essay on the Principle of Population, as it Affects the Future Improvement of Society, with Remarks on the Speculations of Mr. Goodwin, M. Condorcet, and Other Writers (London: Johnson 1798) 222. Mandelbrot B. B., The Variation of Certain Speculative Prices, Journal of Business 35 394–419 (1963) reprinted in The Random Character of Stock Market Prices, pp. 307–332, Paul H. Cootner (editor) (Cambridge, MA: The MIT Press 1964) 223. Mandelbrot B. B., Information, Theory and Psycholinguistics: A Theory of Words Frequencies in Readings in Mathematical Social Science, P. Lazarfeld and N. Henry (editors) (Cambridge, MA: The MIT Press, 1966) 224. Mandelbrot B. B., Fractals: Form, Chance, and Dimension (San Francisco: Freeman 1977) 225. Mandelbrot B. B., The Fractal Geometry of Nature (San Francisco: Freeman 1982) 226. Mantegna R. N. and Stanley H. E., Stochastic Process with Ultraslow Convergence to a Gaussian: The Truncated L´ evy Flight, Physical Review Letters 73 2946–2949 (1994) 227. Mantegna R. N. and Stanley H. E., An Introduction to Econophysics: Correlations and Complexity in Finance (Cambridge, MA: Cambridge University Press 1999) 228. Marsili M., Maslov S., and Zhang Y.-C., Comment on “Role of Intermittency in Urban Development: A Model of Large-Scale City Formation,” Physical Review Letters 80 4830 (1998) 229. May R. M., Stability and Complexity in Model Ecosystems (Princeton: Princeton University Press 1974) 230. May R. M., Simple Mathematical Models with Very Complicated Dynamics, Nature 261 459–467 (1976)
382
References
231. May R. M. and Oster G. F., Bifurcations and Dynamic Complexity in Simple Ecological Models, American Naturalist 110 573–599 (1976) 232. May R. M., More Evolution of Cooperation, Nature 327 15–17 (1987) 233. Maynard Smith J., Mathematical Ideas in Biology (Cambridge, UK: Cambridge University Press 1968) 234. Maynard Smith J., Models in Ecology (Cambridge, UK: Cambridge University Press 1974) 235. Maynard Smith J., Evolution and the Theory of Games (Cambridge, UK: Cambridge University Press 1982) 236. Maynard Smith J., The Theory of Evolution (Cambridge, UK: Cambridge University Press 1993) 237. Meakin P., Diffusion-Controlled Cluster Formation in Two, Three, and Four Dimensions, Physical Review A 27 604–607 (1983) 238. Meakin P., Diffusion-Controlled Cluster Formation in 2–6 Dimensional Spaces, Physical Review A 27 1495–1507 (1983) 239. Meakin P., Formation of Fractal Clusters and Networks of Irreversible Diffusion-Limited Aggregation, Physical Review Letters 51 1119–1122 (1983) 240. Mermin N. D. and Wagner H., Absence of Ferromagnetism or Antiferromagnetism in One- and Two-Dimensional Isotropic Heisenberg Models, Physical Review Letters 17 1133–1136 (1966) 241. Metropolis M., Stein M. L., and Stein P. R., On Finite Limit Sets for Transformations of the Unit Interval, Journal of Combinatorial Theory, 15 25– 44 (1973) 242. Middleton A. A. and Tang C., Self-Criticality in Nonconserved Systems, Physical Review Letters 74 742–745 (1995) 243. Milgram S., The Small-World Problem, Psychology Today 1 61–67 (1967) 244. Milinski M., Tit for Tat in Sticklebacks and the Evolution of Cooperation, Nature 325 433-435 (1987) 245. Miyazima S., Lee Y., Nagamine T., and Miyajima H., Power-Law Distribution of Family Names in Japanese Societies, Physica A 278 282–288 (2000) 246. Montemurro M. A., Beyond the Zipf-Mandelbrot Law in Quantitative Linguistics, preprint:cond-mat/0104066 247. Montemurro M. A. and Zanette D., Entropic Analysis of the Role of Words in Literary Texts, Advances in Complex Systems 5 7–17 (2002) 248. Montroll E. W. and Shlesinger M. F., Maximum Entropy Formalism, Fractals, Scaling Phenomena, and 1/f Noise: A Tale of Tails, Journal of Statistical Physics 32 209–230 (1983) 249. Moreira A., Universality and Decidability of Number-Conserving Cellular Automata, Theoretical Computer Science, 292 711–721 (2003) 250. Moßner W. K., Drossel B., and Schwabl F., Computer Simulations of the Forest-Fire Model, Physica A 190 205–217 (1992) 251. Mueller L. D. and Ayala F. J., Dynamics of Single-Species population Growth: Stability or Chaos, Ecology 62 1148–1154 (1981) 252. Muramatsu M., Irie T., and Nagatani T., Jamming Transition in Pedestrian Counter Flow, Physica A 267 487–498 (1999) 253. Muramatsu M. and Nagatani T., Jamming Transition in Two-Dimensional Pedestrian Traffic, Physica A 275 281–291 (2000) 254. Muramatsu M. and Nagatani T., Jamming Transition of Pedestrian Traffic at a Crossing with Open Boundaries, Physica A 286 377–390 (2000)
References
383
255. Murray J. D., Mathematical Biology (Heidelberg: Springer-Verlag 1989) 256. Myrberg P. J., Sur l’it´eration des polynˆ omes quadratiques, Journal de Math´ematiques Pures et Appliqu´ees 41 339–351 (1962) 257. Nagel K. and Schreckenberg M., A Cellular Automaton Model for Freeway Traffic, Journal de Physique I 2 2221–2229 (1992) 258. Neumann J. von and Morgenstern O., Theory of Games and Economic Behavior, 3d edition (Princeton, NJ: Princeton University Press 1953) 259. Newman M. E. J. and Sneppen K., Avalanches, Scaling, and Coherent Noise, Physical Review E 54 6226–6231 (1996) 260. Newman M. E. J., A Model of Mass Extinction, Journal of Theoretical Biology 189 235–252 (1997) 261. Newman M. E. J. and Watts D. J., Scaling and Percolation in the SmallWorld Network Model, Physical Review E 60 7332–7342 (1999) 262. Newman M. E. J. and Watts D. J., Renormalization Group Analysis of the Small-World Model, Physics Letters A 263 341-346 (1999) 263. Newman M. E. J., The Structure of Scientific Collaboration Networks, Proceedings of the National Academy of Sciences of the United States of America 98 404-409 (2001) 264. Newman M. E. J., Scientific Collaboration Networks. I. Network Construction and Fundamental Results Physical Review E 64 016131 (8p) (2001) 265. Newman M. E. J., Scientific Collaboration Networks. II. Shortest Paths, Weighted Networks, and Centrality, Physical Review E 64 016132 (7p) (2001) 266. Newman M. E. J. and Ziff R. M., Fast Monte Carlo Algorithm for Site and Bond Percolation, Physical Review E 64 016706 (16p) (2001) 267. Newman M. E. J., Strogatz S. H., and Watts D. J., Random Graphs with Arbitrary Degree Distributions and Their Applications, Physical Review E 64 026118 (17p) (2001) 268. Nicholson A. J. and Bailey V. A., The Balance of Animal Populations. Part I, Proceedings of the Zoological Society of London 3 551–598 (1935) 269. Niessen W. von and Blumen A., Dynamics of Forest Fires as a Directed Percolation, Journal of Physics A: Mathematical and General 19 L289–L293 (1986) 270. Nishidate K., Baba M., and Gaylord R. J., Cellular Automaton Model for Random Walkers, Physical Review Letters 77 1675–1678 (1997) 271. Nowak M. A. and May R. M., Evolutionary Games and Spatial Chaos, Nature 359 826–829 (1992) 272. Okubo A., Diffusion and Ecological Problems: Mathematical Models (Heidelberg: Springer-Verlag 1980) 273. Olami Z., Feder H. J. S., and Christensen K., Self-Organized Criticality in a Continuous, Nonconservative Cellular Automaton Modeling Earthquakes, Physical Review Letters 68 1244–1247 (1992) 274. Oono Y., Period = 2n Implies Chaos, Progress in Theoretical Physics (Japan) 59 1028–1030 (1978) ˜o M, Carpena P., Bernaola-Galva ´n P., Mun ˜oz E., and Somoza 275. Ortun A. M., Keyword Detection in Natural Languages and DNA, Europhysics Letters 57 759–764 (2002) 276. Palis J. and Melo W. de, Geometric Theory of Dynamical Systems, An Introduction (New York: Springer-Verlag 1982) ´ 277. Pareto V., Cours d’Economie Politique Profess´e ` a l’Universit´e de Lausanne (Lausanne: F. Rouge 1896–1897)
384
References
278. Parrish J. K. and Hammer W. M. (editors), Animal Groups in Three Dimensions (Cambridge, UK: Cambridge University Press 1997) 279. Pastor-Satorras R. and Vespignani A., Corrections to Scaling in the Forest-Fire Model, Physical Review E 61 4854–4859 (2000) 280. Pearl R. and Reed L. J., On the Rate of Growth of the Population of the United States since 1790 and Its Mathematical Representation, Proceedings of the National Academy of Sciences of the United States of America 21 275–288 (1920) 281. Peters O., Hertlein C., and Christensen K., A Complexity View of Rainfall, Physical Review Letters 88 018701(4p) (2002) 282. Philippi T. E., Carpenter M. P., Case T. J., and Gilpin M. E., Drosophila Population Dynamics: Chaos and Extinction, Ecology 68 154–159 (1987) 283. Pielou E. C., Mathematical Ecology (New York: John Wiley 1977) 284. Plerou V., Gopikrishnan P., Amaral L. A. N., Meyer M., and Stanley H. G., Scaling of the Distribution of Price Fluctuations of Individual Companies, Physical Review E 60 6519–6529 (1999) 285. Podobnik B., Ivanov P. C., Lee Y., and Stanley H. E., Scale-Invariant Truncated L´ evy Process, Europhysics Letters 52 491–497 (2000) 286. Podobnik B., Ivanov P. C., Lee Y., Chessa A., and Stanley H. E., Systems with Correlations in the Variance: Generating Power-Law Tails in Probability Distributions, Europhysics Letters 50 711–717 (2000) ´ H., M´emoire sur les courbes d´ 287. Poincare efinies par une ´equation diff´ erentielle, Journal de Math´ematiques, 3`eme s´erie 7 375–422 (1881), and 8 251–296 (1882). Reprinted in Œuvres de Henri Poincar´e, tome I (Paris: Gauthier-Villars 1951) ´ H., Sur l’´equilibre d’une masse fluide anim´ 288. Poincare ee d’un mouvement de rotation, Acta Mathematica 7 259–380 (1885). Reprinted in Œuvres de Henri Poincar´e, tome VII (Paris: Gauthier-Villars 1952) ´ H., Sur le probl`eme des trois corps et les ´ 289. Poincare equations de la dynamique, Acta Mathematica 13 1–270 (1890). Reprinted in Œuvres de Henri Poincar´e, tome VII (Paris: Gauthier-Villars 1952) ´ H., Science et M´ethode, (Paris: Flammarion 1909). English transla290. Poincare tion by G. B. Halsted, in The Foundations of Science: Science and Hypothesis, The Value of Science, Science and Method (Lancaster, PA: The Science Press 1946) 291. Pol B. van der, Forced Oscillations in a Circuit with Nonlinear Resistance, London, Edinburgh and Dublin Philosophical Magazine 3 65–80 (1927) 292. Pomeau Y. and Manneville P., Intermittent Transition to Turbulence in Dissipative Dynamical Systems, Communications in Mathematical Physics 74 189–197 (1980) 293. Rapoport R. and Chammah A. M. with the collaboration of Orwant C. J., Prisoner’s Dilemma: A Study in Conflict and Cooperation (Ann Arbor: The University of Michigan Press 1965) 294. Raup D. M. and Sepkoski J. J., Periodicity of Extinctions in the Geologic Past, Proceedings of the National Academy of Sciences of the United States of America, 81 801–805 (1984) 295. Redner S., How Popular Is Your Paper? An Empirical Study of the Citation Distribution, The European Physical Journal B 4 131–134 (1998) 296. Redner S., Aggregation Kinetics of Popularity, Physica A 306 402–411 (2002) 297. Resnick M., Turtles, Termites, and Traffic Jams, Explorations in Massively Parallel Microworlds (Cambridge, MA: The MIT Press 1997)
References
385
298. Richter C. F., An Instrumental Earthquake Magnitude Scale, Bulletin of the Seismological Society of America 25 1–32 (1935) 299. Robinson C., Dynamical Systems: Stability, Symbolic Dynamics and Chaos (Boca Raton: CRC Press 1995) 300. Ruelle D. and Takens F., On the Nature of Turbulence, Communications in Mathematical Physics 20 167–192 (1971) 301. Ruelle D., Strange Attractors, The Mathematical Intelligencer 2 126–137 (1980) 302. Ruelle D., Chaotic Evolution and Strange Attractors (Cambridge, UK: Cambridge University Press 1989) 303. Ruelle D., Deterministic Chaos: The Science and the Fiction, Proceedings of the Royal Society A 427 241–248 (1990) 304. Russell D. A., Hanson J. D., and Ott E., Dimension of Strange Attractors, Physical Review Letters 45 1175–1178 (1980) 305. Sachs L., Applied Statistics: A Handbook of Techniques, translated from German by Zenon Reynarowych (New York: Springer-Verlag 1984) 306. Schelling T., Models of Segregation, American Economic Review, Papers and Proceedings 59 (2) 488–493 (1969) 307. Schelling T., Micromotives and Macrobehavior (New York: Norton 1978) 308. Schenk K., Drossel B., Clar S., and Schwabl F., Finite-Size Effects in the Self-Organized Critical Forest-Fire Model, The European Physical Journal B 15 177–185 (2000) 309. Schenk K., Drossel B., and Schwabl F., Self-Organized Critical Forest-Fire Model on Large Scales, Physical Review E 65 026135 (8p) (2002) 310. Scudo F. M., Vito Volterra and Theoretical Ecology, Theoretical Population Biology 2 1–23 (1971) 311. Segel L. A. and Jackson J. L., Dissipative Structures: An Explanation and an Ecological Example, Journal of Theoretical Biology 37 545–559 (1972) 312. Seton E. T., Life Histories of Northern Mammals: An Account of the Mammals of Manitoba (New York: Arno Press 1974) 313. Shields P. C., The Ergodic Theory of Discrete Sample Paths, Graduate Studies in Mathematics Series, volume 13, (Prividence, RI: American Mathematical Society 1996 314. Skellam J. G., Random Dispersal in Theoretical Populations, Biometrika 38 196–218 (1951) 315. Smale S., Differentiable Dynamical Systems, Bulletin of the American Mathematical Society 73 747–817 (1967) 316. Smith F. E., Population Dynamics of Daphnia magna and a New Model for Population Growth, Ecology 44 651–663 (1963) 317. Sneppen K., Bak P., Flyvbjerg H., and Jensen M. H., Evolution as a Self-Organized Critical Phenomenon, Proceedings of the National Academy of Sciences of the United States of America, 92 5209–5213 (1995) 318. Sneppen K. and Newman M. E. J., Coherent Noise, Scale Invariance and Intermittency in Large Systems, Physica D 110 209–223 (1997) ´ R. V. and Manrubia S. C., Extinction and Self-Organized Criticality 319. Sole in a Model of Large-Scale Evolution, Physical Review E 54 R42–R45 (1996) ´ R. V., Manrubia S. C., Benton M., and Bak P., Self-Similarity of 320. Sole Extinction Statistics in the Fossil Record, Nature 388 765–767 (1997) 321. Sornette D., Johansen A., and Bouchaud J.-P., Stock Market Crashes, Precursors and Replicas, Journal de Physique I 6 167–175, (1996)
386
References
322. Souma W., Physics of Personal Income, preprint:cond-mat/02022388 323. Sparrow C., The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors (New York: Springer-Verlag 1982) 324. Spencer J., Balancing Vectors in the Max Norm, Combinatorica 6 55-65 (1986) 325. Stanley S. M., Macroevolution, Pattern and Process (San Francisco: W. H. Freeman (1979) ˇ ˇ 326. Stefan P., A Theorem of Sarkovskii on the Existence of Periodic Orbits of Continuous Endomorphisms of the Real Line, Communications in Mathematical Physics 54 237–248 (1977) 327. “Student”, On the Probable Error of the Mean, Biometrika 6 1-25 (1908) 328. Thom R., Stabilit´e structurelle et morphogen`ese (Paris: Inter´editions 1972). English translation: Structural Stability and Morphogenesis (Reading, MA: Addison-Wesley 1972) 329. Thom R., Pr´edire n’est pas expliquer (Paris: Flammarion 1993) 330. Thompson J. N., Interaction and Coevolution (New York: John Wiley 1982) 331. Toner J. and Tu Y., Long-Range Order in a Two-Dimensional Dynamical XY Model: How Birds Fly Together, Physical Review Letters 75 4326–4329 (1995) 332. Tresser C., Coullet P., and Arneodo A., Topological Horseshoe and Numerically Observed Chaotic Behaviour in the H´ enon Mapping, Journal of Physics A: Mathematical and General 13 L123–L127 (1980) 333. Tsallis C. and Albuquerque M. P. de, Are Citations of Scientific Papers a Case of Nonextensivity, The European Physical Journal B 13 777–780 (2000) 334. Turing A. M., The Chemical Basis of Morphogenesis, Philosophical Transaction of the Royal Society B 237 37–72 (1952) ´zquez A., Knowing a Network by Walking on it: Emergence of Scaling, 335. Va Europhysics Letters 54 430–435 (2001) ´zquez A., Statistics of Citation Networks, preprint:cond-mat/0105031 336. Va 337. Vellekoop M. and Berglund R., On Intervals, Transitivity = Chaos, American Mathematical Monthly 101 353–355 (1994) 338. Verhulst F., Notice sur la loi que la population suit dans son accroissement, Correspondances Math´ematiques et Physiques 10 113–121 (1838) 339. Vicsek T., Czirk A., Ben-Jacob E., Cohen I., and Shochet O., Novel Type of Phase Transition in a System of Self-Driven Particles, Physical Review Letters 75 1226–1229 (1995) 340. Volterra V., Variazioni e fluttuazioni del numero d’individui in specie animali conviventi, Rendiconti dell’Accademia dei Lincei, 6 (2) 31–113 (1926). An abridged English version has been published in Fluctuations in the Abundance of a Species Considered Mathematically, Nature 118 558–560 (1926) 341. Waltman P., Deterministic Threshold Models in the Theory of Epidemics, (Heidelberg: Springer-Verlag 1974) 342. Watts D. J. and Strogatz S. H., Collective Dynamics of ‘Small-World’ networks, Nature 393 440–442 (1998) 343. Weinreich G., Solids: Elementary Theory for Advanced Students (New York: John Wiley 1965) 344. Whittaker J. V., An Analytical Description of Some Simple Cases of Chaotic Behavior, American Mathematical Monthly 98 489–504 (1991)
References
387
345. Wilkinson D. and Willemsen J. F., Invasion Percolation: A New Form of Percolation Theory, Journal of Physics A: Mathematical and General 16 3365– 3376 (1983) 346. Wilks S. S., Mathematical Statistics (New York: John Wiley 1962) 347. Witten T. A. and Sander L. M., Diffusion-Limited Aggregation,, a Kinetic Critical Phenomenon, Physical Review Letters 47 1400–1403 (1981) 348. Wolfram S., Statistical Physics of Cellular Automata, Reviews of Modern Physics, 55 601–644 (1983) 349. Wolfram S., Cellular Automata as Models of Complexity, Nature 311 419– 424 (1984) 350. Wu F. Y., The Potts Model, Reviews of Modern Physics 54 235–268 (1982) 351. Zanette D. H. and Manrubia S. C., Role of Intermittency in Urban Development: A Model of Large-Scale City Formation, Physical Review Letters 79 523–526 (1997) 352. Zanette D. H. and Manrubia S. C., Zanette and Manrubia Reply, Physical Review Letters 80 4831 (1998) 353. Zanette D. H. and Manrubia S. C., Vertical Transmission of Culture and the Distribution of Family Names, Physica A 295 1–8 (2001) 354. Zeeman C., Catastrophe Theory: Selected Papers 1972–1977 (Reading, MA: Addison-Wesley 1977) 355. Zipf G. K., Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology (Reading, MA: Addison Wesley 1949), reprinted by (New York: Hafner 1965) 356. Zipf G. K., The Psycho-biology of Language: An Introduction to Dynamic Philology (Cambridge, MA: The MIT Press 1965)
Index
α-limit point, 64 α-limit set, 64 β-expansions, 221 ω-limit point, 64 ω-limit set, 64 ε-C 1 -close, 70, 117 ε-C 1 -perturbation, 70, 117 ε-neighborhood, 70, 117 n-block, 209 n-input rule, 191 absolutely continuous probability measure, 318 accumulation point, 162 active site, 192 age distribution, 12 agent-based modeling, 187, 243 moving agents, 242 Allee effect, 136 animal groups, 2 ants, 1–2 apparent power-law behavior, 311, 324, 341, 361 ARCH model, 337–339, 353, 354, 360, 363 Arnol’d, V. I., 62 artificial society, 243 atlas, 43 attachment rate, 296 attracting set, 161 attractive focus, 54 attractive node, 54 automata network, 187 avalanche, 342
duration, 342 size, 342 Axelrod model of culture dissemination, 245–248 Axelrod, R., 243, 245, 246, 253, 255, 256, 272 Ayala, F. J., 103, 177 Ayala-Gilpin-Ehrenfeld model, 103 Bachelier, L., 335 backward orbit, 108 baker’s map, 165 basin of attraction, 64 Beddington-Free-Lawton model, 115 Bendixson criterion, 85 Bethe lattice, 261 bidirectional pedestrian traffic model, 205 bifurcation diagram, 73 flip, 123 fold, 54 Hopf, 27, 83–85, 125–127, 310 period-doubling, 123–125, 143 pitchfork, 73, 74, 79–80, 122–123 point, 27, 29 saddle-node, 73–77, 120, 139, 140 symmetry breaking, 74 tangent, 54, 73 transcritical, 74, 77–79, 121–122, 268, 301, 308 bifurcations of maps, 120–127 of vector fields, 71–85
390
Index
sequence of period-doubling, 127–135, 143 binomial distribution, 322 bird flocks, 2 block probability distribution, 210 boundary of a set, 162 boundary point, 162 budworm outbreaks, 7–9, 99 Bulgarian solitaire, 15 burst, 155 Burstedde-Klauck-SchadschneiderZittartz pedestrian traffic model, 206 C0 conjugate, 56, 112 distance, 70 function, 42 norm, 69 topology, 69, 70 C1 class, 42 distance, 70 Ck class, 42 diffeomorphism, 42, 43 distance, 70 function, 42, 43 vector field, 43 C-interval, 215 Cantor diagonal process, 163 Cantor function, 163 Cantor set, 165, 166, 169, 218 capacity, 165 carrying capacity, 6, 8, 9, 13, 21–25, 28 cascade, 108 catastrophe, 85–91 Cauchy distribution, 326, 352, 355 Cauchy sequence, 110 cellular automaton, 33 evolution operator, 191 generalized rule 184, 198, 201 global rule, 191 limit set, 193, 194, 199, 200, 202, 203, 209, 213, 234, 258, 263 rule 184, 192–194, 198, 202, 205, 208 rule 18, 208, 263 rule 232, 264 totalistic rule, 224, 234, 235, 272
center, 54 center manifold, 59 central control, 2 central limit theorem, 320–323, 330, 337, 356 characteristic path length, 281, 285, 299, 302 characteristic polynomial, 54 chart, 43 citation network, 290, 291, 294, 297 Clark, C. W., 98 closed orbit, 44 clustering coefficient, 281, 299, 302 cobweb, 108 codimension, 118 collaboration network of movie actors, 276, 279, 282, 283, 287, 288, 292, 294 Collatz conjecture, 15 original problem, 15 collective stability, 256 community matrix, 19 competition, 66, 67, 99, 102 competitive exclusion principle, 67 complete metric space, 110 conditional heteroskedasticity, 337 configuration, 191 conforming, 261 conjugate flows, 49 maps, 108 contracting map, 110 cooperative effect, 188 critical behavior, 188, 225 critical car density, 265 critical density, 203 critical exponents, 188, 199, 226 critical patch size, 94 critical point, 133, 188 critical probability, 221 critical space dimension lower, 189, 223 upper, 33, 189, 200, 223 critical state, 188 cross-section, 118 cumulative distribution function, 219, 220, 303, 311, 317 approximate, 334, 362
Index cusp catastrophe, 85 cylinder set, 215 damped pendulum, 71 de Gennes, P.-G., 221 decentralized systems, 4 defect, 194 defective tiles, 201–203 degree of cultural similarity, 246, 247 degree of mixing, 228–239 degree of separation, 276 delayed logistic model, 126 demographic distribution, 339 dense orbit, 152, 156, 161, 169 dense periodic points, 176, 180 dense subset, 147 density classification problem, 259 derivative, 42 Devaney, R. L., 127, 153 Didinium, 71, 83 diffeomorphism, 42, 43 diffusion, 91 diffusion equation, 91 diffusion-induced instability, 95 diffusion-limited aggregation, 227 dimensionless, 7–9 directed percolation, 223 probability, 224 discrete diffusion equation, 230 discrete one-population model, 113, 124, 128, 161, 169 disease-free state, 138, 142, 240, 301, 308, 309 distance, 14, 69 between two evolution operators, 218, 219 between two linear operators, 51 Domany-Kinzel cellular automaton, 224 drought duration, 351 Dulac criterion, 85 dynamical system, 9–14 earthquake stick-slip frictional instability, 349 earthquakes, 348–351 Burridge-Knopoff model, 349 epicenter, 348 focus, 348 Gutenberg-Richter law, 348
391
lithosphere, 348 magnitude number, 348 Olami-Feder-Christensen model, 349 plate tectonics, 348 Richter scale, 348 seismic moment, 350 econophysics, 335 Eden model, 260, 267 elementary cellular automaton, 192 e-mail networks, 292 emergent behavior, 2–4 endemic state, 138, 142, 240, 301, 308, 309 epidemic model Boccara-Cheong SIR, 236–239 Boccara-Cheong SIS, 239–242 Grassberger, 224 Hethcote-York, 46 equilibrium point, 44 equivalence relation, 50 Erd¨ os number, 276 ergodic map, 156, 158 ergodic theory, 156 error and attack tolerance, 294 Euler Γ function, 327 event, 317 evolution law, 9, 14 evolution operator, 213–216 evolutionarily stable strategy, 255 existence and unicity of solutions, 10, 43 exponential cutoff, 353, 360 exponential distribution, 313, 358 exponential of an operator, 51 extinction, 101, 140 family names, 340 Feigenbaum number, 130, 172, 235 financial markets, 335–339 finite vertex life span, 328 first return map, 118 fish schools, 2 fixed point, 20, 23, 35, 36, 44, 108 flocking behavior, 207 floor field, 206 static, 206 flow, 43, 44, 49, 50, 56, 66, 72, 107–109, 118 forest fires, 226–227
392
Index
critical probability, 227 direction of travel, 226 intensity, 226 rate of spread, 226 forward orbit, 108 Fourier series, 94 fractal, 165, 227 free empty cells, 201, 202 free-moving phase, 198, 199, 201 fruit flies, 99 Fukui-Ishibashi pedestrian traffic model, 204 Fukui-Ishibashi traffic flow model, 198 fundamental diagram, 260 fundamental solution, 92 game of life, 3 game theory, 248 minimax theorem, 252 mixed strategies, 251 optimal mixed strategies, 254 optimal strategies, 251, 252 payoff matrix, 249 saddle point, 251 solution of the game, 251, 252 strategy, 250 value of the game, 250, 252 zero-sum game, 249 Gamma distribution, 356 GARCH model, 354, 363 Gauss distribution, 320, 327 general epidemic process, 224–226 generalized eigenvector, 53 generalized homogeneous function, 92, 189 generating function, 300, 306 Gibrat index, 312 global evolution rule, 193, 213, 217 Gompertz, B., 138, 176 Gordon, D. M., 1, 2 graph, 3, 187, 276 adjacency matrix, 278 adjacent vertex, 276 arc, 276 bipartite graph, 279 chain, 277 characteristic path length, 281 chemical distance between two vertices, 277
clustering coefficient, 281 complete, 277 component, 278 connected, 278 cycle, 278 degree of a vertex, 277 diameter, 281 directed edge, 276 distance between two vertices, 277 empty, 277 giant component, 276, 280, 281, 302, in-degree of a vertex, 277, 288, 289, 291, 295–297 isolated vertex, 277 isomorphism, 277 link, 276 neighborhood of a vertex, 276 order, 277 out-degree of a vertex, 277, 288, 289, 291, 295–297 path, 277 phase transition, 280 random, 279 binomial, 280 characteristic path length, 281 clustering coefficient, 281 diameter, 281 uniform, 280 regular, 277 size, 277 transitive linking, 298 tree, 278 vertex, 276 graphical analysis, 108 Grassberger, P., 224, 225, 259, 343–345 Green’s function, 92 growing network model Barab´ asi-Albert, 292–294 Dorogovtsev-Mendes-Samukhin, 295 extended Barab´ asi-Albert, 294 Krapivsky-Rodgers-Redner, 295–297 V´ azquez, 297–298 growth rate, 66 H´enon map, 170 H´enon strange attractor, 172 habitat size, 36 Hamming distance, 14, 231 Harrison, G. W., 25
Index Hartman-Grobman theorem, 56 harvesting, 98, 137, 140 Hassel one-population model, 113 Hausdorff dimension, 164, 172, 227 Hausdorff outer measure, 164 highway car traffic, 196–204 histogram, 148 Holling, C. S., 25 homeomorphism, 42, 50 host-parasitoid Beddington-Free-Lawton model, 115 Nicholson-Bailey model, 114 host-parasitoid model, 114 Hutchinson, G. E., 6, 12, 13, 22, 28 Hutchinson time-delay model, 12, 28 hyperbolic point, 56, 112 hysteresis, 87 hysteresis loop, 87 implicit function theorem, 72, 75–78, 120–123 infective individual, 45, 137, 187, 236, 239 infinite-range interactions, 137 interior of a set, 162 interior point, 162 Internet Movie Database, 276, 282 invariant measure, 148, 156 invariant probability density, 156, 176–178, 181 invariant probability measure, 157 invariant set, 64 invasion percolation, 261, 271 iterated prisoner’s dilemma, 253, 256 Jacobian, 56, 72, 112, 120 jammed phase, 198, 199 Kermack-McKendrick threshold theorem, 238, 309 Kingsland, S. E., 18 Kolmogorov, A. N., 67, 317 kurtosis excess, 322 L´evy distribution, 327, 328 L´evy flight, 328 laminar phase, 154 laminar traffic flow, 197 Langevin function, 359
393
large deviations, 323 law of large numbers, 333 Leslie, P. H., 25 Leslie’s model, 48 Li, T.-Y., 154, 159 Li-York condition, 160 theorem, 159 limit cycle, 22, 25, 27, 35, 64, 309 limit point, 162 limit set, 209 linear differential equation, 56 function, 56, 112 part, 56, 112 recurrence equation, 112 system, 56 transformation, 42 linearization, 51 local evolution rule, 33, 187, 191, 192, 216, 243, 246, 247, 254 local jam, 201 local structure theory, 209 local transition rule, 187 logistic model, 66 fluctuating environments, 27 time-continuous, 6–9, 12–14 time-delay, 12, 28 lognormal distribution, 312, 324 long-range move, 228, 237, 239, 244 look-up table, 192 Lorentz distribution, 326 Lorenz, E. N., 145, 146, 174 Lorenz equations, 174 Lorenz strange attractor, 174 Lotka, J., 17 Lotka-Volterra competition model, 66 modified model, 23 time-continuous model, 17, 98 time-discrete model, 33 Luckinbill’s experiment, 26 Ludwig-Jones-Holling spruce budworm model, 7, 88, 95, 99 Lyapunov exponent, 158–159 logistic map, 158 Lyapunov function, 98 strong, 60 weak, 60, 100
394
Index
Lyapunov theorem, 60 lynx-hare cycle, 21 majority rule, 264 Malthus, T. R., 22 Mandelbrot, B. B., 165, 291, 335 Manhattan distance, 221 manifold, 11, 43 Manneville, P., 154 market, 3 market index Hang Seng, 337 NIKKEI, 337 S&P 500, 337 mass extinctions, 347–348 May, R. M., 24, 113, 146, 256, 257, 272 Maynard Smith, J., 4, 103, 126, 255 mean-field approximation, 35, 208–209, 237–240, 301, 307 metric, 51, 69 Milgram, S., 275, 276 model, 4–14 moment-generating function, 323 motion representation, 197 movie actors, 3 multiple random walkers, 13, 229, 231, 242, 260, 267 amiable, 242 timorous, 242 243, 247, 254 Muramatsu-Irie-Nagatani pedestrian traffic model, 206 Murray, J. D., 95 muskrat, 93 Myrberg, P. J., 128 Nagel-Schreckenberg traffic flow model, 196 neighborhood, 189 Moore, 33 von Neumann, 33 Newman, M. E. J., 282, 286, 287, 347, 348, 351 Newton’s method, 111 Nicholson-Bailey model, 114 nilpotent operator, 53 nonlinear oscillator, 11 norm, 69 norm of a linear operator, 51 normal distribution, 92, 320, 352
null cline, 66, 99, 102 number-conserving cellular automaton, 194, 260, 265 off-lattice models, 207 open cluster, 222 open set, 70 orbit, 43, 108 order parameter, 189 Paramecium, 71, 83 Pareto exponent, 311–313, 328 law, 311 particle-hopping model, 196 particlelike structures, 194 pedestrian traffic, 204–207 pendulum damped, 57, 61 simple, 10, 60 percolation, 223–226 bond, 223 probability, 225 site, 223 percolation 221–224 bond 221 probability 223 site 221 threshold, 222 perfect set, 166 perfect tile, 199, 201–203 period of a closed orbit, 44 period of a periodic point, 108 period-doubling bifurcation, 129 period-doubling route to chaos, 234 periodic orbit, 108, 147, 149–151, 159, 160, 176 periodic point, 108 Perron-Frobenius equation, 157 perturbed center, 65 perturbed harmonic oscillator, 58, 71, 99 phase flow, 43 phase portrait, 20, 43 phase space, 9–15, 42 Pielou, E. C., 24 Poincar´e-Bendixson theorem, 64, 99 Poincar´e, H., 41, 145 Poincar´e map, 118
Index Poincar´e recurrence theorem, 152 Pomeau, Y., 154, 172 population growth, 5 positively invariant set, 65 predation, 17 predator’s functional response, 24, 25 predator’s numerical response, 24 preferential attachment, 290, 293 prisoner’s dilemma, 252 probabilistic cellular automaton, 196 probabilistic cellular automaton rule, 192 probability density, 318 Cauchy, 326, 330, 352, 355, 357 exponential, 313, 358 Gamma, 356 Gauss, 320, 327 L´evy, 327, 328 lognormal, 312, 324 Lorentzian, 326 normal, 320, 352 Student, 330, 362 truncated L´evy, 328, 330 probability distribution Bernoulli, 228 binomial, 322 Poisson, 280, 299, 300, 302, 306, 307, 321 probability measure, 156, 317 probability space, 317 radii of a rule, 33, 191 rain event, 351 rainfall, 351–352 random dispersal, 91 critical patch size, 94 induced instability, 95–97 one-population models, 92 random network, 279–284, 292, 294 epidemic model on a, 301 random variable, 317 absolutely continuous, 318 average value, 318 Bernoulli, 228, 323, 352 binomial, 322 Cauchy, 326, 357 characteristic function, 319 convergence almost sure, 338 convergence in distribution, 319
395
convergence in probability, 333 discrete, 318 exponential, 352 Gaussian, 320–325 kurtosis, 322 kurtosis excess, 322, 338, 353, 361, 363 L´evy, 325–328, 335 lognormal, 324 Lorentzian, 326 maximum of a sequence, 353 mean value, 318, 353, 361, 363 median, 322 minimum of a sequence, 353 mode, 322 moment-generating function, 323 moment of order r, 318 normal, 325 sample of observed values, 332 skewness, 327, 353, 361, 363 standard deviation, 319 truncated L´evy, 328 variance, 318, 353, 361, 363 random walk Cauchy, 328 random walk and diffusion, 91 refuge, 94, 102 removed individual, 45, 137, 236 repulsive focus, 54 repulsive node, 54 Resnick, M, 4 return, 335 normalized, 336 rich get richer phenomenon, 293 Rift Valley Fever, 242 rotating hoop, 81 route to chaos intermittency, 154 period-doubling, 154 Ruelle, D., 156, 167 rule code number, 192 saddle, 54, 58 saddle node, 54 sample space, 317 ˇ Sarkovskii theorem, 150, 151, 159, 176 scaled variables, 7 scaling relation, 189, 200, 285 scatter plot, 361
396
Index
scientific collaboration networks, 282, 283, 289, 294 second-order phase transition, 188, 198, 199, 308 self-consistency conditions, 210 self-organization, 2–3 self-organized criticality, 341–352 self-propelled particles, 207 self-similar function, 191 self-similar set, 164 sensitive dependence on initial conditions, 153, 161, 176, 180 sentinel, 2 Seton, E. T., 21 sexual contact network, 291 sexually transmitted disease, 46, 138 short-range move, 228, 237, 239, 241, 244 simulation, 4 singular measure, 219, 318 sink, 57 SIR epidemic model Boccara-Cheong, 137 Kermack-McKendrick, 45, 137 SIS epidemic model Boccara-Cheong, 138 Hethcote-York, 47 site-exchange cellular automata, 228–243 Skellam, J. G., 93 skewness, 322 small-world model Davidsen-Ebel-Bornholdt, 298–299 highly connected extra vertex, 287 Newman-Watts, 286–287 shortcut, 286 Watts-Strogatz, 283–286 Watts-Strogatz modified, 285 Smith model, 98, 100 smooth function, 42 social networks, 3 source, 58 space metric, 69 normed vector, 69 space average, 156, 158 space of elementary events, 317 spatial segregation model, 243–245 spectrum, 120
stability asymptotic, 23, 44, 110 Lyapunov, 44, 110 structural, 24, 69, 118 stable manifold, 56, 59 stable manifold theorem, 59 stable Paretian hypothesis, 335 stable probability distribution, 326 StarLogo, 4 start-stop waves, 197 state at a site, 191 of a cellular automaton, 191 state space, 9 statistics arithmetic mean, 332 kurtosis excess, 332 median, 332 skewness, 332 standard deviation, 332 stock exchange, 3 stock market, 3 strange attractor, 167, 169, 172, 174 strange repeller, 167 street gang control, 87–91 Strogatz, S. H., 283, 285, 286, 292, 294 Student’s test, 333 confidence interval, 334 subgraph, 276 superstable, 133 surjective mapping, 259 susceptible individual, 45, 47, 137, 187, 236, 239 suspicious tit for tat, 261 symmetrical difference, 219 symmetry-breaking field, 199 T -point cycle, 108 tent map, 109 asymmetric, 176, 178 binary, 149 symmetric, 176 tentative move, 229 termites, 4 theta model, 177 three-spined sticklebacks, 256 threshold phenomenon, 45 time-discrete analogue, 32 tit for tat, 255, 256
Index tit for two tats, 261 topological space, 70 topological transitivity, 152 topologically equivalent flows, 50, 82 topology, 70 traffic, 3 traffic flow models, 212–213 trajectory, 43 trapping region, 161 tree swallow, 256 truncated L´evy distribution, 328, 330 truncation operator, 209 Turing effect, 95 two-dimensional linear flows, 54 two-lane car traffic flow model, 259, 265 two-species competition model, 99 U -sequences, 134 unfolding, 82 unimodal map, 131, 133 universality, 131–135, 172, 227, 235, 294, 337, 339 universality class, 188, 207, 225 unstable, 44
397
unstable manifold, 56, 59 van der Pol oscillator, 61 variational principle, 201, 202 vector field, 42 velocity configuration, 197 velocity probability distribution, 213 velocity rule, 197 Verhulst, P. F., 6 volatility, 336 Volterra, V., 17 voting, 340 Watts, D. J., 283, 285, 266, 292, 294 Web pages, 288 Whittaker, J. V., 149 word network, 316 World Wide Web, 3, 288, 289, 294, 295, 298 York, J. A., 154, 159 Zipf law, 314