Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen Advisory Board: Wo Brauer
D. Gries
J. Stoer
899
Wolfgang Banzhaf Frank H. Eeckman (Eds.)
Evolution and Biocomputation ComputationalModelsofEvolution
~ Springer
Series Editors Gerhard Goos Universit~it Karlsruhe Vincenz-Priessnitz-Strat~e 3, D-76128 Karlsruhe, Germany Juris Hartmanis Department of Computer Science, ComeU University 4130 Upson Hall, Ithaca, NY 14853, USA Jan van Leeuwen Department of Computer Science, Utrecht University Padualaan 14, 3584 CH Utrecht, The Netherlands
Volume Editors Wolfgang Banzhaf Department of Computer Science, University of Dortmund D-44221 Dortmund, Germany Frank H. Eeckman Human Genome Center, Lawrence Berkeley Laboratory Berkeley, CA 94720, USA
CR Subject Classification (1991): E2, 1.2, G.2, J.3 ISBN 3-540-59046-3 Springer-Verlag Berlin Heidelberg New York CIP data applied for This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. 9 Springer-Verlag Berlin Heidelberg 1995 Printed in Germany Typesetting: Camera-ready by author SPIN: 10485391 45/3140-543210 - Printed on acid-free paper
Preface Biology is the eternal interdisciplinary subject. With its position right between physical sciences and the behavioral/social sciences its progress always depends on good relationships to neighboring disciplines. This was even true in the good old times during the 18th and 19th century where marly of the lasting contributions of biology depended on the collaboration with geography and geology, as in the case of evolutionary theory, or mathematics, as in the case of Mendelian genetics. The 20th century cell biological and molecular revolution is fueled by the influx of concepts and techniques from chemistry and physics. Now, close to the end of the 20th century, a new ally to biology is emerging, computer science. I consider the connection of biology to computer science to be of fundamental significance for the future of biological science (as a biologist I cannot talk about the future computer science). This relationship develops on two levels: the methodological and the fundamental conceptual level. Obviously, advances in computer science are important in handling the new types of data biology is currently producing. They range from nucleotide sequences to data about foodweb structure, each requiring new techniques to handle and analyze. However, the connection to computer science goes beyond data handling and analysis and addresses one of the deep unsolved problems of biology: the problem of organization. As far as we know every biological event is based on ordinary physico-chemical processes; no special vital force keeps organisms alive. What makes organisms different from the so-called inanimate world is the spatial and temporal organization of these physico-chemical processes. But there is no scientific paradigm which allows us to tackle this problem in a systematic fashion. Similarly, computation is due to the spatial and temporal organization of data streams based on fairly ordinary physical events in processors. This is the conceptual ground on which biology and computer science meet and have, in my opinion, the chance to make a lasting impact. It is this assessment of the importance of computer science to biology that makes me especially welcome the publication of this book. It brings together computer scientists and biologists to discuss one of the areas where the connection has already been established, evolutionary theory and evolutionary algorithms, but where the communication between the disciplines is still limited. Computer scientists and biologists have fundamentally different ways of looking at evolutionary theory. This makes their interaction so promising and interesting. For a biologist evolution is a fact, and he/she uses population genetics theory to understand how it happened. In contrast, computer scientists, at least those working in evolutionary algorithms, want to harness the principle of natural selection to find solutions to new problems. This means that the computer scientists have a forward looking perspective, while the biologist has a backward looking perspective. Consequently it was the computer scientists who first had to deal seriously with the problem that mutation/selection only works under certain conditions, a fact known as the representation problem. In biology this topic was rarely discussed and mostly overlooked, even if it is a serious problem in explaining the evolution of complex adaptations (Frazetta, 1975, Riedl, 1975,
Yl
Bonner, 1988). Who or what has chosen the right genetic representation for the living species to be able to adapt? Another area, reflected in the contributions to this book, where computer science and biology have much in common is the universality of organizational principles. The root of computer science is the discovery of abstract, universal calculation machines, that work regardless of their hardware realization. The best known universal principle in biology has already been mentioned, natural selection. But there is the justified expectation that there ought to be more of these principles. One reason is that the principle of natural selection does not easily explain the class of problems with the labels "The origin of..." (Fontana and Buss, 1994). This is the philosophical basis of artificial life research (Langton, 1989). I agree, but remark that no such principle has been found yet (perhaps the "edge of chaos" ?), and I have also not seen a paradigm which promises progress along theses lines. However, if principles exist, they are most likely to be found by the combined efforts of people trained in both computer science and biology.
Yale University November 1994
G/inter P. Wagner
References Bonuer, J.T. (1988): The evolution of complexity. Princeton University Press, Princeton, NJ. Fontana, W. and Buss, L.W. (1994): "The arrival of the Jittes~": toward a theory of biological organization. Bull. Math. Biol. 56:1 - 64. Frazetta, T.H. (1975): Complex Adaptations in Evolving Populations. Sinauer Ass., Sunderland, MA. Holland, J.H. (1992): Adaptation in natural and artificial systems. MIT Press, Cambridge, MA. Langton, C.G.(1992): Computation at the edge of chaos: Phase transitions and emergent computation. Manuscript. Langton, C.G. (1989): Preface to Artificial life. Edited by C.G.Langton, Santa Fe Irtstitute, Studies in the Sciences of Complexity Series, Vol VI, Addison Wesley, Readwood City, CA. Rechenberg, I.(1973): Evolutionsstrategie. Friedrich Frommann Verlag, Stuttgart. Riedl, R. (1975): Die Ordnung des Lebendigen. Systembcdingungen der Evolution. Verlag Paul Parey, Hamburg and Berlin.
Contents Editors' I n t r o d u c t i o n W. Banzhaf and F.H. Eeckman Aspects of Optimality Behavior in Population Genetics Theory W.J. Ewens and A. Hastings Optimization as a Technique for Studying Population Genetics Equations A. Hastings and G.A. Fox
!8
Emergence of Mutualism G. Duchateau-Nguyen, G. Weisbuch and L. Peliti
27
T h r e e Illustrations of Artificial Life's Working Hypothesis M.A. Bedau
53
Self-Organizing Algorithms Derived from RNA Interactions W. Banzhaf
69
Modeling the Connection Between Development and Evolution: Preliminary Report E, Mjolsness, C.D. Garrett, J. Reinitz and D.H. Sharp
103
Soft Genetic Operators in Evolutionary Algorithms H.-M. Voigt
123
Analysis of Selection, M u t a t i o n and Recombination in Genetic Algorithms H. Mfihlenbein and D. Schlierkamp-Voosen
142
The Role of Mate Choice in Biocomputation: Sexual Selection as a Process of Search, Optimization and Diversification G.F. Miller and P.M. Todd
169
Genome Growth and the Evolution of the G e n o t y p e - P h e n o t y p e Map L. Altenberg
205
A b o u t the Contributors
261
Index
265
EDITORS' INTRODUCTION This volume comprises papers presented at an interdisciplinary workshop on biocomputation entitled "Evolution as a computational process" held in Monterey, California, in July 1992. The Monterey workshop brought together scientists from diverse backgrounds to discuss the implications of viewing evolution as a computational process. As such, evolution can be considered in the more general framework of biocomputation. Biocomputation may be broadly defined by its emphasis on understanding biological systems as computational devices. Many biocomputation subgroups have identified themselves clearly over the years: computational population biology, computational biochemistry, computational neuroscience, etc. Altogether, biocomputation is situated at the intersection between the biological sciences and computer science. Scientists and engineers with different backgrounds converge here, bringing with them specific insights and viewpoints and exporting new ideas to outside areas. Biocomputation may be also considered as part of an ambitious enterprise to uncover the secrets of the living universe. We would like to understand the genetic library each of us is carrying around, we would like to formulate principles of information processing in organisms and other living systems that have evolved over billions of years, we would like to know (or at least have a well founded scientific hypothesis), whether life should be seen as a single and unique event in the history of our Universe or whether there is a large probability of other forms of life elsewhere, maybe in a nearby galaxy. Computer scientists and engineers have already started to use strategies that are modeled loosely after Nature's recipes for optimization, adaptation and improvement of designs. Examples are neural networks [1] - - [3] and evolutionary algorithms [4] - - [8] which have recently entered the world of industrial applications after decade long investigations in academia. In our opinion we have only just started to scratch the surface and it seems likely that many more treasures will await us as we further our understanding of biological computation. A central notion in biocomputation is that of Emergent Properties or Emergence. Emergence started as a philosophical idea early in this century [9] - - [12]. It describes the dynamical process of qualitative changes, e.g. in the form of a creation of new structures and capabilities, and of complexity growth in nonlinear systems due to increased interaction between components. Consequently,
researchers are now paying more attention to the dynamical aspects of origin of systems. Investigations of emergent phenomena in natural and artificial systems [13] - - [16] are playing a prominent role in our understanding of self-organization and evolution. Since emergence usually needs exponential growth rates (at least early on in its dynamics) we can assume that positive feed-back loops are in effect. A wealth of emergent phenomena can therefore be found in communication links. Such links are effective at the organismal level, for example the emergence of language from primitive utterings, or at the societal level, as in the emergence of common technologies through reinforcement. Instabilities caused by positive feed-back loops in a system are required to move the system from one qualitative stage to the next. The requirement for complex systems to teeter "at the edge of chaos" [17],[18] or near instabilities [19] is therefore understandable. Only the violent forces of instability allow a system to truly evolve. However, continuous exponential growth of unstable modes is simply not possible in a finite world. Sooner or later the stabilizing forces of evolution will limit growth by subjugating organisms to selection. Selection is a consequence of competition that itself results from finite resources. It is only when the instabilities are held at bay by selection that we can start to see structure in a system. Stabilizing selection and de novo emergence are the main themes of evolution viewed as a computational process. The general course of evolution has often been associated with that of an adaptive search. Hence it has been a longstanding controversy in evolutionary biology whether evolution really is a dynamical process that searches for optima [20] or not. What then is optimized, and what is the measure of quality in evolution? Warren Ewens and Alan Hastings [21] discuss this question in the context of a one-locus population genetics model and propose a new interpretation of results obtained by Svirezhev [22]. The basic idea is to formulate a Hamiltonian principle similar to the ones formulated in physics for various dynamical problems. Evolutionary dynamics then follows naturally by computation of extrema of the corresponding scalar function in the integrand. Alan Hastings and Gordon Fox [23] further elaborate on this idea aa~d derive results fo~ two-locus models. Why then, one might ask, are there multiple solutions if evolution is an optimization process? Guillemette Duchateau et al. [24] answer this question in a dynamical model of the emergence of coelenterates-algae symbiosis. Interestingly, they are able to show a region of co-existence between symbiosis and selfishness. They interpret their results by drawing an analogy to phases in thermodynamics. The thermodynamical metaphor is also at the center of the argument of Mark Bedau [25]. In the context of simple artificial life models he has devised, he examines statistical macrovariables like mean values and diversity measures of traits within a population as well as adaptive evolutionary activity in the population as a whole. He demonstrates convincingly that the identification of macrovariables is key to understanding such models.
One of us (W.B., [26]) highlights another important aspect of self-organization: the appearance of organizing entities acting on themselves. He does so by discussing a model of self-organizing algorithms. Since the days of von Neumann [27], this theme has been reverberating in the self-organization literature .[28]. It finds its expression here in a very simple form using sequences of binary numbers. Lee Altenberg [29] considers the genotype-phenotype mapping and demonstrates the advantage of his "constructional" selection in the process of adaptation. Specifically, he considers the variational aspect of the representation problem and pleiotropy. The relation and even interaction of evolution and development is discussed in the paper by Eric Mjolsness et al. [30]. Their model is based on a regulatory network for development introduced earlier [31] stating a grammar for development. First observations in the model show the emergence of different cell types in a simulation of multicellular organisms. The next two papers discuss a class of algorithms that have become prominent in recent years [4] - [8]. Hans-Michael Voigt [32] blends evolutionary algorithms with fuzzy logic by introducing soft genetic operators. He compares the performance of these newly invented operators to what he calls hard genetic operators and is able to draw favorable conclusion for the fuzzy operators. Heinz MiJhlenbein and Dirk Schlierkamp-Voosen [33] study the Breeder genetic algorithm and derive theoretical and empirical conclusions from the behavior of this algorithm applied to selected problems. A central role in their argument plays the well known response-to-selection equation of quantitative genetics [34]. Finally, Geoffrey Miller and Peter Todd [35] provide strong arguments against the popular idea that natural selection can explain evolution. Going back to Darwin, they state that sexual selection is central in explaining innovation upon which natural selection might act only later on. As such, the emergence of new traits can be understood as resulting from the communication events of sexual selection. The more general statement about the necessity of instabilities through positive feed-back loops, mentioned in the beginning of this introduction, finds a very clear confirmation here. The workshop in Monterey was a truly interdisciplinary event and we attempted to bring together researchers on both sides of the issue, the biological and the computational. The aim of our meeting was to highlight and explore the notion of evolution as a giant computation being carried out over a vast spatial and temporal scale. We hope that the collection of essays presented here successfully reflects the spirit and enthusiasm at the meeting. Indeed, the impression was that computer scientists, mathematicians and physicists can learn about optimization from looking at evolution and that biologists may learn about evolution from studying artificial life, game theory, and mathematical optimization.
We would like to acknowledge the Institute for Scientific Computing Research at Lawrence Livermore National Laboratory (LLNL) and the Biocomputation Center at Sandia National Laboratory (SNL) for their generous support of this meeting. We would also like to thank the organizing committee for providing us with this opportunity for interdisciplinary communication. It is our pleasure to thank all the participants of the workshop and especially the invited speakers, for their valuable contributions. Chris Ghinazzi did a wonderful job coordinating the meeting and the special event banquet. We are grateful to Helge Baler who generated an index for the book. Last but not least we would like to express our gratitude to Dr. Alfred Hofmann from Springer, Heidelberg, for his friendly and helpful cooperation.
Wolfgang Banzhaf
Frank Eeckman
Dortmund and Berkeley, November 1994
References 1. Hecht-Nielssen, R. (1989): Neurocomputing. Addsion-Wesley, Reading, MA. 2. Hertz, J., Krogh, A. and Palmer, R. (1991): Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City, CA. 3. Wasserman, P.D. (1993): Advanced Methods in Neural Computing. Van NostrandReinhold, New York, NY. 4. Rechenberg, I. (1973): Evolutionsstrategien. Fromann-Holzboog, Stuttgart. 5. Holland, J.H, (1975): Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI. 6. Schwefel, H.P. (1981): Numerical Optimization. Wiley, Chichester, UK. 7. Goldberg, D. (1989): Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA. 8. Michalewicz, Z. (1992): Genetic Algorithms + Data Structures = Evolution Programs. Springer, Berlin. 9. Morgan, Lloyd C. (1923): Emergent Evolution. Williams & Norgate, London. 10. Pepper, S.C. (1926): Emergence. Philos. 23:241 - 245. 11. Ablowitz, R. (1939): The Theory of Emergence. Philos. Science 177:393 - 396. 12. Angyal, A. (1939): The Structure of Wholes. Philos. Sei. 6:25 - 37. 13. Forrest, S. (1991): Emergent Computation. MIT Press, Cambridge, MA.
14. Kampis, G. (1991): Self-modifying Systems in Biology and Cognitive Science. Perg~mon Press, Oxford, UK. 15. Cariani, P. (1991): Emergence and Artificial Life. In: Langton, C., Taylor, C., Farmer, J. and Rasmussen, S. (Eds.): Artificial Life II. Addison-Wesley, Redwood City, CA, 775 - 797. 16. Baas, N. (1994): Emergence, Hierarchies and Hyperstructures. In: Langton, C. (Ed.): Artificial Life IIL Addison-Wesley, Redwood City, CA, 515 - 537. 17. Langton, C. (1991): Life at the edge of chaos. In: Langton, C., Taylor, C., Farmer, J. a~nd Rasmussen, S. (Eds.): Artificial Life II. Addison-Wesley, Redwood City, CA, 41 - 91. 18. Kanffman, S. and Johnsen, S. (1991): Go-Evolution to the Edge of Chaos: Coupled Fitness Landscapes, Poised States and Go-Evolutionary Avalanches. In: Langton, C., Taylor, C., Farmer, J. and Rasmussen, S. (Eds.): Artificial Life II. Addison-Wesley, Redwood City, CA, 325 - 369. 19. HHaken, H. (1983): Synergetics, an Introduction. Springer, Berfin. 20. Dupre, J. (1987): The latest on the best. MIT Press, Cambridge, MA. 21. Ewens, W. and Hastings, A. (1995): Aspects of Optimality Behavior in Population Genetics Theory. This volume, 7 - 17. 22. Svirezhev, Y.M. (1972): Optimum principles in genetics. In: Studies on Theoretical Genetics. USSR Academy of Science, Nowosibirsk. fin Russian] 23. Hastings, A. and Fox, G. (1995): Optimization as a Technique for Studying Population Genetics Equations. This volume, 18 - 26. 24. Duchateau, G., Weisbuch G. and Peliti, L. (1995): Emergence of Mutualism. This volume, 27- 52. 25. Bedau, M. (1995): Three Illustrations of Artificial Life's Working Hypothesis. This volume, 53 - 68. 26. Banzhaf, W. (1995): Sell-organizing Algorithms derived from RNA interactions. This volume, 69 - 102. 27. von Neumann, J. (1966): Theory of Self-reproducing Automata. Edited and completed by Burks, A.W. University of Illinois Press,Urbana, IL. 28. Langton, C. (1989) Artificial Life. In: Artificial Life. Langton, C. (Ed.). Addison Wesley, Redwood City, CA. 29. Altenberg, L. (1995): "Constructional" Selection and the Evolution of the Genotype-Phenotype Map. This volume, 205. 30. Mjolsness, E., Garrett, C., Reinitz, J. and Sharp, D. (1995): Modeling the connection between Development and Evolution. This volume, 103 - 123. 31. Mjolsness, E., Sharp, D. and Reinitz, J. (1991): A connectionist model of development. Journal of Theoretical Biology 152:429 - 453. 32. Voigt, H.M. (1995): Soft Genetic Operators in Evolutionary Algorithms. This volume, 123- 141. 33. Mfihlenbein, H. and Schfierkamp-Voosen, D. (1995): Analysis of Selection, Mutation and Recombination in Genetic Algorithms. This volume, 142 - 168. 34. Falconer, D.S. (1981): Introduction to quantitative Genetics. Longman, London. 35. Miller, G. and Todd, P. (1995): The role of mate choice in biocomputation: Sexual selection as a process of search, optimization and diversification. This volume, 169 204.
6
Organizing
Committee
of the Biocomputation
E v o l u t i o n as a C o m p u t a t i o n a l
Workshop
Process
Joachim Buhmann Lawrence Livermore National Laboratory, now at Bonn University Michael Cotvin Sandia National Laboratory Richard Durbin Medical Research Council, Cambridge Frank Eeckman Lawrence Livermore National Laboratory, now at Lawrence Berkeley Laboratory Richard Judson Sandia National Laboratory Nora Smiriga Lawrence Livermore National Laboratory
Aspects of Optimality Behavior in Population Genetics Theory W.J. Ewens 1 and Alan HastingQ 1 Department of Biology University of Pennsylvania Philadelphia, PA 19104 2 Division of Environmental Studies, Center for Population Biology, and Institute for Theoretical Dynamics University of California Davis, CA. 95616 A b s t r a c t . Optimality principles are central to many areas of the physical sciences, and often the simplest way of finding the evolutionary behavior of some dynamical system is by finding that path satisfying some optimality criterion. This paper discusses two aspects of the evolutionary paths followed by gene frequencies under natural selection as derived by optimality principles. The first, due to Svirezhev, is that when fitnesses depend on the genes at a single locus only, and random mmating occurs, the evolutionary paths of gene frequencies, as determined by natural selection, minimize a functional which can be thought of as the sum of a kinetic and a potential energy. The second principle applies when fitness depends on all loci in the genome and random mating does not necessarily occur. The set of gene frequencies start at some point p in gene frequency space, and, some time later, under natural selection, are at some point q. There is a natural non-euclidean metric in the space of gene frequencies, and with this metric the distance from p to q is some value d. Then of all points in gene frequency space at distance d from p, the point q corresponding to natural selection maximizes the so-called partial increase in mean fitness, a central concept in a recent interpretation of the Fundamental Theorem of Natural Selection.
1
Optimality
It has long been known that many phenomena in the natural sciences exhibit optimality behavior, and the formalization of this goes back to the times of Fermat, Euler, Lagrange and Hamilton. An account of the use of optimality principles in science has been given in a recent paper by Schoemaker (1991) and the associated discussion. This discussion focused on the physical sciences, with comparatively little attention being paid to the biological sciences. Nevertheless, optimality concepts are of central interest in the biological sciences, as well as in areas such as biocomputation and the use of genetic algorithms which employ
biological concepts. The various chapters in this book witness this focus on optimality in these areas: in particular we refer to the companion paper by Hastings and Fox (1993). Optimality in the physical sciences is frequently associated with simplicity: often the easiest way of arriving at a physical principle is through an optimality requirement. By contrast, optimality considerations in the evolutionary biological sciences are sometimes associated with complexity and the resultant difficulties of reaching an optimum - a current trend in genetical evolution (Kauffman, (1993)) focuses on the "complexity catastrophe" reached when a biological entity has evolved to such a complex state that it cannot readily evolve further to a different but more desirable state. On the other hand, a similarity between the physical and biological sciences concerns the choice of a suitable metric in the space in which dynamic behavior occurs. It is well known, for example, that in general relativity optimality behavior is exhibited in a space-time co-ordinate system endowed with a suitable metric: we will show later how choice of an appropriate metric in the space of gene frequencies leads to an optimality behavior that is not readily perceived using the standard euclidean metric. At a higher level, the Darwinian theory itself can be viewed as one in which a population continually strives for optimization through natural selection. In the controversy surrounding the two most important theories concerning the rewriting of the Darwinian paradigm in a Mendelian framework, proposed respectively by R.A. Fisher and Sewall Wright, the main point at issue concerned the different conditions assumed under each theory to be best suited to optimizing the evolutionary process. We will discuss later the interpretation of the centerpiece of Fisher's theory, encapsulated in his "Fundamental Theorem of Natural Selection", and will claim that it has consistently been misunderstood since its introduction, and argue further that it is best presented in association with an evolutionary optimization behavior which we describe later. In this chapter we focus on two aspects of optimization which derive from the central dynamical equations of biological evolution, viewed as a genetic process describing changes in gene frequencies under natural selection. The first aspect concerns optimality properties of the path integral of a certain function of gene frequencies when mating is random and fitness depends on the genetic constitution at a single gene locus. The second concerns the case where fitness depends on many loci and mating is not necessarily at random, and focuses on the concept of partial increase in mean fitness. To make this exposition self-contained, we first outline the equations which describe the dynamics of evolutionary change when viewed as a genetic process. We assume throughout a monoecious diploid population of size so large that random changes in gene frequency can be ignored.
2
Dynamical Equations
We consider first the case of a gene locus "A", admitting alleles (gene types) A1, A2 9 9 Ak 9At the time of conception of a certain (parental) generation, the
frequency of AiAi is assumed to be Pii while that of AiAj is 2Pij, (i # j). it follows that the gene (more properly allelic) frequency Pi of Ai at this time is
Pi = E Pij. J
(1)
Under random mating we have Pij = PiPj (both for i = j and i ~ j), and we will sometimes assume that this is the case. The (viability) fitness of AiAj defined as a measure of the probability that an individual of this genotype will survive until the age of reproduction, is written wq. It follows that the frequency P~j of this genotype at the age of reproduction is
P'~-
wij Pij -~
(2)
where @, defined by
: Z ~ P~jwij, i
(3)
j
is the mean fitness of the population. From this, the frequency p~ of Ai at this later age is P~ = E wijPij_ (4) J Thus the change 5i in the frequency of Ai between the two life stages is
6i = E wij_Pij i
Pl, (i = 1 , 2 , . . . , k ) .
(5)
w
Since we normally assume that the frequency of Ai in the daughter generation at the time of conception is the same as that in the parental generation at the age of reproduction, this is also the change in the frequency of Ai from the time of conception of one generation to the time of conception of the next. To this extent, (5) represents a part of any model of the dynamical behavior of gene frequency change under natural selection. To develop further properties of this dynamic behavior further assumptions are necessary. One assumption often made is that of random mating. Under random mating the above equations simplify to
p~
Pi wi _
-
,
(6)
w
5i -- pi(wi_- u , (i = 1, 2, . . . , k),
(7)
w
i
j
where we define wi by
wi = E PJwij. We may think of wi as the marginal fitness of the allele A~.
(9)
10
The above analysis assume discrete generations. It is often more appropriate to consider time as continuous, in which, for the random mating case, (7) is replaced by [9i = p i ( w i -- w ) , (i = 1, 2 , . . . , k), (10) with N being given by (8) and a superscript dot denotes a derivative with respect to time. (We do not give the continuous-time analogue of the more general equation (2), since to do this would require specific assumptions being made about the mating scheme.) There are two further quantities of major importance in population genetics theory which we now define and consider at some length. The first of these was introduced by Fisher (1958) and is central to his concept of evolution, which he saw being described essentially as changes in gene frequencies in a population over time, as opposed to changes in gametic frequencies, under the action of natural selection. This concept is that of the average effect of the gene Ai, which is defined by a minimization procedure in the following way. Suppose first that the fitness wij can be written in the form
w~j = ~ + ai + aj
(11)
for parameters (O~1,..., O~k) which satisfy, as they must, from (11), Epj~j
=0.
(12)
Thus if any individual in the population is chosen at random and a randomly chosen gene in that individual is replaced by an Ai gene, the mean fitness change of that individual is cti - E p i a j = ai. (13) This explains the terminology "average effect of Ai", which in this case is a constant. More generally the genotype fitnesses cannot be written as in (11), and the average effects (which now depend on gene and genotype frequencies) are chosen so as to minimize c~ 2
(14)
subject to (12). If we write D = diag(pl, P2,..., Pk), P = =
(15)
P' = (Pl,P2,... ,Pk), 5' = (51,6,...,5~), the minimizing values for (al, c~2,..., ae) are found implicitly as the solutions to the equations (D + P ) a = ~e, (16)
1]
where the components in 5 are found from (5). In the random-mating case we can solve these equations explicitly, to obtain c~ _
- wi - w,
(17)
Pi
so that ai is the excess of the marginal fitness of Ai over the mean fitness. When random mating is not assumed, no simple explicit formula for c~i exists, and equation (16) must be solved numerically. The second central quantity, also introduced into the genetics literature by Fisher, is the additive genetic variance, denoted a~, which can be thought of as that component of the total variance in fitness which can be ascribed to differences in the (marginal) fitnesses of the various alleles AI, A2. 9 At. In the random-mating case, this is given by ~r~ = 2 E P i ( W i - N) 2.
(18)
i
It follows from (7), (17) and (18) that an alternative expression for ~r~ is
=
(19)
When mating is not at random the analysis is more complex. However it is found eventually that the additive genetic variance is still defined by (19) if we define by by (3) , 5 by using (5), and c~ as the solution of the equations implied in (16). All these results can be generalized to the case where fitness depends on the genes present at an arbitrary number of loci. Those aspects of this generalization which are of interest to us here follow immediately from the above equations, and thus will be described later at a more appropriate point. 3
A Hamilton's
Principle
in Population
Genetics
The equations of motion in physics can typically be obtained via an optimization procedure based on the calculus of variations which determines a path which minimizes the difference between potential and kinetic energy along the path. This is embodied in Hamilton's principle which states that for gradient systems, that is systems whose equations of motion are the gradient of a potential, the motion can be obtained by finding the stationary point of the integral of the lagrangian which is defined as the difference between kinetic and potential energy. We will now present an analogue to this approach for single locus population genetics, which was first discovered by Svirezhev (1972). Our computations, however, are presented in a somewhat different form than his. We first return to the point that the physical systems for which dynamical equations can be obtained via variational arguments are gradient systems. We thus would expect that a similar approach might work for continuous time single locus population genetics systems with random mating, where the dynamic
12 equations as well can be obtained at the gradient of a potential (with the appropriate metric), as shown by Shashahani (1979), Akin (1979) and perhaps best explained in Hofbauer and Sigmund (1988, pp.242-245). As an analogue to the difference between kinetic and potential energy, we define the function f (Svirezhev, 1972) which includes a term corresponding to dynamics, and a term corresponding to half the additive genetic variance (given in (18)), by f = ~
-~i
+ EPi(Wi
-- @)2
(20)
This form can also be motivated by noting that there are two ways of viewing the equations of single locus population genetics as being derived form the gradient of a potential. One can change the metric, as indicated earlier, or one can make the change of variables Yi = (Pi)2/4, which makes the single locus dynamic equations into a gradient system under the ordinary metric, with the phase space being the surface of a sphere restricted to the positive orthant. Under this transformation, the first term in (1) becomes (y~)2 which is the kinetic energy. To show that the actual dynamics of allele frequencies can be found from a variational principle, we cannot simply minimize the integral of f along evolutionary paths, since we must include the additional constraint that all the allele frequencies sum to one, that is c =
1 = 0.
(21)
The claim we will now demonstrate is that the equations (10) of motion for single locus population genetics can be obtained by minimizing the integrand (20) along the evolutionary path taken by the allele frequencies, subject to the constraint (21). Standard results from the calculus of variations imply that the solution to the problem, minimize
[,2
f dt, subject to G = 0,
(22)
dr1
where tl and t2 are the initial and final values of time and the allele frequencies are specified at the initial and final time, satisfies the system of variational equations 8dpi
dt
= 0.
(23)
Here F is obtained from f by using a Lagrange multiplier, so that F = f + ~O,
(24)
where # is a function to be determined. The function f does not involve time explicitly, so we can integrate the system (23) via a straightforward computation (e.g. Weinstock, 1974, pp. 48-53) and obtain the first order equations ioi = p~(w~ - ~ ) (i = 1 , 2 , . . . , k ) .
(25)
13 Since these equations are identical to (10), we have demonstrated that the equations for single locus population genetics can be obtained via a variational argument, as first shown by Svirezhev (1972). Note that the course of this demonstration shows that # = 2w. It is important to understand the limitations of what has been shown. First, the integral is taken with respect to time, and not with respect to allele frequencies. Secondly, although it is true that
(Pi)--I ~ = ~p~(w; - ~)~ Pi
(26)
if the dynamic equations (10) hold, (26) is not true for an arbitrary evolutionary path. Thus, it is not correct to say that the integral of the additive genetic variance is minimized by the evolutionary path determined by natural selection. Finally, we have been unable to extend this approach to more than one locus. We conjecture that this will prove to be impossible, because in contrast to the equations describing one-locus population genetics, the equations of multilocus population genetic systems can be shown not to be gradient systems in general (ttofbauer and Sigmund, 1988). The only possible extension might be to a special case of the single locus mutation selection equations. If the mutation rate from allele i to allele j depends only on the identity of allele j and not that of allele i, then the dynamic equations are a gradient system, as shown by Hofbauer and Sigmund (1988). 4
Optimality
and the Fundamental
Theorem
Our first aim in this section is to define the "partial increase" in mean fitness: (a more complete description is given by Ewens (1989)). This will be done in the case of a general population (that is, random mating is not necessarily assumed) evolving in discrete time according to (2). A definition of mean fitness alternative
to (3) is ~=
~ ~ Pi~(~ + ~, + ~ ) ,
(27)
and there appears to be strong evidence that Fisher viewed the right-hand side in (26) as a more natural definition of mean fitness than the right-hand side in (3). The partial change in mean fitness during the course of one generation is defined as the change in the right-hand side in (26) due to a change in Pq alone: that is, the partial change in mean fitness is, by definition,
F_, r.,(e~ - Pq)(~ + ~, + "~),
(28)
Using (2), this is easily seen to reduce to
2~ ~
= 2,~'r
(29)
i
and then use of (19) shows tha~ this is exactly Cry/@. Since (29) depends on changes in genotype frequencies only through changes in gene frequencies, we
14 may use the argument following (5) to describe (28) as the partial change in mean fitness from one generation to another. Price (1972) and Ewens (1989) argue that this conclusion was viewed by Fisher as the statement of his "Fundamental Theorem". Suppose now we consider arbitrary changes (dl,d2,... ,dk) in the gene frequencies, and define a vector d by d ' = (di, d 2 , . . . , dk). The interpretation of ai as the average effect of Ai shows that we may think of 2d~a as the partial change in mean fitness due to these gene frequency changes. Suppose now that we impose the constraint O-2
2 d l a _ ---
w
(30)
(as well as the natural constraint ~ di = 0) : this requirement is that the partial increase in mean fitness should equal that arising through natural selection. We now ask what quadratic form d t T d in these arbitrary changes is minimized when d = 5, that is, at the natural selection values, subject to the constraint (33). The introduction of a Lagrange multiplier shows that we must minimize the function d / T d + 2,k(a'a)
(31)
and straightforward differentiation leads to the equation T d = ,~,c~
(32)
T-let =constd.
(33)
which may be written We want this equation to be solved by d = 5, and comparison with (16) shows immediately that to do this we may take T = (D + p ) - l . Thus the quadratic form we seek is d'(D + P ) - l d , (34) and we may say that the quadratic form (34) is minimized, subject to the constraint (33), at the natural selection vector d = 6. A statement equivalent to this is: subject to the condition d'(D + P ) - l d = z~/(2~2), the vector d of gene frequency changes which maximizes the partial increase in mean fitness is the natural selection vector & We can restate this conclusion in a more useful way if we define (34) as a new metric giving the distance between old (pl,..., Pk) and new (Pl + d l , . . . , p ~ + dk) gene frequency values, by saying that if the distance between two sets of gene frequencies is prescribed to be the natural selection value, then the natural selection changes in gene frequency maximize the partial increase in mean fitness. In this way we can begin to ascribe an optimality character to natural selection, but the statement as described is of little value unless we can first find a "natural" interpretation for the metric (34). Before doing this, we note that in the particular case of random mating, the metric (34) simplifies to ~ d2/p~. In his original derivation of the results described in the previous section, Svirezhev (1972) used precisely this metric. This was done purely for mathematical convenience and no interpretation of this metric in biological terms was needed (or offered). Thus our interpretation of the
]5
more general metric (34) can also be regarded as a biological justification for the metric that Svirezhev, for purely mathematical reasons, found it convenient to employ. We now turn to the interpretation of (34). The quantity (11) which is minimized in the definition of the average effects is, up to a linear function, a'(D + P ) a - 2@a'5.
(35)
Consider for the moment the minimization of a'(D + P ) a
(36)
subject to O-2 a'5 = --~-~. (37) 2~ Introducing a Lagrange multiplier, this is done by the absolute minimization of
a'(D + P ) a - 2Aa'5.
(38)
This minimization occurs where (D + P ) a = ~6.
(39)
which is precisely (16) if we choose A = ~. In other words, minimization of (36) subject to (33) is identical, from (34), to the absolute minimization of a'(D + P ) a - 2~a'5.
(40)
But this is exactly (35). In other words, the average effects can be defined, not only through the original definition of minimizing (13) subject to (12), but also by the minimization of (36) subject to (33). Suppose we now define a vector g by (41) g - (D +~ P ) a ' so that a = ~ ( D + p ) - l g . Then (36) and (33), jointly, define the minimization of const g'(D + p ) - l g (42) subject to g'(D + P)-15 =const
(43)
g'a = const.
(44)
which in view of (16) is But minimization of (42) subject to (43) is precisely the minimization of (34) subject to (33). The two procedures, namely the minimization of a quadratic form to define average effects, and the maximization of the partial increase in mean fitness, are, in fact, the same mathematical procedure, simply presented in different ways. Thus insofar as the definition of average effects through the minimization of (13) is regarded as natural and meaningful, use of (34) as a
16
distance metric describing the distance between old and new gene frequencies also becomes natural and meaningful, and we summarize by saying that in a gene frequency space endowed with the "natural" metric (34), natural selection possesses the optimizing property of maximizing the partial increase in mean fitness for any set of gene frequencies which are at the same distance from the original as those arising through natural selection. The above analysis is in discrete time. An analogous analysis holds in continuous time, with in effect the same result. All of the above makes the (unrealistic) assumption that fitnesses depend on the genotype at one locus only. It is however possible to generalize the analysis immediately to the case where fitnesses depend, in a completely arbitrary way, on the genetic make-up of the entire genome, and where no specific assumptions need by made about linkage arrangements, recombination values, the number of loci in the genome or the number of alleles at each locus. To do this, we first order the loci in some agreed way and then the genes at each locus. We now redefine D as a diagonal matrix whose elements are, in turn, the gene frequencies at the various loci, P as a block diagonal matrix, each block corresponding to one gene locus having as entries the various within-locus genotype frequencies, and Q as a certain (off-block-diagonal) matrix of pairwise two-locus genotypic frequencies (Castilloux and Lessard, 1995). Appropriate generalizations of the mean fitness and the additive genetic variance rr~ are also made. Our first task is to define the average effects of all the alleles at all the loci. To do this we define a vector c~ of these average effects, where the alleles whose average effects are described in this vector are conformal with the alleles whose frequencies are displayed in D . Then the natural generalization of the procedure which leads to (16) shows that the average effects are defined, implicitly, as the solutions to the equation
(D + P + Q)a = ~5,
(45)
where 5 is a vector of allelic frequency changes, with again the alleles being conformal with the alleles whose average effects are given in c~. The similarity with (16) is immediate. Carrying through an analysis directly generalizing that given above, we find that a natural metric in the space of gene frequencies is d'(D + P + Q ) - l d ,
(46)
and that subject to the requirement that the distance between old and new gene frequency sets, as measured by (45), is ~ / 2 N 2, the vector of gene frequency changes which maximizes the partial increase in mean fitness is again the natural selection vector. Details of this procedure are given in Ewens (1992). In this way we have shown, in a completely general setting, (that is, considering the entire genome, all alleles at all loci, arbitrary fitnesses, arbitrary genotype frequencies and arbitrary recombination structure, and with a natural metric in gene-frequency space), that natural selection operates in a meaningful optimizing manner.
17
References Akin, E. (1979). The Geometry o] Population Genetics. Lecture notes in Biomathematics 31. Springer-Verlag, Berlin. Castilloux, A -M., and Lessard, S. (1995). The Fundamental Theorem of Natural Selection in Ewens' Sense (case of many loci), (submitted). Ewens, W.J. (1988). An interpretation and proof of the Fundamental Theorem of Natural Selection. Theoret. Pop. Biol. 36, 167-180. Ewens, W.J. (1992). An optimizing principle of natural selection in evolutionary population genetics. Theoret. Pop. Biol. 42, 333-346. Fisher, R.A. (1958). The Genetical Theory o] Natural Selection. Dover, New York. Hastings, Alan and Fox, Gordon (1995). Optimization as a way of studying population genetics equations. (This volume.) Hofbauer, J. and Sigmund, K. (1988). The Theory of Evolution and Dynamic Systems. Cambridge University Press, Cambridge. Kauffman, S.A. (1993). The Origins o] Order. Oxford University Press, New York. Price, G.R. (1972). Fisher's Fundamental Theorem made clear. Ann. Hum. Genet. 36, 129-140. Schoemaker, P.J.H. (1991). The quest for optimality: A positive heuristic of Science? Behav. Brain Sci. 14, 205-245. Shahshahani, S. (1979). A new mathematical framework for the study of linkage and selection. Memoirs of the American Mathematical Society, Vol. 17, No. 211, Amer. Math. Soc. Providence. Svirezhev, Y.M. (1972). Optimum principles in genetics, in Studies on Theoretical Genetics. USSR Academy of Science, Novosibirsk. [In Russian with English summary.] Weinstock, R. (1974). Calculus o] Variations with Applications to Physics and Engineering. Dover, New York.
Optimization as a Technique for Studying Population Genetics Equations Alan Hastings 1 and Gordon A. Fox2 1 Division of Environmental Studies, Center for Population Biology, and Institute for Theoretical Dynamics University of California Davis, CA. 95616 Email:
[email protected] Department of Ecology and Evolutionary Biology University of Arizona Tucson, AZ 85721 Abstract. We use methods from dynamic optimization to study the possible behavior of simple population genetic models. These methods can be used, at least conceptually, to determine limits to the behavior of optimization algorithms based on genetic equations.
1 Introduction The primary focus of this book is to look at how to use the equations of population genetics to study and understand problems in optimization. Most of the rationale for using ideas borrowed from natural selection to solve problems in optimization comes from Fisher's fundamental theorem. Unfortunately, it is well known that Fisher's result applies only to random-mating, single-locus population genetic models with constant selection (see for example, Ewens and Hastings, 1995). Multilocus population genetic models are complicated nonlinear dynamic equations. The dynamics and the equilibrium behavior of these multilocus equations are not well understood, except for some special cases. In this chapter, we will describe approaches for trying to understand bounds to the behavior of these equations by using optimization methods. This work may, in turn, provide some insights on the performance of methods that use genetic equations to solve optimization problems. We will summarize two primary approaches: equilibrium behavior of the twolocus models, and dynamics of two-locus models. In both cases, the approach has been to use optimization methods to find limits to the behavior of the equations (Hastings, 1981; Fox and Hastings, 1992) for fitnesses that are only known within some bounds. One reason for this is that in population genetics the fitnesses are not well specified. Thus the fitnesses are treated either as the unknowns or as parameters to be determined. To place these approaches in a larger context, we will begin by examining the simpler one-locus population genetic equations. This will provide background
19 and motivation for the methods we will discuss for studying the multilocus equations. Moreover, the single-locus viewpoint, in combination with the multilocus results, will help illustrate the role played by recombination and linkage disequilibrium in the dynamics and equilibrium behavior of multilocus population genetic equations.
2
Single Locus Population Genetic Models
Here, we will start with the simplest case, a single locus with two alleles. Let the alleles be A and a, and denote the frequency of A by p and the frequency of a by q. We will begin with a description of the deterministic discrete time model with random mating and nonoverlapping generations. Let the fitness of the genotypes A A , A a , and aa be denoted by WaA,WAa, and waa, respectively. We define the average fitness of the allele A as WA -: pWAA + qWAa
(1)
and the average fitness of the population as = p w a + qw~
(2)
The dynamics of the allele frequencies are then given by the equation # =
(3)
where pP is the allele frequency in the next generation. The equilibrium behavior of this model is easy to analyze. The usual approach is to view the fitnesses as parameters and then solve for the equilibrium value of p = p' (e.g., Ewens, 1979; Nagylaki, 1992). If one could readily estimate the fitnesses, this would make it easy to predict the evolution of gene frequencies in natural populations. However, it is much easier to measure allele frequencies than it is to estimate fitnesses in natural populations. Attractive as this approach may be, then, it is usually impossible to implement in practice. So here we will use an alternate approach (Hastings, 1981): view the equilibrium allele frequency as the parameter, and the fitnesses as unknowns. Doing so allows us to find values for the fitnesses that m a y explain the observed allele frequencies. 2.1
Equilibria
In the simple one-locus, two allele case, this alternate approach leads to a single linear equation with two unknowns. To see this, note that only the relative fitnesses are important, thus reducing the three unknown fitnesses to two, if, e.g., we normalize WAa to be one. Thus, a particular equilibrium allele frequency can be 'explained' by any of the fitnesses in a one dimensional set of possible fitnesses. If we add the constraint (which is easy to specify in this case) that the equilibrium be stable, this restricts the possible fitnesses to those lying along a
20 line that satisfy the constraint that WAA and w~a be less than one. No further information can be deduced - - there is no minimum or maximum strength to selection that we can find. Any specified set of allele frequencies is an equilibrium in fact, a stable equilibrium - - for some set of fitnesses. This view can be extended to an arbitrary number of alleles, and for any specified set of allele frequencies we can find a set of possible fitnesses. These results will not hold for the multilocus problem. -
-
2.2
Dynamics
Given the simplicity of the single-locus model, some aspects of dynamical behavior can be studied using the discrete time model. To facilitate comparison with multilocus results and techniques, we will turn to the continuous time model with random mating and overlapping generations to examine dynamics. The model that we will use is not exact, but is a reasonable approximation when selection is weak (see for example, Nagylaki, 1992). Here, it will be more convenient explicitly to include multiple alleles. Let Pi be the frequency of allele i. Then, define m~j as the Malthusian parameter for the genotype ij, so rnii = bij - dij, where bij and dij are the birth and death rates of the genotype ij. The model we will use is d p i / d t = p i ( m i - fn), (4) where J and
(6) J As a precursor to the study of the multilocus model, we will phrase a study of the dynamics of this model using an optimization procedure. This is motivated in part by a biological question: how far have two populations diverged from one another? Such "genetic distances" (e.g., Nei, 1987) are usually based on measurements of the allele frequencies at a locus in two populations. Most often, these distances have been defined without an underlying biological justification. Here, we will define the genetic distance using an optimization problem, based on the assumption that selection (of an unknown and possibly time-varying pattern) has led to the divergence between the two populations. Define a starting set of allele frequencies
p(0) = po
(7)
p(T) = pT.
(8)
and a final set of allele frequencies
Our problem is to find the minimum time for the allele frequencies in a single population obeying (4) to go from P0 to PT, where the fitnesses mij ~re unspecified and can vary with time, but obey the constraint
~ >o~F
(3.1)
The fitness of B ([3) is higher than the fitness of A (~F) because A provides D and E with nutrients. But when E is bound to A, the fitness of A (o0 is larger than that of B thanks to the cooperation of E. 8F<SF=
Y < ~ ks e) allows E to spend more time inside A than D. This results in a larger time average fitness of E. The basic hypothesis is that the host normally rejects "guests" at a certain rate, but it can somehow appreciate the degree of cooperation of E and reject it less frequently than D. This selective rejection of D might be due to some molecular recognition mechanism, which are known to exist in polypes and sponges, or simply because the increase in the level of nutrient produced by E decreases the rejection rate of E by A. The only "advantage" of E with respect to D in this model is : ks e < ks d. We then want to check when, i.e. for what set of parameters, this condition is sufficient to overcome the advantage in fitness of D over E and bring the emergence of mutualism.
33
,/ ...- .... t i #
t'~'# 7
II I| / I I
I I
/
f
A
I I
',/I I
FF
11
/
I
0 1 e
I
q
O~F
---
I 1 I I
11 Ii II
-I
'1.
Fig. 2: Set of the elementary processes that modify the populations. Thin continuous lines represent fast association processes between hosts and guests organisms with kinetic constants kl d, kl e, and fast dissociation processes with kinetic constants ks d and ks e. Bold lines represent reproduction and death processes (with rate d). The greek letters represent the fitness coefficients associated with the reproduction processes of the organisms.
,CJ
34
4 The Single site model (SSM)
4. 1 The differential system The set of differential equations describing the corresponding population dynamics is: dA _ (aF.(A+AD) + a.AE).fl - (d+ m.n).A+ d.(AD+AE)+ m.(B+AD) dt A+AD+AE+B + (kse.AE) + (ksd.AD) - kle.(A.E ) - kld.(A.D )
(4. I)
dAD -dr = - 2.( d + m.n).AD+ (m.AE) - (ksd.AD) + (kld.(A.D))
(4. 2)
dAE dt - "2.( d + m.n).AE + (m.AD) - (kse.AE) + (kle.(A,E))
(4.3)
_ ~.B.fl w . (d+ m.n).B + m.(A+AE+AD) dt - A+AD+AE+B
(4.4)
dC ~/.C.f2 - (d+ m.n).C + m.(D+AD) dt - C+D+E+AD+AE
(4. 5)
d:) (g~F.D+ &AD).f2 dt - C+D+E+AD+AE " (d+ m.n).D+ (d.AD) + m.(C+E+n.AD) + (ksd.AD) - kld.(A.D )
(4.6)
cE (aF.E + e.AE).f2 ~ - - C + D + E + A D + A E - (d+ m.n).E + (d.AE) + m.(D+n.AE) + kse.AE - kle.(A.E )
(4.7)
These equations simply sum the contributions of the processes described by the Figure 2 (section 3.2) and set the time variation of each population.
4. 2 Simulation results The initial conditions are such that populations are at equilibrium for the populations B, C, A, D and AD, mutant E being absent; for instance: A=0.01, AE=0, AD=0, B=80000, C=20000, D=60000.
35 For each set of parameters, the initial conditions, determined by numerical simulations of the differential system in the absence of E, are those corresponding to the eventual emergence of mutualism, when a true symbiont E is introduced among an ecosystem at equilibrium containing only primeval species B,C and commensal D plus the host species A. The set of parameters listed in the Table 1 were used in the simulations, unless otherwise specified. Varying parameters and initial conditions three types of attractors are obtained by numerical integration of system (equations (4. 1) to (4. 7)). All attractors are point attractors. The most sensitive parameters, in terms of dynamical behavior, are the difference in fitness (~5- e) and in the rejection rate (ksd - kse) between species D and E. The resource parameters fl and f2 simply change the scale of populations. Three time scales, fast, intermediate and slow, are fixed by respectively ksd, d and m. d/m determines the ratio between dominant and less fit species populations.
Parameters Values d (frequency of death) 10"2 m (mutation rate) 10-5 n (number of genes) 3 fl (food shared between A and B) 80 f2 (food shared between C, D and E) 100 Fitness coefficients cxF (free organisms A) 8 13 (organisms B) 10 "f (organisms C) 8 8F (free organisms D) 8 eF (free organisms E) 7 ct (organisms A in AE) 12 (organisms D in AD) 12 e (organisms E in AE) between 8 and 12 Association and dissociation constants kld (organisms D) 1 kl e (organisms E) 1 ksd (organisms D) between 0.1 and 12 kse (organisms E) 0.1 Table 1 : Values of the used parameters in the numerical simulations.
For small ksd, i.e., when D spends in A an amount of time comparable to E, AE does not prevail over AD. The commensal D benefits from the nutrients it gets from A, and develops faster than E. A does not benefit from D and is not able to grow faster than B. Finally the primeval population B remains at the higher level. A small population of
36 host A is mainly infested by commensals D, which gives a small but sufficient advantage to D to overcome C. Emergence o f mutualism is observed for large ksd, i.e. when D is rapidly expelled from A. A gets support from E and is able to overcome B. The symbiotic organism AE becomes predominant over the primeval populations C and B. The primeval organisms populations are only maintained by the mutations. For some values of the fitness parameters a coexistence region is observed for intermediate values of ksd, In this region, one observes coexistence of the couples AD and AE. The fitness of A in these couples is comparable to that of B, which coexists with them. A high population of free D is also maintained.
100000
x)
AEm B
80000
o
AE
o
"'~ 60000
40ooo
AEc ~::.:~:.'.:.'..'. - A D AE -
20000 m
0
~
0
2
~
~
T
~
I
4
6
8
m
w
m
"r
!
10
12
ks d
Fig. 3: Equilibrium populations measured at time t=100000 as a function of ksd for ~:=10. The three regimes, predominance of the ancestors, coexistence and emergence of mutualism are separated by sharp transitions. Populations of A and E, very small, are not represented on the diagram. Continuous lines were obtained from initial conditions in the absence of E species (emergence conditions). The dotted lines are obtained with initial populations of the attractors: ADc and AEc corresponding to the coexistence attractor and AE m to the mutualism attractor. The arrow indicates that the coexistence attractor exists up to larger values of ks d.
37 The three regimes - egoism, coexistence and mutualism - can be observed on Figure 3 which shows the equilibrium levels of the populations when ksd increases. The transition value of ksd between the three dynamics depends upon the difference in fitness between D and E. This diagram is complicated because the regime that is reached depends not only on the parameters, but also from the initial conditions. The discontinuities in populations are due to the non-linearity of the equilibrium equations, obtained by setting to zero the time derivatives. For the same set of parameters several solutions exist, one or two of which are attractors. Dependence on the initial conditions and hysteresis are then observed. One consequence of this hysteresis is that the conditions for the stability of mutualism when facing the invasion of a new commensal is less stringent than the condition for the emergence of mutualism facing the same commensal already established. The emergence of mutualism implies a transition from a different attractor and is achieved for larger ksd (or a smaller 5 - a) than its failure against an invading commensal which implies a dynamics starting from the mutualism attractor. In the second case the transition is achieved for a lower ks d (or a larger 5 - e).
5 Slow manifold analysis
The slow manifold analysis allows us to interprete the simulation results and to predict the transitions among the different dynamical regimes. The interaction processes among the different populations have very different time scales: the exchange interaction between free and linked species are very fast with respect to population growth and death terms which are of the same order. Mutations are even slower. One can then suppose that the populations of free and linked organisms follow their ratio at exchange equilibrium given by:
(kSd+2.d).AD = kld.A.D
(5. 1)
(kse+2.d).AE = kle.A.E
(5.2)
After a fast decay towards exchange equilibria, the populations are describing a slow manifold whose equations are obtained by equating to zero combinations of the time derivatives in which the association and disssociation terms are cancelled (fast exchange). Mutation terms are neglected since they are small with respect to proliferation and death rates: dA
dAD dAE (aF.(A+AD) + oc.AE).fl + ' ~ + dt A+AD+AE+B
- d.(A+AD+AE) = 0
dD dAD (~F.D + 8.AD).f2 ~ - + dt - C+D+E+AD+AE - d.(D+AD) = 0
(5.3)
(5.4)
38 dE dAE (eF.E + e.AE).f2 ~- + dt - C+D+E+AD+AE "d.(E+AE) = 0
as
(5.5)
~.B.fl
~- = A+AD+AE+B "d.B = 0
(5.6)
dC T.C.f2 dt - C+D+E+AD+AE "d.C = 0
(5.7)
The set of elementary processes that give rise to these equations is reduced to the bold line processes of Figure 2.
5.1 The "species" and their effective fitness The first three equations represent the time derivatives of populations of "species" A, D and E. Here the term "species", by opposition to organisms, refers to a set of organisms, whether free or bound. The total population of species A for instance is A+AD+AE. The reproduction of these species involves an" effective fitness" which is:
(OCF.(A+AD)+ c~.AE) A+AD+AE
5.AD D+AD
for species A composed of populations A+AD+AE,
+
and eF'E + e.AE E+AE
for species D composed of populations D+ AD,
for species E composed of populations E+AE.
The effective fitnesses are thus average fitnesses, intermediate between those of free and bound organisms (see Figure 4). Since the processes of association and dissociation are faster than those of death and reproduction, the selection process, which is based on the reproduction/death balance, occurs for "species" rather than for individuals. We then expect that those species with larger effective fitness, or larger fitnesses as far as B and C are concerned, will become predominant. Since the effective fitnesses depend on the ratio of free to bound organisms, they vary with the dissociation constant, which gives rise to the observed transitions. The maximum of effective fitnesses play a role similar to the minimisation of free energies in thermodynamics of solutions. The phases, nearly pure chemical species or mixtures, that are obtained for a given set of thermodynamics conditions are those which minimize free energy under these conditions. The same is true here: the observed dynamical regimes involve either a strict selection of species as in the case of egoism and mutualism, or coexistence of species, but in both case the highest fitnesses are selected. To carry on this analogy, the mutation rate which increases the populations of subdominant species has a
39 role similar to temperature in physical systems. In ordinary physical systems, phase transitions are obtained by changes of intensive variables such as temperature, or concentrations of some chemical species. In the systems of biologicaly interacting species that we describe here, the most biologicaly plausible factors for possible changes in dynamical regime are the apparition of new mutants with different parameters for fitness, death or association. Resources are extensive variables in this model, the equivalent of volumes in thermodynamics of solutions.
5.2 Dominant species We have seven equations with seven unknown, and we should, in principle, be able to compute all variables which we expect a priori to be of the same order of magnitude. In fact this is never observed; if one tries to solve directly the system, some variables have to be set to zero to avoid contradictions. This means that only some of the variables have strictly positive values; they correspond to the dominant populations with larger effective fitness (see Figure 3) and they can be directly computed by solving the slow manifold equations without mutation terms. The other variables are in fact m smaller by a factor -~ ; they can be computed from the dominant populations by taking into account the mutation terms that we have previously neglected. The five last equations (5. 3) to (5. 7) describe the balance between proliferation and death of species A, D, E, B and C. Not all populations can satisfy these equations at the same time. Equations (5.4) and (5.7), for instance, when combined, give:
~/=
8F.D + 8.AD D+AD
(5.8)
which implies: 8F < y < 8
(5.9)
in contradiction with expression (3.2), 5F= y. In fact, the species with larger effective fitness, D, predominates with a large 8f2 population of order ~-~ (cf. equation (1.2)). The mutation term has to be re-introduced in equation (5.7) in order to compute C which is smaller than D+AD by a factor of order d" The different dynamical regimes which are observed by computer simulations correspond to different branches of the slow manifold where different sets of populations are dominant. There is a limited domain of parameters where a particular attractor solution exists, which gives a necessary condition to reach this attractor. To actually solve the equations, one selects a set of predominant populations, being guided by the simulation results. The self consistency of the choice is checked by solving the equilibrium equations. The following algebraic computations using the slow manifold approximation allows to predict the possible attractors for a given set of parameters. But, unless the
40 attractor is unique, knowing the domains of existence of the attractor does not predicts which possible attractor is reached for every initial condition,
5.3 The coexistence regime Let us start with the coexistence regime, where all species except C are dominant. The following change of variable is made:
x-
AE AD E+AE ' y - D + A D '
E+AE z = D+AD+E+AE
(kSd+2,d) , K-
(kse+2.d)
(5. 10)
The equations (5. I) and (5. 2) being combined in one, the system may be now simplified into: K_y 1-y
x - 1-x
(5, 11)
A is neglected with respect to AE and AD. This is based on simulation results. A amplitude could also be estimated from equations (5. 1) and (5.2), taking into account the AD AE fact that ~ and ~ are of order 1. The following equations (5.12 - 5.15) are derived from equations (5.3 - 5.6): O~FY.(1 - z) + ~.x.z = [3. (x.z + y.(1-z))
(5.12)
5F. (l-y)+ &y =
d_.(C+D+E+AD+AE) f2
(5. 13)
EF. (l-x)+ e.x =
d.(C+D+E+AD+AE ) f2
(5. 14)
A+AD+AE+B
= ~.fl d
(5. 15)
Figure 4 is a graphical resolution of equations (5. 11), (5. 13) and (5. 14). This diagram shows the linear increase of the effective fitnesses as a function of the fraction of linked organisms. Equating both left handside of equations (5. 13) and (5. 14) shows that equality among effective fitnesses, which is the condition for coexistence of species D and E, is obtained if and only if the segments delimited on the horizontal by the fitness lines are in the ratio corresponding to equation (5. 11). Two solutions exist for large values of K, but only the one closer to x=l, corresponding to the highest fitness, is stable. The other one
41 is close to y=0. When K decreases, the two solutions get closer until they collapse for the transition from coexistence to egoism. The algebraic solution is obtained by equating both left handside of equations (5. 13) and (5.14) which allows to express y as a function of x. Putting the resulting y in equation (5. I 1) gives a second degree equation in x. Cancelling the discriminant of this equation gives the lower boundary in ksd for the existence of the coexistence attractor (the corresponding transition line is the lower dotted line on Figure 3). The boundary'only depends upon the ratio of association constants K and upon the fitnesses of species E and D. It is independant of available resources. An upper boundary in ksd is obtained by using equation (5. 12) to compute the actual magnitudes of the populations; it is reached for rather large values of ksd when B goes to 0. In between, equations (5. 11) to (5. 15) allow us to compute the populations of the dominant species and C is computed from the original equation (4.5) with mutations.
13 ~
12 11
~i(y)~
~(Y)Io e (x) 9 8
6
0,0
I
0,2
~
I,
,I
I
0,6
0,8
,
0,4
,
x~y
Fig. 4 : Effective fitnesses and limits of stability When x (resp. y), the fraction of associated true symbionts E (resp. parasites D) varies between 0 and 1, the effective fitness e(x) averaged on both associated and free forms varies between eF and e (resp. 3F < 5(y) < fi). Equality of effective fitnesses (the transition condition) and the association equilibria require that the segments delimited on the horizontal fine verify a ratio K defined in the text (see equation (5. 11)).
,,
1,0
42
5.4 The mutualistic regime
In the other regimes, the set of dominant species is even smaller. We could of course systematically try all possible combinations of species to find out whether they satisfy equations (5. 11) to (5. 15), but the easiest way is to make use of simulation results to select them. In the mutualistic regime, the dominant population is AE. It is computed from equation (5.3) by neglecting all other populations on fl branch :
c~.fl
AE= d
(5. 16)
Using this result in equation (5.5) gives a second degree equation for x, the ratio of bound E to total E (as defined above in (5.10)):
(e-eF).X 2+eF.x
The condition • available foods:
-
~'fl f2
-0
(5. 17)
gives the following inequality between the fitnesses and the
e.f2 > c~.fl
(5.18)
When both sides are divided by d, they represent the maximum population size of species E and A. Inequality (5.18) is then interpreted as the possibility for E to eventually bind to all available A. This means that mutualism can be established by compensating the difference in fitness between E and D inside A by a large enough rejection rate for D, provided that the food available to species D and E is in sufficient amount so that E can always saturate A. Otherwise, however large ks d is with respect to kse, the effective fitness of A is not larger than ~], and mutualism cannot be established. The mutualistic regime is observed as long as the effective fitness of species E is larger than that of D. A transition to coexistence occurs when both fitnesses become equal. The effective fitness of E is expressed in terms of x computed from (5. 17). x is used to compute y by (5. 11), which is then used to compute the effective fitness of D. The transition line of Figure 3 (the upper dotted line), which limits the domain of existence of mutualism, is obtained when the values of x and y are replaced in the equality: 8F.(1-y )+ 5.y
= EF.(1-x )+ e.x
(5. 19)
43 5.5 The selfish regime The dominant species in the selfish regime are B and D given by equations (5.4) and (5.6):
B
~.fl
= --d-
D
-
5.f2 d
(5.20)
The most important mutants AD and E are obtained from equations (5.3) and (5. 5) where the most important mutation terms, those involving B and D, have been reintroduced in the right hand side: .0.m.B AD = d({]-~)
~SF.m.D E - d(SF_eF)
(5.21)
A and AE are finally obtained by the kinetics equations (5. 1) and (5.2). The limit of stability of the selfish regime is reached when the effective fitness of A reaches that of B and the same is true for E and D. In the algebraic computation of the stability limit, the fitnesses a and e F in the denominator have to be replaced by the effective fitnesses obtained by iteration of the above computation to achieve a reasonable accuracy. Let us note that the stability of the egoism attractor depends upon the mutation rate, which is not the case for the other regimes. All the quantitative predictions of the slow manifold analysis concerning the values of the populations at equilibrium and the limits of stability of the different attractors have been checked against the numerical integration of the complete model. The agreement is excellent and discrepancies are never greater than the percent. Apart from the transitions between different attractors according to the initial conditions, which can only be obtained by numerical integration, the slow manifold analysis is then a very good predictor of the dynamics. Figure 5 is a summary of the limits of stability of the different attractors obtained from numerical simulations and algebraic analysis of the slow manifold. We are presently concerned with the case when d*---d (the black dots on the figure). Let us note the vertical asymptote of the lower mutualism limit when both maximum populations are equal, as obtained from equation (5. 17). The monotonously decreasing upper curve is the upper limit of egoism. When this limit is overcome, transition to mutualism is obtained on the left part of the diagram and to coexistence on the right. The bifurcation point between the two possible regimes after the transition is very close to the left of the intersection with the lower stability of mutualism. In other words, nearly as soon as mutualism is a possible attractor, it is reached when egoism becomes unstable. The condition for this scenario is the possibility for the symbionts to saturate the available hosts. Otherwise, mutualism is impossible and transition to coexistence is observed. The diagram of Figure 3 with several reachable attractors is in fact only obtained for a small parameter region in the vicinity of the equality of maximum populations.
44
10
8
--
mutualism
9
egoism
................... coexistence -----ta---
mutualism (d>d*)
~
egoism (d>d*)
i
0,0
0,5
,0
1,5
2,0
P2
P1 Fig. 5: Limits of stability in the plane (~--~,ksd) The transition from egoism to coexistence or mutualism depends on the saturation ratio relating the maximum populations of host (A species), Pl=fl.~.d*, to symbiont (E species), P2=f2.E.d, (equations. (5. I8) and (7.5)). The vertical line at PI=P2 is a limit to the existence of mutualism which is only possible in the left part of the diagram. The limits of stability of the three regimes are drawn when the endosymbionts enjoys either an increase in fitness (black dots) or in lifetime (d>d*) (open dots). The upper decreasing curves are the upper stability limits of egoism (losanges), and the lower curves are the lower stability limits of mutualism (squares). The horizontal straight line is the lower stability of coexistence, for both cases. In the lower ksd regions egoism, and in the upper regions mutualism or coexistence, according to the possibility of saturation of the host by the symbionts, are the unique possible attractors. In between, which attractor is reached depends on the initial conditions (hysteresis).
2,5
45
6 The multi-site model
(MSM)
Binary associations are not the rule and in most cases the host offers a number of sites, p, to the invading species (Figure 6).
E
E
D
E
D
E
E
D
E
E
E
,--,--,-_,_-,_-,-_,__,_,_-,_,-_,_,__, Fig. 6 : An organism A with p sites occupied or not by E's or D's.
This is the case, for instance, for corals with algae inside polyps. The straightforward extension of the above model would be to write a large system of ordinary differential equations which would include all of species such as ADrEs, where r and s vary from 0 to p and indicate the number of D and E organisms inside A. In this large system, we can expect that after any reproduction, mutation or death events, the induced fluctuations in the rate of occupancy of the host by the endosymbionts D and E decay very fast while the populations reach their value on the slow manifold. Since we don't care much about these fluctuations, we can enormously simplify the model. A simple approach, which takes into account the different time scales for association and growth, is to suppose that association and dissociation processes are quasi-instantaneous. Apart from the free species, we only have to deal with variables A, AD and AI3. AD (resp.AE) is the population of organisms of type D (resp.E) inside A. A is the population of organisms A which are offering p sites per individual to invading organisms D and E. Among these p sites, AE/A sites are occupied by E organisms and AD/A are occupied by D organisms. In other words we are neglecting fluctuations of the occupancy rates of D and E inside the different A organisms~ This approximation, equivalent to a mean field approach, is based on the fast association kinetics. The differential system is then written:
dA (CZF.(P.A-AE)+ ot.AE).fl = (it p.(A+B)
- (d+m.n).A+ m.B
(6./)
dAD &AD.f2 --~'-" = C+D+E+AD+AE - 2.(d+m.n).AD+ m.AE - ksd.AD + kl d. (p.A-AD-AE).D
(6.2) dAE dt
a.AE.f2 - 2.(d+m.n).AE + m.AD - kSe.AE + kl e. (p.A-AD-AE).E C+D+E+AD+AE (6.3)
dR I].B.fl =-= - (d+m.n),B + m.A dt A+B
(6.4)
46
tiC_ )'.C.f2 - (d+m.n).C + m.(D+AD ) dt - C+D+E+AD+AE ~)
(6.5)
~F .D.f2 = C+D+E+AD+AE - (d+m.n).D + (d.AD) + m.(C+E+n.AD) + ksd.AD kl d.(p.A-AD-AE).D
(6.6)
cE eF "E'f2 dt - C + D + E + A D + A E " (d+m.n).E + (d.AE) + m.(D+n.AE) + kSe.AE - kl e. (p.A-AD-AE).E
(6.7)
The growth term for A is a linear function of the fraction of sites occupied by E. Mutations in A release infesting organisms, resulting in population changes in AD, AE, D and E. They appear as proportional to these populations rather than to A because of the occupancy rates. For instance, when an A organism mutates with frequency m, it frees AE/A organisms of species E. The corresponding source term in the E differential equation is then: AE m. A. ~ = m.AE
(6.8)
When simulated with p=10 or 100, this model (MSM) gives time evolution of populations which are very similar to those obtained with the single site model (SSM). Populations, are the same in both models for the same set of parameters, provided that the following changes are made: single site model fl
B
multi-site model fl P
A
p.B
p.A - AD -AE
Except for their short time linking dynamics, both SSM and MSM are pretty much the same, once one realizes that free sites of the A species in the MSM play the same role as free A in the SSM. When the slow manifold analysis is carried out for the multi-site model with the above changes, most equations are identical. The last four equations (5.4) to (5.7) of the simplified dynamical system are the same for both models (single site and multi-site, resp. SSM and MSM). For the MSM the first three equations are written: kse-AE = kie. (p.A-AD-AE).E
(6.9)
ksd.AD=kld.(P.A-AD-AE).D
(6.10)
47 (CCF.(pA-AE) + c~.AE).f1 p.(A+B)
=d.A
(6.1i)
The two first equations, once combined, give the same equation, (5.11), in x and y as in the SSM. The third equation (6.11) is equivalent in SSM to equation (5. 3) when one notices that the free A population of the SSM corresponds to the number of free sites in A, pA-AD-AE, of the MSM. fl in the SSM then corresponds to f l in the MSM. P P Since the dynamical equations on the slow manifold are the same in both models, we expect the attractors and their domains of existence to be the same, which is verified by the computer simulations. The only difference concerns the evolution towards the attractors. At short time scales of order (ksd)-I some differences are indeed observed, consistent with the approximations made in the MSM on the binding dynamics. Otherwise, at larger time scales the dynamics are identical, and even the transition lines from any given initial conditions are very close. Their relative distance in the (e, kSd) plane is never more than 3% of the parameters. In conclusion of this section, all the dynamical results that were obtained for the simple, but rather unrealistic single site model, remain valid for the evolution of the multi-site model which posess some biological relevance.
7 Discussion
7.1 Death rates of hosts and endosymbionts
In the case of coelenterates and algae, the host lifetime is larger that the endosymbiont lifetime. The assumption of equal death rates for both is unrealistic, but we have tested with numerical simulations that the qualitative features of the model, such as the existence of the three dynamical regimes, are preserved when the death ratio is changed up to a factor of 10. For the purpose of comparison, when changing the death ratio, we fl .dh.O~ maintained constant the ratio - - ~ , which controls the existence of coexistence (see Figure 5), by changing fl. (In the previous expression d h is the death rate of the host and varies from 10-2 to 10-3, the death rate (d) of the endosymbiont remaining unchanged.) This result could have been expected from the fact that the slow manifold equations remain unchanged. 7.2 Selective protection by the host
Mutualistic associations sometimes involve protection in exchange for food, for instance in the case of damselfish and anemon, or coelenterate and algae. This protection against predators might be described in our model by changing the death rate inside the host with respect to the death rate outside. By regrouping the first two terms of the
48 differential equation (1.1): dPi
.~.f = (-~t" d).Pi + . . . .
one sees that a change in the death rate is equivalent to the same relative change in the fitness coefficient. We have done series of simulations and algebraic computations on the slow manifold to study the "extreme" case when the only benefit enjoyed by the endosymbionts D and E (or the damsel fish in the case of fishes and protective hosts) is a reduction of their death rate d* with respect to their death rate d when they are free. d=0.01 ,
d*= 0.007
By contrast with section 4, for the present simulations, fitnesses are maintained constant inside and outside the host:
5F=5=8
aF=e=7
All other parameters are the same as for previous cases. The 30% decrease in death rate for endosymbionts is comparable to the 30% increase in fitnesses used in sections 4 and 6. We then expect comparable behaviors when the other parameters such as the dissociation rate ksd and the maximum population ratio fl.d*.~ f2.d.-----~ are the same, which is indeed observed on the phase diagram of Figure 5, where the open dots - corresponding to the model where protection is offered by the host - fall aside the black dots - corresponding to the model where the fitness of the symbionts increases inside the host. The slow manifold analysis follows the procedures that we have developped in section 5. Only equations (5.4) and (5.5) are changed and become:
dD dAD ~5(D+AD~ _ (d.D+d*.AD) = 0 dt + T = C+D+E+AD+AE
(7.1)
cE dAE E(E+AE).f2 dt + dt - C+D+E+AD+AI~ - (d.E+d*.AE) = 0
(7.2)
The equation (7.1) can be rewritten: dD dAD 52"2 d.D+d*.AD. dt + ~ = (C+D+E+AD+AE - ~ - ~ ) (D+AD) = 0
(7.3)
which shows that the per capita rate of increase is now averaged on the two populations, free and associated, via the death term. The previous analysis (section 4) has
49 shown that instabilities of the different regimes were obtained when average fitnesses became equal. It is now generalized by comparing the averaged per capita rate of increase, which shows that the transitions are obtained by equating average death terms divided by the corresponding fitness. The transitions are then obtained when:
1 d.D+d*.AD 8 D+AD
m
1 d.E+d*.AE ~ E+AE
(7.4)
The same arguments concerning equalities of the per capita rate of increase allow to predict the transitions in the most general case, when the symbiosis involves simultaneous benefits in fitness and protection.
7.3 Selective digestion by the host. We have tentatively tried to check whether a selective digestion of the endosymbionts could result into mutualism. We were driven to this hypothesis by the fact that parasitic algae are sometimes digested by Hydra normally involved in association with Chlorella. ksd and kse in equations (4. 2) and (4. 3) now correspond to digestion coefficients. The source terms (ksd.AD) and (kse.AE) are then removed from equations (4. 6) and (4. 7) since the symbionts die from the digestion process. The positive terms in ksd are kept in equation (4, I) since the digestion of the symbionts frees A. No mutualism is observed when the digestion is faster or of the same order of magnitude as the death process, which corresponds to the most reasonable hypothesis. The slow manifold analysis allows to understand the dynamical behavior of the system. Equations (5.4) and (5.5) now become:
dO
dAD
dt + dt
(SF.D+&AD).f2 - C+D+E+AD+AE - (d+m.n).(D+AD) - ksd. AD = 0
CE dAE (eF.E + E.AE).f2 dt + dt - C+D+E+AD+AE - (d+m.n).(E+AE) - kse. AE = 0
(7.5)
(7.6)
The digestion terms add up to the death term and mutation term (d+m.n), which constitutes a strong handicap for species D and E with respect to organisms C. C remains predominant on its branch, except when ksd is notably smaller than d. Egoism is then the only observed regime, when ksd is larger than d.
50
8 Conclusions
Let us summarize the results and discuss the biological significance of the model. Mutualism does not contradict Darwinian theory of selection of the fittest, provided that one compares species according to their effective fitness, which takes into account the benefits enjoyed by the symbionts while they are associated. Mutualism can be established when a recognition mechanism allows the host to discriminate among parasites and bona fide symbionts. One is then led to look for possible recognition mechanism in biological associations. In the case of rumen/enterobacteria association {Begon, Harper and Townsend, 1986}, the evident candidate for the recognition function is the immune system. For the rhizobium of legumes associating the legume cells and nitrogen fixing bacteria, lectins able to recognize the polysaccharides on the cells walls of the bacteria also are rather convincing molecular mechanisms {Lis and Sharon, 1986}. Other molecular recognition mechanisms have been documented in very simple organisms (spongae, tunicates, coelenterates) which could be involved in symbiotic associations {Douglas, 1988; Taylor, 1973}. Negative electric charges on the surface of symbiotic chlorellae cell walls allow hydrae to recognize them {McNeil, Hohman and Muscatine, 198I }. Another mechanism, which seems to apply to the HydralChlorella system, is the detection by the Hydra of maltose released by the Chlorella (maltose is the benefit enjoyed by the Hydra from the Chlorella) {Hohman, McNeil and Muscatine, 1982; McAuley and Smith, 1982; Muscatine and McNeil, 1989}.
The differential model that we have built allows to predict the possible emergence of mutualism according to the individual charactistics of the organisms. It applies to most endosymbiotic systems, where association times are smaller than organisms lifetime. It takes into account exchanged benefits such as food and protection. Most of the discussion involved the single site model, but we have shown in section 6 that this simpler model is equivalent to the multiple site model in the time range of interest to us.
The possibity of a coexistence regime with both commensals and true symbionts present with comparable populations was an unexpected result of the model. We are tempted to consider that the possible saturation of the host by the symbionts is the normal case which excludes the possibility of observing coexistence. On the other hand, coexistence could be a possibility in those many systems where we cannot figure out the benefits of the association for each individual organism involved, e.g. lichen where the benefit for the algae is not obvious {Begon, Harper and Townsend, 1986} or intestinal flora in insects or mammals. Since the coexistence situation is less favorable to the host than complete mutualism, we might imagine that further mutuations would select host organisms with larger rejection rates of commensals. The real existence in biological systems of the coexistence regime is still an open question worth investigating.
Although most results can be obtained through direct numerical simulation, the slow manifold analysis brings some insight in the important concepts: effective fimesses and effective per capita increase, the saturation limit for mutualism, equivalence between
51 single site and multiple site models, and why selective digestion is insufficient to establish mutualism. An extended version of this manuscript has been published in {Weisbuch and Duchateau, 1993 }.
Acknowledgments Computer simulation used GRIN73 software {De Boer, 1983 }. The Laboratoire de Physique Statistique is associated with the CNRS (URA 1306) and Paris VI and Paris VII Universities. This work was started in the Santa Fe Institute which we thank for its hospitality. We thank NATO (CRG 900998) and Fondation Curie external grants progam for partial support. We thank Richard Belew, Bernard Derrida, Nancy Koppel, Alan Perelson, Jonathan Roughgarden and Errs SzathmSry for helpful discussions.
References Begon, M., J. L. Harper and C. R. Townsend. 1986. Ecology. Blackwell Scientific Publication. Boucher, D. H., S. James and K. H. Keeler. 1982. Ecology of Mutualism. Annual Review of Ecology and Systematics 13, 315-347. De Angelis, D. L., W. M. Post and C. C. Travis. 1986. Mutualistic and Competitive Systems. In Positive Feedback in Natural Systems. S. A. Levin (Ed.) p 290. Berlin: Springer Verlag. De Boer, R. J. 1983. GRIND: Great Integrator for Differential Equations. Bioinformatics Group, University of Utrecht, The Netherlands. Douglas, A. E, 1988. Nutritional interactions as signals in the green hydra symbiosis. In Cell to cell signals in plant, animal and microbial symbiosis. S. Scannerini, D. Smith, P. Bonfante-Fasolo and V. Gianinazzi-Pearson (Eds.) p 283-296. Berlin: Springer-verlag. Hohman, T. C., P. L. McNeil and L. Muscatine. 1982. Phagosome-lysosome fusion inhibited by algal symbionts of Hydra viridis. Journal of cell biology 94, 56-63. Ikegami, T. and K. Kaneko. 1990. Computer Symbiosis - Emergence of Symbiotic Behavior Through Evolution. Physica D 42,235-243. Lindgren, K. 1992. Evolutionary Phenomena in Simple Dynamics. In Artificial Life 11. C. G. Langton, C. Taylor, J. D. Farmer and S. Rasmussen (Eds.) p 295-312. Addison-Wesley. Lis, H. and N. Sharon. 1986. Lectins as molecules and as tools. Ann. rev. biochem. 55, 35-67. McAuley, P. J. and D. C. Smith. 1982. The green hydra symbiosis. VII Conservation of the host cell habitat by the symbiotic algae. Proceedings of the Royal Society of London B 216, 415-426.
52 McNeil, P. L., T. C. Hohman and L. Muscatine. 1981. Mechanisms of nutritive endocytosis. II The effect of charged agents on phagocytic recognition by digestive cells. Journal of Cell Science 52, 243-269. Morse, P. and H. Feshbach. 1953. Methods of Theoretical Physics. McGrawHill. Muscatine, L. and P. L. McNeil. 1989. Endosymbiosis in Hydra and the evolution of internal defense systems. American zoologist 29(2), 371-386. Nowak, M. A. and R. M. May. 1992. Evolutionary games and spatial chaos. Nature, 826-829. Roughgarden, J. 1975. Evolution of Marine Symbiosis - A Simple Cost-Benefit Model. Ecology 56, 1201-1208. Szathrn~try, E. and L. Demeter. 1987. Group selection of early replicators and the origin of life. Journal of theoretical biology 128, 463-486. Taylor, C. E., L. Muscatine and D. R. Jefferson. 1989. Maintenance and Breakdown of the Hydra-chlorella Symbiosis: a Computer Model. Proceedings of the Royal Society of London B 238, 277-289. Taylor, D. 1973. The cellular interactions of algal-invertebrate symbiosis. Advances in marine biology 11, 1-56. Weisbuch, G. 1984. Un module de l'6volution des esp~ces ~ trois niveaux, bas6 sur les propri6t~s globales des r6seaux bool6ens. Comptes rendus de l'Acaddmie des Sciences de Paris 298(III (14)), 375-378. Weisbuch, G. and G. Duchateau. 1993. Emergence of mutualism: Application of a differential model to endosymbiosis. Bulletin of mathematical biology 55(6), 1063-1090. Wilson, D. S. 1983. The Effect of Population Structure on the Evolution of Mutualism: a Field test involving burying beetles and mites. American Naturalist 121,851870.
Three Illustrations of Artificial Life's Working Hypothesis Mark A. Bedau Reed College, 3203 SE Woodstock Blvd., Portland OR 97202, USA Emaih
[email protected] A b s t r a c t , Artificial life uses computer models to study the essential nature of the characteristic processes of complex adaptive systems--
proceses such as self-organization, adaptation, and evolution. Work in the field is guided by the working hypothesis that simple computer models can capture the essential nature of these processes. This hypothesis is illustrated by recent results with a simple population of computational agents whose sensorimotor functionality undergo open-ended adaptive evolution. These might illuminate three aspects of complex adaptive systems in general: punctuated equilibrium dynamics of diversity, a transition separating genetic order and disorder, and a law of adaptive evolutionary activity.
1
Artificial Life's Working Hypothesis
Artificial life studies computer models of the processes characteristic of complex adaptive systems--processes like self-organization, self-reproduction, adaptation, and evolution. Complex adaptive systems take many forms, each of which differs from the others in myriad ways. By abstracting away from the diverse details, artificial life hopes to reveal fundamental principles governing broad classes of complex adaptive systems. This hope rests on artificial life's working hypothesis that simple computer models can capture the essential nature of complex adaptive systems [1]. I propose to pursue artificial life's working hypothesis by applying a "thermodynamic" methodology [5, 6, 3, 4, 2, 7]. Recently it has been suggested that there is a close, intrinsic connection between the content of evolution and thermodynamics (e.g., Brooks and Wiley [8]). By contrast, I envisage the two fields as sharing the methodology of developing and investigating statistical macrovariables. Thermodynamics investigates macrovariables like temperature, pressure, and specific heat, and the fruits of this method include simple, basic laws and classifications (like the ideal gas law and the phase transition separating the solids and liquids). By analogy, the "thermodynamic" approach in artificial life seeks to identify statistical macrovariables that capture the distinctive features of complex adaptive systems. The most straightforward sign that this methodology is bearing fruit would be the demonstration that appropriate macrovariables can be used to frame simple, basic laws and classifications that apply to broad classes of complex adaptive systems.
54
This methodology involves formulating statistical macrovariables that are general enough to apply across a wide variety of systems, and then using these variables to search for underlying quantitative order unifying different systems. It is natural to begin this endevour with simple models, for macrovariables are easiest to formulate initially in simple models and simple models are easiest to study. Furthermore, simple models can reveal the essential nature of complex adaptive systems in general--at least, that is artificial life's working hypothesis. This working hypothesis might be false, of course. It is at odds with the conclusions often drawn from the historicity, contingency, and variety of evolving biological systems (e.g., [25, 16]). One should bear in mind though that processes rife with historicity, complexity, and variety may well still fall under simple, basic laws and classifications, especially if these laws and classifications emerge through the application of statistical macrovariables. The "thermodynamic" methodology applied to simple computer models is a promising way to identify such laws and classifications, if they exist.
2
A Simple
Model
of Evolution
The model studied here is designed to be simple yet able to capture the essential features of an evolutionary process [27, 5, 6, 3, 4, 2, 7]. This model is motivated by the view that evolving life is typified by a population of agents whose continued existence depends on their sensorimotor functionality, i.e., their success at using local information to find and process the resources needed to survive and flourish. Thus, information processing and resource processing are the two internal processes that dominate agents' lives, and their primary goal--whether they know this or not--is to enhance their sensorimotor functionality by suitably coordinating these two internal processes. Since the requirements of sensorimotor functionality typically alter as the contingencies of evolution change, continued viability and vitality calls for sensorimotor functionality to adapt in an openended, autonomous fashion. The present model attempts to create agents with sensorimotor functionality that can undergo this open-ended~ autonomous evolutionary adaptation. The model consists of agents residing in a two-dimensional world, sensing their local environment, moving, and ingesting resources. All that exists in the world besides the agents are heaps of resources that are concentrated at particular locations, with levels decreasing with distance from a central location. The resource is refreshed periodically in time and randomly in space. Agents interact with the resource field at each time step by extracting any found at their current site and storing it in their internal resource reservoir. Agents must continually replenish their internal resource supply to survive. Agents pay" a resource tax just for living and a movement tax proportional to the distance traveled. If an agent's internal resource supply drops to zero, it dies and disappears from the world. On the other hand, an agent can remain alive indefinitely if it can continue to find sufficient resources.
55 An agent's movement is governed by its genetically hardwired sensorimotor strategy. A sensorimotor strategy is simply a map taking sensory data from a local neighborhood (the five site yon Neumann neighborhood) to a vector indicating a magnitude and direction for movement: s:
...,
--.
= (r,
o).
(1)
A agent's sensory data has two bits of resolution for each site, allowing the agents to recognize four resource levels (minimal resources, somewhat more resources, much more resources, maximal resources). Its behavioral repertoire is also finite, with four bits of resolution for magnitude r (zero, one, ..., fifteen steps), and three bits for direction 0 (north, northeast, east, ...). A unit step in the NE, SE, SW, or NW direction is defined as movement to the next diagonal site, so its magnitude is v ~ times greater than a unit step in the N, E, S, or W direction. Each movement vector v thus produces a displacement (x, y) in a square space of possible spatial destinations from an agent's current location. The graph of the strategy map S may be thought of as a look-up table with 2 l~ entries, each entry taking one of 27 possible values. This look-up table represents an agent's overall sensorimotor strategy. The entries are input-output pairs that link each sensory state (input) that an agent could possibly encounter with a specific behavior (output). The different entries in the look-up table represent genetic loci, and the movement vectors assigned to them represent alleles. Since agents have 1024 loci, each containing one out of a possible 128 alleles, the total number of different genotypes is 1281~ Although finite, this space of genotypes allows for evolution in a huge space of genetic possibilities, which simulates the much larger number of possibilities in the biological world. In order to investigate how adaptation affects the evolutionary dynamics of this model, I introduce a behavioral noise parameter, Bo, defined as the probability that an agent's behavior is chosen at random from the 27 possible behaviors, rather than determined by the agent's genetically encoded sensorimotor strategy. Thus, behavioral noise severs the link between genotype and phenotype. If B0 = 1, then agents survive and reproduce differentially, and children inherit their parents' strategy elements (except for mutations), but the inherited strategies reflect only random genetic drift rather than the process of adaptation. Sensorimotor strategies evolve over generations. An agent reproduces (asexually) when its internal resource supply crosses a threshold. The parent produces one child, which is given half of its parent's supply of resources. Parental allele values are inherited except when a point mutation at a locus gives a child a randomly chosen allele value. The mutation rate p determines the probability with which individual locus mutate during reproduction. At the limit of/~ = 1, every allele value will mutate and thus each allele of child is chosen completely randomly. It is important to note that selection and adaptation in the model are "intrinsic" or "indirect" in the sense that survival and reproduction are determined solely by the contingencies involved in each agent's finding and expending resources. No externally-specified fitness function governs the evolutionary dynamics [27, 5]. Good strategies for flourishing in this model would allow agents
56 to acquire and manage resources efficiently. However, it is an open question which specific strategies would efficiently acquire and manage resources, and there might be no universally optimal strategy. A strategy's worth is relative to the environment; a strategy might be optimal in one environment and suboptimal in another. The environment of the present model consists of the fluctuating resource field and the competing strategies possessed by the agents in the population. Both of these environmental components change during the course of evolution. The strategies directly evolve, and the resource field indirectly changes because different populations of strategies affect it differently. For this reason, the model has the potential to show an open-ended evolutionary dynamic consisting of the perpetual creation of adaptive novelty. This potential for an unpredictably shifting adaptive landscape is one reason the model resists treatment by the analytical methods used in traditional mathematical population genetics [9, 14, 15]. Not only are there thousands of loci and hundreds of alleles per locus, but the vicissitudes of natural selection indirectly cause unpredictable fluctuations in the finite population's size, age structure, and genotype distribution. In general, the only way to discern any underlying order in the model's behavior is through extensive computer simulation focussed on appropriate statistical macrovariables. These complications notwithstanding, the model is an unabashedly abstract and idealized representation of a population of evolving agents, lacking many of the features often emphasized in the biological literature. For example, the environment lacks the spatial structure required for migration effects, there are no explicit interactions (such as predation) among organisms, there is no intron/exon distinction in the chromosome, and there is no "continuity" of mutation (mutated allele values are not "near" previous values). Nevertheless, my working hypothesis is that this model captures the fundamental features of complex adaptive systems, and is thus a useful model for investigating the essential aspects of more realistic systems.
3
Measurement of Population Diversity
Population diversity is one plausible statistical macrovariable for artificial life to investigate. But how might population diversity be measured? My proposal, very roughly, is to represent the population as a cloud of points in an abstract genetic space, and then define the population's diversity as the spread of that cloud. In the present model, an allele is a movement vector, a spatial displacement, and an agent's genotype is a set of spatiM displacements. To capture the total population diversity, D, then, collect all the displacements of all agents in all environments into a cloud, and measure the spread or variance of that cloud. We can divide this total diversity D into two components. First, collect the spatial displacements of each agent in the population in a given environment, i.e., the traits encoded across the population at a given locus, and calculate the spread of this locus's cloud. The average spread or variance of all such locus distributions is a population's within-locus diversity, W. Now, form another,
57 second-order collection of the centroid each locus's cloud, i.e., a cloud of the "mean" displacement at each locus. The spread or variance of this second-order cloud is the population's between-locus diversity, B; it measures the diversity of the different mean population responses. More formally, I define total diversity as the mean squared deviation between the average movement of the whole population, averaged over all agents and over all environmental conditions, and the individual movements of particular agents subject to particular conditions, i.e., I
]
D
J
=
-
+
-
(2)
i----1 j-----1
where I is the number of agents i, J is the number of environmental conditions (or, in the present model, loci) j, (x~j, Yij) is the movement vector of agent i subject to input j, and ~IJ = T)-i~i=ll ~ ] = 1 z/j (similarly for zJ J). So, (5cxJ ,~l[g) is the (a, y) displacement of the population averaged over all agents i and loci (environments) j. Then, the within- and between-locus components of the total diversity are defined as follows: 1 w
I
Y
=
-
+
-
(3)
i--1 j = l J
B : 5
_
+ (~.r
(4)
j=l
where x-j r = _~E i =rI XiJ (and similarly for ~ t ) . So, (xj- t ,yj- t ) is the (x, y) displacement of the population in locus (environment) j averaged over all agents i. (.Further formal analysis of diversity and its components is developed elsewhere [6, 3, 4]0 From the analysis of variance [20], we know that the total diversity is the sum of the within- and between-locus components, D = W + B. The relative size of D, W, and B reflects a population's genetic structure, as two extreme kinds of populations can illustrate. First, consider a population consisting of "random agents," in the sense that each agent's alleles are chosen randomly from the set of possible alleles, different agent's alleles being chosen independently. In this case, the distribution across the population at any given locus will be a huge cloud covering the whole set of possible spatial displacements, so the population's within-locus diversity W will be quite large. Since the centroid of each of these huge clouds will be virtually the same point-the center of the space of possible behavioral displacements--the distribution of these centers of gravity will be quite tight, and so the between-locus diversity will be nearly zero, B ~ 0. The population's total diversity will approximately equal the within-locus diversity, D ~ W. A second extreme case is a population consisting of "quasi-clonal" (nearly genetically identical) agents that act differently in different environments. In this case, the within-locus diversity is nearly zero, W ~ 0, since the average spread of
58 the cloud of behavioral displacements at each environment-locus is minimal. On the other hand, since the average behaviors in different environments are quite different, the between-locus diversity is large and equal to the total diversity, D ~ B. In this way, the relations among D, W, and B clearly distinguish the quasi-clonal and random agent populations. 3.1
P u n c t u a t e d Equilibria
One of the most controversial topics in recent evolutionary biology has been the existence, cause, and implications of punctuated equilibria [13, 17, 10, 23, 26]. Artificial life systems might shed some new light on this controversy, since they often display punctuated equilibria in quantities like species concentration and average fitness (e.g, [19, 21, 28]). Yet the causes of these punctuated dynamics remain uncertain. Ecological complications such as host-parasite interactions or genetic complications such as extensive epistasis are typically thought to be implicated, and it is almost universally assumed that adaptation plays an essential role. My observations question whether any of these factors are essential. I measured diversity in a series of simulations in which mutation rate and the presence or absence of adaptation were varied, while all other parameters of the model, including the size of the world and the resource environment, were held constant. Alleles were assigned to the founder population randomly, with displacement direction chosen from the eight compass directions and distance in steps chosen from zero, one and two. Thus, in the founder population, the total diversity was relatively low, D = 2.5, and virtually all of the total diversity was in the within-locus component, D ~ W and B ~ 0. Diversity dynamics in the present model routinely display clear punctuated equilibria when the mutation rate is suitably low. Figure 1 shows the typical dynamics of diversity for simulations in which # = 10-~. Diversity remains largely static ibr significant periods of time, but every now and then diversity is punctuated by very rapid changes. The resulting picture is characterized by relatively fiat plateaus separated by abrupt cliffs. (Figure 1 shows the withinand between-locus diversity components, W and B. The interesting diversity punctuations occur with respect to B. B approximates D since W is very low in these simulations and D = W + B, so the punctuations also occur with respect
to D,) It is notable that these punctuated equilibria occur in such a simple model. None of the ecological or genetical complications usually thought to be implicated are explicitly present in the model. For example, the model allows no explicit ecological interactions like those between host and parasite and the genetic structure has no epistasis. It is true that the model could support the emergence of implicit sub-populations that follow competing or cooperating resource-finding strategies. If such sub-populations were to exist, they would produce a substantial within-locus diversity W, for the average trait at given loci would differ between the sub-populations. 'The slightly positive values of within-locus diversity W in the simulation with adaptation (Fig. 1, top) is too low to be consistent with significantly different sub-populations. The simulations without adaptation
59 I
~
--~r-'
l "84
r
I .......
W
.~ 3Q .>_ Q ~5
0 ........
W
--B
45
.>_ O
, 0
,
[ 2ooooo
1 4oooo0
Time
6o0ooo
8oooo0
1oo0ooo
Fig. 1. Punctuated equilibria in diversity dynamics from the first 1,000,000 time steps of two typical low-mutation simulations (# -- 10-2). Adaptation above (B0 -- 0), no adaptation below (B0 = 1). Time series for the two diversity components, W and B, are shown. The founder populations in these simulations have fairly low diversity, so punctuations initially tend to increase diversity, as shown here. On longer time scales, punctuations are equally likely to decrease and increase diversity.
(Fig. I, bottom) show W values virtually equal to zero, which means that the population is virtually clonal and so has no sub-populations. Thus, although interactions between sub-populations might sometimes contribute to punctuations in some of the simulations, in general sub-populations play no fundamental rote in the punctuated equilibria we observe in this model. The most striking aspect of these punctuations is their presence even when adaptation is absent. Although punctuated equilibria in the absence of adaptation occur only when the mutation rate # is suitably low, the effect is quite robust. Therefore, the presumption that punctuated equilibria must reflect the operation of adaptation is simply wrong. If punctuated equilibria are observed in the presence of adaptation~ without additional evidence one cannot assume that adaptation plays any important role in their genesis. Evidently, there is an intrinsic tendency for evolving systems absent adaptation--that is, stochasti-
60 cally branching, trait-transmitting processes--to produce punctuated diversity dynamics, provided the branching rate is suitably poised. 3.2
Transition Separating Genetic Order and Disorder
Punctuated diversity dynamics fit into a broader pattern suggesting that evolving systems can be classified into two qualitatively different categories. I measured total diversity D and its within-locus W and between-locus B components in a series of pairs of adaptation/no-adaptation simulations, smoothly varying the mutation rate # (on a log scale). The resulting diversity data reveal a transition separating two qualitatively different kinds of genetic systems. One indication of this transition comes from the qualitative nature of the observed diversity dynamics. As noted in the previous section, when # is low diversity dynamics typically consist of punctuated equilibria, the frequency of which is proportional to the mutation rate. On the other hand, when # is high the diversity dynamics exhibit noisy fluctuations around a stable equilibrium value. The amplitude of these fluctuations is inversely proportional to the mutation rate. The relationship between the total diversity and its two components clearly indicates the two different kinds of genetic systems and the transition between them. When the mutation rate is low, the total diversity is well approximated by the between-locus diversity, D ~ B. This shows that low mutation systems consist of the sort of "qaasi-clonal" population mentioned in Sect. 3. On the other hand, when the mutation rate is high, the total diversity is well approximated by the within-locus diversity, D ~ W. Thus, high mutation systems consist of the sort of "random agent" population also mentioned in Sect. 3. The way in which the transition between these quasi-clonal and random populations depends on mutation rate can be made vivid by plotting the component diversity, i.e., the extent to which the total diversity D is dominated by neither W nor B but has a large contribution from each. The component diversity can be defined as the proportion of the area of a square of side D is covered by a rectangle with sides 2W and 2B:
4WB C-
92
(5)
(The factor of 4 scales C so that 0 < C < 1.) I noted above that W will be near zero in a quasi-clonal population, and B will be near zero in a random population. Thus, the component diversity C will be near zero in both of these two kinds of populations. The component diversity C can approach one only if neither diversity component dominates the total diversity, which would entail that the population is neither quasi-clonal nor random. Figure 2 shows the time average of the component diversity C as a function of the mutation rate, for systems both with and without adaptation. A transition between two qualitatively different genetic systems is clearly indicated. Notice that C is close to zero if the mutation rate is either high or low, and C approaches its maximal value of one at intermediate mutation rates, roughly, 10-3 < ~
E3 E
,o =;
0.60
/
i\
/
/
0.50
/
o
E o o
~ i ~,
0.40 0.30
/
/
o
/
/ /
d
/
0.20
0.10 0,00 -0-1010-~
........
[
1 0 -6
~ I all,.
i
........
10 -~
]
10 "=
J , iH,,,
t l O~
Mutation Rate
Fig. 2. Transition in diversity dynamics, reflected by the time average of the component diversity, C, as a function of mutation rate (shown on a log scale to improve resolution). The transition separates two regions of qualitatively different behavior. Systems with low mutation rate # are genetically "ordered"--the genetic structure of each agent in general is highly correlated with those of the other agents. High # systems are genetically "disordered"--the genetic structure of each agent in the population is uncorrelated with those of the other agents. (The leftmost data points represent not # = 1 0 -s but # = 0 . )
10 -2 . It is striking that this transition exists whether or not the agents' genetic strategies are adapting during the course of evolution. Even if all genes are merely drifting because of the operation of behavioral noise, we still see the two qualitatively different genetic systems and the transition between them. (In fact, the transition seems to be sharper without adaptation. Further details about the diversity dynamics and the effects of mutation and adaptation are described elsewhere [6, 3, 4].) Figure 2 paints a picture of an abstract space of evolving systems with two qualitatively distinct regions dividing the mutation spectrum. Low mutation systems are genetically "ordered," consisting of a population of genetically identical (or, nearly identical) agents--a quasi-elonal population. Different loci encode dif-
62 ferent traits, and from time to time this more or less static distribution of traits across loci abruptly shifts, causing punctuations in the prevailing genetic stasis. By contrast, high mutation systems are genetically "disordered," consisting of a population of genetically dissimilar agents, each of which has a random collection of alleles--a random population. Over time, the gene pool is a continually fluctuating random distribution. These ordered and disordered regions are separated by a transitional region. (Whether this transitional region itself contains further structure is a topic of ongoing work.
6 0 0 0 ~ ...........
o
oooo
0 (D
T
T
~
-~--~":~-~--~
.........................
i-
i
L
!
\\.,
..,
!
a::
u)
300000
~
q
fl: c~ > < 150000 ~• Adaptation G----- D No Adaptation
. . . . . .
0 -8
,,
}
10"6
........
i
. ~ ,
,i.,
10 .4
I
10.2
~
,
, r tLu_
10~
Mutation Rate
Fig. 3. Time averages of the amount of uningested resource in the world as mutation rate # is varied (shown on a log scale to improve resolution). In one set of simulations adaptation operates normally; in the other set of simulations ~daptation is prevented with behavioral noise B0 = 1. The "bars" surrounding each point indicate the standard deviation of the time series of resource values. (The le[tmost data points represent not # = 1 0 -8 but # = 0 . )
Figure 3 shows that this transition separating genetic order and disorder has a striking connection with population fitness. Since my model is resource-driven, the population's overall fitness is reflected by its efficiency at extracting the
63 available resources from the environment. (Exactly the same amount of resources were pumped into all simulations.) A crude (inverse) measure of this resourceextraction efficiency is the amount of residual (uningested) resource present in the world. The time average of residual resource is plotted against mutation rate, in Fig. 3. When the dependence of residual resource is compared with the diversity transition shown in Fig. 2, we can see that maximal resource-extraction efficiency occurs when the mutation rate is at or slightly below the transition (a region that one might describe as near "the edge of disorder"). As the mutation rate rises significantly into the region in which systems are disordered, resourceextraction efficiency falls off dramatically. (There is some indication that fitness also falls off if the mutation rate is well into the region of ordered systems, but this is unclear since it is difficult to gather clean statistics at very low mutation rates.) Although the transition between genetic order and disorder exists whether or not adaptation happens, effective adaptation is evidently optimal around this edge of disorder. This effect might reflect a balance between two competing demands of evolutionary learning. On the one hand, the need to remember what has been learned requires a sufficiently low mutation rate; on the other hand, the need to explore novel possibilities requires a sufficiently high mutation rate. Optimal evolutionary learning, then, requires a mutation rate that appropriately balances these competing needs. This optimally poised mutation rate appears to coincide with the region around the edge of disorder.
4
Measurement
of Adaptive
Evolutionary
Activity
A fundamental feature of any complex adaptive system is its adaptive evolutionary dynamics. But how might this property be measured? I think that we should conceive of adaptive evolutionary activity as the creation through the evolutionary process of sensorimotor functionality, i.e., of sensorimotor traits that are beneficial to the agents that possess them and that persist in the population because of this benefit. But how might this process be measured, especially when we might not know which traits have any functionality, and, if they do, what kind and how much? The difficulty--some would say impossibility--of answering this question was stressed in a classic paper by Gould and Lewontin [18] which subsequently generated a flood of critical debate (e.g., [12, 11, 29, 22, 24]). I propose that we can address this issue by measuring the extent to which a trait is well-tested by natural selection. Every time as agent uses one of its sensorimotor traits, natural selection has an opportunity to provide some feedback about the trait's benefit or cost. If the trait persists in the lineage through repeated use and, in particular, accumulates more usage than would be expected a priori, then we have evidence that it is persisting because of its beneficial effects. Measuring a fruit's adaptive significance in this way, then, involves measuring the extent to which its use exceeds a priori expectations. In the context of the present model, sensorimotor traits are alleles. To meat to the 8 th allele of sure the "raw" usage of an allele, assign a usage variable uis
64 the i t h agent. An allele's usage variable is set to zero when the allele first enters the population through mutation (or at the very beginning of the simulation). Then, usage is incremented every time an allele is actually used, i.e., when the agent receives the sensory input genetically linked with the 8 TM lOCUS and the behavior encoded by the 8 t h allele is thereby triggered: . t+l ~is =
I ui,t + 1 if i uses the uist
8th
allele at t
otherwise
Recall that, if B0 > 0, behavioral noise can prevent the .Sth allele from actually producing i's behavior at t; in this case, i would not use the 8 t h allele even after receiving the sensory input that normally triggers its use. If B0 = 1 then u~, = 0 for all i, s, and t. Not all raw usage indicates an allele's adaptive significance, however, since harmful alleles accumulate usage. (In fact, it is only when harmful alleles are used that natural selection can eliminate them.) To determine an allele's proven adaptive value, we need to screen off that usage that might not signify the allele's adaptive value. One way to do this is to measure the duration during which an allele lineage could accumulate usage in the absence of adaptation, and then count an allele's usage as having adaptive significance only if the allele's age exceeds this duration. More precisely, one can measure the extent of the adaptive evolutionary activity underlying all traits in a given simulation of the model--with some particular setting of model parameters (resource field, resource taxes, size of world, etc.)--as follows. Let the age A~ of the 8 t h allele of the i t h agent be defined as the number of time steps since that allele was originally introduced into i's genetic lineage by a mutation at the S t h locus. Then measure the age distribution of all alleles of all agents at the model parameter settings of interest except that behavioral noise is fully turned on, B0 = 1. Adaptation cannot affect the allele age distribution when behavioral noise is always present--genotype and phenotype are unconnected--so the allele age distribution reflects only genetic drift. Given this measured distribution of ages Ai,, define the drift duration, t , , as the shortest duration which is less than Ai, for all s and i. We can be quite confident that, in the simulation of interest, no allele can survive in the population for longer than the drift duration if the alMe's presence is due to chance alone. To calculate the "net" usage gis of the s th allele of the i agent, we modify Eq. 6 by adding the constraint that the allele's age must exceed the drift duration tu: :.t+l ~is =
ui, + 1 if i uses the s th allele at ~ and Ai, >_ t~
f ui~t
otherwise
(r)
Finally, adaptive evolutionary activity A t is simply the sum of the net usage: At = ~
~t Uis .
(s)
65 4.1
A Law of Adaptive Evolutionary Activity
The drift duration tu was measured in a series of simulations across the mutation spectrum. (Limited computational resources prevented measurement of t u for p < 10-3.) All model parameters were set exactly as in the simulations discussed in Sec. 3.1 and Sec. 3.2 above. Then the time average A = tAt)t of evolutionary activity was measured across the mutation spectrum, for various values of behavioral noise, 0 _.r
r-
.9 -I--,
o > LU
1
i
10 -a
l l l ~ l
_ _
i
,f,~.J
I
10 -e
i
llLJll
10 4
I
10 "=
........
10 0
Mutation Rate
Fig. 4. Average evolutionary activity A as a function of mutation rate for several values of behavior~ noise, 0 < B0 < .25. To facilitate comparison with Fig. 2 and Fig. 3, the same mutation rate scale is used. Due to the computational resources necessary for the calculation of the drift duration t~ when # < 10 -z, evolutionary activity has not yet been measured at lower mutation rates.
Figure 4 shows how A was observed to depend on the mutation rate #. We see that, within the range of mutation rates sampled, evolutionary activity approximately follows a power law: A =
,
(9)
66 with a ~ -2.3+0.3. Notice that the dependence of adaptive evolutionary activity A on the mutation rate corresponds very closely with the dependence of resourceextraction efficiency on mutation rate depicted in Fig. 3. It is notable that the approximate power law behavior of A in Fig. 4 holds up at a dozen different (relatively low) values of behavioral noise. This suggests that the law of adaptive evolutionary activity in Eq. 9 is fairly robust. An open question (requiring significant computational resources to answer) is how A will change when # passes through and below the transition separating genetic order and disorder shown in Fig. 2. This question is especially intriguing given the adaptive significance of the transition revealed in when Fig. 2 is overlayed with Fig. 3.
5
The Status of Artificial Life's Working Hypothesis
The three results discussed here--punctuated equilibria in diversity dynamics, the transition separating genetic order and disorder, and the empirical law of adaptive evolutionary activity--illustrate the possible fruits of artificial life's working hypothesis that simple computer models can capture the essential nature of complex adaptive systems. I say possible fruits because it is not clear that these three effects are part of the essential nature of complex adaptive systems in general. Still, the results in the present model are sufficiently compelling for us to seriously entertain the hypotheses that these punctuation, transition, and power law effects have some significant universal application. These three specific hypotheses about punctuation, transition, and adaptation must be sharply distinguished from the general working hypothesis that underlies this whole line of research in artificial life. The specific hypotheses are candidates for confirmation or disconfirmation in the short run, but the working hypothesis is not. In the short run, the working hypothesis is to be judged by whether it generates fruitful lines of research. When held to this standard, the results presented above give the working hypothesis some provisionally credibility. The punctuation~ transition, and adaptation results found in the present simple model will prompt the search for evidence for similar effects in other complex adaptive systems, both artificial and natural, and this in turn wil prompt the development of maximally general formulations of macrovariables like D, W, B, and A. These are exciting and promising lines of research. In the long run, working hypotheses often can be effectively confirmed or disconfirmed. Artificial life's working hypothesis will win confirmation if enough of the specific hypothesis (like punctuation, transition, and adaptation) it spawns prove to be compelling. Whether this is so is an empirical matter, one which the "thermodynamic" methodology illustrated in this paper is well suited to address. But how plausible are the three specific hypotheses about punctuation, transition, and adaptation? Are punctuated equilibrium diversity dynamics, a transition separating genetic order and disorder, and a power law dependence of
67
evolutionary activity on mutation rate part of the essential nature of some significant class of complex adaptive systems? These questions remain open. But there is a straightforward empirical method by which we can pursue their answers. The hypotheses are eminently testable. Testing such hypothesis in a wide variety of artificial and natural systems is my vision of artificial life as-it-couldbe.
References 1. M. A. Bedau, 1992, '~Philosophical Aspects of Artificial Life," in F. J. Varela and P. Bourgine, Towards a Practice of Autonomous Systems, Bradford/MIT Press, Cambridge, MA. 2. M. A. Bedau, 1994, "The Evolution of Sensorimotor Functionality," in P. Gaussier and J.-D. Nicoud, eds., eds., From Perception to Action, IEEE Computer Society Press, Los Alamitos, CA. 3. M. A. Bedau and A. Bahm, 1993, "Order and Chaos in the Evolution of Diversity," in Proceedings of the Second European Conference on Artificial Life, Brussels, Belgium. 4. M. A. Bedan and A. Bahm, 1994, "Bifurcation Structure in Diversity Dynamics," in R. Brooks and P. Maes, eds., Artificial Life IV, Bradford/MIT Press, Cambridge, MA. 5. M. A. Bedan, and N. H. Packard, 1991, "Measurement of Evolutionary Activity, Teleology, and Life," in C. G. Langton, C. E. Taylor, J. D. Farmer, and S. Rasmussen, eds., Artificial Life II, SFI Studies in the Sciences of Complexity, Vol. X., Addison-Wesley, Redwood City, CA. 6. M. A. Bedau, F. Ronneburg, and M. Zwick, 1992, "Dynamics of Diversity in an Evolving Population," in R. MS.nner and B. Manderick, eds., Parallel Problem Solving from Nature, 2, New York, Elsevier. 7. M. A. Bedau and R. Seymour, 1994, "Adaptation of Mutation Rates in a Simple Model of Evolution," in R. Stonier and X. H. Yu, eds., Complex Systems-Mechanisms of Adaptation, IOS Press, Amsterdam. 8. D. R. Brooks and E. O. Wiley, 1988, Evolution as Entropy, second edition, Chicago University Press, Chicago. 9. J. F. Crow and M. Kimura, 1970, An Introduction to Population Genetics Theory, Harper and Row, New York. 10. R. Dawkins, 1982, The Extended Phenotype, Oxford University Press, New York. 11. R. Dawkins, 1983, "Adaptationism Was Always Predictive and Needs No Defense," Behavioral and Brain Sciences, 6, 360-61. 12. D. C. Dennett, 1983, "Intentional Systems in Cognitive Ethology: the 'Pang]ossian Paradigm' Defended," Behavioral and Brain Sciences, 6, 343-390. 13. N. Eldredge and S. J. Gould, 1972, "Punctuated Equilibria: An Alternative to Phyletic Graduahsm," in T. 3. M. Schopf, ed., Models in Paleobiology, Freeman, Cooper and Company, San Francisco. 14. W. J. Ewens, 1979, Mathematical Population Genetics, Springer-Verlag, Berlin. 15. D. S. Falconer, 1981, Introduction to Quantitative Genetics, second edition, Wiley~ New York. 16. S. J. Gould, 1989, Wonderful Life, Norton, New York. 17. S. J. Gould and N. Eldredge, 1977, "Punctuated Equilibria: The Tempo and Mode of Evolution Reconsidered," Paleobiology, 3, 115-151.
68 18. S. 3. Gould and R. C. Lewontin, 1979, "The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme," Proceedings of the Royal Society B, 205, 581-598. 19. D. Hillis, 1992, "Simulated Evolution and the Red Queen Hypothesis," Biocomputation Workshop, Monterey, June 22-24. 20. G. R. Iversen and H. Norpoth, 1976, Analysis of" Variance, Sage Publications, Beverly Hills, CA. 21. K. Lindgren, 1991, "Evolutionary Phenomena in Simple Dynamics," in C. G. Lungton, C. E. Taylor, J. D. Farmer, and S. Rasmussen, eds., Artificial Life H, SPI Studies in the Sciences of Complexity, Vol. X., Addison-Wesley, Redwood City, CA. 22. J. Maynard Smith, 1978, "Optimisation Theory in Evolution," Annual Review of Ecology and Systematics, 9, 31-56. 23. J. Maynard Smith, 1989, Did Darwin Get It Right?, Chapman and Hall, New York. 24. E. Mayr, 1983, "How To Carry Out the Adaptationist Program," American Naturalist, 121, 324-33. 25. E. Mayr, 1988, "Is Biology an Autonomous Science?" In his Towards a New Philosophy of Biology, Harvard University Press, Cambridge, MA. 26. E. Mayr, 1988, "Speciational Evolution through Punctuated Equilibria," in his Towards a New Philosophy of Biology, Harvard University Press, Cambridge, MA. 27. N. H. Packard, 1989, "Intrinsic Adaptation in a Simple Model for Evolution," in C. G. Langton, ed., Artificial Life, SFI Studies in the Sciences of Complexity, Vol. VI., Addison-Wesley, Redwood City, CA. 28. T. Ray, 1991, "An Approach to the Synthesis of Life," in C. G. Langton, C. E. Taylor, J. D. Farmer, and S. Rasmussen, eds., Artificial Life II, SFI Studies in the Sciences of Complexity, Vol. X., Addison-Wesley, Redwood City, CA. 29. A. Rosenberg, 1985, "Adaptationalist Imperatives and Panglossian Paradigms," in J. I-I. Fetzer, ed., Sociobiology and EpistemoJogy~ Reidel, Dordrecht.
Self-Organizing Algorithms Derived from RNA Interactions Wolfgang Banzhaf Department of Computer Science, Dortmund University Baroper Str. 301, 44221 Dortmund, G E R M A N Y banzhaf@t arantoga.informatik .uni-dortmund.de
A b s t r a c t . We discuss algorithms based on the RNA interaction found in Nature. Molecular biology has revelled that strands of RNA, besides being autocatalytic, can interact with each other. They play a double role of being information carriers and enzymes. The first role is realized by the 1-dimensional sequence of nucleotides on a strand of RNA, the second by the 3-dimensional form strands can assume under appropriate temperature and solvent conditions. We use this basic idea of having two alternative forms of the same sequence to propose a new Artificial Life algorithm. After a general introduction to the area we report our findings in a specific application studied recently: an algorithm which allows sequences of binary numbers to interact. We introduce folding methods to achieve 2-dimensional alternative forms of the sequences. Interactions between 1- and 2-dimensional forms of binary sequences generate new sequences, which compete with the original ones due to selection pressure. Starting from random sequences, replicating and selfreplicating sequences are generated in considerable numbers. We follow the evolution of a number of sample simulations and analyse the resulting self-organising system.
1
The Age of R N A
A new age is dawning in molecular biology, the age of RNA [1]. Over roughly the last decade m a n y discoveries were made that have completely changed our understanding of RNA. Whereas the Fifties, Sixties and Seventies were dedicated mainly to explore the enormous richness of the molecular worlds of DNA and proteins, the Eighties clearly marked an explosion of knowledge in l~NA-related problems and facts. W h a t is so interesting about RiboNucleicAcids (RNAs) that chemists and biologists are flocking in large numbers into this research field? W h a t might be the consequences for our understanding of the mechanisms of Life? Finally, what kind of computational models could be derived from this new world that would offer insights into the functioning of a distinct category of algorithms, algorithms of self-organizing systems? This chapter is dedicated to explore the latter question, mostly by discussing computational aspects of recent revolutionary discoveries in biochemistry. We
70 shall put forward a new class of algorithms that shows signs of self-organization. Essential features of this class are derived from new findings in RNA chemistry. In our oppinion, it is possible that those findings might have reverberations into mathematics, physics, studies of complex systems and even engineering (besides heavily impacting biology and chemistry).
\
\ 0
0
I
O-P-O II 0
I
~
CH~
Base
0
d-P-O u 0
~"
CH 2
0
I%H I
I
0 OH m ()-p - 0 0
(a)
Base
H?\ i
0
H
I
H
I
O-P - 0 0
(b)
Fig. 1. The sugar-phosphate backbone of RNA ~nd DNA. Only a slight difference can be seen between RNA (a), and DNA (b). One hydroxyl-group is absent in DNA.
A few words are in order to highlight the specifics of RNA as opposed to DNA. Basically, there are two differences between DNA and RNA: One concerns a mere oxygen atom bound in a hydroxyl-group in the latter which makes macromolecules of RNA much more prone to form secondary structures in itself. DNA macromolecules, on its part, prefer to form stable double helices with complementary strands. The other difference between DNA and RNA is the set of nitrogeneous bases connected to their respective backbone (see figure 1). For RNA, these bases are adenine (A), guanine (G), cytosine (C) and uracil (U), for DNA they are adenine (A), guanine (G), cytosine (C) and thymine (T). Besides an additional methyl group in T as compared to U, they are identical. The primary mechanism for forming 2- and 3-dimensional structures is the interaction via hydrogen bonds between corresponding bases that form base pairs. Basically, a polynucleotide can gain energy by forming such hydrogen bonds which translates into stability for the resulting structure. Figure 2 shows the two most stable pairings in RNA. A typical example of a secondary RNA polynucleotide is shown in figure 3 b. This is often called the phenotypic form of the macromolecule consisting of the sequence shown in (a). By assuming this shape, RNA is more reactive than DNA with its inert form of a double helix, which effectively conserves the sequence on its strands. And here is it, the main functional difference between RNA and DNA: DNA is highly specialized in conserving information residing in the order
71
C
"~C
"% I
N~
I II
Backbone
""H
O
H
I
Guanine
H,. //C ~" C"" N" It"
c
I II H/C%N / c\
II
"'o
I
I ". / C N / N N C \\
Uracil
H
\
"N /
c/N\H
/
H
AderAne
r
/
C-
H
Backbone
N ~I N-- C/ ""H /C N\\ II "N NC / O.
N/ !
I
"" H., N,,C%N
Cytosine
H '
Backbone (a)
II
C - H
/ c \ N/ I Backbone
(b)
Fig. 2. The two most important base-pairings via hydrogen bonds in RNA. (a) U - A and (b) C - G . Other pairings are also possible, notably U - G, but they are not very stable. Dashed lines symbolize hydrogen bonds.
of its bases, whereas RNA is an information storage (in the base sequence) a n d a reactive agent, not very specialized in either of these functions. No wonder there exist numerous different (and more specialized) kinds of RNA, mRNA, tRNA, rRNA, snRNA to name a few, all performing different kinds of functions in the information processing machinery of a cell [2]. In 1989 the Nobel Prize in Chemistry went to Sidney Altman of Yale University and to Thomas R. Cech of the University of Colorado for their pivotal role in the discovery that molecules of RNA can really act as catalysts (ribozymes) [4, 5]. It was subsequently established that certain RNA molecules can accelerate reactions by a factor of as high as 1012 which is comparable to the effect of protein enzymes built from amino acids. An entire new branch of biotechnology has sprung up since [6] to make use of these new functional building blocks for drug design [7, 8]. Furthermore, early on in evolution some organisms seem to have managed the transition into pure RNA form: viruses. Viruses have an intimate knowledge about the replication mechanisms of cells, but are not able to survive on their own. As parasites, however, they have succeeded in exploiting cellular replication mechanisms for their own purposes. It is therefore suspected that they derived from early self-replicating life forms [9]. This leads us naturally to another important topic regarding sequences of RNA. Many scientists [10, 11, 12, 13] now believe that molecules of RNA were the predecessors of a much refined DNA-protein system that allowed Life to selfreplicate and to perpetuate itself. Theories of the origin of life have long been considering the double function of RNA as one important aspect of a system capable of self-replication. The chicken-egg problem of our DNA-protein system could thus find a plausible explanation: Presumably a much less specialized RNA system performing both information storage and enzymatic function could have bootstrapped itself and might have lead later on to the more efficient DNAprotein system with various kinds of RNA acting in auxiliary roles that support DNA-protein. Figure 4 shows a sketch of the dependencies between present-day
72 (a)
GAAUACACGGAAUUCGCCCGGACUCGGUUCGAUUCCGAGUCCGGGCACCAC
C A C C A GAAUACACGGAAUUCG='C C-'G C--G C-.G G-'C G--C A--U C~ U-'A C-'G G--C G--C U U
(b)
U C
A G
Fig. 3. (a) Genotypic form of a t-RNA sequence from E. coil, (b) Phenotypic form of the same sequence [3].
D N A , R N A and proteins.
DNA
m-RNA
--
~
proteins
Fig. 4. The DNA-protein system has to hold information about its own information conservation in itself. Various kinds of RNA play auxiliary roles. Arrows indicate a supporting function.
In the same spirit Stuart Kauffman writes in a recent book: "I shall argue t h a t life started as a minimally complex collection of peptide or R N A catalysts capable of achieving simultaneously collective reflexive catalysis of the set of polymers ( ... ) and a coordinated web of metabolism." [13]
73
2
Evolutionary Algorithms and beyond
Evolutionary Algorithms (EAs) make use of ideas gleaned from natural evolution. Information, e.g. useful for solving an optimization problem, is conserved and evolved over time, by providing a population of entities that breed with each other to generate better solutions. Starting from random solutions, the EA narrows down solutions in successive generations until it cannot find an improvement of a solution any more. Based on the external problem to be solved, the EA assigns fitness values to each individual in the population which are then used to determine the eligibility of the particular solution at hand for breeding and perpetuation into the next generation. A kind of artificial replication and selection takes place which results in a change in the content of successive generations. Genetic Algorithms (GAs) are a prominent example of this idea. At the level of the genotypic representation, John Holland proposed this scheme in 1975 [14]. Similar considerations have been undertaken at the level of the phenotype of a solution and are summarized in a 1973 book by Ingo Rechenberg [15]. The algorithms considered here, however, are different in that they start with a system capable of self-organization. This is done in close analogy to the RNA system in nature by postulating that the same physical entities that are used for information storage exist in an alternative form that allows them to interact with each other. We propose to consider artificial systems with the characteristic feature of being genotype and phenotype at the same time. We shall look at one specific example and study some phenomena that emerge in such as system. We will then point out various routes to generalizing the system and put it into a broader perspective. The system we have chosen to look at in more detail is based on the most fundamental material of information processing in computers, the binary numbers 0 and 1. Sequences of these numbers constitute both, data and programs in the v.Neumann paradigm of computing which was so pervasive over the last 50 years. We thus will study binary sequences that come in two alternative forms, a 1-dimensional "genotypic" form and a 2-dimensional "phenotypic" form.
3
The basic algorithm
As we deal with the evolution of (binary) strings of symbols, two principles have to be embodied in the algorithm: i) Machines, which we call operators should exist, able to change strings according to certain rules we have to define. ii) Strings should be equivalent to machines in that mapping methods determine which operator can be generated from which string.
74 Since we wanted to construct an algorithm as simple as possible, we settled for
binary strings. However, the requirements mentioned in i) and ii) are sufficiently general to allow for other objects. Here, we consider strings s, consisting of concatenated numbers "0" and "1": s = ( s l , s2, ..., st, ..., s N ) ,
s~ ~ {0, 1},
1 < i < N
To keep things as Simple as possible we choose a square number for N, the length of string s. An important question arises immediately: Itow can operators be formed from binary strings? In Nature, nucleotide strands tend to fold together according to the laws of chemistry and physics. Bond formation in Nature is governed mainly by a tendency of the strands to relax into energy-minimal configurations in physical space. This process might be called a compilation of nucleotide programs into enzymatic "executables". tIere we try to keep things as straightforward as possible and only consider two-dimensional configurations of binary numbers which, in a mathematical sense, are operators.
3.1
T h e f o l d i n g of o p e r a t o r s
For binary strings the following procedure is feasible: Strings s with N components fold into operators P which can be represented mathematically by quadratic matrices of size v/N x v/N (remember, N should be a square number!), as it is schematically depicted in Figure 5. In principle, any kind of mapping of the topologically one dimensional strings of binary numbers into a two dimensional (quadratic) array of numbers is allowed. Depending on the method of folding, we can expect various transformation paths between strings. Also, the folding must not be deterministic, but can map strings stochastically into different operators. We then assumed, that an operator Ps, formed from string s, can act on another string in turn and generate still another string (see Figure 6): ;os s / ::~ s u It is important to keep in mind that neither ;Os nor s~ is required to be deleted by this operation 1, Rather, a new string s " is generated by the cooperation of ;~ and s '. Thus, we consider the system to be open, with an ongoing generation of new strings (from some sort of raw materials). In this interpretation, only the information carried by Ps and s ~ is considered as something essential, and it is this information that is required to be conserved. One can imagine some possibilities to balance the continued production of new strings, all having to do with resource limitations necessarily imposed on 1 It is also possible to require that only one of the two, either string or operator should be conserved. Qualitatively, the behaviour of the system is equal.
75
i) String s
Operator Ps
Fig. 5. A string s folds into an operator Ps
such a system: (1) The system might have a fixed number of strings. Each new string produced by an interaction causes the deletion of an already existing string. The string to be replaced can be selected either by chance, i.e. according to its frequency in the ensemble, or by some quality criterion, its length, the number of " l " ' s in it, etc. (2) After an intitial period of unrestricted growth the increase in in string numbers might level off to zero. (3) At the outset, a restricted number of elements that constitute strings might be provided. As a consequence, an intitial growth period in the number of strings would cause a rapid depletion in the supply of raw material, in our case of "0" 's and "1" 's, which in turn would restrict the formation of new strings.
The net effect of these counter-measures (as well as others one may devise) is to force strings into a competition for available resources. Only those strings will have a chance to survive in macroscopic numbers which are able either i) to reproduce themselves or ii) to reproduce by the help of others or iii) to lock into reaction cycles with mutually beneficial transformations. We shall study and demonstrate this behaviour in the next sections.
76
-N-
T l
1]
1 1
N
0 0 1
T ~
N
1 Operator P
String s
String s'
Fig. 6. An operator P acts upon a string s to produce a new string s '
3.2
O p e r a t o r s a t work
For the moment, however, we have to come back to the question, how exactly an operator can act on a string. Consider Figure 2. We can think of s as being concatenated from ~ fragments with length v/N each. The operator 7) is able to transform one of these fragments at a time (semi-local operation). In this way, it moves downward the string in steps of size ~ until it has finally completed the production of a new string s r Then, operator P unfolds back into its corresponding form as a string s~, and is released, together with s and s t, into the ensemble of strings, which will be called string soup from now on. A particular example of the action of putation of scalar products:
an
operator onto a string is the com-
j=,/~
P sj+k
S/
(1)
j=l
i : 1,...,v~
k : O,...,V~- 1
where k counts the steps the operator has taken downwards the string. This computation, however, will not conserve the binary character of strings, unless we introduce a nonlinearity. Therefore, later on we shall examine in more detail
77
the following related computation:
s~+k,/~ = o-
Pqs.i+k,/~ - 0
(2)
k j=l
i = 1, ,.., v/N r
k = 0,...,vrN - 1
symbolizes the squashing function 1 for x >_ 0
0 forx - < / ( t )
>=
I. p(t)
(2)
where < f(t) > is the average fitness at generation t, bt is the inheritance coefficient at generation t, I is the selection intensity, and cp(t) is the phenotypic variance at generation t. The underlying assumption for this equation is a normal fitness distribution within the population. The selection intensity I is also a feature of the normal probability distribution
r
(3)
1 -~(x) where r is the normal probability distribution, and O(x) is the normal probability density function.The dependence of the selection intensity I on the percentage T of selected parents is shown in Figure 1. Selected values are given in Table 1.
Table 1. Selection intensity I for N ~ co, N population size /0~L~.34 ,.0-8 10.971 1.2
1_4 1.6 1 ~
The theoretical results for the BGA [10] concerning the relation of selection and recombination are obtained for binary problems with an underlying binomial distribution. Unfortunately, such a binomial distribution cannot be assumed for the EA with soft genetic operators. Therefore, we made extensive simulation studies concerning the performance and the scaling behavior of EASY using the test functions given in Tables 2 and 3. It should be noted that EASY is an instantiation of the Multivalued Evolutionary Algorithm (MEA) [20].
125
B
r
1.2 ._O
:i!iii::-ii
0.8
9
0.6
0.4
........i
0.2
........ ...........................................
. . . . . . . . .
,.......................i............+ - - ~ - < r
9
*-
"~
.......
~
....
10 20 30 40 50 60 70 80 90 100 Percentage Selected Parents T%
Fig. 1. Selection intensity I vs. percentage T of selected parents
a)
X i (mother)
X i (father)
X i(mother)
X i (father)
b)
Fig. 2. a) Crisp recombination and b) soft recombination
2.1
Soft
Modal
Recombination
Let (zl, ..., xn) and (yl,..., y,~) be the parent feature values. Then for discrete recombination the offspring feature values (zl, ..., zn) are generated by ~{ ~ {=~, y{}.
(4)
x{ or y{ are chosen with probability 0.5. This discrete recombination scheme is depicted in Figure 2a). To check the robustness of such a recombination scheme (uniform crossing over, discrete recombination) we analyzed the sphere model Fsph~r, from Table 2 which is a basic one in the analysis of mathematical optimization and evolutionary algorithms, e.g. [5, 2, 14, 16].
126
Table 2. Simple Test Functions Function
Constraints k
2
i=1
&.~p~oid(~) = ~
i~ . ~
i---1
i=1 7~ i=1
The result is shown in Figure 3 labeled Discrete Recombination.
SPHERE T-
, - -
,
,
,oo,,oo
II}
I.L
. . . . . . .C~ .....
le-10 0
20
M~da!~!.i.~ 40 60 80 Generations
100
120
Fig. a. Behavior of different modal recombination operators for the sphere model, n = 32, N = 512, e = 10 -12
It conforms with the results of the BGA [10]. But this means that selection and discrete recombination does not give a sustained development. The question is how to get a more robust recombination scheme. Contrary to existing recombination schemes (Discrete Recombination [16], Intermediate Recombination [16], Extended Intermediate Recombination [10], Extended Line Recombination [10], Fuzzy Min-Max-Recombination [19], Uni-
127
form Crossover [14, 18], Linear Crossover [22], BLX-0.a Crossover [3], 1-Pointand x-Point-Crossover [2, 6]) we introduce a soft recombination scheme gleaned from fuzzy set theory [23] but used stochastically
p(z{) {r
r
(5)
with triangular probability distributions r having the modal values xi and Yi with x i - a . l y i - x i I ~ r ~_ x i + a . l y i - x i I and y i - a . l y i - x i I ~_ r ~_ y i + ~ ' l y i - x i l , a _> 0.5, for xi ~_ yi. This soft recombination scheme with a -- 0.5 which is used throughout this paper is sketched in Figure 2b). The result using soft recombination is shown in Figure 3 labeled Continuous Modal Recombination. With this recombination scheme it is possible to generate a sustained convergence for all generations, at least for the sphere model.
SPHERE
ELLIPSOID
le+10 L
le+10
T
1 E
E.
1,,I.
le-10
I. . . . . . . . . . .
......................
le-20 " 0
H.4
I~1.~ i=o.8
1:1.4 I-1.1 1=0.8
i 50
le+10
i
le-20
100 150 200 250 300 Generations ZEROMIN , --
i
50
100 150 200 250 300 Generations NEGSPHERE
le+10
i 1 r t.-
Y.
LL
le-10
le-10
j i
i i
100 200 300 400 500 600 700 800 Generations
le-20
i i i
i
i i i
i i
i
i i
i
i
100 200 300 400 500 600 700 800 900 Generations
Fig.4. Upper row: Sphere model (left) and hyper ellipsoid function (right), I = 0.8, 1.1, 1.4, Lower row: Continuous zeromin (left) and negsphere function (right), I = 0.8, 1.4, n = 32, N = 512, c = 10 -i2, soft recombination, 5 runs overlaid for every graph
128
The EA with soft recombination is characterized by the population size N and the selection intensity I, only. Furthermore, the number of features n has to be taken into account. Based on these parameters a number of questions arises concerning the convergence of the EA, i.e. we want to predict the number of generations gen*(g, I, n) such that If* - ]1 -< e where f is the optimal value approximation. The questions concerned are: 9 What is the influence of the selection intensity to gen* (I) for convergent populations ? 9 What is the influence of the population size N on the convergence to the optimum, and if so, what is the dependence gen*(N) ? 9 What is the critical population size N* for which the convergence probability Pco,~ = 1, i.e. 100% convergence is assured ? How does the convergence probability decline if the population size is decreased beyond N* v 9 How does the number of features n influence gen* (n), i.e. how is the scaling behavior of the EA with soft recombination ? To check the influence of these parameters we used the simple test functions given in Table 2.
Table a. Test Functions for Global Optimization Function
Constraints r~
F6(~) = ~ . l o + G ( x ~
- 10. cos(;~))
--600 ~ xi < 600
i=l
Pffx) : ~ -xi sin(x/~)
-500 ~ zi ~ 500
Ps(z) = ~ x~/4ooo - M cos(~,/~) + 1
-600 < x~ < 600
i=1
/=1
F9(z) = -20 e x p ( - ~
-30 _< xi < 30
,=1 ~ x~) - exp(-~ i=lkcos(2~rx,))+
+20 + e
Q i=1
12kxi--nin~l[2~Cxill~
-1000 < xi < 1000
k~0
1 This corresponds to the Fortran generic intrinsic function nint.
129
s
RASTRIGIN le+10
1e+10
i
' i
i
i
J
i
i
. . . . .
I
+o
le-10 i 1
le-20
I=l,g i +
i
i
i
i
=
........,:+
1=0.8
i
+
le-20
i
50 100150200250300350400450500
50 100150200250300350400450500
Generations ACKLEY
Generations GRIEWANGK le+10
le+10 i
i
i
1
te-10
le-10
le-20
0
.....:i.i1
50 t00 150 200 250 300 350 Generations
le-20
'
0
'
100 200 300 400 500 600 Generations
Fig. 5. Upper row: Rastrigin's function 2+6 and Schwefel's Function Fr, n = 32, N = 5120, lower row: Griewangk's function Fs and Ackley's function Fg, n = 32, N = 512, e = 10 -12, soft recombination, 5 runs overlaid for every graph
D e p e n d e n c e o n t h e s e l e c t i o n i n t e n s i t y I We checked the influence of the selection intensity I for large population sizes N >> 1. The convergence behavior for the sphere and for the ellipsoid model as well as for the aeromin and negsphere model dependent on the selection intensity is shown in Figure 4. It is quite obvious that there is an inverse proportionate dependence of the number of gener&tions until convergence on the selection intensity. Making a Mathematica fit we get the relation
gen*(I)
-- /2.]n(2)'
~I = c o n s t
(6)
Furthermore, it is interesting to notice that there is no difference in the convergence of the sphere and the ellipsoid model.
130
For the test functions of Table 3 we get the results shown in Figure 5 (Schwefel's function F7 is normalized to make a log-plot possible) which confirm the relation (6). For these function we observe different regions of convergence. The behavior of the EA with soft recombination reflects the self-referential structure of the functions to be optimized. For Griewangk's function Fs we get the structure for one feature corresponding to Figure 6. The region of a low fractal dimension corresponds to a high convergence speed and vice versa. D e p e n d e n c e on t h e p o p u l a t i o n size N The considerations in the previous section are valid for large population sizes N >> 1. If the population size is large enough the number of generations until convergence gen* (N) is independent on the population size N, i.e.
gen*(N)=kN,
kN=const
for
N>>I.
(7)
What happens for small population sizes? Figure 7 shows the influence of the population size on the probability of convergence for the sphere model and for Griewangk's function Fs. Obviously there exists a lower limit N* of the population size with a convergence probability pcon(N > N*) = 1. D e p e n d e n c e on t h e n u m b e r of f e a t u r e s n The scaling behavior of the EA with soft recombination, i.e. the convergence speed dependence on the number of features, is very interesting for large scMe optimization problems. For the sphere model and for Griewangk's function Fs we get the results for n = 32, 64,128 shown in Figure 8. Estimating the scaling behavior of the EA with soft recombination by using a Mathematica fit we get the following relation for a large population size N >> 1 =
=
const.
(s)
This relation is depicted in Figure 9. Soft M o d a l R e c o m b i n a t i o n S u m m a r i z e d Summarizing the convergence behavior for soft modal recombination one finally gets for N >> 1 the estimate
gen*(I, n) = Icx,~. 2.2
nl/(2,1n(~))
i2.~(2 ) ,
kx,,~ = const.
(9)
Soft M o d a l M u t a t i o n s
Mutations for the Breeder Genetic Algorithm [10] for continuous parameter optimization problems are introduced for a mutation base bm =- 2 by
z5 e -4- {2-15Am,2-14Am,...,2~
}
(10)
131
v
GRIEWANGK
GRIEWANGK 100 90 B0 70 60 50 40 30 20 10 0 9600-400-200 0 200 400 600
12 10 8
6
4
2 0 -200-150-100 -50 0
GRIEWANGK
GRIEWANGK 0.5i
9 ,
,
,
0'45 0.4 : ! . i i i i ~ i ~
0.35
t ;
0.3 0.25 0.2 0.15 " i 0.1 0.05 0
~ i ~ ~ i
3
,--,---
2,5
~i
i i : , - ~ --~-" i-----.i.......... ,. . . . .
2 ~
......... i
i
50 100 150 200
•
x
1.5 1
~....... ! ; ~ i
0.5 0 -40
-I -0.8-0.6-0.4-0.2 0 0.2 0.4 0.6 0.8
-20
0
20
40
X
X
Fig. 6. Self-referential structure of Griewangk's function Fs, left: equally low fractal dimension, right: high fractal dimension
where Am =- t~m(Xma~: -- Xmln) defines the absolute mutation range and /~m the relative m u t a t i o n range. Mutations A for changing a feature zi to zi + Ai are chosen randomly with uniform distribution from the given set. For the BGA R,~ is set usually t o / ~ = 0.1. The discrete modal mutation scheme is a generalization of the B G A mutation scheme, i.e. the number of discrete values klo~ depends now on a lower limit of the relative mutation range R,~i, and the base of the mutations b m > 1 need not be necessarily b,~ = 2 such that discrete modal mutations are from
(11) with
I l~
i
(12)
132
SPHERE
GtEWANGK i
rr E
Oo..86f.t .........:............i ...........i
.............,--2
....................
rr
0.8 0.6
> C;
0
o
~,
o.4
.0
o
0.2 ...... i.............................................
0
o 50 100 150 200 250 300 350 PopulationSize
i ...................
i .. : i: , 100 200 300 400 PopulationSize
i
500
Fig. 7. Ratio of convergent runs P~o,~vs. population size N for the sphere model (left) and Griewangk's function Fs (right), n = 32, c = 10 -12, 20 runs
Discrete modal mutations are depicted schematically in Figure 10a). Since there are only discrete mutation steps in the set of possible mutations we checked the robustness of such a scheme by means of the multimodal test function set from [10, 20]. We extended the Multivalued Evolutionary Algorithm (MEA) [20] by the discrete modal mutation scheme. The new algorithm specification parameters are then the mutation base bin, the relative mutation range Rm and the minimal relative mutation range R,~,~. All other parameters are used as for the MEA, i.e. the population size N, the selection intensity I, the number of genes m for a phenotypic feature, the number n of phenotypic features xi, i = 1, ..., n, and the mutation probability p~. The algorithm stops if the best fitness value f* within the population is below a given threshold. For Rastrigin's function F6 and Ackley's function F9 Figures 11 left) show the average number of function evaluations versus values of the relative mutation range 0 > 1 the response to selection of a large population changing by mu-lation is approximate
R(t) = 1 + (1 - p(t))e -p(t) - p(t)e -(i-p('))
(32)
Pro@ Let the parents have i bits wrong, let si be the probability of a success by mutation, fi be the probability of a defect mutation, si is approximately given by the product of changing at least one of tlhe wrong bits and not changing the correct bit [21]. Therfore =
(1
-
=
(1 -
-
(i
Similarly -d(1
-
(i -
-
t59
From equation 31 and 1 - (1 - m) ~ ~ i 9 m we obtain = (1 - p ( t ) ) ( 1
n
k = p(t)(1
rt
Because (1 - ~)'~ ~ e - I we get st = (1 - v ( t ) )
e-P(')
.It = P(t)e -(1-p(t)) We are left with the problem to estimate imp and red. In a first approximation we set both to 1 because a mutation rate of m = 1/n changes one bit on the average. We have not been able to estimate S(t) analyticMly. Simulations show that for T = 50% S(t) decreases from about 1.15 at the beginning to about 0.9 at GENop,. Therefore S(t) = 1 is a resonable approximation. This completes the proof. | Equation 32 defines a difference equation for p(t + 1). We did not succeed to solve it analytically. We have found that the following linear approximation gives almost the same results E m p i r i c a l L a w 2 Under the asssumptions of empirical law I the response to
selection can be approximated by .~(t) = 2 - 2p(t)
(33)
The number of generations until p(t) = 1 - c is reached is given by n . l n l - Pe~ GENI_~. ~ -~
(34)
Proof, The proof is identical to the proof of theorem 2. In figure 4 the development of the mean fitness is shown. The simulations have been done with two popsizes ( N = 1024, 64) and two m u t a t i o n rates (m = 1/n, 4/n). The agreement between the theory and the simulation is very good. The evolution of the mean fitness of the large population and the small population is almost equal. This demonstrates that for mutation a large population is inefficient. A large mutation rate has an interesting effect. The mean fitness increases faster at the beginning, but it never finds the optimum. This observation again suggests to use a variable mutation rate. But we have already mentioned t h a t the increase in performance by using a variable mutation rate will be rather small. Mutation spends most of its time in getting the last bits correct. But in this region a mutation rate of m = 1/n is optimal. The m a j o r results of this section can be summarized as follows: Mutation in
large populations is not effective. It is more efficient with very strong selection. The response to selection becomes very small when the population is approaching the optimum. The efficiency of the mutation operator critically depends on the mutation rate.
160 MeanFit
60 50 40 30
~ a 20 /f / ~ y 10 0
20
i
o
40
n (N=1024,M=l/n) ~SImuation (N=1024,M=4/n) ---. s i ~ u . ~ a ~ ~ = ! / . " ! ..... Simulation(N= 64, M=4/n)
60
80
100
Gen
Fig. 4. Mean fitness for theory and simulations for various N and mutation probabilities
7
Competition between Mutation and Crossover
The previous sections have qualitatively shown that the crossover operator and the mutation operator are performing good in different regions of the parameter space of the BGA. In figure 5 crossover and mutation are compared quantitatively for a popsize of N = 1024. The initial population was generated with P0 = 1/64. The mean fitness of the population with mutation is larger than that of the population with crossover until generation 18. Afterwards the population with crossover performs better. This was predicted by the analysis.
MeanFit
---
Crossover Mutation
s j
20
40
.
60
.
.
.
.
'
80
.
.
.
.
.
1 O0
Gen
Fig. 5. Comparison of mutation and crossover
161
The question now arises how to best combine mutation and crossover. This can be done by two different methods at least. First one can try to use both operators in a single genetic algorithm with their optimal parameter settings. This means that a good mutation rate and a good population size has to be predicted. This method is used for the standard breeder genetic algorithm B G A . Results for popular test functions will be given later. Another method is to apply a competition between subpopulations using different strategies. Such a competition is in the spirit of population dynamics. It is the foundation of the Distributed Breeder Genetic Algorithm. Competition of strategies can be done on different levels, for example the level of the individuals, the level of subpopulations or the level of populations. B~ck et al. [3] have implemented the adaptation of strategy parameters on the individual level. The strategy parameters of the best individuals are reeombined, giving the new stepsize for the mutation. Herdy [17] uses an competition on the population level. In this case whole populations are evaluated at certain intervals. The strategies of the succesful populations proliferate, strategies in populations with bad performance die out. Our adaptation lies between these two extreme cases. The competition is done between subpopulations. Competition requires a quality criterion to rate a group, a gain criterion to reward or punish the groups, an evaluation interval, and a migration interval. The evaluation interval gives each strategy the chance to demonstrate its performance in a certain time window. By occasional migration of the best individuals groups which performed badly are given a better chance for the next competition. The sizes of the subgoups have a lower limit. Therefore no strategy is lost. The rationale behind this algorithm will be published separately. In the experiments the mean fitness of the species was used as quality criterion. The isolation interval was four generations, the migration interval eight generations. The gain was four individuals. In the case of two groups the population size of the better group increases by four, the population size of the worse group decreases by four. If there are more than two groups competing, then a proportional rating is used. Figure 6 shows a competition race between two groups, one using mutation only, the other crossing-over. The initial population was randomly generated with p0 = 1/64. The initial population is far away from the optimum. Therefore first the population using mutation only grows, then the crossover population takes over. The first figure shows the mean fitness of the two groups. The migration strategy ensures that the mean fitness of both populations are almost equal. in figure 7 competition is done between three groups using different mutation rates. At the beginning the group with the highest mutation rate grows, then both the middle and the lowest mutation rate grow. At the end the lowest mutation rate takes over. These experiments confirm the results of the previous sections. In the next section we will compare the efficiency of a BGA using mutation, crossover and an optimal combination of both.
t62 MeanFit
N
1MAX, n---64
60
6O
50
50
40
~
----
30
1MAX, n---64
4o!
Mutation Crossover
30
20
20
10
10
0 -
' ' 25 50
~ =Gen 75 100 125 150 175 20G
0
25
50
75 100 125 150 175 200Gen
Fig. 6, Competition between mutation and crossover
MeanFit
I MAX, n=64
N
1MAX, n=64
60 40 + ~ p ~
30
Lt,-"
j~,
30 '
t
20
j
.
.
.
.
- - p=l/n --- p=4/n
p=16/n
,,
2O 10
' "
o
25
50
t. %.:..^,,
..
~
75 100 125 150 175 200_en-~G
IO
o'
2~
~o
7~ 1oo 125 15o 17s a00ae"
Fig. 7. Competition between different mutation rates
8
The Test Functions
The outcome of a comparison of mutation and crossover depends on the fitness landscape. Therefore a carefully chosen set of test functions is necessary. We will use test functions which we have theoretically analyzed in [21]. They are similar to the test functions used by Schaffer [26]. The test suite consists of ONEMAX(n) MULTIMAX(n) PLATEAU(k,1) SYMBASIN(k,1) DECEPTION(k,1) The fitness of ONEMAX is given by the number of l's in the string. MULTIMAX(n) is similar to ONEMAX, but its global optima have exactly n/2 l's contained in the string. It is defined as follows
MULTIMAX(n, X)
163
We have included the MULTIMAX(n) function in the test suite to show the dependence of the performance of the crossover operator on the fitness function. MULTIMAX(n) poses no difficulty for mutation. Mutation will find one of the many global optima in O(n) time. But crossover has difficulties when two different optimal strings are recombined. This will lead with high probability to a worse offspring. An example is shown below for n = 4 1100 (~) 0011 With probability P = 10/16 will crossover create an offspring worse than the midparent. The average fitness of an offspring is 3/2. Therefore the population will need many generations in order to converge. More precisely: The number of generations between the time when an optimum is first found and the convergence of the whole population is very high. MULTIMAX is equal to ONEMAX away from the global optima. In this region the heritability is one. When the population approaches the optima, the heritability drops sharply to zero. The response to selection is almost 0. For the PLATEAU function k bits have to be flipped in order that the fitness increases by k. The DECEPTION function has been defined by Goldberg [t6]. The fitness of DECEPTION(k,1) is given by the sum of l deceptive functions of size k. A deceptive function and a smoothed version of order k = 3 is defined in the following table bit DECEP SYMBA I bit:DECEP SYMBA 111 30 30.100 14 14 101 0 26010 22 22 110 0 22 001 26 26 011 0 14 000 28 28 A DECEPTION function has 21 local maxima. Neighboring maxima are k bits apart. Their fitness value differs by two. The basin of attraction of the global optimum is of size k l, the basin of attraction of the smallest optimum is of size (2 k - 1) z. The DECEPTION function is called deceptive because the search is mislead to the wrong maximum (0, 0 , . . . , 0). The global optimum is particularly isolated. The SYMBASIN(k,1) function is like a deceptive function, but the basins of attraction of the two peaks are equal. In the simulations we used the values given in the above table for SYMBA.
9
N u m e r i c a l Results
All simulations have been done with the breeder genetic algorithm BGA. In order to keep the number of simulations small, several parameters were fixed. The mutation rate was set to m = 1/n where n denotes the size of the problem. The parents were selected with a truncation threshold of T = 35%. Sometimes T = 50% was used.
164
In the following tables the average number of generations is reported which are needed in order that the best individual is above a predefined fitness value. With these values it is possible to imagine a type of race between the populations using the different operators. Table 2 shows the results for ONEMAX of size 64. FE denotes the number of function evaluations necessary to reach the optimum. SD is the standard deviation of GENt if crossover is applied only. In all other cases it is GENop~,the number of generations until the optimum was found. The initial population was randomly generated with a probability P0 = 0.5 that there is a 1 at a locus. The numerical values are averages over 100 runs.
63[ 64[ SD I F ~ M 241941156 183 226 309! 82 618 M 641840 65 801102143I 56!9161 c* ! 64 711 15 15 17 19 1.1 1210 C 128 5 9 12 12 13 15 10.8 189~ M~zC 423151 81 961151521 47 608 M&C 64 713 17 19i 20 22 2.1 2102 Table 2. ONEMAX(64); C* found optimum in 84 runs only
The simulations confirm the theory. Mutation in small populations is a very effective search. But the variance SD of GENopt is very high. Furthermore, the success of mutation decreases when the population approaches the optimum. A large population reduces the efficiency of a population using mutation. Crossover is more predictable. The progress of the population is constant. But crossover critically depends on the size of the population. The most efficient search is done by the BGA using both mutation and crossover with a population size of N = 4. In table 3 the initial population was generated farther away from the optimum (p0 = 1/8). In this experiment, mutation in small populations is much more efficient than crossover. But the combined search is also performing good.
lOP N2432 62 63 64SD FE M 21424192237307 85 615 M 64 8 16 96 117 161 72110388 C* 256 6 9 24 25 27 0.9 6790 C 320 6 9 24 25 26 0.9 8369 M&:C 4}i'i 19 114 136i180 5'2 '725 MaC 641 5 8 29 31 34 3 2207 Table 3. ONEMAX(64); P0 = 1/8; C* found optimum in 84 runs only
165
In table 4 results are presented for the PLATEAU function. The efficiency of the small population with mutation is slightly worse than for ONEMAX, But the efficiency of the large population is much better than for ONEMAX. This can be easily explained. The large population is doing a random walk on the plateau. The best efficiency has the BGA with mutation and crossover and a popsize of N --- 4.
I~ I 'NI288l 291129412971300,[SD[, FEI M 4[ 27 42 64 95 184 107 737 M 64~ 5 8 13i 19 31 9!2064 C* 64 3 4 6 7 9 1 569 C 128 3 4 5 6 8 1!1004 M&C 4 2232,5 49 73!134 63 539 M&C 64 10 10 10 10 12 2 i 793 Table 4. PLATEAU(3,10); C* found optimum in 78 runs only
In table 5 results are shown for the
lOP M M M
DECEPTION(3, 10) function.
IN I 2831 2911 2941 2971 3001 SD I FE] 4 419 3520 4721 6632 9797 4160 391927 16 117 550 677 827 1241 595 19871 64
35 202 266 375 573 246 36714
C* 32 11 M&C 4 ] 597 3480,4760,6550 9750 3127 38245 M&C !161 150 535 625 775 1000 389 16004 M&C*!6411170 ...... ! Table 5. DECEPTION(3,10);* stagnated far from optimum
We observe a new behavior. Mutation clearly outperforms uniform crossover. But note that a popsize of N = 16 is twice as efficient as a popsize of N = 4. The performance decreases till N = 1. Mutation is most efficient with a popsize between 12 and 24. In very difficult fitness landscapes it pays off to try many different searches in parallel. The BGA with crossover only does not come near to the optimmn. Furthermore, increasing the size of the population from 32 to 4000 gives worse result. This behavior of crossover dominates also the BGA with mutation and crossover. The BGA does not find the optimum if it is run with popsizes greater than 50. This is a very unpleasant fact. There exist only a small range of popsizes where the BGA wilt find the optimum.
166
It is known that the above problem would vanish, if we use 1-point crossover instead of uniform crossover. But then the results depend on the bit positions of the deceptive function. For the ugly deceptive function [21] 1-point crossover performs worse than uniform crossover. Therefore we will not discuss experiments with 1-point crossover here. The results for SYMBASIN are different. In table 6 the results are given. For mutation this function is only slightly easier to optimize than the DECEPTION function. Good results are achieved with popsizes between 8 and 64, But the SYMBASIN function is a lot more easier to optimize for uniform crossover. The BGA with mutation and crossover performs best. Increasing the popsize decreases the number of generations needed to find the optimum.
297l a 0 0 l ~ iI 41 1092215035857404420029621 16 24 125 205 391 765 530 12250 64 18 46 68 106 221 136 14172 6 16 18 19! 20 4 4 14 15 17! 18 0,2136741 aa 1642 2987'ssa719105 l18a[a6421
12
16115 95 186 331 615 418]9840
64.12 I
aa
5a
99 2:611 15}11~176
Table 6. SYMBASIN(3,10);C*: only 50% reached the optimum
The absolute performance of the BGA is impressive compared to other algorithms. We will only mention ONEMAX and DECEPTION. For ONEMAX the number of function evaluations needed to locate the optimum (FEopt) scales like e. n. In(n) (empirical law i). Goldberg [15] observed a scaling of O(n ~7) for his best algorithm. To our knowledge the previous best results for DECEPTION and uniform crossover have been achieved by the CHC algorithm of Eshelman [10]. The CHC algorithm needed 20960 function evaluations to find the optimum. The BGA needs about 16000 function evaluations. The efficiency can be increased if steepest ascent hillclimbing is used [2t], In the last table we will show that the combination of mutation and crossover gives also good results for continuous functions. In table 7 results for Rastrigin's function [2'2] are shown. The results are similar to the results of the ONEMAX function. The reason of this behavior has been explained in [22]. A BGA using mutation and discrete recombination with a popsize of N = 4 performs most efficiently.
167
loP ]N[i.O I .1[.01[.001[SD I FE I M [ 4594636691 M ]64[139176 225 M&C 4 531 599i634 M&C64 50 66 91
801 40 3205 286 9 18316 720 38 2881 123 3 7932
Table 7. Rastrigin's function (n = 10)
10
Conclusion
The theoretical analysis of evolutionary algorithms has suffered in the past from the fact that the methods developed in quantitative genetics to understand especially artificial selection have been largely neglected. Many researchers still believe that the schema theorem [14] is the foundation of the theory. But the schema theorem is nothing else than a simple version of Fisher's fundamental theorem of natural selection. In population genetics it was discovered very early that this theorem has very limited applications. We have shown in this paper that the behaviour of evolutionary algorithms can be well understood by the response Lo selection equation. It turned out that the behaviour of the breeder genetic algorithm is already complex for one of the most simple optimization functions, the O N E M A X function. This function can play the same role for evolutionary algorithms as the ideal gas in thermodynamics. For the ideal gas the thermodynamic laws can be theoretically derived. The laws for real gases are extensions of the basic laws. In the same manner the equations derived for O N E M A X will be extended for other optimization functions. For this extension a statistical approach using the concept heritability and the genotypic and phenotypic variance of the population can be used. This approach is already used in the science of artificial breeding.
References 1. H. Asoh and H. Miihlenbein. On the mean convergence time of genetic populations without selection. Technical report, GMD, Sankt Augustin, 1994. 2. Thomas Bgck. Optimal mutation rates in genetic search. In S. Forrest, editor, 5rd Int. Conf. on Genetic Algorithms, pages 2-9, San Mateo, 1993. Morgan Kaufmann. 3. Thomas Bs and Hans-Paul Schwefel. A Survey of Evolution Strategies. In Proceedings of the Fourth International Conference of Genetic Algorithms, pages 2-9, San Diego, 1991. ICGA. 4. Thomas Bs and Hans-Paul Schwefel. An Overview of Evolutionary Algorithms for Parameter Optimization. Evolutionary Computation, 1:1-24, 1993. 5. R. K. Belew and L. Booker, editors. Procedings of the Fourth International Conference on Genetic Algorithms, San Mateo, 1991. Morgan Kaufmann. 6. H.J. Bremermann, M. Rogson, and S. Salaff. Global properties of evolution processes. In H.H. Pattee, editor, Natural Automata and Useful Simulations, pages 3-42, 1966.
168
7. M. G. Bulmer. "The Mathematical Theory of Quantitative Genetics". Clarendon Press, Oxford, 1980. 8. J. F. Crow. Basic Concepts in Population, Quantitative and Evolutionary Genetics. Freeman, New York, 1986. 9. J . F . Crow and M. Kimura. An Introduction to Population Genetics Theory. Harper and Row, New York, 1970. 10. L.J. Eshelman. The CHC Adaptive Search Algorithm: How to Have Safe Search when Engaging in Nontraditional Genetic Recombination. In G. Rawfins, editor, Foundations of Genetic Algorithms, pages 265-283, San Mateo, 1991. MorganKaufman. 11. D. S. Falconer. Introduction to Quantitative Genetics. Longman, London, 1981. 12. R. A. Fisher. The Genetical Theory of Natural Selection. Dover, New York, 1958. 13. S. Forrest, editor. Procedings of the Fifth International Conference on Genetic Algorithms, San Mateo, 1993. Morgan Kaufmann. 14. David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, 1989. 15. D.E. Goldberg. Genetic algorithms, noise, and the sizing of populations. Complex Systems, 6:333-362, 1992. 16. D.E. Goldberg, K. Deb, and B. Korb. Messy genetic algorithms revisited: Studies in mixed size and scale. Complex Systems, 4:415-444, 1990. 17. Michael Herdy. Reproductive Isolation as Strategy Parameter in Hierarchical Organized Evolution Strategies. In PPSN 2 Bruxelles, pages 207-217, September 1992. 18. J.H. Holland. Adaptation in Natural and Artificial Systems. Univ. of Michigan Press, Ann Arbor, 1975. 19. M. Kimura. The neutral theory of molecular evolution. Cambridge University Press, Cambridge University Press, 1983. 20. H. Miihlenbein, M. Gorges-Schleuter, and O. Kr~imer. Evolution Algorithms in Combinatorial Optimization. Parallel Computing, 7:65-85, 1988. 21. Heinz Mfihlenbein. Evolution in time and space - the parallel genetic algorithm. In G. Rawfins, editor, Foundations of Genetic Algorithms, pages 316-337, San Mateo, 1991. Morgan-Kaufman. 22. Heinz Miihlenbein and Dirk Schlierkamp-Voosen. Predictive Models for the Breeder Genetic Algorithm: Continuous Parameter Optimization. Evolutionary Computation, 1(1):25-49, 1993. 23. Heinz Mfihlenbein and Dirk Schlierkamp-Voosen. The science of breeding and its application to the breeder genetic algorithm. Evolutionary Computation, 1(4):335360, 1994. 24. Ingo Rechenberg. Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Information. Fromman Verlag, Freiburg, 1973. 25. It. Schaffer, editor. Proeedings of the Third International Conference on Genetic Algorithms, San Mateo, 1989. Morgan Kaufmann. 26. J.D. Schaffer and L.J. Eshelman. On crossover as an evolutionary viable strategy. In R. K. Belew and L. Booker, editors, Procedings of the ~burth International Conference on Genetic Algorithms, pages 61-68, San Mateo, 1991. Morgan Kaufmann. 27. H.-P. Schwefel. Numerical Optimization of Computer Models. Wiley, Chichester, 1981. 28. G. Syswerda. Uniform crossover in genetic algorithms. In H. Schaffer, editor, 3rd
Int. Conf. on Genetic Algorithms, pages 2-9, San Mateo, 1989. Morgan Kaufmann.
The Role of Mate Choice in Biocomputation: Sexual Selection as a Process of Search, Optimization, and Diversification Geoffrey F. Miller 1 and Peter M. Todd 2 School of Cognitive and Computing Sciences University of Sussex Falmer, Brighton, BN1 9QH, UK
[email protected], ac.uk Department of Psychology University of Denver 2155 S. Race Street Denver, CO 80208, USA ptodd@pst ar.psy.du.edu A b s t r a c t . The most successful, complex, and numerous species on earth are composed of sexually-reproducing animals and flowering plants. Both groups typically undergo a form of sexual selection through mate choice: animals are selected by conspecifics and flowering plants are selected by heterospecific pollinators. This suggests that the evolution of phenotypic complexity and diversity may be driven not simply by natural-selective adaptation to econiches, but by subtle interactions between natural selection and sexual selection. This paper reviews several theoretical arguments and simulation results in support of this view. Biological interest in sexual selection has exploded in the last 15 years (see Andersson, 1994; Cronin, 1991), but has not yet been integrated with the biocompurational perspective on evolution as a process of search and optimization (Holland, 1975; Goldberg, 1989). In the terminology of sexual selection theory, mate preferences for 'viability indicators' (e.g. Hamilton & Zuk, 1982) may enhance evolutionary optimization, and mate preferences for 'aesthetic displays' (e.g. Fisher, 1930) may enhance evolutionary search and diversification. Specifically, as a short-term optimization process, sexual selection can: (1) speed evolution by increasing the accuracy of the mapping from phenotype to fitness and thereby decreasing the 'noise' or 'sampling error' characteristic of many forms of natural selection, and (2) speed evolution by increasing the effective reproductive variance in a population even when survival-relevant differences are minimal, thereby imposing an automatic, emergent form of 'fitness scaling', as used in genetic Mgorithm optimization methods (see Goldberg, 1989). As a longer-term search process, sexual selection can: (3) help populations escape from local ecological optima, essentially by replacing genetic drift in Wright's (1932) "shifting balance" model with a much more powerful and directional stochastic process, and (4) facilitate the emergence of complex innovations, some of which may eventually show some ecological utility. Finally, as a process of diversification, sexual selection can (5)
170
promote spontaneous sympatric speciation through assortative mating, increasing biodiversity and thereby increasing the number of reproductively isolated lineages performing parallel evolutionary searches (Todd & Miller, 1991) through an adaptive landscape. The net result of these last three effects is that sexual selection may be to macroevolution what genetic mutation is to microevolution: the prime source of potentially adaptive heritable variation, at both the individual and species levels. Thus, if evolution is understood as a biocomputational process of search, optimization, and diversification, sexual selection can play an important role complementary to that of natural selection. In that role, sexual selection may help explain precisely those phenomena that natural selection finds troubling, such as the success of sexually-reproducing lineages, the speed and robustness of evolutionary adaptation, and the origin of otherwise puzzling evolutionary innovations, such as the human brain (Miller, 1993). Implications of this view will be discussed for biology, psychology, and evolutionary approaches to artificial intelligence and robotics.
1
Introduction
Sexual selection through mate choice (Darwin~ 1871) has traditionally been considered a minor, peripheral, even pathological process, tangential to the main work of natural selection and largely irrelevant to such central issues in biology as speciation, the origin of evolutionary innovations, and the optimization of complex adaptations (see Cronin, 1991). But this traditional view is at odds with the fact that the most complex, diversified, and elaborated taxa on earth are those in which mate choice operates: animals with nervous systems, and flowering plants. The dominance of these life-forms, and the maintenance of sexual reproduction itself, has often been attributed to the advantages of genetic recombination. But recombination alone is not diagnostic of animals and flowering plants: bacteria and non-flowering plants both do sexual recombination. Rather, the interesting common feature of animals and flowering plants is that both undergo a form of sexual selection through mate choice. Animals are sexually selected for reproduction by opposite-sex conspecifics (Darwin, 1871; see Andersson, 1994), and flowering plants are sexually selected by the heterospeciflc pollinators such as insects and hummingbirds that they attract to further their own reproduction (Sprengel, 1793; Darwin, 1862; see Barth, 1991). Indeed, Darwin's dual fascination with animal courtship (Darwin, 1871) and with the contrivances of flowers to attract pollinators (Darwin, 1862) may reflect his understanding that these two phenomena shared some deep similarities. The importance of mate choice in evolution can be appreciated by considering the special properties of neural systems as generators of selection forces. The brains and sensory-motor systems of organisms make choices that affect the survival and reproduction of other organisms in ways that are quite different from the effects of inanimate selection forces (as first emphasized by Morgan,
171
1888). 1 This sort of psychological selection (Miller, 1993; Miller & Cliff, 1994; Miller ~: Freyd, 1993) by animate agents can have much more direct, accurate, focused, and striking results than simple biological seleclion by ecological challenges such as unicellular parasites or physical selection by habitat conditions such as temperature or humidity. Recently, several biologists have considered the evolutionary implications of "sensory selection", perhaps the simplest form of psychological selection (see Endler, 1992; Enquist & Arak, 1993; Guilford & Dawkins, 1991; Ryan, 1990; Ryan ~: Keddy-Hector, 1992). This paper emphasizes the evolutionary effects of mate choice because mate choice is probably the strongest, most common, and best-analyzed type of psychological selection. But there are many other forms of psychological selection both within and between species. For example, the effects of psychological selection on prey by predators results in mimicry, camouflage, warning coloration, and protean (unpredictable) escape behavior. Artificial selection on other species by humans, whether for economic or aesthetic purposes, is simply the most self-conscious and systematic form of psychological selection. Thus, we can view sexual selection by animals choosing mates as mid-way between brute natural selection by the inanimate environment, and purposive artificial selection by humans. But the big questions remain: What distinctive evolutionary effects arise from psychological selection, and in particular from sexual selection through mate choice? And how does sexual selection interact with other selective forces arising from the ecological and physical environment? The traditional answer has been that sexual selection either copies natural selection pressures already present (e.g. when animals choose high-viability mates) making it redundant and impotent, or introduces new selection pressures irrelevant to the real work of adapting to the econiche (e.g. when animals choose highly ornamented mates), making it distracting and maladaptive. In this paper we take a more positive view of sexual selection. By viewing evolution as a 'biocomputational' process of search, optimization, and diversification in an adaptive landscape of possible phenotypic designs, we can better appreciate the complementary roles played by sexual selection and natural selection. We suggest that the success of animals and flowering plants is no accident, but is due to the complex interplay between the dynamics of sexually-selective mate choice and the dynamics of naturally-selective ecological factors. Both processes together are capable of generating complex adaptations and biodiversity much more efficiently than either process alone. Mate choice can therefore play a critical role in biocomputation, facilitating not only short-term optimization within populations, but also the longer-term search for new adaptive zones and new evolutionary innovations, and even speciation and the macroevolution of biodiversity. 1 Mate choice may also be possible without brains, occurring in plants through a variety of mechanisms of female choice and male competition (see Willson & Burley, 1983; Andersson, 1994). However, these mechanisms seem for the most part to be instantiated in and have effects at the microscopic and molecular levels, in contrast to the mostly macroscopic effects of selection by animal nervous systems.
172
This paper begins with a discussion of the historical origins of the idea of mate choice (section 2) and the evolutionary origins of mate choice mechanisms (section 3). We then explore how mate choice can improve bioeomputation construed as adaptive population movements on fitness landscapes, by allowing faster optimization to fitness peaks (section 4), easier escape from local optima (section 5), and the generation of evolutionary innovations (section 6). Moving from serial to parallel search, we then consider how sexual selection can lead to sympatric speeiation and thus to evolutionary search by multiple independent lineages (section 7). Finally, section 8 discusses some implications of these ideas for science (particularly biology and evolutionary psychology) and some applications in engineering (particularly genetic algorithms research and evolutionary optimization techniques). This theoretical paper complements our earlier work on genetic algorithm simulations of sexual selection (Todd & Miller, 1991, 1993; Miller & Todd, 1993; Miller, 1994; Todd, in press); in further work we test these ideas with more extensive simulations (Todd & Miller, in preparation) and comparative biology research (Miller, accepted, a; Miller, 1993). 2 The evolution of economic selection versus the evolution through sexual selection
traits through of reproductive
natural traits
Darwin (1859, 1871) clearly distinguished between natural selection and sexual selection as different kinds of processes operating on different kinds of traits according to different kinds of evolutionary dynamics. For him, natural selection improved organisms' abilities to survive in an environment that is often hostile and always competitive, while sexual selection honed abilities to attract and select mates and to produce viable and attractive offspring. But this critical distinction between natural and sexual selection was lost with tile Modern Synthesis (Dobzhansky, 1937; Huxley, 1942; Mayr, 1942; Simpson, 1944), when natural selection was redefined as any change in gene frequencies due to the fitness effects of heritable traits, whether through differential survival or differential reproduction. The theory of sexual selection through mate choice had been widely dismissed after Darwin, and this brute-force redefinition of natural selection to encompass virtually all non-random evolutionary processes did nothing to revive interest in mate choice. Fisher (1915, 1930) was one of the few biologists of his era to worry about the origins and effects of mate choice. He developed a theory of "runaway sexual selection," in which an evolutionary positive-feedback loop is established (via genetic linkage) between female preferences for certain male traits, and the male traits themselves. As a result, both the elaborateness of the traits and the extremity of the preferences could increase at an exponential rate. Fisher's model could account for the wildly exaggerated male traits seen in many species, such as the peacock's plumage, but it did not explain the evolutionary origins of female preferences themselves, and was not stated in formal genetic terms. Huxley (1938) criticized Fisher's model in a hostile and confused review of sexual
173
selection theory, which kept Darwin's theory of mate choice in limbo for decades to come. In the last 15 years, however, there has been an explosion of work on sexual selection through mate choice. The new population genetics models of O'Donald (1980), Lande (1981), and Kirkpatrick (1982) supported the mathematical feasibility of Fisher's runaway sexual selection process. Behavioral experiments on animals showed that females of many species do exhibit strong preferences for certain male traits (e.g. Andersson, 1982; Catchpole, 1980; Ryan, 1985). New comparative morphology has supported Darwin's (1871) claim that capricious elaboration is the hallmark of sexual selection: for instance, Eberhard (1985) argued that the only feasible explanation for the wildly complex and diverse male genitalia of many species is evolution through female preference for certain kinds of genital stimulation. Evolutionary computer simulation models such as those of Collins and Jefferson (1992) and Miller and Todd (1993) have confirmed the plausibility, robustness, and power of runaway sexual selection. Once biologists started taking the possibility of female choice seriously, evidence for its existence and significance came quickly and ubiquitously. Cronin (1991) provides a readable, comprehensive, and much more detailed account of this history, and Andersson (1994) gives the most authoritative review of the literature. Largely independently of this revival of sexual selection theory, Eldredge (1985, 1986, 1989) has developed a general model of evolution based on the interaction of a "genealogical hierarchy" composed of genes, organisms, species, and monophyletic taxa, and an "ecological hierarchy" composed of organisms, "avatars" (sets of organisms that each occupy the same ecological niche), and ecosystems. Phenotypes in this view are composed of two kinds of traits: "economic traits" that arise through natural selection to deal with the ecological hierarchy, and "reproductive traits" that arise through sexual selection to deal with other entities (e.g. potential mates) in the genealogical hierarchy. Eldredge (1989) emphasizes that the relationship between economic success and reproductive success can be quite weak, and that reproductive traits are legitimate biological adaptations - - as shown by recent research on mate choice and courtship displays (see Andersson, 1994). Eldredge also grants genealogical units their own hierarchy separate from the ecological one, but does not emphasize the possibility of evolutionary dynamics occurring entirely within the genealogical hierarchy, without any ecological relevance. The one exception is Eldredge's discussion of how "specific mate recognition systems" (SMRSs) might be disrupted through stochastic effects, resulting in spontaneous speciation. But other processes occurring purely within the genealogical hierarchy, such as Fisher's (1930) runaway process, are not mentioned. Thus, even in his authoritative review of macroevolutionary theory (Eldredge, 1989), which consistently views evolutionary change in terms of movements through adaptive landscapes, Eldredge overlooks the adaptive autonomy of sexual selection, and the adaptive interplay between sexual selection and natural selection. But the time is now right to take sexual selection seriously in both roles: (1) as a potentially autonomous evolutionary process that can operate entirely
174
within Eldredge's "genealogical hierarchy", and (2) as a potentially important complement to natural selection that can facilitate adaptation to Eldredge's "ecological hierarchy" in various ways. The remainder of this paper focuses on this second role. But to understand the dynamic interplay between natural and sexual selection, we must first understand their different characteristic dynamics. Natural selection typically results in convergent evolution onto a few (locally) optimal solutions given pre-established problems posed by the econiche. In natural selection by the ecological niche or the physical habitat, organisms adapt to environments, but not vice-versa (except in relatively rare cases of tight coevolution - - see Futuyama & Slatkin, 1983). This causal flow of selection from environment to organism makes natural selection fairly easy to study empirically and formally, because one can often identify a relatively stable set of external conditions (i.e. a 'fitness function') to which a species adapts. Moreover, natural selection itself is primarily a hill-climbing process, good at exploiting adaptive peaks, but somewhat weak at discovering them. By contrast, sexual selection often results in an unpredictable, divergent pattern of evolution, with lineages speciating spontaneously and exploring the space of phenotypic possibilities according to their capriciously evolved mate preferences. In sexual selection, the mate choice mechanisms that constitute the selective 'environment' can themselves evolve under various forces, including the current distribution of available phenotypes. Thus, the environment and the adaptations - - the traits and preferences - - can co-evolve under sexual selection, as Fisher (1930) realized. The causal flow of sexual selection forces is bi-directional, and thus more complex and chaotic. The resulting unpredictable dynamics may look entirely anarchic, without structure and due entirely to chance, but are in fact 'autarchic', in that a species evolving through strong selective mate choice is a self-governing system that in a sense determines its own evolutionary trajectory. Indeed, sexual selection could be considered the strongest form of biological self-organization that operates apart from natural selection - - but it is a form almost entirely overlooked by those who study self-organization from a biocomputational perspective (e.g. Brooks & Maes, 1994; Kauffman, 1993). If one visualizes sexual selection dynamics as branching, divergent patterns that explore phenotype space capriciously and autonomously, and natural selection dynamics as convergent, hill-climbing patterns that seek out adaptive peaks, then their potential complementarity can be understood. The overall evolutionary trajectory of a sexually-reproducing lineage results from the combined effects of sexual selection dynamics and natural selection dynamics (plus the stochastic effects of genetic drift and neutral drift) - - an interplay of capriciously directed divergence and ecologically directed convergence. This interplay might help explain evolutionary patterns that have proven difficult to explain under natural selection alone, particularly the abilities of lineages to optimize complex adaptations, to escape from local evolutionary optima, to generate evolutionaxy innovations, and to split apart into sympatric species. This interplay between capricious, divergent sexual selection and directed, convergent natural selection is analogous to the interplay between genetic muta-
175
tion and natural selection. The major difference is that the high-level variation in phenotypic design produced by sexual selection is much richer, more complex, and typically less deleterious than the low-level variation in protein structure produced by random genetic mutation. Thus, many of the phenomena that seem difficult to account for through the interaction of low-level genetic mutation and natural selection, might be better accounted for through the interaction of higher-level sexual-selective effects and natural selection. But we should consider the evolutionary origins of mate choice before we consider its evolutionary effects. 3
Why
mate
choice mechanisms
evolve
Darwin (1871) analyzed the evolutionary effects but not the evolutionary origins of mate preferences. Fisher (1915, 1930) went further in discussing how mate preferences might co-evolve with the traits they prefer, by becoming genetically linked to them, but he too did not directly consider the selection pressures on mate choice itself. Recently, the question of how selective mate choice can evolve has occupied an increasingly important position in sexual selection theory (e.g. Bateson, 1983; Kirkpatrick, 1982, 1987; Pomiankowski, 1988; Sullivan, 1989); the issue becomes particularly acute when mate choice is costly in terms of energy, time, or risk (Iwasa et al., 1991; Pomiankowski, 1987, 1990; Pomiankowski et al., 1991). The mysterious origins of mate choice can be made clearer if the adaptive utility of choice in general is appreciated. Little sleep is lost over the issues of how habitat choice, food choice, or nesting place choice could ever evolve given their costs; the same acceptance ought to apply to mate choice. Animal nervous systems have two basic functions: (1) generating adaptive survival behavior that registers, and exploits or avoids, important objects and situations in the ecological environment, such as food, water, prey, and predators ("ecological affordances"), and (2) generating adaptive reproductive behavior that registers and exploits important objects in the sexual environment, such as viable, fertile, and attractive mates ("reproductive affordances"). Current theories of how animals make adaptive choices among ecological affordances are substantially more sophisticated than theories of how animals make adaptive choices among reproductive affordances. However, by seeing both ecological affordances and reproductive affordances as examples of "fitness affordances" in general (Miller Cliff, 1994; Miller & Freyd, 1993; Todd & Wilson, 1993), we can see the underlying similarity between both sorts of adaptive choice behavior. The key to choosing food adaptively is to have an evolved food-choice mechanism that has internalized the likely survival effects of eating different kinds of foods: from an evolutionary perspective, the internally represented utility of a food item should reflect its objectively likely prospective fitness effects on the animal, given the animal's energy requirements, biochemistry, gut morphology, etc. By analogy, the key to choosing mates adaptively is an evolved mate choice mechanism that has internalized the likely long-term fitness consequences of reproducing with
176
different kinds of potential mates, given a certain recurring set of natural and sexual selection pressures. The adaptive benefit of choice in each case is that negative fitness affordances that threatened survival or fertility in the past can be avoided, and positive fitness affordances that enhanced survival or fertility in the past can be exploited. Thus, choice is a way of internalizing ancestral selection pressures into current psychological mechanisms. This view of the evolution of choice suggests that mate choice mechanisms can be analyzed according to normative criteria of adaptiveness. The internally represented sexual attractiveness of a potential mate should reflect its objectively likely prospective fitness value as a mate, in terms of the likely viability and sexual attractiveness of any offspring that one might have with it. Thus, the efficiency and normativity of a mate choice mechanism could in principle be assessed with the same theoretical rigor as a mechanism for any other kind of adaptive choice. Mate choice is well-calibrated if the perceived sexual attractiveness of potential mates is highly correlated with the actual viability, fertility, and attractiveness of the offspring they would produce. The observable traits of potential mates that correlate primarily with offspring survival prospects can be termed "viability indicators" (Zahavi, 1975), and the observable traits that correlate primarily with offspring reproductive prospects can be called "aesthetic displays" of the sort analyzed by Darwin (1871) and Fisher (1930). In fact, most sexually-elaborated traits such as the peacock's tail will probably play both roles to some extent, with their large costs making them useful viability indicators (e.g. Petrie, 1992) but the details of their design making them attractive aesthetic displays (e.g. Petrie et al., 1991). Now we can ask, what actually gets "evotutionarily internalized" from the environment (Shepard, 1984, 1987) in the case of mate preferences? Mate choice mechanisms may in some cases evolve to 'represent' the recent history of a population's evolutionary trajectory through phenotype space, that is, the recent history of natural selection and sexual selection patterns that have been operating in the population. Sustained, directional movement through phenotype space typically implies that directional selection is operating, or that a fitness gradient is being climbed in a certain direction. Mate preferences that are in agreement with this directional movement, internalizing the species' recent history, will then be more successful, assuming the movement continues. In this case, mate preferences can be described as 'anticipatory' assessments of past selection pressures that will probably continue to be applied in the future, in particular to one's offspring. This picture of how mate preferences evolve has clear implications for sexual selection dynamics. If a population has n o t been moving through phenotypic space, e.g. it is perched atop an adaptive peak due to stabilizing selection, as most populations are most of the time, then mate preferences will probably evolve to favor potential mates near the current peak, and they will tend to reinforce the stabilizing natural selection that is currently in force. (If biased mutation tends to displace individuals from the peak more often in one direction than in another, then mate preferences may evolve to counteract that recurrent delete-
177
rious mutation by having a directional component - - see Pomiankowski et al.~ 1991.) But if a population has been evolving and moving through phenotype space, then mate preferences can evolve to 'point' in the direction of movement, conferring more evolutionary 'momentum' on the population that it would have under natural selection alone. These sorts of directional mate preferences (Kirkpatrick, 1987; Miller & Todd, 1993) can be visualized as momentum vectors in phenotype space that can keep populations moving along a certain trajectory, in some cases even after natural-selective forces have shifted.
Another effect could be seen when a population has been splitting apart due to some form of genetic divergence (which we will discuss more in section 7.1). In this case, mate preferences in each sub-population can evolve to favor breeding within the sub-population, and not between sub-populations, thereby reinforcing the speciation. The divergent mate preferences of two populations splitting apart can be visualized as vectors pointing in different directions. These sexualselective vectors will reinforce and amplify the initial effects of divergence b y imposing disruptive (sexual) selection against individuals positioned phenotypically in between the parting populations. Thus, directional mate preferences will often evolve to be congruent with whatever directional natural selection (if any) is operating on a population, whether it applies to a unified population or one splitting apart into subspecies. Sexual selection may thereby smooth out and reinforce the effects of natural selection.
But sexual selection vectors can often point in different directions from natural selection vectors, resulting in a complex evolutionary interplay between these forces. The evolution of mate preferences can be influenced by a number of factors other than natural selection for mate preferences in favor of high-viability traits. For example, stochastic genetic drift can act on mate preferences as it can act on any phenotypic trait; this effect is important in facilitating spontaneous speciation and in the capriciousness of runaway sexual selection. Intrinsic sensory biases in favor of certain kinds of courtship displays, such as louder calls or brighter colors, may affect the direction of sexual selection (Endler, 1992; Enquist & Arak, 1993; Guilford & Dawkins, 1991; Ryan, 1990; Ryan & KeddyHector, 1992). Also, an intrinsic psychological preference for novelty, as noted by Darwin (1871) and in work on the "Coolidge effect" (Dewsbury, 1981), may favor low-frequency traits and exert "apostatic selection" (Clarke, 1962), a kind of centrifugal selection that can maintain stable polymorphisms, facilitate speciation, and hasten the evolution of biodiversity. Thus, a number of effects may lead mate choice mechanisms to diverge from preferring the objectively highestviability mate as the sexiest mate. These effects will often make sexual-selective vectors diverge from natural-selective gradients in phenotype space, and give sexual selection its capricious, divergent, unpredictable nature. Now that we have considered the evolutionary origins of mate preferences, we can consider their evolutionary effects.
178
4 Ecological optimization can be facilitated by selective mate choice Natural selection is often analyzed theoretically, and implemented computationally, as a fairly simple 'fitness function' that maps from phenotypic traits to reproductive success scores (Goldberg, 1989). But natural selection as it actually operates in the wild is often a horribly noisy, irregular, and inaccurate process. Predators might often eat the prey animal that has the better vision, larger brain, and longer legs, simply because that animal happened to be closer at dinner time than the duller, blinder, slower animal over the hill. A lethal virus may attack and eliminate the animal with the better immune system simply because that animal happened to drink from the wrong pond. Anyone who doubts the noisiness and inaccuracy of natural selection should consider the relative lack of speed with which animals evolve in the wild in comparison to evolution under artificial selection by human breeders, who cull undesirable traits with much more accuracy and thoroughness. Maynard Smith (1978, p. 12) observed that evolution can happen up to five orders of magnitude (100,000 times) faster under artificial selection than under typical natural selection, at least over the short term. The fundamental reason for this disparity is that Nature (i.e. the physical habitat or biological econiche) has no incentive to maximize the selective efficiency or accuracy of naturM selection, whereas human breeders do have incentives to maximize the efficiency and accuracy of artificial selection. Likewise, animals choosing mates have very heavy incentives to maximize the efficiency and accuracy of their mate choice, and thereby the efficiency and accuracy of the sexual selection that they impose. Thus, it would be extremely surprising if the selective efficiency and accuracy of natural selection were typically as high as that of sexual selection through mate choice. Habitats and econiches are not well-adapted to impose natural selection, whereas animals are well-adapted to choose mates and thereby to impose sexual selection. (This difference is often obscured in genetic algorithms research, where fitness functions are specifically designed by humans to be efficient and accurate selectors and Mlocators of offspring.) Given the relative noisiness and inefficiency of natural selection itself, how can the "organs of extreme perfection and complication" that Darwin (1859) so admired ever manage to evolve? We believe they may do so with substantial assistance from selective mate choice, at least in animals and flowering plants. As we saw in the previous section, sexually reproducing animals have strong incentives to internalize whatever natural selection pressures are being applied to their population in the form of selective mate preferences. For example, these preferences can inhibit mating with individuals that probably survived by luck rather than by genetic merit, whatever genetic merit means given current natural-selective and sexual-selective pressures. By avoiding mates that have low apparent viability but happen to still be alive anyway, parents can keep from having offspring that would probably not be so lucky. Conversely, by mating with individuals who clearly show high viability and sexual attractiveness, parents may give their offspring a genetic boost with respect to natural and sexual
179
selection for generations to come. For example, an average individual who mates with someone with twice their viability or attractiveness may increase their longterm reproductive success (e.g. number of surviving grand-children) by roughly 50to random mating, by having their genes 'hitch-hike' in bodies with the better genes of their mate. This inheritance of genetic and economic advantage through mate choice can have several important effects on the optimization of complex adaptations, because the brains and sensory systems involved in mate choice can act as highly efficient 'lenses' for reflecting, refracting, recombining, amplifying, and focusing natural selection pressures. First, the noisiness of natural selection can be substantially reduced by mate choice, leading to smoother, faster evolutionary optimization. It might take a while for mate preferences to accurately internalize the current regime of natural selection, but once in place, such preferences can exert much more accurate, less noisy selection than natural selection itself can. For example, natural selection by viruses alone (a biological selector) might yield a low correlation between heritable immune system quality and reproductive success, because the infected animals might be too sick to have a full-sized litter, but still manage to have several offspring despite their illness. But mate choice based on observed health and immune capacity may boost this correlation much higher, if conspecifics refuse to mate at all with an individual who bears the viral infection, and thereby lower the sick individual's reproductive success to nil. The higher the correlation between heritable phenotypic traits and reproductive success, the faster the evolution (Fisher, 1930). Mate choice can therefore heavily penalize individuals who show a tendency to get sick, whereas natural selection heavily penalizes only those individuals who actually have fewer offspring or die. Here, the brains and sensory systems involved in mate choice act to focus the noisy, diffuse, unreliable forces of natural selection into smoother, steeper gradients of sexual selection. Thus, much of the work of constructing and optimizing complex adaptations may be performed by mate choice mechanisms tuned to reflect natural selection pressures, rather than by the natural selection pressures themselves. Of course, most animals that fail to reproduce - - especially in r-selected species that produce large numbers of offspring with little parental care - - will do so because they fail to survive to reproductive maturity in the first place, being spontaneously aborted, never hatching, or dying due to illness, starvation, or predation. Out of the countless eggs and sperm that adult salmon release during mating, only a very few zygotes will survive the rigors of childhood and up-river migration to successfully choose mates and spawn themselves. Natural selection may eliminate almost all of the individuals in a particular generation in this way. As Darwin (1859) noted in his discussion of the inevitability of competition, the manifest capability of organisms to reproduce far outstrips the carrying capacity of their environment, so natural selection will eliminate the vast majority of individuals. In contrast, even the most intensive mate choice in highly polygynous species will not cull the remaining reproductively mature individuals from the mating game with anything like this kind of ferocious efficiency. A large number of bachelor males may not leave behind any offspring, but most of the
180
females and a significant number of males will, making sexual selection look like a much weaker force in terms of the percentages of individuals affected. But the efficiency of a selective process depends most heavily on the correlation between heritable phenotypic features and selective outcomes. In natural selection, this correlation may often be quite low, because, as stressed earlier, Nature typically has no incentive to increase its selective efficiency. By contrast, this correlation may be quite high in sexual selection, because animals have large incentives to increase their mate choice efficiency. Thus, although sexual selection typically affects fewer individuals per generation than natural selection, sexual selection may account for most of the nonrandom change in heritable phenotypic traits - - i.e. most of the evolution. Second, mate choice can magnify relative fitness differences, thereby increasing the speed and robustness of optimization. In genetic algorithms research, populations often converge to have nearly equal performance on the user-imposed objective fitness function after a few dozen generations, and further optimization becomes difficult because the relatively small fitness differences are insufficient to result in much evolution. Methods for 'fitness scaling' such as linear rescaling or rank-based selection can overcome this problem by mapping small differences in objective fitness (corresponding to ecological success) onto large differences in reproductive success (Goldberg, 1989). We believe that in nature, sexual selection can provide an automatic form of fitness scaling that helps populations avoid this sort of evolutionary stagnation. Again, sexually reproducing animals have incentives to register slight differences in the observed viability of potential mates and to mate selectively with higher-viability individuals. The result of this choosiness will be automatic fitness scaling that maintains substantial variance in reproductive success and thereby keeps evolution humming along even when every individual is similar in fitness (e.g. when near some optimum). Here, brains and sensory systems act via mate choice to magnify small fitness differences, effectively separating individuals who would otherwise have indistinguishable fitnesses (and have the same number of offspring) into different distinguishable fitnesses - - and thereby greatly increasing the variance in the number of offspring. Third, mate choice mechanisms can pick out phenotypic traits that are different from those on which natural selection itself acts, but that are highly correlated with natural-selective fitness. For example, bilateral symmetry may be an important correlate of ecological success for many vertebrates. But natural selection might increase the degree of symmetry in a particular lineage only very indirectly through its effects on several different correlates of symmetry, such as locomotive efficiency (individuals with asymmetric legs won't be able to get around as well and so will be selected against on the grounds of their locomotive inefficiency, rather than being selected against for asymmetry per se). By contrast, mate preferences for perceivable facial and body form can directly select for symmetry in a way that natural selection cannot. 2 2 Symmetry is a useful general-purpose cue of developmental competence (Moiler & Pomi~nkowski, 1993), because deleterious mutations, injuries, and diseases often
181
In general, mate choice can complement natural selection by operating on perceivable phenotypic attributes that underlie a wide array of economic traits, but that would typically be shaped only indirectly by a number of different, weak, indirect natural selection pressures. To continue our analogy between brains and optical devices, mate choice mechanisms can act as panoramic lenses, bringing into view a wider array of phenotypic features than natural selection alone would tend to focus on. Natural selection is extremely efficient at eliminating major genetic blunders, such as highly deleterious mutations or disruptive chromosome duplications - it simply prevents the afflicted individual from reaching reproductive maturity. But the subtler task of shaping and optimizing complex adaptations may be more difficult for direct ecological selection pressures to manage. Natural selection alone can of course accomplish wonderful things, given enough time: 3.5 billion years of prokaryote evolution (amounting to many trillions of generations) has produced some quite intricate biochemical adaptations in these single-celled organisms. But for larger-bodied animals with longer generation times, we believe that selective mate choice plays a major role in the optimization of complex adaptations. For such species, the efficacy of natural selection may depend strongly on shaping mate choice mechanisms that 'take over' via sexual selection and do much of the difficult evolutionary work. There is suggestive data that support this hypothesis. Bateson (1988) replotted data from Wyles, Kunkel, and Wilson (1983), and found a strong positive correlation across several taxa between rate of evolution (assessed by a measure of morphological variability across eight traits) and relative brain size. For example, song birds have larger brains than other birds, and apparently evolve faster; humans have the largest brains of all primates, and apparently evolve the fastest. Bateson (1988) interpreted this correlation in terms of larger brains allowing better habitat choice, a stronger "Baldwin effect" (in which the ability to learn speeds up the evolution of unlearned traits - - see Hinton and Nowlan, 1987), and various forms of "behaviorally induced environmental change" - - but he overlooked the potential effects of brain size on sexual selection patterns. We believe it is more important that larger brains allow more powerful and subtle forms of selective mate choice. Indeed, the vastly enlarged human brain has allowed us not only to (unconsciously) impose strong sexual selection on members of our own species (Darwin, 1871; Miller, 1993), but also to impose very strong artificial selection on members of other species (Darwin, 1859). The correlation disrupt symmetry. Furthermore, an animal choosing a mate based on its ability to develop symmetrically need not know the "intended" optimal form of a particular bilateral structure -- it only needs the circuitry for detecting differences between the two matched halves of the structure. Symmetrically-structured sensory surfaces and neural circuits (e.g. eyes and brains) may make such symmetry judgments easy, because they facilitate the comparison of the corresponding left and right features of perceived objects. The utility of symmetric body-plans as displays of developmental competence, and of symmetric brains and senses as mechanisms for choosing symmetric mates, could make developing a symmetric phenotype a common attractor state for many evolving lineages.
182
between brain size and rate of evolution provides a suggestive start for studies of the relationship between the capacity for selective mate choice and the rate and course of evolution, but clearly much more data is needed on this issue.
5 Escaping evolutionary local optima through sexual selection 5.1 The relative power of ~sexual-selective drift', genetic drift, and neutral drift Populations can become perched on some adaptive peak in the fitness landscape through the optimizing effect of sexual and natural selection acting together. But many such peaks are only local evolutionary optima, and better peaks may exist elsewhere. Once a population has converged on such a locally optimal peak then, how can it move off that peak, incurring a temporary ecological fitness cost, to explore the surrounding adaptive landscape and perhaps find a higher-fitness peak elsewhere? Wright's (1932, 1982) "shifting balance" theory was designed to address this problem of escaping from local evolutionary optima. He suggested that genetic drift operating in quasi-isolated populations can sometimes allow one population to move far enough away from its current fitness peak that it enters a new adaptive zone at the base of a new and higher fitness peak. Once that population starts to climb the new fitness peak, its genes can spread to other populations, so that the evolutionary innovations involved in climbing this peak can eventually reach fixation throughout the species. Thus, the species as a whole can climb from a lower peak to a higher one. Wright's (1932) model anticipated some of the recent concerns about how to take "adaptive walks" that escape from local optima in rugged fitness landscapes (Kaufmann, 1993). In very rugged landscapes, short steps (defined relative to the landscape's ruggedness) of the sort generated by genetic point mutations are unlikely to allow individuals or populations to escape a local optimum. This is similar to Darwin's (1883) problem of how minor mutations can accumulate into useful adaptations if they have no utility in their initial form. But jumping further across the landscape does not guarantee success, either: longer steps of the sort generated by macromutations (as favored by Goldschmidt, 1940) are unlikely to end up anywhere very reasonable; most mutations are deleterious, and major mutations even more so. The central problem is how to match the "foray length" of population movements away from local optima with the "correlation length" of the adaptive landscape, and thereby facilitate directional excursions away from the current adaptive peak to explore the surrounding fitness landscape. Wright's shifting balance model suggests that genetic drift might provide enough random jiggling around the local optimum to sometimes knock the population over into another adaptive zone, but the theoretical analysis of adaptive walks in rugged fitness landscapes (Kaufmann, 1993) indicates that this is unlikely to be a common occurrence.
t83
Our model of population movement in phenotype space via mate choice is similar to Wright's shifting balance theory, but it provides a mechanism for exploring the local adaptive landscape that can be much more powerful and directional than random genetic drift: sexual selection. Here, we are relying on a kind of 'sexual-selective drift' resulting from the stochastic dynamics of mate choice and runaway sexual selection to displace populations from local optima. We suspect that with mate choice, the effects of sexual-selective drift will almost always be stronger and more directional than simple genetic drift for a given population size, and will be more likely to take a population down from a local optimum and over into a new adaptive zone. Genetic drift relies on passive sampling error to move populations down from economic adaptive peaks, whereas sexual selection relies on active mate choice, which can overwhelm even quite strong ecological selection pressures. Our simulations have shown that with directional mate preferences in particular, populations move around through phenotype space much more quickly than they would under genetic drift alone, and not uncommonly in direct opposition to natural selection forces (Miller & Todd, 1993). Thus, sexual selection can be seen as a way of making Wright's shifting bMance model much more powerful, by allowing active mate choice dynamics to replace passive genetic drift as the main source of evolutionary innovation. Aside from classical genetic drift (sampling error in small populations), "neutral drift" through adaptively neutral mutations (Kimura, 1983) might conceivably play an important role in allowing populations to explore high-dimensional adaptive landscapes. The idea is this: the more dimensions there are to an adaptive landscape, the less problematic local optima will be, because the more equal-fitness 'ridges' there will be from one optimum to another in the space. A local optimum may be a peak with respect to each of two phenotypic dimensions, but it is unlikely to be a peak with respect to each of a thousand dimensions, so there will be plenty of room for adaptively neutral exploration of phenotype space (see Eigen, 1992; Schuster, 1988). Under this model, populations can drift around through adaptive landscapes without incurring fitness costs for doing the exploration. The neutral drift theory is usually applied to molecular evolution (DNA base pair substitutions typically do not change expressed protein functionMity), but it could in principle extend to morphology and behavior. To take an implausible example, if quadrupedalism and bipedatism happen to have equal locomotive efficiency in a certain environment (such as the Pleistocene savanna of Africa), a population might drift from the former to the latter without incurring much fitness cost in between, and without natural selection in favor of bipedalism p e r se. Although both ways of moving may be equal in locomotive efficiency, they have very different implications with respect to other potential activities such as tool use. Once the population drifts into bipedalism, it will happen to enter a new adaptive zone wherein natural selection can favor new adaptations for tool use, resulting in an evolutionary innovation with respect to tool use. Thus, if the problem of local optima in high-dimensional adaptive landscapes really is overstated, then neutral drift from one adaptive zone to another might facilitate the
184
discovery of evolutionary innovations associated with different adaptive peaks. However, we believe that for complex phenotypic adaptations at the level of morphology and behavior, the problems of local optima are not so easily overcome. The evolutionary conservatism characteristic of many morphological and behavioral traits in many taxa suggests that neutral drift has trouble operating on such traits. Still, so little is known about neutral drift above the level of molecules that such arguments are not convincing. We can nonetheless ask, if neutral drift theory does apply to complex phenotypic traits, is neutral drift through phenotype space likely to be faster with or without the capricious dynamics of sexual selection? Here again, we believe that populations capable of mate choice will be more likely to exploit the possibilities of neutral drift and move along fitness ridges, because mate choice can confer more mobility and momentum on evolving populations.
5.2 The role of sexual dimorphism in escaping local optima through sexual selection As Darwin (1871) noted, females are usually choosier than males about their mates, so sexual selection typically acts more strongly on males. SexuMly dimorphic selection pressures will often result in sexually dimorphic traits, although dimorphism in a trait tends to evolve much more slowly than the trait itself (Lande, 1980, 1987). Thus, Darwin was able to use sexual dimorphism as a diagnostic feature for a trait having evolved through sexual selection. But the effects of sexual dimorphism on longer-term evolutionary processes have rarely been considered. Highly elaborated male courtship displays, whether behavioral or morphological, are often costly in terms of the male's 'economic' success with respect to the surrounding econiche. Indeed, according to Zahavi's (1975) handicap theory, this cost is indirectly the reason why elaborated displays can evolve under sexual selection, as an indication of the male's vitality in being able to overcome the handicapping costly courtship display. If we view a dimorphic population as situated in an adaptive landscape that represents purely ecological (economic) fitness, then the females will be situated close to the fitness peak, while the males will be situated some distance from the peak, and thus lower on the fitness landscape. As the male displays become more elaborated and more costly, the males will end up further away from the fitness peak representing economic optimality. Thus, sexuM dimorphism in courtship traits leads to a kind of sexual division of labor with respect to the job of exploring adaptive landscapes. Males get pushed off economic fitness peaks by the pressure of female choice in favor of highly elaborated, costly courtship displays. Due to the typical lack of male choosiness, the females can stay more comfortably situated near the economic fitness peak. Thus, males become the explorers of the adaptive landscape, compelled to wander through the space of possible phenotypic designs by the demands of female choice to 'bring home' a sexy, interesting, and expensive courtship display: The economic costs of wandering through phenotype space are compensated for by the reproductive benefits of attracting mates with a costly, elaborated courtship
185
display. In most species most of the time, the males will reach some equilibrium distance (Fisher, 1930; Kirkpatrick, 1982), close enough to the economic fitness peak to survive, but far enough away to demonstrate their viability and to incur the costs of an elaborate display, and the species will be recognized as having some sexually dimorphic traits. But sometimes, in some species, the males might stumble upon a new adaptive zone in the course of their wanderings. That is, a sexually elaborated trait, or some phenotypic side-effect of it, could prove economically useful, and become subject to favorable natural selection. The males would then start to climb the new economic fitness peak; and once the males reach a level of economic benefit on this new peak that exceeds the benefit obtainable on the old fitness peak, then there can be selection for females as well to move from their position on the old peak to the new, higher, peak. This selection on females would act to eliminate the sexual dimorphism that maintained the useful new traits in the males alone, so that the females too could inherit the new trait (from their fathers initially). Thus, once the males enter a new adaptive zone and start to climb a higher fitness peak, a combination of natural selection and reduced sexual dimorphism may move the entire population, males and females, to the top of the new fitness peak. Populations that successfully shift from one adaptive peak to another will show little sexual dimorphism for the original courtship traits that brought them into the region of the new peak, since selection on the females will have worked to remove it; instead, they will be recognized as beneficiaries of an evolutionary innovation that is characteristic of both males and females. So it may be difficult to recognize modern species that have undergone this peak-jumping process except through careful analysis of the fossil record; computer simulation may be more useful in determining whether this peak-jumping mechanism is plausible. Such hypothesized rapid shifts between fitness peaks resemble what Simpson (1944) called "quantum evolution" or what Eldredge and Gould (1972) called "punctuations". The quantum evolution term is apt because our theory suggests that populations capable of sexual dimorphism can do a kind of 'quantum tunneling' between adaptive peaks: the normal economic costs that slow movement across low-fitness valleys between peaks can be overridden by genealogical (sexually selected) benefits to the males, allowing them to traverse the valleys much more quickly. The females can then join the males once a new peak is actually discovered. The result could be much more rapid movement between peaks than would be possible under natural selection alone. This rapid tunneling between peaks looks strange from the perspective of the purely economic adaptive landscape that represents only natural selection pressures. But that landscape is not the whole picture: the effects of sexual selection establish a separate 'reproductive landscape' with different dimensions and perhaps a different topography for males and females. The economic and reproductive landscapes together combine to form a master adaptive landscape; what looks like paradoxical downhill movement or quantum tunneling in the purely economic landscape traversed by natural selection may actually be hillclimbing in the combined landscape that includes sexual selection pressures.
186
But won't these initially economically unfeasible excursions by the males threaten their survival, and hence that of the species as a whole? Sexual selection is often maligned for just this reason, as "a fascinating example of how selection may proceed without adaptation" (Futuyma, 1986, p. 278), on the principle that the economic costs of highly elaborated male courtship displays might predispose a species to extinction - - e.g. as argued by Haldane (1932), Huxley (1938), and Kirkpatriek (1982). But as Pomiankowski (1988) has emphasized, the relationship between male economic success and population viability is quite complex and unclear. Reproductive output in sexually-reproducing species is typically limited by the number of females, not by the number of males. The population's rate of replacement will not necessarily be decreased by the loss of male viability due to elaborated courtship displays. On the contrary: "a population denuded of males will have more resources available for females and so may support an absolutely larger reproductive output for a given resource base" (Pomiankowski, 1988). Thus, the population-level costs of sexually elaborated traits may be minimal, and the individual-level benefits may be large, due to sexual selection. This makes quantum tunneling between adaptive peaks through sexual selection a plausible mechanism for generating evolutionary innovations and escaping local ecological optima. At first glance, our proposal bears an uncomfortable resemblance to traditional sexist images of males going out to hunt and sometimes returning with meat for the benefit of their families. But females may also do some important exploration of the adaptive landscape, with respect to different phenotypic dimensions. Under Fisher's (1930) runaway selection model for example, female preferences and male traits both become elaborated through sexual selection. Females become ever-choosier and more discriminating. The benefits of selective mate choice can favor the evolution of new sensory, perceptual, and decisionmaking adaptations in females, despite their economic costs. Thus, while males are exploring the space of possible secondary sexual characteristics and behavioral courtship displays under sexual selection, females may be exploring the space of possible sensory, perceptual, and cognitive traits. If the females happen upon a mate choice mechanism such as a new form of color vision or better timbre perception that also happens to hmre economic benefits in their econiche, then we would expect such mechanisms to be further modified and elaborated through natural selection, and inherited by males as well, eventually showing low dimorphism. Thus, females can also tunnel between peaks in the space of possible perceptual systems, deriving the reproductive benefits of selective mate choice even when a perceptual system shows little ecological benefit. In summary, sexual selection provides the easiest, fastest, and most efficient way for populations to escape local ecological optima. Sexual dimorphism with respect to courtship traits and mate preferences allows a sexual division of labor in searching the adaptive landscape. Many morphological and behavioral innovations that currently show high economic utility and low sexual dimorphism may have originated as parts of male courtship displays. Likewise, many sensory, perceptual, and decision-making innovations could have originated as components
187
of female choice mechanisms, and later have been modified for ecological applications. Those innovations that did not happen to show any ecological utility remained in their sexually dimorphic form, and are typically not recognized as innovations at all. 6 6.1
Sexual selection and evolutionary
innovations
T h e m y s t e r y of e v o l u t i o n a r y innovations
Evolutionary innovations are important because natural selection crafts adaptations out of innovations:" Innovation is the mainspring of evolution" (Jablonski & Bottjer, 1990, p. 253). Classic examples of major evolutionary innovations include the bony skeleton of vertebrates, the jaws of gnathostomes, the amniote egg, feathers, continuously growing incisors, large brains in hominids, the insect wing, and insect pollination of angiosperms (Cracraft, 1990). But the complete list of major evolutionary innovations is almost endless, being virtually synonymous with the diagnostic characters of all successful higher taxa, and the complete list of minor innovations would include essentially all diagnostic characters of all species. But, for all their biological importance and large number, the causal origins of evo!utionary innovations have been tong contended and remain poorly understood. Virtually every major evolutionary theorist has tackled the problem of evolutionary innovations, e.g. Darwin (1859, 1871, 1883), Romanes (1897), Weismann (1917), Wright (1932, 1982), Simpson (1953), Mayr (1954, 1960, 1963), and Gould (1977). But the major questions remain unresolved (see Nitecki, 1990, for a recent review). This section reviews the history of evolutionary thinking about innovations; section 6.2 examines the most baffling features of innovations; section 6.3 suggests that sexual selection through mate choice can help explain the strange pattern of innovations in animals and flowering plants; section 6.4 outlines some limits to our hypothesis; and section 6.5 concludes the discussion of innovations. Darwin, particularly in the sixth edition of the Origin of species (Darwin, 1883), worried about the early evolutionary stages of"organs of extreme perfection" such as the human eye and the bird's wing. How could these innovations be preserved and elaborated before they could possibly assume their later survival function (such as vision or flight)? The problem for Darwin was to account for the origin of phenotypic innovation that was more complex and well-integrated than what random mutation could produce, but that was not yet useful enough in the struggle for existence to have been favored by natural selection. Mutations seemed able to generate only trivial or disastrous phenotypic changes, and so could not account for the origins of useful innovations, whereas natural selection could only optimize innovations already in place. Nor could Darwin convince skeptics that some mysterious interplay between mutation and selection could account for evolutionary innovations. Darwin's difficulty in accounting for evolutionary innovations was one of the weakest and most often-attacked aspects of his theory of natural selection. Even
188
his most ardent followers were anxious about this problem. Romanes (1897) was very concerned to show how "adaptive characters", or evolutionary novelties, originate. For him, this was the central question of evolutionary theory, much more important than the question of how species originate, but one that he was never able to answer to his own satisfaction. Simpson (1953) later proposed that "key mutations" can cause a lineage to enter a new "adaptive zone" such that the lineage undergoes an adaptive radiation, splitting apart into a large number of species to exploit all the ecological opportunities in that new adaptive zone. Similarly, Mayr (1963) defined an evolutionary innovation as "any newly acquired structure or property that permits the performance of a new function, which, in turn, will open a new adaptive zone" (Mayr, 1963, p. 602). However, both Simpson and Mayr were better able to describe innovation's effects than to explain its causes. Their notion that major innovations are closely associated with adaptive radiations has been a persistent theme in innovation theory, appearing more recently under the guise of "key evolutionary innovations" in Liem (1973, 1990), and "key characters" in Van Valen (1971). Over this long history, several kinds of explanations have been offered to explain the emergence of evolutionary innovations. Goldschmidt (1940) suggested that macromutations could produce fully functioning novelties in the form of "hopeful monsters". The problem with this view is that random macromutations are overwhelmingly unlikely to generate the sort of structural complexity and integration characteristic of innovations even in their early stages. Complex innovations cannot be explained by undirected random mutation. On the other hand, Fisher (1930) took the Darwinian hard line and maintained that innovations could indeed be produced purely through natural-selective hill-climbing. The difficulty with this idea is that it ignores the problem of local optima, as discussed in section 5. Significant innovation corresponds to fairly substantial movement through a multi-dimensionM adaptive landscape. But because many adaptive landscapes have complex structures (Eigen, 1992; Kauffman, 1993), with many peaks, ridges, valleys, and local optima, long movements through such landscapes may often require escaping from local optima. As section 5.1 emphasized, this problem of escaping local optima may be more serious at the level of complex phenotypic design than at the level of genetic sequences or prorein shapes (cf. Eigen, 1992) - - and most evolutionary innovations of interest to biologists are innovations in complex phenotypie design. (However, see Dawkins, 1994, for a description of a recently simulated example of a possible course of evolution for a complex adaptation - - the vertebrate eye - - that proceeds rapid[y and directly from flat skin to fish-eye in 400,000 generations without getting stuck in local optima.) Thus~ the evolution of a new phenotypic innovation may often reflect escape from a local adaptive optimum and the discovery of a better solution elsewhere in the space of possible phenotypes (Wright, 1932; Patterson, 1988). Finally, other theorists have put forth explanations of the origins of innovation that stress the role of phenotypic structure in allowing for innovations. In these theories, innovative adaptations can arise through phenotypic by-products
189
of other adaptive change (Mayr, 1963), through various mechanisms of phenotypic self-organization (e.g. Eigen, 1992; Kauffman, 1993), and through changes in developmental mechanisms~ particularly 'heterochronies' that affect the relative timing of the development of different traits (Bonnet, 1982; Goodwin et al., 1983; Gould, 1977; Muller, 1990; Raft, 1990; Raft ~: Raft, 1987). These sorts of phenotypic constraints and correlations are probably important, but as we will see, they cannot explain the most striking features of the distribution of evolutionary innovation. There are three major problems for these and the other traditional theories about evolutionary innovation just described; we will now examine these challenges in turn. 6.2
T h r e e puzzling a s p e c t s of e v o l u t i o n a r y innovation
First, there is a disparity between the huge number of minor varietal innovations and the small number of ecologically useful innovations. Darwin (1883, p. 156) stressed this problem when he quoted Milne Edwards: "Nature is prodigal in variety but niggardly in innovation. Why ... should there be so much variety and so little real novelty?". The vast majority of characteristic innovations are "inconsequential" (Liem, 1990); they are what Francis Bacon called "the mere Sport of Nature" when he disparaged the apparently pointless variety of animals, plants, and fossils (quoted in Cook, 1991). Only very few of the initially inconsequential minor innovations may lead to major innovative evolutionary shifts in form or function that allow the invasion of major new habitats and adaptive zones. But if evolutionary innovations spread through populations under the influence of traditional natural selection for their ecological utility, why do so few varietal innovations show the sort of ecological utility that characterizes key innovations? Second, there is often a disparity in time between the causal origin of an innovation and the ultimate ecological and evolutionary effect of an innovation. The causes of evolutionary innovations must be clearly separated from their possible effects on diversification, niche exploitation, or adaptive radiation (Cracraft, 1990). "Key innovations" that allow a monophyletic taxon to radiate outwards into a number of new niches can only be identified post-hoe, after their success has been demonstrated evolutionarily. Immediately after they originate, evolutionary innovations are just innovations pure and simple. Their prospective future ecological utility as fully elaborated traits cannot bring them into being initially. If we wish to understand the actual causal origins of evolutionary innovations, we must look within the species where the innovation originated, not at the ultimate macroevolutionary consequences of the innovation. Liem has stressed this point, observing that "An evolutionary novelty may remain in a stasis for extended times when it does not convey an improvement in the matter/energy transfer" (Liem, 1990, p. i61), and "historical tests also show that there is often a great delay between the emergence of a KEI [key evolutionary innovation] and the onset of the diversification it is assumed to cause" due to its newfound ecological utility (Liem, 1990, p. 165). Earlier, he also noted that "adaptive radiations will not occur until after an evolutionary novelty has reached a certain degree of development" (Liem, 1973, p. 426). Jablonski (1986, 1990) has also observed
190
that many innovations fail to persist, let alone trigger a diversification indicative of ecological utility. Thus, to understand key innovations, we must explain the origin and elaboration of many integrated morphological and behavioral systems that only rarely manifest much survival utility. We seem to need a form of iterative Darwinian selection other than natural selection for ecologically useful survival traits, to account for the period of evolution of an innovation between its first appearance and its eventual ecological significance. Third, the distribution of innovations in animals and flowering plants is not random with respect to phenotypic features, but is highly r in features subject to sexual selection. Traditional theories of innovation through natural selection or through phenotypic constraints and correlations have trouble accounting for this distribution, which is seen most clearly when we consider the methods of biological taxonomy. The most common features used by taxonomists to distinguish one species from another should logically be the sorts of features most characteristic of (at least minor) evolutionary innovations. This is an almost tautological result of the fact that taxa, including species, are in some sense made up of their innovations (Weismann, 1917): their innovations are their critical defining features. The most commonly used defining features for species appear to be primary and secondary sexual traits, and behavioral courtship displays, which Mayr (1960) designated "species recognition signals". And a great many of these traits, used in the identification of species of animals and flowering plants and discussed in speciation research, are just the sort of characteristics most likely to have arisen by sexual selection through mate choice. Studies of evolutionary innovation that rely on reconstructing explicit phylogenies often rely on such features. For example, in Cracraft's (1990, pp. 31-35) analysis of evolutionary innovations in the Pwnopsittagenus of South American parrots, every single one of the 30 innovations discussed was a distinctive plumage color pattern or plumage growth pattern that could have been elaborated through mate choice, such as "bright orange-red shoulder patch"," crown bright red in male, not female", "yellow collar around head", or "crown and back of neck black". Moreover, it is often easier in taxonomy to identify the species of a male than of a female animal, because secondary sexual characters are typically more elaborated in males, whereas females more often retain camouflaged and ancestral forms (Eberhard, 1985). So, in Eldredge's (1989) terminology, reproductive rather than economic traits are often used to distinguish between species. In section 7.1, we argue that speciation can result from a stochastic divergence of mate choice criteria in a geographically-united population leading to a disruption of the mate recognition system within a given species. Under this scenario, most most of the traits distinguishing one species from another - - that is, most minor evolutionary innovations - - are likely to be sexuM characters or courtship displays that arose through mate choice. Moreover, the biological species concept, which views species as reproductively isolated populations, virtually demands that the innovations that distinguish one species from another must function as reproductive isolators - - that is, as traits subject to selective or assortative mate choice. Thus,
191
both the empirical methods of taxonomists and the theoretical presuppositions of the biological species concept suggest that most evolutionary innovations in animals and flowering plants arose through sexual selection acting on traits capable of creating reproductive isolation between populations, particularly primary and secondary sexual characteristics and courtship behaviors. To explain evolutionary innovations then, we need to account for the following facts: (1) Most innovations are too complex and well-integrated to have resulted simply from random mutation or genetic drift, and are too structurally and functionally novel (i.e. functionally non-neutral) to have resulted simply from neutral drift. (2) Many innovations may require escape from an evolutionary local optimum, which natural-selective hill-climbing tends to oppose. (3) Most innovations remain minor, showing very little ecological utility and not leading to adaptive radiations. (4) Those innovations that do eventually become ecologically important often show a long delay between their origin and their proliferation. Finally, (5) most innovations in animals and flowering plants, i.e. most traits taxonomically useful in distinguishing species, are heavily concentrated in phenotypic traits subject to mate choice, and this distribution cannot be explained by models of innovation relying on general phenotypic correlations and constraints. In general then, the origins of evolutionary innovations must be explained in terms of some kind of selection between individuals that has little effect on ecological success and that only rarely leads to macroevolutionary success. "Irrespective of whether innovations are perceived as 'large' or 'small', they all must arise and become established at the level of individuals and populations, not higher taxa" (Cracraft, 1990, p. 28). Thus, innovations that characterize an entire population or species must be explained at some level above that of simple mutation or developmental constraints, but below that of macroevolutionary 'sifting' between species (Vrba & Gould, 1986), and aside from that of natural selection for ecological utility.
6.3
T h e role of m a t e choice in g e n e r a t i n g e v o l u t i o n a r y innovations
Sexual selection through mate choice can account for all of these features of evolutionary innovation in animals and flowering plants. Thus, Darwin's "prodigal variety" may arise from a long-overlooked wellspring of innovation - - the effects and side-effects of mate choice. These sexually-selected varietal novelties could be called "courtship innovations." From these humble origins, a few incipient courtship innovations may continue to be elaborated into more and more complex morphological and behavioral characteristics. At various points in this evolutionary course of elaboration, a tiny minority of courtship innovations and their phenotypic by-products will happen to show some ecological utility, and may be modified to form new "economic innovations" that have some ecological utility. And a tiny minority of these economic innovations will prove important enough that they allow adaptive radiations and later come to be recognized as "key innovations." Thus, the causal origins of key innovations may often be the same as the causal origins of courtship innovations: elaboration of a trait by
192
sexual selection through mate choice. The net result of sexual selection's innovativeness may be that sexual selection is to macroevolution what genetic mutation is to microevolution: the prime source of potentially adaptive heritable variation, at both the individual and species levels. 6.4 W h a t kinds of e v o l u t i o n a r y innovations can be g e n e r a t e d t h r o u g h sexual selection? Our theory that many evolutionary innovations arise at first through the effects of selective mate choice, or as side-effects of sexually-selected traits, must be clarified and given some caveats. First, and most obviously, the theory applies only to biological systems where mate choice operates in some fashion. We have lumped together flowering plants and animals because they both undergo a form of sexual selection by animals with nervous systems~ either heterospecific pollinators or conspecifics. Evolutionary innovations in asexual lineages, and in sexually reproducing organisms that are too simple to exercise heritable patterns of nonrandom mate choice, must be explained in some other way. But since innovations seem to emerge much more slowly and sparsely in lineages without mate choice, there is less that needs explaining. Thus, we would expect the frequency distribution of evolutionary innovations to be highly skewed across lineages, clustered in species subject to high levels of selective mate choice. As sections 6.1 and 7.2 argue, this is just what we see. Second, selective mate choice can directly affect only those phenotypic traits that are perceivable to the animal doing the selecting, given its sensory and perceptual capabilities. Thus, mate choice typically applies to macroscopic morphology and manifest behavior. But it also applies indirectly to any microscopic morphology, physiology, neural circuitry, or biochemistry that affects the appearance of the perceivable traits or behaviors, e.g. the iridescence of bird feathers carried by microscopic diffractive structures on feathers, the complex courtship behavior generated by hidden neural circuits, or the persistent bird song allowed by an efficient metabolism. Furthermore, elaboration of these sexually selected traits may often have phenotypic side-effects on many other traits, and ecologically useful innovations may sometimes emerge from these side-effects. So we would expect the frequency distribution of evolutionary innovations across phenotypic traits to be highly skewed, clustered around traits that are directly subject to mate choice (such as genitals, secondary sexual morphology, and courtship behaviors), and spreading outwards from these traits to others that are structurally, behaviorally, or developmentally correlated. Third, as a corollary of the previous point about phenotypic side-effects, our theory may have fairly limited application to evolutionary innovation in the traits of flowering plants, apart from flowers themselves. Pollinators can directly select ibr flower traits such as shape, color~ smell, and. size, but it is unclear how easy it would be for floral innovations to become modified into ecologically useful new kinds of seeds, fruits, or chemical defenses, much less new kinds of twigs, leaves, or roots. Moreover, despite the fact that the complexity of plant behavior has often been underestimated (see Darwin~ 1876; Simon, 1992), plants cannot
193
use shifts in behavior and habit to smooth the way for changes of morphological function as easily as animals do (Darwin, 1883; Bateson, 1988). As a result, the modification of courtship innovations into economic innovations in plants may be more difficult than in animals. However, polymorphism and sympatric speciation could almost certainly be facilitated through flower selection by pollinators, as the data from Eriksson and Bremer (t992) suggest. So the effects of pollinator choice might at least explain the higher speciation rates and high rates of floral innovation in flowering plants.
6.5 Summary: An overview of evolutionary innovation through sexual selection Species perched on adaptive peaks will generally have mate choice mechanisms complementary to the natural-selective pressures keeping them there, so long periods of stasis will ensue for most species, most of the time. But occasionally, directional preferences, or intrinsic perceptual biases in preferences, or genetic drift acting on preferences, can lead to runaway dynamics that take a population (or at least the males) away from the ecological fitness peak. So the effects of mate choice can be visualized as vectors that pull populations away from adaptive peaks out on long forays into the unknown, where they may or may not encounter new ecological opportunities and evolve economically useful traits. If they do not encounter new opportunities, little is lost: the males will have sexually dimorphic courtship innovations, and the females will have mate choice mechanisms, both of which have some economic costs but substantial reproductive benefits. But if they do encounter new opportunities, much is gained: if the male courtship innovation or the female mate choice mechanism happens to be modifiable into a useful economic innovation, then it will be elaborated through natural selection and its degree of sexual dimorphism will decrease. The lucky population will enter a new adaptive zone, rapidly climb the new peak, and may often become reproductively isolated from other populations. The result could look like a period of rapid evolution concentrated around a speciation event, just as described by punctuated equilibrium theory (Eldredge &: Gould, 1972). Moreover, if the new adaptive zone happens to be particularly large and fruitful, and the economic innovation proves particularly advantageous, then the event will look like the establishment of a key evolutionary innovation, and may lead to the formation of new higher taxa.
7 7.1
Speciation Sympatric speciation t h r o u g h sexual selection
Parallel computation can be faster than serial computation. This principle also applies to evolutionary processes of 'biocomputation'. At one level, the adaptive power on natural selection exploits parallelism across the genes, gene complexes, and individuals within a population. But at another level, a single population exploring an adaptive landscape is not as efficient as a set of populations exploring
194
the landscape in parallel. As section 5.2 discussed, sexual dimorphism between males and females allows one sub-population (the females) to stay perched on an old adaptive peak while another (the males) explores the surrounding phenotype space for other adaptive peaks. Are there any more powerful methods of parallel search in biocomputation that would allow many 'search parties' to branch out across the adaptive landscape? Speciation does exactly that. When a biological lineage splits apart into reproductively isolated subpopulations, one search party is replaced by two independent parties. Here again, we can ask whether mate choice and sexual selection can help biocomputation, this time through facilitating speciation. Though vitally interested in both speciation and mate choice, Darwin did not seem to perceive this connection, and the Origin of species (1859) in fact offered no clear mechanism of any sort whereby speciation could happen. The biologists of the Modern Synthesis (e.g. Dobzhansky, 1937; Huxley, 1942; Mayr, 1942) saw species as self-defined reproductive communities, and yet often argued against the idea that sexual selection, the obvious agent of reproductive self-definition, could induce speciation, because their attitude towards Darwin's theory of selective mate choice was so hostile. Instead, two major theories of speciation developed during the Modern Synthesis, and both suggested that speciating populations are split apart by some divisive force or "cleaver" external to the population itself. The cleaver separates the population in twain genetically and phenotypically, and then reproductive barriers arise afterwards through genetic drift or through selection against hybridization. In Mayr's (1942) model of allopatric (spatially separated) speciation, the cleaver is a new geographic barrier arising to separate previously interbreeding populations. For example, a river may shift course to isolate one population from another. Some combination of genetic drift, the "founder effect" (genetic biases resulting from populations starting with a very few isolated individuals), and disruptive selection then causes the two newly isolated groups to diverge phenotypically. Once enough phenotypic divergence accumulates, the populations can no longer interbreed even when the physical barrier disappears, and so are recognized as separate species. Speciation for Mayr was thus generally a side-effect of geographical separation. In Dobzhansky's (1937) model of sympatric (spatially united) speciation, the cleaver is more abstract: it is a low-fitness valley in an adaptive landscape, rather than a barrier in geographic space. For example, an adaptive landscape might contain two high-fitness peaks (econiehes) separated by a low-fitness valley. This valley could establish disruptive selection against interbreeding between the peaks, thereby driving an original population starting in the valley to split and diverge towards the separate peaks in two polymorphic subpopulations. Dobzhansky further suggested that after divergence, reproductive isolation evolves through selection against hybridization: since hybrid offspring will usually fall genetically back in the lower-fitness valley, mechanisms to prevent cross-breeding between the separate populations will tend to evolve. Thus the evolution of reproductive isolation (speciation itself) is viewed as a conservative
195
process of consolidating adaptive change rather than a radical process of differentiation. Vrba (1985) and Futuyma (1986) concur that speciation serves a conservative function, acting like a 'ratchet' in macroevolution: only reproductive isolation allows a newly diverged population to effectively consolidate its adaptive differentiation; otherwise, the parent species will tend to genetically re-absorb it. A recent development in sympatric models is Paterson's (1985) concept of
specific mate recognition systems (SMRSs). SMRSs are phenotypic mechanisms a species uses to maintain itself as a self-defining reproductive community - - in our terms, a set of mate choice mechanisms for assortative mating. A species is thus considered the largest collection of organisms with a shared SMRS. In Paterson's view, sympatric disruption and divergence of these SMRSs themselves (through some unspecified processes) can lead to speciation. Eldredge (1989, p. 120) emphasizes the potential macroevolutionary significance of SMRSs: "significant adaptive change in sexually reproducing lineages accumulates only in conjunction with occasional disruptions of the SMRSs." Historically, the acceptability of sympatric models has depended on the perceived ability of disruptive selection to generate stable polymorphisms and eventual reproductive isolation. A large number of experiments reviewed by Thoday (1972) show that disruptive selection is sufficient to generate phenotypic divergence even in the face of maximal gene flow between populations (which Mayr, 1963, p. 472, saw as the Achilles' heel of sympatric speciation models), and that mechanisms of reproductive isolation can then evolve to avoid hybrids and consolidate that divergence. Computer models by Crosby (1970) showed that syrnpatric speeiation could occur when populations choose different micro-habitats, evolve stable polymorphisms through disruptive selection, and then evolve reproductive barriers to avoid hybridization. But the speciation debate has continued to grind down to a question of whose cleaver is bigger: Mayr's (1942) geographic barriers or Dobzhansky's (1937) fitness valleys. To address this issue, we (Todd & Miller, 1991) developed a computer simulation of sexual selection that allowed for the possibility of "spontaneous" sympatric speciation through the interaction of assortative mating and genetic drift acting in a finite population. We found that spontaneous speciation could indeed happen, even in the absence of any geographic isolation and even without any natural selection - - no cleaver is necessary beyond the mate choices of individuals in the population. The rate of speciation increased with mutation rate and depended on the exact type of mate preference implemented. Preferences for individuals similar to one's own phenotype yielded the highest speciation rate, while inherited preferences for individuals with particular specific phenotypes yielded lower rates of speciation. In further investigations we found that spontaneous speciation also happens robustly with directional mate preferences, when the directional preference vectors happen to diverge and split the population into two subpopulations heading off on different trajectories through phenotype space (Miller & Todd, 1993); and that speciation can happen robustly as well when an individual's mate preferences are learned from the phenotypes of their
196
parents through the process of 'sexual imprinting' (Todd & Miller, 1993; Todd, in press).
7.2
Sexual selection a n d t h e origins of b i o d i v e r s i t y
There is some biological evidence that speciation rates are indeed higher when selective mate choice plays a more important role. Ryan (1986) found a correlation between cladal diversity in frogs and complexity of their inner ear organs (amphibian papilla), which are responsible for the operation of female choice on male calls. He reasoned that "since mating call divergence is an important component in the speciation process, differences in the number of species in each lineage should be influenced by structural variation of the inner ear [and hence the operation of mate choice]" (p. 1379). Immelmann (1972, p. 167) has argued that mate preferences derived from imprinting on the phenotypes of one's parents may speed speeiation in ducks, geese, and the like: "imprinting may be of special advantage in any rapidly evolving group, as well as wherever several closely related and similar species occur in the same region [i.e. sympatric situations]. Interestingly enough, both statements really do seem to apply to all groups of birds in which imprinting has been found to be a widespread phenomenon...'. The enormous diversity of insects (at least 750,000 documented species, maybe as many as 10 million in the wild) might seem at first sight to contradict the notion that mate choice facilitates speciation, since few (except Darwin) seem willing to attribute much mate choice to insects. But Eberhard (1985, 1991, 1992) has shown that male insect genitalia evolve largely through the effects of cryptic female choice, in such as way that speciation could be promoted. Further evidence for speciation through mate choice comes from a consideration of biodiversity and the numbers of species across different kingdoms and phyla. There seems to be a striking correlation between a taxon's species diversity and the taxon's evolutionary potential for sexual selection through mate choice, resulting in highly skewed richness of species across the five kingdoms. Recent estimates of biodiversity suggest there may be somewhere between 10 and 80 million species on earth (May, 1990, 1992). But of the 1.5 million or so species that have actually been identified and documented so far by taxonomists, the animal kingdom contains about 1,110,000, the plant kingdom contains about 290,000, the fungi contain about 90,000, the protists contain about 40,000, and the monera contain only about 5000 (Cook, 1991). (It should be noted that sampling biases might account for a small amount of the skewness here: many animals and plants are larger and easier to notice and to classify than fungi, protists, or monera.) Although the vast majority of species in each kingdom can undergo some form of genetic recombination through sexual reproduction, only in the animals and the flowering plants is selective mate choice of central importance. Of the 290,000 documented species of plants, about 250,000 are angiosperms (flowering plants) fertilized by animal pollinators. And of the 1,110,000 documented species of animals, those with sufficient neural complexity to allow for some degree of mate choice (particularly the arthropods, molluscs,
197
and chordates) are much more numerous than those without. Thus, species diversity is vastly greater among taxa wherein a more or less complex nervous system mediates mate choice, either a conspecific's nervous system in the case of animals, or in a heterospecific pollinator's nervous system in the case of flowering plants. This pattern is the opposite of what we might expect if allopatric speciation were the primary cause of biodiversity. The effects of geographic separation (allopatry) should obviously be weaker for species whose reproduction is mediated by a mobile animal. Animals can search over wide areas for mates and pollinators can fly long distances. So allopatric speciation would predict lower species diversity among taxa whose reproduction is mediated by mobile animals with reasonably complex nervous systems - - exactly the opposite of what we observe. A similar problem holds for sympatric speciation through disruptive selection: animals with complex nervous systems should find it easier to generate conditional behavior that exploits different fitness peaks (ecological niches) flexibly, without having to speciate in order to specialize. Yet it is precisely such animals that seem to speciate most quickly. To further explore the role of selective mate choice in creating species biodiversity, we need to analyze the degree of mate choice in the various taxa more accurately, adjust the speciation rates between taxa for number of generations of evolution (and thus organism size), and if possible take into account the amount of geographic spread and migratory range of the species involved. In this way, we hope to gain more evidence to show that sympatric speciation through mate choice, particularly through assortative mating, is the best explanation available for the extreme biodiversity of animals and flowering plants, and is thus the most powerful mechanism for dividing up and spreading out evolution's exploratory search of the adaptive landscape.
8 8.1
Implications
and applications
I m p l i c a t i o n s for biology a n d p s y c h o l o g y
Biologists have been exploring the nuances of natural selection almost continuously since Darwin's time, and much has been learned. By contrast, Darwin's (1871) theory of sexual selection through mate choice was virtually ignored until about 15 years ago, so the implications of sexual selection are only beginning to be realized. This paper has made some strong claims about how natural selection and sexual selection might interact to explain long-standing mysteries in biology, such as how complex adaptations get optimized, how species split apart, and how evolutionary innovations are constructed before they show any ecological utility. From the perspective of traditional natural selection research and the Modern Synthesis, these claims may look strange and implausible. But Darwin may not have found them so. Taking mate choice seriously does not mean abandoning Darwinism, adaptationism, optimality theory, game theory, or anything else of proven value in biology. It simply means recognizing a broader class of selection
198
pressures and a richer set of evolutionary dynamics than have been analyzed so far. Psychology has barely begun to recognize the role of natural selection in constructing mental and behavioral adaptations, much less the role of sexual selection in doing so. One of our motivations for exploring the interaction of natural and sexual selection is our conviction that sexual selection may have played a critical role in the evolution of our unique human morphology (Szalay & Costello, 1991) and psychology (Miller, 1993). The evolution of the human brain can be seen as a problem of escaping a local optimum: the ecologically efficient 500 cc. brain of the Australopithecenes, who were perfectly good at bipedal walking, gathering, scavenging, and complex social life with their normal ape-sized brains. During the rapid encephalization of our species in the last two million years, through the Homo habilis and Homo erectus stages up through archaic Homo sapiens, our ancestors showed very little ecological progress: tool making was at a virtual stand-still, the hunting of even small animals was still quite inefficient, and we persisted alongside unencephalized Australopiihecene species for well over a million years. These facts suggest an evolutionary pattern just like that of other key innovations, as discussed in section 6.2: that large brains did not give our lineage any significant ecological advantages until the last 100,000 years, when big-game hunting and complex tool-making started to develop quite rapidly - - long after we had attained roughly our present brain size. Instead, we propose that brain size probably evolved through runaway sexual selection operating on both males and females (Miller, 1993). Human encephalization represents the most mysterious example of innovative escape from a local ecological optimum, and we think the runaway dynamics of selective mate choice had everything to do with this escape.
8.2 Applications in genetic algorithms research and evolutionary design optimization If mate choice has been critical to the innovation, optimization, and diversification of life on our planet, we might expect that mate choice will also prove important in the design of complex artificial systems using genetic algorithms and other evolutionary optimization techniques. Evolutionary engineering methods are often defended by claiming that we have a 'sufficiency proof' that natural selection alone is capable of generating complex animals with complex behaviors. But this is not strictly true: all we really know is that natural and sexual selection in concert can do this. Indeed, the traditional assumption in genetic algorithms research that sexual recombination per se is the major advantage of sexual reproduction (Holland, 1975; Goldberg, 1989) may be misleading. If instead the process of selective mate choice is what gives evolutionary power and subtlety to sexual reproduction, then current genetic algorithms work may be missing out on a major benefit of simulating sex. For those interested in evolving robot control systems (e.g. Cliff, Husbands, & Harvey, 1992; Harvey, Husbands, & Cliff, 1992, 1993) or other complex design
199
structures (e.g. Goldberg, 1989; Koza, 1993; see Forrest, 1993) through simulated natural selection, we suggest that incorporating processes of simulated sexual selection may help speed optimization, avoid local evolutionary optima, develop important new evolutionary innovations, and increase parallel search and niche differentiation through speciation. These effects may become particularly important as we move from pre-defined noise-free fitness functions to more complex, noisy, emergent fitness functions of the sort that arise when actually simulating ecosystems, coevolution, and other more naturalistic interactions. Also, to the extent that the human brain evolved through runaway sexual selection (Miller, 1993), simulated sexual selection may help us cross the border between artificial life and artificial intelligence sometime in the future. 9
Conclusions
Natural selection is fairly good at climbing fitness peaks in adaptive landscapes representing 'economic' traits. Sexual selection through mate choice has complementary strengths: it is good at making this natural-selective hill-climbing faster and more accurate, at allowing escape from local optima, at generating courtship innovations that may prove useful as economic innovations, and at creating biodiversity and parallel niche differentiation through speciation. The two processes together yield a very powerful form of biocomputation that rapidly and efficiently explores the space of possible phenotypes, as shown by the diversity and complexity of animals and flowering plants on our planet. We are all the products not only of selection for survival, but also of selection for sexiness - - dark-bright alloys forged in death and shaped by love. 10
Acknowledgments
Geoffrey Miller has been supported by NSF Research Grant INT-9203229 and NSF-NATO Post-Doctoral Grant RCD-9255323. For comments, support, advice, and/or inspiration relevant to this work, we are indebted to: Dave Cliff, Helena Cronin, Inman Harvey, Phil Husbands, Andrew Pomiankowski, Roger Shepard, and John Maynard Smith. References Andersson, M. (1994): Sexual selection. Princeton: Princeton U. Press. Barth, F. G. (1991): Insects and flowers: The biology o] a partnership. Princeton: Princeton U. Press. Bateson, P. (Ed.). (1983): Mate choice. Cambridge, UK: Cambridge U. Press. Bateson, P. (1988): The active role of behavior in evolution. In M.-W. Ho & S. W. Fox (Eds.), Evolutionary processes and metaphors (pp. 191-207). New York: John Wiley. Bonnet, J. T. (Ed.). (1982): Evolution and development. Berlin: Springer-Verlag.
200
Brooks, R. A., & Maes, P. (Eds.). (1994): Artificial Life IV. Cambridge, MA: MIT Press/Bradford Books. Clarke, B. C. (1962): The evidence for apostatic selection. Heredity (London), 24, 347352. Cliff, D., Husbands, P., & Harvey, I. (1992): Evolving visually guided robots. In J.-A. Meyer, H. L. Roitblat, & S. W. Wilson (Eds.), From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior (pp. 374-383). Cambridge, MA: MIT Press. Cook, L. M. (1991): Genetic and ecological diversity: The sport of nature. London: Chapman & Hall. Cracraft, J. (1990): The origin of evolutionary novelties: Pattern and process at different hierarchical levels. In M. Nitecki (Ed.), Evolutionary innovations (pp. 21-44). Chicago: U. Chicago Press. Cronin, H. (1991): The ant and the peacock: Altruism and sexual selection from Darwin to today. Cambridge, UK: Cambridge U. Press. Crosby, J. L. (1970): The evolution of genetic discontinuity: Computer models of the selection of barriers to interbreeding between subspecies. Heredity, 25, 253-297. Darwin, C. (1859): On the origin of species (lst ed.). London: John Murray. Darwin, C. (1862): On the various contrivances by which orchids are fertilized by insects. London: John Murray. Darwin, C. (1871): The descent of man, and selection in relation to sex. London: John Murray. Darwin, C. (1876): The movements and habits of climbing plants (2nd ed.). New York: D. Appleton & Co. Darwin, C. (1883): On the origin of species (6th ed.). New York: D. Appleton & Co. Dawkins, R. (1994): The eye in a twinkling. Nature, 368, 690-691. Dewsbury, D. A. (1981): Effects of novelty on copulatory behavior: The Coolidge Effect and related phenomena. Psychological Review, 89(3), 464-482. Dobzhansky, T. (1937): Genetics and the origin of species. (Reprint edition 1982). New York: Columbia U. Press. Endler, J. A. (1992): Signals, signal conditions, and the direction of evolution. American Naturalist, 139, $125-S153. Eberhard, W. G. (1985): Sexual selection and animal genitalia. Cambridge, MA: Harvard U. Press. Eberhard, W. G. (1991): Copulatory courtship and cryptic female choice in insects. Biol. Rev., 66, 1-31. Eberhard, W. G. (1992): Species isolation, genital mechanics, and the evolution of species-specific genitalia in three species of Macrodactytus beetles (Coleoptera, Scaraceidae, Melolonthinae). Evolution, 46(6), 1774-1783. Eigen, M. (1992): Steps towards life: A perspective on evolution. Oxford: Oxford U. Press. Eldredge, N. (1985): Unfinished synthesis: Biological hierarchies and modern evolutionary thought. New York: Oxford U. Press Eldredge, N. (1986): Information, economics, and evolution. Ann. Review of Ecology and Systematics, 17, 351-369. Eldredge, N. (1989): Macroevolutionary dynamics: Species, niches, and adaptive peaks. New York: McGraw-Hill. Eldredge, N., &: Gould, S. J. (1972): Punctuated equilibria: An alternative to phyletic gradualism. In T. J. M. Schopf (Ed.), Models in paleobiology (pp. 82-115). San Francisco: Freeman, Cooper.
201
Enquist, M., & Arak, A. (1993): Selection of exaggerated male traits through female aesthetic senses. Nature, 361(6~11), 446-448. Fisher, R. A. (1915): The evolution of sexual preference. Eugenics review, 7, 184-192. Fisher, R. A. (1930): The genetical theory of natural selection. Oxford: Clarendon Press. Porrest, S. (Ed.) (1993): Proceedings of the Fifth International Conference on Genetic Algorithms. San Francisco: Morgan Kaufmann. Futuyma, D. (1986): Evolutionary biology. Sunderland, MA: Sinauer. Futuyama, D., & Slatkin, M. (Eds.). (1983): Convolution. Sunderland, MA: Sinauer. Goldschmidt, R. B. (1940): The material basis of evolution. New Haven, CT: Yale U. Press. Goodwin, B. C., Holder, N., & Wylie, C. C. (Eds.). (1983): Development and evolution. Cambridge, UK: Cambridge U. Press. Gould, S. J. (1977): Ontogeny and phylogeny. Cambridge, MA: Harvard U. Press. Guilford, T., & Dawkins, M. S. (1991): Receiver psychology and the evolution of animal signals. Animal Behaviour, 42, 1-14. Hnldane, J. B. S. (1932): The causes of evolution. London: Longman. Harvey, I., Husbands, P., & Cliff, D. (1992): Issues in evolutionary robotics. In J.-A. Meyer, H. L. Roitblat, & S. W. Wilson (Eds.), From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior (pp. 364-373). Cambridge, MA: MIT Press/Bradford Books. Harvey, I., Husbands, P., & Cliff, D. (1993): Genetic convergence in a species of evolved robot control architectures, tn S. Forrest (Ed.), Proceedings of Fifth International Conference on Genetic Algorithms. San Francisco: Morgan Kaufmann. Hinton, G. E., & Nowlan, S. J. (1987): How learning guides evolution. Complex Systems, 1, 495-502. Huxley, J. S. (1938): The present standing of the theory of sexual selection. In G. R. de Beer (Ed.), Evolution: Essays on aspects of evolutionary biology (pp. 11-42). Oxford: Clarendon Press. Huxley, J. S. (1942): Evolution: The modern synthesis. New York: Harper. Iwasa, Y., Pomiankowsld, A., ~z Nee, S. (1991): The evolution of costly mate preferences. II. The 'handicap' principle. Evolution, ~5(6), 1431-1442. Jablonsld, D., & Bottjer, D. J. (1990): The ecology of evolutionary innovation: The fossil record. In M. Nitecki (Ed.), Evolutionary innovations (pp. 253-288). Chicago: U. Chicago Press. Jensen, J. S. (1990): Plausibility and testability: Assessing the consequences of evolutionary innovation. In M. Nitectd (Ed.), Evolutionary innovations (pp. 171-190). Chicago: U. Chicago Press. Kauffman, S. A. (1993): Origins of order: Self-organization and selection in evolution. New York: Oxford U. Press. Kimura, M. (1983): The neutral theory of molecular evolution. In M. Nei & R. K. Koehn (Eds.), Evolution of genes and proteins, pp. 213-233. Sunderland, MA: Sinauer. Kirkpatrick, M. (1982): Sexual selection and the evolution of female choice. Evolution, 36, 1-12. Kirkpatrick, M. (1987): The evolutionary forces acting on female preferences in polygynous animals. In J. W. Bradbury & M. B. Andersson (Eds.), Sexual selection: Testing the alternatives (pp. 67-82). New York: John Wiley. Koza, J. (1993): Genetic programming. Cambridge, MA: MIT Press/Bradford Books. Lande, R. (1980): Sexual dimorphism, sexual selection and adaptation in polygenic characters. Evolution, 34, 292-305.
202
Lande, R. (1981): Models of speciation by sexual selection on polygenic characters. Proe. Nat. Acad. Sci. USA, 78, 3721-3725. Lande, R. (1987): Genetic correlation between the sexes in the evolution of sexual dimorphism and mating preferences. In J. W. Bradbury & M. B. Andersson (Eds.), Sexual selection: Testing the alternatives (pp. 83-95). New York: John Wiley. Liem, K. F. (1973): Evolutionary strategies and morphological innovations: Cichlid pharyngeal jaws. Systematic zoology, 22, 425-441. Liem, K. F. (1990): Key evolutionary innovations, differential diversity, and symecomorphosis. In M. Nitecki (Ed.), Evolutionary innovations (pp. 14%170). Chicago: U. Chicago Press. May, R. M. (1990): How many species? Phil. Trans..l~oyat Soc. London B, Biological Sciences, 330(1257), 293-304. May, R. M. (1992): How many species inhabit the earth? Scientific American, 267(4), 42-48. Maynard Smith, J. (1978): The evolution o] sex. Cambridge, UK: Cambridge U. Press. Mayr, E. (1942): Systematics and the origin of species. (Reprint edition 1982). New York: Columbia U. Press. Mayr, E. (1954): Change of genetic environment and evolution. In J. Huxley, A. C. Hardy, & E. B. Ford (Eds.), Evolution as a process (pp. 157-180). London: George Allen & Unwin. Mayr, E. (1960): The emergence of evolutionary novelties. In S. Tax (Ed.), Evolution after Darwin, Vol. I (pp. 349-380). Chicago: U. Chicago Press. Mayr, E. (1983): Animal species and evolution. Cambridge, MA: Harvard U. Press. McKinney, F. K. (1988): Multidisciplinary perspectives on evolutionary innovations. Trends in ecology and evolution, 3, 220-222. Miller, G. F. (1993): Evolution of the human brain through runaway sexual selection. Ph.D. thesis, Psychology Department, Stanford University. (To be published in 1995 by MIT Press.) Miller, G. F. (1994): Exploiting mate choice in evolutionary computation: Sexual selection as a process of search, optimization, and diversification. In T. C. Fogarty (Ed.), Evolutionary Computing: Proceedings of the t994 Artificial Intelligence and Simulation of Behavior (AISB) Society Workshop (pp. 65-79). Berlin: Springer-Verlag. Miller, G. F. (Accepted, a): Psychological selection in primates: The evolution of adaptive unpredictability in competition and courtship. To appear in A. Whiten & R. W. Byrne (Eds.), Machiavellian Intelligence II. Miller, G. F. (Accepted, b): Sexual selection in human evolution: Review and prospects. To appear in C. Crawford (Ed.), Evolution and human behavior: Ideas, issues~ and applications, ttillsdale, N J: Lawrence Erlbaum. Miller, G. F., & Cliff, D. (1994): Protean behavior in dynamic games: Arguments for the co-evolution of pursuit-evasion tactics in simulated robots. In D. Cliff, P. Husbands, J. A. Meyer, & S. W. Wilson (Eds.), From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior (pp. 411-420). Cambridge, MA: MIT Press/Bradford Books. Miller, G. F. & Freyd, J. J. (1993): Dynamic mental representations of animate motion: The interplay among evolutionary, cognitive, and behavioral dynamics. Cognitive Science Research Paper 290, University of Sussex. Submitted as a target article for Behavioral and Brain Sciences. Miller, G. F.~ K; Todd, P. M. (1993): Evolutionary wanderlust: Sexual selection with directional mate preferences. In J.-A. Meyer, H. L. Roitblat, & S. W. Wilson (Eds.), From Animals to Animats 2: Proceedings of the Second International Conference on
203
Simulation of Adaptive Behavior (pp. 21-30). Cambridge, MA: MIT Press/Bradford Books. Moiler, A. P., & Pomiankowski, A. (1993): Fluctuating asymmetry and sexual selection. Genetica, 89, 267-279. Morgan, C. L. (1888): Natural selection and elimination. Nature, Aug. 16, 370. Muller, G. B. (1990): Developmental mechanisms at the origin of morphological novelty: A side-effect hypothesis. In M. Nitecki (Ed.), Evolutionary innovations, pp. 99-130. Chicago: U. Chicago Press. Nitecki, M. (Ed.). (1990): Evolutionary innovations. Chicago: U. Chicago Press. O'Donald, P. (1980): Genetic models of sexual selection. Cambridge, UK: Cambridge U. Press. Paterson, H. E. H. (1985): The recognition concept of species. In E. S. Vrba (Ed.), Species and speciation, Transvaal Mus. Monogr. 4, 21-29. Patterson, B. D. (1988): Evolutionary innovations: Patterns and processes. Evolutionary trends in plants, 2, 86-87. Petrie, M. (1992): Peacocks with low mating success are more fikely to suffer predation. Animal Behaviour, 44, 585-586. Petrie, M., Halliday, T., & Sanders, C. (1991): Peahens prefer peacocks with elaborate trains. Animal Behaviour~ 41, 323-331. Pimental, D., Smith, G. J. C., & Soans, J. (1967) A population model of sympatric speciation. American Naturalist, 101(92P), 493-504. Pomiankowski, A. (1987): The costs of choice in sexual selection. J. Theoretical Biology, 128, 195-218. Pomiankowski, A. (1988): The evolution of female mate preferences for male genetic quality. Oxford Surveys in Evolutionary Biology, 5, 136-184. Pomiankowski, A. (1990): How to find the top male. Nature, 3$Z 616-617. Pomiankowski, A., Iwasa, Y., & Nee, S. (1991): The evolution of costly mate preferences. I. Fisher and biased mutation. Evolution, 45(6), 1422-1430. Raft, R. A., Par, B., Parks, A., & Wray, G. (1990): Radical evolutionary change in early development. In M. Nitecki (Ed.), Evolutionary innovations (pp. 71-98). Chicago: U. Chicago Press. Raft, R. A., & Raft, E. C. (Eds.): (1987): Development as an evolutionary process. New York: Alan R Liss. Romanes, G. J. (1897): Darwin, and after Darwin. IL Post-Darwinian Questions. Heredity and Utility (2nd ed.). Chicago: Open Court Pubfishing. Ryan, M. J. (1986): Neuroanatomy influences speciation rates among anurans. Proc. Nat. Acad. Sci. USA, 83, 1379-1382. Ryan, M. J. (1990): Sexual selection, sensory systems, and sensory exploitation. Oxford Surveys of Evol. Biology, 7, 156-195. Ryan, M. J., & Keddy-Hector, A. (1992): Directional patterns of female mate choice and the role of sensory biases. American Naturalist, 139, $4-$35. Schuster, P. (1988): Stationary mutant distributions and evolutionary optimization. Bull. Mathematical Biology, 50(6), 635-660. Simon, P. (1992): The action plant: Movement and nervous behaviour in plants. Cambridge, MA: Blackwell. Simpson, G. (1944): Tempo and mode in evolution. New York: Columbia U. Press. Simpson, G. (1953): The major features of evolution. New York: Columbia U. Press. Sprengel, C. K. (1793): Das entdeekte Geheimnis der Natur im Bau und in der Befruchtung der Blumen. (The secret of nature revealed in the structure and pollination of flowers.) Berlin: F. Vieweg. (Reprinted 1972 by J. Cramer, Lehre.)
204 Szalay, F. S. & Costello, R. K. (1991): "Evolution of permanent estrus displays in hominids." J. Human Evolution, 20, 439-464. Sullivan, B. K. (1989): Passive and active female choice: A comment. Animal Behaviour, 37(g), 692-694. Thoday, J. M. (1972): Disruptive selection. Proc. of the Royal Soc. of London B, 182, 109-143. Todd, P. M. (in press): Sexual selection and the evolution of learning. To appear in R. Belew & M. Mitchell (Eds.), Adaptive individuals in evolving populations: Models and algorithms. Reading, MA: Addison-Wesley. Todd, P. M., & Miller, G. F. (1991): On the sympatric origin of species: Mercurial mating in the Quicksilver Model. In R. K. Belew & L. B. Booker (Eds.), Proceedings of the Fourth International Conference on Genetic Algorithms (pp. 547-554). San Mateo, CA: Morgan Kaufmann. Todd, P. M. & Miller, G. F. (1993): Parental guidance suggested: How parental imprinting evolves through sexual selection as an adaptive learning mechanism. Adaptive Behavior, 2(1), 5-47. Todd, P. M. & Miller, G. F. (in preparation): The role of mate choice in biocomputation H: Applications of sexual selection in search and optimization Todd, P. M., & Wilson, S. W. (1993): Environment structure and adaptive behavior from the ground up. In J.-A. Meyer, H. L. Roitblat, & S. W. Wilson (Eds.), From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior (pp. 11-20). Cambridge, MA: MIT Press/Bradford Books. Weismann, A. (1917): The selection theory. In Evolution in modern thought, by Haeckel, Thomson, Weismann, and Others (pp. 23-86)~ New York: Boni and Liveright. Williams, G. C. (1975): Sex and evolution. Princeton: Princeton U. Press. Willson, M. F., and Burley, N. (1983): Mate choice in plants: Tactics, mechanisms, and consequences. Princeton: Princeton U. Press. Wright, S. (1932): The roles of mutation, inbreeding, crossbreeding, and selection in evolution. Proc. Sixth Int. Congr. Genetics, 1, 356-366. Wright, S. (1982): Character change, speciation, and the higher taxa. Evolution, 36, 427-443. Wyles, J. S., Kunkel, J. G., & Wilson, A. C. (1983): Birds, behavior, and anatomical evolution. Proc. Nat. Acad~ Sci. USA, 80, 4394-4397. Van Valen, L. M. (1971): Adaptive zones and the orders of mammals. Evolution, 16, 125-142. Vrb~, E. S. (1983): Macroevolutionary trends: New perspectives on the roles of adaptation and incidental effect. Science, 221,387-389. Vrba, E, S. (1985): Environment and evolution: Alternative causes of the temporal distribution of evolutionary events. South African Journal of Science, 81,, 229-236. Vrba, E. S., & Gould, S. J. (1986): The hierarchical expansion of sorting and selection: Sorting and selection cannot be equated. Paleobiology, 12, 217-228. Zahavi, A. (1975): Mate selection: A selection for a handicap. Journal of Theoretical Biology, 53, 205-214.
Genome Growth and the Evolution of the Genotype-Phenotype Map Lee Altenberg* Institute of Statistics and Decision Sciences, Duke University, Durham, NC 27708-0251 U.S.A. The evolution of new genes is distinct from evolution through allelic substitution in that new genes bring with them new degrees of freedom for genetic variability. Selection in the evolution of new genes can therefore act to sculpt the dimensions of variability in the genome. This "constructional" selection effect is an evolutionary mechanism, in addition to genetic modification, that can affect the variational properties of the genome and its evolvability. One consequence is a form of genic selection: genes with large potential for generating new useful genes when duplicated ought to proliferate in the genome, rendering it ever more capable of generating adaptive variants. A second consequence is that alleles of new genes whose creation produced a selective advantage may be more likely to also produce a selective advantage, provided that gene creation and allelic variation have correlated phenotypic effects. A fitness distribution model is analyzed which demonstrates these two effects quantitatively. These are effects that select on the nature of the genotype-phenotype map. New genes that perturb numerous functions under stabilizing selection, i.e. with high pleiotropy, are unlikely to be advantageous. Therefore, genes coming into the genome ought to exhibit low pleiotropy during their creation. If subsequent offspring genes also have low pleiotropy, then genic selection can occur. If subsequent allelic variation also has low pleiotropy, then that too should have a higher chance of not being deleterious. The effects on pleiotropy are illustrated with two model genotype-phenotype maps: Wagner's linear quantitative-genetic model with Gaussian selection, and Kauffman's "NK" adaptive landscape model. Constructional selection is compared with other processes and ideas about the evolution of constraints, evolvability, and the genotype-phenotype map. Empirical phenomena such as dissociability in development, morphological integration, and exon shuffling are discussed in the context of this evolutionary process. 1
Introduction
In this chapter I discuss an evolutionary mechanism whose target is specifically the ability of genomes to generate adaptive variants. It is about the evolution of evolvability. The main focus of action for this process is the genotype-phenotype map (Wagner 1984, 1989), i.e. the way genetic variation maps to phenotypic variation. The genotype-phenotype map is the concept underpinning the classical concepts of pleiotropy, polygeny, epistasis, constraints, and gradualness. Internet:
[email protected], edu.
206
The way that genetic variation maps to phenotypic variation is fundamental to whether or not that variation has the possibility of producing adaptive change. Even when strong opportunity exists for new adaptations in an organism, many of its previously evolved functions will remain under stabilizing selection. Adaptation requires variation that is able to move the organismal phenotype toward traits under directional selection without greatly disturbing traits remaining under stabilizing selection. Variation that disturbs existing adaptations - - will have as it produces new adaptations - - i.e. variation which is p l e i o t r o p i c difficulty producing an overall fitness advantage. Other aspects of the genotype-phenotype map that affect evolvability include: - Gradualness: genetic changes with extreme effects are less likely to be advantageous; Rugged landscapes: adaptive changes that require the simultaneous altering of several genes are less likely to evolve; and Constraints: adaptations for which no genetic variability exists are unable to evolve. -
-
The question of whether the genotype-phenotype map has evolved so as to systematically affect evolvability has been dealt with in a variety of ways in the literature. Approaches include the following: T h e g e n o m e as fluid: Evolvability is not limited; genetic variation exists within populations tbr any trait one wishes to select on. T h e internalist view: The degree of evolvability is a byproduct of the physics of development. It is fortunate that physics permitted evolvability. Lineage selection: Different developmental systems may have different evolvabilities; those which happen to have high evolvability will proliferate as species lineages. G e n e t i c modification: Selection for adaptedness happens to systematically produce high evolvability. This paper adds an additional hypothesis to this list: C o n s t r u c t i o n a l selection: Selection during the origin of genes provides a filter on the construction of the genotype-phenotype map that naturally produces evolvability. The internalist viewpoint is what this paper will take issue with most. The internalist viewpoint holds that the variational properties of the genotype-phenotype map are the result of the physics of development (Goodwin 1989). The process of morphogenesis is proposed as a complex dynamical system toward which genes contribute, but which has internal macroscopic properties that determine what kinds of phenotypic variability exist. One can ask, however, whether morphogenetic dynamics could have been shaped by evolutionary forces that systematically affect the nature of developmental constraints, or the smoothness of the adaptive landscape, or its evolvability. Here I discuss an evolutionary mechanism by which selection can come
207
to act indirectly on evolutionary potential, as a consequence of how genes come into being in the first place. The main idea, in a nutshell, is this: the genes that stably exist in a genome share the common feature that, when they were created, they produced a selective advantage for the organism. But when a new gene is created, it not only produces its current phenotypic effect, but carries with it a new "neighborhood" in "sequence space" - - the kinds of variants that it can in turn give rise to. The phenotypic character of this neighborhood depends on the gene's mode of action. Different modes of gene action can be expected to have different overall likelihood of producing adaptive variants. The fact that a gene's existence is predicated on it having originally produced a selective advantage means that the accumulation of new genes in the genome should be biased toward modes of action whose variants are more likely to be fruitful in adaptation. Since there is a diversity of modes of gene action, the question remains as to why there are the kinds there are, in the frequencies they are found, within the genomes of organisms. This chapter presents a theory about the statistical properties of genotype-phenotype maps, and how these statistics would be expected to change in the course of the evolutionary construction of the genome toward ways that facilitate the generation of adaptive variants. There are two basic aspects to the idea of a genotype-phenotype map. One can think of the genotype as a "representation" or description of the phenotype. Representation has two aspects: generative and variational. The generative aspect of a representation is how the representation is actually used to produce the object, which in genetics would be the process of gene expression and its integration in development. It is not the mechanisms of how this map is accomplished that is relevant to evolvability; rather, what matters is the variational aspect of a representation - - how changes in the representation map to changes in the object. Variational aspects can be described by their statistical properties without having to deal with the generative mechanisms. The principal variational aspect I will be concerned with is pleiotropy - - the constellation of phenotypic effects from single mutations.
1.1
Bonner's Low Pleiotropy Principle
Bonnet (1974) has articulated a basic "design principle" for the genotype-phenotype map necessary to allow the generation of adaptive variants through random genetic variation, a principle of low pleiotropy: We presume that it is of a distinct advantage to keep a number of the units of gene action of the organism quite independent of one another. The reason for this seems straightforward: mutations that affect a number of construction units are more likely to be lethal than those that affect only one. Or to put it another way, the fewer the interconnections of gene action (the less the pleiotropy), the greater the chances of its being a viable mutant. A viable mutant may be one that appears late in development, such as the pigmentation of hair, eyes, or feathers, or
208
one that acts in a small developmental unit that is independent of the others. (1974, p. 61) Lewontin (1978) proposed the low pleiotropy principle in a somewhat different manner, as a principle of "quasi-independence", i.e. that there must be % great variety of alternative paths by which a given characteristic may change, so that some of them will allow selection to act on the characteristic without altering other characteristics of the organism in a countervailing fashion: pleiotropic and allometric relations must be changeable." However, this design principle suffers from the "for the good of the species" problem. Even though a property might be "good for the species", it can only evolve if organisms bearing it (or "replicators" to be more general (Brandon 1990)) have higher fitness. Although it would be a marvelous design for the organism to have a genome organized for its future adaptive potential, this future advantage does not give an organism the present advantage it needs in order to pass on such a trait.
2
Constructional
Selection
All variational aspects of the genotype-phenotype map face the "good of the species" problem, because variation is not the phenotype of an organism, but a property of genetic transmission between organisms. How, therefore, can organismal selection get a "handle" on the processes that produce variation? The general answer to this question is that there must be correlations between variational properties and properties affecting organismal fitness. These correlations can come about through diverse means. In the case of variational properties like recombination and mutation rates, correlations can be induced by the evolutionary dynamics of modifier genes - genes that control recombination, mutation, and so forth. Genes modifying recombination rates, for example, can evolve linkage associations to genes under selection whose transmission they affect. In this case, it is the modifier gene that provides natural selection with the "handle" to change recombination rates (Liberman and Feldman 1986, Altenberg and Feldman 1987). Modifier genes are rather specialized mechanisms. But here I consider a means by which selection can gain a handle on the variational properties of any gene, through the selective forces operating during the origin of the gene. All genes face the problem of selection during their creation, and those genes that produce a selective disadvantage never become stably incorporated in the genome. Therefore, existing genes share the common history of having once produced a selective advantage to the organism. But new genes bring with them new degrees of freedom for variability in the genome. These new degrees of freedom are of two types: T y p e I: new genes serve as new templates for further genome growth, and T y p e II: new genes afford new sites at which allelic variation can occur.
209
The phenotypic effects of either of these new degrees of freedom depend on the physical nature of the gene's action. And the gene's mechanisms of action is unlikely to change radically between its creation and subsequent gene duplications and allelic variations. Therefore it is reasonable to expect a correlation to exist between the phenotypic or fitness effects of a newly created gene and subsequent duplications and allelic changes. This then is a means by which variational properties of the genome can become correlated with organismal selection. Therefore, without the postulation of additional modifier genes, selection during the creation of new degrees of freedom for genetic variability can gain a handle on the quality of those degrees of freedom. The strength of this handle depends on the strength of the correlations. When referring to this process, I will summarize it with the term "constructional selection", since it is tied to the construction of new genes (Altenberg 1985). 2.1
Riedl's T h e o r y
Riedl's (1977) theory for the evolution of "genome systemization" is the main earlier example of a constructional selection theory for the genotype-phenotype map. He considers the situation where functional interactions arise in the organism that require the coordinated change of several phenotypic characters in order to produce adaptive variants. When this would require simultaneous mutations at several genes, he argues that the evolution of a new gene that produces the needed coordinated variability - - a "superimposed genetic unit" - - is a far more likely possibility. Thus Riedl is proposing that the genotype-phenotype map can evolve in directions that facilitate adaptation through selective genome growth. 2.2
Fine P o i n t s
It is important at this point to be clear that this is not an argument that most adaptive evolution happens through the origin of new genes, as opposed to allelic substitution. Rather, I am proposing that the events surrounding the creation of new genes may play a special role in the evolution of the genotype-phenotype map because of their distinct property of adding new degrees of freedom to the genome. Also, it should be understood that "new genes" can refer equally to new parts of genes or new clusters of genes, i.e. new sections of DNA sequence that are of functional use to the organism. Therefore, the arguments here apply to such elements as exons, promoters, enhancers, operators, other regulatory elements, etc.. Throughout this chapter, pleiotropy must be understood to refer not to multiple effects on arbitrary "characters" of the organism, since these are artifacts of measurement and description, but to organismal functions that are components of adaptation, what Nemeschkal et al. (1992) refer to as a "unit of characters working together to accomplish a common biological role". Moreover, in the case of new genes, the definition of "multiple" effects that is germane as a definition
210
of pleiotropy is when the gene not only produces variability for functions under directional selection, but also disturbs functions under stabilizing selection. "Low pleiotropy" will refer to genes that affect mainly functions under directional selection and leave functions under stabilizing selection unaffected.
2.3
Pleiotropy and Constructional Selection
Let us examine Bonner's low pleiotropy principle in the context of the genome growth process. New genes which have fewer pleiotropic effects when added to the genome, whose action causes the phenotype to change mainly in dimensions that are under directional selection, stand a better chance, by Bonnet's principle, of providing a selective advantage. This is would hold even if that chance is still slight. Genes which disturb many adapted functions of the phenotype are unlikely to be advantageous, and thus would not be incorporated in the genome. Therefore, selection can filter the pleiotropy of genes as they are added to the genome. If there is any correlation between the pleiotropic effects during the gene's addition and the pleiotropy of subsequent additions or allelic changes in the gene, then the genome shall have expanded its degrees of freedom in directions with lower pleiotropy. The effects of constructional selection on the two forms of genetic variation, Type I and II above, are distinct, so each is taken up in turn. 2.4
T y p e I Effect: T h e G e n o m e as Population.
If there are correlations between the phenotypic effects of duplicated genes and the effects of their subsequent duplications during macroevolutionary time scales, then a novel form of "genic" selection process becomes possible. This selection process is based on looking at the genome as a "population" of genes, as in the case of genic selection in the evolution of transposable elements. The idea that transposable elements are genetic parasites propagating within the genome (Cavalier-Smith 1977, Doolittle and Sapienza 1980, Orgel and Crick 1980) lead to the idea that the genome could be considered a population of genes, within which a new level of selection can operate when certain sequences can proliferate within the genome. Such "genic" selection is usually associated with transposable elements, whose activity is generally in conflict with organismal selection. The type I effect, however, is a form of genic selection in harmony with organismal selection, which, moreover, has organismal selection as a sub-process. Where do new genes come from? Although there is a certain amount of de novo synthesis of DNA in the genome, most genes originate from template based duplication of existing sequences. And while the vast majority of gene duplications may go to extinction, the genes currently functioning in an organism will possess an unbroken backward genealogy to earlier, ancestral genes (complicated perhaps by the occasional reactivation or insertion of pseudogene sequences). So there exists an "intra-genomic phylogeny", which is actually beginning to be taken as an object of study as the accumulation of DNA sequences allows the construction of "gene-trees" (Dorit and Gilbert 1990, Dorit et al. 1991, Strong
211
and Gutman 1992, Burt and Paton 1992, Klenova, et al. 1992, Streydio et al. 1992, Haefliger et al. 1989). If one picks any functioning gene in the genome, what would a typical story for its origin be? One could generally list: 1. 2. 3. 4.
Sequence duplication; Fixation in the population, through selection or drift; Maintenance of function by selection; Sequence evolution under mutation and selection.
Differences in gene properties that systematically bias the chances of the above events can produce a Darwinian process on the level of ge-nome-as-population. Darwinian process have three basic elements: viability, fecundity, and heritability. If there exist properties which show heritable variation in viability or fecundity, those properties can evolve over time. Viability, fecundity, and heritability each have their analogs on the level of genome-as-population:
Viability: The viability of a gene is simply its survival as a functioning gene in the genome. This requires its maintenance against mutational degradation, or replacement with other genes, and would occur through organismal selection against deletions or gene silencing mutations.
Fecundity: The fecundity of a gene is the rate at which it gives rise to other functional genes in the genome. This depends on: 1. The overall rate that duplications of the gene are produced; and 2. The probability that a duplication becomes established in the genome as a new, functional gene. This in turn depends on: (a) There being adaptive opportunity for properties of the sequence; (b) the sequence having functional properties which are not disrupted by new functional contexts; and (c) the sequence having properties that allow its duplication without disrupting existing functions of genes with which it interacts.
Herltability: Heritability here refers to ancestral and offspring genes having correlated properties, and depends on: 1. Conservation of the property of a gene over the time scale on which gene duplications occur; and 2. Carry-over of the property from ancestral to offspring genes. In each case above, one could just as well substitute "genetic element" for "gene', since the principles apply equally well to exons, promoters, regulatory sequences, and so forth. If there are systematic differences between sequences in the likelihood that duplications of them give rise to useful new genes (fecundity), and these different
212
likelihoods are conserved between gene origins, and carried from ancestral to offspring genes (heritability), then the genome will become populated with genes that are better able to give rise to other genes. The type I, or "genic selection" effect of constructional selection, therefore, is to increase the genome's ability to evolve new genes. This is an effect on the variational properties of the genome. The genome-as-population analogs of viability, fecundity and heritability in the type I effect can be contrasted with these analogs in the case of transposable elements. For such "selfish" DNA, viability as genes is low: on a macroevolutionary time scale, individual copies of transposons are transient, since they exist either as transient allelic polymorphisms or, if they ever go to fixation, are deleted or silenced rapidly because as alleles they are usually neutral or deleterious, and genetically unstable. The fecundity of transposons in the genome, however, is unsurpassed, and overcomes their sub-viability in the genome as individual copies. Their fecundity is due not to their probability of being useful to the organism (item 2 under Fecundity, above), but due to the shear rate at which copies are produced (item 1 under Fecundity, above). Furthermore, their heritability as genes is extremely high. Thus the type I effect of constructional selection and "selfish DNA" are two kinds of genic selection, and are in a sense opposite points within a continuum defined by the genome-as-population analogs of the Darwinian elements, viability, fecundity, and heritability.
2.5
T y p e II Effect: C o r r e l a t e d Allelic Variation.
The type II effect is where the genes that are stably incorporated into the gehome also have an enhanced likelihood that some of their allelic variants will also produce a selective advantage, by varying the phenotype along the same "lines" as occurred during the gene's original incorporation in the genome. By "enhanced", I mean relative to the e~ects of allelie variation at all the genes that were generated by duplication processes, but never fixed in the population and maintained by selection. If the pleiotropy of a gene is a relatively fixed result of its mode of action, then there will be a correlation between the phenotypic effects of the gene's origin and its subsequent allelic variation. If low pleiotropy helped the gene become established in the first place, then the subsequent low pleiotropy of its allelic variants would enhance their likelihood of being adaptive rather than universally deleterious. An important case of the correlated allelic variation effect is "function splitting", where a gene that has been selected as a compromise for carrying out several organismal functions is duplicated and the separate copies can evolve to specialize in some subset of functions. An example is the duplication of the hemoglobin gene and its specialization for fetal or postnatal oxygen transport conditions. In this case, the duplication causes changes in the genotype-phenotype maps of both resulting genes, with the net result of lowering the pleiotropy of allelic variation at these genes, and better optimization of the adaptive functions. This is an area which has already received a good deal of empirical and theoretical study (Ohta 1991 1988, Kappen et al.1989, Li 1985).
213
The type II effect is entirely dependent on there being correlations between the phenotypic effects of a new gene and the effects of allelic variation at that locus. For genes or recent origin, correlations would be expected. However, over time these correlations would be expected to weaken due to several factors. First, substantial sequence changes may occur as the gene diverges in function from that of its ancestral state. Second, whatever novel advantage the gene may have offered when it first arose will tend to change from being a "luxury" to being a necessity, as other functions evolve conditioned on the current state of that gene. This is what Riedl (1977) calls "burden" (and what Wimsatt and Schank call :'generative entrenchment" (Schank and Wimsatt 1987, Wimsatt and Schank 1988). Histones, polymerases, snRPs, etc, are extreme examples of burdened genes, since effectively all characters of the organism depend on them; their mutations are of necessity highly pleiotropic, and they are extremely well conserved. So over macroevolutionary time scales, the correlated allelic variation effect may become "stale" once a gene is in place. The low pleiotropy might be kept "fresh", however, if changing selection or polymorphism produces a history of variation in the gene to which other genes coadapt. 2.6
An Overall Picture of G e n o m e G r o w t h .
These considerations lead to the following picture of the intra-genomic phylogeny: There should be a static core of genes which have ceased to give rise to new genes in the genome; these may be extremely ancient and functionally burdened, or so highly speciMized as to have little adaptive potential for duplications. Once genes enter this core, they should tend to remain there (though they may continue sequence evolution). There should in addition be a "growth front" in the genome consisting of genes that are prolific in generating offspring genes. The growth front would gradually lose genes to the static core once they were created, but would be renewed by the influx of newly created genes, which would be the most likely to give rise to the next set of new genes. On occasion, static genes would be revived into the growth front by new adaptive opportunities conferred by changes in organismal selection. In addition, there would be the various "exceptional" families of genes, including transposable elements, highly repetitive genes selected for quantity production, "junk" and structural DNA, and so forth. 2.7
Constraints and Latent Directional Selection.
An examination of the situations discussed in the literature in which the genotype-phenotype map constrains evolution shows them to be of two basic kinds: kinetic and range constraints. A range constraint is simply where no genetic variation exists for phenotype or specific combination of phenotypic changes. Kinetic constraints emerge from the population genetic dynamics when the probability of creating given phenotypic variants is vanishingly low. A softer version of this is a kinetic bias, in which the most probable variant that responds to a selective pressure has specific phenotypic forms. The problem of adaptation on "rugged
214
fitness landscapes" (Kauffman 1989a) is an example of kinetic constraints, in that what keeps a population at a local fitness peak is the improbability of generating fitter variants (in fact it is transmission probabilities that define what a neighborhood is in the sequence space). This includes the situation considered by Riedl (1977), where mutations are needed at several loci to produce a given phenotype. The general consequence of either range or kinetic constraints is that to varying extents, organisms will be suboptimally adapted. There may be phenotypes that would be more adapted if only the genome could produce them. The population may have reached a mutation-selection balance, in which new variants are all deleterious, and so appear to be at an adaptive peak, when the lack of fitter variants is due to kinetic or range constraints. In such cases one could say that there exists a "latent" directional selection, which would become visible if genetic variation existed in this direction. Riedl's idea is that much of the adaptive opportunity for the evolution of new genes may come from latent directional selection. But constructional selection effects would apply to conditions of normal directional selection as well. There would be adaptive opportunity for any new gene whose effects on the phenotype were in the direction of the current directional selection on the organism. Therefore, genes may to some extent reflect the historical sequence of directional selection experienced by the organism's lineage. Even ancient and highly functionally burdened genes may reveal the functions they conferred in their origin. For example, homeotic mutations which change insect segment identity are universally deleterious. But if an alteration of segment identity was what the gene did when it was created (and thereby presupposed to have been selectively advantageous), then the gene's current function may be a reflection of the directional selection that existed at the time of its origin.
2.8
Models Illustrating Constructional Selection
To give explicit mathematical form to the ideas sketched so far about genome growth, several models wilt be developed. The first is a simple model showing both type I and II effects, which uses probability distributions of fitness effects for gene additions and subsequent allelic variation. The analysis shows the exponential quality of the genic selection effect, and the dependence on correlations in the correlated MIMicvariation effect. The second and third models are further illustrations of the correlated allelic variation effect, using as concrete examptes of genotype-phenotype map functions: 1. Wagner's linear quantitative-genetic model with Gaussian stabilizing selection (Wagner 1989); and 2. Kauffman's (1989a) epistatic "NK" adaptive landscape model. The linear model illustrates latent directional selection arising from constraints on the range of phenotypic variation produced by the genotype, and exhibits selection for new genes that overcome these range constraints. The NK model
215
illustrates latent directional selection arising from kinetic constraints due to the ruggedness of the adaptive landscape, and exhibits selection for genes that overcome the kinetic constraints and produce smoother adaptive landscapes. The Discussion follows, with an overview of the results, an examination of relevant empirical phenomena, and a discussion of the relation of constructional selection to current thinking about the evolution of evolvability. 3
A Fitness
Distribution
Model
The effects of constructional selection can be described directly in terms of the fitness distributions of new mutations, without having to specify the genotypephenotype maps that give rise to these distributions. In the case of the genic selection effect, the mutation is a gene duplication; in the case of the correlated allelic variation effect, the mutation is an allelic change. In this model, a new gene is randomly created from the existing genes in the genome. Selection then determines whether the gene is kept in the genome. The model considers what happens when either allelic mutations or subsequent gene duplications occur. The genes in the population come in different types that determine the fitness distribution of their mutations. The main elements in the model are as follows. Let: G be the space of different types; pi be the probability that a newly created gene is of type i E ~; w be the fitness of the genome with the new gene, relative to its value before the addition; f~(w) be the probability that a new gene of type i has relative fitness w; x~ be the probability that a new gene of type i is kept in the genome by selection. The probability density f i ( w ) would be the result of the phenotypic properties of the gene, as described in item 2 under F e c u n d i t y in Sect. 2.4, including its pleiotropy, modularity, and adaptive opportunity. A concrete illustration is developed in Sect. 5, on Kauffman's NK adaptive landscapes. In a simple-minded approach, a gene would be kept by selection if it increased fitness, i.e. if w > 1. Then the probability that the gene is kept is
//
xi =
fi(w) dw .
=1
But in finite populations, or in any population dynamics where there is a chance that a gene will not be passed down to any offspring, even a gene increasing fitness can sometimes be lost from the population. The probability that a new gene is successfully incorporated in the genome will be some increasing function r of its fitness w. Classical results using branching process models or diffusion approximations give a success probability of 0 if w < 1, and r ~ 2(w - 1) for w ~ 1 (Haldane 1927, Crow and Kimura 1970). So a more general formula for the likelihood that a new gene of type i is fixed is: poo
x~=/ J0
r
fi(w)dw .
(1)
216
The fixation probability over all random newly created gene is: Y
E x ~ P~ . iE6
With these definitions, results for both the genic selection effect and the correlated allelic variation effect will be derived. 3.1
T h e C o r r e l a t e d Allelic V a r i a t i o n E f f e c t
Here we will see how selection on the creation of new genes can cause subsequent allelic variation of the genes to be more likely to be adaptive. We will look at the fitness distributions of alleles from all new genes and from only those genes that selection stably incorporates into the genome. Suppose that a newly created gene of type i gives rise to allelic variants. Let the allelic fitnesses, w I, be distributed with probability density ai(w'). No assumptions need to be made about this density, so it would certainly include the biologically plausible case in which most of the alleles are deleterious. For a gene or type i, we see that the proportion A~,(w) =
a~(y) dy ,
of its alleles are fitter than w. R e s u l t 1 ( C o r r e l a t e d allelic v a r i a t i o n )
Let A(w) be the proportion of new alleles of randomly created genes that are fitter than y, and A*(w) be the proportion of new alleles of stably incorporated genes that are fitter than y. Then
A*(w) = A(w) + eov[Ai(w), x,/~] .
(2)
Proof. The proportion of alleles that are fitter than y, among randomly created gene, is
while among genes that are stably incorporated in the genome it is
A* (w) - Pr[w / > y lthe gene was fixed] = Pr[w I > y, and the gene was fixed] / Pr[the gene was fixed]
= E Ai(w) x~ p~ / 2 = A(w) + Cov[A(yi), x~/~] . iE6
B
217
If there is a positive correlation between the fixation probability x~ =
//
r
fi(w) dw
of a new gene, and the fitness distribution
Ai(w) =
//
ai(y) dy
of its alleles, then A*(w) is greater than A(w). Similarity between the functions fi(w) and ai(w) would produce a positive covariance. The biological foundation for a positive covariance would include: 1. there continuing to be adaptive opportunity for variation in the phenotype controlled by the gene, and 2. the same suite of phenotypic characters being affected by the alleles of the gene as were affected during the gene's origin. With these plausible and general provisions, we see how selection on new genes can also select on the fitness distributions of the alleles that these genes generate.
3.2
The Genic Selection Effect
Now we will see how selection on new genes can increase the chance that new genes are adaptive when created. We will examine how genes with a higher chance of producing adaptive variants tend to proliferate as the genome grows, as reflected in the evolution of p~. The model I am considering is this: genes are randomly picked from the genome and copied. Their fitness effect determines whether they are stably incorporated in the genome. If they are, then the pool of genes subject to duplication is increased by one, and the process repeated. In this way genes of different types come to proliferate at different rates within the genome. Consider the process of sequence duplication that is the starting point for the history of every gene (or part of a gene). One can think of the rate that a gene gives rise to new, successfully incorporated genes as its "constructional fitness". This will be the product of 1. the rate that copies of the gene are produced, and 2. the likelihood that they are fixed in the genome by having provided a selective advantage to the organism. While genetic elements such as transposons or highly repetitive sequences may proliferate because of factor 1, here I wish consider only factor 2, and assume no systematic differences among sequences in the rate that gene copies are produced.
218
P e r f e c t T r a n s m i s s i o n of t h e G e n e ' s T y p e . I suppose for now that copies of genes of type i are also of type i. Because the gene's type is transmitted from a gene to its offspring genes, this provides a correlation between the fitness effects of a new gene and its subsequent duplications. As in (1), a new gene of type i will have probability xi of fixation due to its yielding a selective advantage. Let ni(t) be the number of genes in the genome of type i at time t, N ( t ) = ~2,ie6 n~(t) be total number of genes in the genome at time t, so that the frequency of genes of type i is p~(t) = n~(t) / N ( t ) , and a be the rate each gene is duplicated per unit time. One then obtains this differential equation for the change in the composition of the genome (approximating the number of genes with a continuum), using the fixation probability, xi, for new genes of type i:
-d~ (
t) = ~x~ni(t)
which has solution: n,(t) = e ~ ' ~ n~(o) .
The ratio between the frequencies in the genome of sequences with different constructional fitnesses grows exponentially with the degree of difference between them: hi(t) _ e(~_~j)~ ~ n~(O)
nat)
~j(o) "
R e s u l t 2 ( F i s h e r ' s T h e o r e m a p p l i e d to g e n o m e g r o w t h ) The average constructional fitness of the genome, ~(t) = ~
p~(t) ,
iEG
which is the portion of new duplicated genes that go to fixation, increases at rate d ~ g ( t ) = ~ Var(x) > 0 . Pro@ d
d iE6
d t = Z xd~n~()/N(t) - n~(t)~N(t)/N(t)2t lEG
= ~ x, N ( t ) ie6
-
--,.,2
x~ n ~ ( t )
219
]
= c~
x~p~t) - 5(t) 2
= a Vat(z)
> 0
.
m
This result is Fisher's fundamental theorem of Natural Selection (Fisher 1930), but here, what is evolving is the probability of gene duplications giving rise to new useful genes. I m p e r f e c t Transmission of t h e G e n e ' s T y p e . The model can be extended to less-than-perfect heritability of constructional fitness by defining a transmission function, T(i *-- j), which is the probability that a gene of type j gives rise to a copy of type i (Slatkin 1970, Altenberg and Feldman 1987). It satisfies conditions
ET(i~-j)
= 1 for all j E ~, and T(i*---j) > 0 for all i , j E G
Here, the fraction of the new genes that are of type i is
pi(t) = E
T(i+--j) ni(t ) / g ( t ) .
The dynamics now become: d -~n~(t) = ~ xi E r ( i + - - - j ) nj(t) . jEg Price's Covariance and Selection theorem (Price 1970 1972) emerges when we consider selection in the presence of arbitrary transmission: R e s u l t 3 (Price's T h e o r e m applied to g e n o m e growth) For a gene of type j, let
~J= E x i T ( i ~ J )
.
be the fraction of its duplicate offspring genes that are stably incorporated in the genome. Then rate of change in the average constructional fitness of the genome evaluates to ~-~(t)d = o~ {Cov(~, x) + [~(t) - ~(t)] ~(t)} ,
where
-~(t) = ~ ~ p,(t), a~d Cov(~, x) = ~ ~, x, p,(t) - ~(t) ~(t) . ieg
ieg
220
Proof.
The portion of gene duplications that go to fixation is
keg
~g
jog
je~
This changes at the rate:
-~5(t) = ~ x~ T(i+--j)
/N(t)
dN(t)dt nj(t)/N(t)2]
i,jEg
= a ~ xi T(i*-j) [xj ~ T(j*--k) nk(t) / N(t) i,jEG
keg
- nj(t) k,hGg ~ Xk T(k~-h)nh(t) /
N(t) 2]
= a ~ ~j [xj ~ T(j *- k) nk(t) / N(t) jEg keg
- ~j(t) ~ ~ ~.(t) / x(t) ~] hEg
= ~ {Cov(~, x) + [~(t) - ~(t)] ~(t)}
The covariance term is between a gene's probability of fixation and its offspring genes' average probability of fixation. Note that the frequencies used in the covariance are the frequencies of different types among gene duplications, not the current genes in the genome. A positive correlation between ~i and x~ is to be expected if a gene and its offspring genes affect the same sort of phenotypic characters, and the adaptive opportunity that existed for these characters still exists. Genes (or gene parts, e.g. exons) that code for generally useful products, such as promoters, transmembrane linkers, catalytic sites, developmental controls, etc., would have such continuing adaptive opportunity, and they would contribute to making
Cov(~, x) > o. The term ~(t) - ~(t) is the net bias in the transmission of constructional fitness between a gene and its offspring genes. A conservative assumption is that the transmission bias is negative - - i.e. the chance that gene duplications are adaptive is less for a gene's grand-offspring than it is for the gene's offspring. This is a reasonable assumption since duplications of a gene (or gene part) would diverge to various extents from the ancestral gene's effects, selection may change, or the adaptive opportunity for new copies of the gene may get saturated.
221
But even with a negative transmission bias, the average constructional fitness, Z(t), increases as long as -
> -Cov(
, x) /
.
(3)
As an illustrative example, we can set ~i = ~xi with ~ < 1, a downward transmission bias. Still, ~(t) increases as long as 1 > 1 + Var(xj~(t))
(4)
"
Evaluation of (4) requires evaluating the magnitude of Var(xi/~(t)), which depends on the distribution of constructional fitness values in the genome. Let g(x) be the portion of gene duplications with constructional fitness x. The conditions for (4) under a variety of distributions are: A uniform distribution, g(x) = 1 : 5 increases if fl > 3/4; - An exponential distribution, g(x) = ~e -~x (v is the normalizer): for large ;~, ~ increases if/3 > 1/2; A Gaussian initial distribution, g(x) = L,e-~x2: for large A, 9 increases if > 2/~-; A Gamma distribution,
-
-
-
-1
0,
x > 0,
x __z__ Since one can choose 7 > 0 close to 0, "7+1 " distributions can be found for any arbitrarily small ~ in which the average constructional fitness of the genome grows. Thus, even for arbitrarily strong downward transmission bias, where the probability of a gene giving rise to a useful offspring gene decreases by a factor /~ each gene duplication, the average probability in the genome that a gene duplication produces a selective advantage may still increase in time, depending on the initial distribution of these probabilities in the genome. As hi(t) evolves, both Cov(~, x) and the net transmission bias will change. Under a wide variety of well-behaved transmission functions, where the net transmission bias initially satisfies (3), the distribution of constructional fitness values will shift upward until the net bias balances the covariance or the covariance is exhausted. Results 1 and 3 are extensions of a line of theorems in quantitative genetics based on the covariance of different traits with fitness, including Fisher's fundamental theorem, Robertson's "secondary theorem of Natural Selection" (Robertson 1966), and a result by Price (1970) on gene frequency change, which were elaborated upon by Crow and Nagylaki (1976) and Lande and Arnold (1983). Price's theorem has been applied in a number of different contexts in evolutionary genetics, including kin selection (Grafen 1985, Taylor 1988), group selection (Wade 1985), the evolution of mating systems (Uyenoyama 1988), and quantitative genetics (Frank and Slatkin 1990). I have applied it to performance analysis of genetic algorithms in Altenberg (1994, 1995).
222 4 Wagner's Linear Gaussian Selection
Quantitative-Genetic
Model
with
Wagner (1984, 1989) has investigated evolutionary aspects of the genotypephenotype map through analysis of linear maps combined with a number of different fitness surfaces, including "corridor" and Gaussian fitness functions. In this section I investigate the correlated allelic variation effect of genome growth using a variant of Wagner's (1989) model of "constrained pleiotropy". The model here is a multilayered linear map from the genotype to the organismal phenotype, and from the phenotype to the adaptive functions they carry out. Figure 1 illustrates this model.
Functions under selection~ 1 "~~ .Ncr o
~ Q
MAP
Phenotype~ PtIENOTYPE MAP
,~
Genotype[ ~ Fig. 1. Wagner's linear model of the genotype-phenotype map with a Gaussian fitness function on the departure, z, from optimality.
What I want to capture with this model is the following idea: genes don't "know" a priori what they are doing, what functions they are carrying out; i.e. there is "universal pleiotropy". Pleiotropic constraints may limit the genotype's ability to optimize simultaneously all the functions it controls, so that the best phenotype achievable, given the genetic variability available, may be a compromise between tradeoffs that represents a departure from the global selective optimum. The genotype may appear to be at a selective peak, but if new dimensions of genetic variability were opened up, this peak would be revealed to be on the slope of a larger selective peak. Therefore, at these constrained peaks there exists a "latent" directional selection to which the population could respond if the proper dimension of genetic variation existed. In such situations, events which makes the proper variation possible can be major factors in evolution. Genetic changes that alter the nature of the pleiotropic constraints can therefore come under selection. In this model, I will show how, when there exists variability in the pleiotropic effects of genes coming into existence, genes which are most aligned with the latent directional
223
selection will have the best chance of being incorporated into the genome, and the genomes that result will be able to simultaneously optimize all the adaptive functions much better than would be expected from the underlying distribution of pleiotropic effects. Moreover, the pattern of phenotypic effects of each gene will tend to reflect the directional selection that existed when the gene came into being. The phenotypic variability present in the genomes will therefore indicate the history of directional selection that the genomes experienced during their evolutionary construction.
4.1
The Adaptive Landscape
The organismal phenotype is defined as a k-element long vector, y E ]Re. The organism carries out f different adaptive functions. The optimal organismal phenotype is y*, which would perform each of these functions maximally. For each of the f organismal functions there will be a vector qi E ]Rk such that when the phenotype y departs from y* in the direction qi, only the performance of adaptive function i is altered. Thus the set of {q~} must be orthogonal. The amount, zi, of this departure of adaptive function i from its optimum is simply the component of qi present in y - y*, i.e., the projection of y - y* onto q~: z i = q i (Ty - y * )
.
Let the departures from optimality in each adaptive function interact multiplicatively in reducing the fitness of the organism, with the relative importance of function i measured by a value ,~i > 0. A Gaussian selection scheme satisfies these specifications, giving
w(y) = exp [ - ( y - y,)TQAQT(y _ y,)] = exp - E Aiz2
(5)
i----1
where
Q= lfql,...,qfll is the matrix whose columns are qi, and A is the diagonal matrix A =diag
Ii
=1
Assume that {qi} are linearly independent, which requires f _ k. Let them also be normalized, so that QTQ = I (if f = k then Q is an orthogonal matrix, hence QT = Q - l ) . Together, y*, Q, and A determine the structure of the "adaptive landscape" in terms of the organismal phenotype, y.
224
4.2
G e n e t i c C o n t r o l of t h e P h e n o t y p e
Suppose there are n genes, and the allelic state at each gene i determines a genotype xi C lR. The organismal phenotype, y, is the sum of a set of normalized vectors a~ E Sk on the unit k-sphere, weighted by the values xi. Hence y = Am ,
(6)
where A = Ilal,...,a
ll
is the matrix whose columns are the vectors aj. The gene effects on the phenotype are additive, by the linearity of (6). The magnitude is partitioned fl'om the direction of the gene's effects by normalizing aj, so that T
aj aj ~
Ea 2 =I ij
i
for all j. The allelic value x j controls the magnitude of the gene's effects. The fitness function for the genotype is: w(x) = exp [ - ( A x - N * ) T Q A Q T ( A x - y*)] A note on epistasis: Although the loci interact additively in this model, they are also epistatic in terms of fitness, since the contribution of each Mlelic value to fitness depends on the value of the alleles at the other loci: Ow(x)/Ox~ = - w ( x )
( A x - y * ) T Q A Q T a~ .
(7)
4.3 " L a t e n t " D i r e c t i o n a l Selection at F i t n e s s P e a k s u n d e r Pleiotropic Constraints I assume that each of the elements of x are free to evolve, and that the population will eventually become fixed, through allelic substitution, on the genotype vector that produces the maximum fitness, i.e. which minimizes 5(x) = ( n x - y * ) T Q A Q T ( A x
- y*) .
(8)
This is illustrated in Fig. 2. The dynamics of the evolution toward this optimum are not critical to what follows, but the gradient ascent model of Via and Lande (1985), extended to arbitrary dimensions, would be applicable. The constraints in this model are therefore entirely range constraints, and not kinetic constraints, on the attainable optima. To find the minimum of 5(x) in (8) one differentiates. Let M = QAQ w Then M is positive definite (if f = k) or semi-definite (if f < k). The system AT M ( A S z - y*) = 105(x)/0~z Z
= 0
(9)
225
Fig. 2. Illustration of the "latent" directional selection remaining when adaptation is constrained by phenotypic variability"to be suboptimal. The global optimum phenotype is y* and the constrained optimum is Y.
represents the "normal equations" for the minimization problem (Luenberger 1968). The closed-form solution is
= (ATMA)-IATMy
* ,
(10)
and requires that the matrix A T M A , known as the Gram matrix of A, be positive definite. This is assured if: A is full rank, i.e. a~ are linearly independent, M is positive semi-definite, and no ai is in the null space of M , i.e. for all i, QTai ~ 0 and A~ r 0. Note that numerical computation of ~ uses LU decomposition, not the matrix inversion in (10). In his analysis of variability maintained by a mutation-selection balance in this model, Wagner (1989) changes coordinates so that y* = 0. But then by (10), = y*, so the system evolves to the global fitness peak, and is not constrained by variation to be suboptimal. Although this is of no consequence for the nature of a mutation-selection balance, it eliminates the evolutionary potential afforded by the "latent" directional selection that exists when the population is constrained to be suboptimal, which is what I consider here. Quantitative genetic models with the kind of constrained optima described here present a number of important features. Adding allelic polymorphism to the current model, as in Wagner (1989), would reveal that there can be additive genetic variance for a trait under directional selection and yet no evolution of that trait. Moreover, if selection is increased on any trait, the population will respond to it and move in the direction of the increase of selection until a new balance is found; upon relaxation of the selection to the former level, the population would return to the previous value.
226
4.4
C o n s t r u c t i o n a l Selection
The presence of latent directional selection at a constrained optimum creates adaptive opportunity for new genes that give different directions of phenotypic variability, and so until evolution reaches the global maximum, there is always the opportunity for genome growth. The process of adding new genes to the genome then is modeled as increasing the matrix A column by column. Here this process is examined under very simple evolutionary dynamics, where the population is fixed on its best attainable genotype at the time a new gene is tested in the genome. If the new gene increases fitness, it is added to the genome, and before any new genes are tested, the genotype evolves through allelic substitution to the new optimum that the new gene allows it to attain. This process is then repeated and the genome thus built up. A new gene is added to the genome according to some random sampling process, producing a random vector, an+1 - - its vector of effects on the organismal phenotype - - which expands A by one column to yield A'. Addition of a new gene increases the length of ~ by one element, Xn+l, a random variable, to yield x'. The number of phenotypic characters, k, remains unchanged. Once the new gene is added to the genome, mutations in its allelic value xn+1 will change the phenotype along the same vector of variation, a~+l, as produced by the gene's creation. Thus there is complete correlation in this model between the phenotypic effects from the creation of the gene and the effects of its subsequent allelic variation, which is what provides the basis of the correlated allelic variation effect of constructional selection. The departure of the fitness components from the optimum before the addition of the new gene is: ~(=) = ~TAz = ~
~z~ 2 i
where z = Q T ( A ~ - y * ) , and each z~ is the departure of phenotype from perfect realization of adaptive function i, The fitness of the organism after addition of the new gene is vJ(=') = e -6(x') where ~(~c') = (AS: + x~+la,~+t - y ~ ) T Q A Q T ( A ~
+ x,~+la,~+l - y * ) 9
(11)
Define: --~ Xn+lQT an+l .
Then ~(=') : (~ + ~)TA(~ + ~) .
(12)
So fitness increases if and only if 5(w')
--
5(x)
::
2x~+I( A ~
--
.Y. , T) U a
n-},l + Z n2 + l a nT+ l M a n + l
Ai(2zi "4"6i)ei