Lecture Notes in Artificial Intelligence Edited by R. Goebel, J. Siekmann, and W. Wahlster
Subseries of Lecture Notes in Computer Science
5778
George Kampis István Karsai Eörs Szathmáry (Eds.)
Advances in Artificial Life Darwin Meets von Neumann 10th European Conference, ECAL 2009 Budapest, Hungary, September 13-16, 2009 Revised Selected Papers, Part II
13
Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany
Volume Editors George Kampis Eötvös University Department of History and Philosophy of Science Pázmány P.s. 1/C, 1117 Budapest, Hungary E-mail:
[email protected] István Karsai East Tennessee State Unviversity Department of Biological Sciences Johnson City, TN 37614-1710, USA E-mail:
[email protected] Eörs Szathmáry Collegium Budapest, Szentháromság u.2 1014 Budapest, Hungary E-mail:
[email protected] ISSN 0302-9743 ISBN 978-3-642-21313-7 DOI 10.1007/978-3-642-21314-4
e-ISSN 1611-3349 e-ISBN 978-3-642-21314-4
Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011927893 CR Subject Classification (1998): I.2, J.3, F.1.1-2, G.2, H.5, I.5, J.6 LNCS Sublibrary: SL 7 – Artificial Intelligence © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Artificial Life, unlike artifical intelligence, had humble beginnings. In the case of the latter, when the word itself was born, the first breathtaking results were already out, such as The Logic Theorist, a computer program developed by Allen Newell, Herbert Simon and Cliff Shaw in 1955–56. In artifical life, for a long while, amibition seems to have dominated over the results, and this was certainly true for the first, formative years. It was a bit unclear what exactly Artificial Life is. As not uncommon in science, the first definition was attempted in the negative form, just like when psychology (the study of the mental) was first defined as “anything not physics” (meaning, not natural science) in the nineteenth century. A tempting early definion of Artificial Life was one that distinguished it from theoretical and mathematical biology, modeling, evolution theory, and all the rest of what consituted “an old kind of science” about life. This was not without grounds, perhaps, and the parallel with artificial intelligence comes up again. Artificial intelligence was conceptually based on “machine functionalism,” the philosophical idea that all operations, such as the mental operations of humans, are to be captured as “mere functions,” or, in other words, as independent of their actual physical realizations. Functionalism has put computers and algorithms in the focus of interest in all dealings with human intelligence, and artificial intelligence was a computerized approach to the mind that was designed to capture human mental operations in the functional sense. Now it was simply the case that the functionalism of life was not yet born, and Artificial Life looked like the candidate that was needed for exactly that—to discover how computers can be used to uncover the secrets of life. There were cellular automata, that John von Neumann discovered back around 1950, that are capable of self-reproduction. Perhaps life was just a step away. This and a new fascination with functionalism in Artificial Life put computer scientists (who could technically master cellular automata and computer math) into a central position—in Artificial Life as it could be. But Artificial Life became different. Incidentally, the slogan “life as it could be” was coined as a motto for Artificial Life, but now the same conditionals apply to Artificial Life itself. The reason is that functionalism turned out to be just one part of the story. There is a well-known and much used (maybe over-used) phrase in biology, called “Dobzhansky’s dictum,” which says that “nothing in biology makes sense except in the light of evolution.” Evolution is, as rightly captured in the dictum, central to the understanding of all things alive, and hence it is, and this had to be discovered, central to the studies of Artificial Life as well. And soon it also turned out that evolution cannot be readily reduced to algorithmic problems, or and least not in that pure, detached sense as it was attempted in functionalism. Evolution is complex in a sense recently acknowledged in the sciences of complexity: there is no single principle, no simple set of rules, and no transparency. Instead,
VI
Preface
evolution is a combination of many heterogeneous, and sometimes contingent factors, many of which have to do with the complex ways of existence of organisms: their body, their interaction, their development, their geo-spatial relations, their temporal history, and so on. This brought biology and biologists back into the equation, and Artificial Life has greatly benefited from that. Evo-devo (the interplay between evolutionary and developmental biology), evolutionary robotics, or systems biology are examples of fields where mathematical and algorithmic thinking combined with “wet” reality started to produce new and fascinating results. (Those who kept an eye on cognitive science and artificial intelligence found that over all those years a similar process has taken place there as well. Embodiment, or situated physical realization, has permeated and changed these fields to the same extent, or even more, as it did Artificial Life). Today, as a result of these processes, Artificial Life is a prolific field that combines the best of computer science with the best of theoretical biology, mathematical modeling, and simulation. One way to express this is to say “Darwin meets von Neumann” at last—where “real” biology and “pure” function are no longer seen as contradicting, or even complementary. Rather, they permeate and fertilize each other in a number of fascinating ways. ECAL 2009 was the 10th Europan Conference of Artificial Life, which means 20 years of history. It was an occasion to celebrate the 20 years of development of the field and also the new symbiosis referred to in the title. Famously, 2009 was also “Darwin year,” celebrating his 200th birthday and the 150 years of the Origin of Species. Thus it was highly appropriate to dedicate the meeting to Darwin—and von Neumann together. Five keynote lectures were delivered by eminent invited speakers, in the order of appearance they were: Peter Hammerstein (Humboldt University, Berlin), Hod Lipson (Cornell), Nick Barton (FRS, Edinburgh), Richard Watson (Southampton) and Eva Jablonka (Tel-Aviv University)—their choice reflected the spirit of convergence alluded to above. The conference featured 161 submissions, out of which 54 were accepted as talks and 87 as posters (making up a total of 141 presentations). Adopting the recent practice of many science meetings, submissions could be based either on full papers or extended abstracts. The meeting was organized in a single track over three days, with parallel (whole-day) poster sections, to give best visibility to everyone’s work. We decided to publish all papers of the accepted presentations, not making any difference between posters and talks. This resulted from different factors: many excellent submissions had to be put into the poster section to keep reasonable time limits for the talks, and often this included second or third papers of some of the key speakers. Poster or talk was therefore not necessarily a quality issue. But also, we decided to publish all poster papers because we wanted to show the heterogeneity and the full spectrum of activities in the field, in order to provide an authentic overview. The result is this volume in 2 parts, containing 116 full papers. The conference was sponsored by the Hungarian Academy of Science in different ways, one of them the special rates we enjoyed when using the wonderful historical building of the Academy in the very center of Budapest, just across
Preface
VII
the castle and the Chain Bridge. It is a building with a unique historical atmosphere and one that has seen many major scientific events. The conference talks were held in the Great Lecture Hall, which added to the impression that Artificial Life—and ECAL—are coming of age. The other sponsor was Aitia International, Inc., whose support is gratefully acknowledged. Aitia is the maker of MEME, or Model Exploration ModulE, a platform for DoE (design of experiments) and parameter sweeping, running on a cloud (https://modelexploration.aitia.ai/). The publication process experienced several unexpected difficulties and delays. The proceedings could never have been published without a final push by Mark Jelasity, of Szeged University, a member of the Local Organizing Committee. It was his support and his offer for a hands-on contribution and equally his expertise of LATEX and prior experience with Springer LNCS publications that made the essential difference that helped cross the line. Mark was offered but declined to be an Editor, lacking a scientific contribution to this conference and bearing a responsibility for the selection process, and this is a decision we had to respect. Nevertheless, here is a “big thank you,” Mark. Several other people provided important help in the production of the volume, of whom Balazs Balint (Collegium Budapest) and Chrisantha Fernando (Sussex) need special mention. We thank the TenSi Congress Ltd. for the seamless technical organization of the meeting. In the evaluation phase, important help was given by several members of the Program Committee and also by additional reviewers, listed separately, whose contribution is highly appreciated. The conference and the proceedings have been the work of several people, and we thank all of them for making it happen. Finally, we thank Anna Kramer of Springer for her support and patience. February 2011
George Kampis Istv´an Karsai E¨ors Szathm´ ary (Editors)
Organization
Organizing Committee George Kampis and E¨ ors Szathm´ ary (Chairs) Chrisantha Fernando, M´ ark Jelasity, Ferenc Jord´ an, Andr´ as L˝orincz, and Istv´an Scheuring
Program Committee Wolfgang Banzhaf Xabier Barandiaran Zoltan Barta Mark Bedau Randall Beer Seth Bullock Philippe Capdepuy Peter Cariani Andy Clark Tam´as Cz´ar´ an Ezequiel Di Paolo Dario Floreano Inman Harvey
Takashi Ikegami Istvan Karsai Jozef Kelemen Simon Kirby Doron Lancet Robert Lowe John McCaskill Bruce MacLennan Federico Moran Alvaro Moreno Chrystopher Nehaniv Stefano Nolfi Charles Ofria
Alexandra Penn Daniel Polani Luis Rocha Matthias Scheutz Thomas Schmickl Peter Stadler Elio Tuci Andreas Wegner Franjo Weissing Larry Yaeger Klaus-Peter Zauner
Additional Reviewers Rose Canino-Koning Arthur Covert Tomassino Ferrauto Sherri Goings
Heather Goldsby Laura Grabowski David Knoester Anuraag Pakanati
Bess Walker Luis Zaman
Table of Contents – Part II
Group Selection Investigations of Wilson’s and Traulsen’s Group Selection Models in Evolutionary Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shelly X. Wu and Wolfgang Banzhaf
1
The Evolution of Division of Labor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heather J. Goldsby, David B. Knoester, Jeff Clune, Philip K. McKinley, and Charles Ofria
10
A Sequence-to-Function Map for Ribozyme-Catalyzed Metabolisms . . . . . Alexander Ullrich and Christoph Flamm
19
Can Selfish Symbioses Effect Higher-Level Selection? . . . . . . . . . . . . . . . . . Richard A. Watson, Niclas Palmius, Rob Mills, Simon T. Powers, and Alexandra Penn
27
The Effect of Group Size and Frequency-of-Encounter on the Evolution of Cooperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Steve Phelps, Gabriel Nevarez, and Andrew Howes
37
Moderate Contact between Sub-populations Promotes Evolved Assortativity Enabling Group Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James R. Snowdon, Simon T. Powers, and Richard A. Watson
45
Evolution of Individual Group Size Preference Can Increase Group-Level Selection and Cooperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon T. Powers and Richard A. Watson
53
Ecosystems and Evolution Implications of the Social Brain Hypothesis for Evolving Human-Like Cognition in Digital Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Suzanne Sadedin and Greg Paperin
61
Embodiment of Honeybee’s Thermotaxis in a Mobile Robot Swarm . . . . Daniela Kengyel, Thomas Schmickl, Heiko Hamann, Ronald Thenius, and Karl Crailsheim
69
Positively versus Negatively Frequency-Dependent Selection . . . . . . . . . . . Robert Morris and Tim Watson
77
XII
Table of Contents – Part II
Origins of Scaling in Genetic Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oliver Obst, Daniel Polani, and Mikhail Prokopenko
85
Adaptive Walk on Fitness Soundscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reiji Suzuki and Takaya Arita
94
Breaking Waves in Population Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . George Kampis and Istvan Karsai
102
Symbiosis Enables the Evolution of Rare Complexes in Structured Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rob Mills and Richard A. Watson Growth of Structured Artificial Neural Networks by Virtual Embryogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ronald Thenius, Michael Bodi, Thomas Schmickl, and Karl Crailsheim
110
118
Algorithms and Evolutionary Computation The Microbial Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inman Harvey HybrID: A Hybridization of Indirect and Direct Encodings for Evolutionary Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeff Clune, Benjamin E. Beckmann, Robert T. Pennock, and Charles Ofria
126
134
An Analysis of Lamarckian Learning in Changing Environments . . . . . . . Dara Curran and Barry O’Sullivan
142
Linguistic Selection of Language Strategies: A Case Study for Colour . . . Joris Bleys and Luc Steels
150
Robots That Say ‘No’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frank F¨ orster, Chrystopher L. Nehaniv, and Joe Saunders
158
A Genetic Programming Approach to an Appropriation Common Pool Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alan Cunningham and Colm O’Riordan
167
Using Pareto Front for a Consensus Building, Human Based, Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pietro Speroni di Fenizio and Chris Anderson
175
The Universal Constructor in the DigiHive Environment . . . . . . . . . . . . . . Rafal Sienkiewicz and Wojciech Jedruch
183
Table of Contents – Part II
A Local Behavior Identification Algorithm for Generative Network Automata Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ Burak Ozdemir and H¨ urevren Kılı¸c Solving a Heterogeneous Fleet Vehicle Routing Problem with Time Windows through the Asynchronous Situated Coevolution Algorithm . . . Abraham Prieto, Francisco Bellas, Pilar Caama˜ no, and Richard J. Duro
XIII
191
200
Philosophy and Arts Modeling Living Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Andras
208
Facing N-P Problems via Artificial Life: A Philosophical Appraisal . . . . . Carlos E. Maldonado and Nelson G´ omez
216
Observer Based Emergence of Local Endo-time in Living Systems: Theoretical and Mathematical Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . Igor Balaz and Dragutin T. Mihailovic
222
Life and Its Close Relatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon McGregor and Nathaniel Virgo
230
A Loosely Symmetric Model of Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . Tatsuji Takahashi, Kuratomo Oyo, and Shuji Shinohara
238
Algorithmic Feasibility of Entity Recognition in Artificial Life . . . . . . . . . Janardan Misra
246
Creative Agency: A Clearer Goal for Artificial Life in the Arts . . . . . . . . . Oliver Bown and Jon McCormack
254
Optimization, Action, and Agent Connectivity Agent-Based Toy Modeling for Comparing Distributive and Competitive Free Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hugues Bersini Swarm Cognition and Artificial Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vito Trianni and Elio Tuci Life Engine - Creating Artificial Life for Scientific and Entertainment Purposes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gon¸calo M. Marques, Ant´ onio Lorena, Jo˜ ao Magalh˜ aes, Tˆ ania Sousa, S.A.L.M. Kooijman, and Tiago Domingos
262
270
278
XIV
Table of Contents – Part II
An Analysis of New Expert Knowledge Scaling Methods for Biologically Inspired Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jason M. Gilmore, Casey S. Greene, Peter C. Andrews, Jeff Kiralis, and Jason H. Moore
286
Impoverished Empowerment: ‘Meaningful’ Action Sequence Generation through Bandwidth Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom Anthony, Daniel Polani, and Chrystopher L. Nehaniv
294
Influence of Promoter Length on Network Convergence in GRN-Based Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul Tonelli, Jean-Baptiste Mouret, and St´ephane Doncieux
302
Cellular Automata Evolution of Leader Election . . . . . . . . . . . . . . . . . . . . . Peter Banda Update Dynamics, Strategy Exchanges and the Evolution of Cooperation in the Snowdrift Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Grilo and Lu´ıs Correia Learning in Minority Games with Multiple Resources . . . . . . . . . . . . . . . . . David Catteeuw and Bernard Manderick Robustness of Market-Based Task Allocation in a Distributed Satellite System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Johannes van der Horst, Jason Noble, and Adrian Tatnall Hierarchical Behaviours: Getting the Most Bang for Your Bit . . . . . . . . . . Sander G. van Dijk, Daniel Polani, and Chrystopher L. Nehaniv
310
318 326
334 342
The Common Stomach as a Center of Information Sharing for Nest Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Istvan Karsai and Andrew Runciman
350
Economics of Specialization in Honeybees: A Multi-agent Simulation Study of Honeybees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Schmickl and Karl Crailsheim
358
How Two Cooperating Robot Swarms Are Affected by Two Conflictive Aggregation Spots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Bodi, Ronald Thenius, Thomas Schmickl, and Karl Crailsheim
367
A Minimalist Flocking Algorithm for Swarm Robots . . . . . . . . . . . . . . . . . . Christoph Moeslinger, Thomas Schmickl, and Karl Crailsheim
375
Networks of Artificial Social Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Andras
383
Table of Contents – Part II
XV
Swarm Intelligence An Ant-Based Rule for UMDA’s Update Strategy . . . . . . . . . . . . . . . . . . . . C.M. Fernandes, C.F. Lima, J.L.J. Laredo, A.C. Rosa, and J.J. Merelo
391
Niche Particle Swarm Optimization for Neural Network Ensembles . . . . . Camiel Castillo, Geoff Nitschke, and Andries Engelbrecht
399
Chaotic Patterns in Crowd Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blanca Cases, Francisco Javier Olasagasti, Abdelmalik Moujahid, Alicia D’Anjou, and Manuel Gra˜ na
408
Simulating Swarm Robots for Collision Avoidance Problem Based on a Dynamic Bayesian Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiroshi Hirai, Shigeru Takano, and Einoshin Suzuki
416
Hybridizing River Formation Dynamics and Ant Colony Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pablo Rabanal and Ismael Rodr´ıguez
424
Landmark Navigation Using Sector-Based Image Matching . . . . . . . . . . . . Jiwon Lee and DaeEun Kim
432
A New Approach for Auto-organizing a Groups of Artificial Ants . . . . . . Hanane Azzag and Mustapha Lebbah
440
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
449
Table of Contents – Part I
Evolutionary Developmental Biology and Hardware Self-organizing Biologically Inspired Configurable Circuits . . . . . . . . . . . . . Andr´e Stauffer and Jo¨el Rossier
1
Epigenetic Tracking: Biological Implications . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro Fontana
10
The Effect of Proprioceptive Feedback on the Distribution of Sensory Information in a Model of an Undulatory Organism . . . . . . . . . . . . . . . . . . Ben Jones, Yaochu Jin, Bernhard Sendhoff, and Xin Yao
18
Emerged Coupling of Motor Control and Morphological Development in Evolution of Multi-cellular Animats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lisa Schramm, Yaochu Jin, and Bernhard Sendhoff
27
Evolution of the Morphology and Patterning of Artificial Embryos: Scaling the Tricolour Problem to the Third Dimension . . . . . . . . . . . . . . . . Michal Joachimczak and Borys Wr´ obel
35
Artificial Cells for Information Processing: Iris Classification . . . . . . . . . . . Enrique Fernandez-Blanco, Julian Dorado, Jose A. Serantes, Daniel Rivero, and Juan R. Rabu˜ nal
44
Digital Organ Cooperation: Toward the Assembly of a Self-feeding Organism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sylvain Cussat-Blanc, Herv´e Luga, and Yves Duthen
53
Distance Discrimination of Weakly Electric Fish with a Sweep of Tail Bending Movements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Miyoung Sim and DaeEun Kim
59
Adaptation in Tissue Sustained by Hormonal Loops . . . . . . . . . . . . . . . . . . Dragana Laketic and Gunnar Tufte
67
Embodiment of the Game of Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kohei Nakajima and Taichi Haruna
75
Metamorphosis and Artificial Development: An Abstract Approach to Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gunnar Tufte
83
XVIII
Table of Contents – Part I
Local Ultrastability in a Real System Based on Programmable Springs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Santosh Manicka and Ezequiel A. Di Paolo Acquisition of Swimming Behavior on Artificial Creature in Virtual Water Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keita Nakamura, Ikuo Suzuki, Masahito Yamamoto, and Masashi Furukawa
91
99
Evolutionary Robotics Evolving Amphibian Behavior in Complex Environment . . . . . . . . . . . . . . Kenji Iwadate, Ikuo Suzuki, Masahito Yamamoto, and Masashi Furukawa
107
Neuro-Evolution Methods for Gathering and Collective Construction . . . Geoff Nitschke
115
On the Dynamics of Active Categorisation of Different Objects Shape through Tactile Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elio Tuci, Gianluca Massera, and Stefano Nolfi
124
Evolving a Novel Bio-inspired Controller in Reconfigurable Robots . . . . . J¨ urgen Stradner, Heiko Hamann, Thomas Schmickl, Ronald Thenius, and Karl Crailsheim
132
Functional and Structural Topologies in Evolved Neural Networks . . . . . . Joseph T. Lizier, Mahendra Piraveenan, Dany Pradhana, Mikhail Prokopenko, and Larry S. Yaeger
140
Input from the External Environment and Input from within the Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filippo Saglimbeni and Domenico Parisi
148
Towards Self-reflecting Machines: Two-Minds in One Robot . . . . . . . . . . . Juan Cristobal Zagal and Hod Lipson
156
Swarm-Bots to the Rescue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rehan O’Grady, Carlo Pinciroli, Roderich Groß, Anders Lyhne Christensen, Francesco Mondada, Michael Bonani, and Marco Dorigo
165
Towards an Autonomous Evolution of Non-biological Physical Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roderich Groß, St´ephane Magnenat, Lorenz K¨ uchler, Vasili Massaras, Michael Bonani, and Francesco Mondada
173
Table of Contents – Part I
Acquisition of Adaptive Behavior for Virtual Modular Robot Using Evolutionary Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keisuke Yoneda, Ikuo Suzuki, Masahito Yamamoto, and Masashi Furukawa Guiding for Associative Learning: How to Shape Artificial Dynamic Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kristen Manac’h and Pierre De Loor ALife in the Galapagos: Migration Effects on Neuro-Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christos Ampatzis, Dario Izzo, Marek Ruci´ nski, and Francesco Biscani To Grip, or Not to Grip: Evolving Coordination in Autonomous Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christos Ampatzis, Francisco C. Santos, Vito Trianni, and Elio Tuci Development of Abstract Categories in Embodied Agents . . . . . . . . . . . . . Giuseppe Morlino, Andrea Sterbini, and Stefano Nolfi For Corvids Together Is Better: A Model of Cooperation in Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michela Ponticorvo, Orazio Miglino, and Onofrio Gigliotta
XIX
181
189
197
205 213
222
Protocells and Prebiotic Chemistry Dynamical Systems Analysis of a Protocell Lipid Compartment . . . . . . . . Ben Shirt-Ediss
230
The Role of the Spatial Boundary in Autopoiesis . . . . . . . . . . . . . . . . . . . . . Nathaniel Virgo, Matthew D. Egbert, and Tom Froese
240
Chemo-ethology of an Adaptive Protocell: Sensorless Sensitivity to Implicit Viability Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matthew D. Egbert, Ezequiel A. Di Paolo, and Xabier E. Barandiaran On the Transition from Prebiotic to Proto-biological Membranes: From ‘Self-assembly’ to ‘Self-production’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriel Piedrafita, Fabio Mavelli, Federico Mor´ an, and Kepa Ruiz-Mirazo
248
256
SimSoup: Artificial Chemistry Meets Pauling . . . . . . . . . . . . . . . . . . . . . . . . Chris Gordon-Smith
265
Elongation Control in an Algorithmic Chemistry . . . . . . . . . . . . . . . . . . . . . Thomas Meyer, Lidia Yamamoto, Wolfgang Banzhaf, and Christian Tschudin
273
XX
Table of Contents – Part I
Systems Biology, Artificial Chemistry and Neuroscience Are Cells Really Operating at the Edge of Chaos?: A Case Study of Two Real-Life Regulatory Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Darabos, Mario Giacobini, Marco Tomassini, Paolo Provero, and Ferdinando Di Cunto
281
Cotranslational Protein Folding with L-Systems . . . . . . . . . . . . . . . . . . . . . . Gemma B. Danks, Susan Stepney, and Leo S.D. Caves
289
Molecular Microprograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon Hickinbotham, Edward Clark, Susan Stepney, Tim Clarke, Adam Nellis, Mungo Pay, and Peter Young
297
Identifying Molecular Organic Codes in Reaction Networks . . . . . . . . . . . . Dennis G¨ orlich and Peter Dittrich
305
An Open-Ended Computational Evolution Strategy for Evolving Parsimonious Solutions to Human Genetics Problems . . . . . . . . . . . . . . . . . Casey S. Greene, Douglas P. Hill, and Jason H. Moore
313
Gene Regulatory Network Properties Linked to Gene Expression Dynamics in Spatially Extended Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Costas Bouyioukos and Jan T. Kim
321
Adding Vertical Meaning to Phylogenetic Trees by Artificial Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francesco Cerutti, Luigi Bertolotti, Tony L. Goldberg, and Mario Giacobini
329
Transient Perturbations on Scale-Free Boolean Networks with Topology Driven Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Darabos, Mario Giacobini, and Marco Tomassini
337
Agent-Based Model of Dengue Disease Transmission by Aedes aegypti Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Isidoro, Nuno Fachada, F´ abio Barata, and Agostinho Rosa
345
All in the Same Boat: A “Situated” Model of Emergent Immune Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom Hebbron, Jason Noble, and Seth Bullock
353
Multi-Agent Model for Simulation at the Subcellular Level . . . . . . . . . . . . M. Beurton-Aimar, N. Parisey, and F. Vall´ee Visualising Random Boolean Network Dynamics: Effects of Perturbations and Canalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Susan Stepney
361
369
Table of Contents – Part I
XXI
RBN-World: A Sub-symbolic Artificial Chemistry . . . . . . . . . . . . . . . . . . . . Adam Faulconbridge, Susan Stepney, Julian F. Miller, and Leo S.D. Caves
377
Flying over Mount Improbable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pietro Speroni di Fenizio, Naoki Matsumaru, and Peter Dittrich
385
Behaviors of Chemical Reactions with Small Number of Molecules . . . . . . Yasuhiro Suzuki
394
Memory-Based Cognitive Framework: A Low-Level Association Approach to Cognitive Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul Baxter and Will Browne Modelling Coordination of Learning Systems: A Reservoir Systems Approach to Dopamine Modulated Pavlovian Conditioning . . . . . . . . . . . . Robert Lowe, Francesco Mannella, Tom Ziemke, and Gianluca Baldassarre Evolving Central Pattern Generators with Varying Number of Neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeisung Lee and DaeEun Kim
402
410
418
The Evolution of Cooperation Toward Minimally Social Behavior: Social Psychology Meets Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom Froese and Ezequiel A. Di Paolo Emergence of Cooperation in Adaptive Social Networks with Behavioral Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sven Van Segbroeck, Francisco C. Santos, Tom Lenaerts, and Jorge M. Pacheco Evolving for Creativity: Maximizing Complexity in a Self-organized Multi-particle System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heiko Hamann, Thomas Schmickl, and Karl Crailsheim Evolving Group Coordination in an N-Player Game . . . . . . . . . . . . . . . . . . Enda Barrett, Enda Howley, and Jim Duggan Combining Different Interaction Strategies Reduces Uncertainty When Bootstrapping a Lexicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Cederborg
426
434
442 450
458
A Chemical Model of the Naming Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joachim De Beule
466
Evolving Virtual Fireflies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David B. Knoester and Philip K. McKinley
474
XXII
Table of Contents – Part I
Selection of Cooperative Partners in n-Player Games . . . . . . . . . . . . . . . . . Pedro Mariano, Lu´ıs Correia, and Carlos Grilo
482
Evolving Social Behavior in Adverse Environments . . . . . . . . . . . . . . . . . . . Brian D. Connelly and Philip K. McKinley
490
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
499
Investigations of Wilson’s and Traulsen’s Group Selection Models in Evolutionary Computation Shelly X. Wu and Wolfgang Banzhaf Computer Science Department, Memorial University of Newfoundland, St John’s, Canada, A1B 3X5 {d72xw,banzhaf}@mun.ca
Abstract. Evolving cooperation by evolutionary algorithms is impossible without introducing extra mechanisms. Group selection theory in biology is a good candidate as it explains the evolution of cooperation in nature. Two biological models, Wilson’s trait group selection model and Traulsen’s group selection model are investigated and compared in evolutionary computation. Three evolutionary algorithms were designed and tested on an n-player prisoner’s dilemma problem; two EAs implement the original Wilson and Traulsen models respectively, and one EA extends Traulsen’s model. Experimental results show that the latter model introduces high between-group variance, leading to more robustness than the other two in response to parameter changes such as group size, the fraction of cooperators and selection pressure. Keywords: the evolution of cooperation, evolutionary computation, Wilson’s trait group selection model, Traulsen’s group selection model.
1
Introduction
Evolutionary computation (EC) is often viewed as an optimization process, as it draws inspiration from the Darwinian principle of variation and natural selection. This implies that EC may fail to solve problems which require a set of cooperative individuals to jointly perform a computational task. When cooperating, individuals may contribute differently, and hence might lead to unequal fitnesses. Individuals with lower fitness will be gradually eliminated from the population, despite their unique contributions to overall performance of the algorithm. Hence, special mechanisms should be implemented in EC that avoid selecting against such individuals. In nature, the success of cooperation is witnessed at all levels of biological organization. A growing number of biologists have come to believe that the theory of group selection is the explanation even though this theory has been unpopular for the past 40 years; hence new models and their applications are investigated
S.W. is grateful to Tom Lenaerts and Arne Traulsen for helpful discussions. W.B. would like to acknowledge support from NSERC Discovery Grants, under RGPIN 283304-07.
G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 1–9, 2011. c Springer-Verlag Berlin Heidelberg 2011
2
S.X. Wu and W. Banzhaf
[2]. Individuals are divided into groups, and only interact with members in the same group. The emergence of cooperation is due to competition between individuals and between groups. Individual competition selects against individuals with lower fitness, but group competition favors individuals who cooperate with others, regardless to their individual fitness. The group selection model proposed by Wilson and Sober [9,12] and the model by Traulsen and Nowak [10] represent two research strands in this area; groups in Wilson’s model are mixed periodically during evolution, while groups in Traulsen’s model are isolated. Hence, selection between groups and within groups work differently in these models. Extending these two models to encourage cooperation in artificial evolution is a relatively new research direction; most research [1,3,6,7] so far is based on Wilson’s model or its variations, not on Traulsen’s. This motivated us to investigate the role each model can play to encourage cooperation in EC, and to analyze their differences. Three evolutionary algorithms adapting the two models were designed and examined under different parameter settings; these parameters refer to group size, fraction of cooperators and selection pressure, and they directly affect the selection dynamics. Our results show that the algorithm which extends Traulsen’s model is more robust towards parameter changes than the algorithms implementing the original Wilson and Traulsen models. The reminder of this paper is organized as follows. Section 2 introduces the three evolutionary algorithms. Section 3 describes the experiments and the results obtained. Section 4 concludes.
2
Algorithms Design
Wilson’s and Traulsen’s models interpret the idea of group selection in a different fashion. In Wilson’s model, groups reproduce proportional to group fitness. Offspring groups are periodically mixed in a migrating pool for another round of group formation. The number of individuals a group contributed to this pool is proportional to its size; so cooperative groups contribute more to the next generation. On the contrary, Traulsen’s model keeps groups isolated. An individual is selected proportional to its fitness from the entire population, and the offspring is added to its parent’s group. When the group reaches its maximal size, either an individual in this group is removed, or the group splits into two, so another group has to be removed. Cooperative groups grow faster, and therefore, split more often. For detailed descriptions of these models, please refer to [9,10,12]. Our study aims to investigate the performance of the two models in extending evolutionary algorithms to evolve cooperation. The investigation is conducted in the context of the n-player prisoner’s dilemma (NPD). The NPD game offers a straightforward way of thinking about the tension between the individual and group level selection [4]; meanwhile it represents many cooperative situations in which fitness depends on both individual and group behavior. In this game, N individuals are randomly divided into m groups. Individuals in a group independently choose to be a cooperator or a defector without knowing the choice of others. The fitness function of cooperators (fCi (x)) and defectors (fDi (x)) in
Investigations of Wilson’s and Traulsen’s Group Selection Models in EC
3
group i are specified by the following equations: b(ni qi − 1) − c), (0 ≤ i < m) ni − 1 bni qi fDi (x) = base + w , (0 ≤ i < m) ni − 1
fCi (x) = base + w(
(1a) (1b)
where base is the base fitness of cooperators and defectors; qi the fraction of cooperators in group i; ni the size of group i; b and c are the benefit and cost caused by the altruistic act, respectively; w is a coefficient. Evidently, cooperators have a lower fitness than defectors, because they not only pay a direct cost, but also receive benefits from fewer cooperators than defectors do. The fitness of group i is defined as the average individual fitness. Although defectors dominate cooperators inside a group, groups with more cooperators have a higher group fitness. Hence, the dynamics between individual and group selection will drive the game in different directions. A simple evolutionary algorithm implementing Wilson’s model (denoted as W) is described in Algorithm 1. Algorithm 1. An Evolutionary Algorithm Based on Wilson’s Model 1 2 3 4 5 6 7 8 9 10 11 12
P ← Initialize Population(N, r); while population does not converge or max generation is not reached do P ← Disperse Population(P, m); Evaluate Fitness(P ); for i ← 0 to N do gn ← Select Group(P ); idv ← Select Individual(P , gn); idv ← Reproduce Offspring(idv); Add Individual(idv , N P, gn) end P ← Mixing Proportionally(N P ); end
This algorithm starts with randomly initializing a population P with N individuals, r percent of which are cooperators. P is then divided into m groups, and the individual and group fitness of the dispersed population P is evaluated. Afterwards, reproduction begins; a group with number gn is first selected, from which an individual idv is selected to produce offspring idv . idv is then added to group gn in the new population N P . The reproduction causes groups in N P vary in size, because the selection of groups is proportional to fitness. In total N offspring will be produced, where N is normally larger than population size N . This gives cooperators an opportunity to increase their frequency in the next generation. To maintain the original population size N , groups in N P are mixed and each contributes individuals proportional to its size to new population P . P will repeat the above steps until the population converges or the maximum number of generations is reached.
4
S.X. Wu and W. Banzhaf
Algorithm 2. An Evolutionary Algorithm Based on Traulsen’s Model 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
P ← Initialize Population(N, r); P ← Disperse Population(P, m); while population does not converge or max generation is not reached do Evaluate Fitness(P ); for i ← 0 to N do idv ← Select Individual from Population(P ); idv ← Reproduce Offspring(idv); Put Back to Group(idv , gn); if Group Size(gn) > n then rnum ← Generate Random Number(0, 1); if rnum < q then Split Group(gn); Remove a Group(); else Remove an Individual in Group(gn); end end end end
Similarly, Traulsen’s is embedded into an evolutionary algorithm shown in Algorithm 2. This algorithm initializes, divides, and evaluates the population the same way algorithm W does. However, there are two major differences. First, the population only disperses once at the beginning of the process; the groups are kept isolated afterwards. Second, the reproduction step is different. An individual idv is selected from the entire population for reproduction, rather than from a group. Offspring idv is put back into its parent’s group, group gn. If the size of group gn exceeds the pre-defined group size n, a random number rnum is generated. If rnum is less than a group splitting probability q, group gn splits and its individuals are randomly distributed into offspring groups. A group has to be removed to maintain a constant number of groups; otherwise, an individual from group gn is removed. In Traulsen’s model, a group or an individual to be eliminated is randomly selected. As an extension, we also investigate selecting such a group or individual reversely proportional to its fitness. Therefore, two variations of Algorithm 2 are implemented, one refers to the former (denoted as T1) and the other to the latter (denoted as T2).
3
Investigations with the Algorithms
The investigations focus on the effects caused by different group size n, initial fraction of cooperators r, and coefficient w. Parameters n and r affect the assortment between cooperators and defectors in groups, and coefficient w affects the individual and group fitness; both cause changes in selection dynamics.
Investigations of Wilson’s and Traulsen’s Group Selection Models in EC
5
To focus on the selection dynamics, we assume asexual reproduction without the interference of mutation. A roulette wheel selection is adopted in the reproduction step for all algorithms. Parameters that are common to all experiments are set as follows: runs R = 20, generation gen = 5, 000, population size N = 200, base fitness base = 10, benefit b = 5, cost c = 1, group splitting probability q = 0.05, N =10, and N is decided by the following equation [9]. N =
m
ni × (qi × fCi (x) + (1 − qi ) × fDi (x))
(2)
i=1
For each algorithm, we measure the success ratio by the number of runs whose population converges to cooperators to the number of total runs 20. The larger the ratio, the more likely an algorithm favors cooperation. We also collect the average variance ratio [5], which indicates composition difference between groups. The higher this ratio, the more prominent the effect of group selection. 3.1
The Effects of Group Size and Initial Fraction of Cooperators
First we investigate how the three algorithms behave under different group sizes. We set r = 0.5 and w = 1. Group size n is varied from {5, 10, 20, 50, 100}. The success ratio and average variance ratio (in brackets) for each setting are listed in the first 3 columns in Table 1. As can be seen, the performance of T1 degrades Table 1. The effects of group size n and initial fraction of cooperators r on the three algorithms
n 5 10 20 50 100
r=0.5 W T2 1(0.196) 1(0.820) 1(0.092) 1(0.655) 0.8(0.045) 1(0.291) 0(0.015) 1(0.112) 0(0.004) 0(0.011)
r=0.3 T1 W T2 1 1(0.201) 1(0.853) 0.85 1(0.098) 1(0.665) 0.65 0.55(0.045) 1(0.398) 0.15 0(0.016) 0.8(0.105) 0 0(0.005) 0(0.014)
r=0.1 T1 W T2 0.95 1(0.196) 1(0.893) 0.55 1(0.095) 1(0.767) 0.25 0.25(0.042) 0.65(0.465) 0.1 0(0.015) 0.55(0.049) 0 0(0.005) 0(0.015)
T1 0.7 0.2 0.1 0.05 0
as n grows. The population in W converges to cooperators when small groups are employed (n = 5 or 10). As n increases, evolving cooperation becomes difficult. In contrast, T2 converges to cooperators except for n = 100. The observation can be explained by Figure 1. Figure 1(a) shows that variance ratio v in W decreases as n increases, which reduces the effect of group selection. As a result, selection on the individual level is becoming the dominate force, so the population converges quicker to defectors, see Figure 1(b). The same trend between v and n is also observed in T2. However, given that n ranges from 5 to 50, its v value is much higher than or equal to the highest v value of W (see Figure 1(c)). This implies that T2 preserves variance between groups better than W, and explains why T2 is more effective than W in evolving cooperation.
6
S.X. Wu and W. Banzhaf
1 n=5 n=10 n=20 n=50 n=100
0.25 0.2
Fraction of Cooperators (r)
Variance Ratio (v)
0.3
0.15 0.1 0.05 0
0.8
n=5 n=10 n=20 n=50 n=100
0.6 0.4 0.2 0
0
25
50
75
100 125 150 175
0
Number of Generation
50
75
100 125 150 175
Number of Generation
(a) Variance ratio in W
(b) Fraction of cooperators in W 1 Fraction of Cooperatros (r)
1 Variance Ratio (v)
25
n=5 n=10 n=20 n=50 n=100
0.8 0.6 0.4 0.2 0
0.8
n=5 n=10 n=20 n=50 n=100
0.6 0.4 0.2 0
0
100
200
300
Number of Generation
(c) Variance ratio in T2
400
0
100
200
300
400
Number of Generation
(d) Fraction of cooperators in T2
Fig. 1. The variance ratio v and fraction of cooperators r for algorithm W and T2 under different group sizes when r = 0.5 and w = 1
Unlike W, convergence to cooperators of T2 is not accelerated as n gets smaller; for example, runs with n = 10 converge first. When groups are too small or too large, much averaging is required to remove defectors from the population (see Figure 1(d)). We further adjusted the value of r from 0.5 to 0.3 and 0.1. We were curious about the response of the three algorithms to this change, because when r drops, the number of cooperators assigned to groups is smaller, which increases the influence of individual selection in a group. As shown in Table 1, the performance of T1 decreases as r drops. For W and T2, when n is small (5 or 10), due to the strong group selection effects, the decrease of r does not affect the success ratio, but only slows convergence speed towards cooperation; for larger groups, as n increases (group selection is weaker) and r decreases (individual selection is stronger), group selection can hardly dominate individual selection; so it becomes difficult for both algorithms to preserve cooperation. However, T2 is less affected, because for a given group size, similar v values in W are observed despite the changes of r, while relatively high v values are produced by T2 even r drops.
Investigations of Wilson’s and Traulsen’s Group Selection Models in EC
3.2
7
Weak vs. Strong Selection
The composition of groups is not the only factor that drives selection dynamics; a difference in fitness values of cooperators and defectors is another one. It affects the pressure put on groups and individuals. In the next experiment, we use coefficient w to adjust the selection pressure. If w is small, the selection is called weak selection; otherwise it is called strong selection. We tested the three algorithms with r=0.5 and w set to {0.1, 0.5, 1, 2, 5, 10}, respectively on all group sizes. Results are shown in Table 2. One first Table 2. The performance of the algorithms under weak and strong selection w=0.1 n W T2 5 1(0.197) 1(0.949) 10 1(0.095) 1(0.766) 20 0.85(0.044) 0.95(0.601) 50 0.4(0.015) 0.45(0.174) w=2 5 1(0.196) 1(0.806) 10 1(0.096) 1(0.543) 20 0.1(0.042) 1(0.309) 50 0(0.014) 0.8(0.079)
w=0.5 T2 1(0.884) 1(0.515) 1(0.370) 1(0.115) w=5 1 0.9(0.196) 1(0.820) 0.9 0.1(0.096) 1(0.596) 0.8 0(0.050) 0.8(0.334) 0.15 0(0.014) 0.45(0.021) T1 0.6 0.6 0.55 0
W 1(0.201) 1(0.096) 1(0.046) 0(0.015)
w=1 W T2 1(0.196) 1(0.820) 1(0.092) 1(0.655) 0.8(0.045) 1(0.291) 0(0.015) 1(0.112) w=10 1 0(0.190) 1(0.875) 0.8 0(0.096) 1(0.638) 0.5 0(0.046) 0.8(0.347) 0 0(0.016) 0.1(0.053)
T1 0.9 0.8 0.65 0.3
T1 1 0.85 0.65 0.15 1 0.85 0.15 0
notices that the performance of the three algorithms increases and then deceases as selection pressure goes from weak to strong. If selection is too weak, the fitness between two roles and between groups are very close. Hence, group and individual selection become neutral, especially if large groups are adopted, so defectors can more easily take over the population. If the selection is too strong, though group selection still favors cooperative groups, because the larger fitness difference between both roles, cooperators are more difficult to be selected. To be more specific, for small groups (n = 5 or 10) only T2 can successfully preserve cooperation under both weak and strong selection. The increase of selection pressure raises the influence of individual selection. In response to this increase, the variance ratio in W for a given group size does not change at all, while T2 still keeps noticeable high variance ratios. This also explains why T2 outperforms W with larger groups. 3.3
Discussion
The above experiments demonstrate that maintaining variance between groups has great impact on group selection models. For W, if groups are randomly formed, small group sizes are desired because small groups increase group variance. This confirms previous investigations (see [5,8,11] for examples). We further show that such a requirement only works if the selection is weak. T2, because it is able to introduce high group variance, is more robust towards parameter
8
S.X. Wu and W. Banzhaf
changes. The reason lies in how the two models manage groups. Mixing and reforming of groups in Wilson’s model constantly averages the variance between groups, so in Figure 1(a) we observe the variance between groups fluctuating. In contrast, because groups in Traulsen’s model are kept isolated, and the selection step in reproduction is proportional to individual fitness, the fraction of cooperators in a cooperative group grows faster than in a less cooperative group, hence gradually increases the variance between groups. T2 performs better than T1 under all settings, because removing an individual or a group according to reversed fitness value at death selection is very likely wiping out defectors, thus it certainly helps cooperators.
4
Conclusion
Wilson’s and Traulsen’s models are possible extensions of EC to evolve cooperation. Here, we investigated evolutionary algorithms that adapt the two models, and analyzed their differences. Our experimental results show that an algorithm which extends Traulsen’s model is less sensitive to parameter changes than the algorithms based on the original Wilson and Traulsen models, because it is able to maintain high between-group variance, which is able to override individual selection arisen by the parameter changes. Future work will consider to theoretically investigate the extended algorithm; its extensions to multilevel selection; and its role in the theory of evolutionary transition.
References 1. Beckmann, B.E., McKinley, P.K., Ofria, C.: Selection for group-level efficiency leads to self-regulation of population size. In: Ryan, C., Keijzer, M. (eds.) Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO), pp. 185–192. ACM, New York (2008) 2. Borrello, M.E.: The rise, fall and resurrection of group selection. Endeavour 29(1), 43–47 (2005) 3. Chang, M., Ohkura, K., Ueda, K., Sugiyama, M.: Group selection and its application to constrained evolutionary optimization. In: Proceedings of the Congress on Evolutionary Computation (CEC), pp. 684–691. IEEE Computer Society, Los Alamitos (2003) 4. Fletcher, J.A., Zwick, M.: N-player prisoner’s dilemma in multiple groups: A model of multilevel selection. In: Boudreau, E., Maley, C. (eds.) Proceedings of the Artificial Life VII Workshops (2000) 5. Fletcher, J.A., Zwick, M.: The evolution of altruism: Game theory in multilevel selection and inclusive fitness. Journal of Theoretical Biology 245, 26–36 (2007) 6. Hales, D.: Emergent group level selection in a peer-to-peer network. Complexus 3(13), 108–118 (2006) 7. Knoester, D.B., McKinley, P.K., Ofria, C.A.: Using group selection to evolve leadership in populations of self-replicating digital organisms. In: Proceedings of the Conference of Genetic and Evolutionary Computation (GECCO), London, England, pp. 293–300 (2007)
Investigations of Wilson’s and Traulsen’s Group Selection Models in EC
9
8. Lenaerts, T.: Different Levels of Selection in Artificial Evolutionary Systems. PhD thesis, Computational modeling lab, Vrije Universiteit Brussel (2003) 9. Sober, E., Wilson, D.S.: Unto Others: The Evolution and Psychology of Unselfish Behavior. Harvard University Press, Cambridge (1999) 10. Traulsen, A., Nowak, M.A.: Evolution of cooperation by multilevel selection. Proceedings of the National Academy of Sciences of the USA (PNAS) 103, 10952–10955 (2006) 11. Wade: A critical review of the models of group selection. The Quarterly Review of Biology 53, 101–104 (1978) 12. Wilson, D.S.: A theory of group selection. Proceedings of the National Academy of Sciences of the USA (PNAS) 72, 143–146 (1975)
The Evolution of Division of Labor Heather J. Goldsby, David B. Knoester, Jeff Clune, Philip K. McKinley, and Charles Ofria Department of Computer Science and Engineering Michigan State University East Lansing, Michigan 48824 {hjg,dk,jclune,mckinley,ofria}@cse.msu.edu
Abstract. We use digital evolution to study the division of labor among heterogeneous organisms under multiple levels of selection. Although division of labor is practiced by many social organisms, the labor roles are typically associated with different individual fitness effects. This fitness variation raises the question of why an individual organism would select a less desirable role. For this study, we provide organisms with varying rewards for labor roles and impose a group-level pressure for division of labor. We demonstrate that a group selection pressure acting on a heterogeneous population is sufficient to ensure role diversity regardless of individual selection pressures, be they beneficial or detrimental. Keywords: digital evolution, cooperative behavior, specialization, altruism.
1
Introduction
Within nature, many organisms live in groups where individuals assume different roles and cooperate to survive [1–4]. For example, in honeybee colonies, among other roles, drones care for the brood, workers forage for pollen, and the queen focuses on reproduction [1]. A notable aspect of these roles is that they do not all accrue the same fitness benefits. For example, leadership is a common role found within multiple species, where the benefits of leadership are significantly greater than that of a follower. In human societies, leaders of organizations commonly earn many times more than the average worker [4]. An open question is why individuals would not all attempt to perform the role associated with the highest fitness benefit, or in other words, why individuals would perform roles that put their genes at an evolutionary disadvantage for survival. Group selection pressures among human tribes have been proposed as one possible explanation for the evolution of leaders and followers [4]. In this paper, we explore whether group selection is sufficient to produce division of labor, where individual selection rewards different roles unequally. Numerous evolutionary computation approaches have been used to study the behavior of cooperative groups comprising heterogeneous members [5–9]. Two key differentiating characteristics for these approaches are the level of selection used (i.e., individual or group) and whether or not division of labor G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 10–18, 2011. c Springer-Verlag Berlin Heidelberg 2011
The Evolution of Division of Labor
11
occurs. Ecological approaches [6] use individual-level selection in concert with limited resources to promote the evolution of specialists. Some coevolutionary approaches [5, 9] evolve cooperative groups using individual selection, where different species are isolated in distinct subpopulations. Cooperation among these species occurs only at the time of fitness evaluation, when individuals from one species are evaluated with representatives from each of the other species. PerezUribe et al. [7] and Waibel et al. [8] overview work performed in this area, and also describe the effects of varying the levels of selection and population composition (i.e., heterogeneous or homogeneous populations) [8]. However, these prior studies do not address multi-level selection, where organisms experience individual-level competition to survive and also group-level pressure to cooperate. To explore these conditions, which are pertinent to biological studies of the division of labor in cooperative groups, we apply multi-level selection to a heterogeneous population. For this study, we use Avida [10], a digital-evolution platform previously used to study topics including the origin of complex features [11] and the evolution of cooperation among homogeneous individuals [12]. Within an Avida experiment, a population of self-replicating computer programs exists in a user-defined computational environment and is subject to mutations and natural selection. These digital organisms execute their genome to perform tasks that metabolize resources in the environment, interact with neighboring organisms, and selfreplicate. In this paper, we describe how we used Avida and multi-level selection to evolve groups of heterogeneous organisms that perform roles with different fitness benefits. First, we enabled organisms to self-select roles associated with different costs and/or benefits. We applied a group-level pressure for division of labor that successfully counteracted the individual pressure to perform only the highest rewarded role. Second, rather than having an organism select a role, we conducted experiments in which a role was associated with a labor task that the organism had to perform. Again, we observed the evolution of division of labor. Third, we analyzed one of the successful groups and determined that it used a combination of genotypic diversity, phenotypic plasticity, and cooperation to perform all roles. The model developed for this approach can be used to inform biological studies of cooperation, such as those performed by Dornhaus et al. for honeybees [1]. Additionally, this technique can serve as a means to achieve division of labor within artificial life in order to solve engineering problems, such as developing multiple software components that must interact to achieve an overall objective [12], as well as cooperation among heterogeneous robots [13].
2
Methods
For each experiment, 30 trials were conducted to account for the stochastic nature of evolution. Figure 1 depicts an Avida organism and population. An Avida organism consists of a circular list of instructions (its genome) and a virtual CPU that executes those instructions. The virtual CPU architecture
12
H.J. Goldsby et al.
comprises three general-purpose registers {AX, BX, CX} and two stacks {GS, LS}. The standard Avida instruction set used in this study is Turing complete and is designed so that random mutations will always yield a syntactically correct program, albeit one that may not perform any meaningful computation [14]. This Avida instruction set performs basic computational tasks (addition, multiplication, and bit-shifts), controls execution flow, enables communication, and allows for replication. In this study, the instruction set also included several instructions developed for the evolution of distributed problem solving [12]; these instructions are summarized in Table 1. Avida organisms can perform tasks that enable them to metabolize resources from their environment. It is typical for these tasks to be Registers Stacks GS AX BX CPU logical operations performed on 32-bit integers. LS CX sendmsg Heads Performing tasks increases an organism’s merit, Cell Instr Flow Read Write Interface which determines the rate at which its virtual CPU will execute instructions relative to the other organisms in the population. For example, an organism with a merit of 2 will, on average, execute twice as many instructions as an organism with a merit of 1. Since organisms selfreplicate, an organism with greater merit will generally out-compete other organisms, eventu- Fig. 1. An Avida organism and population ally dominating the population. For these experiments, the amount of merit that an organism gains for completing a task depends on a user-defined constant called the task’s bonus value. When an organism performs a task, the organism’s merit is multiplied by the task’s bonus value. For example, if an organism performs a task with a bonus value of 2, its merit is doubled. An Avida population comprises a number of cells in which at most one organism can live. Thus, the size of an Avida population is bounded by the number of cells in the environment. The cells are divided into a set of distinct subpopulations, called demes. In this study, demes compete every 100 updates in a tournament based on their fitness function, where a deme’s fitness is determined by the behavior of its constituent organisms. An update is the unit of experimental time in Avida corresponding to approximately 30 instructions per organism. Each tournament contains a set of demes selected at random, and the deme with greatest fitness replaces the other demes (ties are broken randomly). When a deme is replaced, all organisms from the source deme are copied into the target deme, overwriting its previous inhabitants. Within each deme, organisms are still able to self-replicate. Thus, an individual’s survival is dependent not only on its ability to out-compete its neighbors for the limited space available in its deme, but also on the collective behavior of the group. This process is similar to competition among human tribes [3]. For the experiments described in this paper, we created mutually-exclusive tasks for each possible role. An organism fulfills a role by performing its associated inc get−id halloc hdiv rtrvm rotater rotater
rtrvm rotatel if−less h−div getid getid inc
The Evolution of Division of Labor
13
Table 1. Communication and coordination instructions for this study Instruction Description send-msg Sends a message to the neighbor currently faced by the caller. retrieve-msg Loads the caller’s virtual CPU from a previously received message. rotate-left-one Rotate this organism counter-clockwise one step. rotate-right-one Rotate this organism clockwise one step. get-role-id Sets register BX to the value of the caller’s role-id register. set-role-id Sets the caller’s role-id register to the value in register BX. bcast1 Sends a message to all neighboring organisms. get-cell-xy Sets register BX and CX to the x − y coordinates of the caller. collect-cell-data Sets register BX to the value of the cell data where the caller lives.
task; each organism may only have one role. For some of our experiments, we used role-ids, a mechanism whereby an organism sets a special-purpose virtual CPU register to an integer value, to indicate the role that an organism performs. For others, we required the organisms to implement logical operations. We tried these two mechanisms to see if the complexity of labor tasks affected the division of labor. We varied the benefits of performing a task (and thus performing a role) by changing the task’s bonus value. In all experiments presented here, we used 400 demes, each containing 25 organisms on a 5 × 5 toroidal grid. Deme fitness was based on the diversity of tasks performed by the organisms. Thus, our experiments contain both the individual-level pressure to perform the task with the highest reward and the group-level pressure to diversify the tasks performed.
3
Experimental Results
Varying Roles. Our first experiment was designed to ascertain whether group selection is a strong enough pressure to produce division of labor when all roles have the same fitness benefit. For this experiment, we considered an organism to be performing a given role if it sets its role-id register to a desired value using the set-role-id instruction. We varied the desired number of roles from 2 to 20 and associated each role-id with a task that has a bonus value of 2. For example, when two roles are desired, the rewarded role-ids are 1 and 2. If an organism replicates after setting its role-id to 1, then it has completed task 1 and as a result, its merit is doubled. Similarly, if five roles are desired, then the rewarded role-ids are {1, 2, 3, 4, 5}. If an organism sets its role-id to a value outside this range, then no reward is granted. Additionally, we impose a group-level pressure for both the number of organisms that have set a role-id and the diversity of the role-ids. Specifically, the deme fitness function used here is: F =
1+n if n < 25 1 + n + r if n = 25
(1)
where F is the fitness of a deme, n is the number of organisms that have set a role-id, and r is the number of unique rewarded role-ids set by organisms in the
H.J. Goldsby et al.
Number of Roles
25 20 15 10
25
2 3 4 5 10 20
Number of Roles
14
5 0 0
1
2
3
4
Update
5
20 15 10 5 0 0
1
4
x 10
(a)
2
3 Update
4
5 4
x 10
(b)
Fig. 2. (a) grand mean and (b) grand maximum number of unique roles over all demes when the number of desired roles was varied from 2 to 20
deme. Experiments described in this paper were repeated with tournaments of size 2, 5, and 10; results were not qualitatively different. Due to space limitations, we present results for a tournament of size 2, except where noted. Figure 2 depicts the grand mean and maximum performance of all demes across 30 trials for each of {2, 3, 4, 5, 10, 20} roles. The different curves represent the varying number of desired roles. In general, the best performing deme achieves the desired number of roles within 5,000 to 15,000 updates. Our analysis of the behavior of the demes indicates that they exploit location awareness (using instruction get-cell-xy) to determine which role to perform. Specifically, their x (or y) coordinate was used to calculate their role. Thus, we conclude that group selection is strong enough to produce division of labor among equally rewarded roles. Varying Rewards. In the next set of experiments, we explored whether the group selection pressure for division of labor was strong enough to counteract rewarding roles unequally. The different rewards associated with roles provides an individual pressure to specialize on the most rewarded role, even if this behavior is detrimental to the performance of the group. This setup is designed to reflect a leader/follower situation, where it is desirable for the group to have one leader, and yet the rewards for the leader may be significantly different than those of a follower. To test this, we set the desired number of roles to be 2, and conducted trials for different multiplicative benefits of role-id 1 (the leader role). All other role-ids were neutral (i.e., they did not affect merit). We then specified a grouplevel pressure to limit the leader role to only a single organism. The deme fitness function used here was:
F =
1+n if n < 25 (1 + n − (o1 − do1 ))2 if n = 25
(2)
where F is the fitness of a deme, n is the number of organisms that have set a role-id, do1 is the desired number of organisms that perform the leader role, and o1 is the actual number of organisms that perform the leader role.
The Evolution of Division of Labor
15
Number of organisms
Figure 3 depicts the results of varying the multiplicative benefit of the leader role from 0.5 (a penalty) to 64 (a significant reward) across 30 trials. Each treatment has two lines: a dashed line to indicate the number of followers, and a solid line to indicate the number of leaders; different symbols are used to indicate different treatments. In general, by 50,000 updates, the average deme has reached an equilibrium between leaders and followers, where the number of leaders is less than five in all treatments. The larger the benefit of leadership, the slower the population was to reach equilibrium and the larger the number of leaders. These results indicate that group selection is able to effectively counteract individual selection pressures. To assess the generality of these 25 results, we ran a control experi20 ment without group selection, where 0.5 the fitness of all demes were always 15 1 2 equivalent. As expected, the major8 10 ity of organisms chose to be lead64 5 ers when there was a reward and followers when there was a penalty. 0 0 1 2 3 4 5 Additionally, we ran two-role ex4 Update x 10 periments where we set the desired number of leaders to be 13 (apFig. 3. Group selection is strong enough to proximately half of the population). overcome individual pressures for leadership, Similar to the results depicted in which we test by varying the multiplicative Figure 3, the population reached bonus value of leadership from 0.5 to 64 equilibrium by 50,000 updates with the population nearly evenly divided between leaders and followers. Lastly, we conducted treatments where we set the desired number of role-ids to be five and varied the distribution of the benefits among roles. Specifically, we conducted experiments where the benefits: (1) increased linearly; (2) where one role was rewarded significantly greater than the others; and (3), where the majority of task bonus values were penalties. For these experiments, the deme fitness function was the number of unique tasks performed. In all cases, the best performing deme rapidly evolved to perform all five roles, with the average deme performing three or more roles. Increasing Complexity. For the last set of experiments, we studied whether this multi-level selection technique was sufficient to evolve division of labor when the complexity of the corresponding tasks varied. In the first two sets of experiments, an organism performed a role by setting its role-id to a specific value. For these experiments, we required the organism to perform a bit-wise logic operation on 32-bit integers. In this case, we used five mutually-exclusive logic operations: not, nand, and, orn, or. For the first experiment in this set, we rewarded all five logic operations equally with a task bonus value of 2. The deme fitness function was set to the number of unique tasks performed plus 1. The best demes varied between 4 and 5 tasks, whereas the average deme consistently performed between 3 and 4 tasks. Thus, this technique was successful in producing division of labor.
16
H.J. Goldsby et al.
Next we examined whether division of labor occurred when the tasks were complex and their rewards were unequal. We rewarded the tasks based on complexity. Task not was assigned a bonus value of 2, tasks nand, and, orn were assigned a bonus value of 3, and or was assigned a bonus value of 4. This treatment increases the difficulty of the problem because organisms must evolve to perform logic operations and coordinate roles with different benefits. The best performing demes continued to vary between 4 and 5 tasks, and the average deme continued to perform between 3 and 4 tasks. This result indicates that the group selection pressure is strong enough to counteract the varying rewards among complex individual tasks. Behavioral Analysis. To better understand how 4 2 4 2 4 group selection maintained diversity of all five logic tasks in the previous treatment, we analyzed 3 4 2 4 3 one deme from the end of a trial where all tasks 1 2 3 1 4 had equivalent rewards. This deme had 18 unique genotypes. After we put the deme through a period 1 3 5 3 1 of 100 updates without mutations, an ecological pe2 2 3 5 4 riod, the deme maintained all 5 tasks and had 6 different genotypes. Figure 4 provides a graphical depiction of the genotype (shading) and phenotype Fig. 4. A deme after an eco(task-id of the task performed) of the organisms in logical period. Genotypes are the deme. Of the 6 genotypes, five exhibited phe- denoted by different color shading. Phenotypes are indinotypic plasticity. Specifically, depending on their cated by task-id of the role environmental context, genotypes in these families performed. could perform one of two different tasks, nand/orn and not/and, respectively. For example, the first two blocks are both shaded white, indicating they have the same genotype, and yet one performs task 4 (orn) and the other performs task 2 (and). The remaining genotype (depicted in black) exclusively performed task 5 (or). It is often the case that explanations from a group selection perspective can also be interpreted from a kin selection perspective, and both are equally valid yet provide different intuitions [15]. For this paper, interpreting the results from a group selection perspective provides the most intuitive explanations. Additionally, our analyses revealed that the different genotypes belonged to three distantly-related lineages, with those performing tasks 1-4 more closely related to each other than to the genotype performing task 5. Lastly, we conducted “knockout” experiments for each communication (sendmsg, retrieve-msg) and environment-sensing instruction (get-cell-xy, collect-celldata). Specifically, all instances of the target instruction were replaced with a placeholder instruction that performs no useful function. We conducted 30 trials for each instruction knockout, and the results indicated that the ability to communicate with neighboring organisms and to sense their environment were critical for the success of the group. Without these instructions, at the end of a 100 update period (that did include mutations), the deme performed an average
The Evolution of Division of Labor
17
of 2.2 tasks. In summary, the most effective deme strategy relied on a combination of genotypic diversity, phenotypic plasticity, and cooperation to achieve all five tasks.
4
Conclusion
In this paper, we demonstrated that group selection is a sufficient pressure to produce division of labor among heterogeneous organisms. Specifically, we found that group selection can produce demes whose constituent organisms perform five different, mutually-exclusive complex roles, regardless of the underlying reward structure. In future work, we seek to use this technique to better understand behavior of social organisms and to harness this approach to apply evolutionary computation to complicated problems, such as automatically generating software systems [12]. Acknowledgements. The authors thank Ben Beckmann, Jeff Barrick, Rob Pennock, and Carlos Navarrete for their contributions. This work was supported by the U.S. National Science Foundation grants CCF-0820220, CCF-0643952, and CCF-0523449.
References 1. Dornhaus, A., Kl¨ ugl, F., Puppe, F., Tautz, J.: Task selection in honeybees - experiments using multi-agent simulation. In: Proc. of the Third German Workshop on Artificial Life (1998) 2. Robinson, G.E.: Regulation of division of labor in insect societies. Annual Review of Entomology 37, 637–665 (1992) 3. Bowles, S., Hopfensitz, A.: The co-evolution of individual behaviors and social institutions. Journal of Theoretical Biology 223(2), 135–147 (2003) 4. VanVugt, M., Hogan, R., Kaiser, R.: Leadership, followership, and evolution: Some lessons from the past. American Psychologist 63, 182–196 (2008) 5. Potter, M.A., DeJong, K.A.: Cooperative coevolution: An architecture for evolving coadapted subcomponents. Evol. Comput. 8(1), 1–29 (2000) 6. Goings, S., Ofria, C.: Ecological Approaches to Diversity Maintenance in Evolutionary Algorithms. In: IEEE Symposium on Artificial Life, Nashville, TN (2009) 7. Perez-Uribe, A., Floreano, D., Keller, L.: Effects of group composition and levels of selection in the evolution of cooperation in artificial ants. In: Banzhaf, W., Ziegler, J., Christaller, T., Dittrich, P., Kim, J.T. (eds.) ECAL 2003. LNCS (LNAI), vol. 2801, pp. 128–137. Springer, Heidelberg (2003) 8. Waibel, M., Keller, L., Floreano, D.: Genetic team composition and level of selection in the evolution of cooperation. IEEE Transactions on Evolutionary Compuation (2009) (to appear) 9. Yong, C.H., Miikkulainen, R.: Cooperative coevolution of multi-agent systems. Technical report, University of Texas at Austin (2001) 10. Ofria, C., Wilke, C.O.: Avida: A software platform for research in computational evolutionary biology. Journal of Artificial Life 10, 191–229 (2004)
18
H.J. Goldsby et al.
11. Lenski, R.E., Ofria, C., Pennock, R.T., Adami, C.: The evolutionary origin of complex features. Nature 423, 139–144 (2003) 12. McKinley, P.K., Cheng, B.H.C., Ofria, C., Knoester, D.B., Beckmann, B., Goldsby, H.J.: Harnessing digital evolution. IEEE Computer 41(1) (2008) 13. Labella, T.H., Dorigo, M., Deneubourg, J.L.: Division of labor in a group of robots inspired by ants foraging behavior. ACM Transactions on Autonomous and Adaptive Systems (TAAS) (2006) 14. Ofria, C., Adami, C., Collier, T.C.: Design of evolvable computer languages. IEEE Transactions in Evolutionary Computation 6, 420–424 (2002) 15. Foster, K., Wenseleers, T., Ratnieks, F., Queller, D.: There is nothing wrong with inclusive fitness. Trends in Ecology and Evolution 21, 599–600 (2006)
A Sequence-to-Function Map for Ribozyme-Catalyzed Metabolisms Alexander Ullrich1 and Christoph Flamm2 1
University of Leipzig Chair for Bioinformatics H¨ artelstr. 16, 04275 Leipzig, Germany 2 University of Vienna Institute for Theoretical Chemistry W¨ ahringerstr. 17, 1090 Vienna, Austria
Abstract. We introduce a novel genotype-phenotype mapping based on the relation between RNA sequence and its secondary structure for the use in evolutionary studies. Various extensive studies concerning RNA folding in the context of neutral theory yielded insights about properties of the structure space and the mapping itself. We intend to get a better understanding of some of these properties and especially of the evolution of RNA-molecules as well as their effect on the evolution of the entire molecular system. We investigate the constitution of the neutral network and compare our mapping with other artificial approaches using cellular automatons, random boolean networks and others also based on RNA folding. We yield the highest extent, connectivity and evolvability of the underlying neutral network. Further, we successfully apply the mapping in an existing model for the evolution of a ribozyme-catalyzed metabolism. Keywords: genotype-phenotype map, RNA secondary structure, neutral networks, evolution, robustness, evolvability.
1
Introduction
For many years it is known that neutral mutations have a considerable influence on the evolution in molecular systems [14]. The RNA sequence-to-structure map with its many-to-one property represents a system entailing considerable neutrality. It has been used successfully to shed light on the mechanisms of evolution [12,13] and explain basic properties of biological systems, such as resolving the interplay between robustness and evolvability [19]. Several alternative scenarios have been proposed to describe the emergence and evolution of metabolic pathways (for a review see [4]) including spontanious evolution of enzymes with novel functions, the “backwards evolution scenario” suggesting that pathways evolve backward from key metabolites and the “enzyme recruitment scenario” or “patchwork model” which assumes that highly specialized an efficient enzymes evolved from a few inefficient ancestral enzymes G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 19–26, 2011. c Springer-Verlag Berlin Heidelberg 2011
20
A. Ullrich and C. Flamm
Sequence Space
Structure Space
Fig. 1. The RNA sequence-to-structure map: There are many more sequences than structures which brings redundancy into the map. Sequences which fold into the same secondary structure form extended neutral networks in sequence space. The strong interweavement of the neutral networks implies that the sequences in a small volume around an arbitrary sequence realize all possible secondary structures.
with broad specificities. While these scenarios explain certain aspects of the early history of life and the evolution of metabolic pathways, the general mechanisms of the transition from an uncatalysed to a catalysed reaction network and the evolution of novel catalytic functions remains poorly understood. We have proposed a multi-level computational model to study this transition in particular and the evolution of catalyzed reaction-networks in general [18]. The model combines a graph-based algebraic chemistry [1] with a RNA based structure-to-function-map within a protocellular entity. The structure-tofunction-map projects sequence and structure features of the ribozyme into a hierarchical classification scheme for chemical reactions based on Fujita’s “imaginary transition structure concept”[8]. The structure-to-function-map itself is evolvable and allows the emergence of completely new catalytic functions. Furthermore it forms the link between the genotype (i.e. the RNA sequence of the ribozymes) and the phenotype (i.e. the (partially) catalyzed reaction network).
2
The RNA Sequence-to-Structure Map
The folding of RNA sequences into their respective secondary structures [11] is a relatively well understood biophysically grounded process. Efficient algorithms [9] have been developed which allow the computation of nearly any desired thermodynamic or kinetic property of RNA sequences at the level of secondary structure. The relation between RNA sequences and their secondary structures can be viewed as a minimal model of a genotype-phenotype map (see Figure 1), where the RNA sequence carries the “heritable” information for the formation of the phenotype, which in turn is the RNA secondary structure upon which selection acts in evolutionary experiments [2,3].
A Sequence-to-Function Map for Ribozyme-Catalyzed Metabolisms
21
O
Transition State
O U A CC UG G G A A A UU GG U C AG CG C CG
H
H
O Cl
H
Educts
Reaction Logo
O O Basic Reaction
H
H
O H
U A CC UG G G A A A UU GG U C AG CG C CG
Cl
Products
O Specific Reaction
x O O
O
H
o
o
x
x H o Cl
O
H
ITS
Fig. 2. The structure-to-function map: (lhs) The colored regions of the ribozyme fold determine the catalytic function i.e. which leaf in ITS-tree is picked; (middle) Along the levels of the ITS-tree the amount of chemical detail increases; (rhs) Acidic hydrolysis of ethyl acetat. The ITS of the reaction is gained from the superposition of educts and products by removing those atoms and bonds which do not directly participate in the reaction (shown in green).
The statistical architecture of the RNA sequence-to-structure map and it’s implications for the evolutionary dynamics [6,7] has been extensively studied over the past decade. In particular, the map possesses a high degree of neutrality, i.e. sequences which fold into the same secondary structure are organized into extended mutationally connected networks reaching through sequence space. A travel along such a “neutral network” leaves the structure unchanged while the sequence randomizes. The existence of neutral networks in sequence space has been demonstrated in a recent experiment [16]. Due to the fact, that the neutral networks are strongly interwoven, the sequence-to-structure map shows another interesting property called “shape space covering” [17]. Meaning that within a relatively small volume of sequence space around an arbitrary sequence any possible secondary structure is realized. Both features of the RNA sequence-tostructure map account for directionality and the partially punctuated nature of evolutionary change.
3
The Mapping
In the following, we elaborate the aspects of our novel genotype-phenotype mapping. First, the mapping is realized in two steps. While both steps have a biological motivation, they are also designed regarding statistical considerations. We will refer to the two steps as the sequence-to-structure and the structure-tofunction map, respectively. The sequence-to-structure map is equivalent to the folding of the RNAsequence to its minimum free energy structure. We use the RNA sequence-tostructure map on the one hand to model ribozymes for our artificial metabolism
22
A. Ullrich and C. Flamm
(Section 4) and on the other hand to add the properties described in Section 2 to the mapping. The second and more interesting part, the structure-to-function map, can vary in some aspects for different applications, since the resulting phenotype spaces may differ in size and structure as well as the constitution of the phenotype itself. Here we focus on the mapping for our artificial metabolism model (Section 4). Therefore, we map from the RNA secondary structure to a chemical reaction. The reactions are represented by their transition state structure. The classification of those structures is hierarchical (see Figure 2). Accordingly, we extract features for every level, to determine the resulting reaction. Thereby, we regard structural and sequential information only of the longest loop in the structure and adjacent stems. The length of the loop specifies the size of the transition state, i.e. the number of involved atoms. Further, length and starting base-pairs of the adjacent stems define the constitution of the basic reaction, i.e. the abundance and position of single, double or triple bonds. Finally, the sequence inside the loop region decides over the specific reaction, i.e. the position and type of the involved atoms. The decision to focus on the loop region is based on the observation that most catalytic RNA molecules have their active center in a hairpin loop and also generally enzymes have an active site that covers only a small part of the entire structure. Furthermore, some studies suggest that transition state stabilizing is an important catalytic strategy for ribozymes [15], matching our representation of chemical reactions as transition state structures.
4
Ribozyme-Catalyzed Artificial Metabolism
In order to investigate the evolution of biological networks and the emergence of their properties, we have developed a sophisticated model of a ribozymecatalyzed metabolism. The graph-based model is supported by an artificial chemistry, to allow for realistic kinetic behavior of the system. The basic elements of the model are represented as graphs. In case of the metabolites this complies with the intuitive presentation of molecules in chemistry. Similarly, metabolic reactions can be represented as a set of graphs or by one superimposition graph, combining the relevant parts of all involved molecules. The process of the reaction itself is performed by a graph-rewriting mechanism, through which new molecules can be generated. The conjunction of metabolites and their participation in metabolic reactions is captured in the metabolic network. We use several techniques to analyze the properties of the produced networks and compare them with analyzes of real-world networks [18]. Besides the metabolism, our model also comprises a genome that is divided into several genes by TATAboxes. The genes are RNA-sequences of fixed length, intended to code for the catalytic elements, ribozymes, of the model system. The idea for the genotypephenotype mapping that we introduce in this paper originates from the task to find a realistic and efficient mapping from these RNA-sequences (genotype) to the ribozymes (phenotype), or better, to the chemical reactions they implement.
A Sequence-to-Function Map for Ribozyme-Catalyzed Metabolisms
23
Fig. 3. Overview of the artificial metabolism model. Each gene of the genome expresses for a ribozyme. While these catalysts are trapped within the protocellular entities, metabolites under a certain size can pass through the membrane, allowing exchange with the environment.
5
Results
We will now compare our genotype-phenotype mapping with other existing approaches. Mappings based on cellular automaton (CA) and random boolean network (RBN) have been proven to exhibit desirable properties [5] and, thus, will serve here as reference. In previous evolutionary models and for the statistical analysis in this paper, we developed other mappings based on the RNA sequence-to-structure map but varying structure-to-function maps. Here we will mention only two of them. For the first mapping, we assign one target structure for each point in the phenotype space, the structure then maps to the phenotype of the closest target structure. The second mapping extracts structural features, similar to our mapping, but for the entire structure. 5.1
Random Neutral Walk
One way of gaining statistical results for the comparison of different genotypephenotype mappings is through a random neutral walk on the genotype space. It starts from an arbitrary point of the mapping’s genotype space. The first step is to choose randomly one neutral mutation out of the set of all possible one-point mutations. Applying the mutation, a new genotype is created that differs from its predecessor in only one position and maps to the same point in phenotype space. This procedure is repeated until the length of the walk reaches the pre-defined limit or no neutral mutation can be applied. In each step, we keep track of several statistics that give insight into the structure of the neutral network. For this aim, we observe the one-point mutation neighborhoods of all genotypes of the random neutral walk. The fraction of neutral mutations within these neighborhoods will be referred to as the neutrality
24
A. Ullrich and C. Flamm
Fig. 4. Random neutral walk from an arbitrary point in genotype space to random neutral neighbors until a certain length is reached. In each step, encountered phenotypes in the one-point mutation neighborhood are protocolled.
of the underlying neutral network. In terms of biological systems this can also be regarded as the robustness that is gained through the mapping, since neutrality protects from harmful mutations. Further, we search for new phenotypes that are encountered during the course of the walk, i.e. the number of different phenotypes that were found in the one-point neighborhoods. This measurement indicates the rate with which the mapping discovers new phenotypes and other neutral components. The faster a system can access different points in the phenotype space, the more evolvable is it. Thus, the discovery rate can be equated with the notion of evolvability. Other measures that we regard as important here, are the extent and the connectivity of the neutral networks. Former, is given by the maximal distance between genotypes of the neutral walk. Latter, is defined by the fraction of possible phenotypes that ca be reached from an arbitrary starting configuration. 5.2
Comparison
For the following study, we performed 1000 random neutral walks of length 100 for each mapping. The size of the genotype spaces vary for the different mappings. For the RNA-based mappings the size is 4100 , for the random boolean network mapping it is 2144 and for the cellular automaton mapping 276 . The phenotype space is 28 = 256 for all mappings. Figure 5 shows the results of the performed random walks. It can be seen that our mapping (RNA loop) reaches the most phenotypes (≈ 200 of 256), following the other RNA-based mappings (≈ 175 and 150), the random boolean network (145) and the cellular automaton (100). We can also see that the difference in the first steps is even more drastically, indicating a faster discovery rate of our mapping compared to all others. This can be explained by the connectivity, whereas CA and RBN have about 14 and 21 neighboring phenotypes,
250
100
200
80
RNA loop RNA full RNA distance RBN CA
150
100
25
60
40
20
50
0
# reached phenotypes
# reached phenotypes
A Sequence-to-Function Map for Ribozyme-Catalyzed Metabolisms
20
40 60 random walk steps
(a)
80
100
0
5
10 random walk steps
15
20
(b)
Fig. 5. Encountered phenotypes in a random neutral walk of length 100 (left) and for the first 20 steps (right). RNA loop = our mapping; RNA full = RNA-based mapping considering the entire structure; RNA distance = RNA-based mapping using target structures; RBN = random boolean network mapping; CA = cellular automaton mapping.
respectively, our mappings have about 27 distinct phenotypes in their one-point neighborhood. Furthermore, our mapping travels further in the genotype space, allowing a steady discovery of new phenotypes. However, in terms of neutrality the random boolean network mapping performs best, ≈ 58% of its mutations are neutral, for CA it is ≈44% and the RNA-based mappings are in between with around 50%. 5.3
Discussion
We have proposed here a genotype-phenotype mapping that is suitable for evolutionary studies, as shown for the given example and the comparative study with other mappings. Due to its advantageous properties, it is conceivable to use our mapping in studies where so far rather simple artificial genomes have been used and the RNA sequence-to-structure map can be integrated in a meaningful way, such as the evolution of regulatory networks [10]. First of all, our mapping allows open-ended evolution of the simulation model, since it continuously produces new phenotypes and does not get trapped in local optima. Further, it provides sufficient robustness to evolve a stable system and at the same time achieves high evolvabilty, allowing the system to react fast to perturbations or changes in the environment. We have also shown here, that a good genotypephenotype mapping needs more than neutrality, the connectivity and extent of the underlying neutral network is at least as important. Observing the mapping within the proposed evolutionary study of ribozymecatalyzed metabolism, we hope not only to get insights about the emergence of pathways but also about the properties of the mapping itself, thus, the RNA sequence-to-structure map.
26
A. Ullrich and C. Flamm
Acknowledgments We thank the VolkswagenStiftung, the Vienna Science and Technology Fund (WWTF) Projectnumber MA07-30 and COST-Action CM0703 “Systems Chemistry” for financial support.
References 1. Benk¨ o, G., Flamm, C., Stadler, P.F.: A Graph-Based Toy Model of Chemistry. J. Chem. Inf. Comput. Sci. 43, 1085–1093 (2003) 2. Bieberich, C.K.: Darwinian Selection of Self-Replicating RNA Molecules. Evolutionary Biology 16, 1–52 (1983) 3. Bieberich, C.K., Gardiner, W.C.: Molecular Evolution of RNA in vitro. Biophys. Chem. 66, 179–192 (1997) 4. Caetano-Anoll´es, G., Yafremava, L.S., Gee, H., Caetano-Anoll´es, D., Kim, H.S., Mittenthal, J.E.: The origin and evolution of modern metabolism. Int. J. Biochem. & Cell Biol. 41, 285–297 (2009) 5. Ebner, M., Shackleton, M., Shipman, R.: How neutral networks influence evolvability. Complex 7, 19–33 (2001) 6. Fontana, W., Schuster, P.: Continuity in evolution: on the nature of transitions. Science 280, 1451–1455 (1998) 7. Fontana, W., Schuster, P.: Shaping Space: the Possible and the Attainable in RNA Genotype-phenotype Mapping. J. Theor. Biol. 194, 491–515 (1998) 8. Fujita, S.: Description of Organic Reactions Based on Imaginary Transition Structures. J. Chem. Inf. Comput. Sci. 26, 205–212 (1986) 9. Gruber, A.R., Lorenz, R., Bernhart, S.H., Neub¨ ock, R., Hofacker, I.L.: The Vienna RNA websuit. Nucl. Acids Res. 36, W70–W74 (2008) 10. Hallinan, J., Wiles, J.: Evolving genetic regulatory networks using an artificial genome. In: Proc. 2nd Conf. on Asia-Pacific Bioinformatics, vol. 29, pp. 291–296 (2004) 11. Hofacker, I.L., Stadler, P.F.: RNA secondary structures. In: Lengauer, T. (ed.) Bioinformatics: From Genomes to Therapies, vol. 1, pp. 439–489 (2007) 12. Huynen, M.A.: Exploring phenotype space through neutral evolution. J. Mol. Evol. 43, 165–169 (1996) 13. Huynen, M.A., Stadler, P.F., Fontana, W.: Smoothness within ruggedness: the role of neutrality in adaptation. Proc. Natl. Acad. Sci. USA 93, 397–401 (1996) 14. Kimura, M.: Evolutionary rate at the molecular level. Nature 217, 624–626 (1968) 15. Rupert, P.B., Massey, A.P., Sigurdsson, S.T., Ferre-D’Amare, A.R.: Transition State Stabilization by a Catalytic RNA. Science 298, 1421–1424 (2002) 16. Schultes, E.A., Bartel, D.P.: One sequence two ribozymes: Implications for the emergence of new ribozyme folds. Science 289, 448–452 (2000) 17. Schuster, P., Fontana, W., Stadler, P.F., Hofacker, I.L.: From sequence to shape and back. Proc. Roy. Soc. (London) B 255, 279–284 (1994) 18. Ullrich, A., Flamm, C.: Functional Evolution of Ribozyme-Catalyzed Metabolisms in a Graph-Based Toy-Universe. In: Heiner, M., Uhrmacher, A.M. (eds.) CMSB 2008. LNCS (LNBI), vol. 5307, pp. 28–43. Springer, Heidelberg (2008) 19. Wagner, A.: Robustness and evolvability: a paradox resolved Proceedings. Biological Sciences / The Royal Society 2758, 91–100 (2008)
Can Selfish Symbioses Effect Higher-Level Selection? Richard A. Watson, Niclas Palmius, Rob Mills, Simon T. Powers, and Alexandra Penn Natural Systems group, University of Southampton, U.K
[email protected] Abstract. The role of symbiosis in macro-evolution is poorly understood. On the one hand, symbiosis seems to be a perfectly normal manifestation of individual selection, on the other hand, in some of the major transitions in evolution it seems to be implicated in the creation of new higher-level units of selection. Here we present a model of individual selection for symbiotic relationships where individuals can genetically specify traits which partially control which other species they associate with – i.e. they can evolve speciesspecific grouping. We find that when the genetic evolution of symbiotic relationships occurs slowly compared to ecological population dynamics, symbioses form which canalise the combinations of species that commonly occur at local ESSs into new units of selection. Thus even though symbioses will only evolve if they are beneficial to the individual, we find that the symbiotic groups that form are selectively significant and result in combinations of species that are more cooperative than would be possible under individual selection. These findings thus provide a systematic mechanism for creating significant higher-level selective units from individual selection, and support the notion of a significant and systematic role of symbiosis in macroevolution.
1 Introduction: Can Individual Selection Create Higher-Level Selection? Symbiotic relationships in general are ubiquitous and uncontroversial, but the role of symbiosis in macro-evolutionary processes such as the major evolutionary transitions (1) and symbiogenesis (the creation of new species through symbiosis) (2), is poorly understood. Clearly, the evolution of symbiotic relationships may change the effective selection pressures on individuals in complex ways – but can they enable higher-level selection? When the fitness of individuals is context sensitive (i.e. under frequency dependent selection) grouping individuals together in small groups can change the average selection pressure on cooperative traits by altering the variance in contexts (3, 4). This effect is stronger when group membership is assortative on behavioural traits (5). In most models, however, the existence of groups is presupposed and accordingly any group selection effect observed is unsurprising in the sense that it is fully explained by changes in individual selection given the context of these groups. In contrast, we are interested in scenarios where individually selected traits affect the strength of group selection or create group selection de novo (6). For example, related G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 27–36, 2011. © Springer-Verlag Berlin Heidelberg 2011
28
R.A. Watson et al.
work addresses the evolution of individually specified traits that affect group size (7, 8), or the evolution of markers that influence behavioural grouping (9). Here we address a multi-species scenario where species can evolve symbiotic relationships that allow explicit control over whether they group and who they group with. Symbiosis, the living together of different species, implies that one species ‘seeks out’ another, actively controlling (to a limited extent) the species composition of its environmental context. When organisms create their own environments a complex dynamic is created between the traits they evolve that affect their symbiotic relationships, and the ‘ordinary traits’ (traits that do not affect symbioses) they evolve given the context they have created for themselves. Our research question concerns whether it is possible for an individual to evolve symbiotic relationships that cause it to create a significant higher-level unit of selection. This might seem to be a logical impossibility because for a higher-level unit of selection to be significant one would ordinarily assert that it must oppose individual selection. And, if a group opposes individual selection then a defector or selfish individual that exploits the group will be fit and take over. Of course, group selection that acts in alignment with individual selection is possible – e.g. individual selection may cause a mixed population to reach some evolutionarily stable strategy (ESS) (10) and group selection that acts in alignment with individual selection might cause a population to reach this ESS more quickly, but it cannot cause it to go somewhere other than the local ESS. But we show this conclusion is too hasty. We show that in cases where group selection acts in alignment with individual selection it can alter evolutionary outcomes. This requires that we consider a different type of evolutionary game, however. The literature on group selection is largely preoccupied with the prisoners’ dilemma (11) – a game that has only one ESS (10) – ‘Defect’. Although a group of cooperative individuals is collectively fitter than a group of defectors, the cooperative group can never be stable given that the payoff for Defect is higher than the payoff for Cooperate when playing against other cooperators. Thus if groups are imposed Cooperate:Cooperate will beat Defect:Defect but it is not possible that a Cooperate:Cooperate group can be maintained by individual selection. In contrast, a game that has more than one ESS is a different matter. A coordination game of two strategies, for example, has two ESSs, let’s call them A-A and B-B, and these ESSs may have different overall utility, let’s say that an A-A group beats a B-B group. But the difference is that in a game that has multiple ESSs, each ESS can be supported by individual selection (there is no ‘cheat’ strategy that can invade either ESS) and this means that the two groups need not be externally imposed in order to be stable. Nonetheless, the evolutionary outcome can be significantly different from the outcome of individual selection without grouping. For example, with no grouping, if the utility of A-A is only slightly higher than B-B, then the population will reach the ESS that is closest to the initial conditions – for example, if B has a significant majority this will be the B-B ESS. But with grouping, the A-A ESS can be reached even if B has a significant majority because when A’s interact disproportionately with other A’s they are fitter than B’s. In the models that follow we will show that individual selection causes groups to form that represent combinations of species from different ESSs and thus allows the highest utility ESS to be found. We intend our model to represent the evolution of symbiotic relationships between species, not just assortativity of behaviours within a single species. Thus we permit
Can Selfish Symbioses Effect Higher-Level Selection?
29
competition between heterogeneous groups (e.g. AB vs CD, where A and B are behaviours provided by unrelated species) rather than homogeneous groups (e.g. AA vs BB) as would be more conventional in a single-species assortative grouping model (where relatedness and inclusive fitness concepts straightforwardly apply) (12). By using a poly-species model we can show that the process we model significantly increases the likelihood of reaching a higher-utility ESS even in cases where the basin of attraction for high-utility ESSs is initially very small (13). Note that we do not change the interaction coefficients between species but only change the co-location or interaction probability of species. A species might thus change its fitness by increasing the probability of interacting with another (which is what we mean be a symbiosis) but it cannot change its intrinsic fitness dependency on that species (as might be part of a more general model of coevolution - 4,14,15). There are clearly many ways in which organisms can change interaction probabilities with other organisms either subtly or radically (16).
2 An Ecosystem Model with Evolved Symbioses Our abstract model of an ecosystem contains 2N species, each of which contains P individuals. The fitness of each individual in each species will depend on the other species present in its local environmental context. A separation of timescales is crucial in this model (15): On the (fast) ecological dynamics timescale species densities within an environmental context change and quickly reach equilibrium, but on this timescale genetic changes are assumed to be negligible. At a much slower genetic evolution timescale, genetic changes that alter symbiotic relationships are significant. The genotype of an individual specifies partnerships with the other 2N-1 species that can partially (or completely) determine the combination of species it appears with in the environmental context. We assume that the initial composition of the local environment contains a random combination of species, but for the scenarios we investigate the ecological dynamics have only stable attractors, so the composition of the ecology quickly equilibrates to a subset of species that are stable. Although the frequency of a species may go to zero in a particular ecological context, in other contexts it will persist (i.e. no species are lost). Different individuals are evaluated in the environmental context for some time, and at the end of each period we turn attention to a new randomly initialised ecological context. Ours is therefore not an explicitly spatial model since we have no need to model different environmental contexts simultaneously. We choose a very simple representation of the local environmental context – a binary vector representing which species are present in non-zero frequency. We suppose that each position in the vector is a ‘niche’ that may be occupied by one of two possible species that are mutually exclusive, such that some species cannot coexist in the same ecological context. For example, in a forest where deciduous and coniferous trees are competing, patches of the forest may, in simplistic terms, contain either one or the other but not both simultaneously, and simultaneously a patch may contain one species of ant or another but not both, and moreover, the type of tree present may influence which type of ant is fittest., and vice versa. An N-bit vector
30
R.A. Watson et al.
thus indicates which N species, of the possible 2N, are present in the environmental context. A species, ‘------0---’, indicates which type it is (e.g. ‘0’) and which environmental niche in the environmental context it occupies (e.g. 6th). This choice of representation has some properties that are required and some that are merely convenient. It is necessary for our purposes that not all species are present in all environmental contexts – otherwise, genetically specifying a symbiotic partnership would be redundant. It is also necessary that there are many different possibilities for the species composition in an environmental context – so the number of species present in any one environment should be large and many combinations of species should be allowed. The fact that species are arranged in mutually exclusive pairs is a contrivance for convenience: having all environmental states contain exactly N species allows us to define the environmental state and as N-dimensional space and to define fitness interactions between species using an energy function discussed below. And the fact that environmental states are defined using a binary ‘present or not’ representation rather than a continuous species density model is again a convenience – a continuous model would be interesting to investigate in future. Each individual in the ecosystem has a fitness that is a function of the other species present in the current environmental context. In principle, this requires an environmentally sensitive fitness function for each species and the resultant ecological dynamics could be arbitrarily complex in general. In the experiments that follow we restrict our attention to ecosystems with simple monotone dynamics and point attractors. Such dynamics can be modelled using an ‘energy function’ (17,18) over environmental states, e(E), such that the fitness of an individual of species, s, given an environmental context, E, is determined by the change in energy, ∆e(E, s) produced by adding s to E. That is, the fitness of an individual of species s in context E is, fitness(s, E) = ∆e(E, s) = e(E+s) - e(E), where ‘E+s’ is the environmental state E modified by adding species s. (Dynamical systems theory would normally minimise energy, but for familiarity we let positive ∆e correspond to positive fitness such that selection tends to increase e). Each individual has a genotype that defines which other species it forms groups with (see Figure 1). This genotype is simply a binary vector length 2N defining which of N possible ‘0’ species it groups with followed by which of N possible ‘1’ species it groups with. Binary relationships of this form are somewhat crude perhaps, but although the partnerships of any one individual are binary, the evolved associations of the species as a whole, as represented by the frequencies of partnerships in the population of individuals for that species, is a continuous variable (to the resolution of 1/population-size). We use the term ‘association’ to refer to this population-level inter-species average and reserve the word ‘partnership’ for the binary relationships specified by the genotype of an individual. The meaning of the binary partnership vector for an individual is simply that its fitness, already a function of the environmental context, is modified by the inclusion of its symbiotic partners into that context. Specifically, the fitness of an individual genotype, g, belonging to species, s, given a context, E, is defined as fitness(g, E)= ∆e(E, s+S) = e(E+s+S) - e(E), where S is the set of species that g specifies as partners. Using the components introduced above, illustrated in Figure 1, our model operates as defined in Figure 2.
Can Selfish Symbioses Effect Higher-Level Selection?
31
A species, s: -------0-May contain an individual genotype: # This example individual specifies partnerships with the following 5 species: ----0-----, -----0----, -------0-- †, -1--------, --------1-
So, if this individual is placed into an environmental context, it and these partner species will be present: i.e. s+S = -1--00-01-. For example, if this individual is placed into E= 1000100000, with e(E)=α. It will create E+S+s=1100000010, with e(E+S+s)=β. And it will receive a fitness of ∆e(E, S+s) = e(E+S+s) - e(E) = β-α. Fig. 1. An individual, its partners and its fitness in an environmental context. (†For implementational convenience, each individual specifies a partnership with itself. #A comma indicates separation of 0-partnerships from 1-partnerships.) Initialise Ecosystem containing 2N species, s1...s2N. For each species, sn, initialise P individuals, gn1...gnP. For each individual, gn, initialise 2N associations: ai=n =1, ai≠n =0. (notes 1 & 2) Until (stopping-criterion) evolve species: For i from 1 to N: Ei=rand({0,1}). //create random context E. t=1. // counter to decide when to reinitialise the context. //Evaluate g in context E.
For all, g, from Ecosystem in random order: E’=add(E,g). (note 2) Fit(g) = e(E’) - e(E).
//Update environmental context
If (Fit(g) > 0) then {E=E’. t=1.} else t++. If (t>T) {For i from 1 to N: Ei=rand({0,1}). t=1.} For each species s: s=reproduce(s).
add(E,S) → E’: // add individual, with partners, to ecosystem state to create E’. For n from 1 to N: If ((an==1) and (an+N==0)) E'n=0. (note 4) If ((an==0) and (an+N==1)) E'n=1. If ((an==1) and (an+N==1)) E'n=rand({0,1}). (note 5)
reproduce(s) → s’: // reproduce all the individuals in a species s For p from 1 to P: Select g1 and g2 from s with uniform probability (note 6). If (Fit(g1)>Fit(g2)) s’p= mutate(g1) else s’p= mutate(g2). Fig. 2. Model details. 1) Initially, each species associates with itself only. 2) Associating with self makes implementation of E+S+s identical to E+S. 3) This insertion of a species into the ecological state is cumbersome because there are 2N species that fit into N niches. 5) If an individual associates with ‘1’-species and ‘0’-species that are mutually exclusive (i.e. occupy the same niche) – then either species is added to E with equal probability. 6) Individuals specifying deleterious partnerships (negative fitness) have probability 0 of reproducing, but it is important that individuals specifying neutral partnerships have non-zero probability to reproduce. For the following experiments, N=50, P=100, and T=5N was sufficient to ensure an environmental context found a local ESS before being reinitialised. Mutation is single bit-flip.
32
R.A. Watson et al.
2.1 A Poly-ESS Ecological Dynamics We define the energy of an environmental state, E, as a sum over B copies of the subfunction, f, applied to disjoint subsets of species as follows: B −1
e( E ) = ∑ f (s bk +1 , s bk + 2 , … s (b +1)k -1 ) where b =0
2, if G = N , ⎧ ⎪ ⎞ ⎛ 1 1 f (G ) = ⎨ ⎟⎟ max⎜⎜ , ⎪ ⎝ 1 + G 1 + ( N − G ) ⎠ otherwise. ⎩
where B is the number of sub-functions, and k=N/B is the number of species in each sub-function. For convenience f is defined as a function of G, the number of 1-species in the subset of species. f, defines a simple ‘U-shaped’ energy function with local optima at all-0s and all-1s, but all-1s has higher energy than all-0s. Concatenating B of these U-shaped sub-functions creates a poly-attractor system. Five subsystems with two attractors each, as used in the following experiments, creates a system with 25=32 point attractors (local optima in the energy function, corresponding to ESSs with N species each). For N=25, B=5, k=10, a local attractor under individual selection is: 11111111111111111111000000000011111111110000000000. The attractor with the globally-maximal e-value is simply the concatenation of the superior solution to each sub-system, i.e. 11111111111111111111111111111111111111111111111111. However, which of the two-possible local ‘sub-attractors’ for each sub-system (e.g. …1111111111… or …0000000000…) will be found depends (under individual selection) on whether the initial environmental conditions have type-0s or type-1s in the majority. The all-type-1 attractor for each sub-system is thus found with probability 0.5 from a random initial condition and the probability of finding the global-maximal energy attractor is 0.5B=1/32. (This polyattractor system is identical to a building-block function used in (19) to show sexual recombination permits selection on subfunctions if genetic linkage is ‘tight’ – but here we evolve useful linkages.)
3 Results and Discussion Figure 3 (left) shows that under individual selection (before associations are evolved) different attractors are found (categorised by G). The globally-maximal energy attractor is not found in any of the 16 samples depicted. Figure 3 (right) shows that after associations have evolved the globally optimal attractor is being reached in every instance of the local ecological dynamics, regardless of the random initial conditions of the environmental context. Thus the basin of attraction of the globally optimal species configuration now absorbs the entire space of possible initial species configurations (Figure 4). To examine the evolved partnerships that have enabled these changes in the ecological dynamics, we can display an association matrix, Figure 5. Figure 5 (left) clearly shows not only that the majority of evolved associations are correct (between species of the same type) but also that they are correctly restricted to partnerships between species in the same sub-systems not across sub-systems. The evolution of partnerships is therefore successful at identifying sub-systems correctly, and identifying correct (single-type) partnerships
Can Selfish Symbioses Effect Higher-Level Selection?
33
1 0.8 0.6
G
…
0.4 0.2 0
Fig. 3. Ecosystem dynamics: Left) before associations evolve (initial 10,000 time steps), Right) after associations evolve (around 2.8·107 time steps). G=1 → globally optimal attractor, G=0.2,0.4,0.6,0.8, 0.0 → other local attractors. Vertical lines indicate points at which a new random initial ecological condition is created. 1 0.9
The proportion of initial conditions that reach the globally optimal attractor is intially 1/32 and eventually becomes 1.
0.8
attractor basin size
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
200
400
600
800
1000
1200
1400
time
Fig. 4. Change in size of basins of attraction for different attractor classes over evolutionary time: Shades indicate attractor classes grouped by energy values.
‘correct’ associations
no associations ‘incorrect’ associations Fig. 5. Evolved associations. Pixel (i,j) depicts the strength and correctness (i.e. 1s with 1s, and 0s with 0s) of the associations between the species i and j (and species i+N and j+N). Left) associations in the main experiment reveal the modularity of the fitness dependencies defined in the energy function. Right) a control experiment fails to separate modules, see text.
within those subsystems. Figure 5, right shows that the ability to evolve these correct associations is dependent on the separation of timescales. Specifically, if ecological dynamics are not allowed to settle to a local attractor (by setting T=1 in Figure 2), i.e. partnerships evolve in arbitrary ecological contexts, then although they find useful associations within sub-systems, they find incorrect associations between subsystems.
34
R.A. Watson et al.
These results show that individual selection for symbiotic relationships is capable of creating groups that are adaptively significant. After the relationships have evolved, the only attractor of the ecological dynamics is the attractor with the maximal energy. This is surprisingly ‘cooperative’ since ecological energy corresponds to collective fitness whereas individual selection should just go to the local ESS. The selective pressures that cause individuals to form these groups has two possible components: a) When the ecological context is not yet at an ESS, an individual that brings with it a partner that accelerates approach to the ESS is fitter than one that does not. Thus directional selection on two species promotes symbiosis between them (see (20) for an analogous argument regarding “relational QTLs”). b) When the ecological context is already at an ESS, an individual that brings with it a partner that is also part of the ESS has the same fitness as one that does not (because the species is already present). But an individual that brings a partner that is not part of the ESS will have negative ∆e – the partnership is deleterious because it attempts to introduce a species that is selected against in that context. Thus stabilising selection on two species also promotes symbiosis between them albeit in a rather subtle manner. We suggest that the former direct effect is less significant than the latter subtle effect given that the ecosystem spends most of its time at ESSs. This implies that the fommon form of evolved symbioses is to create associations between species that cooccur most often, and suggests that relationship formation in ecosystems will be basically Hebbian (15,18,21,22) – ‘species that fire together wire together’. This has the effect of reinforcing the future co-occurrence of species that already co-occur, and enlarges the basin of attraction for those species combinations in the same manner as Hebbian learning forms an associative memory (18,22). Note that the groups that form do not represent an entire N-species ESS but only contain 10 species each (Figure 5) as per the interactions in the energy function. These small groups are both sufficient and selectively efficient in the sense that they create B independent competitions between the two sub-ESSs in each sub-function rather than a single competition between all 2B complete ESSs (19,24). These small groups form because the co-occurrence of species within each sub-function is more reliable than the cooccurrence of species in different sub-functions. In (13) we provide a model where we assume that relationships form in a manner that reflects species co-occurrence at ESSs and show that this is sufficient to produce the same effects on attractors as those shown here. Using this abstraction we are also able to assess the scalability of the effect and show that it can evolve rare, high-fitness complexes that are unevolvable via non-associative evolution. This suggests a scalable optimisation method for automatic problem decomposition (24), creating algorithmic leverage similar to that demonstrated by (25). How does individual selection create higher-level selection? Well, from one point of view it doesn’t. If we take into account the combined genetic space of characters, both those addressed directly in the energy function and the genetic loci that control partnerships, then all that happens in our model is that natural selection finds a local attractor in this space. It is only when we pretend not to know about the evolved partnerships, and examine the attractors in the energy function alone, that we see group selection. However, this separation is biologically meaningful and relates to the separation of timescales. That is, the most obvious characteristics of species are those that are under direct selection – the ones whose frequencies are affected by selection
Can Selfish Symbioses Effect Higher-Level Selection?
35
on short timescales – the ecological population dynamics. But less obvious characteristics are simultaneously under indirect selection – characters that affect colocation of species for example. These change more slowly, over genetic evolution timescales rather than population dynamic timescales (15). When both systems are taken into account, individual selection explains all the observations (if it did not, we would not be satisfied that an evolutionary explanation had been provided). Specifically, partnerships form when group selection is in alignment with individual selection (at ESSs), but in multi-ESS games, these same groupings can cause selection that acts in opposition to (non-associative) individual selection and alter future selective trajectories when individuals are far from that ESS. Because the indirectly selected characters only have fitness consequences via the directly selected characters their evolution is characterisable by statistics such as cooccurrence of the directly selected characters. This produces systematic consequences on the attractors of directly selected characters – i.e. they enlarge attractors for species combinations reliably found at ESSs. This is equivalent to effecting higher-level selection on these combinations of species. Thus in our opinion, the two types of language - ‘higher levels of selection are created’ and ‘it is all explained by individual selection’ – are entirely reconcilable. Acknowledgements. Chris Buckley, Seth Bullock, Jason Noble, Ivor Simpson, Nicholas Hayes, Mike Streatfield, David Iclanzan, Chrisantha Fernando.
References 1. Maynard Smith, J., Szathmary, E.: The major transitions in evolution, Oxford (1995) 2. Margulis, L.: Origins of species: acquired genomes and individuality. BioSystems 31(2-3), 121–125 (1993) 3. Wilson, D.S.: A theory of group selection. PNAS 72(1), 143–146 (1975) 4. Wilson, D.S.: The Natural Selection of Populations and Communities. Benjamin/ Cummings, New York (1980) 5. Wilson, D.S., Dugatkin, L.A.: Group selection and assortative interactions. The American Naturalist 149(2), 336–351 (1997) 6. Powers, S.T., Mills, R., Penn, A.S., Watson, R.A.: Social niche construction provides an adaptive explanation for new levels of individuality (ABSTRACT). Submitted to Levels of Selection and Individuality in Evolution Workshop, ECAL 2009 (2009) 7. Powers, S.T., Watson, R.A.: Evolution of Individual Group Size Preference can Increase Group-level Selection and Cooperation. In: ECAL 2009 (2009) (to appear) 8. Powers, S.T., Penn, A.S., Watson, R.A.: Individual Selection for Cooperative Group Formation. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 585–594. Springer, Heidelberg (2007) 9. Snowdon, J., Powers, S.T., Watson, R.A.: Moderate contact between sub-populations promotes evolved assortativity enabling group selection. In: ECAL 2009 (2009) (to appear) 10. Maynard Smith, J.: Evolution and the Theory of Games, Cambridge (1982) 11. Axelrod, R.: The Complexity of Cooperation. Princeton (1997) 12. Hamilton, W.D.: The Genetical Evolution of Social Behaviour I and II. J. Theor. Biol. 7, 1–52 (1964)
36
R.A. Watson et al.
13. Mills, R., Watson, R.A.: Symbiosis Enables the Evolution of Rare Complexes in Structured Environments. In: ECAL 2009 (2009) (to appear) 14. Noonburg, V.W.: A Neural Network Modeled by an Adaptive Lotka-Volterra System. SIAM Journal on Applied Mathematics 49(6), 1779–1792 (1989) 15. Poderoso, F.C., Fontanari, J.F.: Model ecosystem with variable interspecies interactions. J. Phys. A: Math. Theor. 40, 8723–8738 (2007) 16. Moran, N.A.: Symbiosis as an adaptive process and source of phenotypic complexity. PNAS USA 10, 1073 (2007) 17. Strogatz: Non-Linear Dynamics and Chaos, Westview (1994) 18. Watson, R.A., Buckley, C.L., Mills, R.: The Effect of Hebbian Learning on Optimisation in Hopfield Networks. Technical Report, ECS, University of Southampton (2009) 19. Watson, R.A., Jansen, T.: A Building-Block Royal Road Where Crossover is Provably Essential. In: GECCO 2007, pp. 1452–1459 (2007) 20. Wagner, G.P., Pavlicev, M., Cheverud, J.M.: The road to modularity. Nature Reviews Genetics 8, 921–931 (2007) 21. Watson, R.A., Mills, R., Buckley, C.L., Palmius, N., Powers, S.T., Penn, A.S.: Hebbian Learning in Ecosystems: Species that fire together wire together (ABSTRACT). In: ECAL 2009 (2009b) (to appear) 22. Watson, R.A., Buckley, C.L., Mills, R.: Implicit Hebbian Learning and Distributed Optimisation in Complex Adaptive Systems (in prep) 23. Watson, R.A., Pollack, J.B.: Modular Interdependency in Complex Dynamical Systems. Artificial Life 11(4), 445–457 (2005) 24. Mills, R., Watson, R.A.: A symbiosis Algorithm for automatic problem decomposition (2009) (in prep) 25. Iclanzan, D., Dumitrescu, D.: Overcoming hierarchical difficulty by hill-climbing the building block structure. In: GECCO 2007, pp. 1256–1263 (2007)
The Effect of Group Size and Frequency-of-Encounter on the Evolution of Cooperation Steve Phelps1 , Gabriel Nevarez2 , and Andrew Howes2 1
Centre for Computational Finance and Economic Agents (CCFEA), University of Essex, UK
[email protected] 2 Manchester Business School, University of Manchester, UK
Abstract. We introduce a model of the evolution of cooperation in groups which incorporates both conditional direct-reciprocity (“tit-fortat”), and indirect-reciprocity based on public reputation (“conspicuous altruism”). We use ALife methods to quantitatively assess the effect of changing the group size and the frequency with which other group members are encountered. We find that for moderately sized groups, although conspicuous altruism plays an important role in enabling cooperation, it fails to prevent an exponential increase in the level of the defectors as the group size is increased, suggesting that economic factors may limit group size for cooperative ecological tasks such as foraging.
1
The Model
Pairs of agents (ai , aj ) : i = j are drawn at random from A = {a1 , a2 , . . . an }, and engage in bouts of grooming at different time periods t ∈ {0, 1, . . . N }. We refer to n as the group size and N as the frequency-of-encounter. At each time period t the groomer ai may choose to invest a certain amount of effort u(i,j,t) ∈ [0, U ] ⊂ in grooming their partner aj , where U ∈ is a parameter determining the maximum rate of grooming. This results in a negative fitness payoff −u to the groomer, and a positive fitness payoff ku to the partner aj : φ(j,t+1) = φ(j,t) + k · u(i,j,t) φ(i,t+1) = φ(i,t) − u(i,j,t) where φ(i,t) ∈ denotes the fitness of agent ai at time t, and k ∈ is a constant parameter. In an ecological context, the positive fitness payoff ku might represent, for example, the fitness gains from parasite elimination, whereas the fitness penalty −u would represent the opportunity cost of foregoing other activities, such as foraging, during the time u allocated for grooming. Since we are interested in the evolution of cooperation, we analyse outcomes in which agents choose values of u that maximise their own fitness φi . Provided G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 37–44, 2011. c Springer-Verlag Berlin Heidelberg 2011
38
S. Phelps, G. Nevarez, and A. Howes
that k > 1, over many bouts of interaction it is possible for agents to enter into reciprocal relationships that are mutually-beneficial, since the groomer’s initial cost u may be reciprocated with ku yielding a net benefit ku − u = u(k − 1). Provided that agents reciprocate, they can increase their net benefit by investing larger values of u. However, by increasing their investment they put themselves more at risk from exploitation, since just as in the alternating prisoner’s dilemma [5], defection is the dominant strategy if the total number of bouts N is known: the optimal behavior is to accept the benefits of being groomed without investing in grooming in return. In the case where N is unknown, and the number of agents is n = 2, it is well known that conditional reciprocation is one of several evolutionary-stable solutions in the form of the so-called tit-for-tat strategy which copies the action that the opposing agent chose in the preceding bout at t − 1 [4]. However, this result does not generalise to larger groups n > 2. Nowak and Sigmund [7] demonstrate that reciprocity can emerge indirectly in large groups, provided that information about each agent’s history of actions is summarised and made publicly available in the form of a reputation or “imagescore” r(i,t) ∈ [rmin , rmax ] ⊂ . The image-score ri summarises the propensityto-cooperate of agent ai . As in the Nowak and Sigmund model, image scores in our model are initialised ∀i r(i,0) = 0 and are bound at rmin = −5 and rmax = 5. An agent’s image score is incremented at t + 1 if the agent invests a non-zero amount at time t, otherwise it is decremented: min(r(i,t) + 1, rmax ) : u(i,x,t) > 0 r(i,t+1) = max(r(i,t) − 1, rmin ) : u(i,x,t) = 0 and agents invest conditionally on their partner’s image score: γ : r(j,t) ≥ σi u(i,j,t) = 0 : r(j,t) < σi where σi is a parameter determining the threshold image score above which agent ai will cooperate, and γ ∈ is a global parameter (as in [7] we use γ = 10−1 and k = 10). Nowak and Sigmund [6] demonstrate that widespread defection is avoided if, and only if, the initial proportion of agents using a discriminatory1 strategy is above a critical value, implying that strategies based on indirect reciprocity via reputation are an essential prerequisite for the evolution of cooperation in large groups. We are interested in the effect of group size n and interaction frequency N on the evolution of cooperation. The analytical model of Nowak and Sigmund [6] assumes: a) that the group size n is large enough relative to N that strategies based on private history, such as tit-for-tat, are irrelevant (since the probability of encountering previous partners is very small); and b) that the we do not need to take into account the fact that an agent cannot cooperate with itself when 1
Discriminatory strategies cooperate only if their partner’s image score is nonnegative, that is: σi = 0.
The Effect of Group Size and Frequency-of-Encounter
39
calculating the probability with which any given agent is likely to encounter a particular strategy. However, in order to model changes in group size, and hence interaction in smaller groups, it is necessary to drop both of these assumptions. The resulting model is more complicated, and it is difficult to derive closed-form solutions for the equilibrium behaviour. Therefore we use ALife simulation to estimate payoffs, and numerical methods to compute asymptotic outcomes, as described in the next section.
2
Methodology
In order to study the evolution of populations of agents using the above strategies, we use both ALife methods and mathematical modeling based on evolutionary game-theory. However, rather than considering pairs of agents, our analysis concerns interactions amongst groups of size n > 2 assembled from a larger population of individuals. The resulting game-theoretic analysis is complicated by the fact that this results in a many-player game, which presents issues of tractability for the standard methods for computing the equilibria of normal-form games. A popular ALife approach to this issue is to use Co-evolutionary algorithms [3,4]. In a co-evolutionary optimisation, the fitness of individuals in the population is evaluated relative to one another in joint interactions (similarly to payoffs in a strategic game), and it is suggested that in certain circumstances the converged population is an approximate Nash solution to the underlying game; that is, the stable states, or equilibria, of the co-evolutionary process are related to the evolutionary stable strategies (ESS) of the corresponding game. However, there are many caveats to interpreting the equilibrium states of standard co-evolutionary algorithms as approximations of game-theoretic equilibria, as discussed in detail by Sevan Ficici [1,2]. In order to address this issue, we adopt a methodology called empirical game-theory [10,12], which uses a combination of simulation and rigorous game-theoretic analysis. The empirical game-theory method uses a heuristic payoff matrix which is calibrated by running many simulations, as detailed below. We can make one important simplification by assuming that the game is symmetric, and therefore that the payoff to a given strategy depends only on the number of agents within the group adopting each strategy. Thus for a game with j strategies, we represent entries in the payoff matrix as vectors of the form p = (p1 , . . . , pj ) where pi specifies the number of agents who are playing the ith strategy. Each entry p ∈ P is mapped onto an outcome vector q ∈ Q of the form q = (q1 , . . . , qj ) where qi specifies the expected payoff to the ith strategy. For each entry in the payoff matrix we estimate the expected payoff to each strategy by running 105 ALife simulations and taking the mean2 fitness. With estimates of the payoffs to each strategy in hand, we are in a position to model the evolution of populations of agents using these strategies. In our evolutionary model, we do not restrict reproduction to within-group mating; 2
We take the average fitness of every agent adopting the strategy for which we are calculating the payoff, and then also average across simulations.
40
S. Phelps, G. Nevarez, and A. Howes
rather, we consider a larger population which temporarily forms groups of size n in order to perform some ecological task. Thus we use the standard replicator dynamics equation [11] to model how the frequency of each strategy in the larger population changes over time in response to the within-group payoffs: m ˙ i = [u(ei , m) − u(m, m)] mi
(1)
where m is a mixed-strategy vector, u(m, m) is the mean payoff when all players play m, and u(ei , m) is the average payoff to pure strategy i when all players play m, and m ˙ i is the first derivative of mi with respect to time. Strategies that gain above-average payoff become more likely to be played, and this equation models a simple co-evolutionary process of adaptation. Since mixed strategies represent population frequencies, the components of m sum to one. The geometric corollary of this is that the vectors m lie in the unit-simplex j j−1 = {x ∈ j : i=1 xi = 1}. In the case of j = 3 strategies the unit-simplex 2 is a two-dimensional triangle embedded in three-dimensional space which passes through the coordinates corresponding to pure strategy mixes: (1, 0, 0), (0, 1, 0), and (0, 0, 1). We shall use a two dimensional projection of this triangle to visualise the population dynamics in the next section3 . In our experiments we solve this system numerically: we choose 103 randomly sampled initial values which are chosen uniformly from the unit simplex [9], and for each of these initial mixed-strategies we solve Equation 1 as an initial value problem using MATLAB’s ode15s solver [8]. This results in 103 trajectories which either terminate at stationary points, or enter cycles. We consider j = 5 strategies: 1. C which cooperates unconditionally (σi = rmin ); 2. D which defects unconditionally (σi = rmax + 1); 3. S which cooperates conditionally with agents who have a good reputation (σi = 0) but cooperates unconditionally in the first round (when reputations have not yet been established); 4. Sd which cooperates conditionally (σi = 0) but defects unconditionally in the first round of play; 5. T 4T which cooperates conditionally with agents who have cooperated in previous rounds, and cooperates unconditionally against unseen opponents.
3
Results
Initially we restrict attention to j = 3 strategies and 102 initial values, allowing us to more easily visualise the population dynamics and to compare our results with that of Nowak and Sigmund [6] (who assume a large group size n relative to N ). 3
See [11, pp. 3–7] for a more detailed exposition of the geometry of mixed-strategy spaces.
The Effect of Group Size and Frequency-of-Encounter
41
S
C
D
S
C
D
Fig. 1. Direction field for n = 10 agents and N = 13 pairwise interactions per generation (above) compared with N = 100 (below). C denotes unconditional altruists, D unconditional defectors and S discriminators who cooperate in the first round. Each line represents a trajectory whose termination is represented by an open circle. The arrows show the direction of change.
42
S. Phelps, G. Nevarez, and A. Howes
0.7 C D S Sd T4T
Mean frequency in equilibrium
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.5
1
1.5
2
2.5 N/n
3
3.5
4
4.5
5
Fig. 2. Mean frequency of each strategy in equilibrium as the number of pairwise interactions per generation N is increased relative to n. The error bars show the confidence interval for p = 0.05. C denotes the proportion of unconditional altruists; D unconditional defectors; S discriminators who cooperate in the first round; Sd discriminators who defect in the first round; T 4T the tit-for-tat strategy.
When we exclude discriminators and consider only cooperator (C), defectors (D) and tit-for-tat (T 4T ) we find that defection is the dominant strategy regardless of N , implying that conditional reciprocity cannot sustain group cooperation in the absence of reputation. We obtain more subtle results when we introduce reputation-based strategies. Figure 1 shows the phase diagram for the population frequencies when we analyse the interaction between cooperators (C), defectors (D) and discriminators (S) when we have a small group of n = 10 agents. As in [6] we find that a minimum initial frequency of discriminators (y axis) is necessary to prevent widespread convergence to the defection strategy (in the bottom right of the simplex). However, the results of our model differ in two important respects. Firstly, when the critical threshold of discriminators is reached, our model results in various stationary mixes of discriminators and cooperators with a total absence of defection, and no limit cycles. This is in contrast to [6] where the population cycles endlessly between all three strategies if the critical threshold is exceeded. Secondly, the behaviour of our model is sensitive to the number of pairwise interactions per generation N : as N is increased from N = 13 to N = 100 we see that the basin of attraction of the pure defection equilibrium is significantly decreased, and correspondingly the critical threshold of initial discriminators
The Effect of Group Size and Frequency-of-Encounter
43
necessary to avoid widespread defection. Our intuitive interpretation of these results is that defection is less likely4 as we increase the frequency of interaction relative to the group size. As discussed in Section 1, if we increase N relative to n we need to consider the effect of strategies that take into account private interaction history as well as strategies that are based on public reputation. Figure 2 shows the mean frequency in equilibrium of each strategy when we analyse all five strategies and systematically vary N while holding n fixed. We plot the equilibrium population frequency against N , and obtain the same graph for both n = 10 and n = 20 n agents. This suggests that the ratio N n determines the asymptotic behaviour.
4
Discussion
The frequency of both unconditional cooperation and discrimination increases with N , and these strategies become more prevalent than discriminators for N > 2. As we would intuitively expect, for N > 1 discriminators become more n n prevalent as we increase group size or decrease frequency of interaction. However, this is not sufficient to prevent free-riding. Most striking is that the likelihood of defection decreases exponentially as we increase the number of interactions per generation N . Correspondingly, as we increase the group size n we observe an exponential increase in the level of defection. If we consider the possibility of inter-group competition and hence group selection, then since the expected frequency of defectors in equilibrium determines the per-capita fitness of the agents in the overall population5 , we can interpret our results as showing how group fitness changes as a function of group size (n) and frequency-of-encounter (N ). That is, for any given N we can determine the optimum group size n. Our results indicate that the stylised cooperation task described by our model introduces a very strong selection pressure for smaller group sizes. Of course, this task is not the only ecological task which influences per-capita fitness for any given species. For example, our model could be used in conjunction with optimal foraging models to derive a comprehensive model of optimum group size for a particular species in a particular niche. Our main contribution is to highlight that economic factors play a significant role in determining optimal group size, when other ecological tasks such as foraging favour group sizes that are relatively small compared with the frequency-of-encounter.
5
Conclusion
Our model predicts that neither reputation nor conditional punishment are sufficient to prevent free-riding as the group size increases. Although both types of strategy play an important role, as the group size increases the level of 4 5
Assuming that all points in the simplex are equally likely as initial values. In the absence of defection all other strategies are able to gain the maximum available surplus.
44
S. Phelps, G. Nevarez, and A. Howes
conspicuous altruism based on reputation rises, but defection rises faster. Thus, when other tasks already favour smaller groups, we predict that economic factors will limit the maximum group size independently of the group size favoured by other niche-specific tasks.
Acknowledgements We are grateful for the comments of the anonymous reviewers, and for financial support received from the UK EPSRC through the Theory for Evolving Sociocognitive Systems project (EP/D05088X/1).
References 1. Ficici, S.G., Pollack, J.B.: Challenges in coevolutionary learning: Arms-race dynamics, open-endedness, and mediocre stable states. In: Proceedings of of ALIFE-6 (1998) 2. Ficici, S.G., Pollack, J.B.: A game-theoretic approach to the simple coevolutionary algorithm. In: Schwefel, H.-P., Schoenauer, M., Deb, K., Rudolph, G., Yao, X., Lutton, E., Merelo, J.J. (eds.) PPSN VI 2000. LNCS, vol. 1917, pp. 16–20. Springer, Heidelberg (2000) 3. Hillis, W.D.: Co-evolving parasites improve simulated evolution as an optimization procedure. In: Langton, et al. (eds.) Proceedings of ALIFE-2, pp. 313–324. Addison Wesley, Reading (1992) 4. Miller, J.H.: The coevolution of automata in the repeated Prisoner’s Dilemma. Journal of Economic Behavior and Organization 29(1), 87–112 (1996) 5. Nowak, M.A., Sigmund, K.: The alternating prisoner’s dilemma. Journal of Theoretical Biology 168, 219–226 (1994) 6. Nowak, M.A., Sigmund, K.: The dynamics of indirect reciprocity. Journal of Theoretical Biology 194(4), 561–574 (1998) 7. Nowak, M.A., Sigmund, K.: Evolution of indirect reciprocity by image scoring. Nature 383, 537–577 (1998) 8. Shampine, L.F., Reichelt, M.W.: The MATLAB ODE suite (2009), http://www. mathworks.com/access/helpdesk/help/pdf_doc/otherdocs/ode_suite.pdf 9. Stafford, R.: Random vectors with fixed sum (January 2006), http://www. mathworks.com/matlabcentral/fileexchange/9700 10. Walsh, W.E., Das, R., Tesauro, G., Kephart, J.O.: Analyzing complex strategic interactions in multi-agent games. In: AAAI 2002 Workshop on Game Theoretic and Decision Theoretic Agents (2002), http://wewalsh.com/papers/MultiAgentGame. pdf 11. Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1997) (First MIT Press edition) 12. Wellman, M.P.: Methods for Empirical Game-Theoretic Analysis. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence, pp. 1152–1155 (2006)
Moderate Contact between Sub-populations Promotes Evolved Assortativity Enabling Group Selection James R. Snowdon, Simon T. Powers, and Richard A. Watson School of Electronics and Computer Science, University of Southampton, U.K.
[email protected] Abstract. Group selection is easily observed when spatial group structure is imposed on a population. In fact, spatial structure is just a means of providing assortative interactions such that the benefits of cooperating are delivered to other cooperators more than to selfish individuals. In principle, assortative interactions could be supported by individually adapted traits without physical grouping. But this possibility seems to be ruled-out because any ’marker’ that cooperators used for this purpose could be adopted by selfish individuals also. However, here we show that stable assortative marking can evolve when sub-populations at different evolutionarily stable strategies (ESSs) are brought into contact. Interestingly, if they are brought into contact too quickly, individual selection causes loss of behavioural diversity before assortative markers have a chance to evolve. But if they are brought into contact slowly, moderate initial mixing between sub-populations produces a pressure to evolve traits that facilitate assortative interactions. Once assortative interactions have become established, group competition between the two ESSs is facilitated without any spatial group structure. This process thus illustrates conditions where individual selection canalises groups that are initially spatially defined into stable groups that compete without the need for continued spatial separation.
1
Introduction
The perspective of group selection is often used to explain altruistic behaviour. Sober and Wilson provide the definition that ‘a behaviour is altruistic when it increases the fitness of others and decreases the fitness of the actor’ [1], which can appear unsupportable by traditional theories of natural selection as echoed by Dawkins [2], who claims that for any gene to survive they must promote themselves at the expense of others. A gene which supports altruistic behaviour would quickly be exploited to extinction by selfish cheaters. As such selfishness can be described as an Evolutionarily Stable Strategy (ESS). A behaviour is considered to be an ESS if when all individuals within a population adopt a particular behavioural strategy no other type can successfully invade [3]. Altruism is therefore not an ESS since it can be invaded by cheats G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 45–52, 2011. c Springer-Verlag Berlin Heidelberg 2011
46
J.R. Snowdon, S.T. Powers, and R.A. Watson
who will exploit the benefit awarded to them and not reciprocate. For natural selection to occur at any level, there must exist a fitness variance between entities [4], where an entity can be an individual or a group composed of individuals forming a meta-population. In this one ESS system all groups will move towards a situation where every individual holds the same behaviour, resulting in no variance for natural selection to act upon. Wilson [5] uses groups comprised of altruists and cheaters to provide a setting in which altruism can prevail. Each group reproducing individually would ultimately be drawn towards the one stable attractor which is the all selfish ESS, but groups are dispersed prior to this occurring and the progeny are mixed. New groups are composed of random samples and the aggregation and dispersal process is repeated indefinitely. Since groups which contain more altruists grow at a faster rate than groups composed of majority cheats, the effect is that the net proportion of altruists rises. However, when practically assessing the model it is found that between group variance in the frequencies of cheats and selfish types is required to be extremely high, and as such a stronger effect is observed through taking an extremely small sample of the population when creating each new group in the aggregation process. This limit of small group sizes coupled with the aggregation and dispersal movement limits the applicability to real world situations. Although the conditions for such altruism to evolve are restrictive, Powers et al. [6] have investigated how these conditions can in fact arise by evolution of individual traits that modify aspects of population structure, such as group size. Wilson [7] studies group level selection in complex meta-communities. His model shows how individuals detrimental to local community productivity can be purged through selection at the community level. This is an example of a multi-ESS system, for the internal community dynamics have many attractors (ESSs) and proves to be more effective than previous altruist/ cheat models, since between-group variance could be preserved even when the groups reached an internal equilibrium. Such a form of group selection is also explored by Boyd and Richerson who consider selection acting among multiple ESSs [8], where each group reaches an ESS holding an inherent fitness thus providing a between group variance. Through competition with other groups based upon a migratory process, the ESSs compete and some are forced into extinction as groups reach the point of highest individual fitness. Their model does not consider the forces which give rise to these groupings, even though they are able to illustrate the selection processes which act at the group level. We are interested in situations where a form of group selection can occur which is not restricted by the need for constant spatial segregation. Consider a scenario where two spatially separated sub-populations develop different behaviours, A and B, which are both ESSs. The sub-populations are then slowly brought into contact, perhaps through natural expansion of each of the sub-populations. One of these behaviours, say A, even though it may have been intrinsically fitter than B, may be lost when the two populations are brought into contact because it fares poorly in interaction with B, because B may be more numerous. In
Moderate Contact between Sub-populations Promotes Evolved Assortativity
47
principle, if individuals within the two sub-populations evolved distinguishable markers, such as the secretion of a unique, identifiable chemical which correlated with behaviour and promoted assortative interactions then competition between the groups would be enabled when the two are brought into contact. That is, although A loses to B, A-A wins in interaction with B-B. This would mean that the two sub-populations had formed higher-level units of selection. A model provided by McElreath, Boyd and Richerson [9] in the domain of cultural evolution shows that individual selection can evolve markers that are correlated with behaviour and promote assortative interactions. Their work does not address the notion that such marking facilitates higher-level selection - they are interested in the promotion of stably coexisting (ethnically marked) groups. But we show that the conditions they illustrate for evolving behaviourally-correlated markers are also suitable to thereby facilitate inter-group competition.
2
A Model of the Evolution of Assortative Markers under Some Degree of Spatial Segregation
Our model is founded upon the work of McElreath et al [9]. We describe individuals as exhibiting a behaviour, labelled A and B, and a ‘marker’ trait, labelled 1 and 2. This can be interpreted as an externally apparent phenotype used for the purpose of facilitating assortative interactions (as per Boyd et al’s work [10]) or any other trait which has the effect of producing assortative interactions, such as a habitat preference [11]. Upon initialization individuals are distributed between two sub-populations, and we start from the scenario where each is already at a stable A or B ESS. The algorithmic operation of the model for each subsequent time step is as follows: 1. Interactions - All individuals interact with each other within the same sub-population in a coordination game with a payoff matrix as shown in table 1. Table 1. Payoff matrix for interactions between individuals harbouring either behaviour A or behaviour B. This system exhibits both A and B as ESSs.
A B
A B 1+δ+α 1 1 1+δ
An individual interacting with another holding the same behavioural trait will enjoy an advantage, δ but only the marker can influence which other individual is likely to be interacted with. A parameter e describes assortativity, or marker strength - the probability that an individual will interact with another possessing the same marker. When e=1 individuals will only interact with others of the same marker as themselves, and when e=0 individuals interact totally randomly with no reliance upon marker.
48
J.R. Snowdon, S.T. Powers, and R.A. Watson
2. Reproduction - The proportions, N, of each marker, i, and behaviour, j, within the next generation of individuals changes according to equation 1, where Wij is the total payoff awarded to each behaviour/marker type (A1, A2, B1 and B2) and W¯ij is the average payoff awarded to individuals of that type. Wij Nij = (1) ¯ Wij 3. Migration - Prior to being brought together, there is a degree of migration between the two groups, represented by the parameter m. This relocates a random proportion of each group to within the other, thus representing a metric of spatial segregation. Groups at a value of m = 0 implies that no migration exists as groups are completely segregated, and a larger value of m (maximum 0.5) represents a freely mixed population. In order to generate a between group fitness variance, A-A interactions are given an additional payoff bonus α, and groups are brought into a competitive state by bringing the sub-populations together after a fixed period by setting the inter-group migration to a maximum, m = 0.5.
3
Exploring the Behaviour of Two Sub-populations
It would of course be possible to set up the initial conditions of the two groups such that markers correlated with behaviours will be reached regardless of interaction. However, we set the initial conditions of the sub-populations such that they have different behaviours in the majority and the same marker in the majority. This means that without some selective pressure to cause markers to diversify, both groups will have the same marker and assortative interactions will not be possible. We also set the size of one sub-population to be slightly larger than the other - the sub-population with the inferior behaviour - for similar reasons. That is, we want to identify conditions where the superior behaviour prevails despite being initially disadvantaged. We will refer to this correlation of marker/ behaviour pairs as the evolution of assortative markers. The simulation was run with the initial parameters shown in table 2, and as will be seen the values of e and m will come under further investigation so these are not fixed. Crucially groups are initialised such that if the initial migration rate is too high then the B behaviour will overwhelm as before markers have evolved in the Table 2. Parameters used for the simulation process Parameter Value Parameter Value Sub-population 1 Prop. 1 Marker 0.9 Sub-population 2 Prop. 1 Marker 0.6 Sub-population 1 Prop. 2 Marker 0.1 Sub-population 2 Prop. 2 Marker 0.4 Sub-population 1 Size 2000 Sub-population 2 Size 3800 Like-for-like payoff, δ 0.5 A-A payoff bonus, α 0.1
Moderate Contact between Sub-populations Promotes Evolved Assortativity
49
system due to the larger sub-population 2 holding a higher majority of B type. If the migration rate is too low then assortative markers will not evolve despite sub-populations being behaviourally marked. Under these conditions should assortative marking occur, if one of the sub-populations was going to go to the marker 1, we would expect it to be sub-population 1. When groups are freely mixed after an initial partial mixing phase, we would expect the ultimate results to be predictable through observing the basins of attraction shown in figure 1. If markers have not been able to evolve (for example from a zero migration rate) then we would expect the distribution of each marker type in a sub-population to be random as frequencies would drift. The ultimately winning strategy would be B as when the sub-populations are mixed, the majority - 65% - of the newly formed group would be from the all-B group 2, and this would overwhelm any advantage which A-A receives which would be unable to successfully interact assortatively. If the migration rate is too high then all individuals in both sub-populations will become type B due to the initial majority of behaviour Bs in the entire population, and so the mixed group would then already be at the all-B attractor. However, if assortative markers have evolved then at e = 0.5 despite only 35% of the mixed group being A1 type, the resultant attractor is expected to be A1 as figure 1(b) shows. 1
Relative Type Payoff
Relative Type Payoff
1 0.8 0.6 A type B type 0.4 0.2 0 0
0.2
0.4
0.6
0.8
Proportion of A type
1
0.8 0.6 A type B type 0.4 0.2 0 0
0.2
0.4
0.6
0.8
1
Proportion of A type
(a) Full correlation between behaviour (b) No correlation between behaviour and marker (e = 1) and marker (e = 0) Fig. 1. Relative fitness of A1 and B2 type when in competition in a single population. The line indicates the lowest initial proportion of A1 type required for A1 to reach fixation.
Figure 2 illustrates system behaviour for a fixed e = 0.5 and varying migration rates. Figure 2(a) introduces a metric of the polarisation of marking present within the sub-populations as the linkage disequilibrium of marker and behaviour after an appropriate period of mixing. A higher value shows assortative markers have evolved, and 0 shows that such marking has not occurred. It is clear that a region exists where a degree of spatial segregation enables the evolution of such assortative marking.
J.R. Snowdon, S.T. Powers, and R.A. Watson
1
1
Proportion of A1 winning
Behaviour/ Marker Covariance
50
0.8
0.6
0.4
0.2
0 0
0.05
0.1
0.15
Migration Rate − 0 min, 0.2 max
(a)
0.2
0.8
0.6
0.4
0.2
0 0
0.05
0.1
0.15
0.2
Migration Rate − 0 min, 0.2 max
(b)
Fig. 2. a) Analysis over a range of spatial segregation values, when e = 0.5, displaying marker/ behaviour covariance evolved with spatial segregation after 600 time steps, averaged over 20 runs b) Resulting proportion of A1 types which went on to reach fixation determined from an average of the same 20 runs
Figure 2(b) shows the proportion of runs when the groups were then brought together and the A1 behaviour type went on to reach fixation and drive the B2 behaviour extinct. Again we see that there is a region where A1 is able to outcompete B2 - and this corresponds to the region of polarised markers. Taking single points from the graph in figure 3, we can observe the internal behaviour between the two sub-populations which leads to the results shown: – No initial contact, m = 0.0, shows that because the migration rate is zero there is no pressure for markers to evolve, so when the groups are brought together the effect is that B behaviour types overwhelm A behaviour. – Moderate contact, m = 0.01, shows that although sub-population 2 starts with marker 1 in the majority, there is a pressure not to interact with migrants of majority A1 type from sub-population 1. This causes marker 2 to cross over and become marked with behaviour B. When sub-populations are freely mixed this results in A1 winning. – High contact, m = 0.2, shows that B behaviour fixates within the entire population due to the higher migration rate, and again there is no pressure for markers to evolve. Accordingly the resultant ‘winner’ is non-polarised behaviour B.
4
Discussion
We have shown that, as per McElreath et al’s work, a small amount of initial mixing between sub-populations that exhibit different behaviours can produce a selective pressure to favour the evolution of markers that are correlated to those behaviours. No mixing and the markers have no function, too much mixing and one of the behaviours is lost before markers evolve; but a period with a small
Moderate Contact between Sub-populations Promotes Evolved Assortativity
Sub−population 1
No initial contact
Moderate contact
High contact
3000
B wins
A1 wins
B wins 2000 1000 0 0
500
1000
0
Time step
Sub−population 2
51
500
1000
0
Time step
500
1000
Time step
4000
A1 A2 B1 B2
3000 2000 1000 0 0
500 Time step
1000
0
500 Time step
1000
0
500
1000
Time step
Fig. 3. Illustrating interactions with varying initial contact between sub-populations. Sub-population 1 (above) is fixed at size 2000, and sub-population 2 (below) is fixed at size 3800. At t=600 the sub-populations are mixed through setting inter-group migration to maximum, resulting in the domination of one behaviour.
amount of mixing produces this effect. In essence, this occurs because it enables a period where weak indirect selective pressures on markers can be felt, whilst precluding the strong selective pressures on behaviours that would lose diversity before the markers have evolved. This effect means that partial spatial segregation between sub-populations (but not complete segregation) enables individuals to evolve behaviours that reinforce within-group interactions and we show that the assortative interactions that result facilitate competition between groups when increased mixing occurs. The evolution of markers in this way can be seen as construction of an individual’s social environment [12]. Group selection can be facilitated by such a process, as shown here and in other work [13,14]. In conventional altruist/ cheat dynamics involving the evolution of assortative markers there exists a possibility of cheats evolving the phenotypic altruistic marker, such as a green beard [2], signalling altruistic intent without actually posessing a cooperative gene. Because this model holds multiple stable ESSs the issue of cheating does not occur, as an individual falsely advertising its’ behaviour would receive a lower payoff than if it had been honest. This model illustrates very simple conditions where individual selection can favour the evolution of traits that support between-group competition. Accordingly, from one point of view, the expectation that individual selection cannot create significant group selection is shown to be false. But, from another point of view, one which takes into account both individual selection on markers and behaviours under these spatial conditions - individual selection explains the outcomes we observe. Indeed, if there were not the case, we would not have provided an evolutionarily explanation at all. Nonetheless, because the markers are different from behaviours in that they only have fitness consequences via their indirect effects on the assortativity of behaviours - we argue that a two scale selection theory is conceptually useful.
52
J.R. Snowdon, S.T. Powers, and R.A. Watson
References 1. Sober, E., Wilson, D.S.: Unto others: the evolution of psychology and unselfish behavior. Harvard University Press, Cambridge (1998) 2. Dawkins, R.: The Selfish Gene. Oxford University Press, Oxford (1976) 3. Maynard Smith, J.: Evolution and the Theory of Games. Cambridge University Press, Cambridge (1982) 4. Wilson, D.S., Sober, E.: Reintroducing group selection to the human behavioral sciences. Behavioral and Brain Sciences 17(4), 585–654 (1994) 5. Wilson, D.S.: A theory of group selection. PNAS 72(1), 143–146 (1975) 6. Powers, S.T., Watson, R.A.: Evolution of individual group size preferences can increase group-level selection and cooperation. In: Proceedings of the 10th European Conference on Artificial Life (2009) (to appear) 7. Wilson, D.S.: Complex interactions in metacommunities, with implications for biodiversity and higher levels of selection. Ecology 73, 1984–2000 (1992) 8. Boyd, R., Richerson, P.J.: Group Selection among Alternative evolutionarily Stable Strategies. J. Theor. Biol. 145, 331–342 (1990) 9. McElreath, R., Boyd, R., Richerson, P.J.: Shared norms and the evolution of ethnic markers. Current Anthropology 44(1), 122–129 (2003) 10. Boyd, R., Richerson, P.J.: Culture and the Evolutionary Process. University of Chicago Press, Chicago (1985) 11. Wilson, D.S., Dugatkin, L.A.: Group selection and assortative interactions. Am. Nat. 149, 336–351 (1997) 12. Powers, S.T., Mills, R., Penn, A.S., Watson, R.A.: Social Environment Construction Provides an Adaptive Explanation for New Levels of Individuality. In: Proceedings of ECAL 2009 Workshop on Levels of Selection and Individuality in Evolution: Conceptual Issues and the Role of Artificial Life Models (2009) 13. Mills, R., Watson, R.A.: Symbiosis enables the evolution of rare complexes in structured environments. In: Proceedings of the 10th European Conference on Artificial Life (2009) (to appear) 14. Watson, R.A., Palmius, N., Mills, R., Powers, S.T., Penn, A.S.: Can selfish symbioses effect higher-level selection? In: Proceedings of the 10th European Conference on Artificial Life (2009) (to appear)
Evolution of Individual Group Size Preference Can Increase Group-Level Selection and Cooperation Simon T. Powers and Richard A. Watson Natural Systems group, University of Southampton, U.K.
[email protected] Abstract. The question of how cooperative groups can evolve and be maintained is fundamental to understanding the evolution of social behaviour in general, and the major transitions in particular. Here, we show how selection on an individual trait for group size preference can increase variance in fitness at the group-level, thereby leading to an increase in cooperation through stronger group selection. We are thus able to show conditions under which a population can evolve from an initial state with low cooperation and only weak group selection, to one where group selection is a highly effective force.
1
Introduction
Understanding how cooperative social groups can evolve and be maintained is a major challenge in both artificial life [1,2] and evolutionary biology [3,4]. In particular, work on cooperative group formation has gained an impetus in recent years through a recognition of the major transitions in evolution [5], and the corresponding realisation that cooperation can be a driving force in evolution and not a mere curio [6]. The major transitions include the evolution of multi-cellular organisms from single cells, and the evolution of societies from solitary organisms. In these transitions, lower-level entities (single cells, solitary organisms) formed groups and donated part of their personal fitness to contribute to shared group success, to the point where cooperation amongst the lower-level entities was so great that the group ultimately became an individual in its own right [5,6] (e.g., somatic cells give up independent reproductive rights in multi-cellular organisms, likewise for sterile workers in eusocial insect colonies). Such transitions therefore represent premier examples of cooperative group formation. From the viewpoint of artificial life, understanding the mechanisms behind these transitions, and cooperative group formation more generally, may help us to understand the evolution of increased complexity [1,7], and to create applied systems where the evolution of a high degree of cooperation is supported [2]. The key difficulty that any explanation for the evolution of cooperative groups must overcome is this: if performing a group-beneficial cooperative act entails some cost, then why should an individual donate a component of its own fitness in order to contribute to the success of its group? Surely selfish cheats who do not themselves cooperate but instead simply reap the benefits of other group G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 53–60, 2011. c Springer-Verlag Berlin Heidelberg 2011
54
S.T. Powers and R.A. Watson
members’ cooperation should be expected to be fitter, for they receive all of the group benefits whilst paying none of the individual costs. However cooperation can nevertheless evolve, even in the face of selfish cheats, if there is competition and hence selection between groups [8]. This is because groups with more cooperators will, all other things being equal, outcompete other groups that are dominated by selfish individuals. In this way, selection acting between groups provides an evolutionary force favouring cooperation. However, because selfish cheats still enjoy a relative fitness advantage within each group through freeriding, any between-group selection for cooperation is opposed by within-group selection for selfishness. In such a scenario selection is therefore multi-level, operating both within- and between- groups. The degree to which group cooperation evolves then depends on the extent to which selection between groups is stronger than selection within groups [8]. However, it is commonly held that the conditions under which between-group selection can overpower within-group selection are rather limited [9]. In particular, there must be a high variance in the proportion of cooperators between different groups, such that some groups contain many more cooperators, and are hence much more productive, than others. One way that this variance can be generated is if groups are formed assortatively, such that cooperators tend to form groups with other cooperators. This could happen, for example, if groups were founded by kin [10], for the genealogical relatives of a cooperator will themselves tend to be cooperators. Another way a high between-group variance can arise is if groups are founded by individuals sampled at random from the global population, but the number of founding individuals is small [11]. In this case the groups will start out as small, unrepresentative, samples of the population and so will tend to have a high variance. Selection at the group-level will be more effective as a result of this increased variance, and the increased effect of group selection would lead to greater cooperation. However, the initial group size must typically be very small if significant group-level variance and hence selection is to be generated by this mechanism. We consider here how such groups with high variance, and that are consequently much affected by group selection, can evolve from an initial population where between-group variance is low and group selection weak. In order to do so we relax an assumption that is present in nearly all other multi-level selection models, namely, that initial group size must remain fixed over evolutionary time. Rather, we consider here the possibility that many organisms across all taxa may be able to influence the size of their group to some extent, through genetically coded (individual) traits. As an example, bacteria living in biofilms have a trait that controls the amount of extracellular polymeric substance that they secrete [12]. Since this substance allows bacterial cells to bind together, the amount produced could influence microcolony (group) size. In our model we then consider the evolution of both a group size preference trait such as this, and a behavioural trait that determines whether an individual cooperates or not, such that both traits evolve concurrently. If smaller groups lead, through increased group selection, to increased cooperation then individuals with a preference for
Evolution of Group Size Can Increase Cooperation
55
founding smaller groups could potentially be selectively favoured, for they would experience a greater frequency of the benefits of cooperation in their groups and hence be on average fitter. In our previous work [13] we showed that, in principle, individuals with a preference for a group size very much smaller than the current size could invade if they arose in a sufficiently high frequency. Such a sudden, large, decrease in group size is easier to evolve, since the increase in cooperation is greater if there is more of a decrease in group size. However, if group size decreased gradually, which seems a more plausible evolutionary mechanism, then it is not clear whether there could be sufficient immediate individual benefit to be selectively favoured. Here, we examine conditions under which groups of a smaller initial size can in fact evolve from larger ones, simply through gradual unbiased mutations on individual size preference, even when there are some other advantages to being in a larger group (e.g., better predator defence, access to resources that a smaller number of individuals cannot obtain [14]). We find that large jumps in size preference, as modelled by our previous work, are not required under (negative) frequency-dependant within-group selection. Consequently, this process can provide a gradualist explanation for the origin of cooperative groups.
2
Modelling the Concurrent Evolution of Initial Group Size and Cooperative Behaviour
The simulation model presented here considers a population structured as follows. Individuals reproduce within discrete groups for a number of generations, before individuals in all groups disperse and form a global migrant pool (dispersal stage). New groups are then formed by a uniform random sampling of individuals from this migrant pool, and the process of group growth, dispersal, and reformation repeats. Selection within groups occurs during the group growth stage, where individuals reproduce and have fitness-affecting interactions with other group members (i.e., selfish individuals will have more offspring than cooperators within that same group). On the other hand, selection between groups occurs at the dispersal stage, since those groups that have grown to a larger size (i.e., those with more cooperators) will contribute more individuals to the migrant pool. This kind of population structure is very similar to that considered in some classical group selection models [15,11,10], and especially fits organisms that live on temporary resource patches which become depleted, thereby periodically forcing dispersal and the formation of new groups. Unlike previous models, we give individuals a genetically coded initial group size preference. An individual genotype then contains two loci: the first codes for cooperative or selfish behaviour, the second for an initial group size preference. Mutation on these loci happens after the dispersal stage; double mutations within a single organism are not allowed. Mutation at the size locus is done by adding or subtracting 1 to the current size preference (with 50% probability each); mutation at the behaviour locus is done by swapping to the other behaviour. Groups are formed randomly with respect to behaviour (cooperative
56
S.T. Powers and R.A. Watson
or selfish), but assortatively on size, such that individuals with a preference for smaller groups tend to, on average, find themselves in such groups, likewise for individuals with a preference for larger groups. This is implemented by creating a list of individuals in the migrant pool sorted ascendingly by their size preference. Individuals from this list are then added in order to a group, until the size of the group exceeds the mean preference of the group members, at which point a new group is created for the next individual and is populated in the same manner. An algorithmic overview of the model is as follows: 1. Initialisation: Let the migrant pool be the initial population. 2. Group formation: Assign individuals in the migrant pool to groups. 3. Reproduction: Perform reproduction and selection within groups for a fixed number of generations, using the fitness functions defined below.1 4. Dispersal: Place all individuals into the migrant pool. 5. Mutation: Mutate individuals in the migrant pool. 6. Iteration: Repeat from step 2 onwards until and equilibrium is reached. 2.1
Within-Group Selection: Snowdrift and Prisoner’s Dilemma Games
To model within-group selection and reproduction, we assume that group members have fitness-affecting interactions with each other as defined by the payoff matrix in Table 1, where b is the benefit of cooperating, and c the cost. If b > c then this represents the Snowdrift game [16], where a coexistence of cooperative and selfish individuals is supported at equilibrium in a freely-mixed population, since both types enjoy a fitness advantage when rare. Such an equilibrium coexistence of behaviours can occur if cooperators are able to internalise some of the benefits of cooperating. For example, if cooperation involves the production of a public good, then cooperators may be able to keep a fraction of the good they produce for themselves [17]. Where this occurs cooperators will receive, on average, a greater per capita share of the benefits of cooperation, giving them a fitness advantage. However, if the benefit of cooperation becomes discounted with additional cooperators, but the cost remains the same, then selfish individuals will become fitter as cooperators increase above a threshold frequency [17]. The Snowdrift game is thus a model of negative frequency-dependent selection leading to a mixed equilibrium; there is a growing realisation that this type of dynamic occurs in many biological systems [16,17]. It should be stressed that despite an equilibrium coexistence of behaviours, it is still the case that mean fitness would be higher if all individuals cooperated. This then means that if the game is played in a group structured population, groups with more cooperators will still be fitter than those with less. Group selection can, therefore, still further increase the global frequency of cooperation. For c > b > c/2, the payoff matrix in Table 1 yields the classical Prisoner’s Dilemma [16], where selfish individuals are always fitter and cooperators are 1
We rescale the groups after each generation to maintain a constant population size.
Evolution of Group Size Can Increase Cooperation
57
Table 1. Payoff matrix for within-group interactions Cooperate Selfish Payoff to Cooperate b − c/2 b−c Payoff to Selfish b 0
driven extinct at equilibrium in a freely-mixed population. Again, however, mean fitness would be higher if all individuals cooperated. We generalise the structure of this 2-player payoff matrix to a group of n players by multiplying the payoff matrix by the proportion of behaviours within the group, as is standard when forming a replicator equation in evolutionary game theory; in our model this corresponds to treating each group as a separate freely-mixed population. Doing so yields the following fitness equations, where wc is the fitness of cooperators, ws the fitness of selfish individuals, p the proportion of cooperators within the group, w0 a baseline fitness in the absence of social interactions, and σn a sigmoidal function that provides a benefit depending on group size n, as described below: c wc = p b − + (1 − p) (b − c) + w0 + σn 2 ws = pb + w0 + σn β β σn = − −µn 1+e 2 σn is a sigmoidal function which takes in as input the current group size, and has as parameters a gradient μ (which determines how quickly the benefit tails off as the group gets larger), and a maximum fitness benefit β. This provides what is known in ecology as an Allee effect, whereby there is an advantage in number when groups are small, but as the group grows this is overwhelmed by the negative effects of crowding, i.e, the effect saturates with increasing size [14].
3
Results
We investigate here the conditions under which an individual preference for groups of a smaller initial size can evolve, and lead to greater group selection and cooperation. In all cases, we start all individuals out with the same size preference (20), and then consider the evolutionary dynamics that occur through mutation and selection. The initial frequency of the cooperation allele is taken to be the global equilibrium frequency in the model that occurs if group size was fixed at the starting size. We then record the changes in mean initial group size and global proportion of cooperators over time. A particular focus of this study is to contrast the effects of within-group selection modelled on the Snowdrift versus the Prisoner’s Dilemma game. We set b/c = 1.1 to yield the Snowdrift game, and b/c = 0.9 to produce the Prisoner’s Dilemma . The following parameter settings were used throughout: 3 generations within groups prior to dispersal, w0 = 1, mutation rate 1%, 90% of mutations on the size locus, and population size 1000.
58
S.T. Powers and R.A. Watson
Group size preference
We initially considered a case where there is no intrinsic advantage to larger groups (i.e., set σn = 0). We found that given time for a sufficient number of mutations to accumulate, a population could evolve down to an initial group size of 1, and 100% cooperation, from a large range of initial conditions. The reason initial group size tended towards size 1 is that when there is no intrinsic benefit to larger groups, such an initial size maximises cooperation and hence absolute individual fitness. A representative example of the population dynamics is shown in Fig. 1. Interestingly, our results show that a selective gradient towards smaller groups does not always exist at the start; although it was present in the Snowdrift game, in the Prisoner’s Dilemma the process relied on genetic drift until such a time as very small groups were created. This drift can be seen by individual size preferences spreading out in both directions at the start, whereas in the Snowdrift game the mass of the population moved in one direction, thus showing the presence of an individual selective gradient favouring smaller groups from the outset.
Snowdrift
Prisoner’s Dilemma
10
10
20
20
30 40
30 500 1500 Dispersal cycles
40
500
1500
Fig. 1. Change in group size preference under Snowdrift and Prisoner’s Dilemma games (lighter shades = greater frequency in population)
Next, we considered the case where there is an intrinsic advantage to larger groups (σn parameters: β = 1 and μ = 0.4) . In such cases, the optimum initial group size (in terms of absolute individual fitness) is a trade-off between the benefit of cooperation and the intrinsic advantage to larger groups. It will thus be greater than 1, but not so large as to prevent any significant between-group selection and hence cooperation: using the parameters here, the optimum is 4. The question is then whether this optimum size can be reached by mutation. Figure 2 shows a representative case in which mean group size decreases to the optimum in the Snowdrift, but not Prisoner’s Dilemma, game. The reason size preference cannot decrease in the Prisoner’s Dilemma is that drift can no longer be effective if there is an intrinsic advantage to larger groups, for the advantage of larger groups provides a selective pressure away from small. By contrast, in the Snowdrift game a counter selective gradient towards smaller groups and increased cooperation exists from a much larger range of initial conditions. This counter gradient is provided by the benefits of increased cooperation that can be realised in smaller groups due to increased group selection. In the Prisoner’s
Snowdrift
Prisoner’s Dilemma
10
10
20
20
30 40
30 1000 3000 Dispersal cycles
40
1000
3000
Cooperation frequency
Group size preference
Evolution of Group Size Can Increase Cooperation
59
0.6 0.4 Snowdrift 0.2 PD 0
1000
3000
Fig. 2. Allowing for some benefit to larger groups. Left and centre: change in group size preference (lighter shades = greater frequency). Right: proportion of cooperators.
Dilemma there is no such gradient because group selection is completely ineffective over much of the parameter space, and so a small mutation on size would not lead to any increase in cooperation. The existence of a selective gradient over a larger range of parameters follows from the fact that the Snowdrift game maintains a low frequency of cooperators at (group) equilibrium. Consequently, because one type cannot be driven extinct there is always the possibility that group-level variance can be generated when the groups are reformed, even when groups are large, as shown in some of our previous work [18]. As a result, moving to a slightly smaller group size can further increase this variance, and allow increased group selection and a subsequent increase in cooperation. By contrast, in the Prisoner’s Dilemma game moving to a slightly smaller group size would not increase the effect of group selection over much of the parameter space, for cooperators are driven extinct over a large range of group sizes, thereby destroying the possibility of any group variance. Where this occurs, there cannot be selection for a smaller group size, and so the process must rely on drift, which cannot overcome an intrinsic advantage to larger groups. Our results therefore demonstrated that the evolution of smaller initial group size and greater cooperation is much more plausible if within-group selection takes the form of a Snowdrift game.
4
Concluding Remarks
Many of the major transitions involved the evolution of mechanisms that increased variance and hence selection at the group level, thereby allowing a high degree of cooperation between group members to evolve [6]. One such mechanism could be a reduction in initial group size. For example, multi-cellular organisms are themselves cooperative groups of individual cells. Most multi-cellular organisms develop from a single cell, and it has been argued that this evolved at least partly because it increased between-organism (cell group) variance, and hence increased cooperation between individual cells within the organism [5,6]. We have shown here how selection on an individual trait can lead to the evolution of increased variance in fitness at the group level, and hence a rise in cooperation between group members through increased group selection. Our results
60
S.T. Powers and R.A. Watson
demonstrated that such a process is much more plausible if negative frequencydependent selection, as modelled here by the Snowdrift game, operates within groups. Acknowledgements. Thanks to Alex Penn and Seth Bullock for many discussions, and Rob Mills for feedback on the manuscript.
References 1. Stewart, J.: Evolutionary transitions and artificial life. Artificial Life 3(2), 101–120 (1997) 2. Nitschke, G.: Emergence of cooperation: State of the art. Artificial Life 11, 367–396 (2005) 3. Hammerstein, P.: Genetic and Cultural Evolution of Cooperation. MIT Press, Cambridge (2003) 4. West, S.A., Griffin, A.S., Gardner, A.: Evolutionary explanations for cooperation. Current Biology 17(16), R661–R672 (2007) 5. Maynard Smith, J., Szathm´ ary, E.: Major Transitions in Evolution. Spektrum (1995) 6. Michod, R.E.: Darwinian Dynamics: Evolutionary Transitions in Fitness and Individuality. Princeton University Press, Princeton (1999) 7. Bedau, M.A.: Artificial life: organization, adaptation, and complexity from the bottom up. Trends in Cognitive Science 7, 505–512 (2003) 8. Wilson, D.S., Sober, E.: Reintroducing group selection to the human behavioral sciences. Behavioral and Brain Sciences 17(4), 585–654 (1994) 9. Maynard Smith, J.: Group selection. Quarterly Review of Biology 51, 277–283 (1976) 10. Wilson, D.S.: Altruism in mendelian populations derived from sibling groups: The Haystack model revisited. Evolution 41(5), 1059–1070 (1987) 11. Wilson, D.S., Colwell, R.K.: Evolution of sex ratio in structured demes. Evolution 35(5), 882–897 (1981) 12. Flemming, H.C., Neu, T.R., Wozniak, D.J.: The EPS matrix: The ”House of Biofilm Cells”. Journal of Bacteriology 189(22), 7945–7947 (2007) 13. Powers, S.T., Penn, A.S., Watson, R.A.: Individual selection for cooperative group formation. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 585–594. Springer, Heidelberg (2007) 14. Avil´es, L.: Cooperation and non-linear dynamics: An ecological perspective on the evolution of sociality. Evolutionary Ecology Research 1, 459–477 (1999) 15. Maynard Smith, J.: Group selection and kin selection. Nature 201, 1145–1147 (1964) 16. Doebeli, M., Hauert, C.: Models of cooperation based on the prisoner’s dilemma and the snowdrift game. Ecology Letters 8(7), 748–766 (2005) 17. Gore, J., Youk, H., van Oudenaarden, A.: Snowdrift game dynamics and facultative cheating in yeast. Nature 459, 253–256 (2009) 18. Powers, S.T., Penn, A.S., Watson, R.A.: The efficacy of group selection is increased by coexistence dynamics within groups. In: Proceedings of ALife XI, pp. 498–505. MIT Press, Cambridge (2008)
Implications of the Social Brain Hypothesis for Evolving Human-Like Cognition in Digital Organisms Suzanne Sadedin and Greg Paperin Clayton School of Information Technology, Monash University, Vic. 3800 Australia
[email protected] Abstract. Data show that human-like cognitive traits do not evolve in animals through natural selection. Rather, human-like cognition evolves through runaway selection for social skills. Here, we discuss why social selection may be uniquely effective for promoting human-like cognition, and the conditions that facilitate it. These observations suggest future directions for artificial life research aimed at generating human-like cognition in digital organisms. Keywords: artificial intelligence, sociality, evolution, social selection, Machiavellian intelligence.
1 Introduction One of the most ambitious goals of Artificial Life research is the development of a digital organism that exhibits human-like cognitive behaviour [1, 2]. Such human-like behaviours include creative innovation, tool manufacture and use, self-awareness, abstract reasoning, and social learning [3-6]. Early researchers in artificial intelligence (AI) hoped that human-like cognition would rapidly emerge from AIs endowed with inference systems and explicit symbolic representations of their universe [2]. They were too optimistic. These inference systems, composed of search strategies and formal logic, provided good solutions in domains with semantics of limited complexity, such as games, mathematical problems and other formal systems, as well as simple model worlds [7] [8]. However, they failed to exhibit any useful intelligence in real-world settings [1]. Biomimicry is the development of engineered designs using biological principles. In the 1980s Brooks [9] and others argued that the true challenge of AI was to engineer systems that exhibit not human-like, but insect-like intelligence: a robust capability to perceive and respond in complex and dynamic environments. This view sparked many advances in areas such as machine learning, evolutionary computation and artificial life. Yet despite these advances, an engineered design for a human-like cognition remains elusive [1]. While many new systems perform well in specialised applications in complex environments, the integration of low-level behavioural intelligence and cognitive reasoning appears to hit a complexity wall [1]. We propose that evolutionary in-silico techniques may provide a way around the complexity wall by recreating conditions that promote human-like cognition in G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 61–68, 2011. © Springer-Verlag Berlin Heidelberg 2011
62
S. Sadedin and G. Paperin
animals. Empirical and theoretical studies are approaching consensus on both the features associated with human-like cognition in animals, and the evolutionary conditions required [3-6]. These studies imply that our cognitive abilities are not a collection of independent traits, but a predictable product of selection in specific social environments. This view is termed the social brain or Machiavellian intelligence hypothesis [10, 11]. We discuss how social brain evolution may be embodied in a model system to develop human-like cognition in a digital context.
2 Patterns in the Evolution of Human-Like Cognition Numerous studies show that large brains are strongly associated both with human-like cognition and with specific social systems and behaviours, but only weakly related to ecological factors. Thus, human-like cognition evolves as a social, not ecological, adaptation. Today, this view is widely accepted for primate cognition; recent reviews also support extending it to other taxa [3, 6, 12-15]. Brains and ecology. Relative brain size (brain-body mass ratio) differs dramatically among species. Among mammals, exceptionally large brains occur in primates, cetaceans (dolphins and whales) [6], and carnivores [12, 16]. Most bird brains are larger than those of comparable mammals, with the largest non-primate brains occurring in parrots and corvids (crows, jays and magpies) [13, 17, 18]. Ecological selection does not account for these patterns [17, 19-21]: why should birds need bigger brains than mammals? Why should seed-eating parrots need extremely large brains? Physical capacity for complex, flexible behaviour also seems unrelated to brain size: brain size and forelimb dexterity are uncorrelated in terrestrial carnivores [16], and handless dolphins are second only to apes in relative brain size [6]. Brains and social system. The idea that sociality might drive brain evolution was first proposed in the mid-twentieth century [19, 22, 23], but only recently attained wide acceptance [4]. In contrast to the diverse ecologies of large-brained animals, their social behaviour is strikingly similar. Large-brained vertebrates interact with unrelated adults, forming lasting and powerful social relationships with some [13, 14, 24]. Most primates, cetaceans and carnivores share fission-fusion group dynamics, where coalitions of varying size and stability interact, cooperate and compete [21]. Large-brained birds more often form relatively stable socially or genetically monogamous pairs [24]. In all of these groups, long-term cooperative relationships among unrelated adults powerfully influence reproductive success[24]. Dunbar [11] noted that group size and relative brain size are strongly related in primates, providing the first evidence that selection for social skills may drive the evolution of primate cognition. However, studies of other taxa suggested the result might be specific to primates: group size in parrots and hoofed mammals does not correlate with brain size[3, 25]. Closer attention reveals that the relation holds for some non-primates: brain size correlates with social group size in carnivorous and insectivorous mammals [26] and cetaceans [6]. However, Emery et al.[24] showed that brain size in birds was related to relationship duration, with social monogamists
Implications of the Social Brain Hypothesis for Evolving Human-Like Cognition
63
having smaller brains than genetic monogamists. Similarly, hoofed mammals with unusually large brains do not aggregate in large, anonymous herds, but form small, close-knit social groups with persistent relationships [14]. Brains and human-like cognition. Many aspects of cognition once thought to be confined to humans have now been documented in a range of species. For example, elephants [27], dolphins [28], corvids [29] and apes can recognize themselves in mirrors, a classical test of self-awareness (although one that favours visually-oriented animals). Social transmission of learned behaviour is also seen in these animals, allowing simple forms of cultural evolution [30]. Creative tool-making and tool-use have also been observed in many of these species [15]. Thus, while none of these species exhibits human-like technology or complex language, their traits suggest evolutionary antecedents to human cognition in diverse groups. Surveys of brain structure and size show that the behaviours described above – self-awareness, flexible problem-solving, tool manufacture and use, innovation and social learning – are consistently associated both with unusually large brains, and with specific social behaviours [3, 6, 20]. Large-brained animals show extended parental care, mature slowly and are playful even in adulthood [25, 31, 32]. They use sound both creatively and playfully, and their songs are wholly or partially learned [33]. Positive relationships between relative brain size, tool use, innovation and social learning are documented across a broad range of species [5, 6, 13, 20, 34]. In addition, early birth/hatching, slow maturation, long lifespan and extended parental care correlate with these behaviours and with relative brain size [32]. Reader and Laland [20] noted that innovation rate and social learning are correlated independent of brain size. This result implies that there is no trade-off between social and asocial learning; rather, social cognition provides general cognitive benefits. The same study showed that that social group size does not correlate with social learning when other factors are controlled. This observation supports the idea that cognitive evolution is not determined by the volume of social interactions per se, but rather reflects the importance of personal relationships.
3 Social Selection on Social Brains The dominant explanation for the above observations is a theoretical framework termed the social brain hypothesis [4, 10, 22, 23, 35]. A key factor in this framework is social selection, which occurs when the fitness of individuals depends on interactions with others. Social selection can drive the evolution of dramatic displays due to positive feedback which progressively exaggerates signal traits that reliably indicate social benefits [36-38]. For most species, the traits relevant to social success are primarily physical, so displays reflect individual physical ability. Complex strategies may be invoked by social displays without complex cognition, so social selection alone need not lead to brain evolution. However, when power is distributed in a network of cooperative individual relationships, the ability to manipulate this social network may become the target of social selection, resulting in social selection for improved cognition (Figure 1). Formulations of the social brain hypothesis vary in
64
S. Sadedin and G. Paperin
their emphasis on social complexity, display, innovation, culture, cooperation and cheating. All, however, incorporate these factors in a positive feedback loop that drives increases in cognitive ability until resource limits are met. The social network is a unique selective environment in that fitness depends on the ability to model the minds and relationships of others; as minds and relationships become more complex, so do these models [22]. Through social selection for such cognitive ability, extreme displays of social skill may evolve, such as humour, artistry and song [36]. Increased behavioural flexibility resulting from improved cognition may then reduce the influence of ecological selection further [34, 39]. Cooperation emerges and persists in restrictive evolutionary conditions [40]; in particular, models suggest that it is likely to evolve in viscous social networks where repeated interactions allow punishment of non-cooperators [41]. For animals with limited cognition, group sizes that are too large to recall individual interactions undermine cooperation. Competition, however, is more intense when there are more competitors. Consequently, it seems plausible that social selection and selection for cooperative relationships combine most forcefully at group sizes and interaction rates that are as large as possible for cooperation to evolve given existing cognitive limits. This observation may explain why species that live in medium-sized groups often have the largest brains. Selection for social status Relatively weaker selection for adaptation to the physical environment
Selection for ability to manipulate social network
Increased behavioural flexibility, tool use, communication skills
Selection for ability to impress peers and reason about their intentions
Sophisticated cognition, imagination, artistry and song
Fig. 1. Conceptual model for the evolution of human-like cognition through social selection. When fitness is determined by social interactions, selection for social skills drives the evolution of displays of cognitive ability in a positive feedback loop which is ultimately inhibited by external resource limitations.
Long-term relationships also appear crucial to the evolution of human-like cognition [6, 14, 24]. Social choices made without relationship commitment have immediate consequences but can be easily modified. In contrast, relationships that require large investments impose strong selection for appropriate partner choice; this may be why birds that are genetically rather than socially monogamous have larger brains [24]. A ubiquitous feature of long-term relationships is courtship. Courtship involves playful and coordinated display activities such as dancing, duets, and mutual grooming; it seems likely that these activities allow animals to assess the value of social investment [31].
Implications of the Social Brain Hypothesis for Evolving Human-Like Cognition
65
4 Evolving Artificial Human-Like Cognition The results described in Sections 2 and 3 imply that features of human-like cognition are likely to evolve only when populations are faced with: -
A complex network of social interactions Selection for social status Ability to communicate to other member of the social network Ability to observe other members of the social network and their actions
Possible ways to instantiate these features in artificial life are suggested below. Individuals. In most animals, cognition is performed by natural neural networks. Thus, one option is to represent individuals using an artificial neural network. However, as discussed in section 3, in most animals these networks evolved, not for abstract reasoning, but to channel responses to specific stimuli. Clearly, brains can be used for human-like cognition, but some functions may be prerequisite. This may include ability to represent objects, associative memory, communication, basic empathy, and others. Natural evolution of such functions took a very long time, and may be extremely complex. Including pre-wired components that can perform such basic functions may facilitate evolution of more interesting properties. Artificial neural networks are not the only possible choice to perform cognition. It has been shown that artificial life (although not intelligent) can arise in a universe of co-evolving computer programs (e.g. [42, 43]). Logical and probabilistic systems can also exhibit limited representational thinking (see review by [44]). Combining such techniques with evolutionary systems may yield an appropriate representation. The genotype representation in the model must support phenotypes of varying size and complexity. Lenton and Oijen [45] define a hierarchy for intrinsic control dynamics in adaptive systems. In their terms, the representational system for individuals must exhibit at least the third order of control. Second order systems with fixed representational structure may show learning capabilities in specific domains, but are not expected to develop open-ended cognition. Fitness. Reproductive success (fitness) must be based on the social standing of an individual relative to its peers. One way to do this is to supply every newborn individual with a unit amount of an abstract resource R. This represents the amount of social status it can confer on other individuals. Throughout its lifetime, an individual contributes to the social status of its peers by supplying them with an arbitrary proportion of R. An individual must share the entire unit amount of R during its life time, but cannot share R that it received from other individuals (it cannot pass on respect received from others). The social status (relative fitness) of an individual is determined by the total R it receives throughout its lifetime. Environment. We have argued that human-like cognition evolves only when selection targets social network skills. In social selection, individuals must perceive the actions of others (such as donating R or communicating a statement). In the real world, perception depends on individual location, but allowing perception to be determined by social network proximity seems sufficient for the current model.
66
S. Sadedin and G. Paperin
Communication. Without a physical environment there is no explicit communication channel in the model universe. A statement can be encoded as a tuple (S, D, M), where S and D are source and destination individuals respectively, and M is a message. The role of noise and the circumstances under which a message between S and D may be observed by others require further consideration. The encoding of M may be critical for the evolution of meaningful communication and of cognitive abilities: a particular choice must agree with the representation chosen for individuals. Most generally, M may be represented as a bit string with a fixed maximum length. Individuals communicate randomly at the beginning of the model evolution, but once some individuals begin to be influenced in their actions by perceived communication, others will gain a selective advantage from discovering this and from sending specific messages. This may lead to evolution of primitive communication - a prerequisite for complex social interactions. Model lifecycle. The system can be initialised with a set of populations, each consisting of 10 to 30 individuals created randomly. The lifecycle is based on nonoverlapping generations, each consisting of two phases – competition and reproduction. During the competition phase, each individual receives a unit amount of resource R. The individuals then communicate by passing messages as described earlier. At any time during this phase, an individual may choose to pass on some of its R to any other individual. This process continues until all individuals have passed on all of their initial R. The reproduction phase then ensues, with individual reproductive success proportional to the amount of R received during the competition phase. Mutation supplies variation. Existing adults are replaced by juveniles and the competition phase ensues for the new generation.
5 Conclusions We have argued that human-like cognitive traits evolve naturally in a restrictive set of evolutionary conditions when the social network of an organism is its primary selective environment. In particular, organisms that can increase their fitness by adaptively and flexibly forming long-term, high-commitment relationships seem most likely to evolve large brains, creative innovation, tool use, social learning and abstract reasoning. These principles could be embodied in an artificial life system to develop human-like cognition in a digital context.
References [1] Bedau, M., McCaskill, J., Packard, N., Rasmussen, S., Adami, C., Green, D., Ikegami, T., Kaneko, K., Ray, T.: Open problems in artificial life. Artificial Life 6, 363–376 (2000) [2] Dreyfus, H.L.: Why Heideggerian AI failed and how fixing it would require making it more Heideggerian. Artificial Intelligence 171, 1137–1160 (2007) [3] Emery, N., Clayton, N.: The mentality of crows: convergent evolution of intelligence in corvids and apes. Science 306, 1903–1907 (2004) [4] Flinn, M., Geary, D., Ward, C.: Ecological dominance, social competition, and coalitionary arms races Why humans evolved extraordinary intelligence. Evolution and Human Behavior 26, 10–46 (2005)
Implications of the Social Brain Hypothesis for Evolving Human-Like Cognition
67
[5] Lefebvre, L., Reader, S., Sol, D.: Brains, innovations and evolution in birds and primates. Brain, Behav. Evol. 63, 233–246 (2004) [6] Marino, L.: Convergence of complex cognitive abilities in cetaceans and primates. Brain Behav. Evol. 59, 21–32 (2002) [7] Giunchiglia, F., Villafiorita, A., Walsh, T.: A Theory of Abstraction, Universita di Genova facolta di ingegneria, dipartimento informatica sistemistica telematica, Genova, Technical Report MRG/DIST #97-0051 (November 1997) [8] Nilsson, N.J.: Shakey the robot, SRI International Menlo Park, Menlo Park, CA, Technical note no. 323 A819854 (April 1984) [9] Brooks, R.A.: Achieving artificial intelligence through building robots, Massachusetts Institute of Technology, Boston, MA, Technical report A.I. Memo 899 (May 1986) [10] Byrne, R.: Machiavellian intelligence: Social expertise and the evolution of intellect in monkeys, apes, and humans. Oxford University Press, Oxford (1989) [11] Dunbar, R.: Grooming, gossip, and the evolution of language. Harvard Univ. Pr., Cambridge (1998) [12] Holekamp, K., Sakai, S., Lundrigan, B.: Social intelligence in the spotted hyena (Crocuta crocuta). Philosophical Transactions of the Royal Society B: Biological Sciences 362, 523 (2007) [13] Marler, P.: Social cognition: Are primates smarter than birds. Current Ornithology 13, 1–32 (1996) [14] Schultz, S., Dunbar, R.: Both social and ecological factors predict brain size in ungulates, pp. 207–215 (2006) [15] Seed, A., Clayton, N., Emery, N.: Cooperative problem solving in rooks (Corvus frugilegus). Proceedings of the Royal Society B: Biological Sciences 275, 1421 (2008) [16] Iwaniuk, A., Pellis, S., Whishaw, I.: Brain size is not correlated with forelimb dexterity in fissiped carnivores (Carnivora): a comparative test of the principle of proper mass. Brain Behav. Evol. 54, 167–180 (1999) [17] Bennett, P., Harvey, P.: Relative brain size and ecology in birds. Journal of zoology. Series A 207, 151–169 (1985) [18] Iwaniuk, A., Dean, K., Nelson, J.: Interspecific allometry of the brain and brain regions in parrots (Psittaciformes): comparisons with other birds and primates. Brain, Behav. Evol. 65, 40–59 (2005) [19] Chance, M., Mead, A.: Social behaviour and primate evolution. In: Machiavellian Intelligence: Social Expertise and the Evolution of Intellect in Monkeys, Apes, and Humans (1988) [20] Reader, S., Laland, K.: Social intelligence, innovation, and enhanced brain size in primates. Proceedings of the National Academy of Sciences 99, 4436 (2002) [21] Perez-Barberia, F., Shultz, S., Dunbar, R., Janis, C.: Evidence for coevolution of sociality and relative brain size in three orders of mammals. Evolution 61, 2811–2821 (2007) [22] Humphrey, N.: The social function of intellect. In: Machiavellian Intelligence: Social Expertise and the Evolution of Intellect in Monkeys, Apes, and Humans, pp. 13–26 (1988) [23] Jolly, A.: Lemur social behavior and primate intelligence. Science 153, 501–506 (1966) [24] Emery, N., Seed, A., von Bayern, A., Clayton, N.: Cognitive adaptations of social bonding in birds. Philosophical Transactions of the Royal Society B: Biological Sciences 362, 489 (2007) [25] Pearce, J.: Animal learning and cognition: An introduction. Psychology Pr., San Diego (1997) [26] Dunbar, R., Bever, J.: Neocortex size predicts group size in carnivores and some insectivores. Ethology 104, 695–708 (1998)
68
S. Sadedin and G. Paperin
[27] Plotnik, J., de Waal, F., Reiss, D.: Self-recognition in an Asian elephant. Proceedings of the National Academy of Sciences 103, 17053 (2006) [28] Reiss, D., Marino, L.: Mirror self-recognition in the bottlenose dolphin: A case of cognitive convergence. Proceedings of the National Academy of Sciences 98, 5937 (2001) [29] Prior, H., Schwarz, A., Güntürkün, O.: Mirror-induced behavior in the magpie (Pica pica): Evidence of self-recognition. PLoS Biol. 6, e202 (2008) [30] Laland, K., Hoppitt, W.: Do animals have culture? Evolutionary Anthropology: Issues, News, and Reviews 12 (2003) [31] Chick, G.: What is play for? Sexual selection and the evolution of play. Theory in Context and Out, 1 (2001) [32] Iwaniuk, A., Nelson, J.: Developmental differences are correlated with relative brain size in birds: a comparative analysis. Canadian Journal of Zoology 81, 1913–1928 (2003) [33] Fitch, W.: The biology and evolution of music: A comparative perspective. Cognition 100, 173–215 (2006) [34] Sol, D., Duncan, R., Blackburn, T., Cassey, P., Lefebvre, L.: Big brains, enhanced cognition, and response of birds to novel environments. Proceedings of the National Academy of Sciences 102, 5460–5465 (2005) [35] Dunbar, R.: The social brain hypothesis. Brain 9, 10 [36] Nesse, R.: Runaway Social Selection for Displays of Partner Value and Altruism. Biological Theory 2, 143–155 (2007) [37] Tanaka, Y.: Social selection and the evolution of animal signals. Evolution, 512–523 (1996) [38] Wolf, J., Brodie III, E., Moore, A.: Interacting phenotypes and the evolutionary process. II. Selection resulting from social interactions. The American Naturalist 153, 254–266 (1999) [39] Sol, D., Timmermans, S., Lefebvre, L.: Behavioural flexibility and invasion success in birds. Animal Behaviour 63, 495–502 (2002) [40] Axelrod, R., Hamilton, W.: The evolution of cooperation. Science 211, 1390–1396 (1981) [41] Santos, F., Pacheco, J., Lenaerts, T.: Cooperation prevails when individuals adjust their social ties. PLoS Comput. Biol. 2, e140 (2006) [42] Adami, C., Brown, C.T.: Evolutionary learning in the 2D artificial life system “Avida”. In: Fourth International Conference on the Synthesis and Simulation of Living Systems (ALife IV), pp. 377–381. MIT Press, Cambridge (1994) [43] Ray, T.S.: An approach to the synthesis of life. In: Second International Conference on the Synthesis and Simulation of Living Systems (ALife II), vol. 10, pp. 371–408 (1991) [44] Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2003) [45] Lenton, T.M., Van Oijen, M.: Gaia as a Complex Adaptive System. Philosophical Transactions of the Royal Society: Biological Sciences 357, 683–695 (2002)
Embodiment of Honeybee’s Thermotaxis in a Mobile Robot Swarm Daniela Kengyel1, Thomas Schmickl2 , Heiko Hamann2 , Ronald Thenius2 , and Karl Crailsheim2 1
2
University of Applied Sciences St. Poelten, Computersimulation, Austria Artificial Life Laboratory of the Department of Zoology, Karl-Franzens University Graz, Universit¨ atsplatz 2, A-8010 Graz, Austria
[email protected], {heiko.hamann,thomas.schmickl}@uni-graz.at, {ronald.thenius,karl.crailsheim}@uni-graz.at
Abstract. Searching an area of interest based on environmental cues is a challenging benchmark task for an autonomous robot. It gets even harder to achieve if the goal is to aggregate a whole swarm of robots at such a target site after exhaustive exploration of the whole environment. When searching gas leakages or heat sources, swarm robotic approaches have been evaluated in recent years, which were, in part, inspired by biologically motivated control algorithms. Here we present a bio-inspired control program for swarm robots, which collectively explore the environment for a heat source to aggregate. Behaviours of young honeybees were embodied on a robot by adding thermosensors in ‘virtual antennae’. This enables the robot to perform thermotaxis, which was evaluated in a comparative study of an egoistic versus a collective swarm approach.
1 1.1
Introduction Motivation and Problem Formulation
Eusocial insects are fascinating and inspiring group consisting of termites, bees, and ants, which live in well organized colonies. Colonies of honeybees are precisely regulated. For example the temperature in the brood nest is kept constant at approx. 36 C. This is very important for the brood because otherwise this could lead to disorders in the development or even the loss of brood [1]. One challenge of a colony of honeybees in such a hive is, that the bees have to organize and orient themselves in a temperature gradient without the help of light. Such a temperature gradient is inhomogeneous and dynamic. This is a complicated challenge, because the antennae of honeybees are close to each other, but it is solved collectively by the bees [2]. To find a point of a special property, such as locating a field of heat or a leakage in a gas pipe, is a common challenge in robotics, especially if the environment is changing over time and no global gradient points to the target spot. One approach to tackle such challenges is using a bio-inspired approach, e.g. using a swarm to perform a parallelized and G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 69–76, 2011. Springer-Verlag Berlin Heidelberg 2011
70
D. Kengyel et al.
coordinated search. Following this idea, swarm robotics [3] emerged from combining autonomous robots with techniques originating from the field of swarm intelligence [4]. In swarm intelligent systems, the collective of agents performs decision making without a central unit that decides. As is summarized by [5], swarm intelligent solutions are robust, flexible, scalable and (computationally) cheap to implement on the individual. Camazine et. al. [6] showed that these properties apply also to biological swarms and social insect colonies, which exploit self-organizing processes to achieve swarm intelligence. As an example, natural selection sharpened individual behaviour of bees in a way that colony fitness is maximized. Self-organisation and swarm intelligence can enhance this colony fitness significantly: In the following, we focus on groups of young honeybees which are an example of such a swarm-intelligent system. Young honeybees need a certain environmental temperature. The temperature in the hive varies usually between 32 C and 36 C while the optimal temperature for young honeybees is approximately 35 C [7]. This temperature is found at the brood nest of the hive, which is, preferred by the young honeybees [8]. Former experiments also showed that a single bee is mostly not able to detect the area of optimal temperature when the temperature gradient is rather flat [9]. 1.2
Derivation of the Bio-inspired Solution
A model was reported in [10], that reproduces this behaviour of young bees without assuming any explicit communication between the bees. Inspired by the observed behaviour of young honeybees and based on this model, an algorithm for a swarm of robots was developed, which enables the robots to find an area with a special property collectively [2,10]. This typical (swarm) robotic task was frequently realized based on areas marked by light and using light sensors because this experiment configuration is supposed to be the technically simplest. We adopted the algorithm reported in [2] to consider the specific aspects of a temperature field in contrast to a light gradient field. The algorithm is based on robot-to-robot approaches. For simplicity we call it the BEECLUST algorithm further on. By using heat instead of light the robots’ environment gets closer to the original bio-inspired situation young honeybees are faced with. Due to the fact that heat differs in its physical characteristics compared to light (e.g., warming up period, thermal diffusion, turbulences in the airflow) we show here how physical properties of stimuli are reflected in the swarm’s control algorithm. The typical engineering approach is very different to this swarm-intelligent approach: In his book Braitenberg describes algorithms, such as a simple gradient ascent, of robots which are sensitive to the environment [11]. There are several works describing complex approaches with cooperating sensor networks and robot groups (e.g., [12]). The sensor network measures the temperature gradient and communicates within the robot swarm. The robots are able to localize themselves globally and perform a simple gradient ascent. A single robot that produces heat trails and follows them is reported in [13].
Embodiment of Honeybee’s Thermotaxis in a Mobile Robot Swarm
2 2.1
71
Material and Method Hardware and Electronics
For our robot experiments, we needed a small robot with two temperature sensors. A good solution was to upgrade an out-of-the-shelf robot with an according sensor system. Additional to an extensible I/O-board, the Hemisson robot has six infrared-sensors, which we used for collision avoidance and for the identification of other robots. Three of these sensors point to the front, one to each side and one to the back. The locomotion is realized by two wheels with a differential drive system. In addition, we developed a circuit processing the measured temperature. The circuit with temperature sensors is able to detect temperatures in a range of 140 C. We re-mapped this range to the relevant temperature interval [10 C, 60 C] of honeybees, represented by digital sensor values on the interval e ∈ [50, 225]. Our measurements indicate that the sensor’s voltage is approx. linear within this temperature interval. Using an instrumentation amplifier the measured signal was amplified for processing the data in an easier way. The advantage of an instrumentation amplifier compared to the operational amplifier is that the former is regulated by a single resistor without affecting the circuit. Fig. 1A shows the Hemisson with the temperature sensors.
Fig. 1. A: Robot “Hemisson” with temperature sensors. B: Temperature sensor of the Hemisson. C: Robots underneath the heat lamps in the arena.
2.2
Algorithms
Naturally, there are many possibilities to implement the aggregation of autonomous agents at a certain target area. Here, we want to analyze the properties of the BEECLUST algorithm when working with fields of temperature instead of light gradients. This bio-inspired algorithm is different from anything that could be considered to be a standard approach. Actually, it might even be perceived as counter-intuitive. However, we hypothesize that it is robust and efficient in multirobot scenarios. In order to test this hypothesis we compare the BEECLUST to a rather classic approach of a simple gradient ascent algorithm (GAA).
72
D. Kengyel et al.
BEECLUST Algorithm: BEECLUST is adopted from young honeybees’ navigation behaviour. In principle there are two strategies how young honeybees find the optimal temperature: In a rather steep gradient field, the individual behaviour of each single bee is effective and dominates. In a rather flat gradient, a single bee is mostly not able to find the spot of optimal temperature. The following group behaviour of young honeybees in such an environment was observed [2]: The bees walk around randomly until they meet another bee. Then they stop and wait. After a certain time the bees walk around randomly again. This process forms clusters of bees. The higher the temperature is, the longer the bees wait. Hence, the cluster sizes and numbers increase over time in the warmer area whereas they decrease in the colder area. To realize such a behaviour the robots have to perform several different tasks: random movement, discrimination between the wall and other robots, avoidance of collisions, measurement of local temperature and calculation of the waiting time. Our program on the Hemisson robot mimics the minimal individual behaviour extracted from honeybees: A robot drives forward until it detects an object in the front. If this object is an obstacle (e.g., a wall) the robot turns in a random direction by a random angle. This turning angle is uniformly randomly distributed between 40 degrees and 140 degrees in both directions. If the detected object is another robot, the robot stops, measures the temperature, and calculates the time to wait. The robot-to-robot detection is realized by the help of passive sensing using infrared sensors. Passive sensing means, in contrast to active sensing, that no infrared light is emitted. If the sensor receives IR without emitting IR there is only one possibility: The IR-light was emitted by another robot. The dependency of the waiting-time on the measured temperature is also an important factor and is described with a sigmoid curve. The following equation is used to map the sensor values e to waiting times (for details see [2]): w(e) =
wmax e2 , e2 + θ
(1)
where θ indicates the steepness of the resulting curve. See table 1 for characteristics of the robot and other parameters that were used in this work in comparison to parameters of the original BEECLUST [2]. Table 1. Parameters temperature
light
speed of the robot 3 cm/s 30 cm/s range of infrared sensors 4 cm 6 cm robot size 12 × 13 cm 3 × 3 cm max. waiting time wmax 180 s 66 s interval of the sensor value e [50, 225] [0, 255] waiting time function offset θ 21,000 7,000
Embodiment of Honeybee’s Thermotaxis in a Mobile Robot Swarm
73
GAA – Gradient-Ascent Algorithm: The GAA represents a simple straightforward approach to thermotaxis. The collision avoidance is implemented in the same manner as described above. The only difference is that there is no robotto-robot detection implemented and therefore no waiting-time is calculated. The GAA works as follows: The robot measures the temperature with a frequency of two measurements per second. If one sensor reports a difference of temperature greater than a threshold of 2 in the digital sensor value e (to achieve a certain robustness against fluctuations), the robot changes its direction and starts to turn to the warmer area until the difference between the two sensors has vanished. A difference of 2 units in the digital sensor value corresponds to a difference in the temperature of about 1 C. 2.3
Setup of the Experiment
Experiment 1: To compare both algorithms we used three Hemisson robots and built a rectangular arena as shown in Fig. 1C. The arena had a size of 150cm × 120cm. Over the right side of the arena we placed three infrared heat lamps at a height of 44cm. Each of them had a power of 250 watt. The lamps generated 30 C in a height of 10cm (where the robots measure the temperature) above the target area. The ambient air temperature in the arena (where no heat lamps were) was around 27 C. The target zone underneath the heat lamps was marked with a surrounding black line as seen in Fig. 1C. It covered about an eighth of the arena. Initially, the robots (depending on the scenario: one, two, or three) are positioned at the side of the arena opposite to the target zone. In case of BEECLUST, many robot-to-robot detections occur in the first seconds. However, they do not remain there because the waiting-time in the cold part of the arena is very small (mostly no waiting-time). It was important to ensure that the ambient air temperature did not significantly change during the experiment. Such changes were prevented by heating up the room to 27 C before our experiments started. To ensure the same initial conditions the temperature was measured before each experiment started. In these experiments, the time each robot spent inside of the target zone was measured. Each experiment lasted 14 minutes. Experiment 2: In an additional experiment the probability of staying within the target area after a robot-to-robot approach was measured after a time of 25 seconds. In case of the BEECLUST, not every robot-to-robot detection results in a successful detection of the other robot due to noise and blind spots of the infrared sensors. In case of the GAA, two approaching robots interfere with each other and might “push” each other out of the target area when trying to avoid collisions. In this experiment we had five different setups: a) one GAA robot, b) two GAA robots, c) three GAA robots, d) two BEECLUST robots and e) three BEECLUST robots . The robots started vis-´a-vis and where placed 10cm outside of the target area. The third robot in experiment c and e was placed inside of the target area and was programmed to stay immobile. Therefore, it was not included in the determination of the results.
74
3
D. Kengyel et al.
Results and Discussion
mean percent
Fig. 2 shows the results of our three test runs in experiment 1 (N = 6 repetitions). Three robots with the GAA have spent 25.4% (mean) of time inside of the target zone. This time span is quite short compared to the mean of 49.7% in the runs with a single GAA robot. This is explained by the consequences of the competition for space underneath the heat lamps: If one robot is inside the target zone and a second robot tries to get into the target zone as well, it may happen that they approach each other. The collision avoidance behaviour might lead to a turn and the robot might leave the target area.
Fig. 2. Mean of the percentage of the total time a robot spent inside of the target zone (see Fig. 1C). Error bars show the standard deviation, N = 6 per setting.
One single BEECLUST robot spent an eighth of the time in the target area, because the target area covers an eighth of the arena. This was calculated statistically because one single robot cannot stop underneath the heat lamps. Three BEECL. robots yielded a result with a mean of 36.7%. In comparison to the GAA the swarm-intelligent BEECL. has a higher success rate. If there is no heat source in the arena the three robots do not aggregate for both algorithms. In test runs with three robots, three cases of robot-to-robot encounters in the target area can occur, which are tested in experiment 2. Fig. 3 shows the relative frequency of a robot to stay in the target area. The accounted final state was measured 25 seconds after the start leaving enough time for a robot to get out of the target zone: One single GAA robot stayed inside of the target zone with a frequency of 0.75 per trial. Two GAA robots remained there with a frequency of 0.67 per trial whereas the frequency was 0.54 per trial in the experiment with three GAA robots. 2 BEECL. robots had a frequency of 0.91 per trial and 3 BEECL. robots remained in the target area with a frequency of 0.83 per trial. Significant differences were determined for the comparisons: 1 BEECL. and all other comparable settings (Chi-square test, P < 0.0001), 2 GAA and 2 BEECL. (P < 0.0330, χ2 = 4.547), and 3 GAA and 3 BEECL. (P < 0.0293, χ2 = 4.752).
Embodiment of Honeybee’s Thermotaxis in a Mobile Robot Swarm
75
* rel. frequency
*
Fig. 3. Comparison of the relative frequency of robots that stay in the target zone for both algorithms and for all possible robot-to-robot encounters, N = 12 per setting
The probability of the GAA robots remaining in the target zone decreases with increasing number of robots. It is expected that with a higher number of robots the probability will decrease further because the limited resource “space” in the target area will get more scarce. Although this is true for the BEECLUST robots as well, they have a higher probability of staying in the target area.
4
Conclusion and Outlook
As expected the use of heat instead of light brings along significant challenges in designing an efficient robotic swarm due to different physical characteristics. Differences are observed, e.g., in the shape of the temperature gradient, in the time scale of diffusion and the time delays in the measurement of temperatures. This had to be reflected in the control algorithm by slower motion/turns and longer waiting periods. The time delay of measurement originates from the temperature sensors because they have to heat themselves up to decrease their resistance. If the robot is not exposed to heat long enough it measures a lower temperature and so the calculated waiting-time is shorter. As a consequence the aggregated robot swarm is not as robust as it could be in a light gradient. A major challenge was the inhomogeneity concerning the spreading of heat and air flows. Due to the fact that hot air ascends, the heat lamps generate a lot of waste heat which heats up the ambient air instead of the desired area in the robot arena. Between experiment runs the initial conditions (defined temperatures at the target area, in the ambient air, etc.) needed to be re-established. Otherwise the air temperature saturates close to the temperature in the target zone and the robots cannot identify the target area anymore. This would influence the effectivity of the algorithms critically. Despite these characteristics, which increase the difficulty of this task, the robots were able to locate the target area beneath the heat lamps using the BEECLUST. This was possible although the configuration of the infrared-sensors of the Hemisson, used for collision-detection and the detection of another robot, have blind spots.
76
D. Kengyel et al.
In this paper we focused on building robots which are able to perform thermotaxis. The experiments were used as a proof of concept. For an exhaustive analysis are multiple robots, a larger arena and a longer period of time required. We will continue to use these robots as a model to emulate natural organisms such as honeybees. Furthermore we will test the robustness of bio-inspired algorithms to obstacles and disturbances of the gradient.
Acknowledgements This work was supported by: EU-IST-FET project ‘SYMBRION’, no. 216342; EU-ICT project ‘REPLICATOR’, no. 216240. Austrian Science Fund (FWF) research grants: P15961-B06 and P19478-B16.
References 1. Stabentheiner, A., Schmaranzer, S., Heran, H., Ressl, R.: Ver¨ andertes Thermopr¨ aferendum von Jungbienen durch Intoxikation mit Roxion-S (Dimethoat). Mitteilungen der Deutschen G. f. allg. u. angew. Entomologie 6, 514–520 (1988) 2. Schmickl, T., Thenius, R., M¨ oslinger, C., Radspieler, G., Kernbach, S., Crailsheim, K.: Get in touch: Cooperative decision making based on robot-to-robot collisions. Autonomous Agents and Multi-Agent Systems 18(1), 133–155 (2008) 3. S ¸ ahin, E.: Swarm robotics: From sources of inspiration to domains of application. In: S ¸ ahin, E., Spears, W.M. (eds.) SAB 2004. LNCS, vol. 3342, pp. 10–20. Springer, Heidelberg (2005) 4. Kennedy, J., Eberhart, R.C.: Swarm Intelligence. Morgan Kaufmann Series in Evolutionary Computation. Morgan Kaufmann, San Francisco (2001) 5. Millonas, M.M.: Swarms, phase transitions, and collective intelligence. In: Langton, C.G. (ed.) Artificial Life III, Addison-Wesley, Reading (1994) 6. Camazine, S., Deneubourg, J.L., Franks, N.R., Sneyd, J., Theraulaz, G., Bonabeau, E.: Self-Organizing Biological Systems. Princeton Univ. Press, Princeton (2001) 7. Jones, J.C., Myerscough, M.R., Graham, S., Oldroyd, B.P.: Honey bee nest thermoregulation: Diversity promotes stability. Science 305, 402–404 (2004) 8. Bujok, B.: Thermoregulation im Brutbereich der Honigbiene apis mellifera carnica. Dissertation, Universit¨ at W¨ urzburg (2005) 9. Szopek, M., Radspieler, G., Schmickl, T., Thenius, R., Crailsheim, K.: Recording and tracking of locomotion and clustering behavior in young honeybees (apis mellifera). In: Spink, et al. (eds.) Proc. Measuring Behavior, vol. 6, p. 327 (2008) 10. Schmickl, T., Hamann, H.: BEECLUST: A swarm algorithm derived from honeybees. In: Xiao, Y., Hu, F. (eds.) Bio-inspired Computing and Communication Networks. Routledge, New York (2011) 11. Braitenberg, V.: Vehicles: Experiments in synthetic psychology. MIT Press, Cambridge (1984) 12. Kantor, G., et al.: Distributed search and rescue with robot and sensor teams. In: Proc. 4th Int. C. on Field and Service Robotics, pp. 529–538 (July 2006) 13. Russell, R.: Heat trails as short-lived navigational markers for mobile robots. In: Proceedings of the IEEE International Conference on Robotics and Automation, vol. 4, pp. 3534–3539 (1997)
Positively versus Negatively Frequency-Dependent Selection Robert Morris and Tim Watson Faculty of Technology, De Montfort University, Leicester, United Kingdom, LE1 9BH
[email protected] Abstract. Frequency-dependent selection (FDS) refers to situations where individual fitnesses are dependent (to some degree) on where the individual’s alleles lie in the proximate allele frequency distribution. If the dependence is negative – that is, if alleles become increasingly detrimental to fitness as they become increasingly common at a given locus – then genetic diversity may be maintained. If the dependence is positive, then alleles may converge at given loci. A hypothetical evolutionary model of FDS is here presented, in which the individuals themselves determined – by means of a gene – whether their fitnesses were positively or negatively frequency-dependent. The population ratio of the two types of individual was monitored in runs with different parameters, and explanations of what happened are offered. Keywords: Frequency-dependent selection, multiple alleles, meta gene.
1
Introduction
Ridley’s textbook Evolution [6] contains a good entry on FDS, the opening of which is reproduced here. “Frequency-dependent selection occurs when the fitness of a genotype depends on its frequency. It is possible for the fitness of a genotype to increase (positively frequency-dependent) or decrease (negatively frequency-dependent) as the genotype frequency in the population increases.” Many abstract models of FDS have been studied, with the principal aims being to strengthen the theory underpinning these phenomena, and to explore the surrounding space of possibilities. Curtsinger [3] looked at many different selection modes, and found a condition that determined whether or not the system would stably converge. Asmussen and Basnayake [1] studied several models, focussing on the potential for the maintenance of genetic diversity, and Roff [7] looked at maintaining both phenotypic and additive variation via FDS. B¨ urger [2] performed an extensive analysis of a general model (of which previous known models could be considered special cases) and gave a near-complete characterisation of the equilibrium structure. Schneider [8] carried out a similar study, with multiple alleles and loci, and found no equilibria possible with more than two alleles at a locus. And Trotter and Spencer [9] investigated the potential for maintaining polymorphism taking into account the presence of positive FDS. G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 77–84, 2011. c Springer-Verlag Berlin Heidelberg 2011
78
R. Morris and T. Watson
In every previous simulation of FDS, the selection regime has been imposed on the individuals by (essentially) the environment, in order to reproduce realworld conditions. In the present FDS model, the novel step is taken of putting the form of the frequency-dependence under the control of the individuals. In more precise terms, the individuals have a meta-gene in their genotype that dictates whether their fitness will be proportional to how similar they are to their neighbours, or how different they are to them. (Cf. the meta-genes in [4] and [5].) The purpose here is not to model any known natural phenomena, but to speculatively extend the domain of theoretical FDS in an interesting ‘what if?’ way.
2
Experiments and Analysis
A standard genetic algorithm was written whose genotype consisted of one meta gene plus a chromosome. The meta gene was a bit, and the chromosome consisted of non-negative integers in a given range. A zero allele in the meta gene told the fitness function to reward that individual for similarity, and a one told the function to reward it for difference. The measures were based on the concept of Hamming distance, and were implemented here in two different ways: pairwise, and population-wide. In the pairwise method, an individual’s fitness was calculated by comparing it to a randomly chosen individual from the population (which could have been itself). If the individual in question had a meta gene of zero, its fitness was given by 1 plus the total number of genes1 – by locus – which it had in common with the other individual, ignoring the meta genes. If the meta gene was a one, the fitness was given by 1 plus the total number of genes which they did not have in common. For example, for the individuals and , the fitness of the first w.r.t. the second would be 4 (= 1 + 3), whereas the fitness of the second w.r.t. the first would be 3. In the population-wide method, an individual’s fitness was calculated by comparing it to every individual in the population (including itself). This was performed by applying the pairwise method down the whole population, and tallying up the points. These were the only components of the fitnesses. Six runs were executed for every combination of the following parameter ranges: population size = 10 and 100; chromosome length = 1, 4, and 40; allele range = 2 (i.e. bitstrings), 3, and 4; fitness calculation = pairwise and population wide; mutation rate = zero, ‘low’, and ‘high’; crossover = off and on (∼70% across each new population). The chromosomes were initialised randomly, but the meta-bits were set alternately to 0 and 1, to prevent biased starts. The main datum that was tracked during every run was the ratio of the two types of individual in the population per generation – this is what the vertical axes represent in most of the figures. A value of 1 indicates that the population is 1
The minimal possible fitness was 1 so no individual could have a zero probability of being selected.
Positively versus Negatively Frequency-Dependent Selection
79
dominated by individuals whose fitness is negatively frequency-dependent (hereafter “NFDs”); a value of 0 indicates domination by individuals with positively frequency-dependent fitness (PFDs); a value of 0.5 indicates a 50:50 mixture. 2.1
Two Alleles
When the bitstring results were plotted and compared, it was seen that neither the presence of crossover nor the choice of fitness function (pairwise or population wide) made much difference. The mutation rate was not particularly important either, so long as it was neither vanishingly small nor excessively high. And the population size and the chromosome length only changed the destiny of the system when they were very small. Figure 1 shows what usually happened in the experiments for all but the extreme settings. The PFDs made more copies of themselves than the NFDs from the outset, and the population soon became dominated by PFDs. The system entered an evolutionarily stable state, which mutation and crossover could not overturn.
Fig. 1. How the ratio of binary PFDs (meta gene = 0) to NFDs (meta gene = 1) varied over the first 200 generations of two sets of six runs. Plot (a) shows a noisy case, and plot (b) shows a more representative case.
Why the PFDs always defeated the NFDs in base-2 (in non-extreme conditions) can be understood in the following terms. If two bitstrings are generated completely at random, the Hamming distance between them will lie somewhere between zero and their length, according to a bell-shaped probability distribution with a peak at half their length. In other words, they will probably have half their bits in common. This means that in a random initial population of 50% PFDs and 50% NFDs, the mixture of fitnesses found in the PFD group will probably be the same as that of the NFD group. In the first few generations therefore, selection will effectively be random. This means that different individuals will make varying numbers of copies of themselves, so after the first few generations, the population will comprise a number of (near-)homogenous groups of PFDs, a number of (near-)homogenous groups of NFDs, and a mixture of unique individuals of both type. (Crossover is temporally being ignored here, but when it is factored in it does not change the result.) In the pairwise fitness method, a member of a group will be compared to one of three other kinds of individual: a fellow group member, a member of another group, or one of the singletons. A PFD will get maximum points from a fellow group member, and
80
R. Morris and T. Watson
(essentially) random points from the others. A NFD, on the other hand, will get minimum points from a fellow group member, and random points from the others. This means that as the population evolves, the PFDs get progressively fitter and the NFDs get progressively less fit, and at the same time, PFD groups that are similar to each other will grow faster than PFD groups that are more genetically isolated. The inevitable outcome is that the NFDs all die out as the population drives towards uniformity. The same occurs with the population-wide fitness method, because the same fitness mixture is there. A converged PFD-only population cannot be invaded by an NFD mutant, because such a mutant would have a minimal (or near minimal) fitness compared to the maximal (or near maximal) fitnesses of the PFDs. A diverse PFD-only population is resistant to NFD mutants by an amount that negatively correlates to its diversity: if it is approaching convergence, it will have a high resistance, but if it is very mixed, then an NFD mutant could arise with a fitness comparable to those of the PFDs. However, if an NFD mutant does manage to get into a mixed population and starts spreading, the mutant group will die down, for the same reason as their kind dies out from the start. 2.2
Three Alleles
When the base-3 results were plotted, it was seen that the mutation rate and fitness function were similarly (ir)relevant as for base 2, but that the chromosome length and the population size were important, as was crossover in certain circumstances. One of the most important differences between the two-allele and the three-allele systems was how they behaved initially, from a random start. Whereas with two alleles, the population generally converged quickly to zeroes at the meta-locus (i.e. the PFDs dominated), with three alleles the population generally did the opposite, and converged to ones at the meta-locus. This was because in every initial population, any two individuals could expect to have on average only 13 of their genes in common (as opposed to the bitstring case, where it was 12 ). The NFDs could thus expect fitnesses of ∼ 23 of the maximum, while the PFDs could only expect ∼ 31 of the maximum. Consequently, the NFDs made around twice as many copies of themselves during the first few generations, so quickly wiped out the PFDs. Hence in every run (excepting some with extreme parameters) the populations took themselves into a negativelyfrequency-dependent selection regime. (This is also what happens for all larger allele alphabets, so the two-allele situation is exceptional in this regard.) These NFD convergences did not always last: they often turned out to be merely the first of two phases. When the population was large, and particularly when the chromosome was short as well, the NFD domination seemed very stable. Figure 2 shows two examples of this stability, as well as the fast convergences mentioned previously. No PFDs could invade during the timescales of the experiments, some of which went as far as 5000 generations, so these situations were the complements – in terms of the PFD:NFD ratio – of the two-allele situations (though they were not complementary in terms of diversity, because NFD populations stay diverse while PFD populations converge).
Positively versus Negatively Frequency-Dependent Selection
81
Fig. 2. How the ratio of PFDs to NFDs varied during the first and last 100 generations of two sets of six 2000-generation runs. The NFDs dominate in both cases, partly because the chromosomes were short and the populations large.
Two examples of the second phase are plotted in figure 3. In (a), which is representative of most small-population cases, the NFDs died out nearly as quickly as they took over, whereas in (b) – where the populations were large – it took varying lengths of time for the PFDs to successfully invade, with the longest being around 300 generations. As stated earlier, PFD-domination states, where the genotypes are converged or converging, are global attractors in this kind of system, and a return to NFD domination may only occur via a vastly improbable mutation and/or selection sequence.
Fig. 3. How the ratio of PFDs to NFDs varied over the first 200 and 400 generations of two sets of six runs. The PFDs eventually dominate in both cases.
2.3
Multiple Alleles
The sets of results gathered for base-4 chromosomes were almost the same as their counterparts in base-3, and preliminary runs in even higher bases indicated that the patterns continue. Explanations of why certain multi-allele populations can support long-term NFD domination, but others cannot, are now offered. It was found that the configurations most conducive to stable NFD-domination were those of very large populations and allele alphabets, but very short chromosomes, ideally single-locus. Reducing the population size, reducing the number of alleles, and increasing the chromosome length all tended to reduce the length of time the NFDs could survive before before displaced by PFDs. This result can be explained with an example. In a population of 200 NFD-individuals with base5 single-gene chromosomes (i.e. the genotype comprises 2 genes: the meta gene [0..1] + the chromosomal gene [0..4]) where there are 40 of each allele, every individual’s fitness – as measured by the inclusive population-wide method – is 161.
82
R. Morris and T. Watson
If an individual experienced a mutation in its meta gene, its fitness would be 41: with only 14 the fitness of its neighbours, it would struggle to survive. And if it did survive and spread a little, its group would continue to struggle, because no matter how many copies it made, as long as there were at least two other alleles in the population, those others would be fitter. The experimental reality would be that the mutant would disappear in a generation or two’s time, as would any copies it managed to make. An observer would have to wait a long time to see a PFD takeover. Now, if the allele range is increased, the expected fitness of a PFD mutant decreases, and if the population size is increased, the amount of ‘work’ it must do to dominate the population increases with it. This is why those settings have the effect they do. Regarding the chromosome length, the effect of increasing it is perhaps counter-intuitive; one might have thought that longer chromosomes would have more capacity to be different from each other, thereby making NFDdomination stabler and longer lasting. Not only is this not the case, it is the opposite of the case, for the following reason. In a well-spaced-out population, at each locus there should be approximately equal ratios of the alleles across the population (so for example, in a population of 100 base-4 individuals, ∼25 individuals should have a 0 as their first gene, ∼25 should have a 1, etc.). When the chromosome is short, the low capacity for difference means that every gene is important, so every gene has ‘healthy’ selective pressure on it. But when the chromosome is long, a given gene is less important, as it represents only a very small portion of the distance between individuals, so the selective pressure applied on it is relatively light. Consequently, whereas for short chromosomes the local allele ratios are kept under quite tight control, for long chromosomes they can drift and become skewed. This tends to reduce the genetic distance between individuals, making it easier (to some degree) for PFD mutants to establish themselves in the population. 2.4
Crossover
The last parameter to be discussed is crossover. This operator did not discernably change the results in most of the runs, but when the populations and chromosomes were (relatively) large and the number of alleles greater than two, it made a difference. The disappearance of NFDs shown in figure 3(b) did not happen in that same time period in different runs when the only difference in settings was the absence of crossover. This may seem strange at first, when it is considered that crossover does not change population-wide gene frequencies at any loci, and that it is those frequencies that control the fitnesses. Something subtle was happening. In a mutation-only system (with a big population, long chromosomes, and multiple alleles) where NFDs are dominant, the population is usually made up of several roughly-equally-fit groups, within each of which there is homogeneity or near homogeneity. If a group happens to expand, its members’ fitnesses dip, and the fitnesses of the non-members rise, so the group usually shrinks back down. (This is a standard dynamic that one finds in populations subject to negatively FDS.) The crucial observation is that when a group
Positively versus Negatively Frequency-Dependent Selection
83
expands, causing certain alleles become undesirably frequent at several loci, the undesirable genes are carried by identifiable individuals. Evolution can therefore remove these genes by selecting against the individuals that carry them. When crossover of any type is added, it breaks up the group structure of the population, but this in itself does not have too much impact on the system’s behaviour. Crossing over can be described as ‘dispersing’ or ‘mixing up’ the alleles – at each locus – across the population. Thus, if any individuals now make extra copies of themselves, the alleles they add to the population are dispersed up and down the loci, so there are no identifiably-bad individuals that can be selected against. In other words, instead of selection having ‘guilty’ individuals it can remove, the guilt is spread across the population, so there are no longer any outstandingly guilty individuals.
Fig. 4. The mean fitnesses over the first 300 generations of three sets of six runs
To see the exact effect crossover had, the mean fitnesses were plotted for the relevant base-3 runs. Figure 4 shows how they varied for 0%, 10%, and 70% uniform crossover, where the other parameters were the same as those of plot (b) in figure 3. Figure 4(a) shows a case where the groups scenario was played out. The fitnesses quickly rose to around 23 of the maximum – which was to be expected with three alleles – and stayed there. The variability of the values reflects the group-sizes changing as well as the drifting of groups themselves. Plot (c) – which represents the exact same runs as those in figure 3(b) – differs in two keys way to (a). Firstly, there is a gradual drop after the initial rise, and secondly, the curves ‘take off’ at those times that correspond to the PFD-mutant invasions, as PFD populations are fitter than NFD ones. Plot (b) shows that a very small amount of crossover changes the system’s behaviour. It is roughly the same as (c), with the key differences being higher resistances to PFD invasions, and the shorter durations of those invasions when they occur. The former can be attributed to less ‘dispersion’ of unwanted alleles; the latter shows that crossover slows down invasions, because the dispersals make the PFDs less similar to each other. Further runs with more alleles and other population sizes suggested that crossover causes the mean fitness to decay geometrically from its high early value to a stable value somewhere above the halfway value. (Also, the ‘dip depth’ increased with the number of alleles, so the dips in figure 4 are the least severe examples of their kind.) It appears that it is in those resultant regions of stable fitness that selection can positively promote the rarer alleles as effectively as crossover can assist the commoner ones.
84
3
R. Morris and T. Watson
Conclusion
This paper has sought to present and explain a novel hypothetical model of frequency-dependent selection in which the individuals determine for themselves whether the frequency dependence of their fitness is positive or negative. It was found that in this particular artificial evolutionary system, the important parameters are the population size, the chromosome length, the allele range, and the presence or not of crossover. When there are only two alleles (the binary case) a random initial population will become stably dominated by individuals whose fitness is positively frequency-dependent. When there are more than two alleles, the population will become dominated by individuals whose fitness is negatively frequencydependent. These converged states vary in their stability, with their durations depending on the parameters. When they end, they end with the imposition of stable positively-FDS across the population, a state whose arrival can be hastened by making any of the following parameter changes: reducing the population size, increasing the chromosome length, and enabling crossover.
References 1. Asmussen, M.A., Basnayake, E.: Frequency-Dependent Selection: The High Potential for Permanent Genetic Variation in the Diallelic, Pairwise Interaction Model. Genetics 125, 215–230 (1990) 2. B¨ urger, R.: A Multilocus Analysis of Intraspecific Competition and Stabilizing Selection on a Quantitative Trait. Journal of Math. Biology 50, 355–396 (2005) 3. Curtsinger, J.W.: Evolutionary Principles for Polynomial Models of FrequencyDependent Selection. Proc. Natl. Acad. Sci. USA 81, 2840–2842 (1984) 4. Grefenstette, J.J.: Evolvability in Dynamic Fitness Landscapes: A Genetic Algorithm Approach. In: Proc. of the 1999 Conf. on Ev. Comp., pp. 2031–2038 (1999) 5. Hillman, J.P.A., Hinde, C.J.: Evolving UAV Tactics with an Infected Genome. In: Proc. of the 5th UK Workshop on Comp. Intel., pp. 243–250 (2005) 6. Ridley, M.: http://www.blackwellpublishing.com/ridley/ 7. Roff, D.A.: The Maintenance of Phenotypic and Genetic Variation in Threshold Traits by Frequency-Dependent Selection. Journal of Evolutionary Biology 11, 513– 529 (1998) 8. Schneider, K.A.: A Multilocus-Multiallele Analysis of Frequency- Dependent Selection Induced by Intraspecific Competition. Journal of Math. Biology 52, 483–523 (2006) 9. Trotter, M.V., Spencer, H.G.: Frequency-Dependent Selection and the Maintenance of Genetic Variation: Exploring the Parameter Space of the Multiallelic Pairwise Interaction Model. Genetics 176, 1729–1740 (2007)
Origins of Scaling in Genetic Code Oliver Obst1 , Daniel Polani2 , and Mikhail Prokopenko1 1
2
CSIRO Information and Communications Technology Centre, P.O. Box 76, Epping, NSW 1710, Australia Department of Computer Science, University of Hertfordshire, Hatfield, UK
[email protected] Abstract. The principle of least effort in communications has been shown, by Ferrer i Cancho and Sol´e, to explain emergence of power laws (e.g., Zipf’s law) in human languages. This paper applies the principle and the information-theoretic model of Ferrer i Cancho and Sol´e to genetic coding. The application of the principle is achieved via equating the ambiguity of signals used by “speakers” with codon usage, on the one hand, and the effort of “hearers” with needs of amino acid translation mechanics, on the other hand. The re-interpreted model captures the case of the typical (vertical) gene transfer, and confirms that Zipf’s law can be found in the transition between referentially useless systems (i.e., ambiguous genetic coding) and indexical reference systems (i.e., zero-redundancy genetic coding). As with linguistic symbols, arranging genetic codes according to Zipf’s law is observed to be the optimal solution for maximising the referential power under the effort constraints. Thus, the model identifies the origins of scaling in genetic coding — via a trade-off between codon usage and needs of amino acid translation. Furthermore, the paper extends the model to multiple inputs, reaching out toward the case of horizontal gene transfer (HGT) where multiple contributors may share the same genetic coding. Importantly, the extended model also leads to a sharp transition between ambiguous HGT and zero-redundancy HGT. Zipf’s law is also observed to be the optimal solution in the HGT case.
1
Introduction — Coding Thresholds
The definition and understanding of the genotype-phenotype relationship continues to be one of the most fundamental problems in biology and artificial life. For example, Woese strongly argues against fundamentalist reductionism and presents the real problem of the gene as “how the genotype-phenotype relationship had come to be” [1], pointing out the likelihood of the “coding threshold”. This threshold signifies development of the capacity to represent nucleic acid sequence symbolically in terms of a amino acid sequence, and separates the phase of nucleic acid life from an earlier evolutionary stage. Interestingly, the analysis sheds light not only on this transition, but also on saltations that have occurred at other times, e.g. advents of multicellularity and language. The common feature G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 85–93, 2011. c Springer-Verlag Berlin Heidelberg 2011
86
O. Obst, D. Polani, and M. Prokopenko
is “the emergence of higher levels of organization, which bring with them qualitatively new properties, properties that are describable in reductionist terms but that are neither predictable nor fully explainable therein” [1]. The reason for the increase in complexity can be identified as communication within a complex, sophisticated network of interactions: “translationally produced proteins, multicellular organisms, and social structures are each the result of, emerge from, fields of interaction when the latter attain a certain degree of complexity and specificity” [1, 2]. The increase of complexity is also linked to adding new dimensions to the phase space within which the evolution occurs, i.e. expansion of the network of interacting elements that forms the medium within which the new level of organization comes into existence [1, 2]. An increase of complexity is one of the landmarks of self-organization, typically defined as an increase of order within an open system, without explicit external control. As pointed out by Ferrer i Cancho and Sol´e [3], the emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization, common to all languages. The best known is Zipf’s law, which states that the frequency of a word decays as a (universal) power law of its rank. Furthermore, Ferrer i Cancho and Sol´e observe that “all known human languages exhibit two fully developed distinguishing traits with regard to animal communication systems: syntax [4] and symbolic reference [5]”, and suggest that Zipf’s law is a hallmark of symbolic reference [3]. They adopt the view that a communication system is shaped by both constraints of the system and demands of a task: e.g., the system may be constrained by the limitations of a sender (“speaker”) trying to encode a message that is easy-to-decode by the receiver (“hearer”). In particular, speakers want to minimise articulatory effort and hence encourage brevity and phonological reduction, while hearers want to minimise the effort of understanding and hence desire explicitness and clarity [3, 6, 7]. For example, the speaker tends to choose ambiguous words (words with a high number of meanings), and this increases the interpretation effort for the hearer. Zipf referred to the lexical trade-off as the principle of least effort, leading to a well-known power law: if the words within a sample text are ordered by decreasing frequency, then the frequency of the k-th word, P (k), is given by P (k) ∝ k −α , with α ≈ 1. The main findings of Ferrer i Cancho and Sol´e are that (i) Zipf’s law can be found in the transition between referentially useless systems and indexical reference systems, and (ii) arranging symbols according to Zipf’s law is the optimal solution for maximising the referential power under the effort constraints. Combining terminology of Woese and Ferrer i Cancho and Sol´e allows us to re-phrase these observations as follows: (i) referentially useless systems are separated from indexical reference systems by a coding threshold, and (ii) Zipf’s law maximising the referential power under the effort constraints is the optimal solution that is a feature observed at the coding threshold. In this paper we apply the principle of least effort to genetic coding, by equating, on the one hand, the ambiguity of signals used by “speakers” with codon
Origins of Scaling in Genetic Code
87
usage, and, on the other hand, the effort of “hearers” with demands of amino acid translation mechanics. The re-interpreted model confirms that Zipf’s law can be found in the transition (“coding threshold”) between ambiguous genetic coding (i.e., referentially useless systems) and zero-redundancy genetic coding (indexical reference systems). As with linguistic symbols, arranging genetic codes according to Zipf’s law is observed to be the optimal solution for maximising the referential power under the effort constraints. In other words, the model identifies the origins of scaling in genetic coding — via a trade-off between codon usage and needs of amino acid translation. This application captures the case of the typical, vertical, gene transfer. We further extend this case to multiple inputs, reaching out toward the case of horizontal gene transfer (HGT) where multiple contributors may share the same genetic coding. We observe that the extended model also leads to a sharp transition between ambiguous HGT and zero-redundancy HGT, and that Zipf’s law is observed to be the optimal solution again.
2
Horizontal and Stigmergic Gene Transfer
It is important to realize that during the early phase in cellular evolution the proto-cells can be thought of as conglomerates of substrates, that exchange components with their neighbours freely — horizontally [8]. The notion of vertical descent from one “generation” to the next is not yet well-defined. This means that the descent with variation from one “generation” to the next is not genealogically traceable but is a descent of a cellular community as a whole. Thus, genetic code that appears at the coding threshold is “not only a protocol for encoding amino acid sequences in the genome but also an innovation-sharing protocol” [8], as it used not only as a part of the mechanism for cell replication, but also as a way to encode relevant information about the environment. Different proto-cells may come up with different innovations that make them more fit to the environment, and the “horizontal” exchange of such information may be assisted by an innovation-sharing protocol — a proto-code. With time, the proto-code develops into a universal genetic code. Such innovation-sharing is perceived to have a price: it implies ambiguous translation where the assignment of codons to amino acids is not unique but spread over related codons and amino acids [8], i.e. accepting innovations from neighbours requires that the receiving proto-cell is sufficiently flexible in translating the incoming fragments of the proto-code. Such flexible translation, of course, would produce imprecise copies. However, a descent of the whole innovationsharing community may be traceable: i.e., in a statistical sense, the next “generation” should be correlated with the previous one. While any individual protein is only a highly imprecise translation of the underlying gene, a consensus sequence for the various imprecise translations of that gene would closely approximate an exact translation of it. As noted by Polani et al. [9], the consensus sequence would capture the main information content of the innovation-sharing community.
88
O. Obst, D. Polani, and M. Prokopenko
Moreover, it can be argued that the universality of the code is a generic consequence of early communal evolution mediated by horizontal gene transfer (HGT), and that thus HGT enhances optimality of the code [8]: HGT of protein coding regions and HGT of translational components ensures the emergence of clusters of similar codes and compatible translational machineries. Different clusters compete for niches, and because of the benefits of the communal evolution, the only stable solution of the cluster dynamics is universality. The work of Piraveenan et al. [10] and Polani et al. [9] investigated particular HGT scenarios where certain fragments necessary for cellular evolution begin to play the role of the proto-code. For example, stigmeric gene transfer was considered as an HGT variant. SGT suggests that the proto-code is present in an environmental locality, and is subsequently entrapped by the proto-cells that benefit from such interactions. In other words, there is an indirect exchange of information among the cells via their local environment, which is indicative of stigmergy: proto-cells find matching fragments, use them for coding, modify and evolve their translation machinery, and exchange certain fragments with each other via the local environment. SGT differs from HGT in that the fragments exchanged between two proto-cells may be modified during the transfer process by other cells in the locality. SGT studies concentrated on the information preservation property of evolution in the vicinity of the “coding threshold”, considering a communication channel between a proto-cell and itself at a future time point, and posing a question of the channel capacity constrained by environmental noise. By varying the nature and degree of the noise prevalent in the environment within which such proto-cells exist and evolve, the conditions for self-organization of an efficient coupling between the proto-cell per se and its encoding with “proto-symbols” were identified. It was shown that the coupling evolves to preserve (within the entrapped encoding) the information about the proto-cell dynamics. This information is preserved across time within the noisy communication channel. The studies verified that the ability to symbolically encode nucleic acid sequences does not develop when environmental noise ϕ is outside a specific noise range. In current work we depart from the models developed by Piraveenan et al. [10] and Polani et al. [9], and rather than considering proto-cell dynamics defined via specific dynamical systems (e.g., logistic maps) subjected to environmental noise, we focus on constraints determined by (i) ambiguous codon usage, and (ii) the demands of amino acid translation mechanics. This allows us to abstract away the specifics of the employed dynamical systems [10, 9], and explain the emergence of a coding threshold from another standpoint that takes into account codon usage and amino acid translation. This, in turn, allows us to extend the model to HGT/SGT scenaria with multiple inputs. Both types of models — dynamical systems based [10, 9] and the one presented here — are able to identify an (order) parameter corresponding to the coding threshold: the environmental noise ϕ [10, 9] or the effort contribution λ [3]. Both types of models formulate objective functions information-theoretically, following the guidelines
Origins of Scaling in Genetic Code
89
of information-driven self-organisation [11–13, 10, 14–17]. The main difference from the dynamical systems based models lies, however, in the ability to detect a power law in the codon usage (lexicon) at the threshold.
3
Model
The Ferrer i Cancho and Sol´e model [3] is based on an information-theoretic approach, where a (single) set of signals S and a set of objects R are used to describe signals between a speaker and a hearer, and the objects of reference. The relation between S and R (the communication channel) is modelled using a binary matrix A, where an element ai,j = 1 if and only if signal si refers to object rj . The effort for the speaker is low with a high amount of ambiguity, i.e. if signal entropy is low. Hn (S) expresses this as a number between 0 and 1: Hn (S) = −
n
p(si ) logn p(si )
i=1
The effort for the hearer to decode a particular signal si is small if there is little ambiguity, i.e. the probability of a signal si referring to one object rj is high. In [3], this is expressed by the conditional entropy Hm (R|si ) = −
m
p(rj |si ) logm p(rj |si )
j=1
Hearer effort is dependent on probability of each signal and effort to decode: Hm (R|S) =
n
p(si )Hm (R|si )
i=1
A cost function Ω(λ) = λHm (R|S)+(1−λ)Hn (S) is introduced to combine effort of speaker and hearer, with 0 ≤ λ ≤ 1 trading off the effort between speaker and hearer. To consider effects of different efforts of speaker and hearer, different λ are used to compute the accuracy of the communication as the mutual information, In (S, R) = Hn (S) − Hn (S|R). This uses matrices evolved for minimal cost Ω(λ), and a simple mutation-based genetic algorithm. In addition, L will denote the number of signals referring to at least one object, i.e. the effective lexicon size. 3.1
Extension of the Model and “Readout” Modeling
To model codon usage by several neighbours, we extend the original approach that was using one matrix A, to several matrices Ak , each of which represents a different way of codon usage for the same set of objects (amino acids). In the extended model, a separate matrix Ak is created and evolved to encode a ˆ different communication channel. The cost Ω(λ) is computed for a matrix A that is obtained by averaging over all individual matrices Ak . During evolution,
90
O. Obst, D. Polani, and M. Prokopenko 0.7
1 0.6 0.8 0.5
0.6
n
I (S,R)
In(S,R)
0.4
0.3
0.4
0.2 0.2 0.1 0 0.1
0.2
0.3
0.4
0.5
λ
0.6
0.7
0.8
0.9
Fig. 1. One 150x150 matrix. The mutual information as function of λ.
1
0 0
0.1
0.2
0.3
0.4
0.5 λ
0.6
0.7
0.8
0.9
1
Fig. 2. HGT – four 40x40 matrices. The mutual information as function of λ.
ˆ obtained from mutated Ak , is reduced, the mutation is kept. if the cost of A, Averaging captures SGT between different sources (variables), and is a specific case of “readout”, motivated below. Shannon information between two variables X and X is determined as an optimum of knowledge extracted from the state of X about the state of X under the assumption that both variables and their joint distribution have been identified beforehand. In our model, however, the transfer of a message fragment from one proto-cell to another does not, in general, enjoy that advantage, because there are multiple candidates for the source of the message. The stigmergic nature of the message transmission in the HGT scenario does not allow for a priori assumptions of who the sender is and who the receiver is. This implies that there might emerge a pressure to “homogenize” the instantiations of sender and the receiver variables in the sense that, where information is to be shared, an instantiation x in the sender variable X evokes to some extent the same instantiation in the receiver variable X . To formalize this intuition, we model the sending (and analogously receiving) proto-cells as mixtures of probabilities, endowed with the “readout” interpretation suggested in [13] which we sketch in the following. Assume a collection of random variables Xk , indexed by an index k ∈ K for each sender, where all Xk assume values xk ∈ X in the same sampling set X . For a fixed, given k ∈ K, one can now determine informational quantities involving Xk in the usual fashion; however, if, as in the HGT case, the sender is not known in advance, one can model that uncertainty as a probability distribution over the possible indices k ∈ K, defining a random variable K with values in K. If nothing else is known about the sender, one can for instance assume an equidistribution on K. The readout of the collection (Xk )k∈K is then denoted by XK which models the random selection of one k ∈ K according to the distribution of K which of the Xk , followed by a random selection of an instance x ∈ X according to the distribution of Xk . For a Bayesian network interpretation of the readout, see [13]. Formally, the probability distribution of the readout XK assuming a value x ∈ X is given by Pr(XK = x) = k∈K Pr(K = k) · Pr(Xk = x).
Origins of Scaling in Genetic Code 10
0
10 lambda = 0.41 lambda = 0.05 lambda = 0.40 lambda = 0.50 lambda = 0.44
10
91
0
lambda = 0.05 lambda = 0.4 lambda = 0.5 10
−1
−1 −2
P(k)
P(k)
10
10 10
10
10
−3
−2
−3
10
0
1
10 k
10
2
Fig. 3. Single 150x150 matrix. Signal normalised frequency, P (k), versus rank, k.
4
10
−4
−5 0
10
10 k
1
10
2
Fig. 4. Four 40x40 matrices. Signal normalised frequency, P (k), versus rank, k.
Results
The accuracy of communication In (S, R) as a function of λ is shown in Fig. 1 and 2. Figure 1 traces mutual information for a single 150x150 matrix (staying within the model in [3], and studying a typical vertical gene transfer), while Fig. 2 contrasts the dynamics within HGT, simulated with four 40x40 matrices. In both, three domains are distinguishable in the behavior of In (S, R) vs. λ. For small values λ < λ∗ , In (S, R) grows smoothly, before undergoing a sharp transition in the vicinity λ ≈ λ∗ . Following Ferrer i Cancho and Sol´e, we observe that single-signal systems (effective lexicon size L ≈ 1/n) dominate for λ < λ∗ : “because every object has at least one signal, one signal stands for all the objects” [3]. Low In (S, R) indicates that the system is unable to convey information in this domain (totally ambiguous genetic code). Rich vocabularies (genetic codes with some redundancy), L ≈ 1, are found after the transition, for λ > λ∗ . Full vocabularies are attained for very high λ. The maximal value of In (S, R) indicates that the associations between signals (codons) and objects (aminoacids) are one-to-one maps, removing any redundancy in the genetic code. In the HGT case, this is harder to achieve, while the overall tendency is retained. To investigate transition around λ ≈ λ∗ , we focus on the lexicon’s ranked distribution, and consider the signal’s normalised frequency P (k) vs. rank k, for different λ. As expected, Ferrer i Cancho and Sol´e model shows that “Zipf’s law is the outcome of the nontrivial arrangement of word-concept associations adopted for complying with hearer and speaker needs” [3]. Contrasting the graphs (Fig. 3) reveals the presence of scaling for λ ≈ λ∗ , and suggests a phase transition is taking place at λ∗ ≈ 0.44 in the information dynamics of In (S, R). This results in a power law (P (k) = 0.2776k−1.0146), consistent with α ≈ 1 reported in [3]. Similar phenomenon is observed for HGT (Fig. 4). Scaling at λ∗ = 0.4 results in the power law (P (k) = 0.2742k −1.0412, with R2 = 0.9474), again consistent with α ≈ 1 in the power law reported in [3] and the one reported for vertical gene transfer above. Thus, the trade-off between codon usage and needs of amino acid translation in HGT results in a nontrivial but still redundant genetic code.
92
5
O. Obst, D. Polani, and M. Prokopenko
Conclusions
We applied the principle principle of least effort in communications and the (extended) information-theoretic model of Ferrer i Cancho and Sol´e to genetic coding. The ambiguity of signals used by “speakers” was equated with codon usage, while the effort of “hearers” provided an analogue for the needs of amino acid translation mechanics. The re-interpreted model captures the case of the typical (vertical) gene transfer, and confirms presence of scaling in the transition between referentially useless systems (i.e., ambiguous genetic coding) and indexical reference systems (i.e., zero-redundancy genetic coding). Arranging genetic codes according to Zipf’s law is observed to be the optimal solution for maximising the referential power under the effort constraints. Thus, the model identifies the origins of scaling in genetic coding — via a trade-off between codon usage and needs of amino acid translation. The extended model includes multiple inputs, representing HGT where multiple contributors may share the same genetic coding. The extended model also leads to a sharp transition: between ambiguous HGT and zero-redundancy HGT, and scaling is observed to be the optimal solution in the HGT case as well.
References 1. Woese, C.R.: A new biology for a new century. Microbiology and Molecular Biology Reviews 68(2), 173–186 (2004) 2. Barbieri, M.: The organic codes: an introduction to semantic biology. Cambridge University Press, Cambridge (2003) 3. Ferrer i Cancho, R., Sol´e, R.V.: Least effort and the origins of scaling in human language. PNAS 100(3), 788–791 (2003) 4. Chomsky, N.: Language and Mind. Harcourt, Brace, and World, New York (1968) 5. Deacon, T.W.: The Symbolic Species: The Co-evolution of Language and the Brain. Norton & Company, New York (1997) 6. Pinker, S., Bloom, P.: Natural language and natural selection. Behavioral and Brain Sciences 13(4), 707–784 (1990) 7. K¨ ohler, R.: System theoretical linguistics. Theor. Ling. 14(2-3), 241–257 (1987) 8. Vetsigian, K., Woese, C., Goldenfeld, N.: Collective evolution and the genetic code. PNAS 103(28), 10696–10701 (2006) 9. Polani, D., Prokopenko, M., Chadwick, M., Modelling, M.: stigmergic gene transfer. In: Bullock, S., Noble, J., et al. (eds.) Artificial Life XI - Proc. 11th Int. Conf. on the Simulation and Synthesis of Living Systems, pp. 490–497. MIT Press, Cambridge (2008) 10. Piraveenan, M., Polani, D., Prokopenko, M.: Emergence of genetic coding: an information-theoretic model. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 42–52. Springer, Heidelberg (2007) 11. Prokopenko, M., Gerasimov, V., Tanev, I.: Evolving spatiotemporal coordination in a modular robotic system. In: Nolfi, S., Baldassarre, G., et al. (eds.) SAB 2006. LNCS (LNAI), vol. 4095, pp. 558–569. Springer, Heidelberg (2006)
Origins of Scaling in Genetic Code
93
12. Polani, D., Nehaniv, C., Martinetz, T., Kim, J.T.: Relevant information in optimized persistence vs. progeny strategies. In: Rocha, L., Yaeger, L., Bedau, M., Floreano, D., Goldstone, R., Vespignani, A. (eds.) Artificial Life X: Proc. 10th International Conference on the Simulation and Synthesis of Living Systems (2006) 13. Klyubin, A., Polani, D., Nehaniv, C.: Representations of space and time in the maximization of information flow in the perception-action loop. Neural Computation 19(9), 2387–2432 (2007) 14. Laughlin, S.B., Anderson, J.C., Carroll, D.C., de Ruyter van Steveninck, R.R.: Coding efficiency and the metabolic cost of sensory and neural information. In: Baddeley, R., Hancock, P., F¨ oldi´ ak, P. (eds.) Information Theory and the Brain, pp. 41–61. Cambridge University Press, Cambridge (2000) 15. Bialek, W., de Ruyter van Steveninck, R.R., Tishby, N.: Efficient representation as a design principle for neural coding and computation. In: 2006 IEEE International Symposium on Information Theory, pp. 659–663. IEEE, Los Alamitos (2006) 16. Piraveenan, M., Prokopenko, M., Zomaya, A.Y.: Assortativeness and information in scale-free networks. European Physical Journal B 67, 291–300 (2009) 17. Lizier, J.T., Prokopenko, M., Zomaya, A.Y.: Local information transfer as a spatiotemporal filter for complex systems. Physical Review E 77(2), 026110 (2008)
Adaptive Walk on Fitness Soundscape Reiji Suzuki and Takaya Arita Graduate School of Information Science, Nagoya Univesity Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan {reiji,arita}@nagoya-u.jp http://www.alife.cs.is.nagoya-u.ac.jp/~ {reiji,ari}/
Abstract. We propose a new IEC for musical works based on an adaptive walk on a fitness landscape of sounds. In this system, there is a virtual plane that represents the genetic space of possible musical works called fitness soundscape. The user stands on the soundscape, and hears the multiple sounds that correspond to one’s neighboring genotypes at the same time. These sounds come from different directions that correspond to the locations of their genotypes on the soundscape. By using the human abilities for localization and selective listening of sounds, the user can repeatedly walk toward the direction from which more favorite sounds come. This virtual environment can be realized by a home theater system with multiple speakers creating “surrounded sound”. We report on the basic concept of the system, a simple prototype for musical composition with several functional features for improvement of evolutionary search, and preliminary evaluations of the system. Keywords: interactive evolutionary computation, musical composition, fitness landscape, soundscape, virtual reality, artificial life.
1
Introduction
An interactive evolutionary computation (IEC) has been used for optimizing various artifacts which are not possible to be evaluated mechanically or automatically [1]. Based on subjective evaluations of the artifacts by a human, one’s favorite artefacts in the population are selected as parents for the individuals in the next generation. By iterating this process, one can obtain better artifacts without constructing them by oneself. IECs have been widely applied in various artistic fields. Especially, IECs have been applied in creation of musical works such as musical composition, sound design, and so on [2]. For example, Unemi constructed a musical composition system named SBEAT in which a multi-part music can be generated through simulated breeding [3]. Dahlstedt also proposed a system for synthesizing a sound with interactive evolution, in which each sound has a visual representation which corresponds to the set of parameters of the synthesizer for generating the sound [4]. In IECs for graphical works such as Dawkins’s Biomorph [5], the individuals in the population can be evaluated in parallel as shown in Fig. 1 (a). On the other G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 94–101, 2011. c Springer-Verlag Berlin Heidelberg 2011
Adaptive Walk on Fitness Soundscape
95
(a) a parallel evaluation of graphical works (b) a sequential evaluation of musical works
Fig. 1. The interactive evolutionary computation with a) parallel or b) sequential evaluations
hand, in IECs for musical works, the user listens to and evaluates each individual in the population one by one, as shown in Fig. 1 (b). This is due to the fact that it is basically thought as difficult to evaluate each individual correctly when the all individuals are played all at once. This is an essential difference of IECs for musical works compared with those for graphical works. Unfortunately, this sequential evaluation of individuals increases the total evaluation time and thus increases the user’s temporal cost significantly. It also increases one’s cognitive cost in that one needs to remember the features of individuals so as to compare between them. Thus, we often limit the population size small in order to decrease these costs, which brings about the fitness bottleneck [6]. So as to solve these problems, we focus on the cognitive abilities of human for localization and selective listening of sounds. We can localize the direction of sounds, and can concentrate on the sound we like to hear, which is called a cocktail party effect [7] in a broad sense. If we utilize these kinds of ability properly, there is a possibility that we can correctly evaluate the individuals in parallel even in the case of IECs for musical works. We propose a new IEC for musical works based on an adaptive walk on the fitness landscape of sounds. In this system, there is a two-dimensional virtual plane that represents the genotypic space of possible musical works called fitness soundscape. The user stands on the soundscape, and hears the multiple sounds that correspond to one’s neighboring genotypes at the same time. The sounds come from different directions that correspond to the locations of their genotypes on the soundscape. This virtual environment can be realized by a home theater system with multiple speakers creating “surrounded sound”. By using the ability for localization and selective listening of sounds, the user can repeatedly walk toward the direction from which more favorite sounds come, which corresponds to the evolutionary process of the population in standard IECs. In this paper, we introduce the basic concept of the proposed system, and a simple prototype system for musical composition. We also propose the several functional features for improvement of evolutionary search. The preliminary experiments showed that the user successfully obtained one’s favorite musical work with the help of these features.
96
R. Suzuki and T. Arita
(a) fitness soundscape
(b) 3D sound system with multiple speakers
Fig. 2. A conceptual image of adaptive walk on fitness soundscape
2
Adaptive Walk on Fitness Soundscape
We propose an IEC for musical works based on an adaptive walk on a fitness landscape of sounds. Here, we introduce the basic concept of the model. The conceptual image of the model is shown in Fig. 2. A fitness landscape is often used so as to visualize and intuitively understand evolutionary dynamics of the population [8]. We assume a two-dimensional plane of genotypes that determine the individuals’ phenotypes. In this model, the phenotypes represent possible musical works such as musical pieces that are going to be searched, as shown in Fig. 2 (a). Thus, we call this landscape of sounds the fitness soundscape. The user stands on this plane and hears the sounds of one’s neighboring individuals from their corresponding position all at once. This virtual sound environment can be realized by a home theater system with multiple speakers creating “surrounded sound” as shown in Fig. 2 (b). The height of the fitness landscape is the corresponding fitness value of a genotype on a possible genetic space. The fitness of each genotype on the soundscape can be determined by a subjective evaluation by the user. Thus, the actual shape of the soundscape will be different among users depending on their subjective impression of individuals, and can change dynamically. The adaptive evolution of the population can be represented by a hill-climbing process on the fitness landscape. We adopt this process on the fitness soundscape as an evolutionary mechanism of the model. The user can walk toward the direction from which one’s favorite individuals’ sounds come. The user can obtain one’s more favorite sounds by repeating this hill-climbing process.
3
Prototype
We constructed a prototype of the system, which includes the several functions for improvement of an adaptive walk on the fitness soundscape. The main part of the system was constructed using Java 3D with JOAL (the Java Bindings for OpenAL API)1 , and we used the software synthesizer TiMidity++2 for the dynamic 1 2
https://joal.dev.java.net/ http://timidity.sourceforge.net/
Adaptive Walk on Fitness Soundscape
97
Fig. 3. A snapshot of the system X=1 genotype
t0=0->C
Y=52
t1=1->D
t2=6->B
repeated for T times
phenotype
(a)
t3=4
offset of tone by t3-2=+2
(b)
(c)
Fig. 4. The genetic description of a musical piece in the case of T =3
generation of the sound files. The real 3D sound environment was composed of a multi-channel home theater system with 7 speakers as shown in Fig. 3. A surround headphone for the home theater system with 5.1 channels can be used instead of this speaker system. We explain each part of the prototype in detail. 3.1
Genetic Description of the Individual
We assumed an evolutionary search of a musical piece as shown in Fig. 4, which is quite simple but sufficient to evaluate and demonstrate the prototype system, especially the parallel search utilizing the cocktail party effect. It also shows the genetic description of a musical piece, which is represented by a bit string with the length 12. Specifically, each genotype can be divided into 4 parts, each of which represents an integer value from 0 to 7. They determine the values of the parameters t0 , · · · , t3 respectively, as shown in Fig. 4 (a). The first three parameters determine the tones of the initial set of the three notes respectively as shown in Fig. 4 (b). The possible values of the parameter “01234567” correspond to the tones “CDEFGABC”. The tones of notes in the next set are defined as those of the corresponding notes in the previous set offset by t3 -2 within the range of possible tones. This subsequent set starts to be played at the same timing of playing the last note in the previous set. By repeating this process, the T sets are generated. Fig. 4 (c) is the final phenotype of this example.
98
R. Suzuki and T. Arita
W
The area which each sound of individual can reach
D
the sound source of the individual being playing
Y
X
Fig. 5. An example situation of the soundscape around the user
3.2
Fig. 6. A three dimensional image of the soundscape
Fitness Soundscape and Adaptive Walk
We mapped the genotypes described in the previous section into a twodimensional space using the following simple way. We divided the bit string of each genotype into two parts, and regarded the former part as the x position (0, · · · , 26 −1) of each genotype and the latter part as its y position (0, · · · , 26 −1) as shown in Fig. 4 (a). The fitness soundscape is composed of the 26 × 26 square-shaped areas, and each genotype (phenotype) is assigned to each corresponding area as explained above. Fig. 5 is an example situation of the soundscape around the user. The size of each area is W × W in units of Java 3D environment. When the user stands on the soundscape, at most 8 individuals on one’s neighboring areas are played repeatedly in parallel. The user can determine which individuals are actually played by specifying their relative directions from one’s heading by using a control panel (explained later). The sound source of each individual is placed at the center of each corresponding area. It is omnidirectional and its gain decreases linearly to 0 percent at the distance of D. In Fig. 5, the user specified that one can hear the individuals in front center, front right, front left, and rear center. However, one does not hear the sound in front left, because its distance from the user is out of the range of D. It should be noted that we do not play the individual which corresponds to the user’s standing area. It is due to the fact that it can make the user difficult to localize other neighboring sounds of individuals because it is too close to the user, besides that it is not necessary for deciding the direction of evolution. When the user has moved to the next area, the new neighboring individuals around the user’s standing area are played. Thus, the user can repeatedly walk around the whole soundscape hearing and evaluating the neighboring individuals. Fig. 6 shows the three dimensional image of the fitness soundscape, which is presented to the user during adaptive walk. The user can grasp the trajectory of one’s adaptive walk so far, and the relative positions of the individuals that are played now.
Adaptive Walk on Fitness Soundscape
3.3
99
Functional Features for Enhancement of Adaptive Walk
So as to improve the ability of evolutionary search, we introduced the following functional features into the prototype system. The several properties of these functions can be set up using the control panel of the system shown in Fig. 7. The adjustment of the scale of the soundscape. The user can adjust the scale of the soundscape by changing the hamming distance between the nearest neighboring genotypes, which is realized by eliminating the areas which correspond to the intermediate genotypes. By decreasing the scale ratio, the user can evaluate more different individuals at the same time, and walk toward a more distant place quickly. The assingment of the relative positions of individuals being playing. If the all neighboring individuals are always played, there is a possibility that the user can not discriminate between them because they are too much mixed. Thus, we added the set of options that determine the relative directions of the neighboring area in which the individuals can be played. The user can adjust the total number of the individuals being playing and their relative positions from the one’s heading by choosing activated individuals from the sorrounding ones in the control panel. The random delay in the timing of play. So as to make the user easy to distinguish between individuals, we have introduced a randomly determined time interval before playing each individual. The use of different sound types. We have also added the option whether the all individuals are played using the same sound type or are played using randomly assigned sound types. The difference in the sound types enables the user to listen to each individual more selectively.
4
Preliminary Evaluations
We have conducted the several preliminary experiments using the prototype explained in the previous section. First, we summarize basic evaluations on adaptive walk on the soundscape. By using the multiple speaker systems, we were able to localize the positions of individuals and distinguish between them even when they were played in parallel. We were also able to evaluate individuals, and walk toward the favorite ones. Thus, we can say that the user can search for favorite musical pieces based on one’s subjective evaluations using this system. Fig. 8 shows example trajectories of adaptive walk on the soundscape when the 5 different subjects searched for their favorite musical pieces. We used the settings of parameters: W =5, D=7.5 and T =4. The sounds are played with a randomly determined delay, and we adopted a unique sound type (piano) for all individuals. The relative directions of individuals being playing are front center, rear center, center right, and center left. The subjects were only allowed to use the adjustment of the scale of the soundscape. Starting from the center of the soundscape, the subjects moved over 10-80 areas, and finally reached their favorite piece whose corresponding position were
100
R. Suzuki and T. Arita
c b
a
d e
Fig. 7. The control panel of the prototype
Fig. 8. Example trajectories of evolutionary search on the soundscape
Fig. 9. An evolved musical piece
indicated with a circle on the map. As the figure shows, the trajectories varied significantly, which implies that the aesthetic feeling for musical pieces of the subjects were basically different. It is also interesting that the two subjects finally reached the closer positions (d and e). Fig. 9 is an evolved sound piece which corresponds to the position c (50, 45) in Fig. 8. It also turned out the following points with regard to the functional features. By decreasing the scale of the soundscape during early period of the search, the user could quickly reach one’s favorite areas in a long distance from the initial area. It was not easy to localize the individuals in backward positions of the user compared with those in other relative positions. Thus, there is a possibility that we can improve the ability for selective listening by adjusting the number or assignment of the individuals being playing in backward positions. Also, the random delay in the timing of play and the use of different sound types made the users much easier to distinguish between individuals. However, the latter sometimes affected the intuitive impressions of the individuals strongly, and thus affected their evaluations. In addition, a searching process on the soundscape was itself a new experience, which implies that this system can be extended to a kind of installation.
5
Conclusion
We have proposed a new interactive genetic algorithm for musical works utilizing the human abilities of localization and selective listening of sounds, which is based on adaptive walk on the fitness landscape of sounds. We constructed a prototype of the system for simple musical pieces, and conducted a preliminary
Adaptive Walk on Fitness Soundscape
101
evaluation of it. It was confirmed that the users were able to search for their favorite pieces by using this system. We also proposed the several functional features for improving evolutionary search. Future work includes the more detailed evaluation of the system, the improvement of evolutionary search, and the application of this system to the interface of content selection in portable music players.
Acknowledgements This work was supported in part by a grant from the Sound Technology Promotion Foundation.
References 1. Takagi, H.: Interactive Evolutionary Computation: Fusion of the Capabilities of EC Optimization and Human Evaluation. Proceedings of the IEEE 89(9), 1275–1296 (2001) 2. Husband, P., Copley, P., Eldridge, A., Mandelis, J.: An Introduction to Evolutionary Computing for Musicians. In: Miranda, E.R., Biles, J.A. (eds.) Evolutionary Computer Music, pp. 1–27 (2007) 3. Unemi, T.: SBEAT3: A Tool for Multi-part Music Composition by Simulated Breeding. In: Proceedings of the Eighth International Conference on Artificial Life, pp. 410–413 (2002) 4. Dahlstedt, P.: Creating and Exploring Huge Parameter Spaces: Interactive Evolution as a Tool for Sound Generation. In: Proceedings of the International Computer Music Conference 2001, pp. 235–242 (2001) 5. Dawkins, R.: The Blind Watchmaker. Longman, Harlow (1986) 6. Biles, J.A.: GenJam: A Genetic Algorithms for Generating Jazz Solos. In: Proceedings of the 1994 International Computer Music Conference (1994) 7. Cherry, E.C.: Some Experiments on the Recognition of Speech with One and with Two Ears. Journal of the Acoustical Society of America 25, 975–979 (1953) 8. Wright, S.: The Roles of Mutation, Inbreeding, Crossbreeding, and Selection in Evolution. In: Proceedings of the Sixth International Congress on Genetics, pp. 355–366 (1932)
Breaking Waves in Population Flows George Kampis1,2 and Istvan Karsai3 1
Collegium Budapest, Institute for Advanced Study, Budapest, Hungary 2 Eötvös University, Department of Philosophy of Science, Budapest, Hungary 3 Department BISC, East Tennessee State University, Johnson City, TN, USA
[email protected] Abstract. We test the controversial ideas about the role of corridors in fragmented animal habitats. Using simulation studies we analyze how fragmentation affects a simple prey-predator system and how the introduction of openings that connect the habitats changes the situation. Our individual based model consists of 3 levels: renewable prey food, as well as prey and predators that both have a simple economy. We find, in line with intuition, that the fragmentation of a habitat has a strong negative effect especially on the predator population. Connecting the fragmented habitats facilitates predator (and hence prey) survival, but also leads to an important counterintuitive effect: in the presence of a high quality predator, connected fragmented systems fare better in terms of coexistence than do unfragmented systems. Using a frequency domain analysis we explain how corridors between sub-habitats serve as “wave breakers” in the population flow, thus preventing deadly density waves to occur. Keywords: predator prey systems, animal corridors, frequency domain analysis.
1 Introduction Wave patterns are common in biology. Spatio-temporal waves have been demonstrated in many population models. In particular, various waveforms including spiral waves can be generated in simple predator prey systems [e.g., 1]. The stability and coexistence of populations is often closely related to the existence and behavior of wave patterns. High amplitude oscillations or transients tend to destabilize systems by generating depleted resource conditions. Therefore, the understanding and active control of population waves is of high importance in a number of contexts. For example, fragmented habitats tend to produce high densities that often lead to fatal oscillations. General spatial models in ecology, including island biogeographic models [2] as well as meta-population models [3, 4] predict that movements between patches will increase effective population size and persistence in general. Haddad & Tewksbury [5] reviewed major ecology and general biology journals from 1997 to 2003 and found only 20 studies to test the corridors’ effects on populations and diversity. They concluded that the current evidence offers only tentative support for the positive G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 102–109, 2011. © Springer-Verlag Berlin Heidelberg 2011
Breaking Waves in Population Flows
103
effects of corridors, and much more work on population and community responses is needed, especially, that it is important to study the mechanisms and conditions under which we can expect corridors to impact populations. The authors also predicted an increasing importance of individual based models that can aid empirical studies by focusing on the effects of different life history parameters [5]. In this paper we present and agent based model to study the oscillatory behavior of populations in various fragmented and connected fragmented habitats, respectively. The model utilizes an approach inspired by studies of excitable media, hydrodynamics, and granular flows. Hydrodynamic analogies can often help understand wave phenomena in different domains. Wave control tools (such as wave breakers and barriers) are being extensively used in natural flow systems such as rivers, ocean shores, etc. Also, granular flows have been successfully applied to the modeling and control of the behavior of humans in mass situations such as when escaping from fire, in high traffic or in mass demonstrations [6] . We utilize a similar metaphor in the present model, where population waves are shown to be controllable by a combination of barriers and restricted passages (i.e., corridors).
2 The Model The model (written in NetLogo 3.14) uses a simple set of assumptions: • The system consists of predator and prey individuals, which feed on biotic resources. Prey food has an autonomous growth dynamics with saturation. • The model is spatially explicit, consisting of n x n locations. A prey individual consumes a food token on its current location, if there is one. Similarly, the predator consumes a single prey individual if prey is found at the given location. All these behaviors are translated into a common currency, “energy”. Consumption ensures a certain amount of energy, i.e., GainPY and GainPD for the prey and predator, respectively. The consumed individual dies, and it is removed from the system. Death also happens if the energy level of an individual reaches zero. • At each time step every individual is forwarded by a constant amount Fd (this speed is assumed identical for both predator and prey). Each move occurs in a uniformly selected random direction taken from the interval ±T (the degree of turning in degrees relative to the current orientations). One step costs exactly one energy token (stored “energy” thus directly converts to lifetime). • The habitat is modeled as a rectangular area with reflecting boundaries. Each position (except the borders) can be empty or occupied by an arbitrary number of individuals (except that prey food that can only be on or off at a given location). • Reproduction is asexual and occurs with probability RPY and RPD, for prey and predator. Upon reproduction, a new individual of the given type is produced, with a random spatial orientation. Energy tokens of the parent individual will be shared evenly with the offspring. Prey food regenerates in K steps. • Prey and predator start from random initial positions as well as orientation and receive a randomized amount of startup energy (between 0 and EPY or EPD respectively). At every discrete time step, the following sequence of actions is performed for each individual organism in a dynamically randomized order: move, consume available resources, reproduce, and die if energy is out.
104
G. Kampis and I. Karsai
Borders with reflecting walls implement fragmentation in the habitat. Borders (similar to real roads, canals, or fences) do not decrease the overall habitat size significantly, and behave in the same way, as does the outer boundary. Corridors are implemented as openings in the walls where the organisms can pass through freely. Corridors have zero length (beyond the wall thickness) and are wide enough to permit several organisms to cross at the same time. w1c0
w2c0
w3c0
w1c1
w2c2
w3c3
Fig. 1. Fragmentation schemes studied in the paper: wXcY is an X walls and Y corridors system
The main numerical control parameter of the model is GainPD, i.e., the amount of energy gained by the predator when a prey individual is consumed (this can be thought of as representing a “quality” parameter for the predator: thus, the parameter is used as an umbrella descriptor for many direct and indirect relationships between several individual predator traits). In the model, high levels of predator gain and the resulting higher energy reserves of the predators imply a higher expected predator lifetime, which corresponds to an efficient natural predator that can control a large area. An additional, and most important, control parameter is the number of boundaries and openings in the system. We studied several different combinations (see Fig. 1.) Baseline settings define an initial area large enough to support a high population size (about 10,000 individuals) of prey and predator (Table 1.). Under the baseline conditions, in the unfragmented area (w0c0) the prey population always persists indefinitely without the predators, while prey and predator tend to coexist together. In the experiments, the value of GainPD and the number of boundaries and corridors were varied, while all other parameters were kept unchanged. Many of these parameters have various effects on dynamic properties that we cannot study in this paper. Treatments consisted of an exhaustive combination of the values GainPD = 10
Breaking Waves in Population Flows
105
(baseline), 30, 50 and 70, with a complete set of boundary-corridor systems as presented on Fig.1. (For each treatment, we started simulation runs with a delay t=50 in initial prey food re-growth, to avoid extreme initial prey densities and follow-up extinctions.) A high number (≥ 50) of simulation runs with different random seeds and t = 4,596 time steps were carried out for each treatment. Data were analyzed and plotted using the R statistical package. Table 1. Model parameters and initial values used in the simulations
Initial prey NPY = 1,000; predator NPD = 100 Fertility RPY = RPD= 15 % Motion speed Fd = 0.9; GainPY = 4, GainPD = varied (10, 30, 50, 70) Initial energy maximum EPY = 2*GainPY, EPD = 2*GainPD
n area size = 200 Corridor width = 3 Max. turning T = 50 Regeneration time K = 5
3 Results and Discussion The various effects of fragmentation and corridors exerted on the quantitative behavior of coexistence dynamics has been explored in (Karsai and Kampis 2011, submitted). The basic – and seemingly counterintuitive – finding is that, although fragmentation decreases the chance of survival for the predator (as expected), yet a combined effect of fragmentation and corridors produces a situation favoring the coexistence and stable dynamics of prey and predator above the chance of coexistence found in the unfragmented system. In the present paper we study the oscillatory behavior and the effect of boundaries and corridors on the emerging spatio-temporal waves understood as population oscillations. Accordingly, the main results reported here are based on a frequency domain analysis. 3.1 Qualitative Behavior In an unfragmented habitat (Fig. 2.a), massive waves (typical for high density populations) together with a high risk of extinction are experienced at high values of predator gain. In isolated fragmented habitats (Fig. 2.b), under the same conditions a system of short-lived dynamic transients dominates the system in each isolated fragment, leading to the extinction of the predator or the prey (and then both of them) in almost every trial and any degree of fragmentation applied. Wave phenomena are not strongly expressed in these cases, because of the lack of a stationary domain of existence and also due to the strongly limited size of the isolated fragments. In a connected-fragmented system, however, an entirely new phenomenon is found. “Broken” waves appear that can percolate through boundaries via the openings and tend to form a new seed for newer waves (Fig. 2c.). The typical mechanism is that some prey individuals get into (“escape” into) new territories, followed by the predator somewhat later. The new territories often lack predators as a result from earlier overexploitation. Hence, a new territory can be freely repopulated by the prey, harvested by the predator later; and so on. Unlike in isolated fragments, in connected
106
G. Kampis and I. Karsai
fragments the local extinction dynamics does not lead to a fatal population level consequence. New growth and new oscillations can start in neighboring fragments and spread all over in a recursive fashion. In the following, we analyze these behaviors using Fourier analysis to quantify the nature of the summarized effects.
a
b
c
Fig. 2. Population distributions in unfragmented: a, fragmented: b; fragmented and connected: c habitats. Prey food (grey), prey (white) and predator (black).
3.2 Frequency Domain Analysis Prey behavior is clearly derived from predatory effects. It will, therefore, be instructive to focus on the behavior of the population of predators. Recurrence plots for N=1024 steps are shown on Fig. 3. To appreciate these recurrent plots, note that a perfectly regular plot (for instance, the recurrence plot of a sinus wave) shows a regular, reticular pattern. Transients are visible as stripes and smudges. Disregarding transient irregularities, Fig. 3. shows a clear overall pattern. The upper row shows the unfragmented situation. The baseline case GainPD = 10 yields a quite perfect regular recurrence pattern, showing the dominance of a single frequency and its higher harmonics. Increasing GainPD (going right) leads to slower and less sharp recurrences, visualized as a decay of the original regular reticular pattern. Below the upper row, we show similar plots for a feasible fragmented connected habitat (w2c2). Here, even at higher values of GainPD, the recurrence structure is maintained in an undamaged and sharp condition.
Breaking Waves in Population Flows
107
We also performed a (Fast) Fourier Transformation on the predator records for the various habitat types as shown in Fig. 1, using N=4096 steps, and discarding the initial transients (t U (x) > k2 2 ⎪ ⎩ (k−U(x)) else 2 Given that each of Z modules has k variables, x is a configuration of variables within that module, √ and U (x) is the unitation (number of variables set to ‘1’). We set k = Z = N , such that the size of the modules scales with the size of the system, and thus refer to this problem as the scalable building blocks (SBB) problem. Each variable in the problem corresponds to a niche, and we refer to the two species that can occupy this niche as the ‘0-species’ and ‘1-species’ – giving a total of 2N different species in the ecosystem. The resultant landscape is very rugged, with 2Z locally optimal configurations. Of these configurations, only one is globally optimal: when all niches are occupied by the 1-species. From any local optimum, the nearest configuration of higher utility differs in k niches – all within one module that is currently entirely occupied with 0-species.
114
R. Mills and R.A. Watson
Fig. 1. Timesteps that each model takes before visiting the highest utility configuration in the SBB landscape. The symbiotic model visits this configuration with ease, whereas the control model very rarely does (thus requiring exponentially many epochs to sample the appropriate basin). The parameter settings are as follows: for M-S, we set d = 50, h = 0.6; In both M-S and the control M-C, following the dynamics uses 2N migrations, sampling on the initially selected species m without replacement; each model continues to restart until the globally optimal configuration is sampled. 30 repeats are performed for M-S, 100 repeats are performed for M-C (on account of higher stochasticity).
4
Discussion
As Fig. 1 shows, the symbiosis model provides an efficient processes that finds the very particular configuration with highest overall utility. This is in contrast to the control model, where the number of timesteps required to find the same configuration increases exponentially with the system size. The comparison indicates the significance of the result that M-S can find such a configuration. The symbiotic model is very efficient at discovering the globally optimal configuration in this landscape. How is this so? First, consider the attractors visited by the initial set of demes. In each one, some modules will be occupied by all-0, and some by all-1 species – but no single attractor will have all-1 in all modules. Whenever a 1-species occurs, it always co-occurs with other 1-species in the other niches in that module. Note that it is not the case that 1-species always occur in any particular module: the all 0-species attractor also has a 50% basin. Furthermore, there is no correlation between the attractor that each module finds, since the landscape is separable. Associations are formed between species that co-occur frequently across the set of demes. Thus, within any module strong symbioses will form amongst all k of the 1-species, and likewise amongst all k of the 0-species. No associations will form between 0-species and 1-species that occupy niches in the same module, since at attractors there are no co-occurrences. The between-module co-occurrences are predicted by the univariate frequencies (i.e., there is little or no surprise), so no associations will form here either. Fig. 2 (a) shows the resultant associations from the described process. Now let us consider the local dynamics with these associations. Each migration is likely to introduce all of the compatible species in a particular module. This transfers competition to the module-level. To start with, introducing either an all-0 group or an all-1 group will remain, since both are significantly higher utility
Symbiosis Enables the Evolution of Rare Complexes
(a)
115
(b)
Fig. 2. Symbiosis matrices for a 72 species (N =36, k=6) ecosystem. (a) Ideal target to represent the landscape structure. (b) Calculated from the surprise metric (Eqn. 1) on an example set of local optima, without a threshold (i.e., h=0). This leads to a largely accurate representation, with some spurious interactions. An appropriate threshold can recover the target – see text. Lighter shades indicate stronger associations.
than a random composition in the corresponding niches. However, as soon as an all-1 group is introduced, it has exactly the right variation to move to the highest utility possible in that module, and will thus replace any other composition. When appropriate all-1 groups have migrated into each of the modules, the overall utility will be maximised. After the associations develop as described above, this reliably occurs. While each of the configurations that is discovered in one of the demes has N species present, the subsequent competition is not directly between these local optima. The associations that evolve form groups of k members, corresponding to each module, as described. This is important for a significant result. Because the module sub-functions are independent, these small groups are both sufficient and selectively efficient in the sense that they create Z independent competitions between the two sub-solutions in each module, rather than a single competition between all 2Z possible local optima [18]. The success of the symbiosis model depends on the formation of associations appropriate to construct these per-module competitions. Fig. 2 (a) shows the target associations, which comprises strong associations between all compatible species within each module, and between-module associations. Frame (b) shows an example calculation of the surprise metric, without a threshold applied (Eqn. 1). In addition to the correct within-module interactions, this measure indicates some spurious interactions. Using Eqn. 1 with h = 0.6 is sufficient to recover just the appropriate associations, as in frame (a). There are several control models that we could have used, but would any have a better chance than M-C? Selecting on groups at the level of entire ecosystems does not allow one ecosystem configuration to have any correlation with the next. Without the ability to follow fitness gradients, all possible ecosystems must be enumerated. In principle, migrations of uncorrelated groups of species can move between the local optima that defeat single-migrant dynamics. However, to move between local optima in the SBB landscape, the k species that are in one module must all change at once. The number of different possible k-species groups is 2k N k , so any uncorrelated group formation process will require a prohibitively large number of attempts before finding the exact group necessary to move from one local optimum to another of higher utility. Even if the decomposition
116
R. Mills and R.A. Watson
were somehow known, randomly forming a group of the particular membership required is still exponential in the module size, which scales with the system size. Considering all 2Z local attractors as M-C does is therefore the fastest control when k = Z, despite it not requiring any structural information. Therefore, any reasonable control that does not evolve symbiotic associations will not reliably find ecosystems of the same level of utility. As described above, we use a mechanism that reinforces associations at the inter-species level – based on the surprise in the co-occurrence of species across several demes, with respect to the expected frequencies predicted from individual occurrence levels. The proposed mechanism is simpler than those suggested in previous models [10–12], and gives rise to qualitatively distinct results. In other work, we use a model that has an explicit population within each species, and each individual can evolve species-specific symbiotic associations (i.e., the associations are individual traits) [8]. We use this model to explore the types of population structure that lead to the evolution of adaptively significant associations. In particular, we show that the high-level mechanism used in M-S need not be imposed at the species level, but can in fact can be manifested via the evolution of individual traits. As in the present model, the evolved associations create higher-level groupings and selective units. Note that due to the transparent simplicity of the SBB landscape, it is actually possible to simplify M-S to only use ‘all-or-nothing’ associations in this case. However, such a simplification narrows the applicability. Elsewhere we have investigated landscapes in which associations with intermediate strengths find high-utility configurations that cannot be found with ‘all-or-nothing’ associations [19, 20]. It is worth noting that the algorithm proposed in [16] is able to efficiently solve hierarchical problems with some similar abstractions, in particular by model building from information at local optima. In our model, we apply selection on migration groups such that an entire group is rejected if overall utility is not improved. This effectively causes the units of variation to be synonymous with the units of selection. An alternative scheme might allow groups to migrate together, but select on individual species. We suggest that because the individual selection is performed in the new context, with the entire migration group, the ultimate changes in ecosystem composition will not be significantly different than if selecting on groups as an entire unit. Recall that migration groups typically comprise species that were frequently found to co-occur in locally stable contexts. Thus, the individual species would be selected in the context of the particular group. We leave the verification that both schemes have qualitatively equivalent results for future work. We have presented a model of the evolution of symbiotic associations where a separation of the temporal scales of changes in ecosystem composition and changes in species associations leads to the evolution adaptively significant complexes. This is in contrast to previous models of symbiosis with similar timescales for both levels of adaptation, and results in a simpler and more biologically plausible model. Provided that the ecosystem is perturbed at a low frequency such that the transients are shorter than the average time between
Symbiosis Enables the Evolution of Rare Complexes
117
perturbations, the associations that form give rise to specific species groupings that can traverse rugged landscapes. Acknowledgements. Thanks to Jason Noble, Simon Powers, Johannes van der Horst, and Devin Drown for useful discussions.
References 1. Thompson, J.N.: The Coevolutionary Process. Chicago (1994) 2. Mayr, E.: What Evolution is. Phoenix, London (2001) 3. Dunbar, H.E., Wilson, A.C.C., Ferguson, N.R., Moran, N.A.: Aphid thermal tolerance is governed by a point mutation in bacterial symbionts. PLoS Biology 5(5), e96 (2007) 4. Margulis, L.: The Symbiotic Planet. Phoenix, London (1998) 5. Khakhina, L.N.: Concepts of Symbiogenesis. Yale University Press, New Haven (1992) 6. Maynard Smith, J., Szathm´ ary, E.: The major transitions in evolution. Oxford University Press, Oxford (1995) 7. Margulis, L., Dolan, M.F., Guerrero, R.: Origin of the nucleus from the karyomastigontin amitochondriate protists. PNAS 97(13), 6954–6959 (2000) 8. Watson, R.A., Palmius, N., Mills, R., Powers, S., Penn, A.: Can selfish symbioses effect higher-level selection? In: ECAL (2009) 9. Mills, R., Watson, R.A.: Variable discrimination of crossover versus mutation using parameterized modular structure. In: GECCO, pp. 1312–1319 (2007) 10. Watson, R.A., Pollack, J.B.: A computational model of symbiotic composition in evolutionary transitions. Biosystems 69(2-3), 187–209 (2003) 11. Defaweux, A., Lenaerts, T., van Hemert, J.I.: Evolutionary transitions as a metaphor for evolutionary optimisation. In: Capcarr`ere, M.S., Freitas, A.A., Bentley, P.J., Johnson, C.G., Timmis, J. (eds.) ECAL 2005. LNCS (LNAI), vol. 3630, pp. 342–352. Springer, Heidelberg (2005) 12. Mills, R., Watson, R.A.: Symbiosis, synergy and modularity: Introducing the reciprocal synergy symbiosis algorithm. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 1192– 1201. Springer, Heidelberg (2007) 13. de Jong, E.D., Watson, R.A., Thierens, D.: On the complexity of hierarchical problem solving. In: GECCO, pp. 1201–1208 (2005) 14. Philemotte, C., Bersini, H.: A gestalt genetic algorithm: less details for better search. In: Procs. GECCO, pp. 1328–1334 (2007) 15. Houdayer, J., Martin, O.C.: Renormalization for discrete optimization. Physical Review Letters 83, 1030–1033 (1999) 16. Iclanzan, D., Dumitrescu, D.: Overcoming hierarchical difficulty by hill-climbing the building block structure. In: GECCO, pp. 1256–1263 (2007) 17. Krasnogor, N., Smith, J.: A tutorial for competent memetic algorithms. IEEE Transactions on Evolutionary Computation 9(5), 474–488 (2005) 18. Watson, R.A., Jansen, T.: A building-block royal road where crossover is provably essential. In: GECCO, pp. 1452–1459 (2007) 19. Mills, R., Watson, R.A.: Adaptive units of selection can evolve complexes that are provably unevolvable under fixed units of selection (abstract). In: ALIFE XI, p. 785 (2008) 20. Mills, R., Watson, R.A.: Dynamic problem decomposition via evolved symbiotic associations (in preparation)
Growth of Structured Artificial Neural Networks by Virtual Embryogenesis Ronald Thenius, Michael Bodi, Thomas Schmickl, and Karl Crailsheim Artificial Life Laboratory of the Department of Zoology Karl-Franzens University Graz Universit¨ atsplatz 2, A-8010 Graz, Austria
[email protected] Abstract. In the work at hand, a bio-inspired approach to robot controller evolution is described. By using concepts found in biological embryogenesis we developed a system of virtual embryogenesis, that can be used to shape artificial neural networks. The described virtual embryogenesis has the ability to structure a network, regarding the number of nodes, the degree of connectivity between the nodes and the amount and structure of sub-networks. Furthermore, it allows the development of inhomogeneous neural networks by cellular differentiation processes by the evolution predispositions of cells to different learning-paradigms or functionalities. The main goal of the described method is the evolution of a logical structure (e.g., artificial neural networks), that is able to control an artificial agent (e.g., robot). The method of developing, extracting and consolidation of an neural network from a virtual embryo is described. The work at hand demonstrates the ability of the described system to produce functional neural patterns, even after mutations have taken place in the genome.
1
Introduction
Structured artificial neural networks (ANNs) are advantageous in many ways: For example, they provide highly interesting auto-teaching structures as mentioned by Nolfi in [1] and [2]. In these works an ANN is described, which consists of two subnets, a “teaching net” and a “controlling net”. If such a network is shaped by an artificial evolutionary process it has “genetically inherited predispositions to learn”, as described by the authors. The ability of an ANN to evolve such substructures is one of the main goals of the virtual embryogenetic approach described in this work. Other concepts, like the influence of body (and neural controller shape) onto the function of the controller are described in [3] and [4]. The main ideas of these publications are to describe the significant influence of the morphological shape of the agent (or robot) on the learning and controlling process. As described later
Supported by: EU-IST FET project I-Swarm, no. 507006; EU-IST-FET project SYMBRION, no. 216342; EU-ICT project REPLICATOR, no. 216240.
G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 118–125, 2011. c Springer-Verlag Berlin Heidelberg 2011
Growth of Structured ANNs by Virtual Embryogenesis
119
in this article, the positions of sensor input nodes and actuator output nodes within the virtual embryo’s growing body corresponds the positions of sensors and actuators in the real robot (“embodiment” of controller). Other interesting approaches to the challenge of shaping of ANNs (with focus on the French flag problem described in [5]) are described in [6]: The function of a node is not determined there, but can be shaped by the genome. Another work, that deals with the problem of differing functions within an ANN, is [7], in which different predispositions for learning are implemented by “virtual adaptive neuro-endocrinology”. Within such a network, different types of cells exist: glandcells, which influence the learning of the network, and regular cells, which are influenced by the gland-cells. In [8] the process of virtual embryogenesis is described (see figure 1). It also describes how the feedback between morphogenes and the interaction between morphogenes and the genome act together and define the final shape of the embryo. Based on this work we show how to extract, how to optimize and how to use an ANN from a grown virtual embryo. In addition we discuss the advantages of such networks for future evolutionary adaptation.
Fig. 1. Comparison of virtual embryogenesis in our model and real-world embryogenesis (from [8]): A: Virtual embryo, consisting of cells (dots); B: Morphogene gradient in the embryo; C: Gradient of another morphogene, inducing cell differentiation. D: Embryo consisting of differentiated cells (white dots) and non-differentiated cells (invisible) ; E: Natural example of gene expression: Activity domains of gap genes in larva (lateral view) of Drosophila m. (from [9]; ’Kr’ and ’Gt’ indicate gene activity.)
2 2.1
Method Concept of Adapting Virtual Embryogenesis for Controller Development
In the process of virtual embryogenesis, described in [8], an embryo consists of cells, which develop to nodes in the ANN during the embryological process. These cells duplicate, specialise, emit chemical substances (morphogenes), die, or build links to other cells depending local status variables. Links between cells represent neural connections between the nodes of the resulting ANN. Cells can be “pushed around” in space, due to growth processes of the embryo, but they
120
R. Thenius et al.
have no ability for active movement. Cellular actions are coded in the genome (collection of single genes) of the embryo. Genes are triggered by presence of virtual morphogenes. One possible effect of gene-activation is the production of another morphogene. This way a network of feedbacks emerges, which leads to a self-organised process, that governs the growth of the embryo. The developed network is analysed and translated into a data structure which is compatible to a standard ANN-interpreter. The overall process of growing and extracting an ANN from an embryo is a procedure of 9 distinct steps (figure 2): A genome (figure 2a) that is able to perform complex operations including operations regarding the interpretation of the genome (figure 2b) codes for the growth process of a virtual embryo (figure 2c). For details of this process see [8]. Some cells of this virtual embryo are able to develop links to other cells in the embryo and thus become neural cells. (figure 2d). After the embryogenetic process, the ANN is extracted from the virtual embryo (figure 2e). “Dead-end” connections within the network that might develop during embryological process are removed for the sake of calculation speed (figure 2f). The ANN is then linked to sensor- nodes and actuator nodes according to the definitions of the robot in which the network has to perform (figure 2g). After this step, the network is translated into a data structure, that can be parsed by an ANN interpreter (figure 2h) and tested in the robot or in the simulation environment (figure 2i).
Fig. 2. Concept of shaping ANNs by virtual embryogenesis
Growth of Structured ANNs by Virtual Embryogenesis
3 3.1
121
Results Growth of Neurons
One of the most important steps on the way from the initial genome to a robot controller is the definition and linkage of neural cells within the virtual embryo. Some genes trigger the growth of neural links of a cell. In figure 3(c) a simple multi-layered neural structure is shown. Due to locally differing concentrations of morphogenes it is possible that sub-networks grow within different parts of the embryo. Due to the possibility to modify the internal status variables of neural cells (as mentioned in [8]), these sub-networks have different properties, for example, in their pre-disposition for learning or in their function. The possibility to structure neural networks this way in a genome controlled self-organised process is one of the main advantages of the method described here. Such processes can be modified by artificial evolution. The genetic structure shown in figure 3(a) contains two genes (“p165” and “p185”), which code for proteins that lead to a linkage of neural cells. The protein, coded by the gene “p165”, is part of the genetic substructure (beginning with the morphogene coding gene “m2”) that controls the vertical growth of the embryo (final shape of the embryo depicted in figure 3(b)). Due to its position, this gene induces the building of “long distance connections” all over the embryo (depicted by red lines in figure 3(c)). This process already takes place very early in the growth process of the virtual embryo, so that linked cells move into different directions and the neural connections reach throughout the whole embryo. The protein resulting from the triggering of the gene “p185” leads to the development of the spatially concentrated sub-networks. The sub-networks with high density are depicted by white lines in figure 3(c). 3.2
Translation
After the virtual embryo has stopped to grow, it is necessary to extract the ANN from the embryo and to translate it into a structure that is readable for a standard neural network interpreter. Before that, the network gets consolidated: All neural cells that have no input or output are removed from the network . This “consolidation process” is an optimisation step which prevents unnecessary structures (like “dead end” connections). In a second step, the ANN has to be linked to the inputs (sensors) and outputs (actuators) of the robot (or other agent), controlled by the network (figure 4). The cells that get connected to inputs and outputs are selected by their position in the virtual embryo corresponding to the position of sensors and actuators on the robot. In a third step the finalised neural network is translated into a structure of one-dimensional and two-dimensional arrays that can be parsed by a standard neural network interpreter. After this step the embryogenetically shaped ANN is ready for upload to the robot.
122
R. Thenius et al.
(a) Genome
(b) Embryo
(c) Neural cells
Fig. 3. Growth of an ANN in a virtual embryo. (a): The genome defining the number and location of neural sub-networks in the embryo via self-organizing processes (for details please see [8]). (b): Shape of the embryo, neural connections are not drawn. (c): Screenshot of the neural network in the embryo. White lines indicate “short distance connections” within local sub-networks, red lines indicate “long distance connections” that reach throughout the whole embryo and connect the sub-networks with each other.
(a) Network inside embyro
(b) Unflolded network
Fig. 4. Extracting and optimising the ANN. (a): A neural network inside of a virtual embryo. (b): Unfolded neural network. During virtual embryogenesis “dead-end” connections can occur, which consist of neural cells that do not have input and output. These cells are disadvantageous to the efficiency of the resulting ANN. Yellow circles indicate neural cells, lines indicate connections between cells.
Growth of Structured ANNs by Virtual Embryogenesis
4 4.1
123
Discussion Usability of Virtual Embryogenesis for Shaping of ANNs and Robot Controllers
One important ability of the described system, on which an artificial evolutionary process works, is the stability against lethal mutations, along with the ability to forward changes from the genetic level to the phenotypic (morphological) level. As shown in figure 5, different embryo shapes emerge from one “ancestor” by applying small changes to the genome. One important feature is, that these new shapes do not look completely different from its ancestor, but differ only slightly (“self-similarity’). These changes do not only take place on the level of shape, but also on the level of resulting ANNs. As shown in figure 6, changes in the genome lead also to changes in the structure of the resulting network (i.e., the robot controller). This way the number of layers, the number of cells within a layer, the micro-structure of a layer, etc. can change. Please notice, that this ability of the described system leads to an increased number of “viable ” offspring of an given “ancestor” in an evolutionary system . We expect this self-similarity to allow for exploration of larger areas of the fitness landscape in less time, due to higher possible mutation rates, and better variations of the offspring due to mutation. Using the described technique of virtual embryogenesis enables us to develop embryos from genomes (see figure 1). Within these embryos we are able to let grow ANNs, whereby the shape, the degree of connectedness and differentiation processes within the ANN are determined during the developmental process. This growth process is controlled by the genome and uses a system of feedbacks and delays. Especially the differentiation of cells within the embryo (see figure 1 D) is important. It allows us to evolve complex structures, constructed out of sub-structures: many local networks within one bigger network. The described system allows not only evolution of morphological structures: In addition internal functions or predispositions of cells or groups of cells get optimized. Within a neural network different kinds of learning methods could applied (like mentioned in [1] and [2]). In addition different subnets within one neural network could learn with different speeds, so they are able to adapt to processes that take place in different time scales in the environment of the agent. It is possible to structure an ANN into different sub-networks, that are able to specialise for different functionalities by being receptive to different external reward functions, that influence the learning function. Also different functions within structured ANNs sub-networks can be determined, for example sensor fusion or task-management.
5
Conclusion
Our method of evolving ANNs by simulating embryogenesis elaborates on concepts from the field of evo-devo [10]. This approach produces results that are comparable to the products of natural developmental processes (figure 1). Within an embryo we are able to grow ANNs, consisting of different sub-networks. If
124
R. Thenius et al.
Fig. 5. Variations in the genome lead to variations in shape: If the genome of one “ancestor” (left sub-figure) is changed slightly, resulting embryos differ slightly in shape (right sub-figures). Changes in the genome are marked with red boxes.
Fig. 6. Variations in the genome lead to variations in the shape of the resulting ANN: If the genome of on “ancestor” (left sub-figure, network with 2 layers) is changed, the resulting networks also change (right sub-figures, networks with 1, respectively 4 layers)
Growth of Structured ANNs by Virtual Embryogenesis
125
the genome of a virtual embryo is mutated, the described system leads to functional intact offspring, slightly differing from its “ancestor” (figure 6 and 5). We also show the possibility how to differentiate groups of cells (figure 1D), what is comparable to tissue development in biology. In the next step we plan to combine the presented virtual embryogenesis with artificial evolution. The quality of the neural network (e.g., regarding learning abilities, learning speed or ability to adapt to new situations) will be used for a fitness function. This way novel and efficient ANN-structures are planned to be evolved. Due to the possibility of the described system to develop structured ANNs, we plan to investigate the processes of evolution of hierarchical brainstructures, and this way to learn more about the biological evolution of brains in real-world animals. Further, we expect this technique to be excellent for shaping robot controllers, due to the fact that the evolved ANNs can be as variable in their structure and function as the virtual embryos, in which they develop.
References 1. Nolfi, S., Parisi, D.: Auto-teaching: Networks that develop their own teaching input. In: Deneubourg, J.L., Bersini, H., Goss, S., Nicolis, G., Dagonnier, R. (eds.) Proceedings of the Second European Conference on Artificial Life (1993) 2. Nolfi, S., Parisi, D.: Learning to adapt to changing environments in evolving neural networks. Adaptive Behavior 5(1), 75–98 (1997) 3. Pfeifer, R., Bongard, J.C.: How the body shapes the way we think: A new view of intelligence (Bradford Books). The MIT Press, Cambridge (2006) 4. Pfeifer, R., Iida, F., Bongard, J.: New robotics: Design principles for intelligent systems. Artificial Life 11(1-2), 99–120 (2005) 5. Wolpert, L.: Principles of development. Oxford University Press, Oxford (1998) 6. Miller, J.F., Banzhaf, W.: Evolving the program for a cell: From french flags to boolean circuits. In: On Growth, Form and Computers, pp. 278–302. Academic Press, London (2003) 7. Timmis, J., Neal, M., Thorniley, J.: An adaptive neuro-endocrine system for robotic systems. In: IEEE Workshop on Robotic Intelligence in Informationally Structured Space. Part of IEEE Workshops on Computational Intelligence, Nashville (2009) (in press) 8. Thenius, R., Schmickl, T., Crailsheim, K.: Novel concept of modelling embryology for structuring an artificial neural network. In: Troch, I., Breitenecker, F. (eds.) Proceedings of the MATHMOD (2009) 9. Jaeger, J., Surkova, S., Blagov, M., Janssens, H., Kosman, D., Kozlov, K.N., Manu, Myasnikova, E., Vanario-Alonso, C.E., Samsonova, M., Sharp, D.H., Reinitz, J.: Dynamic control of positional information in the early Drosophila embryo. Nature 430, 368–371 (2004) 10. M¨ uller, G.B.: Evo-devo: Extending the evolutionary synthesis. Nature Reviews Genetics 8, 943–949 (2007)
The Microbial Genetic Algorithm Inman Harvey Evolutionary and Adaptive Systems Group, Centre for Computational Neuroscience and Robotics, Department of Informatics, University of Sussex, Brighton BN1 9QH, UK
[email protected] Abstract. We analyse how the conventional Genetic Algorithm can be stripped down and reduced to its basics. We present a minimal, modified version that can be interpreted in terms of horizontal gene transfer, as in bacterial conjugation. Whilst its functionality is effectively similar to the conventional version, it is much easier to program, and recommended for both teaching purposes and practical applications. Despite the simplicity of the core code, it effects Selection, (variable rates of) Recombination, Mutation, Elitism (‘for free’) and Geographical Distribution. Keywords: Genetic Algorithm, Evolutionary Computation, Bacterial Conjugation.
1 Introduction Darwinian evolution assumes a population of replicating individuals, with three properties: Heredity, Variation, and Selection. If we take it that the population size neither shrinks to zero, nor shoots off to infinity -- and in artificial evolution such as a genetic algorithm we typically keep the population size constant -- then individuals will come and go, whilst the makeup of the population changes over time. Heredity means that new individuals are similar to the old individuals, typically through the offspring inheriting properties from their parents. Variation means that new individuals will not be completely identical to those that they replace. Selection implies some element of direction or discrimination in the choice of which new individuals replace which old ones. In the absence of selection, the population will still change through random genetic drift, but Darwinian evolution is typically directed: in the natural world through the natural differential survival and mating capacities of varied individuals in the environment, in artificial evolution through the fitness function that reflects the desired goal of the program designer. Any system that meets the three basic requirements of Heredity, Variation and Selection will result in evolution. Our aim here is to work out the most minimal framework in which that can be achieved, for two reasons: firstly theoretical, to see what insights can be gained when the evolutionary method is stripped down to its bare bones; secondly didactic, to demystify the Genetic Algorithm (GA) and present a version that is trivially easy to implement.
2 GAs Stripped to the Minimum The most creative and challenging parts of programming a GA are usually the problem-specific aspects. What is the problem space, how can one best design the G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 126–133, 2011. © Springer-Verlag Berlin Heidelberg 2011
The Microbial Genetic Algorithm
127
genotypic expression of potential solutions or phenotypes, and the genotypephenotype mapping, how should one design an appropriate fitness function to achieve ones goals? Though this may be the most interesting part, it is outside the remit of our focus here on those evolutionary mechanisms that can be generic and largely independent of the specific nature of any problem. So we will assume that the programmer has found some suitable form of genotype expression, such that the genetic material of a population can be maintained and updated as the GA progresses; for the purposes of illustration we shall assume that there is a population size P of binary genotypes length N. We assume that all the hard problem-specific work of translating any specific genotype in the population into the corresponding phenotype, and evaluating its fitness, has been done for us already. Given these premises, we wish to examine the remaining GA framework to see how it can be minimized. 2.1 Generational and Steady State GAs Traditionally GAs were first presented in generational form. Given the current population of size P, and their evaluated fitnesses, a complete new generation of the same size was created via some selective and reproductive process, to replace the former generation in one fell swoop. A GA run starts with an initialized population, and then loops through this process for many successive generations. This would roughly correspond to some natural species that has just one breeding season, say once a year, and after breeding the parents die out without a second chance. There are many natural species that do not have such constraints, with birth and death events happening asynchronously across the population, rather than to the beat of some rhythm. Hence the Steady State GA, which in its simplest form has as its basic event the replacement of just one individual from P by a single new one (a variant version that we shall not consider further would replace two old by two new). The repetition of this event P times is broadly equivalent to a single generation of the Generational GA, with some subtle differences. In the Generational version, no individual survives to the next generation, although some will pass on genetic material. In the Steady State version, any one individual may, through luck or through being favoured by the selective process, survive unchanged for many generation-equivalents. Others, of course may disappear much earlier; the average lifetime of each individual will be P birth/death events, as in the generational case, but there will be more variance in these lifetimes. There are pragmatic considerations for the programmer. The core piece of code in the Steady State version, will be smaller, although looped through P times more, than in the Generational GA. The Generational version needs to be coded so that, in effect, P birth/death events occur in parallel, although on a sequential machine this will have to be emulated by storing a succession of consequences in a temporary array; typically two arrays will be needed for this-generation and next-generation. If the programmer does have access to a cluster of processors working in parallel, then a Steady State version can farm out the evaluation of each individual (usually the computational bottleneck) to a different processor. The communication between
128
I. Harvey
processors can be very minimal, and the asynchronous nature of the Steady State version means that it will not even be necessary to worry too much about keeping different processors in step. These are suggestive reasons, but perhaps rather weak, subjective and indeed even aesthetic, for favouring a Steady State version of a GA. A stronger reason for doing so in a minimalist GA is that it can exploit a very simple implementation of selection. 2.2 Selection Traditionally but regrettably, newcomers to GAs are usually introduced to ‘fitnessproportionate selection’, initially in a Generational GA. After the fitnesses of the whole current population are evaluated by whatever criterion has been chosen, and specified as real values, these values are summed to give the size of the reproductive ‘cake’. Then each individual is allocated a probability of being chosen as a parent according to the size of its own slice of that cake. This method has three virtues: it will indeed selectively favour those with higher fitnesses; it has some tenuous connection with the different biological notion of fitness (although in the biological sense any numbers associated with fitness are based on observing, after the event, just how successful an individual was at leaving offspring; as contrasted with the GA approach that reverses cause and effect here); and lastly, it happens to facilitate, for the theoreticians, some mathematical theoremproving. But pragmatically, this method often leads to unnecessary complications. What if the chosen evaluation method allocates some fitnesses that are negative? This would not make sense under the biological definition, but where the experimenter has chosen the evaluation criteria it may well happen. This is one version of the general re-scaling issue: different versions of a fitness function may well agree in ordering of different members of the population, yet have significantly different consequences. One worry that is often mentioned is that, under fitness-proportionate selection, one often sees (especially in a randomized initial population) many individuals scoring zero, whilst others get near-zero; though this may reflect nothing more than luck or noise, the latter individuals will dominate the next generation to the complete exclusion of the former. These issues motivated many complex schemes for re-scaling fitnesses before then implementing fitness-proportionate selection. But -- although it was not initially reflected in the standard texts -- many practitioners of applying GAs to real-world problems soon abandoned them in favour of rank-based methods. 2.3 Rank-Based and Tournament Selection The most general method of rescaling is to use the scores given by the fitness function to order all the members of the population from fittest to least fit; and thereafter to ignore the original fitness scores and base the probabilities of having offspring solely on these relative rankings. A common choice made is to allocate (at least in principle) 2.0 reproductive units to the fittest, 1.0 units to the median, and 0.0 units to the least fit member, similarly scaling pro rata for intermediate rankings; this is linear rank selection. The probability of being a parent is now proportional to these rankderived numbers, rather than to the original fitness scores.
The Microbial Genetic Algorithm
129
There are in practice some complications. If as is common practice P is an even number, the median lies between two individuals. The explicit programming of this technique requires some sorting of the population according to rank, which adds further complexity. Fortunately there is a convenient trick that generates a similar outcome in a much simpler fashion. If two members of the population are chosen at random, their fitnesses compared, and the Winner selected, then the probability of the Winner being any specific member of the population exactly matches the reproductive units allocated under linear rank selection. This is easiest to visualize if one considers P such tournaments, in succession using the same population each time. Two individuals are chosen at random each time, so that each individual can expect to participate in two such tournaments. The top-ranker will win both its tournaments, for 2.0 reproductive units; a median-ranker can expect to win one, lose one, for 1.0 units; and the bottom-ranker will lose both its tournaments, for 0.0 units. A minor difference between the two methods will be more variance in the tournament case, around the same mean values. Tournament selection can be extended to tournaments of different sizes, for instance choosing the fittest member from a randomly selected threesome. This is an example of non-linear rank selection. From both pragmatic and aesthetic perspectives, Tournament Selection with the basic minimum viable tournament size of two has considerable attractions for the design of a minimalist GA. 2.4 Who to Breed, Who to Die? Selection can be implemented in two very different ways; either is fine, as long as the end result is to bias the choice of those who contribute to future generations in favour of the fitter ones. The usual method in GAs is to focus the selection on who is to become a parent, whilst making an unbiased, unselective choice of who is to die. In the standard Generational GA, every member of the preceding generation is eliminated without any favouritism, so as to make way for the fresh generation reproduced from selected parents. In a Steady State GA, once a single new individual has been bred from selected parents, some other individual has to be removed so as to maintain a constant population size; this individual is often chosen at random, again unbiased. Some people, however, will implement a method of biasing the choice of who is removed towards the less fit. It should be appreciated that this is a second form of selective pressure, that will compound with the selective pressure for fit parents and potentially make the combined selective pressure stronger than is wise. In fact, one can generate the same degree of selective pressure by biasing the culling choice towards the less fit (whilst selecting parents at random) as one gets by the conventional method of biasing the parental choice towards the more fit (whilst culling at random). This leads to an unconventional, but effective, method of implementing Tournament Selection. For each birth/death cycle, generate one new offspring with random parentage; with a standard sexual GA, this means picking both parents at random, but it can similarly work with an asexual GA through picking a single parent at random. A single individual must be culled to be replaced by the new individual; by picking
130
I. Harvey
two at random, and culling the Loser, or least fit of the two, we have the requisite selection pressure. One can argue that this may be closer to many forms of natural selection in the wild than the former method. Going further, we can consider a yet more unconventional method, that combines the random undirected parent-picking with the directed selection of who is to be culled. Pick two individuals at random to be parents, and generate a new offspring from them; then use the same two individuals for the tournament to select who is culled -- in other words the weaker parent is replaced by the offspring. It turns out that this is easy to implement, is effective. This is the underlying intuition behind the Microbial GA, so called because we can interpret what is happening here in a different way -- Evolution without Death! 2.5 Microbial Sex Microbes such as bacteria do not undergo sexual reproduction, but reproduce by binary fission. But they have a further method for exchanging genetic material, bacterial conjugation. Chunks of DNA, plasmids, can be transferred from one bacterium to the next when they are in direct contact with each other. In the conventional picture of a family tree, with offspring listed below parents on the page, we talk of vertical transmission of genes ‘down the page’; but what is going on here is horizontal gene transfer ‘across the page’. As long as there is some selection going on, such that the ‘fitter’ bacteria are more likely to be passing on (copies of) such plasmids than they are to be receiving them, then this is a viable way for evolution to proceed. We can reinterpret the Tournament described above, so as to somewhat resemble bacterial conjugation. If the two individuals picked at random to be parents are called A and B, whilst the offspring is called C, then we have described what happens as C replacing the weaker one of the parents, say B; B disappears and is replaced by C. If C is the product of sexual recombination between A and B, however, then ~50% of C’s genetic material (give or take the odd mutation) is from A, ~50% from B. So what has happened is indistinguishable from B remaining in the population, but with ~50% of its original genetic material replaced by material copied and passed over from A. We can consider this as a rather excessive case of horizontal gene transfer from A (the fitter) to B (the weaker).
3 The Microbal GA We now have the basis for a radical, minimalist revision of the normal form of a GA, although functionally, in terms of Heredity, Variation and Selection, it is performing just the same job as the standard version. This is illustrated in Fig. 1. Here the recombination is described in terms of ‘infecting’ the Loser with genetic material from the Winner, and we can note that this rate of infection can vary. In bacterial conjugation it will typically be rather a low percentage that is replaced or supplemented; to reproduce the typical effects of sexual reproduction, as indicated in the previous section, this rate should be ~50%. But in principle we may want, for different effects, to choose any value between 0% and 100%. In practice, for normal usage and for comparability with a standard GA, the 50% rate is recommended. The simplest way of doing this, equivalent to Uniform
The Microbial Genetic Algorithm
131
Fig. 1. The genotypes of the population are represented as a pool of strings. One single cycle of the Microbial GA is represented by the operations PICK (two at random), COMPARE (their fitnesses to determine Winner = W, Loser = L), RECOMBINE (where some proportion of Winner’s genetic material ‘infects’ the Loser), and MUTATE (the revised version of Loser).
Recombination, is with 50% probability independently at each locus to copy from Winner to Loser, otherwise leaving the Loser unchanged at that locus. From a programming perspective, this cycle is very easy to implement efficiently. For each such tournament cycle, the Winner genotype can remain unchanged within the genotype-array, and the Loser genotype can be modified (by ‘infection’ and mutation) in situ. We can note that this cycle gives a version of ‘elitism’ for free: since the current fittest member of the population will win any tournament that it participates in, it will thus remain unchanged in the population -- until eventually overtaken by some new individual even fitter. Further, it allows us to implement an effective version of geographical clustering for a trivial amount of extra code. 3.1 Trivial Geography It is sometimes considered undesirable to have a panmictic population, since after many rounds of generations (or, in a Steady State GA, generation-equivalents) the population becomes rather genetically uniform. It is thus fairly common for evolutionary computation schemes to introduce some version of geographical distribution. If the population is considered to be distributed over some virtual space, typically two-dimensional, and any operations of parental choice, reproduction, placing of offspring with the associated culling are all done in a local context, then this allows more diversity to be maintained across sub-populations. Spector and Klein [3] note that a one-dimensional geography, where the population is considered to be on a (virtual) ring, can be as effective as the 2-D version, and demonstrate the effectiveness in some example domains. If we consider our array that contains the genotypes to be wrap-around, then we can implement this version by, for each tournament cycle: choose the first member A of the tournament at random from the whole population; then select the next member B at random from a deme, or subpopulation that starts immediately after A in the array-order. The deme size D, eval(B)) {W=A; L=B;} else {W=B; L=A;} for (i=0;i dP opLamarck .
L 2 );
Therefore, we can say that, on average, Lamarckian adaptation is better suited to situations where the gap between environment changes is more than 2. Note that this is restricted to the situation where environments oscillate between extremes and the learning mechanism moves an individual towards a target by a constant amount.
4 Experiments Given the above proof, it seems likely that Lamarckian learning should be selected for a population faced with an environment in which changes occur less frequently. To investigate this, we devised a simple experiment that allows individuals within an evolving population to prefer Lamarckian learning over evolutionary learning depending on the merits of either approach in a specific setting. We chose a simple problem domain, and illustrate the concept using a genetic algorithm. In one environment the target phenotype is one which contains all 1s and in the other, one which contains all 0s. Each individual in the population possesses a genome comprised of 1001 bits, where 1000 bits encode a possible solution and the last bit encodes whether the individual employs Lamarckian learning or not. Each individual is allowed a learning round where a number of its unfit alleles are improved by setting either to 1 or 0, depending on the environment. If an individual employs Lamarckian learning, these changes are reencoded into its genome to be passed on to the next generation. If an individual does not employ Lamarckian learning, all changes are lost to the next generation. A number of experiments were undertaken, examining both the effect of increasing the number of generations between environmental changes and also the number of alleles altered during the learning round applied to both populations. The former can be regarded as controlling the frequency of environmental change, while the latter controls
An Analysis of Lamarckian Learning in Changing Environments
147
Lamarckian Proportion
1
0.8
0.6
0.4
0.2
0 0
50
100
150
200
250
300
350
400
Generations 1 generation b/w changes 2 generations b/w changes 3 generations b/w changes
4 generations b/w changes 5 generations b/w changes
Fig. 1. 1 allele change per learning round
learning rate. We examined populations employing 1 and 50 allele changes in environments changes every 1, 2, 3, 4 and 5 generations. The results from these experiments (averaged from 20 independent runs) are illustrated in Figures 1 and 2. Figure 1 shows populations allowed to change 1 unfit allele each generation. Although the results show a slight trend towards Lamarckian adaptation, the selection process does not truly discriminate between Lamarckian and Baldwinian learning regardless of the number of generations between changes. The modest amount of adaptation that occurs when only one allele is changed is not enough to bring out the characteristics and relative advantages or disadvantages of either learning mechanism. By contrast, the results obtained for 50 allele changes illustrated in Figure 2 show a clear divergence between Baldwinian and Lamarckian learning according to the number of generations between changes. Lamarckian adaptation is selected against in environments where there are 3 or less generations between changes and selected for in environments with larger gaps between changes. Clearly, increasing the number of allele changes allows the selection process to distinguish between Baldwinian and Lamarckian learning. In summary, these results show that Lamarckian learning is not favoured by selection in environments where changes occur very frequently. In addition, the results show that as the gaps between environmental changes increases, the probability of Lamarckian learning being selected increases. Finally, in the situation where changes occur every 3 generations, the probability is split (slightly disfavouring Lamarckian learning). This is because the relative advantage of Lamarckian learning in this environment is not sufficient to fully take over the population. What is crucial is that individuals employing Lamarckian learning continue to exist in much larger numbers than in environments with smaller gaps between environmental changes. These results correlate with both our formal analysis of Lamarckian adaptation and with previous empirical research on Lamarckian learning. A brief examination of the fitness values of the Baldwinian and Lamarckian populations for each environment further illustrates the selection pressure exerted by the environmental conditions on the Lamarckian evolution gene. An experiment was conducted with two populations, one employing Baldwinian learning and the other Lamarckian
148
D. Curran and B. O’Sullivan
Lamarckian Proportion
1
0.8
0.6
0.4
0.2
0 0
50
100
150
200
250
300
350
400
Generations 1 generation b/w changes 2 generations b/w changes 3 generations b/w changes
4 generations b/w changes 5 generations b/w changes
Fig. 2. 50 allele changes per learning round
learning. Both populations were allowed 50 allele changes during the lifetime learning stages. The average fitness values (F¯ ) over the entire experiment were computed for each population as follows: G
1 ¯ F¯Baldwin = fBaldwin (g) G g=0 G
1 ¯ F¯Lamarck = fLamarck (g) G g=0 where G is the number of generations and fg is the average fitness of the populations at generation g. The difference between Lamarckian and Baldwinian populations (D) was calculated as: D = F¯Lamarck − F¯Baldwin . Table 2 shows the average fitness for both Baldwinian and Lamarckian populations as well as the difference measure D for each environmental setting. Negative values indicate fitter Baldwinian populations while positive values indicate fitter Lamarckian populations. It is clear from these results that Lamarckian learning produces fitter populations in environmental conditions where the gap between environmental changes is larger than 2, as predicted by the analysis outlined in Section 3. Table 2. Average fitnesses and differences for Baldwinian and Lamarckian populations Change Interval 1 2 3 4 5
Baldwin Lamarck Difference 584.1 577.7 -6.4 575.5 570.5 -5.0 568.2 575.4 7.2 565.6 577.4 11.8 564.7 581.2 16.5
An Analysis of Lamarckian Learning in Changing Environments
149
5 Conclusions This paper presented a formal analysis of Lamarckian learning showing the conditions in which Lamarckian learning can be a better choice over evolutionary learning. We presented the first formal analytical proof that can be used to show that Lamarckian inheritance can dominate more traditional individual (non-inheritable) learning schemes. The findings of this model were validated through a set of simple experiments showing that the evolutionary pressure to select for or against Lamarckian learning closely follows the conditions predicted by the proof. Future work will examine the suitability of Lamarckian learning in more complex environments.
References 1. Cortez, P., Rocha, M., Neves, J.: A lamarckian approach for neural network training. Neural Process. Lett. 15(2), 105–116 (2002) 2. Crick, F.H.C.: Central dogma of molecular biology. Nature 227, 561–563 (1970) 3. Darwin, C.: The Origin of Species: By Means of Natural Selection or the Preservation of Favoured Races in the Struggle for Life. Bantam Classics (1859) 4. Giraud-Carrier, C.: Unifying learning with evolution through baldwinian evolution and lamarckism: A case study. In: Proceedings of the Symposium on Computational Intelligence and Learning (CoIL 2000), pp. 36–41. MIT GmbH (June 2000) 5. Grefenstette, J.J.: Lamarckian learning in multi-agent environments. In: Belew, R., Booker, L. (eds.) Proceedings of the Fourth International Conference on Genetic Algorithms, San Mateo, CA, pp. 303–310. Morgan Kaufmann, San Francisco (1991) 6. Hayes, B.: Experimental lamarckism. American Scientist 87(6), 494–498 (1999) 7. Hinton, G.E., Nowlan, S.J.: How learning guides evolution. Complex Systems 1, 495–502 (1987) 8. Jablonka, E., Oborny, B., Molnar, I., Kisdi, E., Hofbauer, J., Czaran, T.: The adaptive advantage of phenotypic memory in changing environments. Philos. Trans. R Soc. Lond. B Biol. Sci. 29(350), 133–141 (1995) 9. Lamarck, J.B.: Philosophie zoologique ou exposition des considrations relatives l’histoire naturelle des animaux. UCP (reprinted 1984), Paris (1809) 10. Lamma, E., Moniz Pereira, L., Riguzzi, F.: Belief revision by lamarckian evolution. In: Boers, E.J.W., Gottlieb, J., Lanzi, P.L., Smith, R.E., Cagnoni, S., Hart, E., Raidl, G.R., Tijink, H. (eds.) EvoIASP 2001, EvoWorkshops 2001, EvoFlight 2001, EvoSTIM 2001, EvoCOP 2001, and EvoLearn 2001. LNCS, vol. 2037, pp. 404–413. Springer, Heidelberg (2001) 11. Paenke, I., Sendhoff, B., Rowe, J., Fernando, C.T.: On the adaptive disadvantage of lamarckianism in rapidly changing environments. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 355–364. Springer, Heidelberg (2007) 12. Richards, E.J.: Inherited epigenetic variation - revisiting soft inheritance. Nat. Rev. Genet. (2006) 13. Sasaki, T., Tokoro, M.: Comparison between lamarckian and darwinian evolution on a model using neural networks and genetic algorithms. Knowl. Inf. Syst. 2(2), 201–222 (2000)
Linguistic Selection of Language Strategies A Case Study for Colour Joris Bleys1 and Luc Steels1,2 1
Artificial Intelligence Laboratory, Vrije Universiteit Brussel, Brussels, Belgium
[email protected] 2 Sony Computer Science Laboratory, Paris, France
[email protected] Abstract. Language evolution takes place at two levels: the level of language strategies, which are ways in which a particular subarea of meaning and function is structured and expressed, and the level of concrete linguistic choices for the meanings, words, or grammatical constructions that instantiate a particular language strategy. It is now reasonably well understood how a shared language strategy enables a population of agents to self-organise a shared language system. But the origins and evolution of strategies has so far been explored less. This paper proposes that linguistic selection, i.e. selection driven by communicative success and cognitive effort, is relevant and shows a concrete case study for the domain of colour on how different language strategies may cooperate and compete for dominance in a population.
1
Language Strategies and Language Systems
Human languages are complex adaptive systems that are shaped and reshaped by their users, even in the course of a single dialogue [1]. They undergo change in order to remain adaptive to the expressive needs of the community, while maximising communicative success and minimising cognitive effort [2]. The past decade, considerable progress has been made to model the architecture and behaviour of ‘linguistic’ agents such that symbolic communication systems with properties similar to human languages may arise through language games [3]. In this paper we will use the
Fig. 1. Robotic experiments with the Colour Naming Game. The speaker draws attention to a chip by naming its colour and the hearer points to the chip which has been named. Agents give feedback after this interaction in order to share and align their colour categories and vocabularies.
G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 150–157, 2011. c Springer-Verlag Berlin Heidelberg 2011
Linguistic Selection of Language Strategies
151
Colour Naming Game, where the speaker uses a colour to draw the attention of the hearer to an object in the world [4]. Two agents drawn randomly from a population are shown a set of Munsell colour chips. The speaker chooses one chip as the topic, categorises the colour of this topic, and then searches in his own private lexicon how this category is named. The hearer gets the name, looks it up in his own lexicon, and then identifies and points to the chip in the context which fits best with the category named. If this is the chip that the speaker had in mind, the game is a success. Otherwise the speaker points to his original choice and the hearer can learn the name and the category expressed by it. Colour Naming is an interesting domain because it has been studied intensely by anthropologists, neuroscientists, and psychologists, and so there is a significant body of empirical data available, including data about the evolution of colour terms and colour categories [5]. These empirical studies of colour naming in humans have shown three facts: (1) There are different strategies by which speakers use colour to draw attention to objects in the world. One common strategy, which has been studied the most, is to use a limited set of basic colour prototypes utilising the full colour space, meaning the two hue opponent channels and brightness (for example: black, white, red, green, blue, yellow, pink, purple, brown, orange) [6]. We further call this the Basic Colour Strategy. Another strategy is to use only brightness, with words like “dark”, “shiny”, or “light”. We will call this the Brightness Colour Strategy. There are still other possibilities: to combine two basic colour prototypes (as in “bluish green” or “reddish orange”), to suggest colours by naming an object that typically has the colour (as in “lila” or “almond”), to combine the latter with basic colours as in “grass green”, “milk white” or “sky blue”, etc. (2) There is considerable variation in the way in which a particular strategy is instantiated in a language and how it determines how languages change over time. For example, “red” is the name of a prototypical colour in English, roughly in the 625-740 nanometer range of the colour spectrum, but it used to be called “read” in Old English. The same colour prototype is called “rojo” in Spanish, “aka” in Japanese, “ˇcerven´ y” in Czech, or “merah” in Indonesian. The basic colour prototypes used in different languages vary as well and there is also evolution, usually towards more and more refined colour prototypes [6]. For example, English speakers make a rather clear distinction between green and blue, but in Chinese and Japanese there is a single colour category which covers both areas, named “ao” in Japanese or “q¯ing” in Chinese. It is used for the (green) traffic light (“ao shingo”) or the colour of unripe bananas, but also for a blue sky (“aozora”). The Berinmo, a Papua New Guinea indigineous culture, has a word “wor” which covers some of the green region, a word “nol” which covers much of green, blue and blue/purple, a word “wap” which covers almost all the lightest colours, and a word “kel” which covers almost all dark colours [7]. (3) Interestingly, a kind of switch has often happened or is happening in predominantly brightness-oriented colour languages towards predominantly fullcolour oriented languages which use both brightness and hue [8], showing that
152
J. Bleys and L. Steels
there is not only evolution in how a strategy is instantiated (in other words which words and categories are used) but also which strategies a language community employs. Today’s hues like “yellow”, “brown”, or “blue” were all expressing brightness-based distinctions in Old English before they became used as part of the Basic Colour Strategy in the late Middle English period (1350-1500) [9], showing that the same linguistic elements (e.g. the same words) may be used by different strategies leading to a kind of competition and mutual influence across strategies. Given that we see variation and evolution at these two levels, we must conclude that individual language users master both language strategies, which are procedures for building, expanding, and adapting form-meaning mappings in order to achieve a particular communicative goal, and language systems, which are the concrete instantiations with respect to meanings (ontology), words (lexicon) or grammatical constructions (grammar) given a particular strategy. The communal language strategies, i.e. the strategies shared by all or most members of a population, and the communal language system, being the shared choices in a particular community, emerge out of the collective activity of all individuals and is not explicitly accessible nor represented. The goal of this paper is to understand and model the competition, selection and evolution of language strategies using colour as a concrete case study, specifically the interaction between brightness-based and full-colour-based strategies as attested in the evolution of English and many other languages.
2
Language Strategies
-100
0
100
200
The first step in the investigation is to operationalise the language strategies themselves. We have done this here for two strategies: the Basic Colour Strategy and the Brightness Colour Strategy. We have chosen the U CIE L*u*v* space because the distance between two colours in this space accurately represents the psychological distances between these colours perceived -100 -50 0 50 100 by human subjects [10]. The L* dimension represents V brightness (ranging from black to white), the u* dimension represents the red-green opponent channel Fig. 2. A two-dimensional and the v* dimension the yellow-blue channel. A two- projection of the Munsell dimensional projection of the Munsell chips is shown chips on the hue plane. in Fig. 2. Colour categories are represented by a single A subset of these chips is point in this colour space, usually called its focal pro- used as the stimuli in a lantotype [11], and categorisation can be modeled with guage game. a standard one-nearest neighbour classification algorithm. The prototypes nearest to each chip in the context are computed and categorisation is successful if there is a unique distinctive prototype found for the topic. This prototype is named and if this lead to a successful game, it is shifted with a very small factor towards the stimulus topic. When there is
Linguistic Selection of Language Strategies
153
another stimulus with the same prototype as the topic, a new prototype is introduced using the topic as seed. This happens at a very low rate, to ensure that new categories and the names to express them become sufficiently shared in the population before another new category is invented. When the hearer encounters a name he has never heard before, he adds a new association between this name and a newly created colour category based on the current topic. The Basic Colour Strategy uses all three dimensions of the colour space (both the brightness and the hue dimensions). Figure 3(a) shows that this strategy enables a population of agents to self-organize a colour lexicon from scratch. Figures 3(b) and 3(c) show the evolution of a colour lexicon in a typical experiment. 60
50
communicative success
0.8
40 0.6 30 0.4 20
0.2
0
10
0
200
400
600
800 1000 1200 1400 number of interactions/agent
communicative success lexicon size
1600
ontology size interpretation variance
(a)
1800
lexicon size/ontology size/interpretation variance
1
fidate
bamova
bamoru
vamasi
riveke
bamoru
fidate
kenafo
bamova
fidate
bamova
bamoru
vamasi
riveke
bamoru
fidate
kenafo
bamova
fidate
bamova
bamoru
vamasi
riveke
bamoru
fidate
kenafo
bamova
fidate
bamova
bamoru
vamasi
riveke
bamoru
fidate
kenafo
bamova
bamova
bamoru
vamasi
riveke
bamoru
fidate
kenafo
bamova
0 2000
fidate
(b)
(c)
Fig. 3. The Basic Colour Strategy allows a population of agents (in this case 10) to self-organise a colour lexicon from scratch (a). The graph shows how steady high communicative success is reached with a lexicon of about 15 colour words. The evolution of a typical lexicon in a smaller population (5 agents), is shown after 400 (b) and 1200 (c) games per agent. Each row represents the lexicon of one agent.
The Brightness Prototype Strategy is similar to the previous strategy, but instead of taking all three dimensions into account, only the L* dimension of both the stimuli in the context and the prototypes of the colour categories are compared. While learning, the prototype of the used colour category is shifted on the L* axis towards the L* value of the topic. During invention, only the L* value of the topic is considered relevant. Figure 4(a) shows that this strategy is also adequate to allow a population of agents to self-organize and coordinate a colour lexicon from scratch. The resulting colour lexicon now consists of different shades of gray (see Figs. 4(b) and 4(c) for the evolution of a typical lexicon).
3
Strategy Selection
We can now turn to the main topic of this paper: how can there be selection and cooperation between different language strategies, and how can there be evolution, in the sense that one strategy overtakes another. Different strategies may
J. Bleys and L. Steels 1
80
70 0.8
communicative success
60
50
0.6
40 0.4
30
20 0.2 10
0
0
200
400
600
800
1000
1200
1400
1600
1800
lexicon size/ontology size/interpretation variance
154
zetanu
vusapu
xizoxo
lofatu
povoka
zetanu
vusapu
leluza
zetanu
vusapu
xizoxo
lofatu
povoka
zetanu
vusapu
leluza
zetanu
vusapu
xizoxo
lofatu
povoka
zetanu
vusapu
leluza
zetanu
vusapu
xizoxo
lofatu
povoka
zetanu
vusapu
leluza
zetanu
vusapu
xizoxo
lofatu
povoka
zetanu
vusapu
leluza
0 2000
number of interactions/agent communicative success lexicon size
ontology size interpretation variance
(a)
(b)
(c)
Fig. 4. The Brightness Colour Strategy also allows a population of agents (in this case 10) to evolve an adequate colour lexicon (a). The evolution of a typical lexicon in a smaller population (5 agents), is shown after 400 (b) and 1600 (c) games per agent. Each row represents the lexicon of one agent.
cooperate in the sense that one strategy may be better in certain circumstances than in others. For example, a brightness colour vocabulary is obviously more appropriate when talking about black-and-white photographs. But strategies also compete because a population will obviously be more successful if each language user adopts the same default strategy for dealing with the same sort of communicative problem. There is also competition for words and meanings among the different strategies. A hearer cannot know whether a particular word (e.g. “yellow”) is to be interpreted using one strategy (brightness-based) or another one (full-colour-based), particularly because both strategies could work in similar circumstances. For example, yellow colour chips are often the most bright ones and hence both strategies would work. We have operationalised the linguistic selection of strategies in the following way. Each strategy is reified, in the sense that it is an object represented as such in the memory of the agents. A strategy has a score which reflects its ‘fitness’. This fitness is based on the communicative success of the words and meanings that were built or used with this strategy. The meanings, words or grammatical constructions are tagged with the strategy that has been used to invent or learn them. For example, if a word “dark” is acquired with the brightness-based strategy, it is tagged with that strategy. The same word may be tagged with different strategies because a learning agent does not know which strategy has been used by the speaker with a particular word and hence may have to make different hypotheses. In speaking, agents handle a communicative problem with the solution stored in their language system that had most success in the past and this solution implies a particular strategy. When the problem cannot be handled, the speaker has to expand his set of meanings and his lexicon and he prefers the default strategy, i.e. the strategy that had most success in the past. It is only when this strategy does not work that other alternative strategies are tried out in decreasing order of fitness. In listening, the hearer first applies his own stored solution to interpret the utterance, which again implies the use of the
Linguistic Selection of Language Strategies
155
language strategy associated with this solution. When the hearer is confronted with an unknown word or with a situation in which his interpretation of the word does not work for the present context (because apparently the speaker used another strategy for the word), he uses first his own default strategy to figure out the meaning of the unknown word, and, if that does not work, he tries out alternative strategies, again in the order of decreasing fitness. Due to space limitations, we can only show the outcome of one of our experiments to study the rich and complex dynamics that result from these behaviours. In this experiment, the set of prototypes is kept fixed, namely equal to the focal colours underlying the Spanish colour system [12]. However we left it open which strategy agents should use. For example, the word “morado” (purple) can both be interpreted in the full-colour space and in the brightness space. Two situations arise. In one situation, a single strategy becomes clearly dominant in the population. This could either be the brightness or the full colour strategy, depending on small fluctuations in the early stages (Fig. 5(a)). But we have also observed situations where one strategy becomes dominant first (for example, the brightness strategy) to be overtaken later by another strategy (i.c. the full-colour strategy), as seen in the history of English (Fig. 5(b)). The two strategies continue to co-exist in this case. Brightness is still used in circumstances where this gives a higher chance of communicative success, for example when colour chips are close in hue but distinct in brightness or when there is a word which has most of its success in the brightness dimension.
4
Conclusion
Language strategies can only compete with each other through the use of the language systems that they enable their users to build. And language systems can only be tested through the production and comprehension of concrete utterances. So we get selection at two levels: (1) The application of a language strategy by a population generates possible variation (possible categories, possible words) and those variants that lead to higher communicative success undergo positive selection and are hence re-used in future communications. We have shown that this selectionist process can be orchestrated by coupling communicative success to language adaptation (Figs. 3(a) and 4(a)). (2) The recruitment of cognitive functions generates possible language strategies and those strategies that lead to the construction of language systems with higher communicative success undergo positive selection, and are thus used even more in the future. We have shown here that this selectionist process can be orchestrated by having the agents keep track of which strategy they used for the building and interpretation of a particular word as well as the long term communicative success of each strategy (its fitness). We have shown that this dynamics allows a population to settle on a dominant default strategy although there is not necessarily a winner-take-all situation (Figs. 5(a) and 5(b)). The same sort of competition and selection between different strategies has been observed in many areas of lexical and grammatical evolution. For example, there is currently an evolution going on in Spanish clitics (“le”, “la”, “lo”)
J. Bleys and L. Steels
communicative success/coherence/agreement/fitness/usage
156
1
0.8
0.6
0.4
0.2
0
0
500
1000
1500
2000
number of interactions/agent communicative success strategy coherence fitness (brightness)
usage (brightness) fitness (full colour space) usage (full colour space)
(a)
communicative success/coherence/fitness/usage
1
0.8
0.6
0.4
0.2
0
0
200
400
600
800
1000
number of interactions/agent usage (brightness) fitness (full colour space) usage (full colour space)
communicative success strategy coherence fitness (brightness)
(b) Fig. 5. In (a) the Brightness Colour Strategy becomes entirely dominant and is used significantly more, whereas in (b) one strategy overtakes the dominant one which is reflected in the respective use of each strategy
whereby the etymological system of Standard Spanish, which uses clitics to express different cases (nominative, dative, accusative), is shifting to a referential system in which case differentiation is lost, but with existing forms recruited for expressing gender and number distinctions [13]. The two-level selectionist dynamics discussed here is relevant to understand more broadly how an individual may select the language strategies used in his or her language community and how language strategies may evolve in the historical evolution of a language.
Linguistic Selection of Language Strategies
157
Acknowledgements This research is funded through a fellowship of the Institute of the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen), with additional funding from the EU FP7 ALEAR project and the Sony Computer Science Laboratory in Paris.
References 1. Garrod, S., Doherty, G.: Conversation, co-ordination and convention: An empirical investigation of how groups establish linguistic conventions. Cognition 53, 181–251 (1994) 2. Steels, L.: Language as a complex adaptive system. In: Schoenauer, M. (ed.) Proceedings of PPSN VI. LNCS, pp. 17–26. Springer, Berlin (2000) 3. Steels, L.: A self-organizing spatial vocabulary. Artificial Life 2(3), 319–332 (1995) 4. Steels, L., Belpaeme, T.: Coordinating perceptually grounded categories through language: a case study for colour. Behavioral and Brain Sciences 28(4), 469–489 (2005) 5. Hardin, C., Maffi, L.: Color categories in thought and language. Cambridge University Press, Cambridge (1997) 6. Berlin, B., Kay, P.: Basic color terms: their universality and evolution. University of California Press, Berkeley (1969) 7. Roberson, D., Davies, I., Davidoff, J.: Color categories are not universal: replications and new evidence from a stone-age culture. Journal of Experimental Psychology: General 129(3), 369–398 (2000) 8. Maclaury, R.: From brightness to hue: an explanatory model of color-category evolution. Current Antropology 33(2), 137–187 (1992) 9. Casson, R.: Color shift: evolution of English color terms from brightness to hue. In: Hardin, C., Maffi, L. (eds.) Color Categories in Thought and Language, pp. 224–239. Cambridge University Press, Cambridge (1997) 10. Fairchild, D.: Color appearance models. Addison Wesley Longman, Reading (1997) 11. Rosch, E.: Cognitive representations of semantic categories. Journal of Experimental Psychology 104(3), 192–233 (1975) 12. Lillo, J., Moreira, H., Vitini, I., Mart´ın, J.: Locating basic Spanish colour categories in CIE L*u*v* space: identification, lightness segregation and correspondence with English equivalents. Psicol´ ogica 28, 21–54 (2007) 13. Fern´ andez-Ord´ on ˜ez, I.: Le´ısmo, la´ısmo, lo´ısmo: estado de la cuesti´ on. In: Bosque, I., Demonte, V. (eds.) Gram´ atica descriptiva de la lengua espa˜ nola, vol. 1, pp. 1319–1390. Real Academia Espa˜ nola / Espasa Calpe, Madrid (1999)
Robots That Say ‘No’ Frank F¨ orster, Chrystopher L. Nehaniv, and Joe Saunders Adaptive Systems Research Group, School of Computer Science University of Hertfordshire, College Lane, Hatfield, AL10 9AB. United Kingdom
Abstract. This paper reports on foundational considerations for experiments into the acquisition of human-like use and understanding of negation in linguistic utterances via a developmental robotics approach. For this purpose different taxonomies of negation in early child language are analysed in order to show the large variety of communicative functions that these different types of negation have. Requirements for robotic systems that aim at acquiring these utterances in a linguistically unconstrained human-robot dialog are derived from this analysis.
1
Introduction
This paper presents an analysis of negation in early child language which offers an alternative to the prevailing paradigm of propositional representation with one that considers language as a means to manipulate the world. Apart from being necessary in order to ground negative utterances this shift in the viewpoint seems more in accordance with evolutionary perspectives on language [9]. The discussion below contributes to ongoing research into achieving human-like language acquisition in robots via developmental processes [16,10,5]. The motivation for the development of existing frameworks is at least partially to tackle the symbol grounding problem by linking the symbolic representations of the system to sensorimotor data [8,15]. Some of the embodied frameworks such as [19] enable artificial agents to invent their own vocabulary and simple forms of grammar like word order for the purpose of communicating with each other in a language constructed by and understandable to the robotic participants of the dialog [18]. Other frameworks, e.g. [6] or [14], enable robots to learn and understand names for objects, actions, or spatial relations, simple propositions, or commands taken from natural human language. From the perspective of pragmatics all of these frameworks enable robots to engage in one or two types of speech acts: commenting on the state of the world, an assertive speech act, and following orders, a reaction to directive speech acts. As is known from speech act theory [1,17] human language has many more functions than only stating facts about the world or receiving and giving orders. Observations of the earliest language use of children show clearly that the functional scope of early human language transcends these types of speech acts [4]. Already in the pregrammatical phase of early utterances children express linguistically attraction G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 158–166, 2011. c Springer-Verlag Berlin Heidelberg 2011
Robots That Say ‘No’
159
or aversion towards objects, actions and persons by requesting or refusing. The latter constitutes only an alternative means for conveying communicative intentions which in the pre-linguistic phase were conveyed via gestures or prosodically marked non-lexical utterances that accomplish the same function. Negatives play a major role in so called “langage de volont´e” (language of will) [7] and many types of negation are prototypical examples for utterances that cannot be grounded without being linked to the volitional or affective state of an agent. Pea [13] uses the apt expression of “motor-affective sensorimotor intelligence” to characterize the capability that children must have in order to refuse things in a linguistic manner. This highlights the difference compared to pure sensorimotor grounding of object words like “cup” or “ball” or action words like “grasp” or “throw”. Prohibitive speech acts, acts of refusal, or acts of motivation-dependent denial are examples for negatives that are at least partially distinguished through their affective nature. Other negative speech acts include commenting on disappearance of objects, cessation of events, or expressing unfulfilled expectations which could be grounded with the existing frameworks. It can neither be assumed that the last mentioned types constitute the overwhelming majority of early negation nor can these be expected to be the dominant types of negation in a human-robot dialog with unconstrained language use on part of the human. We examine the preconditions of this very early form of human speech that must be met by robotic agents in order to ground negatives and engage in negative speech acts in the context of such a dialog. Less affective utterances that comment on disappearance or express unfulfilled expectations, sometimes grouped together under the term of non-existence, have the advantage of being a natural counterpart to positive comments on and descriptions of the world and thus highlight the difference between recent frameworks and the new approach proposed here, which aims at extending these frameworks towards the grounding of negation including affect-heavy forms like rejection or prohibition. The presented outline is meant to be a theoretical basis for research which is dedicated to enable robots to obtain early pre-grammatical human language. The account tries to stay close to early human language development while at the same time exposing the crucial properties of negation which have to be met by robotic systems in order to be able to engage in dialogues with humans. Section 2 gives a short overview of negative speech acts observed in early child language in the one-word stage and their characteristics. Different taxonomies are presented and compared with specific regard to properties that are crucial for grounding and representation of negatives in robotic language acquisition. These acts will be considered from the point of view of pragmatics, speech act theory, semiotics, and accounts of early language acquisition [4,12]. Section 3 outlines implications of the linguistic analysis for robotic frameworks and describes cornerstones of a computational model designed to enable humanoid robots to acquire early human language via dialog with a human partner. Section 4 concludes with a short summary.
160
2
F. F¨ orster, C.L. Nehaniv, and J. Saunders
Emergence of Negation in Human Language Development
Compared to words that name objects, actions or even abstract categories, negation is semantically hard to classify and is deeply interwoven with lexical meaning in general [13]. There are several strategies to categorize negation semantically with varying granularity (see also [13] for a discussion). To the knowledge of the authors there is despite many centuries of research on many aspects of negation no commonly accepted taxonomy with regard to its semantics. This circumstance indicates the difficulty of this enterprise. We adopt in the following the taxonomy proposed by Pea [13] as it is motivated by similarities in the situational context and the child’s behavior which leads to an intuitive clustering. Moreover this taxonomy exhibits the nice feature of emphasizing different properties for the particular negation types that lend themselves readily to a translation into different requirements for the computational model delineated in section 3 below. Note that this taxonomy does not necessarily reflect the child’s own distinction (see [13]). Other taxonomies can be found in [2] and [3] and will be compared to the adopted one. Table 1 summarizes the most frequent types of early negation and their characteristics in the period up to 25 months according to Pea [13]. The negation types in the table roughly emerge in the order listed, whereby the three middle types can also be exchanged with regard to order of emergence. The reason for this variability is simply that different parents from potentially different social classes speak very differently to their children which has a strong impact on their linguistic development. Nonetheless the tendency from a rather reactive behavior towards a behavior that requires an increasingly complex memory is observable. Negation types not listed in table 1. Less frequent and therefore omitted types are make-believe, agreement to negative statements, motivation-dependent, and perspective-dependent denial (see [13]). Rejection. Action-based negations that serve to reject objects, persons, activities or events in the immediate environment fall into this category. Often expressed as a simple “No” this type of negation is repeatedly reported to be the first type that emerges and has gestural and non-gestural predecessors long before the emergence of the first word. Rejections might be the most affective type of negation as they cannot be interpreted without reference to aversion. Self-Prohibition. Utterances of negation in the context of objects or actions that have been forbidden previously by the teacher. One can observe these utterances in a scenario, where the child approaches a forbidden object, hesitates to touch it while looking at the teacher, approaches it again and so on. Selfprohibition is more complex than rejection in the sense that it requires a internal representation of the preceding external prohibition. Disappearance. Typically expressed as “gone” or “all-gone” in English, disappearance negatives signal the disappearance of something in the immediate
Robots That Say ‘No’
161
Table 1. Most frequent types of early negation. Taxonomy and characteristics are compiled from Pea [13] except for characteristics tagged with * which were added by authors. (affective) means that corresponding negation types are not intrinsically affective but accompanied or caused by affective states. ‘Adjacent’ describes whether or not an utterance is produced in response to another one. Negation type Rejection
Self-Prohibition
Disappearance
Unfulfilled Expectation
Truth-functional Denial
Topic Characteristics objects, persons, events, affective, adjacent and nonadjacent, activities in the present action-based, no need for internal representation objects, persons, events, nonadjacent, (affective)*, referent in activities present but assumes prohibition in past with regard to the same referent on part of the carer objects, persons, events, adjacent and nonadjacent, nonactivities affective*, referent in the immediate past, typically in context of wherequestions but also with declaratives, need for internal representation objects, persons, events, nonadjacent, (affective)*, referent in activities past or present, comment on constraint on activity or absence other than immediately prior disappearance or cessation, need for internal representation, first action-uses then existential and locational uses facts of the situation which adjacent, non-affective*, response to the proposition that is to be a proposition, need for abstract interdenied refers to nal representation
past. Like self-prohibition this type requires an internal representation, in this case of objects that disappear, but on a shorter time-scale. Short-term memory might be enough to support this type. Unfulfilled Expectation. Like disappearance, unfulfilled expectations might be expressed as “gone” or “all-gone”. They are uttered in contexts where objects are absent from their expected or habitual location without having been present in the immediate past. They also occur when an activity is unsuccessful in contrast to previous success, e.g. caused by broken toys. This type requires representations of objects, actions or events on longer time-scales than disappearance. Truth-functional Denial. This type of negation is the most abstract in the taxonomy and the last to emerge. Generally it is a response to a proposition not held to be true by the child. To employ this kind of negation a child must be able to conduct logical judgments and use at least some truth-conditional semantics
162
F. F¨ orster, C.L. Nehaniv, and J. Saunders
of language. The negated proposition is independent of the child’s attitude towards it, distinguishing this from motivation- and perspective-dependent denial. The facts that are referred to might be in the present, past or future. Taxonomy Proposed by Choi. Choi divides cross-linguistic negation up to age 40 months into nine categories, which roughly emerge in three developmental phases [3]: Phase 1: (nonexistence), prohibition, rejection, (failure) Phase 2: denial, (inability, epistemic negation) Phase 3: normative negation, inferential negation The brackets indicate a variation in the time of emergence for the different children observed by Choi. For some children they emerged during the indicated phase for others they emerged one phase later. With normative negation Choi describes negatives that occur when the state of affairs differs from the agents habitual expectations. Negations of this type are evoked through a deviation from normative expectation (e.g. persons go in and not on a car) or the unorthodox use of a tool. Inferential negation is related to denial, but in contrast the agent assumes that the conversation partner holds the statement which is to be denied to be true rather than having actually heard the latter expressing it. Nonexistence is expressed when the agent expects an entity to be present which is not or the entity disappears. Nonexistence according to Choi therefore seems to subsume Pea’s categories of disappearance and unfulfilled expectation. Prohibition is used by the agent to negate action on part of the interaction partner. Pea does not list this category explicitly but notes that it is tightly linked to rejection. Failure is the reaction to a specific event that does not occur as expected. Like nonexistence it therefore seems to map to Pea’s category of unfulfilled expectation. Denial probably subsumes all the differentiated types of denial of Pea. On the other hand Choi only quotes one example which falls into Pea’s category of truth-functional denial. Inability describes an agent’s negation of its physical ability to accomplish a task. It probably maps best to Pea’s unfulfilled expectation category as this also subsumes constraints on activities. Epistemic negation describes negative responses to requests for information like “I don’t know”. This type does not seem to be captured by Pea’s taxonomy maybe because he did not observe it while monitoring the children of his study. Taxonomy Proposed by Bloom. Bloom is cited by both of the other authors and seems to have provided one of the first accounts of the development of negation with regard to semantics [2]. At the same time her partition is the least elaborate one of the presented taxonomies. She only distinguishes the three types nonexistence, rejection and denial. Another striking difference is the fact that she reports nonexistence as the first negation type to emerge. Pea argues that this might be the case because Bloom focuses on negative meanings in sentences instead of their emergence during the single-word period.
Robots That Say ‘No’
163
Nonexistence: The referent is not manifest in the context, where there is an expectation of its existence, and is correspondingly negated in the linguistic expression. Rejection: The referent actually exists or is imminent within the contextual space of the speech event and is rejected or opposed by the child. Denial: The negative utterance asserts that an actual (or supposed) predication is not the case. The negated referent is not actually manifest in the context as it is in rejection, but it is manifest symbolically in a previous utterance. Bloom’s category of denial is evidently the same as Pea’s category of truthfunctional denial. Rejection is defined in the same way as given by Pea. Nonexistence maps to Pea’s unfulfilled expectation. Blooms focus on sentential negation renders it incomparable with regard to developmental emergence and we therefore will not take it into account in what follows.
3
Prerequisites for Grumpy Robots
From the linguistic analysis above we infer that there are three crucial properties that distinguish the different types of negation. First they are distinguished through their relatedness to affect or volition. E.g. rejection cannot be grounded meaningfully without referring to the affective state of the agent. We therefore propose to replace sensorimotor grounding with sensorimotor-affective grounding to take this circumstance into account. An easy way to accomplish this might be to simply introduce two-dimensional values for affect denoting the valence (positive/negative) and the degree of affect. These values principally could be associated to instances of acquired concepts, linked to ongoing ‘needs’ of the robot, treated in the same way as other sensorimotor values and stored together with the other sensor readings for each frame in two additional dimensions. These values constituting the willingness to cooperate/accept or the opposite thereof could in the beginning be chosen arbitrarily indicating a tendency to accept or reject certain actions and objects. For the sole purpose of grounding the reason why an affective state is the way it is seems not of importance. In order to signal the state of affect to the interaction partner in language acquisition games this state should be mirrored by facial or body gestures of the humanoid. This is necessary in order to provoke negative utterances by the interaction partner in the case of non-cooperation/non-acceptance. This could be utterances like “No? You don’t like this ball?”. In later stages the arbitrary choice of affect could be replaced by more sensible measures like the outputs of a planning module depending on the decision if a certain action is useful or harmful to the agent to achieve its goals, to satisfy its internal drives, or to maintain its state [11]. The second distinction we can observe is the increasing complexity with regard to the required memory: from a purely reactive behaviour in the case of rejection to the need for long-term memory for unfulfilled expectations or normative negation. Interesting issues in this context are the need for an internalization of a previously external physical prohibition in the case of self-prohibition and
164
F. F¨ orster, C.L. Nehaniv, and J. Saunders
Fig. 1. Humanoids iCub (left) and Kaspar (right) used in language acquisition studies. These studies include work on learning vocabulary, holophrases, negation and grammar in a manner where language is grounded in active manipulation of objects, the social environment, and in sensorimotor experience in unconstrained interaction with human teachers. The work is targeted at the acquisition of human language-like capabilities.
the need for an internal representation of habitual locations and habitual functionality in the case of unfulfilled expectations. It is unclear whether these two memory-related requirements should be treated in a differentiated manner or whether it would be advantageous to distinguish them on a higher level but map them to the same underlying memory-structure. Experiments with different memory-models in which both negation types are acquired simultaneously should shed light onto this issue. The third distinction for the types is given by their property of being adjacent or nonadjacent (see table 1). In order to engage in adjacent negation the agent must know when it is addressed. The only purely adjacent type is truthfunctional negation, the latest negation type expected to emerge. In this case turn-taking is an issue and the analysis of prosody or an artificial replacement thereof as the agent is supposed to react to being spoken to and therefore must have the means to find out when it is addressed. The large functional variability of the different negation types suggests that from a developmental point of view early negation might be rather considered as a family of related but different types of speech acts than being a variation of a single phenomenon. Seen from this perspective an implementation in the spirit of item-based constructions seems natural [20]. The latter support a treatment of these types as if they were entirely independent in the beginning and leave it up to machine learning algorithms to detect the similarity. Eventually a higherlevel schema constructed through these mechanisms might emerge that could be labeled negation by an external observer.
4
Summary
We provided an analysis of early negation types with regards to their functional use which highlighted crucial differences with regards to their treatment in the
Robots That Say ‘No’
165
context of robotic language acquisition in general and furthermore in a setting of human-robot dialogues. We propose to introduce a representation of affect or volition into the system in order to enable the acquisition of particular negation types like rejection. We also propose to introduce a differentiated memory architecture in order to support other types of negation like disappearance or unfulfilled expectations. The fact that all but one negation type are expressed also in a nonadjacent manner suggests that the integration of means to detect questions, e.g. through prosody extraction, is of importance but seems to be not of equal priority. Nonetheless turn-taking and the detection of questions is important in order to maintain the progress of the dialog which drives the grounded acquisition process itself. Initially it could be replaced by easier means of dialog control until e.g. robust methods for social cue and prosody detection are found. Acknowledgement. The work described in this paper was conducted within the EU Integrated Project ITALK (“Integration and Transfer of Action and Language in Robots”) funded by the European Commission under contract number FP-7-214668.
References 1. Austin, J.L.: How to Do Things with Words. Harvard University Press, Cambridge (1975) 2. Bloom, L.: Language Development: Form and Function in Emerging Grammars. M.I.T. Press, Cambridge (1970) 3. Choi, S.: The semantic development of negation: a cross-linguistic longitudinal study. Journal of Child Language 15(3), 517–531 (1988) 4. Clark, E.V.: First Language Acquisition. Cambridge University Press, Cambridge (2009) 5. Dominey, P.F., Metta, G., Nori, F., Natale, L.: Anticipation and initiative in human-humanoid interaction. In: 8th IEEE-RAS Intl. Conf. Humanoid Robots, pp. 693–699 (2008) 6. Dominey, P.F., Boucher, J.D.: Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence 167, 31–61 (2005) 7. Guillaume, P.: Les d´ebuts de la phrase dans le langage de l’enfant. Journal de Psychologie Normale et Pathologique 24, 1–25 (1927) 8. Harnad, S.: The symbol grounding problem. Physica D 42, 335–346 (1990) 9. Maynard Smith, J., Szathm´ ary, E.: The Major Transitions in Evolution. Oxford University Press, Oxford (2002) 10. Nagai, Y., Rohlfing, K.J.: Computational analysis of motionese toward scaffolding robot action learning. IEEE Trans. Autonomous Mental Development 1(1) (2009) 11. Nehaniv, C.L., Dautenhahn, K., Loomes, M.J.: Constructive Biology and Approaches to Temporal Grounding in Post-Reactive Robotics. In: McKee, G.T., Schenker, P. (eds.) Sensor Fusion and Decentralized Control in Robotics Systems II. Proc. of SPIE, vol. 3839, pp. 156–167 (1999) 12. Owens, R.E.: Language Development: An Introduction. Pearson Education, London (2007) 13. Pea, R.: The Development Of Negation In Early Child Language. In: Olson, D.R. (ed.) The Social Foundations of Language & Thought: Essays in Honor of Jerome S. Bruner, pp. 156–186. W.W. Norton, New York (1980)
166
F. F¨ orster, C.L. Nehaniv, and J. Saunders
14. Roy, D.: Semiotic schemas: A framework for grounding language in action and perception. Artificial Intelligence 167(1-2), 170–205 (2005) 15. Roy, D., Reiter, E.: Connecting language to the world. Artificial Intelligence 167, 1–12 (2005) 16. Saunders, J., Lyon, C., F¨ orster, F., Nehaniv, C.L., Dautenhahn, K.: A Constructivist Approach to Robot Language Learning via Simulated Babbling and Holophrase Extraction. In: Proc. 2nd Intl. IEEE Symposium on Artificial Life, pp. 13–20 (2009) 17. Searle, J.R.: Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press, Cambridge (1969) 18. Steels, L.: The origins of syntax in visually grounded robotic agents. Artificial Intelligence 103(1-2), 133–156 (1998) 19. Steels, L.: Evolving grounded communication for robots. Trends in Cognitive Science 7(7), 308–312 (2003) 20. Tomasello, M.: Constructing a Language. Harvard University Press, Cambridge (2003)
A Genetic Programming Approach to an Appropriation Common Pool Game Alan Cunningham and Colm O’Riordan National University of Ireland, Galway, Ireland
[email protected],
[email protected] http://www.it.nuigalway.ie/cirg/
Abstract. We investigate the performance of agents co-evolved using genetic programming techniques to play an appropriation common pool game. This game is used to study behaviours of users participating in scenarios with shared resources or interests eg. fisheries. We compare the outcomes achieved by the evolved strategies to that of human players as reported by [6]. Results show that genetic programming techniques are suitable for generating strategies in a repeated investment problem. We find that by using co-evolutionary methods, populations of strategies will quickly converge to nash equilibrium predicted by game theoretic analysis, but also lose many adaptive behaviours. Further, by evolving against a set of naive strategies, we show the creation of diverse and adaptive behaviours that play similarly to humans as described in previous experiments.
1
Introduction
Common pool problems are ones that deal with multiple agents and shared resources. These games model many scenarios in real world economic and social scenarios where many individuals are sharing resources e.g. fishing grounds. We are interested in a specific type of common pool problem – the appropriation game. The main characteristic of this game is that the level of yield from the pool is dependent on all the agents acting in the pool. The problem becomes how the agents must appropriate their investments to achieve an optimum return. Agents must cooperate to some degree in order to achieve a maximally beneficial outcome for the individual and the group. In this game we are studying a co-evolved genetic programming method for creating investment strategies in a common pool resource (CPR). We wish to compare the performance of the strategies generated and the group level behaviour of the agents with that of the human players performance in a similar game [6]. This game provides a difficult problem for the agents and a difficult search space for the genetic programming process. In the agent environments specified, the maximum return from investments is gained by using the CPR, however over-investment ruins the output of the CPR for the group. For these experiments a fixed number of agents play an iterated appropriation game. Each agent is provided with an endowment of tokens at the beginning of G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 167–174, 2011. c Springer-Verlag Berlin Heidelberg 2011
168
A. Cunningham and C. O’Riordan
each round which must be invested in two markets. The first market offers a fixed return on the tokens invested, while the second, the CPR, provides a level of return based on the amount of total investment by all the agents in that market. The GP generated tree chooses the investment strategy for each agent every round and the fitness of the solution is a measure of the success of the agent. The performance of these evolved trees are compared to the human players in order to discover: (1) if the evolved solutions are close to the aggregate group level Nash equilibrium as predicted by non-cooperative game theory, (2) if the individual play by the agents approaches Nash equilibrium by selective pressure and (3) what decision making processes are adopted by the agents is while playing the game against various opponents. The main contributions of this paper are as follows: (1) a genetic programming based study of this CPR game. GP can provide an evolutionary perspective on this problem and by using decision trees it can overcome the constraints in other evolutionary methods[7], (2) the use of co-evolution and discussion of the findings, (3) the study of strategies employed by agents and (4) a comparison of the evolutionary results with human players. The paper is divided in several sections. The next section is a discussion of the background and related research, which introduces briefly the concepts seen in the rest of the report. Section three introduces the CPR appropriation game and our GP approach. Section four introduces the experiments that we have conducted and section five contains the results. The final section discusses the uses for the techniques in the paper as well as possible future work.
2
Related Research
There is a great deal of literature available on specific field based CPRs such as the fishery [4] and for a review of the literature see [3]. We focus on studies that are more comparable to the computer agent simulations we employ. In [6] CPR problems are discussed from the point of view of lab experiments with human players and a study of common-pool scenarios in the field. Their main focus from the experiments is the appropriation game with a view to discover how changes to the rules of the games alters the rate of cooperation. They find that both the ability to communicate and sanctioning in games increases the players’ yield. There have been a number of evolutionary approaches to different economic games in previous research. It has been shown that multi population co-evolution with genetic algorithms performs well when for searching for equilibria in simple standard games used in organisational theory[7]. All the games explored were one shot, simultaneous move games owing to a representational constraint in GAs. The authors suggest that other techniques, including GP, would provide the ability for conditional choice and thus exploration of more complex and repeated games. The emergence of cooperation has been demonstrated in a spatial CPR game where the spatial element is represented by agents on a circle[5]. They show that
A Genetic Programming Approach to an Appropriation Common Pool Game
169
in the CPR game a cooperative strategy can survive, even when the majority of agents is defecting. As the strategies of players were based on neighbors it allowed some agents to maintain cooperation even when globally defection was the generally adopted strategy. A model using the Swarm multi-agent simulation environment was built[1] to capture how adaptive agents perform in the baseline CPR game as defined in [6]. The agents are randomly allotted a subset of 16 preprogrammed strategies for which the performance results of each alternate strategy is measured as the average return that the agent would have received in each round had it utilized it. It is found that the agents perform similarly to the humans in the lab experiments. These findings can be used for comparison with other approaches to the same game.
3
Game Definition
Researchers[6] developed a series of laboratory experiments utilising human subjects and investigate the correlation between the behaviours of the human players and the behaviours predicted by non-cooperative game theory. The baseline experiment comprised the following parameters: eight human participants made finitely repeated investment decisions regarding an amount of tokens with which they were endowed at the beginning of each round. The tokens are then invested in either Market 1, offering a fixed return, or Market 2, the common pool offering a return based on the level of investment, or some combination of both. Participants know the number of other players, their own endowment, their own past actions, the aggregate past actions of others, the payoff per unit for output produced in both markets, the allocation rule for sharing Market 2 output, and the finite nature of the game’s repetitions. Participants also know the mapping from investment decisions into net payoffs. Table 1 shows the parameters for the two baseline experiments undertaken which are also used for the evolutionary experiments. The number of tokens predicted by Nash equilibrium is 8 for each agent, or 64 for the group when such comparisons are made, for both experiments. Averaged across several experiments the results were as follows: the average net yield as a percentage of the maximum yield was in the 10 token game 37% and in the 25 token game -3%. The main conclusion from this baseline experiment was that even as users reach the equilibrium point, net yield decays toward 0 and rebounds as subjects alter their investment strategies. In low endowment settings, aggregate behaviour results tend toward Nash equilibrium. In the high endowment setting aggregate behaviour in early rounds is far from Nash euilibrium but does approach it in later rounds. At the individual decision level, however, behaviour is inconsistent with the Nash prediction. In [1] they report almost identical performance from their agents, in terms of efficiency, when compared to the human players.
170
A. Cunningham and C. O’Riordan Table 1. Human Experiments
Type of Endowment Low (10 tokens) High (25 tokens) Number of Subjects 8 8 Individual token endowment 10 25 2 Production function (xi is the investments by player i) 23( xi ) − .25( xi ) 23( xi ) − .25( xi )2 Market 2 return/unit of output $.01 $.01 Market 1 return/unit of output $.05 $.05 Earnings/subject at group maximum $.91 $1.65 Earnings/subject at Nash equilibrium $.66 $1.40 Earnings/subject at zero rent $.50 $1.25
3.1
Using GP to Co-evolve Agents in the Two Market Game:
We use the GP process to create a tree for each agent representing its investment strategy for the CPR. As all tokens must be invested each round, the remainder are put into the fixed market. The GP creates trees from the set of functions and terminals contained in Table 1 which are divided into different nodesets for the STGP (for determining creation and crossover points for the trees). The functions are listed in Table 2 denoted with an f (n) where n is the number of arguments the function takes. The description of the GP nodesets is as follows: The environment nodeset contains functions and terminals that represent the agent’s perception of the environment. Where the node says that it is believed, for this game that belief is with 100% certainty. The function BELIEVEDPROFITFROMM1 returns the amount the agent would receive based on the level of input as a given argument. BELIEVEDPROFITFROMM2 acts the same but also requires the total group investment as an argument. The constant nodeset contains terminals in the form of constants. It also contains some standard mathematical functions which take two arguments to be operated on (multiply, divide). It also contains the If function which takes 3 arguments, the first from the decision set, the second is the true branch and the third is the false branch. The decision set has functions which take environmental or constant nodes and compare them (greater, less, equal). We have also included And and Or which take other decisions. This allows for complex if then statements to be created. We use a co-evolution method for evaluating the GP trees. We iterate through the population 20 times selecting an individual member and choosing a random seven others to create a group of eight agents – with each tree representing the investment strategy of each agent. As the opponent strategies are chosen at random each member of the population usually plays a far higher number than this minimum 20. The group then plays the game as the humans would with one difference – a fixed 20 rounds instead of at least 20 but the agents do not know the exact finite nature of the game.
A Genetic Programming Approach to an Appropriation Common Pool Game
171
For each round, the tree is evaluated returning a value of tokens which the agent will invest into the CPR. The remainder of the tokens are invested into the fixed market. The fitness of the individual is the cumulative profit of the agent over the 20 investment rounds with a small penalty for length of the tree. We also penalise invalid trees by assigning a very poor fitness score. An invalid tree is one that returns a value of tokens to invest in the CPR greater than the round endowment or less than zero. When a strategy like this occurs during evalutaion, it is replaced by a one investing a random percentage of the tokens endowments each round. The GP parameters that we use, based on experimentation and recommendation, are as follows: population size of 250, 100 generations, crossover probability of 90%, a mutation rate of 10%, a creation probability of 2% and finally elitism is employed. Table 2. Node Sets for GP Environment Constants Decisions BELIEVEDROUND Multiply f (2) Or f (2) BELIEVEDTOTALGROUPTOTKENS Mod f (2) Greater f (2) BELIEVEDAMOUNTOFAGENTS Plus f (2) Less f (2) BELIEVEDROUNDENDOWMENT Divide f (2) Equal f (2) PROFITLASTROUND If f (3) And f (2) PROFITM1LASTROUND {0, 1, 2, . . .} PROFITM2LASTROUND TOTALGROUPINVESTMENTM2LASTROUND INVESTEDM2 CUMULATIVEPROFIT AVERAGEINVESTEDM2 AVERAGECUMULATIVEPROFIT AVERAGETOTALGROUPINVESTMENTM2 AVERAGEPROFITEACHROUND AVERAGEPROFITM1EACHROUND AVERAGEPROFITM2EACHROUND BELIEVEDPROFITFROMM1 f(1) BELIEVEDPROFITFROMM2 f(2)
4
Experiments
We evolve 8 separate populations of solutions. We then select the tree with the best fitness at generation 100 to be a representative of that evolutionary run. We play each of the 8 selected strategies in a game against each other. As a benchmark they also play against a group of pre coded fixed strategies. We use the game the evolved agents play together as the comparison with the human players’ game. Averaged over 3 iterations of this process we use the aggregate play as well as individual investments in each round as the basis of our comparison. We repeat the experiments for both 10 and 25 token endowments per agent per round.
172
A. Cunningham and C. O’Riordan
We use a set of fixed naive strategies to assess the similarity in performance between the individual representative evolved strategies. The cumulative profit of each strategy should show the degree of variance between them and also how well each strategy performs. We, for an analysis, also save the best strategy each generation for the evolutionary runs. We analyse the population of 100 “best” strategies creating an overall profile of the evolutionary process in this problem domain. We play each of the strategies against each other. We record the cumulative profits and rank the solutions accordingly to determine the if there is an increase in performance as the evolution time increases. We also analyse the playing of rounds to see if there is a general pattern that agents adopt.
5 5.1
Results GP Co-evolved Agents
Using the tree with the best fitness at the end of 100 generations of the evolutionary run as a representative, we use eight of these candidates to play the game. We find that, regardless of the token amount, the chosen agents play close or equal to the Nash equilibrium amount of tokens. Recall from Table 1 that the individual investment Nash equilibrium is 8 tokens and for the group is 64 tokens. In the three 10 token scenarios the agents play fixed strategies. The average group token investment for the games is 65.68. This is the same performance as the human players had at an aggregate level. At an individual level however we see a different trend. The evolutionary pressure is on the individual to perform at the Nash equilibrium and this is the result we see from our evolved agents; this deviates from what the human players played. The human performance show a changeable strategy that tends towards a group aggregate Nash in later rounds but at the individual level it is quite varied. The evolved agents play a typical non varying strategy of an average 8.21 tokens. In the 25 token game, taking a candidate agent from eight different evolutionary runs, we see a perfect Nash play in the three chosen instances. During the game between the evolved agents, each agent plays an unchanging strategy of 8 tokens per round. We can see from Fig. 1 that the evolutionary process converges to the Nash amount of tokens quite quickly. The standard deviation of the population’s investment decreases for the run. The results are averaged over eight separate evolutions of the problem. This is evolution pattern is seen with both token endowment levels. In order to further evaluate the co-evolved strategies we play them against a set of naive strategies that play 0%, 20%, 40%, 60%, 80% or 100% of the token endowments. These opponents have a uniform and unchanging strategy throughout. We perform this experiment to test if our evolved strategies contain the ability to exploit when there is low pool investment and to counteract overinvestment.
A Genetic Programming Approach to an Appropriation Common Pool Game
173
While playing the naive fixed strategies the evolved 25 token agents played a mix of fixed and mild exploitative strategies. The fixed played 8 tokens or 8 in the first round and then 9 tokens in subsequent rounds. Some exploitative strategies gradually increased the amount of tokens from 8 to 10 when playing with the low investment naive strategies. Once the high investment naive strategies were in play all of the evolved agents did not reduce their usage of the pool and as such suffered net losses. In the 10 token game the agents are not as badly exploited as they are in the 25 token game. This is due to the fact that the naive agents are bound to the maximum of 10 token investments so we do not see negative returns. We do observe an inability to change strategy as the pool is over exploited with agents playing roughly fixed strategies that invest either 8 or 9 tokens. As the co-evolved population of agents converge to play the Nash amount of tokens, where during their evaluation a larger proportion of opponents will be playing similar strategies, the agents do not evolve very dynamic strategies that counteract being exploited. This results in very fixed strategies, as seen above, where typically the evolved agents do not take into account the actions of others. 5.2
Evolving against Naive Strategies
To explore the possibility of creating a good strategy we use the naive strategies in the evaluation of our agent trees. Each agent plays 6 different games with the naive strategies as described above. This is the first time that the evolved strategies different outcomes between the 10 and 25 tokens games. As the 10 token game does not get exploited when all tokens are invested by the 8 players the strategies learned are consistently close to or at a Nash level of investment. For the 25 token games, the strategies evolved vary. Some play a uniform investment strategy while others play adaptive strategies. When the new evolved strategies play each other the performance is similar to human play. The pool utilization refines itself close to Nash at the aggregate level but from an individual level it is wildly varying. The best performing strategy is one that does not invest in the pool at all while the two next best are adaptive strategies. The worst performing strategies are ones that invest the total amount of tokens constantly. These findings are shown in Fig. 2. The average efficiency of the investments as a percentage of the maximum return is 53.5% which is much higher than that earned by human players.
6
Discussion and Future Work
We have shown that genetic programming techniques are suitable for generating strategies in a repeated investment problem. By using co-evolutionary methods we have shown that populations of strategies will quickly converge to Nash equilibrium, as predicted by game theoretic analysis, but also lose many adaptive behaviours. We have shown that genetic programming techniques are suitable for generating strategies in a repeated investment problem. By evolving against a set
174
A. Cunningham and C. O’Riordan
Fig. 1. Average Investment for 25 tokens
Fig. 2. Evolved Strats playing each other
of naive strategies we have shown the creation of diverse and adaptive behaviours that play similarly to humans. Erratic behaviours emerge when the goals of the opponents differ from the evolving agent; this supplies an evolutionary pressure towards more adaptive strategies. We plan to explore and compare single and multiple population co-evolution for this problem. In conjunction with this we wish to examine the fitness to take into account varying proportions of the group score in order to ascertain if there is a tipping point in the agents’ performance towards the optimum for the group. Using multiple populations would allow for specific roles to emerge to emerge within the population if the group composition is fixed.
References 1. Deadman, P.J.: Modelling individual behaviour and group performance in an intelligent agent-based simulation of the tragedy of the commons. J. Environ. Manage 56, 159–172 (1999) 2. Ferber, J.: Multi-Agent System: An Introduction to Distributed Artificial Intelligence. Addison Wesley Longman, Harlow (1999) 3. Furubotn, E.G., Pejovich, S.: Property Rights and Economic Theory: A Survey of Recent Literature. Journal of Economic Literature 10(4), 1137–1162 (1972) 4. Gordon, S.: The Economic Theory of a Common-Property Resource: The Fishery. J. Polit. Econ. 62, 124–142 (1954) 5. Noailly, J., Withagen, C.A., van den Bergh, J.C.J.M.: Spatial Evolution of Social Norms in a Common-Pool Resource Game. Environmental and Resource Economics 36(1), 113–141 (2007) 6. Ostrom, E., Gardner, R., Walker, J.: Rules, Games and Common-Pool Problems. Michigan, Ann Arbor (1994) 7. Price, T.C.: Using co-evolutionary programming to simulate strategic behaviour in markets. Journal of Evolutionary Economics 7(3), 219–254 (2007)
Using Pareto Front for a Consensus Building, Human Based, Genetic Algorithm Pietro Speroni di Fenizio1 and Chris Anderson2 1
Engenheria Informatica, Coimbra University, Coimbra, Portugal
[email protected] 2 Turnfront
[email protected] Abstract. We present a decision making procedure, for a problem where no solution is known a priori. The decision making procedure is a human powered genetic algorithm that uses human beings to produce variations and evaluation of the partial solution proposed. Following [1] we then pick the pareto front of the proposed partial solutions proposed, eliminating the dominated ones. We then feed back the partial results to the human beings, asking them to find a alternative proposals, that integrate and synthesize the solutions in the pareto front. The algorithm is right now being implemented, and some preliminary results are being presented. Some possible variations on the algorithm, and some limits of it, are also discussed.
1
Introduction
Decision making is a common challenge in any community, independently of its size. Every form of government is essentially a way to solve this problem, but it is also solved (or suffered) in groups that are too small to have any form of official decision making structure. Often the way to find an agreement is through a democratic voting system. In this way the options are listed, and everybody votes what is their favourite option. Theory of voting is, generally, considered part of game theory, and it has a long history. For a good review we suggest [2]. Mostly the difference between the various voting systems are limited to (1) how many options can a member of the community endorse; (2) if the option chosen are ranked or not, and (3) how are the voting of the various member integrated to acknowledge what would be the winning proposal. The general structure is then: (a) the possible options are spelled out (often by some member of the community); (b) everybody votes; (c) through an algorithm the votes are counted; (d) the winning proposal is interpreted as the representative of the community global desire. Of course unless the decision was taking through an unanimity, not everybody will have favoured the winning proposal, and thus some people of the community will be forced to accept a decision they do not favour. If this happens often, with the same people forcing over the same minority some decisions, it is common to talk about a tyranny of the majority. It is also well known in voting theory G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 175–182, 2011. c Springer-Verlag Berlin Heidelberg 2011
176
P. Speroni di Fenizio and C. Anderson
(and practice) that requiring an unanimity before letting a decision be accepted tend to freeze a community, when its size grows too much. How much is “too much”, is different from community to community, but often between 10 and 20 members are enough to freeze a community who is trying to achieve unanimity. Is there really no other option?
2
The Unspoken Assumptions
There are a number of assumptions that are at the base of all those voting systems. We shall try to list them, and propose an alternative decisions system not based on them. Firstly there are some assumptions being made about the size of the space of the possible actions that can be undertaken by a community. It is generally assumed that those possibilities are few (first assumption), are clear (second assumption), and can be easily recognized (third assumption) and acknowledged (fourth assumption). By assuming that there are few possibilities, we also assume that it is possible to list them all (fifth assumption). If we break free from those assumption, we can instead suppose that the space of possibility is vast, needs to be explored, and no single human being can see all the possible solutions. We can even assume that at the starting point no group of people can see all the possible solutions. Now the problem have moved from finding an algorithm that decides which of the few possibilities to implement, based on how many people prefer which of them, to finding a solution that fits as many persons as possible. To say that generally the problem is simply posed as listing the possibilities, and the voting on them, ignores the important aspect of mediation. Some, generally few, individuals, when posed with different positions will try to mediate among the various possibilities, trying to find a solution that fits more people. Not only those people are rare, but also communities need to have one of those people in a particular position of power, to be able to benefit of its ability. So we could say that in a community, where everybody can vote, there are generally only few individuals that set up the possible proposals on which everybody else will vote, and even less that are actively working to find a mediation between the various groups. Those mediators are the ones which are effectively looking for possible solutions that can be endorsed by a wider base. What we want to suggest here is a system where every member of a community has the opportunity to present proposals, mediate between existing proposals, endorse others proposals, and finally where this happens in a cyclic way in such a way that eventually an optimal solution is eventually reached.
3
The Algorithm
The algorithm that we are proposing in this paper, is a human based genetic algorithm [3], that explores the space of possible solution to the question posed. Looking for a solution that can satisfy the maximum number of people; potentially satisfying everybody.
Using Pareto Front for a Consensus Building
177
The algorithm starts with a question being posed. It can be posed by one of the user, or it can be posed by an external person. Then every user will be allowed to write a possible solution (called proposal) to the question. At this stage the proposals are secret, and no user is allowed to see the proposals written by the others participants. When everybody has written their proposal, the writing phase end, and the algorithm moves to the endorsing phase. Now everybody is allowed to read each other proposal, and endorse all the proposals they agree with. It is important in this stage that each user endorse all of the proposals that he agrees on. No limit should be set on the number of proposals that a participant can endorse. In the worse case a participant will only endorse their own proposal. If a participant is instead satisfied with all the proposals he can endorse them all. Once everybody has endorsed all the proposals they are happy with, a selection process happen. We define a proposal A as dominating a proposal B, if the set of participants that endorse proposal A strictly contains the set of participants that endorse proposal B. Of course if A dominates B, and B dominates C, then A dominates C. To select the winner proposals we eliminate all the proposal that are being dominated by any other proposal. What remains is a Pareto Front of the proposals. Note that each participant will be present in at least one proposal. As such the Pareto Front can be said to represent every person that has participated, so far. As the selection ends, we say that also a generation (or a turn) has passed. If the selection process produced a single proposal, this must necessarily be endorsed by everybody. We then decide that the question has been answered, and an acceptable unanimous solution has been found. If the selection process did not produce a single proposal, the process continues with a new generation. Now at this new generation the participants are presented with the question that was posed (the same as before), and the pareto front of the proposals that won the past round. All the other proposals from the past generation have been eliminated, and will be ignored. The participants are now invited to write new proposals, taking the previous Pareto Front as an inspiration. They should try to find possible synthesis among them. Although this is the invitation that they are presented each participant is allowed to write anything they want. They can introduce new solutions, re-propose past solutions that did not make it to the Pareto Front, rephrase past proposals. If a participant feels that he has nothing to contribute, and that his view is fully represented by the proposals that made it to the Pareto Front, he is also allowed not to contribute at all at this stage. Once the participants (who wanted to write) have written their proposals, again we move to the endorsing phase, and then to the selection stage, and so on. The process continues through writing, and endorsing phase, until the system has either converged to a single unanimous answer, or the system does not seem to produce any more variations, and generations after generations the Pareto Front is always the same. The system is very simple, it is a genetic algorithm that uses human beings to produce the ‘genetic’ variation. It also uses human beings to evaluate the generated partial solutions. And then then limits itself to the pareto front of the proposals.
178
P. Speroni di Fenizio and C. Anderson
Question
Answer
Proposing
Selecting Pareto Front
Endorsing
Fig. 1. A schematic representation of the genetic algorithm
We have already discussed in the introduction, how we moved from framing the problem as finding the best solution among a given set, to find the best solution, in an open context. Once we have framed the problem as a search problem, using a genetic algorithm is a natural choice. What we are looking for is a solution which is expressed as a text describing a set of actions. If we were trying to produce such algorithm automatically we would incur in three different problems. Each unsolved so far. First of all there is no automatic way to translate a text in a set of actions. Solving this problem would be equivalent to solve the general problem of automatically finding the semantic meaning of a text. Although a lot of research has been done so far, the results generally apply only for very specific cases. The second problem is to select and filter the actions that are meaningful and possible, from the ones that do not simply make any sense. And this is a second unsolved problem, which would instead require a computer program to have an understanding of the physical world, and its laws. And finally the genetic algorithm would have to be able to evaluate which solutions are better than others. None of those problems have been solved so far. Similarly to [4] we decided to let human beings produce the possible solutions, and again human beings evaluate them. Evaluating the proposals, selecting which to pick for the next generation is in itself not a straightforward task. There are here three requirements that need to be satisfied: (1) we want the algorithm
Using Pareto Front for a Consensus Building
179
to converge in few solutions, that eventually (through various generations) give rise to a single answer; (2) we want the set of solutions to represent the whole panorama of possibilities, and (3) we want every participant to feel that the solutions they endorse are part of those panorama. In the literature there are different threads of research that studied this problem. If we look at this as a voting problem, there is the whole voting theory that has been developed. On the other hand, if we look at this as a genetic algorithm problem, there is also a huge literature on the subject, too. As a voting problem the assumption has always been that each voters had an equally valid point of view, and so it was natural and fair to just sum them. But basically each voter represents a different dimension, on which each of the proposals would be evaluated. If we sum the votes, we are evaluating a bunch of points in a unidimensional space. And we just need to decide where to stop. How many proposals should be allowed to the next generation. At this point the proposals are also playing a transitive game. If proposal A, wins over proposal B (i.e. A has more votes than B), which wins over proposal C, then A wins over C. But we are effectively throwing away a lot of information. In particular we are ignoring who has voted for what proposal. So the result is suboptimal, in the sense that many participants are not really considered in the result. If instead we consider this information, the problem moves from a transitive game to an intransitive game; now a proposal A can be better than a proposal B (according to Joe), which can be better than the proposal C (according to Suzanne), which can again be better than the proposal A (according to James). In this case no proposal is then inherently better, and we are faced with what is called an intransitive cycle. In the field of genetic algorithm Bucci and Pollack [1] have shown that it is possible to escape those intransitive cycles by using Pareto Fronts. Not only this moves back from the potentially dangerous place of intransitive cycles, back to a transitive games; now a proposal A dominates a proposal B if and only if A is bigger or equal in all dimensions to B (no participant prefers B to A), and there is at least a dimension in which A is strictly bigger than B (there is one person that prefers A to B). So if A dominates B, and B dominates C then A will dominate C. This effectively permits us to drop all the proposals that are being dominated by another one. What remains is what is called a Pareto Front. While we are doing this we are not losing the representation of any participant: if proposal A dominates B, and we drop B, all the participant that have endorsed B will have endorsed also A. So by ignoring B we are not ignoring any participant input. They are all present in A. Now assuming that each person have endorsed at least one proposal (which is a safe assumption, since at the very least they would have endorsed the proposal written by them), then each person will be present in the Pareto Front. This solution also solved the problem of how many proposals to take-on to the next generation. We need to take all and only the proposals of the Pareto Front. If we took less, the solution would not be inclusive, and we would run the risk of representing all the participants. If we took more, we would be keeping a solution which is unnecessarily redundant.
180
4
P. Speroni di Fenizio and C. Anderson
Limits, Problems and Variations
While the basic algorithm is quite simple, there are a number of possible variations that should be tested. Some of those variations are presented here. Anonymity of the proposals. Should proposals be anonymous, or should the people who are endorsing the proposals be allowed to know who have written a particular proposal. If the proposals are anonymous the participants are forced to read the whole proposal, before endorsing it. Will they be able to understand it? This depends on the topic, and on the group. The philosophy behind this choice requires each proposal to stand on its own too legs, and not be endorsed thanks to the popularity of the person writing it. On the other side, if the proposals are non anonymous, people who are interested in the topic, but lack the technical knowledge to understand the subtle elements of it, could still participate in the vote. In system where the key element is a comparison between set of voters, and not the actual counting of the endorsers, the anonymous choice seems to be a natural one. Anonymity of the endorsements. After the endorsing phase, the successful Pareto Front of the proposals are fed back to the participants, asking them to write new proposals. When this is done the name of the participants that have endorse each proposal can also be made public, or not. By making those information public it permits to the participants to understand which are the major proposals, also it works as a light social control system, to avoid participants abusing of the system in an antisocial way. In this system each user has a strong power. If a participant writes a proposal, and then endorses only his own proposal, he can be sure that his proposal will be present in the pareto front. This behavior is possible, and even socially acceptable when a person is honestly in disagreement with all the proposals that are being presented. But it can be abused by using it as a way to protest. By letting everybody see who has voted for what, this kind of antisocial behavior is exposed, and generally ceases. On the other hand an anonymous system would permit everybody to endorse what they truly believe in. So in this case both of the possibilities make sense, and both should be tested. Who is allowed to write, who is allowed to endorse? In our description we assumed that everybody who was allowed to write a proposal was allowed to endorse them. This does not necessarily need to be so. If the participants who are allowed to propose are a subset of the participants which are allowed to endorse we have a situation who is similar to a modern democracy, where few people define the options for everybody else. We are not particularly interested in this situation, as it has already been tested enough in modern democracies. If instead the participants who are allowed to suggest proposals are a supra-set than the set of the endorsers, then we have a situation where a community are discussing an issue, and external people are allowed to insert new ideas. This situation has rarely been tested. Another, different, possibility is a situation where no one who
Using Pareto Front for a Consensus Building
181
has proposed something is allowed to participate in the endorsing phase. If this is done by splitting the group into two subgroups at the beginning, and keeping every proposal anonymous. This last option might produce interesting results. Changing the Participants During the Process. So far we have assumed that the same participants that have written a proposal one generation, will also write it on the next generation. And somehow this would probably be an optimal situation, because, since the people participating are also the ones evaluating the results, this defines a static fitness landscape on which the genetic algorithm can climb. Unfortunately this is not always possible. Since this decision system requires multiple voting, periods, and in general a protracted interaction between the users and the system, it is possible (and even common) that the participants in a generation (or even between phases, inside a generation) might change. If the community is big enough this is not necessarily a problem, provided there are enough participants to represent the various possible ideas, the system can keep on finding an answer. If the community is small, changing the participants half way seems to produce the most unreliable results. As it would be running a genetic algorithm where the fitness function changes from one generation to the next. Unfortunately most of the test that we could do far of this method suffered of this problem. Real Questions versus False Questions. Although the work is still preliminary, we already noticed an interesting pattern. We tried the algorithm several times, on various questions. Every time the question was a real question, among real participants, which were going to have their life changed by the result, the algorithm seemed to work better. It would act in a more predictable way, it would converge more rapidly. When more than one result was in the Pareto Front, the participants would try harder to synthesize an acceptable compromise. When instead the question was irrelevant, the answer were random, the endorsing was random, and the algorithm did not seem to converge easily (if it would converge at all). All this seem to suggest that the algorithm is indeed exploring a space of possibilities. And when the question is a real question, there is a definite fitness space to be explored. With peaks, valleys, and neutral ridges. When instead the question is irrelevant (to the participants), the algorithm is unable to find any real synthesis because no real synthesis is there to be found. This suggests that future work should be done on participants that are really involved with the results of the procedure.
5
Partial Results and Conclusions
At the moment we only did preliminaries studies on the subject. We tried it out among six participants with pen and paper. We then implemented the algorithm on a website (http://pareto.ironfire.org), and invited some testers to try it out. All the results are promising, but not consistent enough to make any statistical case. We will thus only rely them as an anticipation of some future work. On the pen and paper example, the question posed was: “We are going for one month
182
P. Speroni di Fenizio and C. Anderson
together, in vacation, this summer. What shall we do?”. This test only lasted two generations. On the first generation the answer were: “go camping”; “help my grandmother with her garden” (from participant ‘D.’ ); “join a construction site and build a house”; “go biking in east Europe”; “go to thailand”; “go to Canada”. After the evaluation process the Pareto Front only included two proposals left: “go to Canada” and “help my grandmother with her garden”. Then the participants were invited to write new proposals. Five out of six proposals suggested to “go to Canada with D.’s Grandmother, and ...”. The sixth proposed to pay a gardener for D.’s Grandmother, and then go to Canada. We notice here an interesting result. The proposals seem to get more complex as the generation passes. As if the algorithm started by exploring the space of possibilities, in a more general way, and then become more precise in successive generations. Everybody is effectively trying to mediate between the elements. It was also interesting that each person tried to reinsert what they really cared for, in the next proposal. For example the participant that first suggested to “go biking in east Europe”, on the second generation suggested to “go to Canada with D.’s grandmother, and go biking, after leaving D’s grandmother in a camping site.” A decision making algorithm was presented to permit to a community to investigate and discover the most widely endorsed proposal that answers a given question. A number of possible variations were discussed and some partial results were presented. Future work include testing with bigger communities, for longer time, as well as testing the effect of the possible variations.
Acknowledgment We thank for the interesting discussions: Giovani Spagnolo, Manuel Barkau, Christopher Ritter and the people who participated in our tests.
References 1. Bucci, A., Pollack, J.B.: A mathematical framework for the study of coevolution. In: Foundations of Genetic Algorithms, vol. 7, pp. 221–235. Morgan Kaufmann, San Francisco (2003) 2. Taylor, A.D., Pacelli, A.M.: Mathematics and politics: Strategy, voting, power and proof. Springer, pp. 1–377 (2008) 3. Kosorukoff, A.: Human based genetic algorithm. In: 2001 IEEE International Conference on Systems (January 2001) 4. Defaweux, A., Grosche, T., Karapatsiou, M., Moraglio, A., Shenfield, A.: Automated concept evolution. Technical Report Vrije Universiteit Brussel, Belgium (2003)
The Universal Constructor in the DigiHive Environment Rafał Sienkiewicz and Wojciech Jędruch Gdansk University of Technology, Gdansk, Poland {Rafal.Sienkiewicz,Wojciech.Jedruch}@eti.pg.gda.pl
Abstract. The paper describes an universal constructor model realized in artificial environment called DigiHive. The environment is a two dimensional space, containing stacks of hexagonal tiles being able to moving, colliding, and making bonds between them. On the higher level of organization a structure of tiles specifies some function whose execution affects other tiles in its neighborhood. After short description of the DigiHive the paper describes design of an universal constructor and discusses possibilities of simulating self-replicating strategies in the environment.
1
Introduction
The individual and agent based modeling are natural tools for modeling of basic biological phenomena. Using this approach an universal or problem oriented modeling environments (artificial worlds) can be constructed. The DigiHive environment [1,2] (see also precursor of the environment [3,4]) is an artificial world aimed for modeling of various systems which a complex global behavior emerge as a result of many simultaneous, distributed in space, local and simple interactions between its constituent entities. It is especially convenient for modeling of basic properties of self organizing, self modifying and self replicating systems. laws The short description of the DigiHive system is given in the first part of the paper. The various strategies of self replication usually involves existence of an universal constructor, a device which can build any entity basing on its description and using surrounding building materials. The design of the universal constructor in the DigiHive environment is described in the second part of the paper. The design process must still take into account the fact that all entities in the environment are in continuous movement. The design constitues the base for testing a series of modeling of various self reproducing strategies: their efficiency, speed, and ability to evolve.
2
The DigiHive Environment
The environment is a two dimensional space containing objects which move, collide and change their structure. The constituent objects of the environment, called particles, are represented by hexagonal tiles. The particles are of 256 types G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 183–190, 2011. c Springer-Verlag Berlin Heidelberg 2011
184
R. Sienkiewicz and W. Jedruch
and are characterized by velocity, position, and internal energy. Each type is related with a set of constant attributes: mass, bond energy (needed to disrupt bond between particles of given types), activation energy (needed to initiate any change of bonds). Particles can bond together forming a complex of particles. The permanence of the complex depends on its constituent particles bond energies. Particles can bond vertically on the directions up and down forming stacks. The particles on the bottom of the stack can bond horizontally. An example of stack of particles and a complex formed by horizontal bonds are shown in Fig.1. U
N
NW
NE SE
SW
(a)
S
(b)
D
Fig. 1. Examples of complexes: (a) horizontal view of single stack of particles with directions shown, and (b) vertical view of complex formed by horizontal bonds, where hexagons drawn by single lines represent single particles, and by double lines represent stacks of particles and black dots mark horizontal bonds between particles
There are two types of collisions of particles and complexes: elastic and inelastic one, where the type is chosen randomly according to preset probability. During elastic collision, the resulting velocities are calculated according to the rules of classical mechanics resolving the collision of two disks (circles surrounding hexagons) – conservation of momentum and of kinetic energy is observed. After inelastic collision the velocities of both particles are equalized – conservation of momentum is observed but the resulting decrease of kinetic energy of particles is compensated by emission of a photon. Photons move and collide with particles (but not with other photons). The collisions result in creating or removing bond between hit particle and other particle close to the hit one. When the probability of inelastic collision is set to 0 the above described properties allow to perform simulation of classical mechanics processes – equivalence of a simple molecular dynamic. When the probability is set to non zero value, some photons will appear as a result of collisions and then the reactions triggered by photons will lead to creation of various complexes of particles. 2.1
Functions
Besides the reactions resulting from the collision of particles with photons, there also exists additional class of interactions in which complexes of particles are
The Universal Constructor in the DigiHive Environment
185
capable to recognize and manipulate particular structures of particles in the space around them. The description of the function performed by a complex is contained in the types and locations of particles in the complex. The structure of the complex is interpreted as a program written in a specially defined declarative language. Syntax of the program encoded in a complex of particles is similar to the Prolog language, using the following predicates only: program, search, action, structure, exists, bind, unbind, move, not. Predicates: program, search, action and structure helps maintain the structure of the program. The other predicates are responsible for selective recognition of the particular structure of particles (exists) and for manipulation of them (bind – create bonds, unbind – remove bonds and move – move particles).
program():– search(), action(). search():– structure(0). structure(0):– exists([0,0,0,0,0,0,×,×], mark V1), exists([1,1,1,1,1,1,1,1] bound to V1 exists([0,0,0,0,0,0,0,0], mark V5), not(structure(1)), not(structure(2)). structure(1):– exists([1,1,1,1,0,0,0,0] bound to V2 exists([1,1,1,1,0,0,0,0] bound to V3 not(structure(3)). structure(3):– exists([0,0,0,0,1,1,1,1] bound to V4 structure(2):– exists([1,0,1,0,1,0,1,0]). action():– bind(V2 to V5 on SW).
on N, mark V2),
on NW, mark V3), on SW, mark V4),
on S).
Fig. 2. Example of a program recognizing the structure shown in Fig. 3
An example of a program is presented in Fig. 2. The program recognizes the structure shown in Fig. 3, and then binds the particle 11111111 to the unbound particle 00000000. The predicate program consists of exactly two predicates search and action. First one calls the searching predicates, while the second one calls the predicates responsible for performing some actions in the environment. The predicate search calls the predicate structure. The predicate structure consists of sequence of exists predicates and/or other structure predicates, always followed by the negation not. It provides the ability of recognizing the particular
186
(a)
R. Sienkiewicz and W. Jedruch
0000 0000
0000 0000 1111 1111
0000 1111 1111 0000
1111 0000
0000 0000
(b) 0000 0000
1111 1111
0000 1111 1111 0000
1111 0000
Fig. 3. Single particle and a complex of particles recognized by the program listed in Fig. 2 (a), and the structure after action of the program (b)
structure in case of some other structure does not exists. In the example, the particle of type 10101010 plays role of the reaction inhibitor (its existence would prevents program from being executed). With the predicate exists it is possible to check various conditions, e.g.: existence of some particular particle type (exists([0,0,0,0,1,1,1,1] ...)), check if the particle is bound to some other particle on given direction (exists(... bound to V2 on N ...)) etc. It is also possible to mark the particle which fulfills the predicate condition with one of 15 labels (variables: V1 to V15), e.g. the exists([1,1,1,1,0,0,0,0] bound to V2 on NW, mark V3) means: find the particle of type 11110000 which is bound to the particle marked as V2 on direction NW, and store the result in variable V3 (mark the found particle as V3). In fact, the only result of search predicate is some state of the variables V1 to V15 in which there are stored particles fulfilling the conditions grouped in all structure predicates. While the searching is performed via Prolog like backtracking algorithm, the order of execution inside this predicate doesn’t matter (besides performance impact). The predicates grouped in the action predicate are sequence of predicates that affect the environment. Contrary to the previous one, the order of execution is important. As a matter of facts this part of the program acts as an imperative (not declarative) sequence of commands which operates on data provided by declarative search part. The important feature of the particle complex language is the property that small changes in code of a program (i.e. in particles in which the program is encoded) usually lead to small changes in its execution effects. Such a property of the language is crucial while simulation of self organizing phenomena. The simulation that demonstrates this property was presented in [2]. The program is encoded by complex of particles. Each predicate structure is represented by the single stack of particles. Such a stack encodes a list of predicates exists. Stack which encodes structure(0) also encodes predicates bind, unbind and/or move. Adjoining stacks encode negative form of predicates structure. More details can be found at project website [1]. The introduced property of the environment that the complexes of particles perform some functions on other particles and a moment later are an object of manipulation by another complex opens the wide possibilities of modeling variety of self modifications systems and regulation chains.
The Universal Constructor in the DigiHive Environment
3
187
The Universal Constructor
The universal constructor is a concept introduced by von Neumann in his famous work on self-replicating cellular automata [5] (see also [6] for most recent ideas). The universal constructor in DigiHive environment is a structure A (complex of particles) being able to constructs other structure X based on its description d(X). It is admissible to constructs the X structure via description d(X ) of an intermediate structure X = X, being able to transform itself into the X. The universal constructor is the consistent structure (set of programs) being able to fulfill the following tasks: 1. search for valid information structure (information string) – d(X). The information string encodes description of some structure d(X). The description may be viewed as another program written in simply universal constructor language with the following commands: PUT (add specified particle to the stack), SPLIT (start construction of a new stack connected horizontally to the existing one), NEW (start construction of a new stack and not connect it to the existing one), and END (end of the information string). The information string is a stack of particles of specified types as described in the Fig. 4a. As an example the following program: PUT(01010101) PUT(01010101) END which describes stack of two particles of type 01010101 can be encoded by the stack: 11111111 01010101 00000001 01010101 00000001
2. connect itself to the found information string and start constructing the structure X. The structure X consists of horizontally joined stacks of particles. There is always exactly one stack of particles being build at the moment, called active stack X , 3. sequentially process the joined information string: (a) if current particle in the information string encodes command PUT – find the particle of specified type which is on the top of stack of building material. The building material is contained in stacks of particles (named M ) marked by the particle at the bottom (material header) of the type 0000××10 (see Fig.4b). The newly found particle is removed from the top of the stack M , and put on the top of the stack X (Fig. 5a). (b) if current particle in the information string encodes command SPLIT – split the stack X into two stacks: remove particle from top, move the trimmed stack in the specified direction, and create horizontal bond between X and removed particle. The particle becomes the bottom of active stack X . This action is presented in the figure 5b,
188
R. Sienkiewicz and W. Jedruch
(a)
END (×,×,×,×,×,×,×,×) .. . direction (×,×,×,×,×,×,×,×) SPLIT header (×,×,×,×,×,×,1,1) .. . particle type (×,×,×,×,×,×,×,×) NEW header (1,1,1,1,1,1,0,1) .. . particle type (×,×,×,×,×,×,×,×) PUT header (×,×,×,×,×,×,0,1)
(b)
particle (×,×,×,×,×,×,×,×) .. . particle (×,×,×,×,×,×,×,×) material header (0,0,0,0,×,×,1,0)
Fig. 4. Encoding stack (a) and building material (b)
(a)
(b)
Fig. 5. Illustration of actions of the universal constructor during processing of the information string: (a) action caused by the command PUT. The particle of specified type is put on the top of active stack X (draw using thicker lines), (b) action caused by the command SPLIT. The particle is removed from top of X , then the bond is created on specified direction, the particle becomes a new active stack X .
(c) if current particle in the information string encodes command NEW – disconnect the structure X and immediately start the construction of a new structure with specified particle as the beginning of X (single information string can then encode various structures, e.g. both d(B) and d(A) – see also additional note on the use of the NEW command at the end of this paragraph), (d) if current particle in the information string encodes command END – release the information string and the constructed structure. The universal constructor was implemented as a set of cooperating 10 programs being able to fulfill tasks described above, enhanced with 5 helper stacks of particles. Helper stacks are used mainly for performing synchronization between
The Universal Constructor in the DigiHive Environment
189
working programs, they also mark some characteristic part of the universal constructor e.g. a place where the new structure is build, a place where the universal constructor join the encoding string etc. In order to illustrate the universal constructor abilities it was provided with the information string describing flat, rhombus-shaped complex. As the result, the programmed structure was successfully build. The simulation screenshots are presented in the fig. 6. Note, that after the complex is finished the constructor is immediately ready to connect to the same information string again and start a new translation. Such behavior would cause that constructor will produce all the time the copies of the same structure. This undesirable side effect can be resolved be encoding a simple program which can disable the information string – i.e. prevent it from being recognized by the constructor. The program can be encoded together with the main structure in one information string. During translation the program can be separated from the main structure by the NEW command. It is also possible to encode another program which can enable again the information string (e.g. at the beginning of the string in order to guarantee the existence of at least one program after the reaction). Note, that the efficiency of universal constructor reaction can be then adjusted by the concentration of both types of programs.
Fig. 6. Construction process after 0 (a), 26 (b) and 88 (c) simulation cycles
4
Strategies of Self-reproduction
During designing of the universal constructor, several technical problems has been encountered, solving which would need introduction of some subtle programming tricks leading to in some way to superfluous structure of the constructor. For example in case when the constructor A works with its description d(A), i.e. when constructor builds its own copy the problem is to prevent the activity of partially constructed structure. This structure should not manifest any activity before it is completely finished, in other case the simulation can become unpredictable (e.g. the unfinished constructor may start producing the copy of itself etc.). On the other hand the constructor that builds the structure should not recognize this structure as a part of itself. The another problem is a geometry
190
R. Sienkiewicz and W. Jedruch
of the constructed structure. Sequential adding of subsequent stacks in various direction may lead to the situation that a built structure will try to occupy a place already occupied by the constructor itself. Also sequential adding of stacks produces complexes only in the form of one dimensional strings of stacks. Taking into consideration the tradeoff between the above problems and the simplicity and efficiency of solutions it has been assumed that the constructor may not be fully universal – some structures can not be built straightforwardly. These limitations can be fully compensated using a wide possibilities offered by the actions of functions encoded in complexes. The problem of constructing any structure can be resolved in one of the following ways: 1. By constructing some intermediate structure I being able to transform itself into the desired one: I, in finite number of time cycles: A + d(I ) → A + d(I ) + I → A + d(I ) + I 2. By constructing set of programs I1 , I2 , . . . , In being able to build the structure I in finite number of time cycles: A+
n i=1
5
d(Ii ) → A +
n i=1
d(Ii ) +
n
Ii → A +
i=1
n i=1
d(Ii ) +
n
Ii + I
i=1
Conclusions and Further Research
The universal constructor implemented in the DigiHive environment has been designed and implemented. The limitations of the design has been discussed and the ways of compensation has been shown. The aim of the further research is to implement and compare the effectiveness of various possible self-reproduction strategies and their ability to evolve in time.
References 1. Sienkiewicz, R.: The DigHive website, http://www.swarm.eti.pg.gda.pl/ 2. Sienkiewicz, R., Jedruch, W.: Artificial environment for simulation of emergent behaviour. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS, vol. 4431, pp. 386–393. Springer, Heidelberg (2007) 3. Jędruch, W., Barski, M.: Experiments with a universe for molecular modelling of biological processes. Biosystems 24(2), 99–117 (1990) 4. Jedruch, W., Sampson, J.R.: A universe for molecular modeling of self-replication. Biosystems 20(4), 329–340 (1987) 5. von Neumann, J.: Theory of Self-Reproducing Automata. University of Illinois Press, Champaign (1966) 6. Hutton, T.J.: Evolvable self-reproducing cells in a two-dimensional artificial chemistry. Artificial Life 13(1), 11–30 (2007)
A Local Behavior Identification Algorithm for Generative Network Automata Configurations ¨ Burak Ozdemir and H¨ urevren Kılı¸c Department of Computer Engineering, ˙ Atılım University Incek, G¨ olba¸sı, 06836, Ankara, Turkey
[email protected],
[email protected] http://ceng.atilim.edu.tr
Abstract. Relation between the part and the whole is investigated in the context of complex discrete dynamical systems. For that purpose, an algorithm for local behavior identification from global data described as Generative Network Automata model configurations is developed. It is shown that one can devise a procedure to simulate finite GNA configurations via Automata Networks having static rule-space setting. In practice, the algorithm provides an automated approach to model construction and it can suitably be used in GNA based system modeling effort. Keywords: Generative Network Automata, Automata Networks, Inverse Problem, Identification, Discrete Dynamical Systems.
1
Introduction
In the context of complex dynamical systems, one can talk about two general approaches to the study of behavior: The first one is to start with global information describing state and topology changes of the system and identify the local state transition/topology transformation functions that generate them (i.e. the inverse problem). The second approach is to start with some state transition/topology transformation functions at hand and observe the resulting behavior of the constructed system through local function execution (i.e. the forward problem) [1][2][3]. In this study, we consider the first problem in the context of Generative Network Automata (GNA) known to be a varying-topology extended modeling framework for complex dynamical systems [4]. GNA framework provides us a well-defined environment for examining potential relations and mechanisms between the part and the whole. Different from Automata Networks (AN) modeling in which locally interconnected large set of cells evolve at discrete time steps through mutual cell interactions [5], GNA modeling framework supports time varying connectivity among automata components that results in autonomous transformations of complex dynamical networks. To the best of authors knowledge, the first appearance of Automata Networks in literature is due to the works G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 191–199, 2011. c Springer-Verlag Berlin Heidelberg 2011
192
¨ B. Ozdemir and H. Kılı¸c
of John Von Neumann [6] in order to model various phenomena in physics and biology. When we consider discrete systems, one may encode any entity in bits. Here, the coded thing can be any value representing some property of the system. And, if the coded entity somehow changes its value over time, this makes the system suitable for being modeled by for example Cellular Automata [7] or in general by AN. In fact, the encoding can also be applied to GNA intercomponent connectivity information. So, one can devise a procedure to identify (then simulate) finite GNA configuration sequences via AN having static rulespace descriptions. This paper describes how the idea can be realized for some restricted type of discrete dynamical systems. The existence of proposed algorithm also proves (by construction) the generatability of seemingly autonomous behavior via designed one. Simply, the output of the algorithm is an instance of AN constituted by the identified i) set of encoded and coupled state/connectivity transition rules and ii) static AN topology (represented by what we call overlay level neighborhood graph) describing the structure and flow of state/connectivity information between AN components. In practice, the algorithm provides an automated approach to model construction and it can suitably be used in GNA based automated system modeling effort. In section 2, we give some of the necessary definitions and assumptions made throughout the paper. The proposed algorithm is described in section 3. Section 4 is the conclusion part that includes discussions/speculations about applicability of the approach to different discrete dynamical network domains and possible future works. Throughout the text, we used the terms component and node interchangeably.
2
Definitions and Assumptions
GNA is a dynamical network described by a time-varying directed graph with its labeled nodes. Each node of the graph represents one system component and takes one of the possible state values defined by a state value set S . The edges are supposed to be the indicators of referential relationships between system components that cause state transitions and topology transformations in time. Below, we give definitions about GNA configuration and its temporal dynamics[4]. Definition 1: A configuration of a GNA at time t is a triplet Vt , Ct , Lt where Vt : a finite set of nodes at time t describing dimension of the system, Ct : a mapping Vt → S defining the global state of the system at time t, Lt : a mapping Vt → Vt∗ defining global topology of the system at time t and this mapping is defined by outgoing edges of a specific node to its destination nodes. Definition 2: Let G be the set of all GNAs. Temporal dynamics of a GNA is defined by a triplet E, R, I where
A Local Behavior Identification Algorithm for GNA Configurations
193
E : a mechanism that produces an extraction ge ∈ G from a given GNA instance g ∈ G. R : a mechanism that replaces the extracted subGNA ge a with a new GNA g g instance gr ∈ G and produces correspondence Vt e → Vt r from nodes of ge to nodes of gr . I : an initial configuration of GNA. The extraction and replacement mechanisms can be deterministic or stochastic. The above extraction mechanism E, replacement mechanism R and initial configuration I are sufficient to define GNA based dynamic models and reflect a perspective (defined by E, R, I structure) to explain complex dynamic behavior of systems under study. Clearly, based on the amount of information provided by given global behavior data, there may be many GNA based models to be identified by an identification algorithm. This fact has also been pointed out in the context of reverse engineering of Automata Networks [7]. So, one need to define his/her assumptions about the characteristics of the system model under automated identification/construction for better precision. The assumptions/restrictions can be treated as apriori knowledge about the system under study that may vary from biological organisms, ecological communities and human societies to computer communication networks. Our assumptions include the following: • Local behavior of system components are supposed to be deterministic. As a consequence, whole system behavior is supposed to be deterministic which is reflected by global behavior data under study. • Components have minimal memory. In their update, components consider only one step previous state/connectivity of other system components (this may also include themselves). • System evolves synchronously in terms of its component states and network connectivity updates. • Global behavior of the system is supposed to be cyclic. In other words, the data reflecting last configuration is supposed to be followed by the first one, for completeness. • We divide system into two parts: observable part and reservoir part (see Figure 1). The whole system is supposed to be isolated and its components are distinguishable. However, the observable part of the system from which global behavior data input is extracted is non-isolated while its components are still distinguishable. • Components are supposed to appear at least once in the observable part of the system during their lifetime. • All system components are assumed to be known apriori in the form of a finite ordered set H defining inter-component neighborhood relation based on ordering. • Component behaviors are supposed to be explainable by state/connectivity information fed from nearest neighbor components defined by H.
194
¨ B. Ozdemir and H. Kılı¸c
• At any time a component can be in one of three states represented by 0, 1 or 2 symbols (i.e. S = {0, 1, 2}). A component residing in the observable part of the system may take state value either 0 or 1. On the other hand, the state value of a component in reservoir part can only be 2. This value represents unknown states of separate/disconnected components. From Figure 1, we can conclude that set of system components H = {p, q, r, s, t, u, v, w, x, y, z} ; finite set of components at time t, Vt = {p, r, t, v, w, y, z} ; component states Ct (r) = Ct (t) = Ct (v) = Ct (z) = 0 and Ct (p) = Ct (w) = Ct (y) = 1; destination component sets for each component Lt (p) = {z}, Lt (r) = {v}, Lt (t) = {p}, Lt (v) = {p, w}, Lt (w) = ∅, Lt (y) = {r, t, w}, Lt (z) = {t, y}. The state of any component at reservoir part is unknown and symbolized by 2.
Fig. 1. An example system snapshot
3
The Algorithm
In this section, we define an identification algorithm for the restricted type of systems defined by the assumptions given in section 2. The inputs of the algorithm are a sequence of GNA configurations reflecting the restricted system’s global behavior; a finite ordered set H of components from which Vt sets are constructed and state/connectivity interaction mode considered in the identification process. A fundamental question that we have to answer is: “How should component states and connectivity affect each other so that given autonomously varying topology and state change sequence can be generated?”. For that purpose, the algorithm considers three alternative state/connectivity interaction modes which characterize single component behavior that drive general GNA system evolution: reader mode, writer mode and transfer mode. For a given system, all components are supposed to behave in the same mode. In reader mode, component i is first supposed to be able to identify its incoming edges and their corresponding source nodes Pt−1 (i) (this set may also include the component itself) together with their states Ct−1 (j) where j ∈ Pt−1 (i) then decide on new incoming components Pt (i) and the component i’s new state Ct (i).
A Local Behavior Identification Algorithm for GNA Configurations
195
Similarly in writer mode, component i first determines destination components Lt−1 (i) through outgoing edges and their corresponding states Ct−1 (j) where j ∈ Lt−1 (i) then the state and connectivity information is used to generate new outgoing components set Lt (i) and new state of component i, Ct (i). In transfer mode, component i is supposed to generate new outgoing edges (i.e. destination components Lt (i)) and its state Ct (i) based on its incoming components Pt−1 (i) and their states Ct−1 (j) where j ∈ Pt−1 (i) of the corresponding source components in previous time step. In this mode, the effect of individual component i over the whole system behavior is realized through transformation of incoming connectivity and corresponding state information to outgoing connectivity and the component’s state. Table 1. Mode Table - List of [(Connectivity,State)] → [(Connectivity),(State)] based component behavior modes for minimal memory restricted systems under study
In computations, Pt is assumed to be a mapping Vt → Vt∗ defining global topology of the system at time t by incoming edges of node i from its source nodes. We can relate Pt and Lt in such a way that for all j ∈ Pt (i) we require i ∈ Lt (j) and for all j ∈ Lt(i) we require i ∈ Pt (j). Clearly, one can obtain all Pt values from all Lt values. None of the proposed modes cause conflicts in terms of their decided state and connectivity values. One can consider
196
¨ B. Ozdemir and H. Kılı¸c
alternative component behavior modes that can explain the restricted system’s global behavior in terms of (connectivity, state) input tuples generating (connectivity) and (state) outputs (see Table 1). In the Mode table, a component behaving in modes 1 to 4 does not get any input from the others. In other words, system components are disconnected. They have no meaning in our GNA context. Similarly, modes 1, 5, 9 and 13 do not produce any output and they do not define a dynamical system so have no practical importance in our study, Component behavior modes 4, 8, 12 and 16 update both incoming and outgoing component’s connectivity, simultaneously that may result in conflicts and assumed that components live in peace. Therefore, we did not consider these modes during the identification process. However, for these modes one can establish some negotiation protocols and embedded component strategies as new constraints over the system. And, the identification algorithm can be extended to cover them also. Finally, modes 7, 14 and 15 are not considered but they can easily be added to the implementation. In the implementation of the algorithm, information about existence of incoming or outgoing connectivity between components (defined by sets Pt (i) and Lt (i) is encoded as 0 and 1 for simplicity. Depending on given configuration sequence, deterministic behavior of an individual component may not be explainable only by its own history. We call it inconsistency. They are resolved by neighborhood expansion of component in concern. So, the inconsistencies are resolved by the establishment of what we call overlay level component dependency settings that leads to the identification of an overlay level network topology. Initially, each component is assumed to depend only on itself. Therefore, the only overlay level neighbors of components are themselves, initially. Any neighborhood expansion may continue at most until all system components are added into the current overlay level neighborhood of the component under construction. Note that any inconsistency is guaranteed to be resolved at worst when the identified overlay network become fully-connected. This is because the given global behavior of the system has been assumed to be deterministic. The following example covers an inconsistency situation resolved through component overlay neighborhood expansion realized over finite ordered set H of system components which is given as input.In below, we give the proposed identification algorithm IDENTIFY GNA(). Inputs: • a deterministic sequence of k + 1 GNA configurations Vi , Ci , Li labeled from S0 to Sk where 0 ≤ i < k and S0 = Sk • a finite ordered set H of system components such that Vi ⊆ H for all 0 ≤ i < k, • state/connectivity interaction mode read/write/transfer. Outputs: • an identified set of encoded and coupled state/connectivity transition rules Ti for each system component i ∈ H • overlay level neighborhood graph F describing the flow of state/connectivity information between system components.
A Local Behavior Identification Algorithm for GNA Configurations
197
Algorithm IDENTIFY GNA(): For each component i { Set rule set Ti to ∅; Set component i s overlay level neighbor set Fi to {i}; } For each component i { If (mode = reader or mode = transfer) Construct the sets Pt (i) by using Lt (j) of other components in given sequence; } For each component i { t =0; While t < k do { Switch(mode) { case reader : Construct next rule u by using Pt (i), Pt+1 (i), Ct (j) and current Fi ; case writer : Construct next rule u by using Lt (i), Lt+1 (i), Ct (j) and current Fi ; case transfer : Construct next rule u by using Pt (i), Lt+1 (i), Ct (j) and current Fi ; } If (u is consistent with all the rules in Ti { If (u ∈ / Ti ) Set Ti = Ti ∪ {u }; t=t+1; } Else { Set rule set Ti to ∅; Find next overlay level nearest neighbor component n for i by using ordered set H ; Set Fi = Fi ∪ {n }; t =0; Start over while loop; } } Return all sets Ti and graph F ; Note that the algorithm does not directly produce the E, R, I triplet describing temporal dynamics of GNA but instead transition rules that describe component level local behavior together with overlay level neighborhood topology are the outputs. In fact, the outputs describe nothing but an Automata Network. Left-hand-sides of the identified AN transition rules describe the parts to be extracted from GNA due to a single component subGNA and right-handsides define the parts that replace the extracted left-hand-sides on the same single component subGNA scale. Collective extraction and replacement effort of the components results in sequence generation. The AN transition rules are sufficient to generate given global system behavior because it can be explained by
198
¨ B. Ozdemir and H. Kılı¸c
collective behavior of individual system components co-evolving through state and connectivity based overlay level complex neighborhood interactions. Depending on the nature of the sequence, the identified AN may fit to instances of cellular automata, random Boolean networks, graph grammar, pure network growth models or any other hybrid GNA model. Finally, the identified rulespace of AN model might be partially defined since the information provided by GNA sequence is limited to some finite number of configurations.
4
Conclusions
Identification of the relation between local and global, subsystem and system or simply the part and the whole is a known fundamental question. An algorithm to identify the behavior of complex discrete dynamical systems from their whole behavior description to the parts behavior identification is developed. During identification of state/connectivity based transition rules of the systems in concern, apriori system knowledge is treated as assumptions/restrictions. An abstraction what we call overlay level nearest neighbor component interaction is introduced. Different component behavior modes (namely reader, writer and transfer) are defined to explain the whole systems behavior by simple local component set behavior. Clearly, the more information (i.e. state/topology change history) provided about the system the more rules can be identified. Finally, we showed the existence of a GNA identification procedure for at least systems with some defined assumptions and restrictions. In fact, automated identifiability of systems (up to some degree) of different scales ranging from nonliving systems (e.g. communication networks), cellular/species level biological/ecological systems to human systems through devising computer procedures/algorithms is still questionable. This is because of known limitations of computing based on formal description of algorithms built over the ChurchTuring thesis. The existence of alternative/unconventional hypercomputing approaches and their products may (hopefully/hopeholy but inshAllah) result in contribution to peace and charity. However, it seems that the success is technically upper bounded by human-being himself/herself who is complicated enough to be identified/understood. The proof is the existence of the authors of this paper. As a consequence, any identification/decision made about it only based on the letter sequences of this paper may not be sufficient for healthy model construction, not only for an intelligent machinery, but also for an expert of the domain. The proof, in this case, is the paper itself.
References 1. Garzon, M.: Models of Massive Parallelism Analysis of Cellular Automata and Neural Networks. Texts in Theoretical Computer Science. Springer, Heidelberg (1995) 2. Wolfram, S.: Universality and Complexity in Cellular Automata. Physica D 10, 1–35 (1984) 3. Langton, C.: Studying Artificial Life with Cellular Automata. Physica D 22, 120–149 (1986)
A Local Behavior Identification Algorithm for GNA Configurations
199
4. Sayama, H.: Generative Network Automata: A Generalized Framework for Modeling Complex Dynamical Systems with Autonomously Varying Topologies. In: Proc. of the 2007 IEEE Symposium on Artificial Life (CI-Alife 2007), pp. 214–221 (2007) 5. Goles, E., Martinez, S.: Neural and Automata Networks: Dynamical Behavior and Applications. Kluwer Academic Publishers, The Netherlands (1990) 6. Von Neumann, J.: Theory of Self-Reproducing Automata, edited by A.W. Burks. University of Illinois, Urbana (1966) 7. Kılı¸c, H.: A Methodology for Reverse Engineering Automata Networks, Ph.D. Dissertation, Middle East Technical University, Department of Computer Engineering, 135 pages, Ankara, Turkey (1998), http://www.atilim.edu.tr/~ hurevren/PHDTHESIS_HUR.rar
Solving a Heterogeneous Fleet Vehicle Routing Problem with Time Windows through the Asynchronous Situated Coevolution Algorithm Abraham Prieto, Francisco Bellas, Pilar Caamaño, and Richard J. Duro Integrated Group for Engineering Research, Universidade da Coruña, Spain {abprieto,fran,pcsobrino,richard}@udc.es
Abstract. In this work we present the practical application of the Asynchronous Situated Coevolution (ASiCo) algorithm to a special type of vehicle routing problem, the heterogeneous fleet vehicle routing problem with time windows (HVRPTW). It consists in simultaneously determining the composition and the routing of a fleet of heterogeneous vehicles in order to serve a set of timeconstrained delivery demands. The ASiCo algorithm performs a situated coevolution process inspired on those typical of the Artificial Life field that has been improved with a strategy to guide the evolution towards a design objective. This strategy is based on the principled evaluation function selection for evolving coordinated multirobot systems developed by Agogino and Tumer. ASiCo has been designed to solve dynamic, distributed and combinatorial optimization problems in a completely decentralized way, resulting in an alternative approach to be applied to several engineering optimization domains where current algorithms perform unsatisfactorily. Keywords: Open-ended Evolution, Optimization, Vehicle Routing Problem.
1 Motivation Typical engineering optimization problems belonging to the Operational Research field such as dynamic programming, assignment, scheduling, supply chain flow, network optimization or decision analysis, have been widely used, due to their complexity, as testbeds for new optimization techniques [1]. In fact, solving the real applications behind these problems is still a challenge, mostly due to the fact that the simplifications assumed in the computational models are usually substantial. Specifically, providing a practical solution for real time operation implies dealing with three main features: the huge number of interactions between the elements that make up the system, the locality of the available information and the dynamic properties of the environments. In this work, we study the application of a new evolutionary approach based on the algorithms used in Artificial Life simulations called ASiCo (Asynchronous Situated Coevolution) that includes these three features as a design requirement. G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 200–207, 2011. © Springer-Verlag Berlin Heidelberg 2011
Solving a HFVRPTW Through the Asynchronous Situated Coevolution Algorithm
201
The idea of applying Artificial Life simulations in fields different from biology is not new [2]. In the particular case of engineering problems, these techniques have not been studied in depth but, nevertheless, relevant work centered on function optimization that has been applied to real problems may be found. Examples of this have been reported by Satoh [3] or Yang [4] among others. These authors propose an emergent colonization algorithm for the optimization of non-convex functions. The algorithm was later applied in [5] to design tasks. Another example that hybridizes different approaches is the one found in [6] where the authors propose an algorithm that combines an Artificial Life simulation with a Tabu search algorithm for the optimal design of an engine mount. Within a different field, that of autonomous robotics, we must mention the work by Watson et al. on Embodied Evolution [7]. Basically, the authors seek a robot population that evolves in a completely autonomous manner, without external intervention to obtain a single controller that makes the robots fulfill their objectives. Recently, we have expanded this idea in order to consider the possibility of obtaining robot populations that cooperate towards an objective. In fact, this was the origin of the ASiCo algorithm as proposed by our group [8][9] and used in collaboration with Schut et al. [10] for the operation of a set of robots performing a surveillance task. The results show the appropriateness of the approach for real time operation over a fixed number of robots or agents. Here we go one step further and evaluate the possibility of obtaining differentiated controllers (different species) for a heterogeneous population of agents. In fact, the objective is to test this strategy in a problem that would require dimensioning the agent set, selecting the appropriate features for each agent (size, capabilities), and constructing the individual agent controllers simultaneously. For formal reasons we have decided to do this on a well described and established problem within operational research, that is, the heterogeneous fleet vehicle routing problem with time windows (HVRPTW).
2 Asynchronous Situated Coevolution The ASiCo algorithm is inspired on Artificial Life simulations in terms of the use of decentralized and asynchronous open-ended evolution. Unlike other bio-inspired approaches such as genetic algorithms in which selection and evaluation of the individuals is carried out in a centralized manner at regular processing intervals based on an objective function, this type of evolution is situated. This means that all of the interactions among individuals of the population are local and depend on spatial and temporal coincidence of the individuals, implying an intrinsic decentralization. Consequently, reproduction, creation of new individuals or their elimination is driven by events that may occur in the environment in a decentralized way. This type of evolution has usually been employed for analysis purposes, this is, to study how a system evolves in an open-ended manner, and not really with an engineering objective in mind, and thus, there is no clear procedure to relate the global objective to be achieved with the local objectives of the agents that participate in the process. To this end, we take inspiration from the studies of utility functions and their distribution among individuals in order to structure the energy dynamics of
202
A. Prieto et al.
the environment to guide evolution to the objectives sought. Specifically, we have used the principled evaluation function selection procedure for evolving coordinated multirobot systems developed by Agogino and Tumer [11], which establishes a formal procedure to obtain the individual utility function from the global function. With this procedure, ASiCo open-ended evolution becomes an evolutionary optimization algorithm that provides a distributed solution by means of the whole population and not only by the best individual as in typical evolutionary algorithms.
Fig. 1. Schematic representation of ASiCo structure
Fig. 1 displays a schematic representation of the algorithm’s structure divided into two different parts, and the relations between these procedures and the processes carried out during evolution. On one hand (right block) we have the procedures that guide the evolution towards an objective. Individual encoding defines the solutions that can be generated, the objective function is established using the utility functions commented above, and, finally bipolar crossover, explained in detail below, allows for the evolution of a heterogeneous population. On the other hand (left block) we have the evolution engine, which is based on the interactions among elements in the environment. After the creation of a random population, the execution of the interaction events occurs in a continuous loop modifying the state of the elements. In some conditions and based on energetic criteria for spatiotemporally coinciding individuals, the procedures that represent the evolution of the population (selection, evaluation and elimination) occur. The energy/utility association represents the energetic rules in the environment and affects these three procedures. Finally, the objective function defines the energetic criteria, the selection and the elimination. Thus, ASiCo is an interaction driven algorithm. Interactions are a set of rules that make the state of the elements and individuals in the environment change in time due to particular events. This process is independent from the evolution of the population. Two elements are very relevant within ASiCo. On one hand we have the flow of energy, which represents the different rules that regulate energy variations and transmissions between the individuals and the environment and vice versa. On the
Solving a HFVRPTW Through the Asynchronous Situated Coevolution Algorithm
203
other, reproductive selection is the set of rules that regulates the reproduction process. This selection process must be defined for each problem and is based on spatial interactions together with some energetic criteria. Specifically, it is usually performed by means of a tournament operator, which, in a typical evolutionary algorithm, randomly selects a number of individuals from the population for the reproduction. This centralized behavior is not possible here, so the tournament has been modified to be asynchronous and decentralized and, consequently, based on local interactions between the individuals. The reproduction process uses a bipolar crossover operator that we have developed to preserve heterogeneous populations in the evolution process. To achieve it, when two parent individuals are selected to create an offspring, one of them is randomly labeled as the base individual, and the resulting offspring will be a variation of this base individual. The crossover is performed gene by gene applying a Gaussian function that is centered on the genes of the base individual and with a deviation function that depends on the difference between the parents’ genes. The idea is that, in the case of having two very different parent individuals (that could be considered as belonging to different species), this operator tends to create an offspring that is more similar to one of them, with a very small probability of being a mixture as this would eventually lead to a homogeneous population.
3 The Shipping Freight Problem In order to evaluate ASiCo´s capability of generating heterogeneous variable sized populations in real time, we have solved a heterogeneous fleet vehicle routing problem with time windows (HVRPTW), as a testbed for real applications. It consists in simultaneously determining the composition and the routing controllers for a fleet of heterogeneous vehicles in order to serve a given set of customers with probabilistic delivery demands that have time constraints. Several algorithms may be found that were designed specifically to solve HVRPTW. The most successful results have been achieved using classical heuristic methods adapted to the problem such as [12], or metaheuristics [13][14]. In this work, we apply a more general algorithm, which has not been designed for this particular problem. In addition, as we will explain later, ASiCo allows to simultaneously obtain the fleet composition and the optimal routing with a continuous range of possible vehicles, that is, we do not have to use fixed sets of vehicles to make up the fleet. Finally, ASiCo may be used in real time and adapts to changing circumstances in the environment or agents. 3.1 Experimental Setup The particular HVRPTW we have studied is a shipping freight problem, that is, the fleet will be made up of cargo ships that must supply the demands that appear in some points (representing naval ports). For the sake of realism, two more elements have been added: supply points, where the cargo must be delivered, and maintenance points, that the ships must visit periodically. These 4 elements make up the simulation environment and they are represented in Fig. 2 where a screenshot of this application example in the ASiCo simulator [8] is shown.
204
A. Prieto et al.
Fig. 2. Screenshot of the shipping simulation environment. Demand points are represented by small circles, supply points by squares and maintenance points by large circles. The remaining elements are ships of different size.
The objective of the ASiCO algorithm is to obtain a fleet that minimizes the unsatisfied demand level. Specifically, it must provide the composition of the fleet (number and size of ships) and the control system for each ship to reach this objective. To achieve it, the population of the algorithm is made up of individuals representing the ships that have the following 5 sensors: 1. 2. 3. 4. 5.
Reward: provides the estimated level of resources the ship will have after performing a delivery. Minimal resources: provides the minimum level of resources that is reached during the itinerary of a given route. Distance: to the destination along a given route. Autonomy: provides the autonomy the ship would have in the nearest maintenance point after the delivery following a given route. Load: provides the load the ship would have after the delivery.
The action the ships can execute is simply the selection of the route they will follow to a demand point, either directly, through a supply point, through a maintenance point or through both, a supply and a maintenance point. The control system of the ships acts as an evaluator that must select the route the ship must follow using information from the sensor values. This selection is performed through a function that provides an evaluation parameter (E), which depends on the load (CL), autonomy (CA) and resources (CR) coefficients that represent the relevance of these input values in the selection of the route. In addition, E depends on the reward rate (Rrw) of a given route, obtained from the estimated time to reach a demand point, the resources consumed and the estimated reward. Thus, the valuation of each possible route can be obtained using:
E = RrwCL CA Cr
Solving a HFVRPTW Through the Asynchronous Situated Coevolution Algorithm
205
Consequently, the genotype of each individual is made up of the following parameters: the capacity (Q) that indicates the maximum load a ship can transport, the autonomy (A) that indicates the time between maintenance operations, the velocity (V) and the 3 coefficients CL, CA and CR. The energy input in this environment occurs at the supply points. Every time a ship performs a delivery, it receives a reward value associated to the merchandise deposited there. Energy output takes place through resource consumption that occurs during each ship’s life represented by two economic factors, the fuel consumption and the maintenance costs. The depreciation of the ship´s value has also been included in the consumption rate. It depends on the velocity, the autonomy and the fuel consumption according to [15]. Consequently we take as the global utility the sum of the private utilities of all the ships in terms of their energy, or in economic terms, their available budget. Finally, the selection process in this problem is highly decentralized and asynchronous because the ships do not coincide spatially. Consequently, we have developed the following selection procedure: when a ship reaches a supply point with an energy level over a given threshold, it leaves a copy of its genotype. When the number of deposited genotypes is higher than a previously established tournament size, the bipolar crossover occurs using the two best individuals according to their individual utility. 3.2 Results Using this setup we have created a simulation environment where a set of 20 demand points, located relatively near one supply and one maintenance points, were considered. Starting from an initial population of 50 ships, Fig. 3 (left) shows the changes in the fleet under these conditions. Here we have represented fleet size, fleet resources, fleet consumption and the whole demand level as time passes. It can be seen that, once fleet size becomes stable at around 40 homogeneous ships (4000 simulation steps), the demand level starts to decrease reaching a very low level. Fig. 3 (right) shows the variation in time of the average resource level and the average consumption level for the fleet. It tends to use homogeneous ships with a smaller consumption level and with a more adjusted profit margin, as expected.
Fig. 3. Evolution results for the shipping freight problem
206
A. Prieto et al.
Due to the distribution of demand and supply points in this first experiment, we did not obtain a heterogeneous fleet as the optimal solution. To achieve it, we have created a second experiment with 17 demand points placed near one supply and one maintenance point (distance of less than 40 spatial units) and 6 demand points that are placed far from the supply and maintenance points (more than 500 spatial units). Fig. 4 (top) shows the results obtained after 9950 simulation steps, when the population is stable. The three graphs represent the relations between the three parameters that define the fleet composition: capacity, velocity and number of ships. As shown in the graphs, ASiCo produces a fleet made up of 22 ships, 12 with low capacity and velocity and 10 of higher capacities and velocities, constituting a self-organized heterogeneous solution. This distribution is consequent with that of the demand and supply points, because we obtain a fleet of 12 slow and small ships to cover the short haul routes and 10 faster and larger ships to deliver in long haul routes. Finally, in Fig. 4 (bottom) we show the parameters that define the fleet at step 21400 if we simulate a variation in the fuel costs in simulation step 10000. As displayed in the figure, the fleet changes, as expected: it is made up of 13 ships, all of them are slow (right graph) and 11 of them have a low capacity (center graph). Obviously, they do cover the demand and the fleet is profitable despite the fuel price increase. With this result we show the ability of ASiCO to adapt the solutions to dynamic environments.
Fig. 4. Distribution of ships in the fleet in simulation step 9950 (top) and 21400 (bottom) after a variation in the fuel cost in step 10000
4 Conclusions A heterogeneous fleet vehicle routing problems with time windows (HFVRPTW) is solved using the Asynchronous Situated Coevolution (ASiCo) algorithm. This problem is a very relevant testbed in the operational research field due to its broad range of practical applications. In this work, we show how an open-ended evolutionary
Solving a HFVRPTW Through the Asynchronous Situated Coevolution Algorithm
207
simulation improved with a procedure to guide evolution towards a design objective provides successful results to this problem. The solution obtained is a self-organized fleet of heterogeneous ships that distribute their features adapted to the spatial distribution of the demand and supply points according to market requirements and restricted to time windows. Acknowledgments. This work was partially funded by the MEC of Spain through projects DEP2006-56158-C03-02 and DPI2006-15346-C03-01.
References 1. Doerner, K.F., Gendreau, M., Greistorfer, P., Gutjahr, W., Hartl, R.F., Reimann, M.: Metaheuristics, Progress in Complex Systems Optimization. Operations Research/Computer Science Interfaces Series, vol. 39. Springer, Heidelberg (2007) 2. Langton, C.: Artificial Life: an Overview. MIT Press, Cambridge (1997) 3. Satoh, T., Uchibori, A., Tanaka, K.: Artificial life system for optimization of nonconvex functions, vol. 4, pp. 2390–2393 (1999) 4. Yang, B.S., Lee, Y.H.: Artificial life algorithm for function optimization. In: Proceedings of 2000 ASME Design Engineering Technical Conferences and Computers and Information in Engineering Conference (2000) 5. Yang, B., Lee, Y., Choi, B., Kim, H.: Optimum design of short journal bearings by artificial life algorithm. Tribology International 34(7), 427–435 (2001) 6. Ahn, Y.K., Song, J.D., Yang, B.: Optimal design of engine mount using an artificial life algorithm. Journal of Sound and Vibration 261(2), 309–328 (2003) 7. Watson, R.A., Ficici, S.G., Pollack, J.B.: Embodied Evolution: Distributing an Evolutionary Algorithm in a Population of Robots. Robotics and Autonomous Systems 39(1), 1–18 (2002) 8. Prieto, A., Duro, R.J., Bellas, F.: Obtaining Optimization Algorithms through an Evolutionary Complex System. In: Abstracts booklet ECCS 2007, pp. 132–132. European Complex Systems Society (2007) 9. Prieto, A.: Estudio de la coevolución asíncrona situada para la resolución de problemas dinámicos descentralizados en ingeniería, Ph. D. Thesis, Universidade da Coruña (2009) 10. Schut, M.C., Haasdijk, E., Prieto, A.: Is Situated Evolution an Alternative for Classical Evolution? To appear in Proceedings CEC 2009 (2009) 11. Agogino, A., Tumer, K.: Efficient evaluation functions for evolving coordination. Evolutionary Computation 16(2), 257–288 (2008) 12. Imrana, A., Salhi, S., Wassana, N.: A variable neighborhood-based heuristic for the heterogeneous fleet vehicle routing problem. European Journal of Operational Research 197(2,1), 509–518 (2009) 13. Paraskevopoulos, D., Repoussis, P., Tarantilis, C., Ioannou, G., Prastacos, G.: A reactive variable neighborhood tabu search for the heterogeneous fleet vehicle routing problem with time windows. Journal of Heuristics 14(5), 425–455 (2008) 14. Lima, C., Goldbarg, M., Goldbarg, E.: A Memetic Algorithm for the Heterogeneous Fleet Vehicle Routing Problem. Electronic Notes in Discrete Math. 18, 171–176 (2004) 15. Stopford, M.: Is the drive for ever bigger containerships irresistible? In: Lloyds List Shipping Forecasting Conference (2002)
Modeling Living Systems Peter Andras School of Computing Science, Newcastle University, Claremont Tower, Claremont Road, Newcastle upon Tyne, NE1 7RU, United Kingdom
[email protected] Abstract. A fundamental issue of evolution of life is the emergence and maintenance of self-referential autocatalytic systems (e.g. living cells). In this paper the problem is analyzed from a computational perspective. It is proposed that such systems have to be infinite autocatalytic systems, which can be considered equivalent to Turing machines. The implication of this is that searching for finite autocatalytic systems is likely to not be successful, and any such finite system would be maintainable only in a highly stable environment. The infiniteness of autocatalytic systems also implies that top-down search for the simplest living system is likely to stop at relatively complex cells that are still able to provide a realization of infinite autocatalytic systems. Keywords: autocatalysis, computation, formal model, self-reference, infinite model.
1 Introduction The questions of what makes living systems alive and how to create live systems are among the fundamental questions of biology [1-10]. A common feature of several theories of living systems is the requirement of self-referencing in the system [7-9], i.e. what happens in the system depends on what happened within the system before and the system somehow re-creates itself through such self-referential processes. What makes the self-referential recreation of the system problematic is that the references need to go back infinitely and the recreated system needs to be equivalent of the current one, without shrinking or explosive growth [7]. In terms of foundations of life these theories led to the postulation of autocatalytic chemical interaction systems that are assumed to provide the ground level building blocks for living systems [9]. Experimental work aimed to support theories about elementary forms of life showed that a wide range of organic molecules (e.g. amino acids, nucleobases) can be created in the conditions that may have existed on the early Earth [4,5]. Other related research aims to identify the last universal common ancestor (LUCA) organism by analyzing genetic data and experimenting with primitive organisms [10]. Another related line of research aims to produce the most reduced life by simplifying experimentally the genome of organisms that have already a small genome [11]. A further approach is to build simple autocatalytic or self-reproducing biochemical systems that are validated experimentally [12]. G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 208–215, 2011. © Springer-Verlag Berlin Heidelberg 2011
Modeling Living Systems
209
Here I present a computational interpretation of self-referentially reproducing living systems (in particular of simple life forms). This interpretation implies the equivalence of such systems with universal computers (Turing machines). A further implication of this interpretation and analysis is that autocatalytic systems of molecular interactions have to be infinite systems. The rest of the paper is organized as follows. First I discuss briefly autocatalytic systems. This is followed by a computational interpretation of simple living systems. Then I discuss about the infiniteness of autocatalytic systems. Finally the paper ends with the conclusion section.
2 Autocatalytic Systems Various self-regenerating self-referential systems have been proposed as models and descriptions of living systems in the second half of the last century [7-9]. While such systems capture conceptually fundamental features of living systems, most of these proposals remain obscure and difficult to understand due to the lack of their appropriate formalization and difficulty of their predictive application. For example, the theory of autopoietic systems of Varela and Maturana [7] provides an attractive interpretation of how living systems work, but lacks the formal analytic representation that would make it practically applicable in the context of experiments. An alternative theory of living systems was developed by Rosen [8] around the same time, following the work of Rashevsky [13]. This theory of M-R systems postulates the composition of living systems by a metabolic (M) and a repair (or reproductive) (R) component and provides a formal framework in terms of category theory. While this approach is very exciting theoretically, it did not find yet its way to the interests of experimentalists. Another more experimentally rooted approach was proposed by Ganti [9]. This approach focuses on chemical automata (chemoton) which perform self-reproducing interactions between molecules in a similar manner as reproducing patterns are generated in cellular automata. In this context cellular automata [14] can be also seen as models of self-reproducing autocatalytic systems. The chemoton theory builds on known chemical catalytic reactions and expands these in principle to describe molecular interaction systems that can maintain and reproduce themselves. However, so far there is no clear experimental proof of such chemoton systems except living systems, in which the interactions are known only partially. Researchers starting from the experimental end first showed that key organic molecules can be generated in abiotic conditions [4,5]. These molecules self-organize in some way to lead to living systems. The way how this happens is not yet known. The discovery of catalytic activity of RNA molecules [4,12] led to the formulation of the RNA world hypothesis, according to which interacting RNA molecules may constitute autocatalytic systems that are living systems. Recently it has been shown that small sets of RNA molecules can act alternately as catalysts, reaction components and products such that their system is reproduced if the right prime materials (molecules) and physical and chemical conditions are provided [12]. However, yet there is no sufficient explanation how proteins, RNA and DNA molecules are created and kept together in self-reproducing chemical interaction systems that are alive.
210
P. Andras
3 Computational Interpretation In some sense, a computational interpretation of living systems is present in the M-R systems theory of Rosen [8] and in the chemoton theory of Ganti [9]. Recent works on DNA computing also point in this direction [15]. Here I present an alternative way of looking at autocatalytic self-referential systems through a computational interpretation. First, let us consider a living system of interacting molecules that maintains and reproduces itself in the context of its environment formed of other molecules and molecular interactions. The living system is an open system, which receives molecules from the environment and releases molecules into the environment. The system can maintain itself by reproducing its molecules and molecular interactions in the right spatio-temporal patterns that correspond for example to the functional maintenance of the cell membrane and of the various cellular organelles. This means that the system is ready to produce the right molecular interactions to incorporate new required molecules (e.g. pinocytosis) and to release molecules that are not required any longer (e.g. lysosomal defecation). Being ready for such molecular interactions means that the system is able to ‘predict’ in a sense what these interactions need to be (e.g. by displaying the right receptor and anchor molecules on the cell membrane, and having ready sufficient amount of ATP to energize the molecular import). This kind of prediction of what the system ‘expects’ can be also considered as a computational process performed by the system. Note that the predictive ability of the system works well in a range of appropriate environments. If a bacterium is put in an environment with antibiotics present, the molecular interactions in the bacterium do not fit well any longer the environment and the bacterium dies. However, if the bacterium has resistance against the antibiotic, this can be interpreted in the sense that it can adapt its prediction machinery to generate the right molecular interactions which lead to the decomposition or expulsion from the cell of the antibiotic molecules. In self-referential autocatalytic systems, such as living cells, the interactions between molecules can be considered as computational operations on the data represented by molecules. Being autocatalytic it means that the current molecular interactions in the system make possible the realization of further molecular interactions within the system. This means that the computational process calculates a representation of the system itself as well, beside of calculating an approximation (or prediction) of the environment. In other words, the approximation of the environment leads to the regeneration of the system itself. Being self-referential it means that the computations depend (reference) on the available data (molecules), on past computational operations (patterns of molecular interactions), and on the earlier data as well. To achieve this, the system needs some way of providing the information about its past ready for referencing. The requirement of providing reference to past interactions and molecules can be satisfied if all patterns of molecular interactions (reference-able computations) can be represented by molecules (e.g. molecules that formed through these interactions), and if all patterns of molecules (reference-able data) can be represented by ongoing molecular interactions (e.g. interactions which can happen only if the referenced pattern of molecules was present earlier). This circular referencing may appear irresolvable; however there is a mathematical formalism that can provide a solution, which is the theory of recursive domain equations [16].
Modeling Living Systems
211
Let us assume that R is a domain (e.g. a set, or possibly a mathematical object that is larger than any set, for example the collection of all sets – i.e. the category Set), which contains objects. There are transformations of the domain R into itself; these can be considered as functions f:RÆR. The collection of all these transformations is denoted as [RÆR]. The recursive domain equation is stated as R ≅ A+[RÆR]
(1)
where A is a part of R (a sub-domain), which may be empty. In other words, each object of the domain R is either an unrepresented object (part of A) or it is represented by a transformation f:RÆR, and each transformation f:RÆR is represented by an object of R. To build up the analogy with the autocatalytic self-referential molecular interaction systems, let us consider each molecule type and also types of combinations of such molecules (e.g. proteins and protein complexes composed of multiple subunits constituting independent molecules) as an object and the transformations of the cell that result from patterns of molecular interactions the analogues of transformations of the domain, which is the collection of all considered molecule and molecule combination types. It is possible that there are molecules which have no transformation representation (e.g. inorganic ions that flow into the cell from the environment), but all transformations have some representation in terms of molecules or patterns of molecules, and all molecules involved in the self-reproduction have a representation in form of some transformation representing molecular interactions. If there is a solution of the recursive domain equation (1) then that solution may be used as a model of autocatalytic self-referential molecular interaction systems that are living cells. Equation (1) has solutions, for example the category of partial orders satisfies this equation (collection of all partial orders defined on sets, together with all functions that transform one partial order into another partial order). In general there are many categories, which satisfy this equation. These categories are the so called Cartesianclosed categories, which are defined by following features: (a) they have a Cartesian product object for each finite set of objects and (b) for any ordered pair of objects they have an object representing the set of morphisms between the first and the second object (for more information about categories and category theory the reader should consult relevant textbooks – e.g. [17]). This means that Cartesian-closed categories can be considered as models of self-referential autocatalytic systems. In fact considering the above identified requirements of self-referential autocatalysis any model of these systems has to be a such category. Notably, Cartesian-closed categories are also models of λ-calculus, which are a representation of Turing machines [18]. Turing machines are universal computational machines that are able to represent any computational algorithm and consequently can compute anything that is computable. This means that if self-referential autocatalytic molecular interaction systems (i.e. living cells) are representable as Cartesian-closed categories then they can also be seen as representations of Turing machines that can compute anything computable in principle. This implies that indeed the molecular interactions that happen in a living cell can be seen as a representation of computational processes. I argued earlier that these computations may be considered as computations leading to an approximation (or prediction) of the environment. In principle the Turing machine equivalence suggests
212
P. Andras
that these computations may compute everything computable, and consequently living cells may approximate their environment arbitrarily correctly. However, the high precision approximation computations may need very much time and space, and time and space limits may limit the attainable precision of the environment approximation. So, in practice the living system is able to achieve only limited precision approximation of its environment, which implies that there are environments in which these system cannot survive (since they cannot approximate it or predict it sufficiently correctly) – e.g. consider the earlier example of antibiotics in the environment of non-resistant bacteria.
4 Infinite Autocatalytic Systems The previously discussed computational interpretation of self-referential autocatalytic systems shows that such living systems of molecular interactions (cells) can be represented as Cartesian-closed categories, and cannot be represented by smaller formal representations. These categories include infinitely many objects, such that the level of this infinity is the same as the level of infinity in the case of the collection of all sets (i.e. more infinite than any infinite set). The morphisms between any ordered pair of objects always form a set, i.e. their number may be infinite, but not more infinite than the most infinite set (note that sets can be countable infinite – i.e. same as the infinity of natural numbers, and continuously infinite – i.e. the same as the infinity of real numbers). This implies that molecular interaction systems that compose living cells have to be infinite systems in principle. Infinite autocatalytic systems have to have infinitely many molecules and molecular configurations corresponding to the infinitely many objects of the representing Cartesian-closed categories. The infinity of molecules and molecular configurations has to be comparable to the infinity of collection of all sets (i.e. more than the infinity of real numbers). This may seem to be contradictory with the evidence of observed molecules in living cells. However, considering the range of macromolecules (e.g. proteins, RNA, DNA), their complexes and polymers, and also the polymers of simpler organic molecules (these polymers may approximate infinitely long molecules in principle), it becomes more plausible the assumption that there are very infinitely many molecules, complexes and molecular configurations that may represent the infinitely many objects of Cartesian-closed categories. The morphisms between objects of a corresponding Cartesian-closed category can be represented as interactions between molecules or spatio-temporal patterns of such interactions. Note that the DNA in itself can be seen as a computational machine equivalent of Turing machines [15]. This means that in principle (assuming that the DNA’s length approximates countable infinity) it can represent any computational processes and the computational system represented by it is the equivalent of λ-calculus. Consequently the system of computations that can be represented by the DNA can represent a corresponding Cartesian-closed category. So, what the DNA can encode in principle is equivalent to a system with very infinitely many objects and sets of morphisms between these objects. This means that the molecular interaction system encoded by the DNA, i.e. involving proteins, RNA, their complexes and spatio-temporal patterns, is a representation of an infinite Cartesian-closed category. This confirms that the above conclusion that living cells form infinite autocatalytic systems is valid.
Modeling Living Systems
213
Still there is an issue about how to accommodate the observational fact of finite sets of types and instances of molecules in cells with the assumption of infinite autocatalytic systems considered to be represented by these cells. To resolve this issue, let us start by considering that cells are open systems. This means that molecules flow into and out of the cell. Many of these molecules are essential components of proteins or prime material for molecular components of other macromolecules. In this sense the molecular interaction system of the cell extends out of the physical boundaries of the cell, by involving other molecule types that are not produced in the cell. Next, let us consider the cell together with its ancestor and descendant cells. The system of the cell is open in this temporal sense as well, since molecular interactions in the cell depend on molecules and molecular interactions that composed its ancestor cells and determine the molecular composition and molecular interaction of its descendant cells. While the cell contains a finite set of molecules at any time, considering the molecular interaction system spanning through ancestor and descendant cells as well, the molecules and their interactions constitute an approximation of an infinite fragment of the autocatalytic system represented by the cell. Different cells represent different fragments of the full system. They are adapted to their molecular environment in the sense that the computations represented by them assume the presence of this molecular environment, which may include required organic and inorganic molecules for the proper functioning of the cell, which are not produced by the cell. This means that it is sufficient for these cells to represent the appropriate fragment of the full autocatalytic system that can deal with the approximation / prediction of the characteristic environment of the cell. Since the part of the environment that is not given is at least comparably complex as cells themselves (e.g. many other cells contribute to it), the fragment of the autocatalytic system represented by the cell has to be infinite itself to make able the cell to compute good approximations / predictions of this environment. An implication of the infiniteness of the autocatalytic system is that searching for finite autocatalytic systems is likely to be fruitless. Finite variants of autocatalytic systems can be considered at best as finite fragments of an autocatalytic system that can maintain itself in the context of a stable specific environment providing inputs to and taking away outputs from the self-reproducing molecular interaction system (see for example [12]) – significant variation of the environment cannot be dealt with by the finite autocatalytic system, since this is unable to perform the full range of required computations. The computational requirements imposed by a partially given environment mean that any molecular realization of a fragment of an autocatalytic system that can reproduce in this partially given environment should be able to represent an infinite fragment of the autocatalytic system. This means that the bottomup search for autocatalytic systems by combining molecules and their interactions is unlikely to be successful unless the resulting molecular interaction system can be considered a representation of an infinite autocatalytic system. On the other side, it also means that top-down search for minimal self-reproducing autocatalytic systems is likely to stop at the level of relatively complex cellular systems that can still realize such infinite autocatalytic systems.
214
P. Andras
5 Conclusions A formal computational interpretation of living systems (live cells) is presented in this paper. This interpretation leads to the conclusion that self-referential autocatalytic systems have to be infinite systems that have an equivalent representation in form of a Cartesian-closed category. It is argued that such autocatalytic systems are equivalent to Turing machines as well, and in principle can compute anything computable. The realizations of such systems are live cells that have temporal and spatial constraints which imply that their computational ability in practice is limited and they can predict (or approximate) their environment to some extent that allows their survival and reproduction. However this ability depends on the assumptions of the system about its environment. If these assumptions are violated in a critical way that may imply the drop in the predictive ability of the system in the context of the modified environment and may lead to the termination of the system (e.g. bacteria in presence of antibiotics). An important implication of this conclusion about autocatalytic systems is that searching for their finite variants is likely to be fruitless, and in the best case finite variants of them will depend on very stable environmental condition. It also implies that top-down search for the simplest live organism is also likely to stop at relatively complex organisms that can still represent an infinite fragment of the infinite autocatalytic system.
References 1. Miller, J.G.: Living Systems. McGraw-Hill, New York (1978) 2. Lovelock, J.: Gaia: A New Look at Life on Earth. Oxford University Press, New York (1987) 3. Kauffmann, S.: The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, New York (1993) 4. Joyce, G.F.: Booting up life. Nature 420, 278–279 (2002) 5. Miller, S.L., Orgel, L.E.: The Origins of Life on the Earth. Prentice Hall, Englewood Cliffs (1974) 6. Segre, D., Lancet, D.: Composing life. EMBO Reports 1, 217–222 (2000) 7. Maturana, H.R., Varela, F.J.: Autopoiesis and Cognition. In: The Realization of the Living. D. Reidel Publishing Company, Dordrecht (1980) 8. Rosen, R.: Life Itself. Columbia University Press (1991) 9. Ganti, T.: The Principles of Life. Oxford University Press, Oxford (2003) 10. Glansdorff, N., Xu, Y., Labedan, B.: The Last Universal Common Ancestor: emergence, constitution and genetic legacy of an elusive forerunner. Biology Direct 3, 29 (2008) 11. Lartigue, C., Glass, J.I., Alperovich, N., Pieper, R., Pramar, P.P., Hutchinson, C.A., Smith, H.O., Venter, J.C.: Genome transplantation in bacteria: changing one species to another. Science 317, 632–638 (2007) 12. Lincoln, T.A., Joyce, G.F.: Self-sustained replication of an RNA enzyme. Science 323, 1229–1232 (2009) 13. Rashevsky, N.: Outline of a unified approach to physics, biology and sociology. Bulletin of Mathematical Biophysics 31, 159–198 (1969)
Modeling Living Systems
215
14. Wolfram, S.: A New Kind of Science. Wolfram Media, Champaign (2002) 15. Paun, G., Rozenberg, G., Salomaa, A.: DNA Computing. Springer, Heidelberg (1998) 16. Pierce, B.J.: Basic Category Theory for Computer Scientists. MIT Press, Cambridge (1991) 17. Mac Lane, S.: Categories for the Working Mathematician. Springer, New York (1998) 18. Crole, R.L.: Categories for Types. Cambridge University Press, Cambridge (1993)
Facing N-P Problems via Artificial Life: A Philosophical Appraisal Carlos E. Maldonado and Nelson Gómez Modeling and Simulation Laboratory, Universidad del Rosario, Bogotá, Colombia {carlos.maldonado,nelson.gomez}@urosario.edu.co
Abstract. Life as an N-P problem is a philosophical, scientific and engineering concern. N-P problems can be understood and worked out via artificial life. However, these problems demand a new understanding of engineering, since engineering is basically a way of acting upon the world. Such a new engineering is known as non-conventional engineering or also as complex systems engineering. Bio-inspired systems are more flexible and allow a higher number of degrees of freedom. As a consequence, AL enlarges our understanding of living systems in general and can be taken as a step forwards in grasping the complexity of life. Keywords: Complexity, N-P Problems, Non-conventional Engineering, NonConventional Computing, Living Systems, Heuristics, Experimentation.
1 Life Is an N-P Problem Generally speaking, life is an N-P problem. However, a feasible way for dealing with N-P problems and trying to solve them is via thinking, experimenting and working on and with AL. The workings and research on and with artificial life (AL) cover three main action domains, namely philosophy, science, and engineering. The first two deal with understanding and explaining phenomena, systems and dynamics exhibiting life – natural or artificial– whereas the third is mainly centered around building, optimizing, predicting and/or controlling engineering-like systems that behave as living beings, or having living features or characteristics. Two outcomes can be derived hereafter [1]: firstly, models and simulations for studying life (scientific orientation); secondly, artificial systems bearing biological properties and capabilities applicable to solving problems (engineering orientation). In both cases, though, the role of computing is crucial in order to literally be able to see non-linearity in phenomena characterized by emergence, self-organization, growing, adaptation, evolution, and increasing complexity –in one word: life! However, we claim, many if not all examples of AL and its applications in other disciplines and sciences are systems imposing tremendous computational challenges –they are N-P systems. G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 216–221, 2011. © Springer-Verlag Berlin Heidelberg 2011
Facing N-P Problems via Artificial Life: A Philosophical Appraisal
217
2 Engineering Life and AL Unlike the techniques used in artificial intelligence, AL does not work uniquely on the existing computational paradigm (based on Von Neumann architecture). Rather, it creates in many cases new paradigms inspired by the forms of processing, architecture and dynamics of biological, social and evolutionary systems such as evolutionary computation (among others, J. Holland, J. Koza), swarm intelligence (E. Bonabeau, M. Dorigo, G. Theraulaz), membrane computing (Gh. Paun), DNA computing (L. Adleman), artificial immune systems or immunological computation (D. Dasgupta) and cellular computing (M. Sipper). These new computational paradigms belong to the so-called non-conventional computing and cross other paradigms from other areas and contexts such as physics and logics; such is notoriously the case of quantum computation (P. Benioff, S. Lloyd), fuzzy systems or also hipercomputation (foreseen somehow by A. Turing circa 1938 when working on the idea of non-computable tasks; in other words, tasks that are not carried out by a conventional Turing machine [2]). The most generic example of the new computational paradigms arisen within the frame of AL has been pointed out by M. Sipper [3], related to cellular computing (CC). Here, most of the features and claims of AL, even philosophical ones, are gathered. The core of CC turns around principles such as simplicity, vast parallelism and locality: • Simplicity: unlike current complex units, the processing unit in CC, called cell, can carry out very simple tasks. • Vast parallelism: This principle is based on the interaction of large number of cells (around 10x) that interact in order to carry out complex tasks at a high level that one single cell could not achieve (emergence). This is evidently a proposal quite different to those that we normally find within the frame of parallel computing and massive parallelism. • Locality: The connectivity patterns –interaction- among cells are entirely local, and no single cell is able to see the whole system. In other words, it has to do with the absence of local control or with the implementation of distributed and local control techniques. These three principles are strongly connected with each other and are necessary to create a CC; otherwise, if one of them is modified we converge to another type of computational paradigm (Fig. 1). Nonetheless, within every principle there are diverse degrees that are eligible according to the context of the problem one wishes to attack. The kind of cells (discrete or continuous), the scheme of connectivity (regular or irregular grid) and the dynamics in time (synchrony/asynchrony, discrete/continuous) are some of the configurable variables in CC [3]. However, the really important feature of CC and hitherto of the remaining computational paradigms of AL consists in that within the possible applications, among others, are the complete N-P problems, opening thus a wide horizon for the study of complex systems.
218
C.E. Maldonado and N. Gómez
Fig. 1. Computing Cube. Adapted from Sipper [3]
3 About Non-conventional Engineering Now, the majority of paradigms in non-conventional computation are inspired by natural systems –whether biological or not, and they are gathered under the generic name of natural computing. Here, as f. i. does Nunes de Castro [4] swarm intelligence or evolutionary computation are categorized as independent and parallel to artificial life; beyond such recognition, that can be debatable, what is really important consists in its underlying goals [4], namely i) to develop computation inspired on nature in order to solve complex problems; ii) to construct tools to synthesize behaviors, patterns, forms, and natural systems; and iii) to use natural materials in order to carry out computational tasks. These underlying goals set out clearly that paradigms such as cellular computing or any other in AL or from natural computation in general are not applicable as such to any kind of problems set out by science, engineering and computation, for in many cases the conventional techniques and models are better suited and work better. In contrast, we only use those models and techniques that entail a large number of variables, non-linearity, multiple goals, N-P time, a space of solutions or, what is equivalent, more than one potential solution – whereas standard techniques provide a unique or singular solution. M. Bedau et al. [5] and [6], have worked on the structuring of AL around fourteen open problems. Being as they are all in all relevant, these problems could make more sense if they were set as P or N-P problems. In any case, even if for the sake of agreement we can say that those are N-P problems it remains to be set whether any open problem is an N-P problem.
Facing N-P Problems via Artificial Life: A Philosophical Appraisal
219
Here, we claim that AL is a most viable way to deal with “real” N-P problems of real life; hence its utility and significance. In engineering in general, we have come to talk about non-conventional engineering or complex systems engineering [7] or also about a “new engineering” – three different ways referring to one and the same idea. Non-conventional engineering is rooted in the theory of non-linear dynamic systems, namely the sciences of complexity1, and thereafter on the bio-inspired systems of AL. We strongly believe that AL can and must play a crucial role in the structuring of the frame of this new engineering, not only from the technical and engineering-like scope, but also from a philosophical, scientific, heuristic and methodological standpoint [9]. Such a new way of thinking on, and doing, engineering has made possible firstly moving away from the basic principles of classical engineering – namely centralized control, preprogramming, the search for a unique solution: optimality – towards more flexible and robust principles such as distributed and local control, emergent or bottom-up programming, a space of solutions, and secondly it has set out previously unforeseen research lines in order to implement features and characteristics of living systems, such as autonomy, adaptation, evolution, self-organization, selfsynchronization, immunological response and metabolizing, among others.
4 P Problems and N-P Problems within the Frame of AL A problem to which any problem in P can be reduced is called P-hard; if it also belongs to P, is called P-complete. Thus, to show that a problem is P-hard, it is enough to reduce a P-complete problem to it. Similarly, a system is universal if it may simulate a universal Turing machine. If so, then the problem arises about the decidability or undecidability of the system, i.e. whether we can safely say whether the program stops or not and when, and also whether the program can be compressed. A system is P-hard, when it admits a P-hard problem. Any P problem, whether P-hard or P-complete, assesses and presupposes at the same time a polynomial time. Polynomial time is critical to living organisms as circadian cycles, or also at the scale of developmental biology. The critical case is the study of apoptosis and, hence, the biological clock [10]. ALife systems however not always “die”. They just vanish in simulated programs. What is it for an AL system to die? It is conspicuous to notice that artificial organisms may die, as they do indeed, but the program does not! Inversely, an AL-organism is born as the program creates it, but the very process of giving birth and developing on the evolution of an AL system is the very success of the program! The program can be different as it happens: artificial chemistry, swarm intelligence, and the like. This idea could lead us back to the belief of a “programmer of the universe”, a dream that has already been dreamt. The reply to the belief can be positive if we think of conventional engineering, but it can be negative if we take into consideration nonconventional engineering, i. e., artificial chemistry, and the very possibility of programs that create other programs [11] and [12]. 1
For example, the thermodynamics of non-equilibrium, the theory (or science) of chaos, fractals, the theory of bifurcations, self-organized criticality, and the science of complex networks. See Maldonado [8].
220
C.E. Maldonado and N. Gómez
What can be thought of as a kind of biological clock within AL-systems, is simply a matter of programming. The question arises: can we really talk about N-P problems in the frame of AL? So far the answer clearly seems to be: no. If so, then AL is at most a P-complete problem; no more, no less. The implications of such an acknowledgement are twofold: on the one hand, AL is susceptible only of P-treatments; hence, N-P problems are a patrimony of carbonbased life, not to talk about human systems. On the other hand, we could figure out as a question of possibilities N-P problems for AL. If so, the best chance comes from engineering AL in that direction. As a consequence, three big axes are useful as references when working with or studying AL, namely the philosophical significance, the scientific endeavor and the engineering featuring of AL. As for the N-P problems, we want to asses that engineering AL is most useful to deal, understand and try to solve them. Nonetheless the road remains open and not fully experienced or crossed. Just some steps, although steady, have been done so far. The crux of the studies on life, whether natural or artificial, is about the meaning of life. To be sure, any scientific answer to this question crosses necessarily throughout the fields of N-P problems. Living systems live by solving P problems, but when solving these kind of problems, they happen to encounter N-P; for instance, problems of optimization. Hence, the most basic form of appearance of N-P problems is via problems of optimization, but N-P problems do not, alas, reduce to questions concerning optimization. Therefore, it can be useful to introduce a sort of phenomenology of N-P problems. Simulation via swarm intelligence or artificial chemistry allows us to truly “see” N-P times. An N-P time should not be taken as a physical stance, namely quantitatively, i.e. as a large or big span of time. Such would be indeed a Newtonian understanding of time – as a “mass” or “force”. Grasping N-P time in such a way is misleading and it can simply be taken as a limit as in mathematics.
5 N-P Time and Complexity An N-P time is, accordingly to complex science, rather a quite surprised time, unexpected in principle and unforeseeable. This opens up the door to the Greek notion of kairós, but it can be better grasped as Nietzsche put it: unzeitmässige (Betrachtungen). Here we get a hint for further exploring of N-P time as different from a classical mechanical background (as an extensive – long span). Computationally speaking this leads us necessarily to quantum computing a most intriguing and passionate field underlying, to be sure, complex theory. Life in general, and ALife in particular, obeys from time to time these kinds of N-P- times, and they emerge unexpectedly, as it happens. Several accounts of such phenomena have been recounted by Ch. Langton [13], [14] and [15], T. Ray [16] and [17], or J. Conway [18], among others. We can experience those accounts not just as physical phenomena, but also as temporal or time happenings not assailable to, or as, a P time.
Facing N-P Problems via Artificial Life: A Philosophical Appraisal
221
N-P time has been conceived in terms of its computability and, thereafter, its complexity (computational complexity) [19]. The question remains whether AL can bring new insights on a broader scope on computational complexity. This, we recall, is at the same time a scientific, a philosophical and an engineering challenge and endeavor that remains, so far, still open.
References 1. Heudin, J.: Artificial Life and the Sciences of Complexity: History and Future. In: Feltz, B., Crommelinck, M., Goujon, P. (eds.) Self-Organization and Emergence in Life Sciences, pp. 227–247. Springer, Heidelberg (2006) 2. Copeland, J., Proudfoot, D.: Un Alan Turing Desconocido. Investigación y Ciencia 273, 14–19 (1999) 3. Sipper, M.: The Emergence of Cellular Computing. IEEE Computer 32(7), 18–26 (1999) 4. Nunes de Castro, L.: Fundamentals of Natural Computing: An Overview. Physics of Life 4, 1–36 (2007) 5. Bedau, M.: Artificial Life: Organization, Adaptation and Complexity from the Bottom-Up. Trends in Cognitive Sciences 7(11), 505–512 (2003) 6. Bedau, M., et al.: Open Problems in Artificial Life. Artificial Life 6(4), 363–376 (2000) 7. Braha, D., Minai, A., Bar-Yam, Y. (eds.): Complex Engineered Systems: Science Meets Technology. Springer Complexity, Cambridge (2006) 8. Maldonado, C.E.: Ciencias de la complejidad: Ciencias de los Cambios Súbitos. In: Odeón. Observatorio De Economía y Operaciones Numéricas, pp. 85–125. Universidad Externado de Colombia, Bogotá (2005) 9. Maldonado, C.E.: La heurística de la vida artificial. Revista Colombiana de Filosofía de la Ciencia II (4 & 5), 35–43 (2001) 10. Lane, N.: Power, Sex, Suicide. Mitochondria and the Meaning of Life. Oxford University Press, Oxford (2005) 11. Lloyd, S.: Programming the Universe. Alfred A. Knopf, New York (2006) 12. Kauffman, S.: Reinventing the sacred: A new view of science, reason, and religion. Basic Books, New York (2008) 13. Langton, C.: Studying Artificial Life with Cellular Automata. Physica 22D, 120–149 (1986) 14. Langton, C.: Artificial Life. In: Boden, M. (ed.) The Philosophy of Artificial Life, Oxford University Press, Oxford (1996) 15. Waldrop, M.: Complexity. In: The Emerging Science at the Edge of Order and Chaos. Touchstone, New York (1993) 16. Ray, T.: An approach to the synthesis of life. In: Langton, C., Taylor, C., Farmer, D., Rasmussen, S. (eds.) Artificial Life II, Santa Fe Institute Studies in the Sciences of Complexity, vol. XI, pp. 371–408. Addison-Wesley, Redwood City (1991) 17. Ray, T.: Jugué a ser Dios y Creé la Vida en mi Computadora. In: Gutiérrez, C. (ed.) Epistemología e Informática, pp. 257–267. UNED, Costa Rica (1994) 18. Gardner, M.: Mathematical Games: The fantastic combinations of John Conway’s new solitaire game ``life”. Scientific American 223, 120–123 (1970) 19. Chaitin, G.: Metamaths. In: The Quest for Omega. Atlantic Books, London (2006)
Observer Based Emergence of Local Endo-time in Living Systems: Theoretical and Mathematical Reasoning Igor Balaz1 and Dragutin T. Mihailovic2 1
Scientific Computing Laboratory, Institute of Physics, P.O. Box 57, 11001 Belgrade, Serbia
[email protected] 2 Faculty of Agriculture, University of Novi Sad, Dositej Obradovic Sq. 8, 21000 Novi Sad, Serbia
Abstract. In this paper, we analyze the temporal dimension of the problem of emergence of functional networks in living systems and give an abstract mathematical framework for dealing with those issues without considering the structure of the particular mappings or the resulting dynamics. We found that formal structures equivalent to emergence of local ordering and local chronologies can be defined within the framework of topological relations based on local observers. By using category theory, we further represented higher levels of transformations (between closures), with a special focus on observer turnover. Keywords: Time, Emergence, Category theory, Observer, Artificial Life.
1 Introduction The problem of understanding the flow of time has a long tradition and often presents inextricable difficulties derived from disputes in basic conceptual approaches. Demarcation points are numerous: reductionism vs. Platonism, different views of the topology of time, presentism vs. non-presentism, and so on [1, 2]. We do not strive to understand the philosophical considerations of the nature of time itself. Instead, we will deal only with the experience of time on the very elementary level of interactions of intracellular proteins. These proteins are the basis for the emergence of functional networks where transformations and regulations work together to create metabolism. Since each protein acts as a state machine with determined inputs which can be mapped into defined outputs, we will consider them as elementary observers. As observers they reside within some universe and are able to assimilate a segment of external changes with an appropriate set of internal operations to produce reactions. In addition, the operational environment is defined from the perspective of that individual so that external changes functionally exist for individuals only if they can be observed. Starting from this framework, we will develop a theoretical and mathematical treatment of the emergence of these processes, their chronological organization and the possibilities of internal manipulations. G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 222–229, 2011. © Springer-Verlag Berlin Heidelberg 2011
Observer Based Emergence of Local Endo-time in Living Systems
223
2 Theoretical Treatment When considering the perception of time, changes in the observer’s environment are only one basis upon which a perceptive entity or a group of entities can construct their own time flow. Such construction is established locally in three senses: (i) the act of observing is synonymous with the translation of a vast diversity of external physical and/or chemical changes into an “understandable” signal, so it is functionally local, (ii) the life span of observing agents are always limited (temporal locality) and (iii) spatial constraints are inherent to the act of observation. We consolidate these notions into a single idea under the term local endo-time. However, perceptivity itself is not enough for the establishment of systemic time flow, since the prerequisite for establishing systemic time relations is the ability to compare different systemic states. As long as perceptivity is not incorporated into the network of systemic relations, the system will remain in a state of independent linear processes flows. If, according to Luhmann [3], we define a process as a mutually connected succession of events where the scope of selections is re-established at each stage, the specific characteristic is defined by anticipation as an accumulation toward less and less probable states (less probable from the perspective of the beginning of the process) which emerges as a necessary consequence of previous stages. However, there is a very important difference between elementary processes (e.g., separate enzymatic transformations) and the processes developed from them (e.g., metabolic pathways). The first ones can be considered as unambiguous transformations toward a determined state (in an ideal case) or a group of very similar states. However, in living systems, that kind of almost indispensable flow enclosed in a rigid structure is only a basis for the further development of functionality. These irreversible sequences can be rearranged into higher order structures (metabolic pathways) gradually relativizing process indispensability with every superposed level of functionality constructed. In such a structure, individual events are no longer necessary for continuing the line of transformations (which is the case with elementary processes), but become “one of” the possible realizations of functionality. In spite of such relativization, what remains inherently connected for these processes is their necessary differentiation: former/latter – as the primary form of local temporal differentiation. Through this relativization of necessities, the system becomes able to construct anticipatory structures which, from the available (perceptively constructed) context, choose indicators that are in correlation with changes in the future1, associating them with adequate systems of transformations and preparing themselves for the following events. From such a perspective it is obvious that mutual interactions of anticipatory structures are not only based on the possibility of perceiving signals, but also on the possibility of anticipating future states (e.g., establishment of regulations based on feedback). Such anticipations are inherently connected with systemic expectations in which realization or non-realization becomes a powerful intrasystemic regulative factor. Therefore, it is not only important to accomplish some function; equally important is temporal compatibility with other, parallel processes. Consequently, only 1
As future changes in this context we consider only those which are within a scope of results of other systemic-induced transformations.
224
I. Balaz and D.T. Mihailovic
a combination of these two factors--the possibility of anticipation and regulation by anticipation--can create a basis for the construction of an autonomous, systemic time flow, as a generalization of the validity of partial intrasystemic time horizons across functional elements and (organizational) structures within certain subsystems. In other words, during interactions, along with perceptive normativity, subsystems also impute time (their own construction of time) to their environment. The organizational dynamics of mutual influences results purely from such imputations. Although such relativizations make organizational manipulations available, we still cannot talk about systemic comparisons (since an abstract measurement of empty intervals in such systems is not possible) and hence, about systemic time. The final precondition is to establish parallel and multiple repetitions of transformations which are successively superpositioned. Then, the system enters a state where consequences of transformations are transferred to the following cycles of successive processes. In this way, the dynamics of cycles are not self-sustained but always exist with reference to previous operations in their environment; i.e., by a rudimentary “comparison” of intervals. Only then processes became autonomous axes for establishing systemic time relations, and consequently for construction of organizational regulations and controls. However, at the same time, the physical duration of constitutive elements is limited (in this case it is called protein turnover) [4]. This means that pattern of observation, with all of the consequences described above, is not fixed to some location in space nor is the structure of functional relations invariable. Although the theory of autopoietic systems make it obvious that systemic elements should always be selfreproduced, [5] mathematical models of global regulations in organisms mainly neglect this fact and its consequences. With this in mind, it is necessary to pay particular attention to the development of functionality based on cyclic degradations and reconstructions of systemic material structures. First, it should be emphasized that it is not a mere repetitive circle that produces sameness, but rather a production with deviations, due to locality inherent to perception. The continuous decomposition of segments of processes compels them to be in constant reconstruction, thus making space available in the organizational structure of different insertions, divergences and re-routings without the need to construct specific mechanisms (in the form of localized regulators) for each specific case. In addition, through turnover, the system purposefully eliminates groups of elements, concordantly eliminating them from the possibility of direct reaction (with respect to other elements, subsystems, etc.). In this way, the system’s internal structure perpetually reconstitutes the causal basis for its own processes and the past is not merely a fixed set of preceding events, but is rather a dynamic accumulation where, according to its relevancy to the current state (a relevancy constantly updated with causal reconstitution) some elements and structures can be summoned while others disappear without any further functional influences. Through such cycles, the system is forced not to be self-adapted but rather in selfadaptation. Finally, it is necessary to answer one more question: What is the main precondition for turnover of elements without destroying systemic functionality? It is obvious that since an organism’s survival is inherently connected with undisturbed metabolism function, turnover should not influence the continuity of metabolic processes. Is it justified then to assume the existence of elementary functional units which cannot be,
Observer Based Emergence of Local Endo-time in Living Systems
225
and should not be perturbed? If we analyze metabolism as a whole, at each stage we can identify some segments that are essential for survival of the organism and that will be safeguarded from the possibility of internal violations (e.g., excessive synthesis of groups of enzymes coupled with a decrease in the level of the specific chaperone). However, determination of such “elementary units” is highly dependent on context: both materially (e.g., availability of certain nutrients, the constellation of environmental factors), as well as functionally (e.g., hierarchical variations regarding actual distribution of subsystems). Therefore, what is usually considered a central property of elementary units, namely their perseverance through different contexts, is lost. However, if we set our focus down to the level of concrete, material transformations, we can see that every single step in the processes of metabolic transformations is performed by enzymes whose actions are not susceptible to cutting or dividing into independent phases. In this manner, single enzymatic transformation is not only an elementary event, but is also an inherently unambiguous process—an atom of functionality. Only through the enclosure of single occurrences in a web of functionally meaningful events does it become possible for systems to base their functionality on recursive, reflexive reproduction of elements. The rise of elementary events from the level of discrete, meaningless occurrences to the level of finished processes, lays the groundwork for a situation where any kind of interruption (in the sense of physical elimination of functional elements) or rearrangement cannot violate the fundamentality of such units. Without that kind of organization, temporalizing constitutive elements of the systems would destroy them.
3 Mathematical Treatment In the theoretical construction of the problem we raised several notions important for understanding emergence of endo-time in living systems and their consequences for the pattern of functional organization of living systems. Based on these considerations, we will develop an appropriate mathematical framework. Definition 1 (Relations). If we denote by M = { f , g , h,...} the set of proteins and by
U = {a, b, c,...} the set of environmental objects, then relations on U can be represented as R = (U , G( R ))
(1)
where G( R ) :{U1 × ... × U k U 1 ,...,U j ,...,U k ⊂ U } is determined by M (in any given context internal determination of G( R ) by U itself is not important) such that
aG( R )b a, b ∈ U , G( R) ≡ f ∈ M
(2)
Set of all a ∈ U is called the domain of the relation R and is denoted domf , while set of all b ∈ U is called the codomain of the relation R and is denoted codf . Since the relation R is structure, preserving it can be called a morphism. The relation G ( R ) is:
226
-
I. Balaz and D.T. Mihailovic
irreflexive: ¬aG ( R)a ; symmetric: aG ( R)b ⇒ bG( R) a ; nontransitive: aG ( R )b ∧ bG ( R)c ⇒ ¬aG( R)c ; with a unique resultant: aG ( R )b ∧ aG ( R )c ⇒ b = c .
Here, domf represents the observed subset of the environment, while aG( R)b incorporates the intrinsic unambiguity of elementary processes. In addition, it should be noted that being member of e.g., codf is context dependent: codf can became domf and vice versa. Definition 2 (Mapping into Shared Metric Space). If both sets M and U can be mapped into the metric space (C , d ) such that C ≡ M ⊕ U = {(0, m) m ∈ M } ∪ {(1, u ) u ∈ U }
then C is a coproduct of M and U , while (C , d ) is shared metric space over M and U . Definition 3 (Domain Determination). For domf , a ∈ domf iff there exist a ∈ U such that: -
if we define a set of many-valued attributes P = { p, q, r ,...}, ∀p ∈ P, p = { p1 , p2 , p3 ,..., pn } , and each a ∈ U and f ∈ M are associated with the Pn^ , where n ∈ {M ∪ U } , Pn ⊆ P , and Pn^ is a strictly ordered set ( Pn , 1 2 p rf + 1 2 par ⎪ d ( f , a) = 1 for d ( f , a) = ⎨ where p rf and par are r r ⎪⎩1, if f − a ≤ 1 2 p f + 1 2 pa called the “activity radius” values for f and a , respectively, and are determined by the corresponding attribute values.
In other words, in order to be recognized, each element of the environment must be within the scope of a certain enzyme, and recognizable by it. All elements recognized in such manner are collected into an equivalence class [a]domf . Definition 4 (Topology Based on Observations). If the metrical radius R( f ) for domf , is defined as R( f ) = {a, b a ∈ domf , b ∈ codf ∧ d (a, f ) = 1, d (b, f ) = 1} , then the collection of all R( f ) for a given metric space (C , d ) is a basis for the topology T on
C . Therefore, a subset N of C is in T (is an open set) if it is a union of members of the collection of all R( f ) . Such a topology T is induced by the metric d and (C ,T ) is the induced topological space. It should be emphasized that each open set is defined with respect to a particular f ∈ M .
Observer Based Emergence of Local Endo-time in Living Systems
227
Definition 5 (Emergence of Functional Closure and Boundaries). In order to describe functional relations induced by local perceptions, we will focus our attention on the emergence of closure and boundaries for open sets in an induced topological space: -
for R( f ) , a point a ∈ C is defined as limit point of R( f ) if every open set of induced topology containing a contains a point of R( f ) different from a . The set of all limit points of R ( f ) is denoted R( f ) ' . Then, R( f ) ∪ R( f ) ' is called
-
the closure of R( f ) , denoted R( f ) . Within the given framework, R( f ) signifies the emergence of metabolic processes. for an open set R( f ) , the interior, denoted int ( R ( f )) , is the largest open set contained in R( f ) . In other words, int ( R ( f )) is R( f ) itself. All points in the closure of R( f ) not belonging to the int ( R ( f )) define the boundary of R( f ) denoted bdR( f ) . An element of the boundary of R( f ) is called a boundary point of R( f ) . Less formally, boundary points of R( f ) are those points which can be approached both from R( f ) and from the outside of R( f ) . Within the given framework, bdR( f ) signifies points of possible divergence or differentiation of emerging metabolic processes.
In order to have ordering, a relation over the set has to be transitive and asymmetric. However, Definition 1 states that relations induced by enzymes are nontransitive and symmetric. Therefore, it is clear that one single relation determined by f ∈ M cannot induce ordering over the environment. Instead, we need to introduce a relation which would naturally follow from the existence of a number of different members of the set M within a given range.
Definition 6 (Ordering). If a ∈ (bdR( f ) ∧ bdR( g )) and b ∈ (bdR ( g ) ∧ bdR( h)) then between
a, b
is
established
a
causal
influenceability
with
respect
to
R( M ), M = { f , g , h,...} denoted < R ( M ) , which is transitive and asymmetric.
Definition 7 (Chronology). a, b ∈ ∪ bdR( M ), M = { f , g , h,...} ordered by the relation < R ( M ) can be naturally mapped into a finite index set I such that
I = {i, j , k ,...}, I ⊂ . Then, index set I represents a chronology over closure. This chronology is not imposed onto the structure but emerges from the bottom as a consequence of established causal relations.
Definition 8 (Locality of Chronology). Chronology defined by the relation < R ( M ) is local in the sense that due to Definitions 3 – 5 is applicable only to observable subsets of U determined by structures and metrics. Therefore, the closure chronology of living systems is globally linear. However, if we try to broaden our scope beyond R( M ) over which < R ( M ) is established, it is
228
I. Balaz and D.T. Mihailovic
nonlinear in the sense of Robb’s axiom, [6] which states that “For every element x , there is another element y such that neither x > y nor y > x ”, and its strengthened form: (∀x)(∀y )(∃z )( y ≺ x
z ≺ x ∧ ¬( z = y ∨ z ≺ y ∨ y ≺ z )) [7].
Definition 9 (Category Based on Observations). If we assume: -
a collection of closures of R ( M ), M = { f , g , h,...} for a given topology is a basis for collections of objects such that each object consists of a quadruplet G = ( A, O, ∂ 0 , ∂1 ) where A is a set of directed edges defined by Eqs. (1) and (2),
O is a set of objects whose members are equivalence classes [a]domM , [b]codM where M = { f , g , h,...} and mappings ∂ i (i = 0,1) from A to O such that ∂ 0 is
-
a source map that sends each directed edge to its source and ∂1 is a target map that sends each directed edge to its target; homomorphisms exist between objects G such that if D : G → G ' , where G = ( A, O, ∂ 0 , ∂1 ) and G ' = ( A' , O ' , ∂ 0' , ∂1' ) then D consists of two maps
D( A) : A → A' and D (O) : O → O ' such that following diagram commute:
(3)
-
an arrow is assigned to each object id G : G → G called the identity on G ; the composition of homomorphisms is defined as an associative operation assigned to pairs of morphisms such that for D : G → G ' and E : G ' → G " , a map E D : G → G " is defined by ( E D )(G ) = E ( D(G )) ,
then the category Γ of graphs and their homomorphisms is defined over interacting agents residing within subjective environment. Finally, using categories we are able to represent protein turnover. In category theory a forgetful functor is defined as a functor that “forgets” some or all of the structure of an algebraic object [8]. For example U : Grp → Set assigns to each element of G ∈ Grp its underlying set and to each morphism D : G → G ' the same function regarded just as a function between sets. Since in our framework the structure of an object is defined as a collection of closures determined by R( M ) M = { f , g , h,...} , a function that “forgets” any f and the corresponding interior of R( f ) will accurately represent protein turnover, and at the same time will cause reorganization of the structure of subjective environment.
Definition 10 (Forgetful Function). For an object G = ( A, O, ∂ 0 , ∂1 ), G ∈ Γ we define forgetful function Φ as a homomorphism such that in mapping
Observer Based Emergence of Local Endo-time in Living Systems
229
Φ : ( A, O, ∂ 0 , ∂1 ) → ( A' , O ' , ∂ '0 , ∂ '0 ) one or more R( M ), M = { f , g , h,...} are ignored. In G ' equivalence classes [a]domM , [b]codM are changed and the according structure of G ' is reconfigured.
4 Conclusions Organization of living systems is unique in the sense that it cannot at any moment passively exist, from the mere fact that it is already formed. They are not finished systems, but rather systems in continuous self-construction, which require them to always be in the process of adaptation to both the external and internal environment. The results presented here are only a part of the global framework which could be comprehensive enough to stands as an evolvable, self-organizing model of an abstract organism. Merging the presented framework with the internal coding system which would ensure that generation of observers is autonomously determined, will represent an important step toward that goal, which is a task for future work.
Acknowledgments. The research work described here has been funded by the Serbian Ministry of Science and Technology under the project “Modeling and numerical simulations of complex physical systems”, No. ON141035 for 2006-2010.
References 1. Newton-Smith, W.H.: The Structure of Time. Routledge & Kegan Paul, London (1980) 2. Zimmerman, D.: Persistence and Presentism. Philosophical Papers 25, 115–126 (1996) 3. Luhmann, N.: Soziale Systeme. In: Grundrisseiner allgemeinen Theorie. Suhrkamp Verlag GmbH, Frankfurt (1987) 4. Hershko, A.: Ubiquitin: roles in protein modification and breakdown. Cell 34, 11–12 (1983) 5. Zeleny, M. (ed.): Autopoiesis: A Theory of Living Organization. Elsevier North-Holland, Amsterdam (1981) 6. Robb, A.A.: The Absolute Relations of Time and Space. Cambridge University Press, Cambridge (1921) 7. Lucas, J.R.: The Conceptual Roots of Mathematics: An essay on the philosophy of mathematics. Routledge, New York (2002) 8. MacLane, S.: Categories for the Working Mathematician. Springer, New York (1998)
Life and Its Close Relatives Simon McGregor and Nathaniel Virgo Centre for Computational Neuroscience and Robotics University of Sussex, Brighton, UK
[email protected] Abstract. When driven by an external thermodynamic gradient, nonbiological physical systems can exhibit a wide range of behaviours usually associated with living systems. Consequently, Artificial Life researchers should be open to the possibility that there is no hard-and-fast distinction between the biological and the physical. This suggests a novel field of research: the application of biologists’ methods for studying organisms to simple “near-life” phenomena in non-equilibrium physical systems. We illustrate this with some examples, including natural dynamic phenomena such as hurricanes and human artefacts such as photocopiers. This has implications for the notion of agency, which we discuss. Keywords: Near-life, Thermodynamics, Dissipative Autopoiesis, Developmental Systems Theory.
1 1.1
Structures,
Introduction A-Life and Near-Life
The premise of Artificial Life is that there is more to being alive than the details of terrestrial biology; that there are abstract principles which underlie those particular instantiations of life we happen to share a planet with. In this paper we will argue that, besides “life as it could be” [6], we should also be considering “near-life as it is”: life-like properties in the familiar inanimate world. Indeed, we will see that hurricanes and even photocopiers have thought-provoking features in common with living organisms. When we use the word “life” in this paper, we do not refer to a definition but to a class of examples, namely those things existing on this planet which biologists consider alive: the eukaryotes, prokaryotes and occasionally viruses. Similarly, we use “life-like” informally to refer to various properties shared by all or most of these examples. Central to our approach is the study of life-like phenomena that arise naturally within a wider system, either simulated or physical. We propose to study “near-living” systems in the same kind of way that a biologist studies living organisms: by observing them in their natural habitat; by studying their behaviour; and by trying to see how their anatomy functions. The processes behind life-like behaviour in non-living systems are likely to be much easier to understand than in living systems. Our purpose in this paper is not merely to point out similarities between some non-living systems and living systems, but to suggest that principled further studies of such similarities could be a gold-mine of new results. G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 230–237, 2011. c Springer-Verlag Berlin Heidelberg 2011
Life and Its Close Relatives
1.2
231
Non-equilibrium Thermodynamics
Physical systems in thermal equilibrium are not life-like: the second law of thermodynamics means that isolated macroscopic systems tend towards more or less homogeneous equilibrium states. However, this does not apply to open systems in which an externally imposed “thermodynamic gradient” causes a continual flow of matter or energy through the system. In such non-equilibrium systems complex patterns can arise and persist. Authors such as Prigogine, Schneider & Kay and Kauffman [12,14,5] have observed that life is an example of such a “dissipative structure.” Life exists within the Earth system, which is an open system with a gradient imposed primarily by the sun1 . Living organisms exist within this system and are constrained by the second law of thermodynamics. Therefore many of their properties have to be realised in non-trivial ways. No organism can maintain its structure by isolating itself from the world: the need to “feed on negative entropy” (Schr¨ odinger [15]) guarantees that interaction with the world is required in order to find food or other sources of energy. Ruiz-Mirazo and Moreno connect this to the theory of autopoiesis [13]. Simulation work in artificial life has often ignored thermodynamics; it is common to see research using physically unrealistic cellular automata, abstract artificial chemistries in which there is no quantity corresponding to entropy, or simulated agents which eat abstract “food” (if indeed they need to eat at all). In contrast, our approach builds upon and extends previous authors’ observations about the relationship between life and thermodynamics. An abiotic non-equilibrium system has many parallels with an ecosystem; our focus is on the equivalents to organisms that exist in such systems, and the specific properties that they do or do not share with biological organisms. These properties vary depending on the system under consideration, so our methodology is to investigate many different types of non-equilibrium system, looking for the general circumstances under which various life-like properties arise.
2
Some Examples of Near-Life
2.1
Hurricanes
A hurricane is a classic exemplar of a dissipative structure. It is instructive to focus on this example because hurricanes exhibit a phenomenon we call individuation. Emanuel [3] gives the following characterisation of a hurricane’s operation: “[Air] flows inward at constant temperature within a thin boundary layer [above the sea surface], where it loses angular momentum and gains moist entropy from the sea surface. It then ascends and flows outward to large radii, preserving its angular momentum and moist entropy. Eventually, at large radii, the air loses moist entropy by radiative cooling to space. . . ” These processes occur simultaneously. Their rates balance so that the hurricane as a whole is stable. 1
More precisely, the gradient is formed by the difference in temperatures between the incoming solar radiation and deep space.
232
S. McGregor and N. Virgo
The net result of the interactions between these processes is that the hurricane is formed and remains stable to perturbations. Moreover it is formed as a spatially distinct individual, separate from other hurricanes (this can be compared to the definition of autopoiesis by Maturana & Varela [8,9] as a “network of processes” that “constitute” a “unity.”). A hurricane has functionally differentiated parts: near the sea surface the water is drawn towards the eye, picking up moisture from the sea and rotational speed from the Coriolis force; in the eyewall itself the air is moving rapidly upward. Each part, together with its associated processes, is necessary for the whole to persist. This is analogous to an organism’s anatomy. A hurricane remains an individual entity because of its vortex structure, whereas a cell is surrounded by a membrane. We see this as the same phenomenon, individuation, occurring by different mechanisms. In both cases the result is that the system is localised in space and distinct from other individuals. Although the research is so far preliminary, hurricanes may also be able to exhibit behaviour that could be called adaptive. Shimokawa et al. [16] claim that when the prevailing wind is subtracted from the data, tropical cyclones tend to move towards regions which are better able to sustain them, namely those with a greater temperature gradient between the sea and the upper atmosphere2 . The hurricane has been given as an example of a “self-organising” system before but its similarities to living cells have not been studied in depth. We feel that such a study would provide many novel scientific insights. 2.2
Reaction-Diffusion Spots
Hurricanes are large, comparatively complex and difficult to study. A much simpler and easier-to-study example of an individuated dissipative system can be found in the patterns that form in reaction-diffusion systems [11]. Reactiondiffusion systems are very simple and easily simulated non-equilibrium chemical systems in which reactions take place among chemical species that are able to diffuse along a plane. The non-equilibrium conditions are maintained by continually adding reactants and removing products from the system. Under some parameter regimes the system can form a pattern in which there are spatially distinct “spots” of an autocatalytic substance, separated by regions in which no autocatalyst is present (see fig. 1, left). We can observe interesting life-like properties in the form and behaviour of reaction-diffusion spots: like all dissipative systems they export entropy and are dependent on specific thermodynamic gradients; they are patterns in matter and energy; like organisms, they exist mostly as identifiable individuals; under certain parameter regimes they reproduce [11]. Studying reaction-diffusion spots according to our methodology involves treating a single spot as a model agent and studying it from a “spot-centric” point of view. The following simulation-based results have been demonstrated in [17], 2
More strictly, higher Maximum Potential Intensity, a measure that takes into account both the temperature and pressure gradients that power hurricanes.
Life and Its Close Relatives
233
and a paper giving details of the experiments is in preparation. Briefly, we found that reaction-diffusion spots, perhaps like hurricanes, tend to move towards areas where more food is available, a behaviour that could be called “adaptive.” We also found that individuated spots are very likely to arise when there is a negative feedback added between the whole system’s activity and overall supply of food (this situation is common in natural systems and we think the result is suggestive of a possible general phenomenon). Using this as a method for producing individuated entities we were able to produce agents with a more complex ‘anatomy’ than just a single spot (fig. 1, right). Some of these more complex agents exhibited a very limited form of heredity in their reproduction.
Fig. 1. left: “individuated” spots of autocatalyst in a reaction-diffusion system; right: structurally complex individuated entities in a reaction diffusion system. Each consists of spots of two different autocatalysts (represented with diagonal stripes in this image) coexisting symbiotically, mediated by an exchange of nutrients, one of which is shown in grey. Details will be published in a forthcoming paper.
2.3
Photocopiers
In this section we will examine a photocopier — a stereotypical inanimate object — from a point of view that might be called “enactive.” That is, it is a point of view in which the photocopier itself is the central player, maintaining its identity and behaving adaptively as a result of dynamical interactions with its environment. This is a difficult viewpoint to take at first, since our intuition tells us that a photocopier is not the sort of thing that can “act.” Rather, we normally think of it as acted upon by human beings. However, this intuition can be stretched, and we hope the reader will agree that it is worthwhile to do so. We chose a photocopier for this example because it is usually repaired when it breaks down; any such machine would have done. In particular, the fact that the photocopier performs a copying task has no special significance. We focus on this example because it shows how far our intuitions can be stretched without reaching a reductio ad absurdum: a photocopier is an archetypal example of an inanimate object, but when considered as an agent engaged in complex interactions with its environment it becomes in many ways the most life-like of our three examples. When trying to support our natural intuition of a qualitative difference between photocopiers and bacteria, we may cite a variety of apparently relevant
234
S. McGregor and N. Virgo
facts. Bacteria have DNA, and we can observe the complex process of bacterial reproduction under the microscope, whereas the same is not true of photocopiers. A photocopier, unlike a bacterium, consists of mostly static and chemically inert parts; if a part of the photocopier becomes damaged, it does not reconstitute itself internally. Bacteria display complex adaptive behaviour including chemotaxis and habituation; photocopiers don’t appear to. This may seem like a conclusive list of differences between a cell and a photocopier. And some of these differences are genuine, if perhaps arbitrary: photocopiers do indeed lack DNA. However, on closer scrutiny some of the other issues will turn out to be less clear-cut. We think that many if not all of the most significant differences between cells and photocopiers can be seen as differences of degree rather than kind. Dynamic Identity. One prototypical property of bacteria and other living organisms is their identity as patterns of matter and energy. Individual atoms flow through the organism, and the overall organism is maintained even if all the material parts change. We can consider a photocopier that has this property, although the rate of material turnover is much slower than in a cell: when the parts of this photocopier break they are replaced by an engineer. Like the hammer which retains its identity despite having a new handle and a new head, matter flows through the photocopier leaving its photocopier-ness unchanged. In ordinary discourse, we would not describe the process as self -repair, since we prefer to locate causal primacy in the engineer rather than the photocopier. But seen from a logical point of view, both photocopier and engineer are necessary parts of the physical process; the repair is caused by the interaction between the two. It is no different for bacteria: ongoing cell repair is caused by the interaction between the cell and its environment, since the organism must be able to absorb relevant nutrients and excrete waste. Note that some important physical principles can be observed in action here. In order for this process to continue, the photocopier’s environment has to be in a very specific state of thermodynamic disequilibrium: it has to contain appropriately competent and motivated engineers. From a more photocopier-centric point of view one could say that the photocopier causes the repair to be carried out: simply by performing a useful office task, the photocopier is able to co-opt the complex behaviour of humans in its environment in just such a way that the raw materials needed to maintain its structure are extracted from the ground, fashioned into the appropriate spare parts and correctly installed. Seen from this perspective the photocopier is a master of manipulating its environment. It needs no deliberative intelligence to perform these feats, however: one is reminded of species of orchid that cause their pollen to be spread by mimicking the form of a female bee. Many of these processes take place outside the physical (spatial) bounds of the photocopier itself, with most of them involving human activity in some way. This is in contrast to the usual conception of an organism, since these systems rely heavily on a network of metabolic or dynamical processes that occur within the system’s physical boundary. We argue however that this difference between
Life and Its Close Relatives
235
artefacts and organisms is one of degree rather than a difference in kind, since all organisms must rely on some processes that are external to their spatial boundary, some of which will often involve the action of other organisms. Individuation. The phenomenon of individuation is evident in this example: photocopiers are maintained as individual photocopiers. Half a photocopier will not function as one and will not be maintained as one. We can imagine an environment in which half a photocopier might be maintained: we might find one inhabiting a display case in a museum, for example. But in this case it is being maintained as a museum exhibit rather than as a photocopier: it would be a different species of artefact. Reproduction and Evolution. Although there are obvious differences between the process of “evolution” in photocopiers and in living organisms, we can still observe some similarities. The photocopier phenotype has become better adapted to its ecological niche over time, as successful photocopiers are re-produced in factories and successful designs retained and modified. A photocopier’s external casing does not contain its blueprints, whereas we often regard the “design” for a cell as being in its DNA (with developmental influences from the internal and external environment). But this notion is perspectival rather than factual: as Oyama [10] points out, the interaction of environment and genome forms the cell, with both being required. Reproduction and development in any organism relies heavily on environmental machinery external to the organism itself. The photocopier phenotype interacts with its blueprint indirectly through the phenotype’s effects on photocopier sales, which fund the production of more photocopiers from the blueprint. In turn the sales depend on the successful operation of the photocopiers, among other factors.
3
Discussion
When we picture something which symbolises the fundamental properties of a living being, the chances are good that we imagine a biological object: a living organism such as a cell, plant or animal, or perhaps the DNA helix. Alternatively, we may think of our favourite simulation model in Artificial Life. It is unlikely in the extreme that we picture an inanimate object such as a rock, a hurricane or a photocopier. Due in part to our evolutionary heritage, humans have a definite sense of what is alive and what is not. We assert that scientists should treat this intuition with suspicion. In the past, it has given rise to erroneous theories of an animal spirit or elan vital as an explanation for the remarkable behaviours of living organisms. Although these ideas have long been discarded by formal science, the underlying psychological stance is more enduring, and it has led science astray in the past. In this sense, we echo Oyama’s insights in Developmental Systems Theory(e.g. [10]): human beings like to postulate chains of effect which attribute causal primacy to a particular part of a holistically integrated system. In the scientific imagination, part of the system becomes the agent and the rest is the environment.
236
S. McGregor and N. Virgo
Oyama argues that attributing causal primacy to genes is a conceptual error within developmental biology; we contend that similar problems arise when we think about living and non-living systems. Thermodynamic gradients can be found nearly everywhere in the physical universe, and consequently we should not be surprised if near-life is abundant in interesting forms. By “near-life”, we simply mean non-biological systems which share important characteristics with living organisms, including any of the following: reproduction, particularly with heritable variation; maintenance of a dynamic pattern of matter and energy; production of spatially separated individuals; “goal-directed” behaviour. All of these properties can be observed in comparatively simple non-equilibrium systems; there is every reason to suppose we will find more such properties. It is quite possible, for example, that one could find life-like structures that exhibit a developmental trajectory over their existence; or show a permanent memory-like change in behaviour in response to a stimulus; or have a more complex form of heredity. It is not a new idea to look for life-like processes in non-biological systems. Lovelock [7] describes the entire Earth system as a “single living system”, using terms such as “physiology” and “anatomy.” Our ambitions concern simple and physically numerous systems which can be studied experimentally. While many of the similarities between life and dissipative structures are wellknown, most previous commentators (e.g. Prigogine, Schneider & Kay [12,14]) have considered only natural (as opposed to artificial) phenomena. However, artificial structures are part of the same physical world as natural structures. Our perspective allows us to observe physical characteristics associated with biology even in apparently prototypical inanimate objects such as photocopiers. Of course, it is not a new idea to attribute biological properties to cultural artefacts (this occurs in “meme theory,” which concentrates on “selfish replicator” properties, e.g. Blackmore [1]) or to consider physical artefacts as an integral part of a biological system (Clark & Chalmers [2]). We add the observation that ordinary physical artefacts can also be a type of life-like dissipative structure, providing they exist in a human environment which maintains them against decay. This is particularly instructive because it illustrates the extent to which our intuitions can be stretched without breaking. 3.1
Proposed Future Research
Searching for life-like properties of phenomena that arise in physical rather than computational systems is an under-researched area of A-Life that has potential for some very important results. Current “wet” A-Life research (i.e. in vitro chemical experiments), tends to focus on the deliberate design of life-like structures (e.g. synthetic bacteria [4] or formation of lipid vesicles), rather than on open-ended observation of structures that form naturally. However, interesting behaviour can also be observed spontaneously in non-equilibrium systems, including ones which are simple enough for physically realistic computer simulation (e.g. [17] and forthcoming work). More research along these lines would help to bridge the gap between biology and physics.
Life and Its Close Relatives
237
Acknowledgments. This paper arose from discussions in the Life And Mind seminar series at the University of Sussex. The support of Inman Harvey has been especially helpful.
References 1. Blackmore, S.: The Meme Machine. Oxford University Press, Oxford (1999) 2. Clark, A., Chalmers, D.: The extended mind. Analysis 58, 10–23 (1998) 3. Emanuel, K.A.: The dependence of hurricane intensity on climate. Nature 326, 483–485 (1987) 4. Gibson, B., Hutchison, C., Pfannkock, C., Venter, J.C., et al.: Complete chemical synthesis, assembly, and cloning of a mycoplasma genitalium genome. Science 319(5867), 1215–1220 (2008) 5. Kauffman, S.: Investigations. Oxford University Press, US (2000) 6. Langton, C.: Introduction: Artificial life. In: Langton, C. (ed.) Artificial Life, Santa Fe Institute, Los Alamos, New Mexico (1989) 7. Lovelock, J.: Gaia: a New Look at Life on Earth. Oxford University Press, Oxford (1987) 8. Maturana, H.R., Varela, F.J.: Autopoiesis and Cognition: The Realization of the Living. Kluwer Academic Publishers, Dordrecht (1980) 9. Maturana, H.R., Varela, F.J.: The Tree of Knowledge: The Biological Roots of Human Understanding. Shambhala Publications, Boston (1987) 10. Oyama, S.: The Ontogeny of Information: Developmental Systems and Evolution. Duke University Press, Durham (2000) 11. Pearson, J.E.: Complex patterns in a simple system. Science 261, 189–192 (1993) 12. Prigogine, I.: Time, structure and fluctuations. Science 201(4358), 777–785 (1978) 13. Ruiz-Mirazo, K., Moreno, A.: Searching for the roots of autonomy: the natural and artificial paradigms revisited. Communication and Cognition–Artificial Intelligence 17, 209–228 (2000) 14. Schneider, E.D., Kay, J.J.: Life as a manifestation of the second law of thermodynamics. Mathematical and Computer Modelling 19(6-8), 25–48 (1994) 15. Schr¨ odinger, E.: What is Life? Cambridge University Press, Cambridge (1944) 16. Shimokawa, S., Kayahara, T., Ozawa, H., Matsuura, T.: On analysis of typhoon activities from a thermodynamic viewpoint (poster). In: Asia Oceania Geosciences Society 4th Annual Meeting, Bangkok (2007) 17. Virgo, N., Harvey, I.: Reaction-diffusion spots as a model for autopoiesis (abstract). In: Bullock, S., Noble, J., Watson, R., Bedau, M.A. (eds.) Artificial Life XI: Proceedings, Winchester, UK, p. 816. MIT Press, Cambridge (2008)
A Loosely Symmetric Model of Cognition Tatsuji Takahashi1 , Kuratomo Oyo1 , and Shuji Shinohara2 1
2
Tokyo Denki University, School of Science and Engineering, Hatoyama, Hiki, Saitama, 350-0394, Japan
[email protected] http://web.me.com/tatsuji2_takahashi_/ Advanced Algorithms & Systems, Co. Ltd., Ebisu IS bldg. 7F, Ebisu 1-13-6, Shibuya, Tokyo 150-0013, Japan
Abstract. Cognitive biases explaining human deviation from formal logic have been broadly studied. We here try to give a step toward the general formalism still missing, introducing a probabilistic formula for causal induction. It has symmetries reflecting human cognitive biases and shows extremely high correlation with the experimental results. We apply the formula to learning or decision-theoretic tasks, n-armed bandit problems. Searching for the best cause for reward, it exhibits an optimal property breaking the usual trade-off between speed and accuracy. Keywords: cognitive bias, heuristics, multi-armed bandit problems, stimulus equivalence, mutual exclusivity.
1
Introduction
Recently, cognitive biases and heuristics have attracted widespread attention, as typically in psychology and behavioral economics (e.g. [1–4]). Logicality and rationality must be distinguished, and we are required to inquire their origin, development and the relationship between them. Considering the effects of the biases may be the key to construct artificial agents. It is interesting to see that animals do not appear to have some of these illogical biases. It means that in some senses animals are more logical, or machinelike, than us. There is considerable circumstantial evidence that the biases deeply involve the difference between human and other animals. In this study, we try to show the importance of loosely symmetric cognition through a constructive approach. We introduce a certain formula as an elementary model of causal inductive cognition and apply it to decision making or reinforcement learning. The formula, a loosely symmetric (LS) model [5, 6], implements key biases in a flexible way, autonomously adjusting the bias intensity according to the situation. The derivation of LS is given, along with its best humanlike behavior and high performance. The model’s descriptive validity for causal induction, beating all tens of preexisting models, is shown. Its high performance is presented in binary bandit problems (BBP), a simplest class of decision theoretic tasks. BBP are the simplest dilemmatic situations where an G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 238–245, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Loosely Symmetric Model of Cognition
239
agent has difficulty in balancing exploration (searching for better options) and exploitation (persisting to the known best option). The LS model is shown to break the speed-accuracy trade-offs resulting from the dilemma. Even though it has no parameter, it gives better performance than existing standard models like the epsilon-greedy softmax methods. First we introduce our LS model in the context of causal induction.
2
Human Biases in Causal Induction and LS Model
It is critical for survival to inductively reason a causal relationship from event co-occurrence. In psychology, symmetrical biases are considered to be crucial in general [1]. It is also argued that the biases enhance information gain from the cooccurrence [4]. Stimulus equivalence and mutual exclusivity have been originated in behavior analysis [2] and language acquisition [3]. They can be generalized to the inferential biases that we call symmetry (S) and mutual exclusivity (M X) biases. From a conditional p → q, S and M X biases respectively induces q → p and p¯ → q¯ (where p¯ means the negation of p). These conditionals correspond to the following three conditions that are important to our intuition of causality, that p is the cause of q: 1. High conditional probability of q given p, P (q|p). 2. Simultaneously high (a) “converse” probability P (p|q) ≈ P (q|p) (S bias) and (b) “inverse” probability P (¯ q |¯ p) ≈ P (q|p) (M X bias). Condition 1 is fundamental in prediction. However, it is not directly connected to causality. Even if the sun rises always after ravens croak, it does not mean that the croaking is the cause of the sunrise. 2.1
The Models
Two events co-occurrence or covariation information is expressed in a 2 × 2 contingency table (Table 1). The cells a, b, c and d respectively mean the joint probabilities P (p, q), P (p, q¯), P (¯ p, q) and P (¯ p, q¯). Table 1. A 2 x 2 contingency table for causal induction
prior event
p p¯
posterior event q q¯ a b c d
Table 2. The completely symmetric contingency table for RS
prior event
p p¯
posterior event q q¯ a+d b+c c+b d+a
If it is just to predict a posterior event q from the occurrence of a prior event p, conditional probability (CP ) suffices. CP (q|p) = P (q|p) = P (p, q)/P (p) = P (p, q)/(P (p, q)+P (p, q¯)) = a/(a+b). (1)
240
T. Takahashi, K. Oyo, and S. Shinohara
However, it does not satisfy the conditions (2a) nor (2b). They are satisfied by a transformation of the information. If we identify the b and c, the first condition CP (q|p) = a/(a + b) = a/(a + c) = CP (p|q)
(2)
is satisfied. In a similar way, the second condition holds if a and d mean the same thing as well: CP (q|p) = a/(a + b) = d/(d + c) = CP (¯ q |¯ p).
(3)
The transformation P (p, q) → P (p, q) + P (¯ p, q¯) satisfies the identities; the result is in Table 2. Denoting the probabilities on Table 2 by P(p, q) = a + d and so on, we get a completely biased model RS (rigidly symmetric) as: RS(q|p) = P (p, q)/(P (p, q) + P(p, q¯)) = (a + d)/((a + d) + (b + c)).
(4)
RS satisfies both of the two symmetric conditions, 2a and 2b. It is important to see that there is another way to get it: it is to change how to refer to the untouched experience in Table 1. Since Σu,v P (u, v) = 1.0, p, q¯). (5) RS(q|p) = P (p, q) + P (¯ p, q¯) /Σu,v P (u, v) = P (p, q) + P (¯ In this form, the diagonal elements a and d are summed and the sum is divided by the whole, a+b+c+d = 1.0. Conversely, CP can be expressed in this reference form. Ignoring c and d cells by giving the coefficient 0 in CP (q|p) realizes CP (q|p) = a + 0 · d)/ a + b + 0 · c + 0 · d). (6) It is to ignore p¯, the information in the absense of p. Conversely, for calculating CP (q|¯ p), it is required to neglect a and b, the information in the presence of p. CP (q|¯ p) = c + 0 · b)/ c + d + 0 · a + 0 · b) (7) While the focal shift from p to p¯ changes the cells in the contingency table from a, b, 0, 0 to 0, 0, c, d and vice versa, the value of the unfocussed information is trivially conserved as zero. This is figure-ground segregation that is essential to cognition [8]. While the figure information is straightly used, the ground is treated in an invariant way, against focal shift. It is ground-invariance. 2.2
The LS Model
Here we introduce our loosely symmetric (LS) model. The Conditions for the Model. LS is defined as a probabilistic formula satisfying the following conditions: (i) (Loosely symmetric) LS is not always rigidly biased.
A Loosely Symmetric Model of Cognition
241
(ii) (Flexible) However, its bias intensity is flexibly adjusted according to the situation, that is here (a, b, c, d). (iii) (Cognitive) It reflects a cognitive constraint or ability: figure-ground segregation and the ground invariance. Condition (i) can be satisfied by a family of intermediate models between CP and RS, P S (parametrically symmetric) that has two parameters 0 ≤ α, β ≤ 1. We adopt it as a prototype of our LS model. P Sα,β (q|p) = (a + βd)/((a + βd) + (b + αc)).
(8)
However, P S can not satisfy neither condition (ii) nor (iii), as far as α and β are constants. Condition (iii) demands CP -like weakening of ground information, but condition (ii) requires α and β to be a function of the situation, here a, b, c, d. So we determine α(q|p) = α(a, b, c, d) and β(q|p) = β(a, b, c, d) so as to satisfy condition (iii). The condition is to equate αc and βd in Table 3 with αa and βb in Table 4, that are all in the row of the unfocussed event in the two tables. Constants can not be the solution, so the problem is to find functions α and β that satisfy α(q|p)c = α(q|¯ p)a and β(q|p)d = β(q|¯ p)b. Because 0 ≤ α, β ≤ Table 3. The table for P S(q|p), when focussed on p
⇒ p p¯
q a αc
q¯ b βd
Table 4. The table for P S(q|¯ p), when focussed on p¯
p ⇒ p¯
q αa c
q¯ βb d
1, probability formulae are suitable. The numerator of α(q|p) and α(q|¯ p) can respectively be a = P (p, q) and c = P (¯ p, q) to make the numerator of the both terms ac. The denominators are the ones invariant against the change between p and p¯. The sum of a and c satisfies the condition and hence α(q|p) = a/(a + c) = P (p|q) and α(q|¯ p) = c/(c + a) = P (¯ p|q). Similarly, β(q|p) is determined to be b/(b + d) = P (p|¯ q ) and β(q|¯ p) = d/(d + b) = P (¯ p|¯ q ). Now we get the following formula: LS(q|p) =
b a + b+d d a + P (p|¯ q )d = b a + P (p|¯ q )d + b + P (p|q)c a + b+d d + b +
a a+c c
.
(9)
As we see in Table 3, the relation (ratio) between a and c, a/(a+c), is recursively or self-referentially introduced as the coefficient α to c. 2.3
Results
We here introduce two models for comparison. One is called contingency (DP ) model. [9] The other is the presently best dual-factor heuristics (DH) model. [10] They are defined as follows: DP (q|p) = P (q|p) − P (q|¯ p)
(10)
242
T. Takahashi, K. Oyo, and S. Shinohara
Table 5. The determination coefficient r2 of the model with human estimations H03 [10] AS95 [11] WDK90 [12]
CP 0.000 0.823 0.944
DP 0.000 0.781 0.884
DH(q|p) =
DF H 0.964 0.905 0.961
RS 0.158 0.761 0.888
P (q|p)P (p|q)
LS 0.969 0.904 0.969
(11)
In Table 5, we compare the estimations by the models with the average human ratings data from three experiments available in the literature. We can see that LS describes the data best as well as DH, while DH has been comprehensively shown the superior descriptive power compared to existing 41 models. [4]
3
Application to Reinforcement Learning
The findings just from contemplation can not be definitive. Some experiment or intervention is needed to identify a conclusive. Relationships inferred from causal induction must be then utilized in decision making. So we apply the models to BBP. We can apply the causal inductive models to decision making by reading p and p¯ as “trying A” and “trying B”, and q and q¯ as “win” and “lose”, in Table 1. The models calculate the value of the options (action value). The option o to choose can be determined by a model M as: o = argmaxM (q|o ).
(12)
o ∈{p,p} ¯
This way of managing action values for decision making is called the greedy method. 3.1
The Settings for the Simulation
The simulation is executed in the following framework. Initial Condition. We need some stipulations for the initial setting. To avoid the exception “division by zero”, we define the initial covariation information a, b, c and d as respectively a uniform random number from [0, 1]. They are here not the joint probability but the frequency, so 1 is added to a when it wins trying A, and so on. The adoption is justified by the fact that the change of initial condition setting preserves the ordering of the results (correct rate, explained later) by the models. We could uniformly set a, b, c and d as a = b = c = d = θ where the θ may be 0, 1, 0.5 or 0.1 or try all the options in the first stage. Problems. A binary bandit problem is totally defined by a pair of two probabilities, (PA , PB ) = (P (q|p), P(q|¯ p)). We generated the problems as follows. n P (q|p) = Σi=1 rndi /n
(13)
A Loosely Symmetric Model of Cognition
243
where rndi is a uniform random number from [0, 1]. We fix the parameter n controlling the variance to 6. P (q|¯ p) is determined in the same fashion. Index. As the index for the performance of the models, we adopt the correct rate, the percentage of choosing the optimal option o = argmaxP (q|o ), that has o ∈{p,p} ¯
the highest win probability, in all 1, 000, 000 simulations. 3.2
Non-deterministic Models
Here we introduce some non-deterministic models for the bandit problems that are standard in reinforcement learning. Epsilon-Greedy Methods. There are three ways to enhance the effectivity of initial search by probability. – Epsilon-first (EF) method. Randomly choose an option in the initial n steps. – Epsilon-constant (EC) method. Randomly choose an option at the constant probability through the whole series of trials. – Epsilon-decreasing (ED) method. in EC method gets decreased according to a monotonously decreasing function. Here we define the function as (t) = 0.5/(1.0 + τ t).
(14)
All the above methods, EF, EC and ED show a tradeoff that corresponds to the exploration-exploitation dilemma, as shown in Figure 1-4. The parameters for the methods are n for EF, for EC, and τ for ED. Softmax Method (SM). The softmax action selection rule has a different way to manage the action values calculated by probabilistic formulae. It determines the probability of choosing an option according to the following formula that uses the Gibbs distribution: Pˆ (q|r) =
exp(M (q|r)/τ ) , ∀r ∈ {p, p¯}. Σs∈{p,p} ¯ exp(M (q|s)/τ )
(15)
Note that Pˆ (q|p) is the probability of choosing the option p, not like other methods that compare M (q|r) for all r in argmax function. τ is temperature. 3.3
Results and Discussion
Among five causal models in Table 5, LS performs the best, with RS just slightly worse, with the greedy method. CP and DP give the same result and DH chooses the optimal option at the 15th and 2000th step with the same percentage 55%. CP , LS and RS are employed in the probabilistic methods. The result by ED method in Figure 1 most prominently show the trade-off between the initial (short-term) and the final (long-term) rewards. The average correct rate of CP , LS and RS by the non-probabilistic greedy method are plotted as well. The parameter τ is moved in {0.01, ..., 1.5} (21 variations). The arrow direction means
244
T. Takahashi, K. Oyo, and S. Shinohara
Correct rate at the 2,000th step
Correct rate at the 2,000th step
0 ∧ nF (m) > 0) ∨ nJ (m) > 0 T TJTJ ⎨ T nT (m) > 0 ∧ nF (m) = 0 ∧ nJ (m) = 0 F JFFJ Basic(m) = F nT (m) = 0 ∧ nF (m) > 0 ∧ nJ (m) = 0 ⎪ ⎪ I TF I J ⎩ I nT (m) = 0 ∧ nF (m) = 0 ∧ nJ (m) = 0 J J J JJ
Chaotic Patterns in Crowd Simulation
411
J is transmitted with the highest priority and means something like a warning to others: be careful, I’m following one who is still missing. For example, given the following system of self-referential equations and iterating from the initial condition I assigned all the variables, we have the following stages until the system reaches a fixed point:
R1= T R2= F R3= Basic( R3 R4= Basic( R4 R5= Basic( R5 R6= Basic( R6
3
, R1 , R3 , R4 , R2
, R4 ) , R5 ) , R6 ) , R5 )
R1 R2 R3 R4 R5 R6 t=0 I I I I I I t=1 T F I I I I t=2 T F T I I F t=3 T F T T F F t=4 T F T J J F t=5 T F J J J J t=6 T F J J J J
A Netlogo Implementation of the Model: Comparing the Dynamics of the Mode and the Anti-mode
In order to understand the above concepts, we developed a simulation on Netlogo1 . The button setup creates a random graph with as many nodes as number-of-agents and as many sources of each class, T or F, as specified in the respective inputs sources-T or sources-F. The dynamics can be situated geographically or not: Neighborhood input variable specifies the size of the environment around an agent. The world positions vary between coordinates (-10, -10) and (10, 10) leading to 21 x 21 patch positions in a square. The kind of dynamics produced by Basic state operator in non situated experiments (the neighborhood of an agent is the whole world) is typically: (a a final configuration where all the agents stop in conflictive state J, (b Activating mode as a method of resolution of conflicts the typical final state is such that the crowd unanimously select goals T or F respectively. This is a surprising result since one can expect heterogeneous groups of agents in final states T, F and J. In the case of anti-mode method of conflict resolution, the mean results are similar to the ones obtained for the mode, but the crowd normally enters in a periodic point and individuals oscillate between the two goals. In Figure 2 we compare the extent of information for operators Basic, Mode and Anti-mode in a population of 100 agents by varying the number of references parameter from 1 to 10 and conducting 100 runs for each experiment. The criterion for stopping the system is either reaching a fixed point or completing 100 steps of iteration. One can observe that while conflict is dominant in the basic dynamics, information becomes dominant in the Mode and Anti-Mode models reaching T value about the 50%. 1
Available at http://www.ehu.es/ccwintco/index.php/Sociedades Artificiales -- Artificial Societies
412
B. Cases et al.
Table 1. The first part of the table shows the average results over 100 agents and 100 runs including own opinion. The second part presents the results of the same models excluding own opinion. Basic Mean Standard deviation Mode Mean Standard deviation Anti-Mode Mean Standard deviation
Steps 4.84 2.72 Steps 6.95 3.18 Steps 87.59 31.77
I 8.70 26.79 I 7.94 25.44 I 7.91 25.36
T 2.86 14.96 T 47.09 47.89 T 46.74 28.13
F 2.32 13.58 F 44.97 47.76 F 45.35 27.74
J 86.12 34.55 J 0.00 0.00 J 0.00 0.00
Steps 4.97 2.84 Steps 11.66 20.82 Steps 85.12 33.98
I 7.94 25.70 I 8.32 26.22 I 8.51 26.65
T 3.87 17.97 T 45.88 48.53 T 45.80 28.18
F 3.98 18.53 F 45.80 48.47 F 45.70 28.29
J 84.22 36.44 J 0.00 0.00 J 0.00 0.00
Fig. 2. Comparison of the transmission of information according to the model. The independent variable is the number of references. The dependent variable counts the number of agents in state T at the end of the experiment.
Since T and F values are symmetric, misinformation and over-information are reduced to nearly 0% from 2 references avoiding conflict. With respect to the time of convergence, Mode and Anti-Mode dynamics reflect the same curve than Basic, although the maximum shifts to three references, as shown in Figure 3. Concerning to the average values over all experiments, the following table describes the mean and deviation obtained on each variable. Note that the dynamics Mode and Anti-Mode completely eliminate the conflict and obtain unanimous dynamics navigating to T and to F, as we can infer from the values of standard deviation in table 1 of 1.
Chaotic Patterns in Crowd Simulation
413
Fig. 3. Number of iterations until convergence or termination of the experiment (100 steps) comparing the three models
4
Discussion: On the Role of the Own Opinion
We have studied experimentally the mode and anti-mode mechanisms of conflict resolution. Note that mode amplifies the information value most frequent in the references of the agent, and in this way we can expect an unanimous selection of the group. Anti-mode makes just the opposite, selecting the less frequent information, but the emergent behavior is similar in mean to the case of mode operator. For more than 3 references, the basic model ends in configurations of total conflict J, while following 1 or 2 references implies less transmission of information values over misinformation I. This suggests that the percolation index of the graph [9] has to do with this behavior of the system. Interpreting our model as a generalization of cellular automata [5], the kind of computations obtained in the experiments described before correspond to the classes I (short transient lengths and fixed points), II (greater transient lengths and periods) and III (long transient lengths and short periods) of the WolframLangton hierarchy. We wonder if the exclusion or inclusion of the own opinion in the model has importance to obtain behaviors at the edge of chaos. The number of references is critical to this behavior and following one or two references with exclusion of the own opinion leads the system to chaotic patterns of convergence. Results of experimentation excluding own opinion (with identical parameters to experimentation presented before) are given in table 2 of 1.
414
B. Cases et al.
Fig. 4. Comparing the period and the transient-lengths following the mode and de the anti-mode for 10 agents
Fig. 5. Comparing the period and the transient-lengths following the mode and de the anti-mode for 10 agents
To explore completely the periodic patterns of convergence, we have repeated the experiments for 10 agents with one source T and one F and 100 runs per experiment. States operators Mode and Anti-Mode were compared, excluding and including own opinion and exploring the whole range from 1 to 9 references. Experimentation has shown that following the anti-mode produces more variability than following the mode as show figures 4 and 5. The values of periods
Chaotic Patterns in Crowd Simulation
415
and transient lengths are significantly greater for the anti-mode state operator. Following the Mode with 2 references per agent and excluding the own opinion makes arise computations at the edge of chaos (type III), while the Anti-Mode produces the same behavior either excluding or including the own opinion. Summarizing, conflict crowds, but paradoxically, solving the problem of splitting the crowd in two equilibrated groups that navigate to the exits occurs in a tiny amount of cases.
References 1. Chen, Z.: Computational Intelligence For Decision Support. The Crc Press LLC, U.S.A (2000) 2. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural To Artificial Systems. Oxford University Press, Oxford (1999) 3. Camazine, S., Deneubourg, J.L., Franks, N., Sneyd, J., Bonabeau, E., Theraulaz, G.: Self-Organization In Biological Systems. Princeton University Press, Princeton (2001) 4. Kennedy, J., Eberhart, R.C., Shi, Y.: Swarm Intelligence. Academic Press, U.S.A (2001) 5. Schut, M.C.: Scientific Handbook for Simulation of Collective Intelligence (2000), Edicin electrnica en http://sci.collectivae.net/ 6. Langton, C.G.: Life At The Edge Of Chaos. In: Langton, C.G. (ed.) Artificial Life II, pp. 41–90. Addison Wesley, Reading (1990) 7. Hellerstein, N.S.: Diamond, A Logic Of Paradox. In: CYBERNETICS, vol. 1(1). American Society For Cybernetics, USA (Summer-Fall 1985) 8. Hellerstein, N.S.: Diamond, A Paradox Logic. World Scientific Publishing Company, Singapore (1997) 9. Grimmett, G.: Percolation, 2nd edn. Springer, Heidelberg (1999) 10. Cases, B., Hernandez, C., Graa, M., D’Anjou, A.: On The Ability Of Swarms To Compute The 3-Coloring Of Graphs. In: En Bullock, S., Noble, J., Watson, R., Bedau, M.A. (eds.) Artificial Life XI: Proceedings Of The Eleventh International Conference On The Simulation And Synthesis Of Living Systems, MIT Press, Cambridge (2008)
Simulating Swarm Robots for Collision Avoidance Problem Based on a Dynamic Bayesian Network Hiroshi Hirai, Shigeru Takano, and Einoshin Suzuki Department of Informatics, ISEE, Kyushu University, Fukuoka 819-0395, Japan
[email protected], {takano,suzuki}@inf.kyushu-u.ac.jp http://www.i.kyushu-u.ac.jp/~ suzuki/slabhomee.html
Abstract. This paper presents a simulator for the behaviors of swarm robots based on a Dynamic Bayesian Network (DBN). Our task is to design each robot’s controller which enables the robot to patrol as many regions as possible without collisions. As the first step, we use two swarm robots, each of which has two motors each of which is connected to a wheel and three distance-measurement sensors. To design the controllers of these robots, we must determine several parameters such as the motor speed and thresholds of the three sensors. The simulator is used to reduce the number of real experiments in deciding values of such parameters. We fist performed measurement experiments for our real robots in order to get probabilistic data of the DBN. The simulator based on the DBN revealed appropriate values of a threshold parameter and interesting phase transitions of their behaviors in terms of the values. Keywords: Swarm Robots, Dynamic Bayesian Network, Simulation.
1
Introduction
Recently, swarm robots have been attracting a lot of attention in the research community since they possess advantages such as scalability, flexibility, robustness, and cost-performance [4]. It is a difficult problem to design a controller of swarm robots which accomplish given tasks as a system due to uncertainties of robots, the complex interaction among robots, and their limited capabilities. We aim at developing each robot’s controller which enables the robot to patrol as many regions as possible without collisions. As a simple case, we assume two robots and one parameter that is a threshold of the robot’s sensor. Finding a good value for the parameter necessitates us a large number of experiments and thus we use a simulator. M. Toussaint [5] recently proposed to use a Dynamic Bayesian Network (DBN) for planning problems of a manipulator robot. The DBN is a probabilistic model, which consists of a set of states and probabilistic information between the states. In his work, DBN returns the joint probability distribution of the states of the robot. As the final state is given in the planning problem unlike in our collision G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 416–423, 2011. c Springer-Verlag Berlin Heidelberg 2011
Simulating Swarm Robots for Collision Avoidance Problem Based on a DBN
417
avoidance problem, we can not directly apply his method to our problem. We therefore propose another type of the DBN to model the behaviors of the swarm robots for our task.
2 2.1
Collision Avoidance Problem Definition
Our problem is to design controllers for swarm robots so that they patrol as many regions as possible without collisions with other robots or wall. We put D as a a × a closed domain in the 2 dimensional space in which the swarm robots can move, where a represents the size of one edge of D. At time t ∈ [1, T ], a variable xit is the state of the i-th robot and its action is represented by ait . Similarly, a variable sit is a value of distance-measurement sensors of the i-th robot at time t. The action of each of the swarm robots is determined by a controller δi as follows: δ i sit = ait . In order to evaluate the performance of the program δi , we consider to divide the domain D into b × b square sub-domains, where b represents the size of one edge of the sub-domain, and define a probability of visiting each sub-domain as Psearch . and a probability of collisions with another robot in each sub-domain as Pcollision . Note that these probabilities are defined in terms of the two robots from t = 1 to the time of a collision. In this paper, each of the swarm robots determines its action based on the values of the distance-measurement sensors and the values depend on their states. Our task is to find an appropriate value for the threshold so as to increase Psearch and decrease Pcollision . To determine the value, we use a simulator to imitate the real behaviors of the two robots. We store the observed behaviors as the log records of their positions and their angles in time sequence. From the log data, we calculate Psearch and Pcollision . 2.2
Controller
In this paper, we put a set of the states of the robot at time t as Xt , so Xt = {s1t , s2t }. Concretely, sit = (pit , dit ), where pit and dit ∈ [0, 360) represent the position and the angle of the i-th robot at time t, respectively. In the rest of the paper, we may omit i and t if they are clear from the context. Swarm robots are autonomous agents each of whom interacts with their environment locally based on a relatively simple program but as a system they exhibit complex collective behaviors. As the first step, we tackle a simple problem to study the feasibility of our approach by understanding its essence without being overwhelmed by a complex problem. We use the same controller for the two robots, i.e., each controller δ i for determining the action of the i-th robot is given below.
418
H. Hirai, S. Takano, and E. Suzuki
Controller of each robot while(1) { s = GetSensorValue(); if (s > threshold) turn right; else move forward; } In the controller, GetSensorValue() returns the average value of the distancemeasurement sensors. Each of the robots can only perform two kinds of actions: move forward and turn right to avoid the collision with another robot or a wall.
3
Simulator Based on the Dynamic Bayesian Network
A simulator may enable a quick and easy design of the controller of the swarm robots. To create the swarm robots’ simulator, we must know how each robot moves for each action. Unfortunately, our robots can not necessarily move forward or turn right as ordered in the robot’s program. Moreover, the values of our sensors include noise, i.e., a sensor may overlook or mis-detect an object. Therefore, we model the uncertain behaviors of such robots as a DBN, which has been successfully used in [5] as a model of a manipulator robot in planning problems. 3.1
Dynamic Bayesian Network
A DBN [3] consists of a pair of Bayesian Networks (B1 , B→ ), where B1 defines the prior probability distribution P(Z1 ) over the initial state at time t = 1, and B→ defines the the conditional probability distribution P(Zt |Zt−1 ) over the states at previous-time t − 1 to current-time t. Here Zt represents all probability variables at time t. In this paper, we assume that the P(Zt |Zt−1 ) is realized in time t > 1. 3.2
Structure of Our DBN
We propose the following DBN as a model of n swarm robots. The initial Bayesian Network B1 has a set of nodes which represent probabilistic state variables A11 , · · · , An1 , X11 , · · · , Xn1 , and a set of arrows which represents these state transitions. Here Ait represents the set of actions of the i-th robot at time t. The state transitions are given by P(Ai1 |X11 , · · · , Xn1 ) in each of i ∈ [1, n]. Similarity, Bayesian Network B→ has a set of probabilistic state variables A1t−1 , · · · , Ant−1 , X1t−1 , · · · , Xnt−1 , A1t , · · · , Ant ,X1t , · · · , Xnt , and a set of arrows which represent the state transitions given by P(Ait−1 |X1t−1 , · · · , Xnt−1 ), P(Ait |X1t , · · · , Xnt ) and P(Xit |Ait−1 , Xit−1 ) in each of i ∈ [1, n]. An example of the structure of the proposed DBN model in case of n = 2 is as shown in Fig. 1.
Simulating Swarm Robots for Collision Avoidance Problem Based on a DBN
419
Fig. 1. DBN model in case of 2 robots
Here we introduce the following two assumptions. One is that the states of each robot follow the same conditional probability distribution, that is, the following equation is realized: P(Ait |X1t , · · · , Xnt ) = P(Ajt |X1t , · · · , Xnt ), where i, j ∈ [1, · · · , n], i = j. Similarity, the other assumption is that the following equation is realized: P(Xit |Ait−1 , Xit−1 ) = P(Xjt |Ajt−1 , Xjt−1 ).
4
Experimental Results
In this section, we show the results of two types of experiments. In the first experiment, we measure the performance of the swarm robot to determine the probability distributions used in our DBN model. The second experiment is a simulation of the behavior of the swarm robots based on the DBN. 4.1
Measurement Experiment
This section describes the environment of the first experiment. As a swarm robot, we use a robot kit called Robo Designer, which is commercially available from Japan Robotech ltd [7]. This robot consists of two motors each of which is connected to a wheel, a battery, a controller board, and three distance-measurement sensors as shown in Fig. 2. The controller board orders to move forward or to turn right, and can read the value of the sensors. In order to observe the behavior of the swarm robots, we use a USB camera (Logicool Qcam Orbit) of 300,000 pixels and we set it on a position of 160 cm
420
H. Hirai, S. Takano, and E. Suzuki
Fig. 2. ROBO DESIGNER, our swarm robot
Fig. 3. Detection of the position and the direction of the robot
0.15
observed value predicted value
P(dir)
0.10
0.05
0.00 -30
Fig. 4. Regression analysis of the position (in case of a move forward)
-20 -10
0 10 dir[deg]
20
30
Fig. 5. Regression analysis of the direction (in case of a move forward)
height with a 45-degree angle to the ground [6]. We cover the robot with a sheet of colored paper, and detect the position and direction of the robot by using the color information. Figure 3 illustrates the overview of the detection of the position and the direction of the robot. We observe the actions to move forward and turn right, and then calculate the change of positions and directions of the robots observed by USB camera. Here we assume that the velocity of the robot is constant. The probability distributions of the positions and directions is computed by applying a regression analysis to the observed log data. In the regression analysis, we assume that pit and dit follow two or one dimensional normal distributions as stated in equations (1) and (2)), respectively. The parameters µp , Σp , μd , σd are estimated from the observed data. 1 1 exp − (p − µp )T Σp−1 (p − µp ) 2 2π |Σp | 1 (d − μd )2 f (d) = √ exp − 2σd2 2πσd f (p) =
(1) (2)
Simulating Swarm Robots for Collision Avoidance Problem Based on a DBN
421
Table 1. Initial parameters for our simulations Domain D [−200, 200] × [−200, 200] Time limit T 1000 sec. Number of robots n 2 Initial states (x10 , x20 ) ((0, −100, 90), (0, 100, −90)) Threshold θ 0, 10, . . . , 100
θ=10
θ=11
θ=12
θ=13
θ=14
θ=15
θ=20
θ=30 Fig. 6. Visiting probabilities of sub-domains
Results of the regression analysis using the log data of the positions and direc 0.52 0 tions when the robot moves forward are µp = (0.069, 17.24), Σp = , 0 6.76 μd = −1.47, σd = 3.57 (Fig. 4 and Fig. 5). Similarly, results of regression anal 19.98 0 ysis when the robot turn right are µp = (−14.35, 21.09), Σp = , 0 14.82
422
H. Hirai, S. Takano, and E. Suzuki
0.16
θ=10
θ=11
θ=12
θ=13
θ=14
θ=15
θ=20
θ=30 Fig. 7. Collision probabilities of sub-domains
μd = −118.49, σd = 11.78. In this experiment, the interval of time sequence is 1 sec. and the initial position and direction of the robots are (0, 0) and 0 degree, respectively. 4.2
Simulation
Using our DBN model, we perform simulations of the behavior of the swarm robots. For each of domain D, time limit T , number of robots n, initial states (x10 , x20 ), and threshold θ, we define their values as shown in Table 1. We have obtained Psearch with the simulation and show it for each sub-domain with varying values for θ in Figure 6. Detailed experiments have been performed for θ = 10, 11, . . . , 15. As the result, is is clear that either 13, 14, or 15 is appropriate as the value for θ, where the robot is neither too sensitive nor too insensitive. Interestingly, there is a phase transition between θ = 12 and 13, which tells us the importance of the simulation in investigating the behavior of a system composed of swarm robots.
Simulating Swarm Robots for Collision Avoidance Problem Based on a DBN
423
Similarly, we have obtained Pcollision for each sub-domain and show them in Figure 7. The results show two phase transitions and we can conclude that θ = 13 is the best value.
5
Conclusion
This paper has proposed the DBN model to simulate the behavior of the swarm robots, and calculated the probability distribution based on these model using the observed real behavior of the swarm robots. Especially, in case of the 2 robots and some thresholds for the sensor, we obtained the probability of visits and collisions, and analysed the results of our simulations. We believe that our simulation helps us to design the controller of the robot by decreasing the number of real experiments. In a future work, we will introduce a new method to discover appropriate values for the parameters.
References 1. Russel, S., Norvig, P.: Artificial Intelligence, a Modern Approach, 2nd edn. Prentice Hall, Upper Saddle River (2003) 2. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006) 3. Murphy, K.P.: Dynamic Bayesian Networks: Representation, Inference and Learning. Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley (2002) 4. Nouyan, S., Campo, A., Dorigo, M.: Path Formation in a Robot Swarm Selforganized Strategies to Find Your Way Home. Swarm Intelligence 2(1), 1–23 (2008) 5. Toussaint, M., Goerick, C.: Probabilistic inference for structured planning in robotics. In: Proc. Int’l Conf. on Intelligent Robots and Systems (IROS), pp. 3068– 3073 (2007) 6. Suzuki, E., Hirai, H., Takano, S.: Toward a Novel Design of Swarm Robots Based on the Dynamic Bayesian Network. In: Ras, Z.W., Dardzinska, A. (eds.) Advances in Data Management. Springer Studies in Computational Intelligence, Springer, Heidelberg (accepted for publication) 7. Japan Robotech, http://www.japan-robotech.com/
Hybridizing River Formation Dynamics and Ant Colony Optimization Pablo Rabanal and Ismael Rodríguez Dept. Sistemas Informáticos y Computación Facultad de Informática Universidad Complutense de Madrid, 28040 Madrid, Spain
[email protected],
[email protected] Abstract. River Formation Dynamics (RFD) is an evolutionary computation method based on copying how drops form rivers by eroding the ground and depositing sediments. In a rough sense, this method can be seen as a gradient-oriented version of Ant Colony Optimization (ACO). Several experiments have shown that the gradient orientation of RFD makes this method solve problems in a different way as ACO. In particular, RFD typically performs deeper searches, which in turn makes it find worse solutions than ACO in the first execution steps in general, though RFD solutions surpass ACO solutions after some more time passes. In this paper we try to get the best features of both worlds by hybridizing RFD and ACO, in particular by using a kind of ant-drop hybrid and considering both pheromone trails and altitudes in the environment. Keywords: River Formation Dynamics, Ant Colony Optimization Algorithms, Heuristic Algorithms, NP-hard problems.
1
Introduction
River Formation Dynamics (RFD) [8–10, 7] is an Evolutionary Computation method [3, 4] related to Ant Colony Optimization (ACO) [2, 1]. Roughly speaking, RFD can be seen as a gradient-oriented version of ACO where, in particular, a different nature-inspired metaphor is considered. RFD is based on copying how the water forms rivers in nature. The water transforms the environment by eroding the ground when it falls through a high decreasing slope, and it deposits carried sediments when a flatter ground is reached. In this way, altitudes of places are decreased/increased, and paths of decreasing gradient are dynamically constructed. These slopes are followed by subsequent drops to create new gradients, reinforcing the best ones. Eventually, paths consisting in consecutively taking the highest decreasing gradients constitute good paths from raining places to the sea. In RFD, drops tend to fall through high decreasing gradients with higher probability. Thus, the probabilistic decision of where a drop will move next depends
Research partially supported by projects TIN2006-15578-C02-01, CCG08-UCM/ TIC-4124, and UCM-BSCH GR58/08 - group number 910606.
G. Kampis, I. Karsai, and E. Szathmáry (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 424–431, 2011. c Springer-Verlag Berlin Heidelberg 2011
Hybridizing River Formation Dynamics and Ant Colony Optimization
425
on the difference of some values (the difference of altitudes between the origin and the destination) rather than on the values themselves (in the case of ACO, the pheromone trail of the edge leading to the destination). From the beginning of the execution of RFD, any path from the origin point to the target point has a gradient that, considering this path as a whole (i.e. from the origin to the target), must be decreasing. Hence, all complete paths from the origin to the target not include climbing steps have some incentive for drops.1 As a result, from the beginning of the execution the number of potential paths providing a non-negligible incentive for drops to be taken, and thus simultaneously competing for the drops preferences, is huge. This improves the deepness of the analysis but delays the formation of champion paths. In general, solutions constructed by RFD are better, though ACO obtains acceptable solutions faster. In this paper we develop an hybrid ACO-RFD method aiming at getting the best features of both methods. In order to enable both methods, each graph node will be decorated with an altitude value, whereas each edge will be endowed with a pheromone trail value. Hybrid drop-ant entities are released in this environment, and both pheromone trails and altitudes will have some weight in deciding where these entities will move next. After moving, both the pheromone trail and the altitude of the place will be modified according to each method. The resulting hybrid method will be influenced by some values taken as they are (pheromone trails) and also by some derivative values (differences of altitudes). Though the tendency of RFD to provide better solutions in longer times has been observed in NP-hard problems of different nature, such as the Traveling Salesman Problem [8] and the Minimum Load Sequence [7], it has been observed that RFD performs particularly well in problems consisting in creating a kind of covering tree over a given graph (see [10]). These are problems where the goal is constructing a tree covering some graph nodes, in such a way that a given property is met or a given value is minimized/maximized. In order to adapt RFD to these problems, we assign the lowest altitude to one of the nodes to be covered (it becomes the sea) and we make drops rain at the rest of nodes to be covered. Eventually, the gravity makes drops form a tree of joining tributaries from departure points to the sea, which plays the role of tree root. In this paper we will consider one of such NP-hard problems as a benchmark problem to test our hybrid method, in particular MDV (Minimum Distances tree in a Variablecost graph, see [9, 10]; this problem will be introduced in Section 3). Let us note that, in this particular problem, RFD does not only provide better solutions than ACO, but it also takes comparable or even lower times to get these solutions. Thus, in this paper the goal of hybridizing RFD and ACO will be improving the results of RFD for a problem where RFD fits particularly well, and ACO will play the role of supporting method in this particular hybridization. Interestingly, the introduction of the ACO part in the RFD scheme does not reduce the times required to get reasonable solutions, as one would expect, but it improves the 1
One of the improvements applied to the basic RFD scheme consists in allowing drops to climb increasing gradients with some low probability (see [8]). In this case, even paths with climbing steps would provide some incentive for drops.
426
P. Rabanal and I. Rodríguez
quality of final solutions indeed. The reasons for this behavior will be discussed in next sections. Studying the hybridization in problems where none of both methods is dominant is left as future work. Since the hybridization improves the results in a stress benchmark case where one method dominates the other, we think that the application of the hybrid method to more balanced problems (where the hybridization should fit more naturally) is very promising. The rest of the paper is organized as follows. In the next section we briefly present the RFD method. Then, in Section 3 we describe the MDV problem, while in Section 4 we introduce the hybrid method and we compare the results obtained by each method (RFD, ACO, and RFD-ACO hybrid) when applied to the MDV benchmark. Finally, in Section 5 we present our conclusions.
2
Brief Introduction to River Formation Dynamics
In this section we briefly introduce the basic structure of River Formation Dynamics (RFD) (for further details, see [8, 10]). Given a working graph, we associate altitude values to nodes. Drops erode the ground (they reduce the altitude of nodes) or deposit the sediment (increase it) as they move. The probability of the drop to take a given edge instead of others is proportional to the gradient of the down slope in the edge, which in turn depends on the difference of altitudes between both nodes and the distance (i.e. the cost of the edge). At the beginning, a flat environment is provided, that is, all nodes have the same altitude. The exception is the destination node, which is a hole (the sea). Drops are unleashed (i.e. it rains) at the origin node/s, and they spread around the flat environment until some of them fall in the destination node. This erodes adjacent nodes, which creates new down slopes, and in this way the erosion process is propagated. New drops are inserted in the origin node/s to transform paths and reinforce the erosion of promising paths. After some steps, good paths from the origin/s to the destination are found. These paths are given in the form of sequences of decreasing gradient edges from the origin to the destination. Several improvements are applied to this basic general scheme (see [8, 10]). Compared to a related well-known evolutionary computation method, Ant Colony Optimization, RFD provides some advantages. On the one hand, local cycles are not created and reinforced because they would imply an ever decreasing cycle, which is contradictory. Though ants usually take into account their past path to avoid repeating nodes, they cannot avoid to be led by pheromone trails through some edges in such a way that a node must be repeated in the next step.2 However, altitudes cannot lead drops to these situations. Moreover, since drops do not have to worry about following cycles, in general drops do not need to be endowed with memory of previous movements, which releases some computational memory and reduces some execution time. On the other hand, when a shorter path is found in RFD, the subsequent reinforcement of the path is fast: Since the same origin and destination are concerned in both the old and 2
Usually, this implies either to repeat a node or to kill the ant. In both cases, the last movements of the ant were useless.
Hybridizing River Formation Dynamics and Ant Colony Optimization
427
the new path, the difference of altitude is the same but the distance is different. Hence, the edges of the shorter path necessarily have higher down slopes and are immediately preferred (in average) by subsequent drops. Finally, the erosion process provides a method to avoid inefficient solutions because sediments tend to be cumulated in blind alleys (in our case, in valleys). These nodes are filled until eventually their altitude matches adjacent nodes, i.e., the valley disappears. This differs from typical methods to reduce pheromone trails in ACO: Usually, the trails of all edges are periodically reduced at the same rate. On the contrary, RFD intrinsically provides a focused punishment of bad paths where, in particular, those nodes blocking alternative paths are modified. When there are several departing points (i.e. it rains at several points), RFD does not tend in general to provide the shortest path (i.e. river) from each point to the sea. Instead, as it happens in nature, it tends to provide a tradeoff between quickly gathering individual paths into a small number of main flows (which minimizes the total size of the formed tree of tributaries) and actually forming short paths from each point to the sea. For instance, meanders are caused by the former goal: We deviate from the shortest path just to collect drops from a different area, thus reducing the number of flows. On the other hand, new tributaries are caused by the latter one: By not joining the main flows, we can form tailored short paths from each origin point. Solving the problem considered in this paper, MDV, requires encouraging the latter of both behaviors. Interestingly, we can make RFD tend towards either of these choices by setting a single parameter. In particular, if we reduce the erosion caused by high flows, then the incentive of drops to join each other is partially reduced, and thus each drop tends to follow its own shortest path (see [9] for details). Thus, adapting RFD to MDV is straightforward.
3
Problem Definition
In this section we briefly describe the problem to be solved in this paper, MDV (Minimum Distances tree in a Variable-cost graph). A formal definition of the problem and a proof of its NP-completeness can be found in [10]. MDV can be stated as follows. Let us consider a variable-cost graph, i.e. a graph where costs are assigned to edges, and the cost of traversing a given graph edge depends on the path traversed so far. That is, if we traverse e after following a path σ then the cost of adding e to the path is ce,σ ; in general, we have ce,σ = ce,σ for any other path σ . Besides, let us consider a subset of graph nodes required to be covered, called origin nodes, and a specific node called destination node. Then, the goal of MDV is constructing a tree in this graph such that the addition of the costs of paths connecting origin nodes with the destination node through the tree is as small as possible. Let us note that nodes not being origin nodes or the destination node may be included in the tree or not, depending on whether they are useful to minimize the cost of paths or not. The previous minimization problem appears in any engineering domain where a set of origins must be joined with a destination d in such a way that distances
428
P. Rabanal and I. Rodríguez
from each origin to d are minimized (in order to improve the quality of service, QoS). Let us suppose that we wish to construct a local area network for a new complex of facilities in such a way that computers in all offices and buildings are connected to the company central server. A tree topology is chosen for the network due to the ease to further extend such a topology with new switches/routers/networks without needing to change networking devices (see the role of the tree topology in networking in e.g. [5, 6]). Improving the QoS implies minimizing distances from each node to the central node. Let us note that using variable-cost graphs allows to consider that the cost of an edge (in terms of QoS loss) depends on the origin of the path we have traversed before reaching the edge. For instance, the QoS of an edge denoting a fast but not very reliable wire is not high for transmitting data coming from a node typically running real-time applications (in this case, meeting some minimal speed in all situations is critical). However, the QoS of this edge may be high for transmitting data coming from an e-mail server (a high bandwidth is good for transmitting a huge amount of e-mails, though it is acceptable if the connection is rarely down). MDV allows to represent the dependencies of edge costs on the path traversed so far as follows. We assume that the cost of a path of edges e1 , . . . , en from a given origin node o to a given destination node d depends on the evolution of a variable through the path. Initially, a value vo is assigned to this variable at node o. Then, the cost added to the path due to the inclusion of edge e1 is an amount depending on vo . After traversing e1 , the value of the variable is updated to a new value v1 . Next, the cost of adding e2 to the path depends on v1 . After taking e2 , the value of the variable is updated again, and the process continues so on until we obtain the whole cost of the path e1 , . . . , en . Thus, an MDV problem instance consists of a directed graph, a set of nodes considered as origin nodes, the destination node, the initial value given to the variable at each origin node, the way in which the variable value changes its value at each edge, and the way in which the cost of each edge depends on the variable value (for a formal definition, see [10]).
4
Implementation of the Hybrid Method and Experimental Results
In this section we introduce the RFD-ACO hybrid method and we present some experimental results. In order to obtain the best characteristics of ACO and RFD, we combine them in the following manner. Instead of using ants and drops to traverse the nodes of the graph, we create a new entity called ant-drop that contains all the attributes of the ant as well as all the attributes needed for defining a drop. This implies that the ant-drop consumes more memory resources than an ant or a drop, though many attributes are common to both entities indeed. When the ant-drop has to decide its new destination, it takes into account the pheromone trails deposited in all outgoing edges as well as the altitude gradients of these edges. The probability pij of going from a node i to another node j through edge ij is calculated as follows: First, we compute the
Hybridizing River Formation Dynamics and Ant Colony Optimization
429
probability pant an ant would have to take edge ij, as well as the probability ij a drop would have to take such an edge. Both probabilities are computed pdrop ij as if the method were pure ACO or pure RFD, respectively (i.e. taking into account only pheromone trails and only altitudes, respectively). Then, pij is calculated by taking into account the relative weight we give to each method in our hybridization. Let wACO be the weight of the ACO method and wRF D be the weight of RFD where wACO + wRF D = 1. The probability for an ant-drop to drop take the edge ij is pij = pant · wRF D . After moving, pheromone ij · wACO + pij values and altitudes are updated as each individual method would do. Let us note that the relative weights of each method, i.e. wRF D and wACO , do not need to be fix along the execution. On the contrary, we start the execution considering wRF D = 1 and next this value is slowly reduced until 0. Then, the value is slowly raised again up to 1. Though we observed that considering fix values of wRF D and wACO provides worse results than fluctuating like this, we also observed that different patterns of temporal fluctuation provide very similar results. After the hybrid method is executed for some time, we start a post-processing for constructing the solution of MDV, that is, a tree. We consider any origin node and we do as follows: We select the outgoing edge with highest pij , we add this edge to the solution tree, and we traverse the edge to reach the node it leads to. We do the same for this node, and so on until the destination is reached. Next we take another origin node and we do the same until we reach a node already included in the tree (and we join the branch to the current tree, thus introducing a tributary) or the destination is reached. We do the same for all origin nodes and, finally, a tree leading all origin nodes to the destination will be given. Next we report some experimental results. We compare the results obtained by using an ACO method, the solutions given by RFD, and the solutions found by the hybrid method (from now on, HYB). All experiments were performed in an Intel Core Duo T7250 processor with 2.00 GHz. Problem instances are graphs with 100, 150, and 200 nodes. These graphs were randomly generated in such a way that the cost of the edges, which depends on the actual value of the variable, is always between 1 and 10. Variables can take up to 5 possible values. For each edge and each variable value, the cost of traversing the edge for this value, as well as a new variable value after traversing the edge, were randomly generated. In particular, features such as monotonicity or injectivity were not required in these correspondences. In order to report solutions that are not biased by a single execution, each algorithm was executed thirty times for each graph. Table 1 summarizes the best solution found by each method among the 30 executions, as well as the arithmetic mean. As we can see, for all graphs the hybrid method obtains the best solutions, considering both mean results and best results. Regarding the time needed to obtain good solutions in specific executions, Figure 1 and Figure 2 show the cost of the best solution found by each algorithm for each execution time (in seconds), where the input of the three algorithms was the graph with 100 and 200 nodes, respectively. As we can see, the hybrid method obtains the best solution after
430
P. Rabanal and I. Rodríguez
Fig. 1. MDV results for a 100 nodes graph
Fig. 2. MDV results for a 200 nodes graph
Table 1. Summary of results Method RFD ACO HYB RFD ACO HYB RFD ACO HYB
Graph size 100 nodes 100 nodes 100 nodes 150 nodes 150 nodes 150 nodes 200 nodes 200 nodes 200 nodes
Best result 593.75 589.11 563.59 666.33 702.68 636.88 858.83 993.81 848.56
Arithmetic mean 615.66 614.25 585.15 684.59 738.03 655.98 873.12 1082 870.98
some time passes. However, it is a little bit slower than RFD because of the extra costs needed to handle the information of both ACO and RFD. The improvement of the solutions quality in the hybrid method is due to the way in which this method analyzes the search space. By combining both methods, the hybrid method avoids getting stuck in solutions that are a local optimum for one method but not for the other one. The alternation in the weight of each method along time allows each method to be dominant for some time. Though both methods are based on driving entities in a different way, the underlying evolutionary computation model is related. Hence, the mechanics of both methods are partially compatible, which in turn allows both methods to partially collaborate in the formation of solutions.
5
Conclusions and Future Work
In this paper we have presented an hybrid evolutionary computation approach where two previous methods, RFD and ACO, are mixed. In brief, the resulting method consists in driving entities in terms of both some absolute values (pheromone trails) and some derivative values (the differences of altitudes). As
Hybridizing River Formation Dynamics and Ant Colony Optimization
431
a benchmark problem, we have considered MDV, an NP-complete problem where RFD provides better solutions than ACO and takes equal or smaller times to provide them. Interestingly, despite of considering a problem where one of the methods dominates the other, the hybrid method makes an improvement with respect to both individual methods. In particular, the inclusion of the ACO approach into the RFD scheme allows the resulting hybrid method to improve the quality of solutions. This is particularly remarkable, because previous experiments showed that, in other more balanced problems such as the Traveling Salesman Problem (TSP), the strong point of ACO with respect to RFD was not the quality of final solutions (which was lower), but the smaller time needed to construct reasonable solutions. The mixture of RFD and ACO has proved its usefulness for solving MDV, which is a problem where a direct combination of the features of both methods would not be expected to be an improvement (ACO neither provides better solutions nor is faster than RFD for MDV). Thus, we think that the hybrid approach is particularly promising to solve other problems where a combination of features of both methods would be an improvement indeed (such as TSP). Hence, our next step will be applying the hybrid method to TSP.
References 1. Dorigo, M., Gambardella, L.M.: Ant colonies for the traveling salesman problem. BioSystems 43(2), 73–81 (1997) 2. Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man and Cybernetics, Part B 26(1), 29–41 (1996) 3. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003) 4. De Jong, K.A.: Evolutionary computation: a unified approach. MIT Press, Cambridge (2006) 5. Perlman, R.: An algorithm for distributed computation of a spanningtree in an extended lan. In: Symposium on Data communications, SIGCOMM 1985, pp. 44– 53. ACM, New York (1985) 6. Peterson, L.L., Davie, B.S.: Computer Networks: A Systems Approach, 3rd edn. Morgan Kaufmann, San Francisco (2007) 7. Rabanal, P., Rodríguez, I.: Testing restorable systems by using RFD. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds.) IWANN 2009. LNCS, vol. 5517, pp. 351–358. Springer, Heidelberg (2009) 8. Rabanal, P., Rodríguez, I., Rubio, F.: Using river formation dynamics to design heuristic algorithms. In: Akl, S.G., Calude, C.S., Dinneen, M.J., Rozenberg, G., Wareham, H.T. (eds.) UC 2007. LNCS, vol. 4618, pp. 163–177. Springer, Heidelberg (2007) 9. Rabanal, P., Rodríguez, I., Rubio, F.: Finding minimum spanning/distances trees by using river formation dynamics. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T., Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 60–71. Springer, Heidelberg (2008) 10. Rabanal, P., Rodríguez, I., Rubio, F.: Applying river formation dynamics to solve NP-complete problems. In: Chiong, R. (ed.) Nature-Inspired Algorithms for Optimisation. SCI, vol. 193, pp. 333–368. Springer, Heidelberg (2009)
Landmark Navigation Using Sector-Based Image Matching Jiwon Lee and DaeEun Kim Biological Cybernetics Lab, School of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, Corea (South Korea) {jiwonlee,daeeun}@yonsei.ac.kr
Abstract. Many insects return home by using their environmental landmarks. They remember the image at their nest and find the homeward direction, comparing it with the current image. There have been robotic researches to model the landmark navigation, focusing on how the image matching process can lead an agent to return to the nest, starting from an arbitrary spot. According to Franz’s navigation algorithm, an agent estimates the changes of image for its own movement, and evaluates which directional movement can produce the image pattern most similar to the snapshot taken at the nest. Then it finally chooses the best image-matching direction. Based on the idea, we suggest a new navigation approach where the image is divided into several sectors and then the sector-based image matching is applied. It checks the occupancy and the distance variation for each sector. As a result, it shows better performance than Franz’s algorithm. Keywords: landmark navigation, bio-inspired navigation, biorobotics, image matching.
1
Introduction
Many insects and animals can return home accurately by using their own navigation system, after exploring to find the food. It is well known that desert ants, fiddler crabs, and honeybees use the path integration algorithm, also called dead reckoning [1, 2]. For path integration, the current moving speed and the angular information with the reference, for example, light compass are measured and integrated all through traveling, and the conclusive direction is selected for returning home. Without any help of a specific external landmark, the animals and insects can implement the path integration for their navigation. However, the insects use both landmark navigation and path integration because it is difficult to arrive home accurately by using only the path integration algorithm [3]. For landmark navigation, the visual environment near the nest is used for homing as the information of the goal position. Various models to explain how the insects navigate with environmental landmarks have been suggested [4–7]. According to the snapshot model [7], an agent G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 432–439, 2011. c Springer-Verlag Berlin Heidelberg 2011
Landmark Navigation Using Sector-Based Image Matching
433
Fig. 1. Estimation process in the image matching method. DA : dot product result of H · image(A) where H is the home snapshot and A is one estimated position, DB : dot product result of H · image(B) where B is another estimated position; R is the landmark distance and d is the moving distance of an agent.
remembers the snapshot image at its nest and compares it with new images at the current location. Then it chooses the best matching direction with the snapshot. Afterwards, many different navigation models have been introduced. Average landmark vector model (ALV) draws a vector for each landmark, and the agent recognizes the visual environment as a summation of the local vectors [8]. The agent goes towards the direction of the difference between the two vectors, one landmark vector at the current position and the other landmark vector at the nest. The algorithm easily finds appropriate direction for any position, but if one small landmark is missing by noise, the navigation direction is significantly influenced. The image matching algorithm suggested by Franz et al. [9] includes the estimation process of new predictive images at possible movements. An agent considers all possible directions from the current position and each direction implies a new predictive image. This image is compared with the snapshot image at the nest. The best matching direction is the direction that the agent chooses. For this approach, it is assumed all landmarks are in the equal distance far away from the agent in order to generate the predictive image for each directional movement. Fig. 1 shows an example of this image matching method. If there is no reference compass, all possible head directions of the agent should be considered to estimate the best image matching direction. In spite of the equidistance assumption, it was shown that the approach can guide the agent to the nest when the nest is surrounded by landmarks. However, an agent may misjudge the homing direction, depending on the moving distance and estimation of the landmark distance. Also if large landmarks are available, the matching process tends to choose the direction biased by large landmarks. In this paper, we suggest a new navigation algorithm, called sector-based approach. The omnidirectional snapshot image is divided into several sectors. Predictive images are generated at the current position as Franz showed, and the predictive images and the snapshot image are compared sector by sector. Each sector evaluates the occupancy state of landmarks and the landmark distance
434
J. Lee and D. Kim
(a)
(b)
Fig. 2. Sector division; the visual environment surrounded by four landmarks is divided into four non-overlapping sectors (a) honeybee heads for the front (b) The orientation of the bee changes the sector occupancy (modified from [5])
for both the home snapshot and the predictive image. Then the parameters are compared each other and the agent chooses the direction with the best matching score. We will first introduce our sector-based method in details and demonstrate the simulation experiments in the robotic environment.
2
Method
In this paper, we propose a new navigation algorithm with sector-based image matching. Anderson [5] used sector-based analysis to unravel the honeybee’s navigation. The visual information is divided to separate parts, called sectors. To compare the two images, or estimate how close they are, the two images are divided into sectors, respectively. Then each sector in a image is compared with a sector in the other image, one by one. Here, occupancy and distance are the factors evaluating the matching degree between a pair of sector images. The occupancy checks if at least one landmark is found and the distance factor evaluates the distance difference between objects, if a sector is occupied. The distance can be estimated by the bearing angles of objects. By comparing sector by sector for the two images, an agent can determine the matching level of the two images. The sector-based analysis calculates the matching score over a pair of images, and Anderson [5] applied it to see the distribution of the matching score around the honeybee’s home. However, how an agent chooses the movement direction has not been handled. In this paper, we use the sector-based image matching process to select the next movement based on the image snapshot. If predictive images are generated for possible directional movements from the current position with equidistance assumption, each predictive image is compared with the snapshot image at the nest, using the sector-based image matching. Here, an agent plans to head towards a place that is similarly surrounded by the landmarks. In Fig. 2(a), the visual environment is divided into four sectors. Four landmarks surround the robot and when the robot heads towards the front, the robot recognizes that all sectors are occupied, since each sector includes one landmark image. When the robot turns its head direction to the left as shown in Fig. 2(b), two front landmarks belong to the same sector, and thus the occupancy changes
Landmark Navigation Using Sector-Based Image Matching
435
Fig. 3. Estimation process in the sector-based matching method
from 1 1 1 1 to 1 0 1 1 (sectors are counted from the head direction). Also, we can represent the distance for each occupancy as given below: (a)
(b)
Occupancy
1 1 1 1
1 0 1 1
Distance
a b b a
b 0 b a
The matching score between the two images is calculated as follows: Matching score mi = Occupancy matching × (1 − Distance difference),
(1)
where there are N viewpoints (head directions). The degree of distance difference is normalized into the range [0, 1]. The matching score mi , for i = 1, ..., N for each head direction is calculated, and the maximum score becomes the representative value at a specific point. It is called a local maximum fj (Equation (2)). fj = max mi , i
j = 1, . . . , M,
(2)
where M is the number of possible moving directions to build predictive images. Then we finally estimate the best matching direction, the j ∗ -th angular direction, which equals the homing direction for the agent to choose. j ∗ = arg max fj j
(3)
Using Franz’s predictive model with equidistance assumption, the robot tries to estimate the matching score for possible directional movements. The matching score fj is evaluated for each direction. The robot chooses the best matching direction by finding the global maximum value. Fig. 3 illustrates this estimation process, where predictive images are generated for each directional movement, and images are obtained with the assumption that all landmarks are in the same distance far away from the current position, and each moving distance for all directions is fixed. Table 1 shows how the images are predicted at the current position. The robot can move to all directions from the current position P (500, 450), and the robot
436
J. Lee and D. Kim
Table 1. Predictive images for possible movements (four directions, north, south, east, west); robot’s current position at P(500,450) when home is at H(500,500) Ratio Direction P image
0.1
0.2
0.3
0.4
0.5
North
South
East
West
predicts possible images distorted from the snapshot image (P image in Table 1) at the current position. These images are compared with the snapshot at home. If the robot moves further to the south, the landmark images in the northern area becomes closer. The ratio ρ = d/R is obtained from the moving distance d and the landmark distance R; we assume R = 100cm in this paper. Thus, the larger ratio means the larger movement in the preferred direction. The sector-based approach follows the basic scheme of Franz’s image matching process, and needs to consider all possible head directions to find the best matching direction. If the reference compass is available, the algorithm becomes simpler. Here, the sector-based approach estimates the matching score, sector by sector, rather than pixel by pixel comparison. It can show more robust navigation for noisy environments.
3
Experiments
We tested our sector-based image matching and Franz’s pixel-based image matching. We assumed that five cylindrical landmarks are in the environment, and the home position is at (500,500), as shown in Fig. 4. A robot is supposed to return home after exploration. According to our sector-based approach, the robot predicts images for all possible directions and then compares the predictive images and the snapshot at the nest. Sector-based comparison between the two images determines the best matching direction. Here, we assumed that there are 72 directions available (5 degree resolution) and four non-overlapping sectors are investigated. By equidistance assumption, the robot thinks all landmarks are in the same distance of R = 100cm. Fig. 4(a)-(b) shows the vector map where the arrow represents the best matching direction that the robot chooses. When the robot is placed at an arbitrary position, it ultimately returns home by applying the sector-based matching repeatedly. The ratio ρ = d/R also influences the performance. The high ratio ρ = 0.5 triggers more consistent directional movement
Landmark Navigation Using Sector-Based Image Matching 700
700
650
650
600
600
550
550
500
500
450
450
400
400
350
350
300 300
350
400
450
500
550
600
650
700
300 300
350
400
450
(a)
550
600
650
700
(b)
700
700
650
650
600
600
550
550
500
500
450
450
400
400
350
300 300
500
437
350
350
400
450
500
(c)
550
600
650
700
300 300
350
400
450
500
550
600
650
700
(d)
Fig. 4. Vector map; 72 moving directions and 72 head directions were tested and (500,500) is the home position (a) ρ = 0.1 with sector-based approach (b) ρ = 0.5 with sector-based approach (c) ρ = 0.1 with Franz’s method (d) ρ = 0.5 with Franz’s method
to the nest, as shown in Fig. 4(b). With Franz’s method, there is a catchment area where the robot can enter the nest. Especially in the boundary area between objects, it shows low performance. In our experiments, no reference compass is given. The robot should decide the next directional movement purely depending on image information. Thus, the robot tries to match the images with all possible head directions. Here, we tested varying number of head directions. The new head direction implies a new viewpoint over the landmarks as shown in Fig. 2. The occupancy and distance difference for each sector change depending on the head direction. Fig. 5 shows that more viewpoints lead the robot to follow the homeward direction easily. For more detailed comparison between Franz’s and the sector-based matching method, we calculated the angular errors between the estimated direction and the desired direction. Here, the desired direction is the direction in the direct route from a given position to the nest position (500,500). The best matching direction has been calculated for a number of samples within a given distance from the nest, and the direction is compared with the desired direction. Fig. 6(a) shows that the error tends to decrease at far distances, and the distance ratio influences the performance. The sector-based approach shows much better
438
J. Lee and D. Kim 700
700
650
650
600
600
550
550
500
500
450
450
400
400
350
350
300 300
350
400
450
500
550
600
650
700
300 300
350
400
450
(a)
500
550
600
650
700
(b)
Fig. 5. Vector map by sector-based algorithm (R = 100, ρ = 0.1, 72 moving directions); (a) 6 head directions tested (b) 36 head directions tested r = 0.1 r = 0.3 r = 0.5 r = 0.7
angle diffenrence
160 140 120 100 80 60 40
r = 0.1 r = 0.3 r = 0.5 r = 0.7
160 140 120 100 80 60 40 20
20 0
180
angle diffenrence
180
20
40
60
80
100
distance from home
(a)
120
0
20
40
60
80
100
120
distance from home
(b)
Fig. 6. Comparison between sector-based approach and Franz’s approach; the performance is measured with the angular difference between estimated direction and the desired direction (a) varying ratios and distances with the sector-based approach (b) varying ratios and distances with the Franz’s approach
performance than Franz’s approach. In Fig. 6(b), there is a large fluctuation of performance depending on the robot position, and the distance ratio also significantly influences the performance, since the ratio specifies the next position to be checked. High ratio in an area around the nest yields a wrong decision of moving direction. Pixel-based matching may not be effective unless we know prior information of the distance from the nest. In contrast, the sector-based approach shows consistently good result for large distances, while the performance degrades in the area close to the nest. In spite of the equidistance assumption, the sector-based approach can guide the robot to homeward direction under a large range of distance ratios. It is more efficient and robust than Franz’s approach.
4
Conclusion
In this paper, we develop a new image matching approach, called sector-based landmark navigation. We follow the predictive image matching suggested by
Landmark Navigation Using Sector-Based Image Matching
439
Franz et al. [9], where the snapshot image at the current position is converted into a collection of predictive images depending on the directions which the robot moves in. All the predictive images are compared with the snapshot image at the nest, sector by sector. This sector-based image matching significantly improves the performance of trajectories to return home from an arbitrary spot. We showed the sector-based image matching is significantly better than Franz’s pixel-based matching method. In addition, the sector-based approach allows to acquire a desired performance level under a low resolution of images. Here, we handled only four sectors for the image matching process. We can consider varying number of sectors, which can affect the efficiency of the proposed algorithm. Furthermore, how the sectors are divided may be another factor to influence the performance. For example, if there is an overlap between neighbor sectors, it can possibly improve the navigation performance for noisy environments. We leave those studies for the future work. If the reference compass is available, the sector-based algorithm becomes more efficient, since we do not need to check all the head directions to match images. It will substantially reduce the computation time of the algorithm. For the future work, we can compare the sector-based approach and Franz’s approach when the reference compass is given. Acknowledgments. This work was supported by the Korea Science and Engineering Foundation(KOSEF) grant funded by the Korea government(MEST) (No.2009-0080661).
References 1. Collett, M., Collett, T.: How do insects use integration for their navigation? Biological Cybernetics 83, 245–259 (2000) 2. Zeil, J., Hemmi, J.: The visual ecology of fiddler crabs. Journal of Comparative Physiology 192, 1–25 (2006) 3. Collett, M., Colett, T.S., Bisch, S., Wehner, R.: Local and global vectors in desert ant navigation. Nature 394, 269–272 (1998) 4. Collett, T.S., Land, M.F.: Visual spatial memory in a hoverfly. Jourcal of Comparatice Physiology 100, 59–84 (1975) 5. Anderson, A.M.: A model for landmark learning in the honeybee. Jourcal of Comparatice Physiology. A 114, 335–355 (1977) 6. Wehner, R., Ber, F.R.: Visual spatial memory in desert ants, cataglyphis bicolor (hymenoptera: Formicidae). Experientia 35, 156–157 (1979) 7. Cartwright, B., Collett, T.: Landmark learning in bees. Journal of Comparative Physiology. A 151, 521–543 (1983) 8. Moller, R.: Insect visual homing strategies in a robot with analog processing. Biological Cybernetics 83, 231–243 (2000) 9. Franz, M.O., Scholkopf, B., Mallot, H.A., Bulthoff, H.H.: where did i take that snapshot? scene-based homing by image matching. Biological Cybernetics 79, 191– 202 (1998)
A New Approach for Auto-organizing a Groups of Artificial Ants Hanane Azzag and Mustapha Lebbah LIPN-UMR 7030 Universit´e Paris 13 - CNRS 99, av. J-B Cl´ement - F-93430 Villetaneuse {Hanane.Azzag,Mustapha Lebbah}@lipn.univ-paris13.fr
Abstract. We present in this paper a new combined clustering algorithm based on two biomimetic models : artificial ants and self-organizing map (SOM). We describe the main principles of our method that aims at auto-organizing a group of homogeneous ants (data’s). We show how these principles can be applied to the problem of data clustering.
1
Introduction
The initial goal of this work is to build a new clustering algorithm and to apply it to several domains. Data clustering is identified as one of the major problems in data mining. Popularity and different variations linked to the clustering problem have given birth to a several methods of resolution [1]. These methods can both use heuristic or mathematics principles. We propose in this paper a new approach which builds a tree-structured clustering of the data. This method simulates a new biological model: the way ants build structures by assembling their bodies together. Ants start from a point (support) and progressively become connected to this point, and recursively to these firstly connected ants. Ants move in a distributed way over the living structure in order to find the best place where to connect. This behavior can be adapted to build a tree from the data to be clustered. To simulate the started point (support), we provide an original initial organization of the data (ants) by using another biological model which has inspired researchers for more than several years. Self-organizing map (SOM) is biomimetic method introduced by Kohonen and inspired from human neural network [2]. The SOM method is an unsupervised method which autoorganizes data sets in a 2D grid (map) and which uses a local neighborhood function to preserve the topological properties of the real model. The numerous abilities of ants have inspired researchers for more than ten years regarding designing new clustering algorithms [3][4]. The initial and pioneering work in this area is due to Deneubourg and his colleagues [3]. The authors have been interested in the way real ants sort objects in their nest by carrying them and dropping them in appropriate locations without a centralized control policy. The next step toward data clustering has been done in [4]. These authors have adapted the previous algorithm by considering that an object is a datum G. Kampis, I. Karsai, and E. Szathm´ ary (Eds.): ECAL 2009, Part II, LNCS 5778, pp. 440–447, 2011. c Springer-Verlag Berlin Heidelberg 2011
A New Approach for Auto-organizing a Groups of Artificial Ants
441
and by tuning the picking/dropping probabilities according to the similarities between data. These ants based algorithms inherit from real ants interesting properties, such as the local/global optimization of the clustering, the absence of need of a priori information on an initial clustering or number of classes, or parallelism. Furthermore, the results are presented as a visualization, a property which is coherent with an important actual trend in data mining called ”visual data mining” where results are presented in a visual and interactive way to the domain expert.
2 2.1
From Real to Artificial Ants : ‘AntTree Algorithm’ Biological Inspiration
The self-assembly behavior of individuals can be observed in several insects like bees or ants for instance (see a survey with spectacular photographs in [5]). We are interested here in the complex structures which are build by ants. We have specifically studied how two species were building such structures, namely the Argentina ants Linepithema humiles and ants of gender Oecophylla and that we briefly describe here [6][7]: these insects may become fixed to one another to build live structures with different functions. Ants may thus build ”chains of ants” in order to fill a gap between two points. These structures disaggregate after a given time. 2.2
Artificial Ants
For our classic algorithm called AntTree [8] each ant ai , i ∈ [1, N ] represents one data di . An ant ai is moving over the support denoted by a0 or over an other ant denoted by apos (see the ants colored in gray on figure 1). The similarity measure used will be defined in section 3.1. For instance we call it Sim(i, j). The main principles of our deterministic algorithm are the followings: at each step, an ant ai is selected in a sorted list of ants (we will explain how this list is sorted in the following section) and will connect itself or move according to the similarity with its neighbors. While there is still a moving ant ai , we simulate an action for ai according to its position (i.e. on the support or on another ant). In the following we consider that apos denotes the ant or the support over which the moving ant ai is located, and a+ is the ant (daughter) connected to apos which is the most similar to ai (see figure 1). For clarity, consider now that ant ai is located on an ant apos and that ai is similar to apos . As will be seen in the following, when an ant moves toward another one, it means that it is similar enough to that ant. So ai will become connected to apos provided that it is dissimilar enough to ants connected to apos . ai will thus form a new sub-category of apos which is as dissimilar as possible from the other existing sub-categories. For this purpose, let us denote by TDissim (apos ) the lowest similarity value which can be observed among the daughters of apos . ai is connected to apos if and only if the connection of ai decreases further this value. The test that we perform consists in comparing ai to the most similar
442
H. Azzag and M. Lebbah
Disconnection of a group of ants
Fig. 1. Disconnection of ants
ant a+ . If these two ants are dissimilar enough (Sim(ai , a+ ) < TDissim (apos )), then ai is connected to apos , else it is moved toward a+ . Since this minimum value TDissim (apos ) can only be computed with at least two ants, then the two first ants are automatically connected without any tests. This may result in ”abusive” connections for the second ant. Therefore the second ant is removed and disconnected as soon as a third ant is connected (for this latter ant, we are certain that the dissimilarity test has been successful). When this second ant is removed, all ants that were connected to it are also dropped, and all these ants are placed back onto the support (see figure 1). This algorithm can thus be stated, for a given ant ai , as the following ordered list of behaviour rules: R1 (no ant or only one ant connected to apos ): ai connects to apos R1 (2 ants connected to apos , and for the first time): 1. Disconnect the second ant from apos (and recursively all ants connected to it) 2. Place all these ants back onto the support a0 3. Connect ai to apos R2 (more than 2 ants connected to apos , or 2 ants connected to apos but for the second time): 1. let TDissim (apos ) be the lowest dissimilarity value between daughters of apos (i.e. TDissim (apos ) = M in Sim(aj , ak ) where aj and ak ∈ {ants connected to apos }), 2. If ai is dissimilar enough to a+ (Sim(ai , a+ ) < TDissim (apos )) Then ai connects to apos 3. Else ai moves toward a+
When ants are placed back on the support, they may find another place where to connect using the same behavior rules. It can be observe that, for any node of the tree, the value TDissim (apos ) is only decreasing, which ensures the termination and convergence of the algorithm. One must notice that no parameters or predefined thresholds are necessary for using our algorithm: this is one major advantage, because ants based methods often need parameter tuning, as well as clustering methods which often require a user-defined similarity threshold.
A New Approach for Auto-organizing a Groups of Artificial Ants
3
443
Artificial Ants and Self-Organizing Map
To avoid data sorting and to provide an initial organization of the ants, we propose in this work to hybridize the artificial ant algorithm with a clustering method which uses the self organizing map. This clustering approach named AntTree-N-W (N: neighborhood, W: ward distance) has the advantage to present a topological organization of data in 2D map. In addition it permit us to cluster a large number of data. Self-organizing maps provide an initial organization of ant on the grid, where each group of ants are represented by an ant ”prototype”. Until this point, hybrid model seems to be standard, but in our case we define a measure of similarity between ant prototype taking into account their proximity on the grid ”map”. Below we present a global definition of SOM map. Self-organizing maps are increasingly used as tools for visualization, as they allow projection over small areas that are generally two dimensional. The basic model proposed by Kohonen consists on a discrete set C of cells called map. This map has a discrete topology defined by undirected graph, usually it is a regular grid in 2 dimensions. We denote p the number of cells. For each pair of cells (c,r) on the map, the distance δ(c, r) is defined as the length of the shortest chain linking cells r and c on the grid. For each cell c this distance defines a neighbor cell; in order to control the neighborhood area, we introduce a kernel positive function K (K ≥ 0 and lim K(x) = 0). We de|x|→∞
fine the mutual influence of two cells c and r by K(δ(c, r)). In practice, as for traditional topological map we use smooth function to control the size of the neighborhood as K(δ(c, r)) = exp( −δ(c,r) ). Using this kernel function, T beT comes a parameter of the model. As in the Kohonen algorithm, we decrease T from an initilal value Tmax to a final value Tmin . Let Rd be the euclidean data space and A = {di ; i = 1, . . . , N } a set of observations, where each data (ant) di = (d1i , d2i , ..., dni ) is a continuous vector in Rn . For each cell c of the grid, we associate a referent vector wc = (wc1 , wc2 , ..., wcj , ..., wcd ) of dimension n (referent ant). We denote by W the set of the referent vectors. The set of parameter W, has to be estimated from A iteratively by minimizing a cost function defined as follows : J (φ, W) = K(δ(φ(di ), r))||di − wr ||2 (1) di ∈A r∈C
where φ assign each observation z to a single cell in the map C. In this expression ||z − wr ||2 is square of the Euclidean distance. The cost function (1), is minimized using an iterative process with two steps. 1. Assignment step which leads to the use of the following assignment function: ∀d, φ(x) = arg min ||d − wc ||2 c
444
H. Azzag and M. Lebbah
2. Optimization step the referent ant vector wc as the mean vector as: K(δ(c, φ(di )))di wc =
di ∈A
K(δ(c, φ(di )))
,
di ∈A
3.1
A New Similarity Measure
At the end of learning, SOM provide a partition of p groups. This partition will be denoted by P = {P1 , . . . , Pc , . . . , Pp }. Each subset Pc is associated to a referent ant wc ∈ Rn . This partition is used as input of AntTree-N-W algorithm. In order to take into account the ant topology provided by self-Organizing Map, we define a new measure between ants group. Connecting two ”referent”ant , infer merging the associated group of ants. The most widely considered rule for merging groups in continuous space may be the method proposed by Ward [9] which is defined as follows: nc nr DistW = ||wc − wr ||2 (2) nc + nc r where nc and nr represent respectively the size of groups Pc and Pr . It is necessary to weight the Ward measure by the loss of inertia with value that measures the topological modification after merging two subsets. We propose to quantify this topological modification by as follows: K(δ((c, r), u))) where δ((c, r), u) = min{δ(c, u), δ(r, u)} u∈C
This quantity allows to quantify the topological modification, but do not allow to take into account the referent proximity on the map. Thus we propose to subtract value that measures this proximity as follows: nc nr DistN −W = K(δ((c, r), u))) ||wc − wr ||2 nc + nc r u∈C
− K(δ(c, r))(nc + nr )||wc − wr ||2
(3)
This measure is composed with two terms. The first term computes the inertia loose after connecting Pc and Pr . The second term brings subsets corresponding to two referent neighbors on the map, in order to maintain topological order between subsets. Small proximity between two neighbor c and r infers small δ(c, r), thus the neighbor function K(δ(c, r)) becomes high. Hence, the second term reduces the first term depending on the neighbor function. It’s obvious that for null neighborhood our measure computes only Ward criterion. The criterion we proposed allows to obtain a regularized Ward criterion, and this regularization is obtained with the topological order used. Finally the new measure
A New Approach for Auto-organizing a Groups of Artificial Ants
445
defines a dissimilarity matrix which take into account the inertia loss and the topological order. AntTree has the advantage to have a low complexity (near the n log(n)) [8]. Those times will be further reduced since AntTree will be applied on the referent ant provided by the SOM algorithm. Thus, the tree structured obtained is as the best representative of referent ant set W = {w1 , ..., wp } (with the new similarity measure (3)). Each node parent of the tree is more representative of their node son. So the algorithm, AntTree-N-W can be described as follows: – Input: W = {w1 , ..., wp }, set of referents ant which are positioned on the map using SOM algorithm, • Compute the new distance defined in (expression 3), • Building the tree using AntTree algorithm, – Output: Tree structure of referents
The tree obtained provides a clustering P = {P1 ..., Ps } where the value s represent the number of clusters founded (each sub-tree represent one cluster) provided by AntTree-N-W using map position of referent ants
4
Validation
We have evaluated and compared our algorithms on a set of 19 databases (table 1). Art1 to Art6 are artificial and have been generated with Gaussian and uniform distributions. The others are real and extracted from the machine learning repository [10,11]. To evaluate the quality of clustering using map position, we adopt the approach of comparing the results to a ”ground truth”. We use the clustering accuracy for measuring the clustering results. The index is classification rate, usually named purity measure which can be expressed as the percentage of elements of the assigned class in a cluster. We compare our map clustering process AntTree-N-W with classic version of AntTree. We also compare the impact of using the new measure of similarity. We thus compare AntTree-N-W with AntTree-W using classic Ward index (expression 2). Concerning the number of cluster, we observe a clear improvement when we use AntTree-N-W even if there are cases where the number of found cluster is real bad. Table 1. Databases used in the experimentation Datasets Atom Ring Dem-c. Engyi. Glass Hepta Lsun Pima Target
ClR 2 2 2 2 7 7 3 2 6
d 3 3 2 2 9 3 2 8 2
N 800 1000 600 4096 214 212 400 768 770
Datasets Tetra Twodi. WingN. Art1 Art2 Art4 Art5 Art6
ClR 4 2 2 4 2 2 9 4
d 3 2 2 2 2 2 2 8
N 400 800 1016 400 1000 200 900 400
446
H. Azzag and M. Lebbah
Table 2. Comparison between Hierarchical Clustering of map using Ward criterion and a new measure. The number following the good classification rate (purity rate) indicates the number of subsets provided by map clustering. Datasets/ % AntTree-W AntTree-N-W AntTree Atom (2) 85.87 (5) 99.9 (7) 83.5(8) Ring (2) 97.8 (6) 81.5 (5) 71.5(6) Demi-c. (2) 58.833 (2) 72.67 (4) 74 (3) Engyt. (2) 74.14 (5) 88.04 (7) 95.1 (10) Glass (7) 38.32 (5) 59.81 (6) 45.33 (9) Hepta (7) 43.4 (4) 43.4 (4) 72.6 (6) Lsun (3) 55 (3) 93 (5) 94(4) Pima (2) 67 (5) 72.4 (5) 66.67 (8) Target (6) 83.25 (5) 94.42 (6) 81.81 (9) Tetra (4) 62.5 (3) 81.75 (5) 93.5 (3) Twodi. (2) 100 (4) 100 (5) 99.87 (7) WingN. (2) 95.67 (3) 87.11 (5) 93.60 (6) Art1 (4) 50.5 (4) 84.75 (4) 76.5 (8) Art2 (2) 94.9 (4) 97.7 (4) 97.9 (6) Art4 (2) 100 (3) 100 (5) 98 (4) Art5 (9) 31.78 (4) 50.33 (6) 51.55 (8) Art6 (4) 24.25 (2) 78.75 (4) 93 (4)
Table 2 lists the classification accuracy obtained with different methods; the purity rate results were provided. Using a new measure DistN −W , we observe that the results are generally better than the clustering map using traditional ward measure. Taking into account the topological order improves significatively the results. With our hybrid model (with the new measure 3) no index is necessary to obtain the optimal partition.
5
Conclusions and Perspectives
We have presented in this paper a new model for data clustering which is inspired from the self-assembly behavior observed in real ants. This approach is competitive with the precedent standard method which proves the interest to hybridize AntTree with SOM method. The first results that we have obtained are encouraging, both in terms of quality and processing times. The self-assembly model introduces a new ants-based computational method which is very promising. There are many perspectives to study after this series of results. The first consists on approving the similarity measure which seems to be an important tool in clustering map process. As far as clustering is concerned, we would like to introduce some heuristics in order to evaluate their influence on the results, like removing clusters with a small number of instances.
References 1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999) 2. Kohonen, T.: Self-organizing Maps. Springer, Berlin (2001) 3. Goss, S., Deneubourg, J.-L.: Harvesting by a group of robots. In: Varela, F., Bourgine, P. (eds.) Proceedings of the First European Conference on Artificial Life, Paris, France, pp. 195–204. Elsevier Publishing, Amsterdam (1991)
A New Approach for Auto-organizing a Groups of Artificial Ants
447
4. Lumer, E.D., Faieta, B.: Diversity and adaptation in populations of clustering ants. In: Cliff, D., Husbands, P., Meyer, J.A., Stewart, W. (eds.) Proceedings of the Third International Conference on Simulation of Adaptive Behavior, pp. 501–508. MIT Press, Cambridge (1994) 5. Anderson, C., Theraulaz, G., Deneubourg, J.L.: Self-assemblages in insect societies. Insectes Sociaux 49, 99–110 (2002) 6. Lioni, A., Sauwens, C., Theraulaz, G., Deneubourg, J.-L.: The dynamics of chain formation in oecophylla longinoda. Journal of Insect Behavior 14, 679–696 (2001) 7. Theraulaz, G., Bonabeau, E., Sauwens, C., Deneubourg, J.-L., Lioni, A., Libert, F., Passera, L., Sol´e, R.-V.: Model of droplet formation and dynamics in the argentine ant (linepithema humile mayr). Bulletin of Mathematical Biology (2001) 8. Azzag, H., Guinot, C., Venturini, G.: Data and text mining with hierarchical clustering ants. Swarm Intelligence in Data Mining, 153–189 (2006) 9. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. advanced reference series:Computer Science (1988) 10. Blake, C.L., Merz, C.L.: Uci repository of machine learning databases. Technical report, University of California, Department of information and Computer science, Irvine, CA (1998), ftp://ftp.ics.uci.edu/pub/machine-learning-databases 11. Ultsch, A.: Clustering with SOM: U*C. In: Proc. Workshop on Self-Organizing Maps, Paris, France, pp. 75–82 (2005), http://www.uni-marburg.de/fb12/datenbionik/
Author Index
Ampatzis, Christos I-197, I-205 Anderson, Chris II-175 Andras, Peter II-208, II-383 Andrews, Peter C. II-286 Anthony, Tom II-294 Arita, Takaya II-94 Azzag, Hanane II-440 Balaz, Igor II-222 Baldassarre, Gianluca I-410 Banda, Peter II-310 Banzhaf, Wolfgang I-273, II-1 Barandiaran, Xabier E. I-248 Barata, F´ abio I-345 Barrett, Enda I-450 Baxter, Paul I-402 Beckmann, Benjamin E. II-134 Bellas, Francisco II-200 Bersini, Hugues II-262 Bertolotti, Luigi I-329 Beurton-Aimar, M. I-361 Biscani, Francesco I-197 Bleys, Joris II-150 Bodi, Michael II-118, II-367 Bonani, Michael I-165, I-173 Bouyioukos, Costas I-321 Bown, Oliver II-254 Browne, Will I-402 Bullock, Seth I-353 Caama˜ no, Pilar II-200 Cases, Blanca II-408 Castillo, Camiel II-399 Catteeuw, David II-326 Caves, Leo S.D. I-289, I-377 Cederborg, Thomas I-458 Cerutti, Francesco I-329 Christensen, Anders Lyhne I-165 Clark, Edward I-297 Clarke, Tim I-297 Clune, Jeff II-10, II-134 Connelly, Brian D. I-490 Correia, Lu´ıs I-482, II-318 Crailsheim, Karl I-132, I-442, II-69, II-118, II-358, II-367, II-375
Cunningham, Alan II-167 Curran, Dara II-142 Cussat-Blanc, Sylvain I-53 D’Anjou, Alicia II-408 Danks, Gemma B. I-289 Darabos, Christian I-281, I-337 De Beule, Joachim I-466 De Loor, Pierre I-189 Di Cunto, Ferdinando I-281 Di Paolo, Ezequiel A. I-91, I-248, I-426 Dittrich, Peter I-305, I-385 Domingos, Tiago II-278 Doncieux, St´ephane II-302 Dorado, Julian I-44 Dorigo, Marco I-165 Duggan, Jim I-450 Duro, Richard J. II-200 Duthen, Yves I-53 Egbert, Matthew D. Engelbrecht, Andries
I-240, I-248 II-399
Fachada, Nuno I-345 Faulconbridge, Adam I-377 Fernandes, C.M. II-391 Fernandez-Blanco, Enrique I-44 Flamm, Christoph II-19 Fontana, Alessandro I-10 F¨ orster, Frank II-158 Froese, Tom I-240, I-426 Furukawa, Masashi I-99, I-107, I-181 Giacobini, Mario I-281, I-329, I-337 Gigliotta, Onofrio I-222 Gilmore, Jason M. II-286 Goldberg, Tony L. I-329 Goldsby, Heather J. II-10 G´ omez, Nelson II-216 Gordon-Smith, Chris I-265 G¨ orlich, Dennis I-305 Gra˜ na, Manuel II-408 Greene, Casey S. I-313, II-286 Grilo, Carlos I-482, II-318 Groß, Roderich I-165, I-173
450
Author Index
Hamann, Heiko I-132, I-442, II-69 Haruna, Taichi I-75 Harvey, Inman II-126 Hebbron, Tom I-353 Hickinbotham, Simon I-297 Hill, Douglas P. I-313 Hirai, Hiroshi II-416 Howes, Andrew II-37 Howley, Enda I-450 Isidoro, Carlos I-345 Iwadate, Kenji I-107 Izzo, Dario I-197 Jedruch, Wojciech II-183 Jin, Yaochu I-18, I-27 Joachimczak, Michal I-35 Jones, Ben I-18 Kampis, George II-102 Karsai, Istvan II-102, II-350 Kengyel, Daniela II-69 Kılı¸c, H¨ urevren II-191 Kim, DaeEun I-59, I-418, II-432 Kim, Jan T. I-321 Kiralis, Jeff II-286 Knoester, David B. I-474, II-10 Kooijman, S.A.L.M. II-278 K¨ uchler, Lorenz I-173 Laketic, Dragana I-67 Laredo, J.L.J. II-391 Lebbah, Mustapha II-440 Lee, Jeisung I-418 Lee, Jiwon II-432 Lenaerts, Tom I-434 Lima, C.F. II-391 Lipson, Hod I-156 Lizier, Joseph T. I-140 Lorena, Ant´ onio II-278 Lowe, Robert I-410 Luga, Herv´e I-53 Magalh˜ aes, Jo˜ ao II-278 Magnenat, St´ephane I-173 Maldonado, Carlos E. II-216 Manac’h, Kristen I-189 Manderick, Bernard II-326 Manicka, Santosh I-91 Mannella, Francesco I-410
Mariano, Pedro I-482 Marques, Gon¸calo M. II-278 Massaras, Vasili I-173 Massera, Gianluca I-124 Matsumaru, Naoki I-385 Mavelli, Fabio I-256 McCormack, Jon II-254 McGregor, Simon II-230 McKinley, Philip K. I-474, I-490, II-10 Merelo, J.J. II-391 Meyer, Thomas I-273 Miglino, Orazio I-222 Mihailovic, Dragutin T. II-222 Miller, Julian F. I-377 Mills, Rob II-27, II-110 Misra, Janardan II-246 Moeslinger, Christoph II-375 Mondada, Francesco I-165, I-173 Moore, Jason H. I-313, II-286 Mor´ an, Federico I-256 Morlino, Giuseppe I-213 Morris, Robert II-77 Moujahid, Abdelmalik II-408 Mouret, Jean-Baptiste II-302 Nakajima, Kohei I-75 Nakamura, Keita I-99 Nehaniv, Chrystopher L. II-158, II-294, II-342 Nellis, Adam I-297 Nevarez, Gabriel II-37 Nitschke, Geoff I-115, II-399 Noble, Jason I-353, II-334 Nolfi, Stefano I-124, I-213 Obst, Oliver II-85 Ofria, Charles II-10, II-134 O’Grady, Rehan I-165 Olasagasti, Francisco Javier II-408 O’Riordan, Colm II-167 O’Sullivan, Barry II-142 Oyo, Kuratomo II-238 ¨ Ozdemir, Burak II-191 Pacheco, Jorge M. I-434 Palmius, Niclas II-27 Paperin, Greg II-61 Parisey, N. I-361 Parisi, Domenico I-148 Pay, Mungo I-297 Penn, Alexandra II-27
Author Index Pennock, Robert T. II-134 Phelps, Steve II-37 Piedrafita, Gabriel I-256 Pinciroli, Carlo I-165 Piraveenan, Mahendra I-140 Polani, Daniel II-85, II-294, II-342 Ponticorvo, Michela I-222 Powers, Simon T. II-27, II-45, II-53 Pradhana, Dany I-140 Prieto, Abraham II-200 Prokopenko, Mikhail I-140, II-85 Provero, Paolo I-281 Rabanal, Pablo II-424 Rabu˜ nal, Juan R. I-44 Rivero, Daniel I-44 Rodr´ıguez, Ismael II-424 Rosa, A.C. II-391 Rosa, Agostinho I-345 Rossier, Jo¨el I-1 Ruci´ nski, Marek I-197 Ruiz-Mirazo, Kepa I-256 Runciman, Andrew II-350 Sadedin, Suzanne II-61 Saglimbeni, Filippo I-148 Santos, Francisco C. I-205, I-434 Saunders, Joe II-158 Schmickl, Thomas I-132, I-442, II-69, II-118, II-358, II-367, II-375 Schramm, Lisa I-27 Sendhoff, Bernhard I-18, I-27 Serantes, Jose A. I-44 Shinohara, Shuji II-238 Shirt-Ediss, Ben I-230 Sienkiewicz, Rafal II-183 Sim, Miyoung I-59 Snowdon, James R. II-45 Sousa, Tˆ ania II-278 Speroni di Fenizio, Pietro I-385, II-175 Stauffer, Andr´e I-1 Steels, Luc II-150
451
Stepney, Susan I-289, I-297, I-369, I-377 Sterbini, Andrea I-213 Stradner, J¨ urgen I-132 Suzuki, Einoshin II-416 Suzuki, Ikuo I-99, I-107, I-181 Suzuki, Reiji II-94 Suzuki, Yasuhiro I-394 Takahashi, Tatsuji II-238 Takano, Shigeru II-416 Tatnall, Adrian II-334 Thenius, Ronald I-132, II-69, II-118, II-367 Tomassini, Marco I-281, I-337 Tonelli, Paul II-302 Trianni, Vito I-205, II-270 Tschudin, Christian I-273 Tuci, Elio I-124, I-205, II-270 Tufte, Gunnar I-67, I-83 Ullrich, Alexander
II-19
Vall´ee, F. I-361 van der Horst, Johannes II-334 van Dijk, Sander G. II-342 Van Segbroeck, Sven I-434 Virgo, Nathaniel I-240, II-230 Watson, Richard A. II-27, II-45, II-53, II-110 Watson, Tim II-77 Wr´ obel, Borys I-35 Wu, Shelly X. II-1 Yaeger, Larry S. I-140 Yamamoto, Lidia I-273 Yamamoto, Masahito I-99, I-107, I-181 Yao, Xin I-18 Yoneda, Keisuke I-181 Young, Peter I-297 Zagal, Juan Cristobal Ziemke, Tom I-410
I-156