Processes of Emergence of Systems and Systemic Properties Towards a General Theory of Emergence
This page intentionally left blank
Processes of Emergence of Systems and Systemic Properties Towards a General Theory of Emergence Proceedings of the International Conference Castel Ivano, Italy
18 – 20 October 2007
editors
Gianfranco Minati Italian Systems Society, Italy
Mario Abram Italian Systems Society, Italy
Eliano Pessa University of Pavia, Italy
World Scientific NEW JERSEY
•
LONDON
•
SINGAPORE
•
BEIJING
•
SHANGHAI
•
HONG KONG
•
TA I P E I
•
CHENNAI
A-PDF Merger DEMO : Purchase from www.A-PDF.com to remove the watermark
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
PROCESSES OF EMERGENCE OF SYSTEMS AND SYSTEMIC PROPERTIES Towards a General Theory of Emergence Copyright © 2009 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN-13 978-981-279-346-1 ISBN-10 981-279-346-1
Printed in Singapore.
Chelsea - Process of Emergence.pmd
1
10/29/2008, 1:30 PM
The proceedings of the fourth national conference of the Italian Systems Society (AIRS) are dedicated to the memory of Evelyne Andreewsky, passed away on December 2007. Several members of AIRS had the honour to be her colleague and friend.
Evelyne Andreewsky was born in Paris. She earned an engineering degree in Electronics from E.S.E., Paris, and a "Docteur ès Sciences" degree (PhD) in Computer Science (Neurolinguistic Modelling), from Pierre & Marie Curie University, Paris VI. She was Senior Researcher at the French National Research Institute I.N.S.E.R.M. She has switched from (a straight) Computer Scientist career (as research engineer, chief of information processing public labs, consultant for government's policies, UNESCO expert...) to (pure) Research, trying to develop new multidisciplinary systemic approaches to Cognition and Language (over 150 papers in scientific international journals, books, chapters of books + books editor, guest editor of journals).
v
vi
Dedication
She was founder and honorary president of the Systems Science European Union (UES). She was actively involved in the board of scientific societies, namely AFSCET (French Systems Science Society) and MCX (European Program for Modelling Complexity). She belonged to the editorial board of scientific journals, namely "Cybernetics and Human Knowing" and Res-Systemica. She has organized or co-organized a number of national and international congresses, symposia and summer schools. She has been elected (1999) to Honorary Fellowship of the World Organisation of General Systems and Cybernetics (WOSC), founded by Professor John Rose, and has been invited to give courses or lectures in various countries. We will never forget her and her dedication to systems science. Thank you Evelyne.
PREFACE
The title of this fourth national conference of the Italian Systems Society (AIRS), Processes of emergence of systems and systemic properties − Towards a general theory of emergence, has been proposed to emphasize the importance of processes of emergence within Systemics. The study of this topic has a longstanding tradition within AIRS. Namely this conference can be considered as a continuation of the previous 2002 conference, Emergence in Complex Cognitive, Social and Biological Systems, and 2004 conference, Systemics of Emergence: Research and Applications. In the preface of the 2004 conference the editors wrote: “Emergence is not intended as a process taking place in the domain of any discipline, but as ‘trans-disciplinary modeling’ meaningful for any discipline. We are now facing the process by which General System Theory is more and more becoming a Theory of Emergence, seeking suitable models and formalizations of its fundamental bases. Correspondingly, we need to envisage and prepare for the establishment of a Second Systemics − a Systemics of Emergence…”. We had intense discussions in the periodic meetings of AIRS, focused on the large, increasing amount of contributions available in the scientific literature about emergence. In this regard we remark that AIRS members were and actually are involved in research projects in several disciplinary fields, having the experience of applying the view of emergence outlined above, for instance, in Architecture, Artificial Intelligence, Biology, Cognitive Science, Computer Science, Economics, Education, Engineering, Medicine, Physics, Psychology, and Social Sciences. As a consequence of this intense activity we felt an increasing need to better specify the principles to be adopted when dealing with this evolving, interdisciplinary study of emergence. With this point of view in mind, which could be viewed as a generalization of other instances, historically at the basis of birth of different systems societies in the world (e.g., Cybernetics, General System Theory, Living Systems Theory, Systems Dynamics, Systems Engineering, Systems Theory, etc.), in October 2006 the Italian Systems Society approved a Manifesto, available at our web site www.AIRS.it . It relates to our vision of the current situation of the role of world-wide Systems Societies, as well as of problems and perspectives of Systemics. In the Manifesto we outlined some fundamental aspects of our identity, such as the necessary role of disciplinary knowledge for Systemics, as vii
viii
Preface
well as of inter- and trans-disciplinary knowledge, the meaning of generalization, the need for rigorousness and the non-ideological valence of reductionism. We quote the concluding statements of the Manifesto: “The purpose of systems societies should be to identify and, where possible, produce contributions to Systemics taking place in disciplinary and multidisciplinary research, making them general and producing proposals for structuring and generalizing disciplinary results. Examples of theoretical aspects of such an effort are those related the establishment of a General Theory of Emergence, a Theory of Generalization, Logical Philosophical models related to Systemics and the issue of Variety in different disciplinary contexts.” The general theory of emergence we figure out is not a unique, caseindependent and scale-independent approach, having general disciplinary validity. Instead we have in mind different, dynamical and interacting levels of description within a constructivist view able to model processes of emergence, in order not to reduce all of them to a single description, but to introduce multimodeling and modeling hierarchy as a general approach to be used in principle. A related approach has been introduced in literature with the DYnamic uSAge of Models (DYSAM) and logical openness, i.e. meta-level modelling (models of models). We make reference to a constructivist science, as dealing with the constructive role of the observer in processes of emergence. The latter is related to his/her cognitive model allowing the recognition of acquired systemic properties, which occurs when the hierarchical processes generating these properties cannot be modeled by using traditional causal approaches. In other words, according to a constructivist view on one side the observer looks for what is conceivable by using the assumed cognitive model, and, on the other side, he/she can introduce methodologies allowing the possibility of producing incongruence, unexpected results and inconsistence. The latter process asks for a new cognitive model generating paradigm shifts and new theoretical approaches, such as in the case of abduction, as introduced by Peirce. All this is endowed with a deep, general, cultural meaning when the focus is on scientific aspects where it is possible to test, compare, validate and formulate new explicative theories. Moreover, we believe that the subject of emergence is a sort of accumulation point of increasing, mutually related conceptual links to disciplinary open questions, such as the ones mentioned in the topics of the conference.
Preface
ix
The study of processes of emergence implies the need to model and distinguish, in different disciplinary contexts, the establishment of structures, systems and systemic properties. Examples of processes of emergence of systems are given by the establishment of entities which constructivistically the observer detects to possess properties different from those of the component parts, such as in the case of collective behaviors giving rise to ferromagnetism, superconductivity and superfluidity and to social systems such as markets and industrial districts. It must be noted that in a constructivist view the whole is not constituted by parts, but rather the observer identifies parts by using a model in the attempt to explain the whole (observer and designer coincide only for artificial systems). A different partitioning corresponds to different, mutually equivalent or irreducible, models. Systems do not only possess properties, but are also able, in their turn, to make emergent new ones. Examples of emergence of systemic properties in systems (i.e., complex systems) are given by the establishment of properties such as cognitive abilities in natural and artificial systems, collective learning abilities in social systems such as flocks, swarms, markets, firms and functionalities in networks of computers (e.g., in Internet). Evolutionary processes establish properties in living systems. The models of these processes introduced so far are based on theories of phase transitions, of bifurcations, of dissipative structures, and of Multiple Systems (Collective Beings). On the one hand the ability to identify these processes allows effectiveness without confusing processes of a different nature but having in common the macroscopic and generic establishment of systems. This concerns a number of disciplinary contexts such as Physics, Cognitive Science, Biology, Artificial Intelligence, Economics. On the other hand the attempt to build a General Theory of Emergence corresponds to Von Bertalanffy’s project for a General System Theory. The conference will then focus upon these issues from theoretical, experimental, applicative, epistemological and philosophical points of view. We take this opportunity to mention an important, even if not explicit, outcome of the conference. The scientific committee and we, the editors, had the duty and benefit of this outcome and now we have the pleasure of sharing it with the readers. As it is well known, the scientific and cultural level of scientific journals and edited books is assumed to be assured by a good refereeing by the editorial board and the scientific committee. The task is supposed to be quite “easy”
x
Preface
when dealing with topics having general acceptance in academic and research contexts, robust methodologies, and consolidated literature. Consistency is assumed to be assured, in short, by the complete state of the art, and consequently grounded on the application of well-described approaches, consistent reasoning, supporting examples, validation procedures, so as to get coherent conclusions. Traditionally, the systemic community (the one we criticize in the Manifesto) has always tolerated low ‘grades’ in those areas as balanced by the need to break disciplinary well-defined barriers and approaches and encourage focus on new aspects not regulated by classic rules of acceptance. The purpose was to don’t take the risk of suffocating ideas able to generate interesting cultural processes despite their imprecise formulation, even presenting an interesting inconsistence. This was the age when to be inter- and transdisciplinary was a challenge (actually, it is still so in several universities). As emphasized in our Manifesto, disciplinary scientific research had the need to become more and more interdisciplinary, independently from roles, efforts and recommendations of system societies. The challenge for the systemic movement is, in our view, to convert this need into a theoretical result stemming from a General Theory of Emergence intended as a Theory of Change. The challenge is not only at theoretical level, but also at educational level (e.g., in which university department make such a research?). At the same time we have today available an enormous amount of knowledge and we have to face the temptation to explain-all-with-previousknowledge (like in Science). In this context we may lack approaches suitable for recognize and establish new paradigms, inhomogeneous in principle with the old ones. At the same time we lack ways to assure quality levels (e.g. “What if Simplicio had had computers available?”). One consequence of the unavailability of a General Theory of Emergence as a Theory of Change is the unavailability of a robust methodology for evaluating contributions having this mission. The attempt to evaluate each contribution as a disciplinary contribution may imply the lack of appreciation for innovative, inter- and trans-disciplinary systemic meaning. The problem relates to the production of scientific knowledge and educational systems having to deal with an enormous amount of available knowledge by using often old approaches, methodologies and technologies. How to recognize that a wrong, intelligent idea may be more important than a right, not-so-intelligent idea expected to be homologated because of its homogeneity with the established knowledge?
Preface
xi
Is the system community, in force of its historical attention and mission related to inter- and trans-disciplinarity, able to face this challenge in general, i.e. propose innovative approaches and methodologies able to guarantee, test and validate inter- and trans-disciplinary consistency and robustness? We will try to contribute, on the basis of our experience and research activity, to the introduction of proposals and methodologies. The Italian Systems Society is trying to play a significant role in this process. The conference was articulated in different sessions able to capture both the theoretical aspects of emergence as introduced above and the applicative ones: 1. Emergence in Architecture. 2. Processes of emergence in Economics and Management. 3. Emergence. 4. Emergence in social systems. 5. Emergence in Artificial Intelligence. 6. Emergence in Medicine. 7. Models and systems. 8. Theoretical problems of Systemics. 9. Cognitive Science. We conclude by emphasizing that we are aware of how much the scientific community focuses on the available knowledge, as a very comprehensible attitude. By the way, we also have the dream of inter-related forms of knowledge, one represented and modelled into the other, in which meanings have simultaneous multi-significance contributing to generate hierarchies allowing to deal with the meaning of human existence. With this dream in mind we use the bricks of science to contribute to make emergent a new multidimensional knowledge. Gianfranco Minati AIRS president Eliano Pessa Co-Editor Mario Abram Co-Editor
This page intentionally left blank
PROGRAM COMMITTEE
G. Minati (chairman)
Italian Systems Society
E. Pessa (co-chairman)
University of Pavia
L. Biggiero
LUISS University, Rome
G. Bruno
University of Rome “La Sapienza”
V. Coda
“Bocconi” University, Milan
S. Della Torre
Polytechnic University of Milan
V. Di Battista
Polytechnic University of Milan
S. Di Gregorio
University of Calabria
I. Licata
Institute for Basic Research, Florida, USA
M.P. Penna
University of Cagliari
R. Serra
University of Modena and Reggio Emilia
G. Tascini
University of Ancona
G. Vitiello
University of Salerno
This page intentionally left blank
CONTRIBUTING AUTHORS
Abram M.R. Alberti M. Allievi P. Arecchi F.T. Argentero P. Arlati E. Avolio M.V. Battistelli A. Bednar P.M. Bich L. Biggiero L. Bonfiglio N. Bouchard V. Bruno G. Buttiglieri F. Canziani A. Carletti T. Cirina L. Colacci A. Collen A. D’Ambrosio D. Damiani C. David S. Del Giudice E. Dell’Olivo B.
Della Torre S. Di Battista V. Di Caprio U. Di Gregorio S. Ferretti M.S. Filisetti A. Giallocosta G. Giunti M. Graudenzi A. Gregory R.L. Guberman S. Ingrami P. Lella L. Licata I. Lupiano V. Magliocca L.A. Marconi P.L. Massa Finoli G. Minati G. Mocci S. Montesanto A. Mura M. Odoardi C. Paoli F. Penna M.P.
Percivalle S. Pessa E. Picci P. Pietrocini E. Pinna B. Poli I. Puliti P. Ramazzotti P. Ricciuti A. Rocchi C. Rollo D. Rongo R. Sechi C. Serra R. Setti I. Sevi E. Sforna M. Spataro W. Stara V. Tascini G. Terenzi G. Trotta A. Villani M. Vitiello G.
This page intentionally left blank
CONTENTS
Dedication
v
Preface
vii
Program Committee
xiii
Contributing Authors
xv
Contents
xvii
Opening Lecture Coherence, Complexity and Creativity Fortunato Tito Arecchi
3
Emergence in Architecture Environment and Architecture – A Paradigm Shift Valerio Di Battista
37
Emergence of Architectural Phenomena in the Human Habitation of Space Arne Collen
51
Questions of Method on Interoperability in Architecture Ezio Arlati, Giorgio Giallocosta
67
Comprehensive Plans for a Culture-Driven Local Development: Emergence as a Tool for Understanding Social Impacts of Projects on Built Cultural Heritage Stefano Della Torre, Andrea Canziani Systemic and Architecture: Current Theoretical Issues Giorgio Giallocosta
xvii
79 91
xviii
Contents
Processes of Emergence in Economics and Management Modeling the 360° Innovating Firm as a Multiple System or Collective Being Véronique Bouchard The COD Model: Simulating Workgroup Performance Lucio Biggiero, Enrico Sevi Importance of the Infradisciplinary Areas in the Systemic Approach Towards New Company Organisational Models: the Building Industry Giorgio Giallocosta
103 113
135
Systemic Openness of the Economy and Normative Analysis Paolo Ramazzotti
149
Motivational Antecedents of Individual Innovation Patrizia Picci, Adalgisa Battistelli
163
An E-Usability View of the Web: A Systemic Method for User Interfaces Vera Stara, Maria Pietronilla Penna, Guido Tascini
181
Emergence Evolutionary Computation and Emergent Modeling of Natural Phenomena R. Rongo, W. Spataro, D. D’Ambrosio, M.V. Avolio, V. Lupiano, S. Di Gregorio
195
A New Model for the Organizational Knowledge Life Cycle Luigi Lella, Ignazio Licata
215
On Generalization: Constructing a General Concept from a Single Example Shelia Guberman
229
General Theory of Emergence Beyond Systemic Generalization Gianfranco Minati
241
Uncertainty, Coherence, Emergence Giordano Bruno
257
Emergence and Gravitational Conjectures Paolo Allievi, Alberto Trotta
265
Contents
xix
Emergence in Social Systems Inducing Systems Thinking in Consumer Societies Gianfranco Minati, Larry A. Magliocca
283
Contextual Analysis. A Multiperspective Inquiry into Emergence of Complex Socio-Cultural Systems Peter M. Bednar
299
Job Satisfaction and Organizational Commitment: Affective Commitment Predictors in a Group of Professionals Maria Santa Ferretti
313
Organizational Climate Assessment: A Systemic Perspective Piergiorgio Argentero, Ilaria Setti Environment and Urban Tourism: An Emergent System in Rhetorical Place Identity Definitions Marina Mura
331
347
Emergence in Artificial Intelligence Different Approaches to Semantics in Knowledge Representation S. David, A. Montesanto, C. Rocchi Bidimensional Turing Machines as Galilean Models of Human Computation Marco Giunti
365
383
A Neural Model of Face Recognition: A Comprehensive Approach Vera Stara, Anna Montesanto, Paolo Puliti, Guido Tascini, Cristina Sechi
407
Anticipatory Cognitive Systems: A Theoretical Model Graziano Terenzi
425
Decision Making Models within Incomplete Information Games Natale Bonfiglio, Simone Percivalle, Eliano Pessa
441
Emergence in Medicine Burnout and Job Engagement in Emergency and Intensive Care Nurses Piergiorgio Argentero, Bianca Dell’olivo
455
xx
Contents
The “Implicit” Ethics of a Systemic Approach to the Medical Praxis Alberto Ricciuti Post Traumatic Stress Disorder in Emergency Workers: Risk Factors and Treatment Piergiorgio Argentero, Bianca Dell’Olivo, Ilaria Setti State Variability and Psychopathological Attractors. The Behavioural Complexity as Discriminating Factor between the Pathology and Normality Profiles Pier Luigi Marconi
473
487
503
Models and Systems Decomposition of Systems and Complexity Mario R. Abram How many Stars are there in Heaven ? The results of a study of Universe in the light of Stability Theory Umberto Di Caprio
533
545
Description of a Complex System through Recursive Functions Guido Massa Finoli
561
Issues on Critical Infrastructures Mario R. Abram, Marino Sforna
571
Theoretical Problems of Systemics Downward Causation and Relatedness in Emergent Systems: Epistemological Remarks Leonardo Bich
591
Towards a General Theory of Change Eliano Pessa
603
Acquired Emergent Properties Gianfranco Minati
625
The Growth of Populations of Protocells Roberto Serra, Timoteo Carletti, Irene Poli, Alessandro Filisetti
641
Investigating Cell Criticality R. Serra, M. Villani, C. Damiani, A. Graudenzi, P. Ingrami, A. Colacci
649
Contents
xxi
Relativistic Stability. Part 1 - Relation Between Special Relativity and Stability Theory in the Two-Body Problem Umberto Di Caprio
659
Relativistic Stability. Part 2 - A Study of Black-Holes and of the Schwarzschild Radius Umberto Di Caprio
673
The Formation of Coherent Domains in the Process of Symmetry Breaking Phase Transitions Emilio Del Giudice, Giuseppe Vitiello
685
Cognitive Science Organizations as Cognitive Systems. Is Knowledge an Emergent Property of Information Networks? Lucio Biggiero
697
Communication, Silence and Miscommunication Maria Pietronilla Penna, Sandro Mocci, Cristina Sechi
713
Music: Creativity and Structure Transitions Emanuela Pietrocini
723
The Emergence of Figural Effects in the Watercolor Illusion Baingio Pinna, Maria Pietronilla Penna
745
Continuities and Discontinuities in Motion Perception Baingio Pinna, Richard L. Gregory
765
Mother and Infant Talk about Mental States: Systemic Emergence of Psychological Lexicon and Theory of Mind Understanding D. Rollo, F. Buttiglieri
777
Conflict in Relationships and Perceived Support in Innovative Work Behavior Adalgisa Battistelli, Patrizia Picci, Carlo Odoardi
787
Role Variables vs. Contextual Variables in the Theory of Didactic Systems Monica Alberti, Lucia Cirina, Francesco Paoli
803
This page intentionally left blank
OPENING LECTURE
This page intentionally left blank
COHERENCE, COMPLEXITY AND CREATIVITY
FORTUNATO TITO ARECCHI Università di Firenze and INOA, Largo E. Fermi, 6 - 50125 Firenze, Italy E-mail:
[email protected] We review the ideas and experiments that established the onset of laser coherence beyond a suitable threshold. That threshold is the first of a chain of bifurcations in a non linear dynamics, leading eventually to deterministic chaos in lasers. In particular, the so called HC behavior has striking analogies with the electrical activity of neurons. Based on these considerations, we develop a dynamical model of neuron synchronization leading to coherent global perceptions. Synchronization implies a transitory control of neuron chaos. Depending on the time duration of this control, a cognitive agent has different amounts of awareness. Combining this with a stream of external inputs, one can point at an optimal use of internal resources, that is called cognitive creativity. While coherence is associated with long range correlations, complexity arises whenever an array of coupled dynamical systems displays multiple paths of coherence. What is the relation among the three concepts in the title? While coherence is associated with long range correlations, complexity arises whenever an array of coupled dynamical systems displays multiple paths of coherence. Creativity corresponds to a free selection of a coherence path within a complex nest. As sketched above, it seems dynamically related to chaos control. Keywords: heteroclinic chaos, homoclinic chaos, quantum uncertainty, feature binding, conscious perception.
1. Introduction - Summary of the presentation Up to 1960 in order to have a coherent source of light it was necessary to filter out a noisy regular lamp. Instead, the laser realizes the dream of shining a vacuum state of the electromagnetic field with a classical antenna, thus inducing a coherent state, which is a translated version of the vacuum state, with a minimum quantum uncertainty. As a fact, the laser reaches its coherent state through a threshold transition, starting from a regular incoherent source. Accurate photon statistics measurements proved the coherence quality of the laser as well the threshold transition phenomena, both in stationary and transient situations. The threshold is the first of a chain of dynamical bifurcations; in the 1980’s the successive bifurcations leading to deterministic chaos were explored. Furthermore, the coexistence of many laser modes in a cavity with high Fresnel
3
4
F.T. Arecchi
number gives rise to a complex situation, where the modes behave in a nested way, due to their mutual couplings, displaying a pattern of giant intensity peaks whose statistics is by no means Gaussian, as in speckles. Among the chaotic scenarios, the so called HC (Heteroclinic chaos), consisting of trains of equal spikes with erratic inter-spike separation, was explored in CO2 and in diode lasers with feedback. It looks as the best implementation of a time code. Indeed, networks of coupled HC systems may reach a state of collective synchronization lasting for a finite time, in presence of a suitable external input. This opens powerful analogies with the feature binding phenomenon characterizing neuron organization in a perceptual task. The dynamics of a single neuron is suitably modeled by a HC laser; thence, the collective dynamics of a network of coupled neurons can be realized in terms of arrays of coupled HC lasers [5,23]. Thus, synchronization of an array of coupled chaotic lasers is a promising tool for a physics of cognition. Exploration of a complex situation would require a very large amount of time, in order to classify all possible coherences, i.e. long range correlations. In cognitive tasks facing a complex scenario, our strategy consists in converging to a decision within a finite short time. Any conscious perception (we define conscious as that eliciting a decision) requires 200 ms, whereas the loss of information in the chaotic spike train of a single neuron takes a few ms. The interaction of a bottom-up signal (external stimulus) with a top-down modification of the control parameters (induced by the semantic memory) leads to a collective synchronization lasting 200 ms: this is the indicator of a conscious perception. The operation is a control of chaos, and it has an optimality; if it lasts less than 200 ms, no decisions emerge, if it lasts much longer, there is no room for sequential cognitive tasks. We call creativity this optimal control of neuronal chaos. It amounts to select one among a large number of possible coherences all present in a complex situation. The selected coherence is the meaning of the object under study. 2. Coherence 2.1. Classical notion of coherence Before the laser, in order to have a coherent source of light it was necessary to filter out a noisy regular lamp. Fig. 1 illustrates the classical notion of coherence, with reference to the Young interferometer. A light source with aperture ∆ x illuminates a screen with two holes A and B (that we can move to positions A’ and B’). We take the source as made of the superposition of
Coherence, Complexity and Creativity
source
5
detector
Figure 1. Young interferometer: a light source of aperture x illuminates a screen with two holes in it. Under suitable conditions, the phase interference between the fields leaking trough the two holes gives rise to interference fringes, as the point like detector is moved on a plane transverse to the propagation direction.
independent plane waves, without mutual phase relations. The single plane wave is called mode, since it is a solution of the wave equation within the cavity containing the source. Each mode, passing through ∆ x , is diffraction broadened into a cone of aperture θ = λ ∆ x . At left of the screen, the light from A and B is collected on a detector, whose electrical current is proportional to the impinging light power, that is, to square modulus of the field. The field is the sum of the two fields E1 and E2 from the two holes. The modulus must be averaged over the observation time, usually much longer than the optical period; we call |E1 + E2|2 this average. The result is the sum of the two separate intensities I1 = |E1|2 and I2 = |E2|2 , plus the cross phased terms E1∗ E2 + E2∗ E1 . These last ones increase or reduce I1 + I 2 , depending on the relative phase, hence interference fringes are observed as we move the detector on a plane transverse to the light propagation, thus changing the path lengths of the two fields. Fringe production implies that the phase difference be maintained during the time of the average … , this occurs only if the two fields leaking through the two holes belong to the same mode, that is, if observation angle, given by the distance AB divided by the separation r between screen and detector, is smaller than the diffraction angle θ = λ ∆ x . If instead it is larger, as it occurs e.g. when holes are in positions A′ , B′ , then the detector receives contributions from distinct modes, whose phases fluctuate over a time much shorter than the averaging time. Hence, the phased
6
F.T. Arecchi
terms are washed out and no fringes appear. We call coherence area that area SAB on the screen which contains pairs of points A, B such that the collection angle be at most equal to the diffraction angle. SAB subtends a solid angle given by
S AB =
λ2 ⋅ r 2 (∆ x) 2
(1)
The averaged product of two fields in positions 1 = A and 2 = B is called first order correlation function and denoted as
G (1) (1,2) = E1∗ E2
(2)
In particular for 1 = 2, G(1)(1,1) = E1*E1 is the local intensity at point 1. Points 1 and 2 correspond to holes A and B of the Young interferometer; their separation is space-like if the detector is at approximately the same distance from the holes. Of course, fringes imply path differences comparable with the wavelength, but anyway much shorter than the coherence time
τ = 1 ∆ω
(3)
of a narrowband (quasi-monochromatic) light source¸ indeed if the line breadth is much smaller than the optical frequency, ∆ω > T . In the case of the Michelson interferometer, 1 and 2 are the two mirror positions, which are separated time-like. Fringe disappearance in this case means that the time separation between the two mirrors has become larger than the coherence time. 2.2. Quantum notion of coherence The laser realizes the dream of shining the current of a classical antenna into the vacuum state of an electromagnetic field mode, thus inducing a coherent state as a translated version of the vacuum state, with a minimum quantum uncertainty (Fig. 2). We know from Maxwell equations that the a single field mode obeys a harmonic oscillator (HO) dynamics. The quantum HO has discrete energy states equally separated by ω starting from a ground (or vacuum) state with energy ω 2 . Each energy state is denoted by the number (0,1,2, … ,n, …) of energy quanta ω above the ground state. In a coordinate q representation, any state with a fixed n is delocalised, that is, its wavefunction is spread inside the region
Coherence, Complexity and Creativity
7
Figure 2. Quantum harmonic oscillator in energy-coordinate diagram. Discrete levels correspond to photon number states. A coherent state is a translated version of the ground state; its photon number is not sharply defined but is spread with a Poisson distribution.
confined by the parabolic potential (see e.g. the dashed wave for n = 5). Calling p = mv the HO impulse, the n state has an uncertainty in the joint coordinateimpulse measurement increasing as
∆ q ∆ p = (n + 1 2) .
(4)
The vacuum state, with n = 0, has the minimum uncertainty
∆q ∆p = 1 2 .
(4’)
If now we consider a version of the vacuum state translated by (where is proportional to q), this is a quantum state still with minimum uncertainty, but with an average photon number equal to the square modulus |α2| (in the example reported in the figure we chose |α2| = 5). It is called coherent state. It oscillates at the optical frequency in the q interval allowed for by the confining potential. It maintains the instant localization, at variance with a number state. The coherent state pays this coordinate localization by a Poisson spread of the photon number around its average |α2|. The quantum field vacuum state shifted by a classical current had been introduced in 1938 by Bloch and Nordsieck; in 1963 R. Glauber showed that these states have maximal coherence, and that a laser emits such a type of light, since the collective light emission in the laser process can be assimilated to the radiation of a classical current.
8
F.T. Arecchi
While the fringe production is just a test of the modal composition of the light field, the Hanbury Brown and Twiss interferometer (HBT) implies the statistical spread of the field amplitude. HBT was introduced in 1956 as a tool for stellar observation (Fig. 3) in place of Michelson (M) stellar interferometer. M is based on summing on a detector the fields from two distant mirrors, in order to resolve the angular breadth of a star (that is, its diameter, or the different directions of two stars in binary components). The more distant are the mirrors the higher the resolution. However the light beams deflected by the two mirrors undergo strong dephasing in the horizontal propagation and this destroys the fringes. In HBT, the two mirrors are replaced by two detectors, whose output currents feed a correlator; now the horizontal path is within a cable, hence not affected by further dephasing. The working principle is intensity correlation (rather than field), which for a Gaussian statistics (as expected from thermal sources as the stars) yields the product of the two intensities plus the square modulus of the field correlation as provided by a standard interferometer, that is, 2
G ( 2) (1,2) = E1 | E2 |2 = I1I 2 + | G (1) |2
(5)
Instead, Glauber had proved that for a coherent state, all the higher order correlation functions factor as products of the lowest one, that is,
G ( n ) (1,2,
, n) = G (1) (1) G (1) (2)
G (1) (n)
(6)
in particular, for n = 2 , we have
G ( 2) (1,2) = G (1) (1) G (1) (2) .
(6’)
G (1) is just the intensity; thus a coherent state should yield only the first term of HBT, without the correlation between the two distant detectors. In 1966 the comparison between a laser and a Gaussian field was measured time-wise rather than space-wise as shown in Fig. 4. The laser light displays no HBT, as one expects from a coherent state. The extra term in the Gaussian case doubles the zero time value. As we increase the time separation between the two “instantaneous” intensity measurements (by instantaneous we mean that the integration time is much shorter than the characteristic coherence time of the Gaussian fields), the extra HBT terms decays and eventually disappears. We have scaled the time axis so that HBT for different coherence times coincide.
Coherence, Complexity and Creativity
Optical field
Photon detector
Electric signal
Electronic correlator
9
Figure 3. Left: the Michelson stellar interferometer M; it consists of two mirrors which collect different angular views of a stellar object and reflect the light to a single photon detector through long horizontal paths (10 to 100 meters) where the light phase is affected by the ground variations of the refractive index (wavy trajectories). Right: the Hanbury-Brown and Twiss (HBT) interferometer; mirrors are replaced by detectors and the current signal travel in cables toward an electronic correlator, which performs the product of the two instant field intensities E1* E1, E2* E2 and averages it over a long time [32].
Coherence times are assigned through the velocities. of a random scatterer, as explained in the next sub-section. Notice that Fig. 4 reports coherence times of the order of 1 ms. In the electronic HBT correlator, this means storing two short time intensity measurements (each lasting for example 50 ns) and then comparing them electronically. If we tried to measure such a coherence time by a Michelson interferometer, we would need a mirror separation of the order of 300 km! 2.3. Photon statistics (PS) As a fact, the laser reaches its coherent state through a threshold transition, starting from a regular incoherent source. Accurate photon statistics measurements proved the coherence quality of the laser as well the threshold transition phenomena, both in stationary and transient situations. We have seen in Fig. 2 that a coherent state yields a Poisson spread in the photon number, that is, a photon number statistics as
p ( n) =
n n − e n!
n
(7)
10
F.T. Arecchi
Figure 4. Laboratory measurement of HBT for Gaussian light sources with different coherence times; for each case, the first order correlations between the signals sampled at different times decay with the respective coherence time, and asymptotically only the product of the average intensities (scaled to 1) remains. The laser light displays no HBT, as one expects from a coherent state. [14]
where n = |α2| is the average photon number. This provides high order moments, whereas HBT for equal space-time positions 1 = 2 would yield just the first and the second moment. Thus PS is statistically far more accurate, however it is confined to within a coherence area and a coherence time. If now we couple the coherent state to an environment, we have a spread of coherent states given by a spread P(α). The corresponding PS is a weighted sum of Poisson distributions with different average values n = |α2| . In Fig. 5 we report the statistical distributions of photocounts versus the count number. If the detector has high efficiency, they well approximate the photon statistics of the observed light source. A few words on how to build a Gaussian light source. A natural way would be to take a black-body source, since at thermal equilibrium P(α) is Gaussian. However its average photon number would be given by Planck’s formula as
n =
1 exp(− ω / kT ) − 1
.
(8)
For visible light ω = 2 eV and current blackbody temperatures (remember that 104 K ≈ 1 eV ) we would have n > c / 2L, then many longitudinal modes can be simultaneously above threshold. In such a case the nonlinear mode-mode coupling, due to the medium interaction, gives an overall high dimensional dynamical system which may undergo chaos. This explains the random spiking behavior of long lasers. The regular spiking in time associated with mode
Coherence, Complexity and Creativity
21
Figure 17. Photorefractive oscillator, with the photorefractive effect enhanced by a LCLV (Liquid Crystal Light Valve). Experimental setup; A is an aperture fixing the Fresnel number of the cavity, z =0 corresponds to the plane of the LCLV; z1, z2, z3 are the three different observation planes. Below: 2-dimensional complex field, with lines of zero real part (solid) and lines of zero imaginary part (dashed). At the intersection points the field amplitude is zero and its phase not defined, so that the circulation of the phase gradient around these points is non-zero (either ±2 ) yielding phase singularities. [15,4,10]
locking is an example of mutual phase synchronization, akin to the regular spiking reported in Fig. 16. 3.2.2. b) transverse case Inserting a photorefractive crystal in a cavity, the crystal is provided high optical gain by a pump laser beam. As the gain overcomes the cavity losses, we have a coherent light oscillator. Due to the narrow linewidth of the crystal, a single longitudinal mode is excited; however, by an optical adjustment we can have large Fresnel numbers, and hence many transverse modes. We carried a research line starting from 1990 [15,16, for a review see 4]. Recently we returned to this oscillator but with a giant photorefractive effect provided by the coupling of a photorefractive slice to a liquid crystal [10,6,12] (Fig. 17). The inset in this figure shows how phase singularities appear in a 2D wave field. A phase gradient circulation ±2π is called a topological charge of ±1 respectively. A photodetector responds to the modulus square of the field amplitude. To have a phase information, we superpose a plane wave light to the 2D pattern, obtaining results illustrated in Fig. 18. For a high Fresnel number we have a number of
22
F.T. Arecchi
Figure 18. Left: a phase singularity is visualized by superposing an auxiliary coaxial plane wave to the optical pattern of the photorefractive oscillator; reconstruction of the instantaneous phase surface: perspective and equi-phase plots. Right: if the auxiliary beam is tilted, we obtain interference fringes, interrupted at each phase singularity (± correspond to ±2 circulation, respectively). The digitized fringe plots correspond to: upper plot (Fresnel number about 3): 6 defects of equal topological charge against 1 of opposite charge; lower plot (Fresnel number close to 10): almost 100 singularities with balanced opposite charges, besides a small residual unbalance [16].
singularities scaling as the square of the Fresnel number [9]. Referring to the inset of Fig. 17, when both intersections of the two zero lines are within the boundary, we expect a balance of opposite topological charges. However, for small Fresnel numbers, it is likely that only one intersection is confined within the boundary; this corresponds to an unbalance, as shown in Fig. 18, upper right. The scaling with the Fresnel number is purely geometric and does not imply dynamics. The statistics of zero-field occurrences can be predicted on purely geometric considerations, as done for random speckles. If instead we look at the high intensity peak in between the zeros, the high fields in a nonlinear medium give a strong mode-mode coupling which goes beyond speckles. This should result from the statistical occurrence of very large peaks. In order to do that, we collect space-time frames as shown in Fig. 19, with the help of the CCD +grabber set up shown in Fig. 17. We don’t have yet a definite 2D comparison with speckles. However, a 1D experiment in an optical fiber has
Coherence, Complexity and Creativity
23
Figure 19. Photorefractive oscillator: Spatiotemporal profile extracted from the z2 movie. [10]
produced giant optical spikes with non-Gaussian statistics [43]. The author draw an analogy with the so called “rogue” wave in the ocean which represent a frequent problem to boats, since satellite inspection has shown that they are more frequent than expected on a purely linear basis. We consider the anomalous statistics of giant spikes as a case of complexity, because the mutual coupling in a nonlinear medium makes the number of possible configurations increasing exponentially with the Fresnel number, rather than polynomially. The rest of the paper explores this question: how it occurs that a cognitive agent in a complex situation decides for a specific case, before having scanned all possible cases, that is, how we “cheat” complexity.
4. Physics of cognition – Creativity 4.1. Perception and control of chaos Synchronization of a chain of chaotic lasers provides a promising model for a physics of cognition. Exploration of a complex situation would require a very large amount of time. In cognitive tasks facing a complex scenario, our strategy consists in converging to a decision within a finite short time. Various experiments [36,38] prove that a decision is taken after 200 ms of exposure to a sensory stimulus. Thus, any conscious perception (we define conscious as that
24
F.T. Arecchi
Figure 20. Feature binding: the lady and the cat are respectively represented by the mosaic of empty and filled circles, each one representing the receptive field of a neuron group in the visual cortex. Within each circle the processing refers to a specific detail (e.g. contour orientation). The relations between details are coded by the temporal correlation among neurons, as shown by the same sequences of electrical pulses for two filled circles or two empty circles. Neurons referring to the same individual (e.g. the cat) have synchronous discharges, whereas their spikes are uncorrelated with those referring to another individual (the lady) [42].
eliciting a decision) requires about 200 ms, whereas the loss of information in a chaotic train of neural spikes takes a few msec. Let us consider the visual system; the role of elementary feature detectors has been extensively studied [34]. By now we know that some neurons are specialized in detecting exclusively vertical or horizontal bars, or a specific luminance contrast, etc. However the problem arises: how elementary detectors contribute to a holistic (Gestalt) perception? A hint is provided by [42]. Suppose we are exposed to a visual field containing two separate objects. Both objects are made of the same visual elements, horizontal and vertical contour bars, different degrees of luminance, etc. What are then the neural correlates of the identification of the two objects? We have one million fibers connecting the retina to the visual cortex. Each fiber results from the merging of approximately 100 retinal detectors (rods and cones) and as a result it has its own receptive field. Each receptive field isolates a specific detail of an object (e.g. a vertical bar). We thus split an image into a mosaic of adjacent receptive fields. Now the “feature binding” hypothesis consists of assuming that all the cortical neurons whose receptive fields are pointing to a specific object synchronize the corresponding spikes, and as a consequence the visual cortex
Coherence, Complexity and Creativity
25
Figure 21. ART = Adaptive Resonance Theory. Role of bottom-up stimuli from the early visual stages an top-down signals due to expectations formulated by the semantic memory. The focal attention assures the matching (resonance) between the two streams [27].
organizes into separate neuron groups oscillating on two distinct spike trains for the two objects. Direct experimental evidence of this synchronization is obtained by insertion of microelectrodes in the cortical tissue of animals just sensing the single neuron (Fig. 20) [42]. An array of weakly coupled HC systems represents the simplest model for a physical realization of feature binding. The array can achieve a collective synchronized state lasting for a finite time (corresponding to the physiological 200 ms!) if there is a sparse (non global) coupling, if the input (bottom-up) is applied to just a few neurons and if the inter-neuron coupling is suitably adjusted (top-down control of chaos) [5,23]. Fig. 21 shows the scheme of ART [27]. The interaction of a bottom-up signal (external stimulus) with a top-down change of the control parameters (induced by the semantic memory) leads to a collective synchronization lasting 200 ms: this is the indicator of a conscious perception. The operation is a control of chaos, and it has an optimality; if it lasts less than 200 ms, no decisions emerge; on the contrary, if it lasts much longer, there is no room for sequential cognitive tasks (Fig. 22). The addition of extra degrees of freedom implies a change of code, thus it can be seen as a new level of description of the same physical system.
26
F.T. Arecchi
Figure 22. Chaos is controlled by adding extra-dynamic variables, which change the transverse instability without affecting the longitudinal trajectory. In the perceptual case, the most suitable topdown signals are those which provide a synchronized neuron array with an information lifetime sufficient to activate successive decisional areas (e.g. 200 ms), whereas the single HC neuron has a chaotic lifetime of 2 ms. If our attentional-emotional system is excessively cautious, it provides a top-down correction which may stabilize the transverse instability for ever, but then the perceptual area is blocked to further perceptions.
4.2. From perception to cognition - Creativity We distinguish two types of cognitive task. In type I, we work within a prefixed framework and readjust the hypotheses at each new cognitive session, by a Bayes strategy. Bayes theorem [21] consists of the relation:
P(h | data) = P (data | h)
P ( h) P(data)
(9)
That is: the probability P(h | data ) of an hypothesis h, conditioned by the observed data (this is the meaning of the bar | ) and called a-posteriori probability of h, is the product of the probability P(data | h) that data are generated by an hypothesis h, times the a-priori probability P (h) of that hypothesis (we assume to have a package of convenient hypotheses with different probabilities) and divided the probability P(data) of the effectively occurred data. As shown in Fig. 23, starting from an initial observation and formulating a large number of different hypotheses, the one supported by the experiment suggests the most appropriate dynamical explanation. Going a step forward and repeating the Bayes procedure amounts to climbing a probability mountain along a steepest gradient line.
Coherence, Complexity and Creativity
27
final condition INFORMATION Fitness = Probability mountains a-posteriori probability a-priori probability
Darwin = Bayesian strategy
initial condition
Figure 23. Successive applications of the Bayes theorem to the experiments. The procedure is an ascent of the Probability Mountain through a steepest gradient line. Each point of the line carries an information related to the local probability by Shannon formula. Notice that Darwinian evolution by mutation and successive selection of the best fit mutant is a sequential implementation of Bayes theorem. [19,18]
On the other hand, a complex problem is characterized by a probability landscape with many peaks (Fig. 24). Jumping from a probability hill to another is not Bayesian; I call it type II cognition. A deterministic computer can not do it. In human cognition, Type II is driven by hints suggested by the context (semiosis) yet not included in the model. Type II task is a creativity act because it goes beyond it implies a change of code, at variance with Type I, which operates within a fixed code. The ascent to a single peak can be automatized in a steepest gradient program; once the peak has been reached, the program stops, any further step would be a downfall. A non-deterministic computer can not perform the jumps of Type II, since it intrinsically lacks semiotic abilities. In order to do that, the computer must be assisted by a human operator. We call “meaning” the multi-peak landscape and “semantic complexity” the number of peaks. However, this is a fuzzy concept, which varies as our comprehension evolves (Fig. 25). Let us discuss in detail the difference between type I cognitive task, which implies changing hypothesis h within a model, that is, climbing a single mountain, and Type II cognitive task which implies changing model, that is, jumping over to another mountain.
28
F.T. Arecchi complexity MEANING
INFORMATION
STOP!!! Bayes without semiosis
complication
Figure 24. Semantic complexity - A complex system is one with a many-peak probability landscape. The ascent to a single peak can be automatized by a steepest gradient program. On the contrary, to record the other peaks, and thus continue the Bayes strategy elsewhere, is a creativity act, implying a holistic comprehension of the surrounding world (semiosis). We call “meaning” the multi-peak landscape and “semantic complexity” the number of peaks. It has been guessed that semiosis is the property that discriminates living beings from Turing machines [39]; here we show that a nonalgorithmic procedure, that is, a non-Bayesian jump from one model to another is what we have called creativity. Is semiosis equivalent to creativity? [19,18].
We formalize a model as a set of dynamical variables xi (i = 1,2, , N ) , N being the number of degrees of freedom, with the equations of motion
xi = Fi ( x1 ,
, x N ; µ1 ,
, µM )
(10)
where Fi are the force laws and the M numbers µ represent the control parameters. The set {F, x, µ} is the model. Changing hypotheses within a model means varying the control parameters, as we do when exploring the transition from regular to chaotic motion in some model dynamics. Instead, changing code, or model, means selecting different sets y ,ν , G of degrees of freedom, control parameters and equations of motion as follows:
yi = Gi ( y1 ,
, y R ;ν 1 ,
,ν L )
(11)
where R and L are different respectively from N and M. The set {G , y, ν } is the new model. While changing hypotheses within a model is an a-semiotic procedure that can be automatized in an computerized expert system, changing model implies catching the meaning of the observed world, and this requires what has been
Coherence, Complexity and Creativity
29
Re-coding = creativity C computation Semiosis
Newton
Scientific Theory 0
K
Figure 25. C-K diagram (C = computational complexity; K = Information loss rate in chaotic motion): Comparison between the procedure of a computer and a semiotic cognitive agent (say: a scientist). The computer operates within a single code and C increases with K. A scientist explores how adding different degrees of freedom one can reduce the high K of the single-code description. This is equivalent to the control operation of Fig. 22; it corresponds to a new model with reduced C and K. An example is offered by the transition from a molecular dynamics to a thermodynamic description of a many body system. Other examples are listed in Table 1. The BACON program [41] could retrieve automatically Kepler’s laws from astronomical data just because the solar system approximated by Newton two-body interactions is chaos-free.
called embodied cognition [46]. Embodied cognition has been developed over thousands of generations of evolutionary adaptation, and we are unable so far to formalize it as an algorithm. This no-go statement seems to be violated by a class of complex systems, which has been dealt with successfully by recursive algorithms. Let us consider a space lattice of spins, with couplings that can be ferro or anti-ferromagnetic in a disordered, but frozen way (spin glass at zero temperature, with quenched disorder). It will be impossible to find a unique ground state. For instance having three spins A, B, and C in a triangular lattice, if all have ferromagnetic interaction, then the ground state will consist of parallel spins, but if instead one (and only one) of the mutual coupling is anti-ferromagnetic, then there will be no satisfactory spin orientation compatible with the coupling (try with: A-up, Bup, C-up; it does not work; then try to reverse a single spin, but it does not work either). This model has a cognitive flavor, since a brain region can be modeled as a lattice of coupled neurons with coupling either excitatory or inhibitory, thus resembling a spin glass, [33,1,45]. We have a large number of possible ground
F.T. Arecchi
30
Table 1. From complication to complexity: four cases of creativity. 1 - electricity - magnetism - optics
Electromagnetic equations (Maxwell)
2 - Mendeleev table
Quantum atom (Bohr, Pauli)
3 - zoo of 200 elementary particles
Quarks (M. Gell Mann)
4 - scaling laws in phase transitions
Renormalization group (K. Wilson)
states, all including some frustration. Trying to classify all possible configurations is a task whose computational difficulty (either, program length or execution time) diverges exponentially with the size of the system. Sequentially related changes of code have been successfully introduced to arrive at finite-time solutions. [37,44]. Can we say that the mentioned solutions realize the reductionistic dream of finding a suitable computer program that not only climbs the single probability peak, but also is able to choose the highest peak? If so, the optimization problem would correspond to understanding the meaning of the object under scrutiny. We should realize however that spin glasses are frozen objects, given once for ever. A clever search of symmetries has produced a spin glass theory [37] that, like the Renormalization Group (RG) for critical phenomena [47] discovers a recursive procedure for changing codes in an optimized way. Even though the problem has a large number of potential minima, and hence of probability peaks, a suitable insight in the topology of the abstract space embedding the dynamical system has led to an optimized trajectory across the peaks. In other words, the correlated clusters can be ordered in a hierarchical way and a formalism analogous to RG applied. It must be stressed that this has been possible because the system under scrutiny has a structure assigned once for ever. In everyday tasks, we face a system embedded in an environment, which induces a-priori unpredictable changes in course of time. This rules out the nice symmetries of hierarchical approaches, and rather requires an adaptive approach. Furthermore, a real life context sensitive system has to be understood within a reasonably short time, in order to take vital decisions about it.
Coherence, Complexity and Creativity
31
We find again a role of control of chaos in cognitive strategies, whenever we go beyond the limit of a Bayes strategy. We call creativity this optimal control of neuronal chaos. Four cases of creative science are listed in Table 1. Furthermore, Fig. 24 sketches the reduction of complexity and chaos which results from a creative scientific step. Appendix. Haken theory of laser threshold [28,29,30,34] We summarize in Table 2 the Langevin equation for a field E, ruled by a dynamics f (E ) corresponding to the atomic polarization and perturbed by a noise. The noise has zero average and a delta-like correlation function with amplitude D given by the spontaneous emission of the N2 atoms in the upper state. The time dependent probability P( E , t ) for E obeys a Fokker-Planck equation. In the stationary limit of zero time derivative, the Fokker-Planck equation is easily solved and gives a negative exponential on V (E ) which is the potential of the force f (E ) . Below laser threshold, f (E ) is linear, V quadratic and P(E ) Gaussian. Above threshold, f has a cubic correction, V is quartic and P(E ) displays two peaks at the minima of the quartic potential. Table 2.
E = f (E) + ξ
Langevin equation
ξ 0 ξ1 = 2 Dδ (t ) D = γ spont N 2 ∂P ∂ ∂2P =− f (E) + D 2 ∂t ∂E ∂E P( E ) ≈ e −V ( E )
D
Fokker-Planck equation Stationary solution
V ( E ) = − f ( E )dE
f ( E ) = −αE 2
f ( E ) = +αE − β E E
Force laws, over/under threshold
F.T. Arecchi
32
References Papers of which I am author or co-author can be found in my home page: www.inoa.it/home/arecchi , List of publications - Research papers in Physics
1. D.J. Amit, H. Gutfreund, H. Sompolinski, Phys. Rev A 32, 1007 (1985). 2. F.T. Arecchi, Phys. Rev. Lett. 15, 912 (1965). 3. F.T. Arecchi, Proc. E. Fermi School 1967 in Quantum Optics, Ed. R.J. Glauber (Academic Press, New York, 1969), pp. 57-110.
4. F.T. Arecchi, in Nonlinear dynamics and spatial complexity in optical systems (Institute of Physics Publishing, Bristol, 1993), pp. 65-113.
5. F.T. Arecchi, Physica A 338, 218-237 (2004). 6. F.T. Arecchi, in La Fisica nella Scuola, Quaderno 18, (Epistemologia e Didattica della Fisica) Bollettino della Assoc. Insegn. Fisica 40(1), 22-50 (2007).
7. F.T. Arecchi, E. Allaria, A. Di Garbo, R. Meucci, Phys. Rev. Lett 86, 791 (2001). 8. F.T. Arecchi, A. Berné, P. Burlamacchi, Phys. Rev. Lett. 16, 32 (1966). 9. F.T. Arecchi, S. Boccaletti, P.L. Ramazza, S. Residori, Phys. Rev. Lett. 70, 2277, (1993).
10. F.T. Arecchi, U. Bortolozzo, A. Montina, J.P. Huignard, S. Residori, Phys. Rev. Lett. 99, 023901 (2007).
11. F.T. Arecchi, V. Degiorgio, B. Querzola, Phys. Rev. Lett. 19, 1168 (1967). 12. F.T. Arecchi, V. Fano, in Hermeneutica 2007, Annuario di filosofia e teologia, (Morcelliana, Brescia, 2007), pp. 151-174.
13. F.T. Arecchi, W. Gadomski, R. Meucci, Phys. Rev. A 34, 1617 (1986). 14. F.T. Arecchi, E. Gatti, A. Sona, Phys. Lett. 20, 27 (1966). 15. F.T. Arecchi, G. Giacomelli, P.L. Ramazza, S. Residori, Phys. Rev. Lett. 65, 25312534 (1990).
16. F.T. Arecchi, G. Giacomelli, P.L. Ramazza, S. Residori, Phys. Rev. Lett. 67, 3749 (1991).
17. F.T. Arecchi, M. Giglio, A. Sona, Phys. Lett. 25A, 341 (1967). 18. F.T. Arecchi, R. Meucci, F. Salvadori, K. Al Naimee, S. Brugioni, B.K. Goswami, S. Boccaletti, Phil. Trans. R. Soc. A, doi:10.198/rsta, 2104 (2007).
19. F.T. Arecchi, A. Montina, U. Bortolozzo, S. Residori, J.P. Huignard, Phys. Rev. A 76, 033826 (2007). F.T. Arecchi, G.P. Rodari, A. Sona, Phys. Lett. 25A, 59 (1967). T. Bayes, Phil. Trans. Royal Soc. 53, 370-418 (1763). G.J. Chaitin, Algorithmic information theory, (Cambridge University Press, 1987). M. Ciszak, A. Montina, F.T. Arecchi, arXiv, nlin.CD:0709.1108v1 (2007). R.J. Glauber, Phys. Rev. 130, 2529 (1963). R.J. Glauber, Phys. Rev. 131, 2766 (1963). R.J. Glauber, in Quantum Optics and Electronics, Ed. C. DeWitt et al., (Gordon and Breach, New York, 1965). 27. S. Grossberg, The American Scientist 83, 439 (1995). 28. H. Haken, Zeits. Phys. 181, 96-124 (1964), 182; 346-359 (1964). 29. H. Haken, Phys. Rev. Lett. 13, 329 (1964).
20. 21. 22. 23. 24. 25. 26.
Coherence, Complexity and Creativity
30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48.
33
H. Haken, Laser Theory, (Springer, Berlin, 1984). H. Haken, H. Risken, W. Weidlich, Zeits. Phys. 204, 223 (1967); 206, 355 (1967). R. Hanbury Brown, R.Q. Twiss, Nature, 4497, 27 (1956). J.J. Hopfield, Proc. Nat. Aca. Sci., USA 79, 2554 (1982). D.H. Hubel, Eye, Brain and Vision, Scientific American Library, No. 22, (W.H. Freeman, New York, 1995). M. Lax, Phys. Rev. 145, 110-129 (1966). B. Libet, E.W. Wright, B. Feinstein, D.K. Pearl, Brain 102, 193 (1979). M. Mezard, G. Parisi, M.A. Virasoro, Spin glass theory and beyond (World Scientific, Singapore, 1987). E. Rodriguez, N. George, J.P. Lachaux, J. Martinerie, B. Renault, F. Varela, Nature 397, 340-343 (1999). T.A. Sebeok, Semiotica 1341(4), 61-78 (2001). L.P. Shilnikov, Dokl. Akad. Nauk SSSR, 160, 558 (1965). A. Shilnikov, L. Shilnikov, D. Turaev, Int. J. Bif. And Chaos 14, 2143 (2004). H.A. Simon, Cognitive Science 4, 33-46 (1980). W. Singer, E.C.M. Gray, Annu. Rev. Neurosci. 18, 555 (1995). D.R. Solli, C. Ropers, P. Koonath, B Jalali, Nature 450, 1054 (2007). S. Solomon, in Ann. Rev. of Comp. Physics II, (World Scientific,1995), pp. 243-294. G. Toulouse, S. Dehaene, J.P. Changeux, Proc. Nat. Aca. Sci. USA 83, 1695 (1986). F. Varela, E. Thompson, E. Rosch, The Embodied Mind (MIT Press, Cambridge, MA, 1991). K.G. Wilson, Rev. Mod. Phys. 47, 773 (1975).
This page intentionally left blank
EMERGENCE IN ARCHITECTURE
This page intentionally left blank
ENVIRONMENT AND ARCHITECTURE – A PARADIGM SHIFT
VALERIO DI BATTISTA Politecnico di Milano Dipartimento Building Environment Science and Technology – BEST The interaction of human cultures and the built environment allows a wide range of interpretations and has been studied inside the domain of many disciplines. This paper discusses three interpretations descending from a systemic approach to the question: - architecture as an “emergence” of the settlement system; - place (and space) as an “accumulator” of time and a “flux” of systems; - landscape as one representation/description of the human settlement. Architecture emerges as a new physical conformation or layout, or as a change in a specific site, arising from actions and representations of political, religious, economical or social powers, being shaped at all times by the material culture belonging to a specific time and place in the course of human evolution. Any inhabited space becomes over time a place as well as a landscape, i.e. a representation of the settlement and a relationship between setting and people. Therefore, any place owns a landscape which, in turn, is a system of physical systems; it could be defined as a system of sites that builds up its own structure stemming from the orographical features and the geometry of land surfaces that set out the basic characters of its space. Keywords: Architectural Design, Architecture, Built Environment, Landscape.
1. Introduction A number of studies, both international (Morin, 1977 [19]; Diamond, 1997 [6]), and national (Bocchi and Ceruti, 2004 [1]; La Cecla, 1988, 1993 [14,15]) have recently highlighted a new and wider understanding of human cultures and their interaction with their settlements and the built environment. A part of the Milanese School of Architecture has been interested in these questions for a long time: I would like to recall, among the others, Guido Nardi’s work on dwelling (Nardi, 1986 [21]) and some of our own considerations about the settlement system and the “continuous project” and its double nature – both intentional and unintentional (Di Battista, 1988, 2006 [7,9]). This framework allows a range of interpretations: • architecture as an “emergence” of the settlement system; • place (and space) as an “accumulator” of time and a “flux” of systems; • landscape as one representation/description of the human settlement.
37
38
2.
V. Di Battista
Architecture (be it “high” or not) as an “emergence” of the settlement system
If we define architecture as “the set of human artefacts and signs that establish and denote mankind’s settlement system” (Di Battista, 2006 [10]), we agree that architecture always represents the settlement that generates it, under all circumstances and regardless of any artistic intention. Architecture emerges as a new physical conformation or layout, or as a change in a specific site, arising from actions and representations of political, religious, economical or social powers, being shaped at all times by the material culture belonging to a specific time and place in the course of human evolution. As these actions constantly signal our way of “belonging to a place”, they consequently promote cultures of site and dwelling that denote each dimension of the settlements: from the large scale of the landscape and the city to the small scale of homes and workplaces. These cultures of different settlements involve both human history and everyday life. The “settlement culture” (that is, the culture belonging to a specific settlement) reveals itself by means of its own techniques and artefacts – terracings, buildings, service networks, canals… – and their peculiar features, related to religion, rites, symbols and style. Artefacts and techniques derive from a social and economic environment and highlight psychological and cultural peculiarities of the settled population. Therefore, our artefacts shape and mark places for a long time; moreover, they come from the past continuously reflecting changes occurring in the settlement and in the built environment. All this means that architecture often outlives its generating system, becoming a heritage to the following ones, thus acting as memory – an identity condition linking people and places to their past systems. This peculiarity, signalling both continuity and inertia of the built environment, derives from the many factors that shape the relation between people and places over time. 3. The variable of time and the built environment Whenever we observe a system of objects, the landscape we are facing, this represents both what has been conserved and what has been transformed; it displays geometric shapes, dimension, materials, colors in their relationships and presents a great variety of information about the conditions and means by which every item has been produced and used, in any time. Every description always takes note only of the state of what has been conserved, because the information
Environment and Architecture – A Paradigm Shift
39
about what has been transformed has been irretrievably lost. But even what we perceive as “conservation”, is actually the result of transformation; only a very keen anamnesis and a historical and documental reconstruction can recognise the size and distance in time of past transformation; every backward enquiry puts what we observe to our scientific and cultural models and, paradoxically, the more recent and keener, the more questionable it is. Moreover, no “case history” will ever be able to describe each and every interaction between the built environment we observe today and the settlement system it once belonged to. Every possible assumption about past events is always an interpretation biased by today cultural models and their leading values. This means that memory acquires and processes materials in order to describe a past that always – in different ways – gets to us through our current reality; it, unavoidably, produces a project – be it intentional or unintentional – that regards future. 4. The bonds and values of time Our built environment is the solid outcome of the different lifetimes of all the various things that today represent the system. They represent the “state of reality”, but also refer to the settlements that produced and selected them in time. Therefore, the built environment is the resultant of the many settlements that came one after the other in the same place, the resultant of the un-realized imagination and the enduring conditions; and it is the summation of all the actions – conservation and transformation, addition and subtraction – that have been performed over time in the same place we now observe. It means that today a place is the way it is (be it anyhow) just and because in it a number of things happened, built up and dissolved in a number of times. Every place is the resultant of a huge quantity of things and times: N things N lives N events N times = place N This mound where things and human lives heap together, this summation of times, of every thing and of every human being that ever was in this place, this is what we can read today in our landscapes. This huge amount of past lives we perceive, even if confusedly, may be the reason why we are so spellbound by historical and archaeological finds. Maybe we perceive more keenly our own brief existence, the continuous change of our landscapes, when we face those places where the past and the mound of time become more evident. Actually, today every open space is the background of an ever-changing setting of movable things; this transient scene repeats itself with equivalent components, depriving the background of any meaning. This may be the reason
40
V. Di Battista
why, in our culture, some monuments and place retain acknowledged values and sometimes they become “sacred” in a strange way, being “consumed” by tourism in a sort of due ritual. The hugeness of past, that belongs to every place, cannot be perceived anywhere and anytime; it can be lost when there are not – or there are no more – traces; in these cases, the links between a place and its times wear out in the speed of actions that happen without leaving any mark. 5. Architecture and society No memory can be recalled when every trace of time past has disappeared, but no trace can reach across time if nobody calls it back by inquiry. What is the filter that defines the time of things? No project, no purpose of duration, no painstaking production can guarantee permanence. Only the strict bond between observed system and observing system gives body and meaning to the time of things in a given place. Our built environments, our settlements, are the references – which are in turn observing and observed – of the meanings connecting places and time. Therefore space receives time, it has received it in the past, is sees it flow in the present, it longs for it and it fears it in the future. In the present, the different speeds of change in settlements (for instance, economic values change much faster than social ones) meet the existence cycles of existent buildings; this raises two major issues: • the difference of speed of different settlements in different places of the world; • the virtual availability of all places and landscapes of the earth. This relativization seems to lessen values; indeed, it might offer a new meaning both to “different” conditions and to the material constitution and duration of the places where we live, even the more ordinary ones. In this new relationship with “virtuality” we always find a condition of “dwelling” always claiming a perceptible, physical relationship between us and the space – very variable in character and dimension – that receives our existence, our time, our observations, our decisions, our actions. How do the various existences of places and their specific things meet the occurrences of the human beings that produce, use, conserve, transform or destroy those places?
Environment and Architecture – A Paradigm Shift
41
To understand what happens in our built environments and dwelling places, we could imagine what happens in some of our familiar micro-landscape, such as our bedroom and the things it contains. We could consider the reasons – more or less profound – that organize its space, its fittings, its use, the way we enter it, its outlook and so on. We could also consider the meaning of the different things that characterize that place where we live day after day, discovering and giving way to emotions and rationality, needs and moods, function and symbols: all of these things being more or less inextricable. Now, let’s try and move these reasons and actions and emotions to the wider landscape of social places. Let’s consider the number of subjects acting, of things present in our settlement; let’s multiply the spurs and the hindrances for every subject and every thing. Finally, let’s imagine how many actions (in conservation, transformation, change of use etc.) could affect every single point and every link in the system. If we imagine all this, we will realize that the configuration and the global working of the system is casual, but nevertheless the organization of that space, the meanings of that place – of that built environment – are real. They can exist in reality only as an emergence (a good or bad one, it does not matter) of the settlement system that inhabits that same place. 6. Built environment and landscape Any inhabited space becomes over time a place (a touchstone both for dwelling and identity) as well as a landscape, i.e. a representation of the settlement ad a relationship between setting and people. Therefore, any place owns a landscape which, in turn, is a system of physical systems; it could be defined as a system of sites that builds up its own structure stemming from the orographical features and the geometry of land surfaces that set out the basic characters of its space. It is a multiple space that links every place to all its neighbours and it is characterized by human signs: the agricultural use of land, the regulation of land and water, all the artefact and devices produced by the settled population over time. Thus every place builds up its own landscape, describing its own track record by means of a complex system of diverse signs and meanings. Every landscape displays a dwelling; it changes its dimensions (is can widen up to a whole region, or shrink to a single room) according to the people it hosts and their needs (identity, symbol, intentions of change, use, image…) and their idea of dwelling. This landscape is made of signs that remind of past decisions, projects, actions; it gets its meaning, as a description of material culture, from everything
42
V. Di Battista
that has been done and conceived in it up to our age. And as soon as this space becomes a settlement – and therefore it is observed, described, acted in – it becomes not only a big “accumulator” of permanencies and past energies, but also a big “condenser” of all relations that happen and develop in that place and nowhere else. This local peculiarity of relations depends in part upon geography and climate, in part upon the biological environment (plants, animals), in part upon the characters of local human settlements. In the time t0 of the observation, the place displays the whole range of its current interactions, and that is its identity. Landscapes narrates this identity, that is the whole system of those interactions occurring in the given time: between forms of energy, matter, people, information, behaviors, in that place. Every inhabited place represents, in the time t0 of the observation, the emergence of its settlement system; therefore, as it allows for an infinite number of descriptions, both diachronic and synchronic, it also happens to be – all the time – the “describer” of our settlement system. Every local emergence of a human settlement represents (regarding the time of observation) at the same time both the condition of state t0 of the whole system, and the becoming (in the interval t 0 + t1 * t n ) of its conditions, as the systems within and without it change continuously. Therefore, a place is the dynamic emergence of an open system, the more complex and variable as the more interactive with other systems (social, economic, political…) it is. Observing a place during a (variable) length of time allows us to describe not only its permanence and change – with entities appearing and disappearing – but also its existence flow. This idea – the existence flow of a place, or of a settlement – gives a new meaning to the architectural project in the built environment. 7. The existence flow of a settlement system Every system of relations between human beings and their settlement shapes and gives meaning to its built environment in specific and different ways, according to the different geographic conditions and cultures. We agree that the built environment represents the balance, gained over time, between those environmental conditions and the cultural models belonging to that specific civilization. Some recent studies in environmental anthropology have investigated the feedback from the built environment to the social behavior, and it would be useful to consider the cognitive functions that the “built environment”, in a broader sense, could represent.
Environment and Architecture – A Paradigm Shift
43
Anyway this balance (environment – culture), within the same space, displays a variation in conditions that can be considered as a true flow of elements (and existence) inside the place itself. Resorting to the coincidence “inhabited place/settlement system”, we can describe the space of a place (location and basic physical conditions) as a permanence feature, the unchanging touchstone of all the succeeding systems and their existence cycles. Therefore, could we actually investigate one dynamic settlement system, proceeding in the same place along a continuous existence flow, from its remote foundation to our present time? Or should we analyze by discontinuous methods this same flow as it changes over time and articulates in space? It depends upon the meaning and purpose of our description level. Every place is an evidence of the whole mankind’s history; our history, in turn, changes according to places. The whole flows of events deposits artefacts and signs in places: some of them remain for a long time, some get transformed, some disappear. Generally speaking, natural features such as mountains, hills, plains, rivers, change very slowly, while their anthropic environment (signs, meanings, resources) changes quickly. The duration of artefacts depends upon changes in values (use values, financial values, cultural values etc.), and many settlements may follow one another in the same place over time. It is the presence and the variation of values belonging to the artefacts that establishes their duration over time. On the other side, a built environment crumble to ruins when people flee from it: in this case, it still retains materials and information slowly decaying. Radical changes in the built environment, otherwise, happen when changes in the settlement system establish new needs and requirements. As a settlement changes its structures (social, economic, cultural ones) by imponderable times and causes, so does the built environment – but in a much slower way and it could probably be investigated as an existence flow. In this flow relevant factors can be observed. Permanent and changing elements rely upon different resources and energies, and refer to different social values. Usually, the “useful” elements are conserved; when such elements embody other meanings (such as religious, symbolic, political ones) that are recognized and shared by a large part of the population, their value increases. Sometimes, elements remain because they become irrelevant or their disposal or replacement is too expensive. Duration, therefore, depends upon the complex weaving over time of the needs and the values belonging to a specific settlement
44
V. Di Battista
system, which uses this woven fabric to filter the features (firmitas, utilitas, venustas…) of every artifact and establish its destiny. Landscapes, as system of signs with different lifespan, have two conditions: On one side, in the flowing of events, symbolic values (both religious and civil ones) have a stronger lasting power than use and economic ones, which are more responsive to the changes in the system. On the other side, what we call “a monument” – that is, an important episode in architecture – is the result (often an emergence) of a specific convergence of willpower, resources, many concurrent operators and other favorable conditions. This convergence only enables the construction of great buildings; only commitment and sharing allows an artefact to survive and last over time. It is the same today: only if a system has a strong potential it will be able to achieve and realize major works, only shared values will guarantee long duration of artefacts. 8. Characters of the urban micro-scale The multiple scale of settlement system allows for many different description levels. There are “personal” spaces, belonging to each individual, and the systems where they relate to one another; these spaces include dwellings and interchange places, thus defining an “urban micro-scale” that can be easily investigated. Such a scale displays itself as a compound sign, a self-representation of the current cultural model which is peculiar to every settlement system; it has different specific features (geographical, historical etc.) characteristic of its meanings, roles, identities inside the wider settlement system. The structure of images and patterns (Lynch, 1960, 1981 [16,17]; Hillier and Hanson, 1984 [12]), the urban texture and geometric features, the characters of materials – such as texture, grain and color – their looking fresh or ancient, indicate such things as cultural models, the care devoted to public space by the population and their public image and self-representation. Urban micro-space always represents an open system, a plastic configuration of meanings, where different flows of values, energy, information, resources, needs and performances disclose themselves as variations in the relationship between longlasting and short-lived symbols and signs, which together constitute the landscapes where we all live. Such an open system allows for different levels of description, it requires a recognition, an interpretation of its changes and some partial adjustment and tuning.
Environment and Architecture – A Paradigm Shift
45
9. Inhabited places: use, sign, meanings It would be necessary to investigate the complex interactions that link some features of the cultural model of a population in the place of its settlement (history and material culture, current uses and customs), the way inhabited places are used, the configuration of the ensuing signs (Muratori, 1959 [20]; Caniggia, 1981 [2]), and the general meaning of the place itself. The ways a space is used, the conditions and needs of this use, generate a flow of actions; hence derives to the system a casual configuration which is continuously – though unnoticeably – changing. This continuous change produces the emergence of configurations, systems of signs, which possess a strong internal coherence. Just think of the characteristics of architecture in great cities, corresponding to the periods of great wealth in their history. This emergence of things and built signs, and the mutual relations among one another and their geographic environment is peculiar to all places, but it appeals in different ways to our emotions and actions. The appeal of a place depends upon the different mix of values, linked either to use, information, aesthetics, society etc., that the observer attaches to the place itself; this mix depends upon the observer’s own cultural and cognitive model, as well as his/her needs and conditions (Maturana and Varela, 1980 [18]). In this conceptual framework, every built environment brings forward a set of values which are shared in different ways by the many observing systems inside and outside it. In their turn, such values interfere with the self-regulation of the different flows that run through the environment: flows of activities, personal relationships, interpretations, emotions, personal feelings that continuously interact between humans and things. This generates a circular flow between actions and values, where the agreement connecting different parties is more or less strong and wide and in variable ways regulates and affects flows of meaning as well as of activity and values. 10. Project and design In the open system of the built environment and in the continuous flow of human settlements that inhabit places, there are many reasons, emotions, needs, all of which are constantly operating everywhere in order to transform, preserve, infill, promote or remove things. These intentional actions, every day, change and/or confirm the different levels of our landscape and built environment. This flow records the continuous variation of the complex connexions between people and places. This flow represents and produces the implicit project that all
46
V. Di Battista
built environments carry out to update uses, values, conditions and meaning of their places. This project is implicit because it is self-generated by the random summation of many different and distinct needs and intentions, continuously carried out by undefined and changing subjects. It gets carried through in a totally unpredictable way – as it comes to goals, time, conditions and outcomes. It is this project anyway, by chaotic summations which are nevertheless continuous over time, that transforms and/or preserves all built environments. No single project, either modern or contemporary, has ever been and will ever be so powerful as to direct the physical effects and the meanings brought about by the implicit project. Nevertheless, this very awareness might rouse a new project culture, a more satisfactory public policy, a better ethic in social and economic behaviour. A multiplicity of factors affects this project resulting, in turn, in positive and negative outcomes (Manhattan or the Brazilian favelas?). Which can be the role of intentional projects – and design – in the context of the implicit project, so little known and manageable as it is? How could we connect the relations between time and place deriving from our own interpretation of human settlements, to an implicit project that does not seem to even notice them? Moreover, how could today practice of architectural design – as we know it – cope with such complex and elusive interrelations, in the various scales of space and time? What signs, what meanings do we need today to build more consistent and shared relationships in our built environments? Is it possible to envisage some objective and some design method to improve the values of the built environment? How can we select, in an accurate way, what we should conserve, transform, renew, dispose of? How can we improve something that we so little know of? We need a project of complex processes to organize knowledge and decisions, to find effective answers to many questions, to carry out positive interactions to the flow of actions running through our settlements. 11. Self-regulation, consent, project The issue of consent and shared agreement about the shape and meaning of the human settlement is quite complex and subtle: it deals with power and control, with democracy, with participation. Decisions, choice and agreement cherish each other and correspond to the cultural and consume models of the population. Consent, through the mediation of local sets of rules, turns into customs and icons of the local practices in building and rehabilitation activities. This usually
Environment and Architecture – A Paradigm Shift
47
degenerates into the squalor of suburban housing; but it also makes clear that every human place bears the mark of its own architecture, through a sort of homeostatic self-regulation (Ciribini, 1984, 1992 [3,4]). Such self-regulation modifies meanings by mean of little signs, while upgrading signs by adding new meanings. The continuous variation of these two factors in our environment is the result of a systemic interaction of a collective kind: it could not be otherwise. Will it be possible to improve our capacity in describing such a continuous, minute action we all exert upon every dimension of the built environment? Will it be possible to use this description to modify and consciously steer the flow of actions toward a different behavior? Which role can the single intention/action play and, in particular, what could be the role of the project, as a mock description of alternative intentional actions within the collective unintentional action? How does the cultural mediation operate, between project and commonplace culture? How can it be transferred into the collective culture – that is, into culture’s average, historically shared condition? 12. Research A small settlement can represent, better than a urban portion, a very good place to investigate such complex relations; the good place to understand the role, conditions, chances and limits of the process of architecture – making (Di Battista, 2004 [8]). In a small settlement, the flows and variations of needs, intentions and actions seem more clear; their project comes alive as a collective description that signifies and semantically modifies itself according to values and models which are mainly shared by the whole community. Here the implicit project (Di Battista, 1988 [7]; De Matteis, 1995 [5]), clearly displays itself as a summation of intentions and actions that acknowledges and signifies needs and signs settled over time; in doing this, it preserves some of these signs – giving them new meanings – while discarding others, in a continuous recording of variations that locally occur in the cultural, social, economic system. This continuous updating links the existent (the memory of time past) to the perspective of possible futures. Moreover, it also links the guess upon possible change in real life and in dreams (that is, in hope) with the unavoidable entropy of the physical system. In this sense, the collective project that knows-acts-signifies the complex environment represent its negentropy (Ciribini, 1984, 1992 [3,4]). It would be advisable that such a project could acquire a better ethical consciousness. Thus, inside the collective project, the individual project would became the local
48
V. Di Battista
action confirming and feeding into the whole; or else, it could aim to be the local emergence, finally taking the lead of the collective model. The signs drawn from this continuous, circular osmosis (of models, intentions, predictions, actions, signs and meanings), over and over reorganize, by local an global actions, the existing frame of the built environment. This osmosis never abruptly upsets the prevalence of present signs and meaning: it makes very slow changes, which can be fully perceived over a time span longer than a human lifetime. This self-regulation allows the physical system to survive and change, slowly adjusting it to the continuous change of the cultural and dwelling models; it preserves the places’ identity while updating their meanings. When the implicit project lumps diverging intentions and actions together, the whole meaning becomes confused and hostile. Today, many economical and social relationships tend to “de-spatialize” themselves; many organizations and structures displace and spread their activities, individuals and groups tend to take up unstable relations and may also inhabit multiple contexts. Architecture seems to have met a critical point in shattering one of its main reasons: the capability to represent the relationship between the local physical system and the self-acknowledgement of the social system settled in it. Actually, this de-spatialization is one of the possible identities that individuals, groups and social/economical systems are adopting, and this could be the reason why many places are becoming more and more individualized/socialized. Therefore, the problems of globalization, of social and identity multiplicity, cause such an uncertain and fragmentary forecast that it urges the need and the quest for places that can balance such upsetting; that is why places with memory and identity are so strongly sought for. “Landscape” can be one of the significant centers for this re-balancing action. Landscape is perhaps the most powerful collective and individual representation of the many models we use to describe ourselves – the philosophical and religious as well as the consumerist and productive or the ethical and symbolic ones. It is also the more direct portrayal of many of our desires and fears, both material and ideal. Landscape and architecture are not mere pictures, nor embody only aesthetical and construction capabilities; they are meaningful representations of the time and space emergence of the continuous flow of actions; they self-represent the settlement system in space (Norberg-Schulz, 1963, 1979 [22,23]) and time, and the deepest existential and symbolic relationships of mankind (Heidegger, 1951 [11]; Jung, 1967 [13]):
Environment and Architecture – A Paradigm Shift
49
they are so rich and complex that we still find very hard to describe and even imagine them. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23.
G. Bocchi and M. Ceruti, Educazione e globalizzazione (Cortina, Milano, 2004). G. Caniggia, Strutture dello spazio antropico (Alinea, Firenze, 1981). G. Ciribini, Tecnologia e progetto (Celid, Torino, 1984). G. Ciribini, Ed., Tecnologie della costruzione (NIS, Roma, 1992). G. De Matteis, Progetto implicito (Franco Angeli, Milano, 1995). J. Diamond, Guns, Germs and Steel. The Fates of Human Societies (1997); (it. ed.: Armi, acciaio e malattie (Einaudi, Torino, 1998)). V. Di Battista, Recuperare, 36, (Peg, Milano, 1988). V. Di Battista, in Teoria Generale dei Sistemi, Sistemica, Emergenza: un’introduzione, G. Minati, (Polimetrica, Monza, 2004), (Prefazione). V. Di Battista, Ambiente Costruito (Alinea, Firenze, 2006) V. Di Battista, in Architettura e approccio sistemico, V. Di Battista, G. Giallocosta, G. Minati, (Polimetrica, Monza, (2006), (Introduzione). M. Heiddeger, Costruire, abitare, pensare (1951), in Saggi e discorsi, Ed. G. Vattimo, (Mursia, Milano, 1976). B. Hillier, J. Hanson, The social logic of space (Cambridge University Press, 1984) C.G. Jung. Man and His Symbols (1967), (it. ed.: L’uomo e i suoi simboli, (Longenesi, Milano, 1980)). F. La Cecla, Perdersi. L’uomo senza ambiente (Laterza, Bari-Roma, 1988). F. La Cecla, Mente locale. Per un’antropologia dell’abitare (Elèuthera, Milano, 1993). K. Lynch, The image of the City, (it. ed.: L’immagine della città (Marsilio, Padova, 1960)). K. Lynch, Good City Form (1981), (it. ed.: Progettare la città: la qualità della forma urbana (Etaslibri, Milano, 1990)). HR. Maturana, F.J. Varala, Autopoiesis and Cognition (1980), (it. ed.: Autopoiesi e cognizione. La realizzazione del vivente (Marsilio, Padova, 1985)). E. Morin, La Methode (1977), (it. ed.: Il metodo (Raffaello Cortina, Milano, 2001)). S. Muratori, Studi per una operante storia urbana di Venezia (Istituto Poligrafico dello Stato, Roma, 1959) G. Nardi, Le nuove radici antiche (Franco Angeli, Milano, 1986). C. Norberg-Schulz, Intentions in Architecture, (1963), (it. ed.: Intenzioni in architettura (Lerici, Milano). C. Norberg-Schulz, Genius Loci (Electa, Milano, 1979).
This page intentionally left blank
EMERGENCE OF ARCHITECTURAL PHENOMENA IN THE HUMAN HABITATION OF SPACE ARNE COLLEN Saybrook Graduate School and Research Center 747 Front Street, San Francisco, CA 94111 USA E-Mail:
[email protected] Considering the impact on human beings and human activities of architectural decisions in the design of space for human habitation, this chapter discusses the increasingly evident and necessary confluence in contemporary times of many disciplines and humanoriented sciences, with architecture being the meeting ground to know emergent phenomena of human habitation. As both a general rubric and a specific phenomenon, architectural emergence is the chosen focus of discussion and other phenomena are related to it. Attention is given to the phenomena of architectural induction, emergence, and convergence as having strategic and explanatory value in understanding tensions between two competing mentalities, the global domineering nature-for-humans attitude, in opposition to the lesser practiced humans-for-nature attitude. Keywords: architecture, convergence, design, emergence, human sciences, induction, systemics, transdisciplinarity.
1. Introduction What brought me to the subject of this chapter is my long-time interest in the occupancy and psychology of space. My approach to the subject is transdisciplinary and systemic, in that I think in contemporary times, we have to converge many fields of study and understand their interrelations to know the subject. What I find particularly interesting and relevant are reciprocal influences between one dynamic body of disciplines associated with architecture, art, design, and engineering the construction of human dwellings on the one side, and another body of disciplines associated with psychological and philosophical thought, human creativity and productivity, and well-being on the other side. Decades of research interest have transpired regarding the reciprocal influences between the two bodies of disciplines, but many would argue that the apparent marriage of architecture and psychology (to take one illustrative connection), through such a lens as environmental psychology [1] applied to architectural designs since the middle of the twentieth century, may have ended 51
52
A. Collen
in divorce by appearances of our human settlements of the early twenty-first century. From my reading of designers, architects and engineers, whose jobs are to design and construct the spaces we inhabit, in recent decades the development of our cities and living spaces constituting them have become subject to the same homogenizing and globalizing forces shaping our consumer products and human services. But for a minority of exceptions, overwhelmingly, the design and construction of human habitats have accompanied the industrialization, the standardization of the processes and products of production, and the blatant exploitation and disregard of the natural order and fabric of the physical world. From our architectural decisions and following them, subsequent actions to organize and construct our living spaces, we have today the accumulation of the physical, psychological, and social effects of them. Our intentions to live, collaborate, and perform in all kinds of human organizations do matter. We are subject to and at the effects of the spaces we occupy. This chapter is to discuss the relevance of trans-disciplinary and systemic approaches that may inform and three architectural phenomena that accompany the dwellings we occupy. 2. Two Attitudes What we do to our surroundings and each other in the forms of architectural decisions have lasting effects. If we believe our surroundings are there only to serve us to fulfill our needs to live, communicate, work, and breed, we have what may be termed the nature-for-humans attitude. Following this mentality, we freely exploit and redesign the natural world to suit ourselves. This attitude is rampant and we see the results everywhere on the planet today. The opposite mentality is the minority view. Adopting this critical interpolation of consciousness, if we believe we are here to serve our surroundings in a sustainable fashion to fulfill our needs, we have the humans-for-nature attitude. It is a pragmatic attitude in which every action takes into conscious regard the consequences of the action on the environment. Unfortunately, only a small proportion of humankind appears to manifest this mentality at this time in human history. We may increasingly question the dominant attitude, such that we may justifiably ask: What are we doing in the design and construction of our habitats to evidence that the humans-for-nature attitude underlies all that we do? Architectural phenomena and decision-making are foci to explore tensions between the two attitudes.
Emergence of Architectural Phenomena in the Human Habitation of Space
53
3. Human Activity Systems and Organized Spaces I have been involved with systems research and sociocybernetics for three decades [2]. I have been particularly interested in what we may call human activity systems [3]. A group of persons forms this kind of system when we may emphasize as the most important defining quality of such a system to be the interactions among these persons. The interactions constitute the activity of the system. The system is not very visible much of the time, but only in our imagination. However, when the people meet in person, or communicate by means of technology for example, the system is activated, it comes alive. It is the communications among the persons that make the system visible. In sum, it is what we mean by a human activity system. It is common that we are members of many human activity systems simultaneously and during our lives. The structures and places associated with human activity systems bring the subject matter of architecture to my research interest, because architecture I believe has a tremendous omnipresent influence on human activity systems. Typically today, we are separated from the natural environments that were common for most of humanity several generations ago. Most of us live our lives in cities. We live and work in contained and well-defined spaces. Considering the longevity of human history, the change from agrarian and nomadic non-city ways of life to the industrialized, consumer-oriented and modernized enclosed spaces of contemporary life has come fast. But an alternative way to think about it is to reflect upon the question: In what ways is the architecture of the life of a human being different today than two hundred years ago? This question is important, in that the architectural decisions of the past, as manifested in the dwellings we inhabit today, I submit have a profound influence on living, thinking, producing, and self-fulfillment. The idea of organized spaces need not be confined to physical housing as we know them. Dwellings, such as schools, offices, and homes, and the physical meeting places within them, such as countertops, dining tables, and workstations, are but nodes of vast and complex networks of persons spanning the globe, made possible via our electronic media technology. Thus, we have various levels of complexity for human activity open to us to consider what organized spaces entail, namely both real and virtual spaces. In fact, such devices as the mobile phone have profoundly altered our idea of what an organized space is. The interface between real and virtual space means that wherever we are in the physical world, there is increasingly present the potentiality of an invasive influential addition (radios, intercoms, cell phones, television and computer screens). These virtual avenues complicate our understanding of our inhabitation
54
A. Collen
of that physical space, because activation of a medium can at any time compete as well as complement our activity in that place. Being paged or phoned may distract or facilitate respectively from current events. The interface has become extremely important to communication, so much so, virtual devices are aspects included in the architectural decisions to design and construct human habitats, for example, placements of recreation and media rooms, and electrical wiring. As a result, various technological media devices are evidence of extensions of human activity systems into virtual realms not immediately visible to us with the physical presence of a group of persons at the same time in the same physical location. 4. Architecture Designs and Organized Space One form of expression of the design of space is architecture. To make a decision that organizes space is an essential element that creates architecture. To impose architecture in space is to organize the space for human habitation. Various organizations of space constitute architectural designs. This activity of ordering space, whether by design of the architect or the inhabitant, can lead to a range of consequences on human activity, from extreme control by others on the one hand to personal expression, happiness, and ornate displays on the other hand [4,5]. Beyond the basics of the perceptual cognitive relations involved in constituting design, the art and innovation in architecture tend to embroider and enhance its minimalism. However, contemporary approaches tend to challenge this view as too limiting, as evidenced for example when inhabitants of modernist architecture remodel and innovate to make their dwellings their own. Such secondary effects illustrate that we cannot take sufficiently into account the emergent consequences of imposing a given architectural design on human beings. Defining architecture, from Vitruvius to present day, and keeping it relevant to human settlements are challenges informatively described in terms of urbanizing concentrations of humanity as complex systems [6]. Further, a provocative journey through the development of architecture revisions the aesthetic of architecture and the primacy of beauty in contemporary terms of the pursuit of happiness that we can experience and manifest in the design and inhabitation of constructed spaces [7]. 5. Architectural Induction and Experiencing Space It is a non-controversial fact, an existential given, that the space a living being inhabits has a profound influence on that living being. Where the biologist may
Emergence of Architectural Phenomena in the Human Habitation of Space
55
point to primary examples of this fact by means of the phototropic and hydrotropic propensities in life forms, the anthropologist may cite the prevalence and placement of certain raw materials, infusing the artifacts of festivals, ceremonies and other cultural events, that are distinguishing markers among peoples. Interacting with the constituent make up of a living being, the environment is a determinant reality of that being. Arranging homes about a meeting place, limiting the heights for every urban dwelling, defining room sizes and their configuration to constitute the set of spaces of a dwelling are examples of architectural decisions. Architecture shapes and organizes the environment for human beings; de facto, architecture is a key environmental force. As a human being, my principal point of reference for existence is my being. To survive, I think in this way and relate to all other persons, things, and places from my personal point of view, my vantage point. Thus, cognition, perception, psychology, and phenomenology are particularly relevant for me to explain, understand, create, design, construct, and change the spaces in which I live, work, and relate with other human beings. At every moment, induction has much to do with my experiencing of the space I inhabit. What sights, sounds, smells, touches and tastes make my space of a place? The objects I perceive and my cognizance of their configuration about me constitute my ongoing experience. My experience is amplified because of my movement through space, which also means through time. My interactions with the objects are specific relations and my space a general relation, all of which are inductions. But those aspects of my experiencing of the space that may be attributed to decisions determining the overall design and organization of the space may be termed architectural induction. By means of perception, cognition and action, we experience space in chiefly four ways: 1) in a fixed body position, we sense what is; 2) we senses what is, while the body is in motion; 3) we interact with persons and objects that are what is; and 4) we integrate senses and actions of what is from multiple separate body positions. This internal frame of experiencing is an artificial articulation of course, because we are doing all four simultaneously most of the time. What becomes experience of a given space is determined in part by the internal frame and in part by the architecture of the space we occupy. The architecture induces and the frame influences. From the resultant confluence, experience emerges.
56
A. Collen
Figure. 1. Framing reconnects separated spaces.
6. Framing and Architectural Phenomena Framing is a natural inherent perceptual-cognitive process of being human (Fig. 1). To line out an area of space is to frame. It is to make separations in the space, to break the space into parts. What is included and excluded in the frame is an act of profound importance having major consequences in regarding architectural induction and emergence. One excellent example of framing in architectural design is making the window. The window is an elementary frame, depicted as a square, rectangle, triangle, circle, oval, or other such intended opening in what is otherwise a pure division of space. Let us consider the square window. What does each square window of a building, seen from a given vantage point communicate? What is its inducement? When a square is made as a window, doorway, recess, or projection, what is induced? Consider some possible relations, not as facts, but only hypotheses: The open square is separation, openness, or possibility; the double square is solidity, stability, or strength; the black-and-white or colored square is separation; the square with crossbars is confinement, imprisonment, or
Emergence of Architectural Phenomena in the Human Habitation of Space
57
control; the square of squares is separateness, security, or safety; and the square in a circle is fluctuation, alternation, tension, or creativity. Consistent with a phenomenology of experiencing space, the examples above are to illustrate the relevance of the experience of the beholder and occupier of the space, regarding the induction of the frame, in this case the square (like the window frame) and the consequent emergent elements of experience.
7. Arena of Inquiry Influences Architecture Inquiry is often discussed in terms of paradigm. We may also realize it is another example of framing. Philosophically, an arena of inquiry (paradigm) comes with an epistemology (knowing), ontology (being), axiology (set of values), and methodology (means of conducting inquiry). We want to know the space. There is knowledge of the place. We can experience the space by being in it and that is not the same as knowing about it. What we see, hear, touch, smell, and taste while in the place naturally spawns meanings, that is, interpretations of what we feel and think about the place. We bring to the place prior experiences that can influence and bias the framing. There are many ways we may value the place or not. And there are ways to explore, alter, and work the place into what we want or need it to be. But there usually are important elements to respect, preserve, and honor in the place. Finally, there are means to redesign and reconstruct its spaces. An arena of inquiry is comprised of the basic assumptions and ideas that define the decisions and practices leading to the architecture. As an arena, it influences the work and process of the inquirer, in this case, the architect who designs, the builder who constructs, and the human beings who occupy the space. When the architect adopts and works within one arena (paradigm), it is a way (frame) of thinking that influences and guides, but also limits thinking. But it is necessary to have to enable the discipline to exist. For the disciplined inquirer, in this case the architect, the frame (paradigm, arena) provides the rules, conceptual relations, principles, and accepted practices to make the architectural decisions required to compose and present the organization of space for human habitation. The paradigm scheme that I find informative is close to one published recently [8]. Paradigms are described to study effects of organized space, and I add a fifth (Systemic) to discuss paradigm for a more inclusive application to architecture. In brief, working within the Functional paradigm, we would be
58
A. Collen
preoccupied with whether the architecture is useful, efficient, and organizes space as intended. Does it work? To design within the Interpretive paradigm, we emphasize how people feel in the space, how they experience it. Is it reflective and enlightening? In the Emancipatory paradigm, we organize space to empower or subdue, liberate or imprison. Does the architecture free or control its occupants? To work in the Postmodern paradigm means to replicate and mimic the diversity and creativity of human beings who are to occupy the space. We would have a major interest in whether the architecture is heuristic and pluralistic, or delimiting and homogenizing. Finally, working within the Systemic paradigm, we would look for ways to combine, balance, configure, and complement the best features of the other paradigms when applied to a particular space. The broadest paradigm would be multi-methodological rather than restricted to one paradigmatic frame. The Systemic paradigm would be most akin to trans-disciplinary architecture, discussed later in this chapter. Given the variety of dwellings we see in our cities today, I find meaningful the following relations between paradigm and the kind of organized space: Functional affiliates with the factory to make a consumer product, Interpretive with the socializing place of a restaurant, Emancipatory with the health spa to promote human healing, Postmodern with the communal park square to support the social diversity of the community, and Systemic with combinations of the above. To illustrate this progression take the application of school architecture. During the industrialization of European and U.S. American continents, our public school systems rose for the populace as places to house our children while parents worked in factories. It is often argued that education then was more about control and socialization than learning and personal development. The design and construction of schools served former ends. Of course, these outdated functionalistic ideas cannot serve our present conditions and needs, even though the idea of containment in a space called school appears of enduring prevalence still. The architecture of schools has advanced extremely to explore the design and construction of more open environments [9,10], in fact to the extreme of considering the community the learning laboratory that once was the classroom. Learning is continuous, life-long, and increasingly augmented by the Internet. Places of learning are confined no longer to metaphors of the one-room schoolhouse, bricks-and-mortar campus, and local geography. To decide the inclusion and placement of a rectangular or oval window in a wall is a prime element and architectural decision. The decision is not divorced from the frames we bring to the act, but to the contrary, partly induced by them. To have familiarity with the arenas of inquiry in advance I contend invites more informed choices and a higher level of awareness to make the architectural
Emergence of Architectural Phenomena in the Human Habitation of Space
59
decisions required to design, construct, and alter human habitats to fulfill the range of human interests represented in the arenas.
8. Architectural Emergence The complexity of framing described in the two previous sections becomes even more profound when we take into consideration that the relations among the elements of the space we perceive changes continuously and multiple paradigms apply. Note the relations enrich and compound experience, for example, when we smell the changing odors walking through a garden (the passage of the body through space), and when sitting we see shadows moving on a wall through the day and feel rising and falling temperatures over days (occupying the same place through time). We are both instruments and recipients of change.
As we move through spaces, the body moves in a constant state of essential incompletion. A determinate point of view necessarily gives way to an indeterminate flow of perspectives. The spectacle of spatial flow is continuously alive . . . It creates an exhilaration, which nourishes the emergence of tentative meanings from the inside. Perception cognition balance the volumetrics of architectural spaces with the understanding of time itself. An ecstatic architecture of the immeasurable emerges. It is precisely at the level of spatial perception that the most architectural meanings come to the fore [11]. As every point of view gives way to the spatial flow of experience, an architecture emerges (Fig. 2). It is inherent in the existent manifest experience of the space occupied. It is a resultant architectural induction. There will likely be an architecture associated with the place one occupies, whether an office, town square, restaurant, or home. But we can also state that the idea of architecture is emergent from the personal experience of the place. That emergent phenomenon from the person is a valid phenomenon. Furthermore, it is justifiably legitimate to name the architecture of one’s experience and communicate it to others. This personal reference point and name of the experience are to be distinguished from the name architecture that is likely associated with the person and design used to construct and organize the space prior to human occupancy. The personal architecture has greatest relevance. From a phenomenological point of view, the totality of organized space experienced personally constitutes the experiential manifestations of consciousness. When lights, sounds, odors, and objects pervade a space, the space, as we experience it, is as much about what is there as what is not. The
60
A. Collen
Figure. 2. Multiple paradigms apply in organizing the spaces of this Parisian indoor emporium for the intended architectural induction to promote emergent behaviors expected in a haven of consumerism.
following are illustrative paired qualities of experience that may become descriptors of our experience of a particular place: Empty-full, present-absent, visible-invisible, loud-quiet, black/white-colored, soft-hard, hot-cold, and strongweak. They represent dimensions of experience, along which we use language to label and communicate experience to others. What is the sight, sound, smell, touch and taste of the space of the place? But descriptors need not be restricted to the sensorial. More complex constructions occupy our experience of space. Are the materials synthetic and artificial, or natural? What and who occupies the space? What interactions among the occupants of the space add to our experience of the place? Our perceptions and cognitions of sounds, lines, shapes, colors, odors and contacts become forces of influence. One may read, reap, interpret, and make meanings--the essential structures and contents of
Emergence of Architectural Phenomena in the Human Habitation of Space
61
consciousness of the place. But of great relevance is the relational nature of the space to our perceptions of the space and meaning attributions that constitute the experience we reflect upon, report, and discuss with others. The particular qualities that describe our experience in the most rudimentary and essential respects are emergent phenomena constituting the experience. They are examples of emergence. Regarding those aspects that stem from decisions determining the overall design and organization of a given space, we may use the phrase architectural emergence to refer to them. The phenomena of induction and emergence are complementary processes, like the two sides of the same coin. They are evocations of our existence in context. Which one to highlight is a matter of emphasis. We may focus on the inductive nature of experiencing space. The impact of the place is described in terms of induction. What flows from the habitat to the occupant, so to speak? What is the induction? Alternatively, we may focus on the emergent qualities of our experience of the place. When in the place, what comes forth to become the foreground of consciousness? What is emergent? Generally speaking, we may refer to the two phenomena as the architectural induction and architectural emergence of the organized space, respectively, when we can know the key architectural decisions involved to design and organize the space associated with the induction and emergence. To illustrate succinctly, placement of a stone arch at the entrance/exit joining two spaces (rooms, courts, passages) has an induction/emergence different from that of a heavy horizontal beam.
9. Systemics of Architecture, Emergence, and Attitude Put people together in a place. Organize the space by means of architecture via the architect, the occupants, or both. After some time, their interactions will likely induce a human activity system. In other words, a social system of some kind emerges, a human activity system defined not simply by the collective beings per se, but more definitively by their interactions. The nature and qualities of the interactions make the system what it is. But it is important to include in our thinking: The architecture of the space is part of the system. It induces to influence human interaction, there by participating in the emergence of properties that come to characterize the system. Given many interactive relations of the people with the environment and each other, concepts and principles applied to describe the designing and organizing of the space for the human beings who occupy it may be termed the systemics of its architecture, that is, those systemic concepts and principles applied to and active in that context.
62
A. Collen
Figure. 3. The office building skyscraper dominates the cityscape.
To illustrate, we may imagine a particular dimension of our experience of place (hot-cold, small-large, still-windy). If we select one element too extremely and focus on it, the whole may go out of balance with the other elements. In other words, a strong force or energy from one element can go so far as to obliterate the presence of others in the space. One element may overshadow the others, like one large tree blocks the sunlight that would nourish the other trees. We witness this spectacle entering a city square or living room of a home to immediately notice a towering building or large stoned floor-to-ceiling fireplace, respectively, with all others entities occupying the space organized around it. The size and intensity of the dominating entity (Fig. 3) tends to command and hold the attention, block out, or mask other entities. Whether the space is being organized in genesis, such as the design, plan, and construction of a new building, or the built space altered, such as remodeling the home, there are architectural decisions being made. The elements that dominant the space, the emergent qualities, may become particular inducements known to and characteristic of that architecture. The kiva (half egg-shaped oven-like fireplace), for example, has acquired this distinguishing status in the homes of Santa Fe, New Mexico. As to the systemic nature of architecture, we may wonder what overriding principle influences our thinking to make the architectural decisions by which the prominent qualities emerge. Is ideal architecture balance? Once we have knowledge of the emergent elements of a given architecture, is the task to find the balance of the most favorable inducements for human habitation? Similarly, we may ask: Is ideal architecture integration of those elements known to promote
Emergence of Architectural Phenomena in the Human Habitation of Space
63
well-being? Of particular relevance is that the emergence of any element to dominate the experience of the occupants of the place may lead further to concerns of human betterment at one extreme and human detriment at the other extreme. Which attitude (nature-for-humans or humans-for-nature) does the hallmark elements of an architecture support? What hallmarks a “green” ecologically sustainable architecture? The thesis developed in this chapter is that the spatial organization we impose through architectural decisions is an inducement in the emergence of the human social systems inhabiting the space. It merits testing to seek evidence for and against it, and whether it might be applied in constructive ways for human betterment. Given current concerns over survivability, it would also support shifts in consciousness from the presently dominant to the advisedly sustainable attitude. Our understanding of this relation seems both obvious and critical to the best of what architecture has to contribute. It should be generally known what inducements favor sustainability, well-being, productivity, and peaceful cohabitation. There is a powerful feedforward loop prominent in the systemics of architecture in its integral relation with design and technology [2]. Civilization progresses by accretion through novelty, diversity, and necessity [12]. We benefit from the discoveries and achievements of those who precede us. Through our immediate activities of design and construction involving feedback loops, we learn what works and what does not. The process is very pragmatic, requiring invention, innovation, and refinement; practical application; and extensive repetition by trial and error until efficacious action becomes reliable and sustainable. Thereby, we come up to the challenge of what is needed to solve the problems of our day. In the case of architecture, the performance, maintenance and endurance of the spaces we design and occupy come under our scrutiny. Ideally, our evaluations should lead over subsequent generations to increasingly superior dwellings in their construction [13], and our healthy living and experience of them [7,14]. As applied to the systemics of architecture, the myriad of feedback loops of human activity systems, coupled with the more macro feedforward loop linking generations are at the heart of second order systemics [15]. It is from the latter that architectures should emerge to apply to the present challenges we face.
10. Emergence of Trans-disciplinary Architecture One implication from the design, organization, and construction of the spaces we inhabit is that the emergent qualities bring preeminent importance to the trans-
64
A. Collen
disciplinary nature of architecture. It follows naturally from the systemics of architecture applied to a given space, because making an architectural decision increasingly has become a more complex endeavor. Some areas to consider are cultural elements; recognition of the unique qualities of indigenous materials; imaginative perspectives; knowing physical, physiological, psychological, social, and economic effects of the architecture on living beings; familiarity with current environmental conditions and fauna; knowing the perceiver’s angle of vision; the history of the place; and preconceptions of the inhabitants. All of these areas have a potential for inclusion in a particular architectural decision. Bringing a set of them together to define in part a given architecture recommends consultation with a range of experts, disciplines, and knowledge domains beyond the principal training and experience of the architect. Thus, to ensure successful completion of a project, the situation commands a systemic approach to organizing the space involved. A confluence of disciplines becomes important to consider and likely necessary, in order to design both conscientiously and consciously with the humans-for-nature attitude. This means a trans-disciplinary approach to making architectural decisions. This chapter has considered architectural phenomena and some aspects of architectural decision-making that would recommend organizing space for human habitation based on systemic and trans-disciplinary approaches. But articulation of the aspects often merely introduces key elements comprising the experience of those who made the original architectural decisions, and later those who occupy the place. From the relations among elements, specifically those that stem from various fields of study and disciplines of human experience and inquiry, we may see trans-disciplinarity emerge. Although matters of economics, physical design, perceptual cognitive relations, and engineering of structure are critical to applications of architecture, there are also psychological, socio-cultural, historical, and contextual influences to be included. For a particular place of human habitation, too much weight given to one aspect may have adverse consequences on the other aspects specifically and the entire space generally. Again, we must question the main principles driving the architectural process, such as balance or integration, mentioned earlier in this chapter.
11. Summary and Conclusion Our experience of space influences our state of being, relationships with others, home and work life, and connectedness to context. The name induction is given to label this phenomenon. Induction is a mediating construct to suggest critical relations between architectures and human activities. The importance of the
Emergence of Architectural Phenomena in the Human Habitation of Space
65
consequence of induction is termed emergence, another phenomenon defined as a quality, feature or characteristic of human interaction with the environment and others associated with and intentionally attributed to its inductive influences. Once the influences are known, their intentional confluence in making architectural decisions is termed convergence. When applied to developing human habitats architectural induction, emergence, and convergence may become advantageous to promoting mutually beneficial humans-for-nature relations. The three architectural phenomena can have strategic and explanatory value to detect and understand the consequences, respectively. The presumption is that our heightened awareness of these phenomena and the framing we apply to decision-making may better enable us to perceive acutely the influences of organized space on our well-being, human relations and activities; evidence the multiple systems of which we are part; and design more efficacious spaces for human beings and human activities. This chapter has been written with systemic and trans-disciplinary importance being given to the imposition of architecture in a place. Sensitivity is imperative to the phenomena of induction, emergence, and convergence. Well worth studying are the architectural decisions having relations to architectural designs and consequential evocations. If we are to become more appreciative of and caring for our environments, and thereby have a quality of life, it is paramount we understand and apply as wisely as possible these relations.
References 1. D. Lowenthal, J. of Environmental Psychology 7, 337 (1987). 2. A. Collen, Systemic Change Through Praxis and Inquiry (Transaction Publishers, New Brunswick, New Jersey, 2004).
3. P. Checkland, Systems Thinking, Systems Practice (Wiley, New York, 1981). 4. L. Fairweather and S. McConville, Prison Architecture (Architectural Press, New
York, 2000). C. Day, Spirit and Place (Architectural Press, New York, 2002). V. Di Battista, Towards a systemic approach to architecture, in Ref. 15, p. 391. A. de Botton, The Architecture of Happiness (Pantheon, New York, 2006). M. Mobach, Systems Research and Behavioral Science 24, 69 (2007). M. Dudek, Architecture of Schools: The New Learning Environments (Architectural Press, New York, 2000). 10. A. Ford, Designing the Sustainable School (Images Publishing Group, Victoria, Australia, 2007). 11. S. Holl, Parallax, (Architectural Press, New York, 2000), p. 13. 12. G. Basalla, The Evolution of Technology (Cambridge, New York, 1988).
5. 6. 7. 8. 9.
66
A. Collen
13. A. Stamps, Psychology and the Aesthetics of the Built Environment (Springer, New York, 2000).
14. J. Hendrix, Architecture and Psychoanalysis: Peter Eisenman and Jacques Lacan (Peter Lang, New York, 2006).
15. G. Minati, Towards a second systemics in Systemics of Emergence: Research and
Applications Eds. G. Minati, E. Pessa and M. Abram (Springer, New York, 2006), p. 667.
QUESTIONS OF METHOD ON INTEROPERABILITY IN ARCHITECTURE EZIO ARLATI (1), GIORGIO GIALLOCOSTA (2) (1) Building Environment Sciences and Technology, Politecnico di Milano Via Bonardi, 15 - 20133 Milan, Italy E-mail:
[email protected] (2) Dipartimento di Progettazione e Costruzione dell’Architettura, Università di Genova Stradone S. Agostino, 37 - 16123 Genoa, Italy E-mail:
[email protected] Interoperability in architecture illustrates contemporary instances of innovation. It aims, through the standardization of instruments and procedures (and especially through shared languages of/in IT tools and applications), at the optimization of interactions amongst agents and the work done. It requires, within a consistently non-reductionist systemic approach: (1) interactions and activities of conscious government in/amongst its fundamental component parts (politics, technical aspects, semantics); (2) development of shared languages and protocols, to verify technical, poietic, etc., innovations which do not destroy accumulative effects and peculiarities (axiological, fruitional, etc.). Keywords: systemics, architecture, innovation, sharing, interoperability
1. Introduction “Some might be filled with wonder watching a flock of birds, but such wonder derives from the impossibility of understanding their means of communication: wonder comes from a lack of comprehension, one can not understand because the communication codes are unknown or, if one prefers, because there is a lack of interoperability between their and our language” (Marescotti, in [1, p. 53], author's translation). In a similar way, in architecture, different languages and/or ineffectiveness between codes of communication in the transmission of data, information, etc., and in the operational instructions themselves, lead to interpretative difficulties: the latter often leading, at least, to inefficiencies and diseconomies in technical and management processes. In this way, interoperability in architecture aims at optimizing interactions amongst agents (as well as the work done), using shared common standards for processing/transmitting documents, information, etc. 67
68
E. Arlati and G. Giallocosta
Interoperability, however, if consistently intended in a non-reductionist sense [4, pp. 84-86], and [1, p. 23], should “(...) be developed in three modalities: technical interoperability, of which one can clearly state that although IT techniques and tools (hardware and software) present no problems, problems do exist and lie in the ability to plan adequate cognitive and thus cultural models; semantic interoperability, which takes us back to interdisciplinarity and the construction of dictionaries and thesauri; political interoperability, reflected in law, in the value and various aspects of the law, and in data property and certification. On these questions, standards are studied and developed (...) through the activities of agencies such as ISO (International Organization for Standardization) and OGC (Open Gis Consortium) ...” (Marescotti, in [1, pp. 56-57], author' s translation). In the same manner, the sharing of standards and protocols (which constitute the instrumental apparatus for interoperability), when used coherently without detracting from the languages themselves, axiological particularities, etc., ensures: • development of cognitive and cultural models which are acquired (in the sense of improved relative synergies) or also tendencies towards new ones (more adequate and effective interactions); • validation of the former models as an interdisciplinary resource [4, pp. 4951]. Only in this sense can interoperability in architecture find real meaning in an advanced systemic approach: this poses the fundamental questions of method for its use in a non-reductionist key. 2. From the industrialization of building to interoperability in architecture From a rationalist point of view [3, pp. 30-40], the industrialization of building seals a relationship between architecture and industry in the sense of a possible mass production (and in terms of product standardization and interchangeability of products and components). This is the result of those instances of innovation, typical of post-artisan choices and logic, through approaches and operational praxis inspired by mass production (prefabricated parts and industrialization of on-site castings) which, however, carry important implications for the design and management stages. This leads, especially with the use of construction techniques with high levels of prefabrication and/or industrialization of on-site castings, to standardizations which are not always
Questions of Method on Interoperability in Architecture
69
compatible with project poiesis, nor with more consolidated aspects of construction culture or living culture. Over the past few decades, however, new needs and awareness regarding populating processes emerge; often, moreover, such new needs and awareness transform obsolete connotations of progress and development into evident dichotomies: from growth and experience from localized hyper-population, to the containment of urban sprawl, the renovation of the pre-existent, the sustainability of building activities. New assumptions develop regarding the architecture-industry relationship, which are deployed mainly: • according to greater compatibility of the structural peculiarities of the former (indications of the singular nature of buildings, the limited opportunities of product standardization, etc.); • consistently with the most important developments in the latter (series of moderate quantity, analogical series, etc.). With such an evolution of the scenario, the traditional connotations of the industrialization of building are modified. Thus do they appear, less and less characterized by the rigid assumptions of mass production [3, pp. 41-64]. More recent connotations of the industrialization of building, therefore, tend to follow the objectives of the standardization of instruments and procedures, minimizing mass production aspects: and with the latter, any implications of possible offsets in the design stage and in many of the operational and management aspects of architecture. Amongst these objectives, technical interoperability leads to, as mentioned, a need for optimized integration of agents (and their respective activities), through shared languages, currently developed by specifications (called IFC Standards - Industry Foundation Classes) for the production/application of interoperable softwarea. Clearly, the use of the latter: a
The IFC standardized internationally through ISO PAS - Publicly Available Standard 16739/2005, is an open source software, and thus freely available to competent and expert users, and is run by the IAI - International Alliance for Interoperability (an international institution comprising researchers, public sector managers, industrial organizations, academics and university teachers). IAI-IFC develops applications of BIM (Building Information Model) concepts; allows the representation in an explicit, shared and thus interoperable way, of objects and their spatial-functional interrelationships. The BIM model consists of a unified information system whose component parts are explicitly marked in order of represented entities, geometric and typological correlations, assigned characteristics: operation is regulated as a function of tasks and responsibility given to the various subjects holding given capabilities and decisional roles; each subject, while authorized to operate only on their own activities, can visualize the whole set of transactions in progress in the model. In this way the integration of the various decisions can benefit from the simultaneous
70
•
•
•
E. Arlati and G. Giallocosta
represents an optimized scenario of operational integration, supporting programming, design, life-cycle management of building organisms, plant and infrastructure networks, etc.; it becomes an important component of an instrumental apparatus for a systemic approach towards architecture, identifying, moreover, shared integrations, and is coherent with effectively non-reductionist approaches, of its various aspects (functional, technological, energetic, structural, etc.); develops, above all, shared and optimum opportunities, even though only operationally, for the management and control of some of those factors (design, technical-operational, etc.) inducing processes of emergence in architecture [4, pp. 37-42, 98-99).
Thus the dichotomy (which still exists) between the unpredictability of emergence and the need for the prefiguration of architecture can, for some aspects and to some extent, be reduced, also through the use of shared simulation and modeling for the preliminary management and control of possible interactions amongst those factors: design, operational, etc. More generally (and reiterating some previously mentioned aspects) interoperability in architecture, together with other instruments: • overcomes previous connotations of predominant product standardization in building industrialization, endorsing late-industrial assumptions of predominant procedure standardization; • requires/foresees mature connotations of a systemic approach to architecture, replacing traditional structural reductionism. In the same way, in the development of interoperability, the risks explicit in the previous praxis of building industrialization are still present, although in different forms and ways. As mentioned above, in fact, emphasizing product standardization often implies behavior which potentially ratifies operational approaches with the consequent outcomes of building projects; in a similar way standardization of procedures practices, using decision support systems, often wiping out cultural peculiarities through the way they are used, for carrying out and managing architectural activities, in the safeguarding of memory etc., may lead to:
checks regarding potential, undesired conflicts and/or interference, allowing adequate solutions to be found rapidly.
Questions of Method on Interoperability in Architecture
•
•
71
possible removal, with those peculiarities, of the premises regarding multiple languages, axiologies, distinctive traits, etc., of the various architectural contexts; unacceptable breaks in cultural expression and accumulation, and ratification and reductionism at such a level b.
In this sense technical interoperability, when correctly understood (together) validates: • languages and shared operational contributions (and the efficiency of the work done), • cultural premises and peculiarities regarding the many and varied architectural contexts, especially regarding effectiveness and flexibility of the work done (related to project poiesis, axiological acquisitions, model outcomes, etc.), requires the removal of any technicist drift in the set-up and the use of standards, protocols, etc. In the same way, as mentioned above, it also requires conscious government and interactions with modalities (according to Marescotti) of political interoperability and especially semantic interoperability: where one considers, in the validation and the more advanced developments in those cultural premises and peculiarities, roles and contributions ascribable to interdisciplinarity and the construction of dictionaries and thesauri (Marescotti, in [1, pp. 56-57]. Within this framework, interoperability, through consistent interaction between its modalities (technical, semantic and political), as mentioned above, combines advanced definitions from a systemic approach and prepares the ground for a non-reductionist development of architecture. Or rather: when accepted in a rigorously systemic manner, it acts at the operational levels of architecture while conditioning cultural settings and developments (and in this sense one is dealing with the validation of virtuous tendencies).
b
Here, it should be stressed, more generally, that there is a risk of technological determinism, typical of sophisticated IT structures when not suitably controlled especially regarding the manmachine interface: clearly, this risk also tends towards an uncontrolled process of technological proxy.
72
E. Arlati and G. Giallocosta
3. Methodologies of technical interoperability 3.1. Cultural origins of methodological equipment and contextual motivations for evolutionary developments Methodological apparatus for technical interoperability in the processes of design, production and management of architecture have so far essentially been limited by a conservative approach: an approach whose main aim, as for most software companies, was a rapid market success for IT applications. This bears clear witness to the non mature nature of supply for the building industry (especially in Italy), compared to other particularly competitive sectors in global markets (regarding optimization of efficacy/efficiency of the work done) such as electronics, avionics, high-precision engineering. It also bears witness to, and is an expression of, the separate nature and fragmentation of the multifarious markets for building products, which preside over and choose, mistakenly rooted in a given geographical site, the main points of their specificity: and so far have been able to condition the various actors in building activities and the nature of the initiatives, on the basis of effective requirements for the validation of local cultures, acquired values, etc., but also when faced with unmotivated unwillingness to develop procedures for the integration and sharing of technological and operational resources; it also follows from this (amongst the various doubts surrounding a persistent dualism identity/mass production) that the essential existence of systems of representation and processing on IT platforms are difficult to integrate, being aimed more at confining within the boundaries of specific product series the need for cooperation between the various actors involved c. But it is precisely this scenario which constitutes the object of a progressive change, typical of the current state of processes of production and management of architecture and is mainly due to the rise of two decisive factors: • the multiplication of the know-how needed to satisfy the set of requirements of increasing breadth and complexity, stimulated by the need to optimize the use of resources at continually higher levels of quality; • the development of regulatory aspects, aimed at providing guarantees in terms of security and the certainty of quality through the whole of the c
Nevertheless (as previously mentioned) one may often have, due to other aspects (and as effects of mistaken ideas about innovation transfer and operational approaches towards interdisciplinarity): (1) better transference of procedures and equipment from other sectors towards the building industry, (2) instead of suitable translations in this (and thus coherent with its effective specificities) of practices and outcomes reached elsewhere.
Questions of Method on Interoperability in Architecture
73
building production line, faced with increased attention being paid to economic aspects and social strategies for the creation of the constructed environment (and whose sustainability crisis, as mentioned above, is now apparent). Within this context of rapid transformation, the progressive modification of the building initiative requires the production line of the city and of its buildings to undergo a global re-thinking of its meanings: • in relation to its economic and social points of references, • faced with a system of requisites which are no longer completely part of the traditional knowledge of the architectural disciplines. Thus, there is the need for the predisposition of a renewed thought scenario, before any other condition and capable of representing the wealth of interdependencies and interactions amongst the decisive factors in the design of architecture. In this way emerge the reasons for accepting technical, or technological (the better term given its meaning of cultural renewal) interoperability, and with its procedural and instrumental corollaries, as an environment of operational resources for reaching the objectives of sharing between knowledge and knowhow. The cognitive approach is the fundamental directing criterion for the adoption of the method of technological interoperability; its operational horizon is, in fact, that of a melting pot in which one can: • contaminate with insight the demands from the various universes of reference (traditional disciplines and their decision-making instruments, so far separate), • remodel the nature of the relationships and interactions amongst the factors determining the configuration of the processes and products. In this sense, the contribution of expert knowledge is fundamental, that is the acquired knowledge of those with experience of past effects and consequences of the behavior in operation of building objects faced with given design assumptions. 3.2. Technologies for modeling data-flows Techniques for the virtual representations, expressed (the latter) through reference to an explicit expert knowledge-base and to a declared asset of
74
E. Arlati and G. Giallocosta
requisites and qualitative objectivesd, allow one the render coherent the evaluation of a design model even during its initial conception, through the support of a powerful software simulation environment. Further: the availability of advanced processing and simulation equipment is leading to the progressive loss of the aspects (still significant) of an exclusive and separate nature, allowing one to observe and cooperate in the development of a project down to its most minute details. These aspects thus offer the operators shared opportunities of representations, modeling and checks on the flow of transactions of meanings and values during the development of the design solutions, simulating its concrete realization as well as the later stages of use and management. In this way in architecture, as mentioned, the unpredictability of emergence can, to a certain extent, be directed following positive propensities of the component parts and interactions which produce it: parts and interactions, naturally, which can effectively be managed through forecast probabilistic hypotheses. Thus, in the force field between political, and especially semantic and technical interoperability (Marescotti, in [1, pp. 56-57], the structure of the cognitive instances which circumstantiate the design of architecture can take shape. For this, the powerful instrumental support is still that virtual model which configures the cooperating objects in the formation of the project, identifying the main interactions (semantic, functional, materials, technicalconstructional, maintenance, etc.). The essential intrinsic novelty in any real project, and its natural, innovative content, lies in the order taken on by the set of significant factors in a given context, in a given cultural and productive point in time, and aiming at a specific scenario of present and future requirements: from this point one proceeds, through successive translations of knowledge in their respective appropriate languages and through progressive transformations of identity, to the construction of a virtual model of solutions to the programme of requisites to be satisfied. Experimental control of the outcome of the design stage (traditionally very late in the life-cycle, and extremely expensive in terms of time and work due to comparisons which have to be made with alternative solutions) so far, because of the continuing insufficient development of design support technologies, has limited experimental activities on alternative designs mainly to within the sphere of the expert knowledge of individual designers and executors: d
One of the fundamental aspects, presiding over the introduction of technological interoperability into the architectural project, consists precisely in maintaining the possibility that the motivation/necessity network be rendered explicit, described, feasible and controllable by a community sharing the whole ensemble of aims and interests (or at least significant parts of it).
Questions of Method on Interoperability in Architecture
75
expert knowledge, moreover, predominantly implicit (the tradition lacking a system of representation of design values beyond formal-linguistic, constructive or structural critique), and thus not sufficient (at least in terms of efficiency) when substituting interoperable although fundamental resources, as previously mentioned, to ensure governance. Properly, the binomial governance - technical interoperability, the former intended as an impeding factor (amongst others) of technological determinisme, allows, however, optimum (and efficient) experimental activities. Interoperability offers the various specialist involved in the design the possibility of actually seeing the same identical model through their own specific system of software instruments, in which they have imported, read and processed the original relational database, without the need for re-processing, re-coding, etc., and avoiding approximate interpretationsf. It supports the design stage throughout its development, from its birth, to the definition of the model (in all its spatial and technological components), through the executive and operational definition, to the updating of the data on the edifice as built, down to the management and maintenance stages. It is then possible to proceed with the processing of the component parts (structures, plant, etc.), properly integrated and with optimum control of mutual interferences and overlap, updating the original model of set with the grafting of new components, and also processing families of comparable and superimposable models (thus providing possible design stage alternatives about which choices to make). The project stage thus described generates original potentialities for reaching the prefixed qualitative objectives. It does, in fact, allow the implementation of a praxis of experimenting with design solutions during their actual formulation, and optimizing the acquisition of gained experience by grafting it onto new projects with much greater rapidity with respect to traditional methods. e f
See note b. The BIM modeling environment assumes the presence of a family of specialist software instruments, conceived for cooperating in the definition of the various sets of characteristic values defining the project; each is an expression of specific domains of expert knowledge deriving from the diverse disciplinary traditions, all of them operating on the basis of the ability to recognize and process the objects defined in the relational database, together sharing: (1) the concept of object-oriented, (2) 3-dimensional vectorial representations in Euclidean space, (3) the qualitative and parametric characteristics defined in the attributes. A further fundamental assumption is the existence of a conventional code for the description, interpretation and recognition of objects in their effective nature (in the sense of represented entities) defined in the relational database. See note a.
76
E. Arlati and G. Giallocosta
The multitude of implemented objects (each with its attributes) in the relational database, moreover, its inherent expert knowledge contributions and especially explicable ad hoc for different project situations (cultural, contextual, poietic, etc.), ensure significant increases in possible variations in options and applications. Naturally, these do not exhaust, however, the number of theoretically possible alternatives. 3.3. Paradigms of experimental application Currently, interoperable technologies of architectural design (based on IAI-IFC standards) require experimental applications in pilot-projects or other, in order to use in the experiment the areas of know-how of the various agents who contribute to making design and actuative decisions, in order to verify the effectiveness of the modelingg. To evaluate this effectiveness, and with it the ability to offer performance advantages to both agents and end-users, the results must be compared with various paradigms. Mainly, the possibilities of: • increasing the levels of government of a project, • proceeding to its most important interfaces, • progressively understanding its main qualitative and economic aspects, • modeling a project, • sub-dividing it into its main phases, • acquiring experience, • facilitating successive monitoring of behavior during life-cycle. The increased possibilities of governing a project, facilitated by the IAI-IFC standard, clearly define prototypes optimized for positive correspondences with end-user requirements and expected quality levels as defined by the regulation: this means especially control of the efficacy of the experimental procedures, on the basis of their achievable advantages (cost-benefit). Feasibility and checking the important interfaces with ranges of further models on the building and microurban scales allow control over the suitability for habitation and services as required by the end-users, and/or the opportunity of implementing innovative aspects. The possibility of progressively understanding, during project modeling, its main qualitative and economic aspects are fundamentally inscribed in the checks retrospectively made with respect to the useful life-cycle of the end-product; in g
Applicative experiences of this type are already under way in the BEST Department at the Politecnico di Milano.
Questions of Method on Interoperability in Architecture
77
this sense the design stages are considered as strategic for logical and temporal priorities, also as a function of the interoperability with successive ones (quality and economies in construction, maintenance, etc.). The effectiveness of modeling a project also concerns the quality and the rapidity of interaction with evaluations of cost and of the technical-operational process (calculations, estimates, etc.). Articulation into its main phases also concerns the identification of building suppliers’ products, to be integrated into the construction, management and maintenance cycles (libraries of interoperable products). The acquisition of experience (implemented, coded and shared within the relational database) is also functional for suitable re-use in future activities. The monitoring of the behavior of manufactured during life-cycle can also be facilitated due to the implementation of a single relational database of the results of the initial simulations, and of those carried out during successive operations. 4. Conclusions It becomes particularly clear, partly reiterating some of the above aspects, how the theoretical-methodological approaches and the outcome of experience of technical interoperability carried out so far suggest, for architecture, the need for and the opportunity of developing its cultural statutes, the sense of its formal and symbolic languages which are still often reduced, confined within selfreferentiality and conserving supposed purifying objectives and presumed intangibility of their relevance. The innovative opportunities offered, and especially the reasons inferable from the current historical context (social, economic, environmental sustainability, etc.), can no longer be inscribed within single disciplinary outlooks, or the breaking up of disciplines, or separated expertise: those opportunities and reasons, on the contrary, require the fundamental integration of disciplines and expertise, optimized osmosis of specialist rigor, expert knowledge, etc., capable of shared syntheses at the highest levels of generality (and not vagueness) and prospectively a harbinger of positive outcomes. In this sense, interoperability in architecture, as seen above, acquires decidedly systemic connotations. In fact, it promotes forms of collective, shared, multidisciplinary and positively desecrating knowledge; it is aware of synergy and does not obliterate, on principle, the specificities and peculiarities of contributions (expert knowledge). On the contrary, it implements the sharing of a cognitive model amongst the agents.
78
E. Arlati and G. Giallocosta
Such an innovative tendency is even more striking, when one considers (and to a much greater extent with respect to other productive sectors): • the traditional (still physiological) heterogeneities of the operators in building activities, • the frequency of conflict amongst them, • the lack of willingness towards common codes of communication amongst various skills, • often the diversification of interests (also due to the existence of persistent disciplinary barriers), • etc. Moreover, the same main objective of past experience in the industrialization of building, shown in the development in (precisely) an industrial sense in the building sector, can now also be pursued, although with different strategies, and at least within the terms of adopting ideally structured (and shared) operational practices, through the contribution of interoperable resources. The latter, however, still require further experimentation and elaboration and especially, as far as is relevant here (and as observed above), regarding developments in a non-reductionist sense. Technical interoperability, in this case, certainly requires further development and improvements, for example in the optimum rearrangement of cognitive and cultural models (Marescotti, in [1, p. 56]), with more flexibility of protocols and instrumentation, etc. But, above all, and of fundamental importance, confirming once again what has been said above, is coherent interaction amongst the technical, political and semantic components of interoperability in architecture, which will render it completely suitable for: • encouraging advanced and shared developments, • favor positive tendencies of components and interactions which lead to emergence, • produce efficient practices of diversity management. And not only on principle. References 1. V. Di Battista, G. Giallocosta, G. Minati, Eds., Architettura e Approccio Sistemico (Polimetrica, Monza, 2006).
2. C.M. Eastman, Building Product Models: Computer Environments Supporting Design and Construction (CRC Press, Boca Raton, Florida, 1999).
3. G. Giallocosta, Riflessioni sull’innovazione (Alinea, Florence, 2004). 4. G. Minati, Teoria Generale dei Sistemi, Sistemica, Emergenza: un’introduzione (Polimetrica, Monza, 2004).
COMPREHENSIVE PLANS FOR A CULTURE-DRIVEN LOCAL DEVELOPMENT: EMERGENCE AS A TOOL FOR UNDERSTANDING SOCIAL IMPACTS OF PROJECTS ON BUILT CULTURAL HERITAGE
STEFANO DELLA TORRE, ANDREA CANZIANI Building Environment Science & Technology Department, Polytechnic of Milan Via Bonardi 3, 20133 Milano, Italy E-mail:
[email protected],
[email protected] Cultural Heritage is comprehensible within an integrated vision, involving economic, cultural and ethic values, typical of not renewable resources. It is an open system that doesn’t correspond just to monuments but is made by the complex interactions of a built environment. The systemic relationships between cultural goods (object, building, landscape), and their environmental context have to be considered of the same importance of the systemic relations established with stakeholders/observers. A first partial answer to Cultural Heritage systemic nature has been the creation of “networks” of cultural institutions, that afterwards have been evolving in “cultural systems” and have been recently followed by “cultural districts”. The Cultural District model put forward a precise application for the theory of emergence. But its systemic nature presents also some problematical identifications. For Cultural Heritage the point is not any more limited to “direct” actions. We must consider stakeholders/observers, feedback circuits, emergence of activation of social/cultural/human capital, more than that linked to the architectural design process. a Keywords: local development, relation object - user, Heritage, network of relationships.
1. Cultural Heritage: between nature and history 1.1. Cultural “things” or cultural “heritage” A new vision of the role of Nature and Cultural Heritage might be find out in a well aware transdisciplinary vision of man and of his works within the ecosystems and the environment arose during the second half of the last century. It is enough to remember the 1972 first Club of Rome’s report, The Limits To Growth, or the UNESCO Convention concerning the Protection of the World
a
While the ideas expressed in the present paper derives from common analyses and reflections, the writing of the first part should be attributed to Stefano Della Torre (1.1, 1.2) and the second to Andrea Canziani (1.3, 2,3).
79
80
S. Della Torre and A. Canziani
Cultural and Natural Heritage (1972) and the ICOMOS Amsterdam Declaration (1975). The idea that Cultural Heritage is formed by “all the goods having a reference to civilization history” and that a Cultural Good is every “material evidence having value of civilisation” [1] is affirmed internationally in 1954 by The Hague Convention [2] and in Italy by the statements of the Commission for the Protection and Enhancement of the Historical, Archaeological, Artistic, and Natural Heritage (Commissione di indagine per la tutela e la valorizzazione del patrimonio storico, archeologico, artistico e del paesaggio), commonly known as the Franceschini Commission after its chairman, established in 1964. It is the overtaking of an aesthetic conception, which former laws have been based on, in favour of a wider idea of the cultural value. A value that includes every tangible and intangible evidence and is not limited to aesthetic or historic excellence. It is the expression of “a concept of culture that, against the imbalance caused by fast economic and social modifications [...], assumes a broad anthropological sense and put Nature in touch with History. Therefore artistic and historic heritage is considered as ‘concrete entity of site and landscape, of survival and work’ uninterruptedly placed over the territory and therefore not open to be considered separately from natural environment and in fact coincident at last with Ecology itself” [3, pp. 30-31]. The relevance of giving an open definition and bypassing a mere visual approach is due to the fact that persisting in a preservation of “things of great value” – handled one by one as a consequence of the aesthetic exceptionality that caused their condition of being under preservation laws – means persisting in the vision of Cultural Heritage as belonging to its own separate category, where the value is expressed by a “declaration of interest” that can not and shall not go further the merits of that particular “thing”. A first consequence of this state of things is to separate the goods from their territorial and social context. Just that same context that produced them. A second consequence is to divide a world where cultural value is absolutely prevailing, so that any restoration cost seems to be admissible, from another world where the cultural value is instead absolutely negligible. Third further consequence is the division between the object and its history: the “work of art” is just considered as materialization of a pure artistic value, forgetting its documentary evidence. In this way the preservation based on single value declarations singles out – i.e. divides, isolates – each object to preserve, that is indeed excluded from the evolution of its active context – social and economical –. Just a symbolic and aesthetic value is attributed to goods and they are used as touristy attraction at the most. It is
Comprehensive Plans for a Culture-Driven Local Development: …
81
unnecessary underlining the coherence between this preservation model and the idea of restoration una tantum, or better still, once and for all. And it is unnecessary to stress the distance from systemic visions and sustainable strategies. This terminological change should have been implying a change in general perspective [4]: either for shifting from a passive, static preservation to a proactive one, based on prevention and maintenance, and for a different attention to types and ways of enhancement and management activities. These ones directed to support an open and real common fruition of cultural value within the good. But nevertheless this has just partially happened. 1.2. A new idea of culture: heritage as open system Cultural goods indeed are an open system that does not correspond to monuments. The attention should overtake single elements, considering their inclusive interactions and “recognizing in the general environment the more important informative document” [3, p. 31]. The idea that the attention has not to be given to single goods or to their sum [3, pp. 44 ff.][6, pp.78 ff.] – the idea of the catalogue as taxonomic enumeration that should exhaust the global knowledge of a system – but to their interactions, opens to wider systemic visions. Cultural Heritage is only comprehensible within a global and integrated vision, involving economic, cultural and ethic values typical of not renewable resources. Therefore it does not makes any sense neither the restoration as sum of isolated interventions, nor the separation of protection processes from territorial context [7][15, p.13 ff.]. It is the basis of a conservation defined as coherent, coordinate and planned activity. From this perspective the value does not consist in the tangible object, but in its social function, “seen as intellectual development factor for a community and as historical element which defines the identity of local communities” [4,8]. Nowadays we know beyond doubt that Cultural Heritage is a source of exchange, innovation and creativity [9]. Therefore speaking of enhancement means referring to choices and acts that allow to use the potentialities of a cultural good to create mainly a social advantage. Of course, also if enhancement has to do with possibilities of use/fruition by a larger and larger public, either in a synchronic sense and in a diachronic sense [10], it cannot set aside protection. An object that becomes part of heritage must preserve the values that make of it a historical evidence, a symbol of cultural identification for a community. These values consist in its
82
S. Della Torre and A. Canziani
authenticity that is always tangible, related to its material body. The evidence value consists in the fabric because the memory depends on the signs of passing time on the fabric. Without preserving all the richness of the signs in their authenticity, the evidence itself is missed. And there is no emergence process of the value that originate from the meeting between the object and people histories. The systemic relationships between cultural goods and their environmental context has to be considered of the same importance of the systemic relations established with individuals and society. And perhaps these exercise an even deeper influence. This means that we have to give substance to enhancement not only at level of actions on each object/building/landscape (allowing fruition, minimizing carelessness, giving support services and rights management [11], ...), but also working on territorial context (transport system, accommodation facilities, ...), on environmental quality [12] and social context improvement (comprehension and recognition, involvement of users in care, planned conservation processes, ...). From this viewpoint the integration between different territorial elements is crucial to reach what we might call “environmental quality”. This is one of the main strategic aim for protection of cultural goods, not seen as separate entities whose aging has to be stopped, but as systemic elements whose co-evolutive potentialities we have to care [13,17]. The Cultural Heritage accessibility involves also the issues of mass democratic society and cultural citizenship rights [14][3, p.89, p.255], of social inclusion and cultural democracy [16]. The relationship between use/enhancement and socio-historic values [17, pp.10-13], with its accent on the user/observer, designs an idea of heritage that can have either a direct influence on development as innovation/education and also of course an indirect influence on the economic system. What may look like an inversion between direct and indirect benefits is just seeming: indeed if development is knowledge, economic benefits are the main positive externalities of knowledgeb [18]. What does it mean conserving cultural goods is wonderfully expressed by Andrea Emiliani when he writes: “Non era più possibile immaginare che un dipinto non facesse parte di una certa chiesa e che quella chiesa, a sua volta, non fosse parte integrante di una certa città, di un certo paesaggio, di una certa economia e di una certa società. Non era più possibile trascurare che, per b
“All that is spent during many years in opening the means of higher education to the masses would be well paid for if it called out one more Newton or Darwin, Shakespeare or Beethoven”. (A. Marshall, 1920 [5, IV.VI.26]).
Comprehensive Plans for a Culture-Driven Local Development: …
83
quanti fossero gli interventi di consolidamento e restauro, il risultato complessivo avrebbe aperto una nuova fase di precarietà e di rischio per edifici storici rimessi, si, in sesto ma destinati a subire le conseguenze del sempre più rapido e inarrestabile indebolimento dei relativi contesti territoriali. Chiunque avesse anche lontana pratica di certe faccende, doveva sapere ormai bene che restaurare un dipinto in una chiesa montana, sita in una zona afflitta da spopolamento, non aveva se non un significato interlocutorio o al massimo di mera conservazione fisica: una frana avrebbe poi colpito quella chiesa; il disboscamento avrebbe potuto facilitare quella frana; la fragilità socioeconomica di quel comprensorio avrebbe accelerato il disboscamento...e a cosa sarebbe servito allora aver restaurato quel dipinto, averlo tolto dal suo contesto culturale, averlo -in fondo- sottratto alla sua superstite funzione sociale, aver infine- con la sua assenza stessa- aggravato le condizioni di quella zona? E se a Raffaello è possibile sopravvivere anche nell'atmosfera indubbiamente rarefatta del museo, per il dipinto di minore interesse storico o qualitativo la vita nel contesto suo originario è tutto o quasi. Toglierlo di lì vuol dire operare con leggerezza quel temibile fenomeno di ‘dèracinement’ che è l'attentato più pericoloso che mai si possa organizzare sull'oggetto culturale” [19]. Considering the whole territory as a common good means that its control mechanisms have to deal with the heritage conservation and evolution. That means having to deal with participated management, with scenarios of shared design and confrontation with other disciplines studies, like politics, economy, social or biological sciences. These are the frameworks of governance, of systems like cultural districts, of transdisciplinary approach. 1.3. From integrated cultural systems to cultural districts The first partial answer to Cultural Heritage systemic nature has been the creation of “networks” of cultural institutions – such as museums or libraries – that afterwards have been evolving in “cultural systems”c [20]. The “cultural system” refers to ideas of programming and management rationalization, expressing a deeper generalization aim. The addition of an adjective like “integrated” expresses the awareness of the importance of connections with the territory and of resources diversification. But planned and controlled actions still c
“Molte delle considerazioni che accompagnano le iniziative di “messa a sistema” dei musei invece, come ad esempio quelle inerenti l’applicazione degli standard museali, hanno la tendenza a considerare il sistema una figura organizzativa in grado di supplire alle dimensioni, supposte insufficienti, dei musei locali, consentendogli di assumere comportamenti simili a quelli di maggiori dimensioni.” (Maggi Dondona, (2006), [15, p. 6 ff.]).
84
S. Della Torre and A. Canziani
remains central, as it was a predictable mechanism of networks and transmissions. The integrated cultural systems have been recently followed by “cultural districts”. As Trimarchi said, “Looking carefully the axiom was easy. The Italian industrial production [...] has developed in this complex unit, with a renaissance taste, in which the issues that for the big industry could be defects (the informality, the importance of family relations, the role of decoration, etc.). are instead unreplaceable virtuous elements” [21]. The district model at last seems to be the answer to all those enhancement problems of a Cultural Heritage that too often comes out as the Italian basic economic resource [22]. Moving from the success of many cultural districts within urban environments, it has been developed a set of models for new scenarios of culture-driven development, trying to deal also with situations really different from the urban ones. And then, while the model is still under the analysts’ lens, quite a few prototypes begin to be applied: usually cultural institutions combinations or development initiatives collected under some cultural symbol whose establishing as a system should produce profitable spinoffs, always not quite well defined at the moment but sure and wide in the future [23]. But drawing near districts, production, development and culture is not easy. A cultural district has been defined, from a theoretical standpoint, as the product of two key factors: “the presence of external agglomeration economies and the awareness of the idiosyncratic nature of culture [which is peculiar to a given place or community and to a specific time]. When these two factors join within a dynamic and creative economic environment, the conditions for having a potential cultural district are satisfied. Adding efficient institutions is the political factor that can transform a potential district in a real outcome” [24]. The attention is concentrated mainly in the creation of a value chain and in the main role played by the organization of the so called Marshallian capital: that “industrial atmosphere” with continuous and repeated transactions that cause the information to circulate. There are actually several cultural district connotations [24,25,26,32]. The word reminds an industrial/economic matrix and therefore the idea of incomes generated by the Cultural Heritage or even a culture commercialization. But a more detailed analysis makes clear that such a connotation has been deleted and the expression is used because of its relationship with local community participation, with the answering to government incentives, with the capability of such a system to produce and spread innovative cultural issues and external
Comprehensive Plans for a Culture-Driven Local Development: …
85
economies connected with innovation [23]. The idea of a district stresses the added value of concentration and localization, but also the emergence of these processes. The cultural district idea is linked to an inclusive vision that can re-discuss and understand on the one hand the role of Cultural Heritage within the development economies of a single territory, on the other hand “the deep change in the role of culture within the contemporary society and its present intellectual and emotional metabolisms” [27]. It is possible to recognize in people’s mental space the main infrastructure that has to be the aim of programming and planning. The quality of each action has the possibility to improve the cultural capital, i.e. the local community capability [28]. From the viewpoint of conservation studies, where we investigate cultural mechanisms and practical procedures that form the bases of architectonical heritage preservation, this model is pretty interesting. The systemic links among heritage, territory and society represent the cutting edge of preservation. Moreover the accent that the district model put on users participation is in tune with the most up-to-date visions of government/governance shifting and with conservation as activity made of studies, capability and prevention. 2. Emergencies between cultural districts and architectural heritage The cultural district model put forward a precise application for the theory of emergence. Moving from the systemic nature of Cultural Heritage we observe that preservation and enhancement procedures present complex interactions between elements and the properties of heritage as a whole are not deducible from single objects. We need strategies going further than the search of a local perfection. The same strategies need to be auto-controlled and adaptive [29] to answer to the evolutionary nature of Cultural Heritage. Moreover we have to consider that what is heritage and what is not is absolutely observer-dependent, with a crucial role of the observers’ awareness of the process. Within this framework the Dynamic Usage of Models (DYSAM) [30] is particularly suitable, having to deal with different stakeholders expectations and social systems modellisation [31]. But the systemic nature of a cultural district and of every single object, building or landscape, presents also some problematical identifications. A district is clearly an emerging system because of the nature of interaction and behaviour of its elements, but how interactions can act on a common property? and to which set these elements belong? It is not obvious that a church
86
S. Della Torre and A. Canziani
and a vineyard belong to the same set. A mere geographical link is too weak and the district link would be self-referential. The only real link is due to the sense of belonging to Cultural Heritage, and it requires a conscious acknowledgment act. Since a system is not behaving as a machine, where each component plays a role and can be replaced without the need to act on the others, any change influences the whole system. It is not possible any intervention on the system just acting on elements and the characteristics of a system can be influenced only by more complex interventions on components interactions over time. How do that reflect on the conservation of each single object when we deal with Cultural Heritage? “A building is considered a dynamic system, where technological interventions inspired by the concept of machine and repair are absolutely inadequate, based on reductionist simplifications. The task is to identify the level of description where occur those emergence processes that maintain the materiality of structure and to support them” [30, p. 92]. But what is the communication between a single building and territorial complex? The possibility to work on a single item without prejudice to the system may indeed be a useful property for the system conservation. But for the Cultural Heritage we must take into account the impossibility of support replication and the need for specific processes that require studies and a non-trivial -not standardizedknowledge. Therefore it is evident the point is not any more limited to the “direct” actions on the good, but we have to consider stakeholders –the observers- and feedback circuits. According to Crutchfield [33] there are three possible types of emergence: intuitive definition (something new appear), pattern formation (an observer identifies organization), intrinsic emergence (the system itself capitalizes on patterns that appear). If only the last one is a right and complete specification for systemic properties, few more doubts arise about the applicability to districts. How might we speak of systemic properties not reducible to single elements properties, but emerging from their interactions? Is it possible speak of unpredictable behaviours that lead to the need of a new cognitive model? At first glance economical interactions between i.e. agricultural and construction sectors do not seem to be unpredictable and the same might be for interactions between built objects and life quality. We should more properly speak of “not trivial consequences” of a model. However these consequences are predictable analyzing the model and it leads us to the second emergence type. Architecture is something coming from a design process, the basis is the prediction of use, coming from forecasts and willing. If an emerging behaviour is something that was not in the designer’s aims, then there are no such a
Comprehensive Plans for a Culture-Driven Local Development: …
87
behaviours for architectural objects. In architecture a real emergence is possible only when we recognize into a building some values that have not been foreseen in the design or construction process. That is exactly what usually happen to something recognized as cultural heritage. From this viewpoint it is much more interesting the emergence linked with the activation of social/cultural/human capital than that of architectural design process –that still remains unclear- [34]. 3.
From events to processes: integration of preservation and enhancement process with other territorial structures
The increase of human capital is recognized as a fundamental basis for development and culture can just address growing processes that otherwise might become a threat for heritage. The reference is the basic role of local community participation in the cultural district, the bright answering to government incentives, the ability to produce and spread innovative external economies connected with innovation: that is to say the emerging character of the district. The evolution of culture has always been the capability to interact building a knowledge network. Let us recall here the ideas of “acting/living local and thinking global” [35], at the opposite of predominant projects where the role of intelligence is restricted to immediate and circumscribed problems microscopically thinking- while consumption is global. That is not only a matter of levels of description. It is the case of the peripheral thinking at the basis of Internet: “With the Internet, decisions were made to allow the control and intelligence functions to reside largely with users at the “edges” of the network, rather than in the core of the network itself” [36,37]. Conservation and enhancement processes could act as a catalyst for quality, innovation, creativity, research also in peripheral zone of territory and society. You need to involve users, especially the local community, using a contextual approach that through narration and participation leads to new learning and awareness [18]. That is the recognition or even the construction of the identity connected to Cultural Heritage. Fixed identities must be re-think in the present time, where everyone lives a multiplicity of belongings, less and less linked to territorial definitions. We need to develop the idea of cultural value not as guardian of tradition, but as something emerging from the meeting between heritage elements and internal people’s space [38]. Within this framework if the meeting is not just one, but it is a history where each point of contact is a new deposit of value that is renewed by each event. It is the construction of a
88
S. Della Torre and A. Canziani
dynamic identity, not built just on consolidated values but on global and hybrid relationships. From this standpoint the cultural diversity is seen as necessary for humankind as biodiversity is for nature. In this sense, it is the common heritage of humanity and should be recognized and affirmed for the benefit of present and future generations” [39]. That is the reference for sharing values and for giving the right weight to the cultural performance [40]. Within this frame of reference the built heritage has a basic catalyst role because of its easy recognizable importance, of its use, of its high visibility. But the classical loop – investments, growth, profitability, investment – encounters difficulties when dealing with public goods, characterized by high interconnections between environment complexity and stakeholders. When investments for the Cultural Heritage conservation give rise to new knowledge and education a new loop is established: the heritage is better understood, the identity is enriched or reformulated, there is a new awareness of the importance of taking care, there are the premises for its participated conservation. References 1. “Report of the Franceschini Commission on the Protection and Use of Historical, 2. 3. 4. 5. 6. 7.
8. 9. 10. 11. 12.
Archaeological, Artistic and Natural Heritage”, Rivista trimestrale di diritto pubblico 16, 119-244 (1966). UNESCO, Convention for the Protection of Cultural Property in the Event of Armed Conflict with Regulations for the Execution of the Convention, (The Hague, 14 May 1954). M. Montella, Musei e beni culturali. Verso un modello di governance (Mondadori Electa, Milano, 2003). G. Pitruzzella, Aedon, 1, 2.6 (2000). A. Marshall, Principles of Economics (Macmillan and Co, London, 1920). S. Settis, Italia S.p.A. L’assalto al patrimonio culturale (Einaudi, Torino, 2002). P. Petraroia, “Alle origini della conservazione programmata: gli scritti di Giovanni Urbani”, TeMa, 3, (Milano, 2001). C. Fontana, in: L’intervento sul costruito. Problemi e orientamenti, Ed. E. Ginelli, (Franco Angeli, Milano, 2002), p.15 ff. S. Settis, “Le pietre dell' identità”, Il Sole 24 ore, (13 november, 2005), p. 29. G. Pastori, Aedon, 3, 1.6-8 (2004). UNESCO, Universal Declaration on Cultural Diversity (Paris, 2001). A. Cicerchia, Il bellissimo vecchio. Argomenti per una geografia del patrimonio culturale (Franco Angeli, Milano, 2002). G. Guerzoni, S. Stabile, I diritti dei musei. La valorizzazione dei beni culturali nella prospettiva del rights management (Etas, Milano, 2003). P.A. Valentino, “Strategie innovative per uno sviluppo economico locale fondato sui beni culturali”, in La storia al futuro: beni culturali, specializzazione del territorio e
Comprehensive Plans for a Culture-Driven Local Development: …
13.
14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
89
nuova occupazione, Ed. P.A. Valentino, A. Musacchio, F. Perego, (Associazione Civita, Giunti, Firenze, 1999), p. 3 ff. S. Della Torre, in Ripensare alla manutenzione. Ricerche, progettazione, materiali, tecniche per la cura del costruito, Ed. G. Biscontin, G. Driussi, (Venezia, 1999). S. Della Torre, G. Minati, Il Progetto sostenibile, 2, (2004). S. Della Torre, Arkos, 15, (2006). L. Fusco Girard, P. NijKamp, Eds., Energia, bellezza, partecipazione: la sfida della sensibilità – Valutazioni integrate tra conservazione e sviluppo (Franco Angeli Editore, Milano, 2004). M. Maggi, Ed., Museo e cittadinanza. Condividere il patrimonio culturale per promuovere la partecipazione e la formazione civica, Quaderni Ires, 108, (Torino, 2005). M. Maggi, C.A. Dondona, Macchine culturali reti e sistemi nell’organizzazione dei musei (Ires, Torino, 2006). Economia della Cultura, 14(4), (Il Mulino, Bologna, 2004). L. Fusco Girard, Risorse architettoniche e culturali: valutazioni e strategie di conservazione (Franco Angeli Editore, Milano, 1987). D. Schürch, Nomadismo cognitivo (Franco Angeli, Milano, 2006). A. Emiliani, Dal museo al territorio (Alfa Editoriale, Bologna, 1974), pp. 207-208. L. Zanetti, “Sistemi locali e investimenti culturali”, Aedon, 2, (2003). M. Trimarchi, Economia della cultura, 15(2), (Il Mulino, Bologna, 2005), p.137. “L' arte, ' petrolio d' Italia' ”, in: Settis, (2002), p.30 ff. P.L. Sacco, S. Pedrini, Il Risparmio, 51(3), (2003). W. Santagata, Economia della cultura 15(2), (Il Mulino, Bologna, 2005), p.141. P.L. Sacco, G. Tavano Blessi, Global & Local Economic Review, 8(1), (Pescara, 2005). P.A. Valentino, Le trame del territorio. Politiche di sviluppo dei sistemi territoriali e distretti culturali (Sperling & Kupfer, Milano, 2003). M. Trimarchi, Economia della cultura, 15(2), (Il Mulino, Bologna, 2005), p.138. A. Sen, Rationality and Freedom (Harvard Belknap Press, 2002). S. Guberman, G. Minati, Dialogue about Systems (Polimetrica, Milano, 2007). G. Minati, Teoria Generale dei Sistemi. Sistemica. Emergenza: un’introduzione, progettare e processi emergenti: frattura o connubio per l’architettura? (Polimetrica, Milano, 2004). G. Becattini, in Il caleidoscopio dello sviluppo locale. Trasformazioni economiche nell'Italia contemporanea, Ed. G. Becattini, M. Bellandi, G. Dei Ottati, F. Sforzi, (Rosenberg & Sellier, Torino, 2001). A. Canziani, Beni culturali e governance: il modello dei distretti culturali, Ph.D. dissertation, (Politecnico di Milano, Milano, 2007). J.P. Crutchfield, in Physica D, special issue on the Proceedings of the Oji International Seminar Complex Systems - from Complex Dynamics to Artificial Reality, 5-9 April 1993, Numazu, Japan, (1994). V. Di Battista, G. Giallocosta, G. Minati, Architettura e approccio sistemico (Polimetrica , Milano, 2006). L. Sartorio, Vivere in nicchia, pensare globale (Bollati Boringhieri, Torino, 2005).
90
S. Della Torre and A. Canziani
36. V. Cerf, U.S. Senate Committee on Commerce, Science, and Transportation Hearing on “Network Neutrality”, (February 7. 2006).
37. F. Carlini, “Io ragiono solo in gruppo”, Il manifesto, 25 luglio 2004. 38. U. Morelli, Ed., Management delle istituzioni dell’arte e della cultura. Formazione,organizzazione e relazioni con le comunità di fruitori (Guerrini, Milano, 2002). 39. UNESCO, Universal Declaration on Cultural Diversity, (Paris,2001). 40. A. Canziani, M. Scaltritti, Il Progetto sostenibile, (2008, in printing).
SYSTEMIC AND ARCHITECTURE: CURRENT THEORETICAL ISSUES GIORGIO GIALLOCOSTA Dipartimento di Progettazione e Costruzione dell’Architettura, Università di Genova Stradone S. Agostino 37, 16123 Genoa, Italy E-mail:
[email protected] Systemics approaches towards architecture, traditionally within a structuralist framework (especially within a technological environment), may evolve in a non-reductionist way through: - non-reductive considerations of the role of human requirements in the definition of inhabited spaces; - acceptance of the use-perception dialogical relationship, and more generally of the artscience nexus, as being characteristic of architecture. Likewise, there are theoretical issues in the development of systemic, particularly within the discipline of architecture, including: - the role of the observer, in the constructivist sense and within the exceptions of scientific realism; - the unpredictability of emergence, with its related limits (of purely ontological significance). Keywords: systemics, architecture, emergence, observer.
1. Introduction A great amount of experience with the systemic approach towards architecture distinguishes studies and applications in various disciplinary environments, which operate within that context. Sometimes by accepting the more important developments of systemic, in other cases reiterating the classical concepts of Systems Theory, such experience, however, does not appear to be significantly projected toward the objectives of disciplinary recomposition. In effect, there remains, especially in Italy, the anti-historical counterposition of scientific culture against artistic culture which still characterises all the relationships between the diverse disciplinary aspects in architecture, and any project of an interdisciplinary nature. For example, in Italian Faculties of Architecture, within the different environments operational, professional, etc., there are clear separations between project approaches and project culture (based on requirements, poietic, morphogenetic, etc.).
91
92
G. Giallocosta
The architectural project, on the other hand, when oriented towards allowing the optimum use and management of its many implications (social-technical, economic, perceptive, etc.), requires suitable interdisciplinary elaboration/ applications to be governed, given the mutual interactions and emergent effects, through transdisciplinarity (Minati, 2004 [12, pp. 37-42 and 49-52]). Similarly, the importance of infradisciplinary research should not be underestimated (regarding epistemological foundations and methodologies which are fundamental in obtaining specialistic rigour). The related problems are not only of an operative or applicational nature, but also concern (with these) the need/opportunity to put architecture on trial in its multiplicity of components (compositive, technological, social, economic, etc.) and in the interactions produced. These issues, identified here above all as being of a theoretical nature, can lead to just as many problems regarding architecture and the systemic approach. In the former, it is useful to work with conceptually shared approaches of definitions of architecture, inferring from these, suitable directions/motivations for scenarios of a systemic nature. In the latter, one needs to recognize, within the developments of systemic itself, those problems of major importance for architecture. 2. Possible shared definitions of Architecture Numerous definitions of architecture can be found in recent contributions, and in the historiography of the sector. For example, amongst those ascribable to various Masters (whose citations are given by Di Battista, in Di Battista et al., Eds., 2006 [5, p. 39]): • Architecture “(...) can be seen as the twin of agriculture; since hunger, against which men dedicated themselves to agriculture, is coupled to the need for shelter, from which architecture was born ...” (Milizia, author' s translation); • “(...) construire, pour l’architecte, c’est employer les matériaux en raison de leur qualités et de leur nature prope, avec l’idèe préconcue de satisfaire à un besoin par les moyens les plus simplex et les plus solides ...” (Violletle-Duc); • “l’architecture est le jeu savant, correct et magnifique des volumes assemblès sous le soleil (...)”, and also, “(...) the Parthenon is a selected product applied to a standard. Architecture acts upon standards. Standards are a fact of logic, analysis, scrupulous study, and derive from a well-
Systemic and Architecture: Current Theoretical Issues
93
defined problem. Experimentation fixes, in a definitive manner, the standard ...” (Le Corbusier, author' s translation). Amongst these, a definition by William Morris in 1881 describes architecture as the moulding and altering to human needs of the very face of the earth itself, except in the outermost desert (Morris, 1947, cit. in Benevolo, 1992 [2, p. 2]). One can agree with Di Battista in considering this definition as a compendium of many of the preceding and later ones; “(...) it takes architecture back to the concept of inhabited environment (...) where human activities have changed, utilised, controlled natural situations to historically establish the built-up environment (...) Architecture could therefore be defined as a set of devices and signs (not ephemeral - author' s note) of man which establish and indicate his system of settlement (...) Architecture is applied to a system of settlement as a system of systems: ecosphere (geological, biological, climatic, etc.) and anthroposphere (agricultural/urban, social, economic). Within these systems there are simultaneous actions of: • observed systems (physical, economic, social, convergent with/in settlement system, according to the schematisations of Di Battista - author' s note) which present different structures and a multiplicity of exchange interactions; • observing systems as subjects or collectives with multiple identities and values, but also as cognitive models (philosophical, religious, scientific, etc.) which explain and offer multiple intentions and judgement criteria” (Di Battista, in Di Battista et al., 2006 [5, pp. 40-41], author' s translation). Morris's definition (Morris, 1947 [13]), especially within the most explicit definition of architecture as a built-up environment (or founding and denoting settlement systems) for satisfying human needs, hence provides unitary and (at least a tendency towards) shared conceptions where, for example, one considers the sphere of human needs in a non-reductive sense of material needs, but also of axiology, representation, poiesis, amongst others. Another definition (most agreeable, and useful here for our declared purposes) is that of Benjamin who, in his most famous essay (Benjamin, 1936 [3]), stresses the particular nature of architecture as a work of art and from which one benefits in two ways, through use and perception: here we also see that dialogical, and highly interactive, relationship between artistic and scientific
94
G. Giallocosta
culturea. The aim of overcoming the art-science dualism was one of the declared objectives of the birth of the Faculties of Architecture, as a disciplinary synthesis of traditional contributions from the Fine Arts Academies (Accademie di Belle Arti) and the Engineering schools. One must, therefore, develop that relationship in a non-reductionist way (Minati, 2004 [12, pp. 84-86], and Minati, in Di Battista et al., 2006 [5, p. 23]), precisely as a synthesis of the twofold fruitional modes of architecture (Benjamin, 1936 [3]) and the effective indivisibility of its diverse aspects: not only representation, or communication, or use, etc., but dynamic interactions involving the multi-dimensional complex of those modifications and alterations produced (and which produce themselves) in the moulding and altering to human needs of the very face of the earth itself (Morris, 1947 [13]). The dialogical recovery of that relationship (art-science), moreover, becomes a necessary requirement, above all when faced with the objectives of optimised management of multiple complex relationships which distinguish contemporary processes in the production of architecture: transformation and conservation, innovation and recollection, sustainability and technological exaltation, etc. In current practices (and reiterating some of what has been said above) there is, however, a dichotomy which can be schematically attributed to: • on one hand, activities which are especially regulatory in the formal intentions of the architectural design, • on the other, the emphasis upon technical virtuosities which stress the role of saviour of modern technology (Di Battista, in Di Battista et al., 2006 [5, p. 38]). Nor can one ignore, particularly in the classical approach of applying systemic to architecture (and emphasised, especially in technological environments), the existence of an essentially structuralist conception, precisely in the sense described by Mounin (Mounin, 1972 [2], cit. in Sassoli, in Giorello, 2006 [8, p. 216])b: in this conception, for example, the leg of a table would be characterized, a
b
The art-science nexus has been well clarified, moreover, since ancient times. It is referred to, for example, and with its own conceptual specificities, in Plato's Philebus, Vitruvio's De Architectura, the Augustinian interpretation of architecture as a science based upon the laws of geometry, Le Corbusier's Le Modulor, etc. (Ungers, in Centi and Lotti, 1999 [4, pp. 85-93]). When a structure (or for others, a system) can essentially be considered as a construction, in the current sense of the word. In this formulation, analysing a structure means identifying the parts which effectively define the construction being considered (Mounin, 1972 [14], cit. in Sassoli, in Giorello, 2006 [8, p. 216]): and such a definition exists because “(...) a choice is made in the
Systemic and Architecture: Current Theoretical Issues
95
in an analogous manner to the constituent parts of the current concept of building system, “(…) neither by the form nor by the substance, because I could indifferently put an iron or a wooden leg (…) the functions of this part in relation to the structure will remain (in fact - author’s note) invariant (…) LéviStrauss (…) defined the method (…) used in his anthropological research in a way clearly inspired by structural linguistics (defining the phenomenon studied as a relationship between real or virtual terms, build the framework of possible permutations amongst them, consider the latter as a general object of an analysis which only at this level can be made, representing the empirical phenomenon as a possible combination amongst others and whose total system must first of all recognize itself - author’s note) …” (Sassoli, in Giorello, 2006 [8, pp. 216-217], author' s translation). In 1964, Lévi-Strauss identified as virtual terms some empirical categories (raw and cooked, fresh and putrid, etc.) showing how, once the possible modes of interaction had been established, such elements can “(...) function as conceptual instruments to make emergent certain abstract notions and concatenate them into propositions ...” (Lévi-Strauss, 1964 [11], cit. in Sassoli, in Giorello, 2006 [8, p. 217], author' s translation) which “(...) explain the structures which regulate the production of myths. In that work, Lévi-Strauss tried to show how ‘if even myths (i.e., apparently the most free and capricious human cultural elaboration) obey a given logic, then it will be proven that the whole mental universe of man is subjected [...] to given norms’. But if all the cultural production of man (...) can be traced back to unconscious structures possessing an internal logic independent of the subject, then structuralist ‘philosophy’ will result in a radical anti-humanism. We can interpret in this way the famous statement by Foucault, according to which the structuralist formulation decrees the ‘death of man’ (Foucault, 1966 [7], author' s note) ...” (Sassoli, in Giorello, 2006 [8, pp. 217-218], author' s translation). It thus becomes clear how the declared radical anti-humanism of the structuralist approach leads to approaches towards systemic which would obliterate (apparently?) one of its most important current paradigms: the role of the observer as an integral part, a generator of processes (Minati, in Di Battista et al., 2006 [5, p. 154]). But the systemic concept, as applied classically to architecture by Italian technological schools, does contemplate an observer in the role of user; the latter, in fact, with its own systems of requisites, functions as a referent for the requisites-performance approach, despite underlining its role attribution in a reductionist sense (as it fundamentally expresses the requisites arrangement of the various parts. And the main criterion for that choice is the function which they have ...” (Mounin, 1972 [14], cit. in Sassoli, in Giorello, 2006 [8, p. 216], author' s translation).
96
G. Giallocosta
deriving from the use value of the end-product) and lacking significant correlations with other agents (and carriers of their own interests, culture, etc.)c. More generally, therefore, although definitions of architecture which tend to be shared can trigger mature trial scenarios of a systemic nature (Di Battista et al., 2006 [5]), and within an interdisciplinary, transdisciplinary (and infradisciplinary) framework, unsolved problems, however, still exist. Amongst others: which observer (or better, which system of observers) is best for activities which put on trial architecture in its multiplicity of components and interactions? But this is still an open question even in the most recent developments in systemic. Similarly, there is the problem of the unpredictability of emergent phenomena (especially in the sense of intrinsic emergence)d, when faced with objectives/requisites, typical of the architectural disciplines, of prefiguring new arrangements and scenarios. 3. Specific problems in systemic and implications for Architecture The role of the observer, “(...) an integral part of the process being studied, combines with constructivism (...) This states that reality can not be effectively considered as objective, independently from the observer detecting it, as it is the observer itself which creates, constructs, invents that which is identified as reality (...) Essentially one passes from the strategy of trying to discover what something is really like to how it is best to think of it” (Minati, in Di Battista et al., 2006 [5, p. 21], author' s translation). Moreover, the connection between the role of the observer and the Copenhagen interpretation of quantum theory is well-known from 20th century scientific research; amongst others, Heinz Pagels refers to it explicitly (as far as this is of interest here), where one can consider as senseless the objective existence of an electron at a certain point in space independently from its concrete observation: thus reality is, at least in part, created by the observer (Pagels, 1982 [15], cit. in Gribbin, 1998 [9]).
c
d
In a similar way, the structuralist formulation (Mounin, 1972 [14]), at least in Mounin's idea (which, in postulating a choice in the arrangement of the various parts, identifies the criteria through their functions, and the latter will, in some way, have to wait), would seem in any case to assume implicitly the existence of a referent with such a waiting function, even though in a tautologically reductionist sense. Di Battista, moreover, develops a proposal for the evolvement of the classical requisites-performance approach, which integrates use-values with cultural, economic, values, etc. (Di Battista, in Di Battista et al., 2006 [5, pp. 85-90]). Where not only the establishment of a given behavior (even though compatible with the cognitive model adopted) can not be foreseen, but where its establishment gives rise to profound changes in the structure of the system, requiring a new modelling process (Pessa, 1998 [16], cit. in Minati, 2004 [12, p. 40]).
Systemic and Architecture: Current Theoretical Issues
97
Clearly, there is also great reserve regarding the role of the observer in the terms described above. Prigogine for example, although dealing with more general problems (including unstable systems and the notions of irreversibility, probability, etc.), states that: “(...) the need for introducing an ‘observer’ (much more significant in quantum mechanics than in classical mechanics - author' s note) necessarily leads to having to tackle some difficulties. Is there an ‘unobserved’ nature different from ‘observed’ nature? (...) Effectively, in the universe we observe equilibrium situations, such as, for example, the famous background radiation at 3°K, evidence of the beginning of the universe. But the idea that this radiation is the result of measurements is absurd: who, in fact could have or should have measured it? There should, therefore, be an intrinsic mechanism in quantum mechanics leading to the statistical aspects observed” (Prigogine, 2003 [17, p. 61], author' s translation). Moreover: the role of the observer would lead to the presence “(...) of a subjective element, the main cause of the lack of satisfaction which Einstein had always expressed regarding quantum mechanics” (Prigogine 2003 [17, p. 74], author' s translation). Thus, the presence of a subjective element brings with it risks of anthropocentrism. Nor can one avoid similar implications of constructivism with architecture (in which exist, tautologically, all anthropic processes); if the role of the observer, in fact, especially regarding decision-making and managerial activities, entails taking responsibility for safeguarding common interests, it becomes clear from other points of view that dichotomic attitudes can arise from this responsibility, and above all can be theoretically justified through considerations of subjective presences. But such problems regarding the observer in current developments in systemic also allude to the dichotomy between scientific realism and anti-realist approaches, whose developments (especially regarding logical-linguistic aspects of science) are efficaciously discussed by Di Francescoe. Particularly meaningful in that examination (Di Francesco, in Giorello, 2006 [8, pp. 127-137]), who, moreover, explains the position according to second Putnam (converted to antirealism, according to a form of so-called internal realism), Wiggins, and Hackingf, is the strategy suggested by the latter regarding the dual dimension of e f
Roughly, scientific realism (as opposed to anti-realist approaches) contemplate a reality without conceptual schemes, languages, etc. (Di Francesco, in Giorello, 2006 [8, p. 127]). More precisely: Hacking, 1983 [10]; Putnam, 1981 [18]; Putnam, 1987 [19]; Wiggins, 1980 [20]. Roughly speaking, the internal realism of the according to second Putnam contemplates, for example, how to “(...) ask oneself: of which objects does the world exist? only makes sense within a given theory or description” (Putnam, 1981 [18], cit. in Di Francesco, in Giorello, 2006 [8, p. 133]).
98
G. Giallocosta
scientific activity (representation and intervention) and therefore effectively translatable into architectural processes: reasoning “(...) of scientific realism at the level of theory, control, explanation, predictive success, convergence of theories and so on, means being confined within a world of representations. [...] And thus engineering, and not theorizing, is the best test of scientific realism upon entities. [...] Theoretical entities which do not end up being manipulated are often shown up as stunning errors” (Hacking, 1983 [10], cit. in Di Francesco, in Giorello, 2006 [8, p. 137], author' s translation). Naturally, in architecture, there are still critical implications regarding the role of the observer (and in further aspects, beyond those described here). If it could be accepted regarding that interaction between observed systems and observer systems, and as already mentioned (Di Battista, in Di Battista et al., 2006 [5, pp. 40-41]), for the formulation of the latter, the non-reductionist evidence of the multiple (and often different) interests, values, culture, etc., characteristic of the agents in construction projects, is also necessary. Neither is this a question of predominantly computational importance (how to formalise the observer systems); in fact, it also involves, and especially within the framework of managing and governing predominantly social interests/values/etc., of defining and following systemics, organised and self-organise, collectivities (Minati, in Di Battista et al., 2006 [5, p. 21]), avoiding problems: • from shared leadership to unacceptable dirigism, • from self-organisation to spontaneity. Then again: who observes the observer systems? Further more detailed considerations, amongst (and beyond) the problems and hypotheses mentioned so far, are therefore necessary. The role of the observer, moreover, is considered to be of fundamental importance also for the detection of emergent properties (Minati, in Di Battista et al., 2006 [5, pp. 21-22]). But the unpredictability of the latter (as mentioned above) leads to further problems in systemics approaches to architecture, effectively persisting in every outcome of anthropic processes: including those ex-post arrangements of the moulding and altering to human needs of the very face of the earth itself (Morris, 1947 [13]). This unpredictability, however, can to a certain extent be resolved dialogically with respect to the requisites of the prefiguration of scenarios (typical of architectural disciplines): • taking those scenarios to be consistently probabilistic (or as systems of probabilities);
Systemic and Architecture: Current Theoretical Issues
•
99
optimising them through validation of positive propensities of and amongst its component parts (and minimising negative potentialities), also through suitable formalisations and ex-ante simulations.
What is more, it is not unusual in architecture to resort to probabilistic formalizations. On the basis of the Bayes theorem, for example, when there is usually a number of experimental data (b1, b2, ..., bn) which can then be formulated in an appropriate manner, suitable networks can be formalized to support, amongst other things, the evaluation of technical risks (intended, even in building, as being connected to reaching the quality and the performance planned in the project), where, for example, the experimental evidence represents symptoms of the outbreak of a pathology (or hypothesis a)g. In the Bayesian approach, however, there is still the problem of the definition of the a priori probability, even though this can be reduced, according to some, on the basis of its own aspects of subjective variability through suitable inductive principles (besides those usually considered in the calculation of the probabilities), which would lead to corresponding variabilities in the attribution of a posteriori probabilitiesh. Emphasis should therefore be placed, in general, upon those theoretical issues regarding the dichotomy between the unpredictability of emergence and the necessity for ex-ante prefigurations in architecture. The systemic approach in this sense, given the state of current cognitive models of observer systems (Di Battista, in Di Battista et al., 2006 [5, p. 40]), can simply direct it towards an appropriate reduction of that dichotomy. But even this question takes on a purely ontological importance. g
h
See, for instance, the Bayesian approach to risk management in the building industry, see, for example, Giretti and Minnucci, in Argiolas, Ed., 2004, pp. 71-102. As is well known, the Bayes theorem (expressed here in its most simple form) allows calculation of the probability (a posteriori) p(a\b) of a hypothesis a on the basis of its probability (a priori) p(a) and the experimental evidence b: p(a\b) = (p(b\a)p(a))/p(b) The a priori probability “(...) can be interpreted as the degree of credibility which a given individual assigns to a proposition a in the case where no empirical evidence is possessed (...) Whereas p(a\b), which denotes the epistemic probability assigned to a in the light of b, is said to be the relative probability of a with respect to b. In the case where a is a hypothesis and b describes the available experimental evidence, p(a\b) is the a posteriori probability ...” (Festa, in Giorello, 2006 [8, pp. 297-298], author' s translation). Besides mentioning (even problematically) some of the inductive principles for minimising subjective variability, Festa recalls the subjectivist conception (de Finetti et al.): according to which, notwithstanding the subjective variability in the choice of the a priori probability, “(...) as the experimental evidence (...) available to the scientists grows, the disagreement (amongst the latter regarding the different evaluations of the a posteriori probabilities - author's note) tends to decrease ...” (Festa, in Giorello, 2006 [8, p. 305], author's translation).
100
G. Giallocosta
References 1. C. Argiolas, Ed., Dalla Risk Analysis al Fault Tolerant Design and Management (LITHOSgrafiche, Cagliari, 2004).
2. L. Benevolo, Storia dell’architettura moderna, 1960 (Laterza, Bari, 1992). 3. W. Benjamin, L’opera d’arte nell’epoca della sua riproducibilità tecnica, 1936 (Einaudi, Turin, 2000).
4. L. Centi, G. Lotti, Eds., Le schegge di Vitruvio (Edicom, Monfalcone, 1999). 5. V. Di Battista, G. Giallocosta, G. Minati, Eds., Architettura e Approccio Sistemico (Polimetrica, Monza, 2006).
6. H. von Foerster, Sistemi che osservano (Astrolabio, Rome, 1987). 7. M. Foucault, Les mots et les choses (Gallimard, Paris, 1966). 8. G. Giorello, Introduzione alla filosofia della scienza (1994) (Bompiani, Milan, 2006).
9. J. Gribbin, Q is for Quantum (Phoenix Press, London, 1998). 10. I. Hacking, Representing and Intervening (Cambridge University Press, Cambridge, 1983).
11. C. Lévi-Strauss, Le cru et le cuit (Plon, Paris, 1964). 12. G. Minati, Teoria Generale dei Sistemi, Sistemica, Emergenza: un’introduzione 13. 14. 15. 16. 17. 18. 19. 20.
(Polimetrica, Monza, 2004). W. Morris, in On Art and Socialism (London, 1947). G. Mounin, Clef pour la linguistique (Seghers, Paris, 1972). H. Pagels, The Cosmic Code (Simon and Schuster, New York, 1982). E. Pessa, in Proceedings of the First Italian Systems Conference, Ed. G. Minati, (Apogeo, Milan, 1998). I. Prigogine, Le leggi del caos, 1993 (Laterza, Bari, 2003). H. Putnam, Reason, Truth and History (Cambridge University Press, Cambridge, 1981). H. Putnam, The Many Faces of Realism (Open Court, La Salle, 1987). D. Wiggins, Sameness and Substance (Blackwell, Oxford, 1980).
PROCESSES OF EMERGENCE IN ECONOMICS AND MANAGEMENT
This page intentionally left blank
MODELING THE 360° INNOVATING FIRM AS A MULTIPLE SYSTEM OR COLLECTIVE BEING VÉRONIQUE BOUCHARD EM LYON, Strategy and Organization Dpt. 23 av. Guy de Collongue, 69132 Ecully Cedex, France Email:
[email protected] Confronted with fast changing technologies and markets and with increasing competitive pressures, firms are now required to innovate fast and continuously. In order to do so, several firms superpose an intrapreneurial layer (IL) to their formal organization (FO). The two systems are in complex relations: the IL is embedded in the FO, sharing human, financial and technical components, but strongly diverges from it when it comes to representation, structure, values and behavior of the shared components. Furthermore, the two systems simultaneously cooperate and compete. In the long run, the organizational dynamics usually end to the detriment of the intrapreneurial layer, which remains marginal or regresses after an initial period of boom. The concepts of Multiple Systems and Collective Beings, proposed by Minati and Pessa, can help students of the firm adopt a different viewpoint on this issue. These concepts can help them move away from a rigid, Manichean view of the two systems’ respective functions and roles towards a more fluid and elaborate vision of their relations, allowing for greater flexibility and coherence. Keywords: innovation, organization, intrapreneurship, models, multiple systems, collective beings.
1. Introduction Confronted with fast changing technologies and markets and with increasing competitive pressures, business firms are now required to innovate fast and continuously [1,2,3]. Conventional innovation processes led by R&D and Marketing departments are not sufficient to meet these requirements. In effect, conventional innovation processes tend to be rigid, slow and focused on technology and product development whereas what firms need now is flexible, rapid and broad scope innovation, encompassing all the key elements of their offer, management and organization [4,2,5,3]. Firms have to improve and transform the way they produce, manage client relations, ensure quality, configure their value chain, manage employees, develop competencies, generate revenues, etc. They have to innovate on all fronts and become “360° innovating
103
104
V. Bouchard
firms”. To this end, more nimble innovation processes are required and, above all, innovation must take place in every department and division of the firm. The 360° innovating firm has to rely on the creativity, talent, energy and informal network of its employees. In the 360° innovating firm, employees must be able to autonomously identify opportunities and re-combine the resources and competences that are spread throughout the various departments and sites of the firm to seize these opportunities. Sales and service persons, in frequent contact with clients can identify emerging needs and business opportunities, computer experts can grasp the value creation potential of new IT developments, experts in manufacturing and logistics can propose new solutions to concrete problems, finance experts can help assess costs and benefits, idle machines can be used to produce prototypes, foreign subsidiaries can come up with low-cost solutions, etc. 2. Intrapreneurs and the 360° innovating firm Opportunities and inventions that are identified and developed outside the conventional innovation track cannot succeed without a champion, someone who strongly believes in the project and is personally committed to its success. 360° innovation relies, therefore, on the emergence of internal entrepreneurs or “intrapreneurs” from the pool of employees [6,7,8,9,10]. Internal entrepreneurs or “intrapreneurs” are employees who identify internal or external value creation opportunities and seize the opportunity relying first and foremost on their own talent, motivation and network. Intrapreneurs can take advantage of the financial and technical resources as well as the wide array of expertise and competencies the firm detains. However the life of intrapreneurs is far from easy: they often cumulate the difficulties faced by entrepreneurs (understanding the market, improving the offer, creating a sound economic model, managing a team, making the first sale, etc.) to the difficulties that arise when one pursues an original project within a rigid and risk adverse environment. 3. The intrapreneurial process as a superposed organizational layer In their quest for 360° innovation, a number of firms try to encourage the emergence of intrapreneurs. To do so, they set up structures, systems and procedures whose goal is to encourage, identify, support and select intrapreneurial initiatives [11,12,13].
Modeling the 360° Innovating Firm as a Multiple System or Collective Being
The firm IE
105
The intrapreneurial proces s
Formal processes and established circuits
The formal organization
Slack resources
The environment Figure 1. Two interacting systems, the formal organization (FO) and the intrapreneurial layer (IL).
By doing so, firms de facto superpose a different and potentially conflicting organizational layer (the intrapreneurial process) over the formal organization [14,15,12,11,16,17]. The two layers can be seen as two systems interacting in a complex way (see Figure 1). 3.1. Two different but highly interdependent systems The formal organization (FO) performs well-defined tasks using well-identified procedures, people and resources, while the intrapreneurial layer (IL) assembles people and resources located anywhere in the organization (even outside the organization) on a ad hoc basis, relying extensively on informal networks (see Table 1). The two systems, however, have numerous contact points since most people and resources involved in the IL “belong” to the FO. Most of the time, the intrapreneur herself is a member of the FO, where she continues to fulfill her tasks, at least in the initial phases of her project. The relations between the two systems are complex: 1. The formal organization enables the emergence and unfolding of the intrapreneurial process by 1) granting autonomy to the internal entrepreneur, 2) providing most of the resources he uses, and 3) giving legitimacy to his
106
V. Bouchard Table 1. Two very different systems.
The formal organization (FO)
The intrapreneurial layer (IL)
Well defined set of elements and interactions
Fuzzy, constantly evolving set of elements
Relatively stable over time
Temporary
Planned (top down)
Emergent (bottom up)
A priori resources and legitimacy
Resources and legitimacy are acquired on the way
2. 3. 4.
project. In other words, system IL is embedded in system FO on which it depends for its survival and success. However system FO is also dependent on system IL. In effect, the intrapreneurial layer allows the formal organization to 1) overcome some of its structural limitations and 2) reach its objective of fast 360° innovation. The intrapreneurial layer is often competing for resource and visibility with some parts of the formal organization and often enters in conflict with it. (IL competes with a subsystem of FO). Finally, the intrapreneur and more generally all those who contribute significantly to the IL, can be rejected or envied by formal organization members because their values, work styles, status are different. (The culture – norms, values and behaviors – of system IL and that of system FO are conflicting).
3.2. Managing the intrapreneurial process The single intrapreneurial initiative is managed – primarily – by the intrapreneur himself. However the intrapreneurial process as an organizational dynamic, a sustained flow of intrapreneurial initiatives, has to be managed by the top management of the firm. Let us review what goals these actors pursue, the levers they control and some of their main strategic options. 3.2.1. Top management Top managers pursue several objectives. Among them: • Multiply the number of intrapreneurial initiatives; • Improve their success rate; • Contain the risks and costs; • Leave the formal organization (FO) “undisturbed”; • Provide concrete examples of the desired behavior to members of the formal organization.
Modeling the 360° Innovating Firm as a Multiple System or Collective Being
107
Some of their most significant control variables are: • The level and type of support granted to internal entrepreneurs; • The conditions at which support is granted; • The definition of desired/ undesired, licit/illicit intrapreneurial initiatives; • The creation of formal links between the two layers. Their strategic options can be positioned along two continua: • Granting high autonomy to employees vs. granting moderate autonomy to employees • Providing strong formal support to intrapreneurs vs. providing minimal formal support to intrapreneurs. • Relying essentially on informal links between the IL and the FO or relying on both informal and formal linkages. 3.2.2. Intrapreneurs Internal entrepreneurs generally seek to maximize their chances of success by: • Securing access to needed resources and competencies; • Minimizing conflict with the formal organization; • Getting the support of members of the leading coalition. Some of their most significant control variables are: • The level of strategic alignment of their project; • Their level of self-sufficiency/autonomy vis-à-vis the formal organization (FO); • The degree of visibility of their project. Here again their strategic options can be positioned along various continua: • Pursuing a strategically aligned project vs. pursuing a not so strategically aligned project; • Being highly self sufficient vs. trying to get formal help and support early on; • Keeping the visibility of the project low vs. giving the project high visibility. 4. A recurrent and bothering problem There are numerous empirical evidences that, over time, systems dynamics play strongly against the Intrapreneurial Layer, which remains marginal or shrinks after an initial period of boom [12,18,11,16,13,17].
108
V. Bouchard
In spite of the declarations and measures taken by the top management to encourage intrapreneurial initiatives, many intrapreneurs face so many difficulties that they renounce to their project. Some fail for reasons that can be attributed to the weakness of their project or their lack of skills but many fail because of the insurmountable organizational or political obstacles they face. And without a small but growing number of visible successes, the intrapreneurial dynamic soon comes to a halt. Some recurrent problems faced by intrapreneurs: • Parts of the formal organization actively or passively oppose the intrapreneur (including the boss of the intrapreneur); • Excessive work load, no team, no help; • The intrapreneur cannot obtain the needed financial resources; • The intrapreneur cannot secure solid and lasting top management support; • The intrapreneur is isolated and does not benefit from the advice of mentors or fellow intrapreneurs; • The intrapreneur is not able to simultaneously face external (market) and internal (political) challenges. A critical issue for firms interested in promoting 360° innovation, therefore, is to realize that such a negative dynamic is at play and find ways to counteract it. If we understand better the complex interactions between the two systems (FO and IL) and their main agents (top management, intrapreneurs, other managers), we might be able to find ways to reduce the pressures experienced by intrapreneurs thus favoring innovation and the creative re-deployment of resources within the firm. New concepts in system modeling such as multiple systems (MS) and collective beings (CB) could help us in this endeavor. 5.
New concepts in system modeling: multiple systems (MS) and collective beings (CB)
We propose to try and apply to the FO-IL systems dynamics depicted above the concepts of Multiple Systems (MS) and Collective Beings (CB) developed by Minati and Pessa [20]: • A MS is a set of systems established by the same elements interacting in different ways i.e., having multiple simultaneous or dynamic roles. Examples of MS include networked interacting computer systems performing cooperative tasks, as well as the Internet, where different systems play different roles in continuously new, emerging usages.
Modeling the 360° Innovating Firm as a Multiple System or Collective Being
•
109
A CB is a particular MS, established by agents possessing the same cognitive (natural or artificial) system. Passengers on a bus and queues are examples of CB established dynamically by agents without considering multiple belonging. Workplaces, families and consumers are examples of CB established by agents simultaneously and considering their multiple belonging.
These new concepts can help us reframe the challenges faced by the “360° innovating firm” which could be approached as a problem of increasing the degrees of freedom of various systems simultaneously involved in innovation i.e., increasing the number of representations simultaneous available to the various agents. For instance, we may design the Intrapreneurial Layer not only in opposition to the Formal Organization, but also considering the possibility of: • redefining problems by distinguishing between conventionally established differences and intentionally established differences between the two systems, for the purpose of systems modeling; • distinguishing between subsystems and systems of the multiple system; • not only establishing a distinction between functional relations and emergent relations but also mixing and managing the two. The proposed approach can help us move away from a rigid, Manichean view of the systems’ respective functionalities and roles towards a more fluid and elaborate vision of their relations, allowing for greater flexibility and coherence when tackling the organizational and managerial issues facing the 360° innovating firm. Let us illustrate these new concepts by applying them to the system “firm” in its productive function. Aspects such as production, organization, cost effectiveness, reliability and availability can be viewed: • as different properties of the firm viewed as a single system or as a set of subsystems, or • as different elements of the MS “firm”, constituted by different systems established by the same elements interacting in different ways. In the second eventuality, production will be considered as an autonomous system possessing its own independent representation and dynamics and not only a property of the system “firm”, itself dependant on organization. In the same way, quality is an autonomous system and not only an effect of production, and so on. The different dimensions are not only viewed as functionally related aspects of the system or of different subsystems, but also as different
110
V. Bouchard
combinations of the same elements (e.g., human resources, machines, energy, rules and facilities) forming different systems (e.g., production, organization and quality). What difference does it make ? In this case, we may act on a given system of the MS not only in a functional way, but also via the complex web of interactions that emerge from its elements’ multiple belonging. From a functional standpoint, the increasing of production may reduce quality and cost-effectiveness affect organization. In an MS perspective, quality is not an effect of production, but an autonomous property of elements also involved in production. Quality, in this case, will derive from design rather than production procedures. It becomes possible to consider properties laterally rather than functionally. Properties as quality, reliability and cost effectiveness are not properties of a single system, but properties of the whole. In the same way, human resources will be considered as agents able to pursue multiple roles in producing, organizing, communicating, marketing, developing new ideas, controlling quality and so on. In the “360° innovating firm”, no agent has a single specific role but rather multiple, dynamic, contextdependent roles. 6. Applicability of DYSAM The Dynamic Usage of Models (DYSAM) has been introduced in Minati and Brahms [19] and Minati and Pessa [20] to deal with dynamic entities such a MS and CB. The dynamic aspect of DYSAM relates to the dynamic multiple belonging of components rather than to the dynamic aspect related to change over time. DYSAM is based on simultaneously or dynamically model a MS or CB by using different non-equivalent models depending on the context. For instance, a workplace may be modeled in a functional way by considering the processing of input and the production of output; as a sociological unit by only considering interactions among human agents; as a source of waste, pollution and energy consumption; and as a source of information used for innovation. The concept of DYSAM applies when considering a property in the different systems of a MS or CB. Moreover, in this case, the models must take into account the fact that different systems are composed of same elements. In this way, dealing with quality in a system affects the other aspects not only in a functional way, but also because the same elements are involved in both. Depending on effectiveness, a firm may be modeled as a system of subsystems and/or as an MS or CB. For instance, the profitability of a firm cannot be modeled by using a single optimization function, linear composition of single different optimization functions, but rather by using a population (i.e., a system)
Modeling the 360° Innovating Firm as a Multiple System or Collective Being
111
of optimization functions continuously and dynamically established by considering context-sensitive parameters. DYSAM allows considering different, non-equivalent models, such as the ones related to profitability, reliability, availability, flexibility and innovation, as autonomous systems of a MS. 7. Conclusion Firms are nowadays required 1) to maximize return on assets, which implies strict performance control and efficient use of resources and 2) to innovate on all fronts (360° innovation), which implies local autonomy, trials and errors and patient money. In order to face these simultaneous and apparently contradictory requirements, several firms superpose an intrapreneurial layer to their formal organization. While the formal organization (FO) performs well-defined tasks using well-identified procedures, people and resources, the intrapreneurial layer (IL) assembles people and resources located anywhere in the organization (even outside the organization) on a ad hoc basis, relying extensively on informal networks to develop innovative projects. The two systems are in complex relations: if the IL is, to a large extent, embedded in the FO, sharing its human, financial and technical components, it also strongly diverges from it when it comes to representation, structure, values and behavior of some shared components. Furthermore, the two systems simultaneously cooperate and compete and frequently enter in conflict. In the long run, one observes that the organizational dynamic set forth usually ends to the detriment of intrapreneurial processes, which remain marginal or regress after an initial period of boom. The concepts of Multiple Systems and Collective Beings, proposed by Minati and Pessa, can help students of the firm adopt another viewpoint on the issues just described and tackle them differently. These concepts can help them move away from a rigid, Manichean view of the two systems’ respective functionalities and roles towards a more fluid and elaborate vision of their relations, allowing for greater flexibility and coherence when tackling the organizational and managerial issues facing the 360° innovating firm. The application of these concepts together with the related DYSAM techniques, could help students of the firm come to term with the multiple contradictions that arise from the mandatory adoption of multiple, non additive roles by the managers of 360° innovating firms. Acknowledgments I wish to express my gratitude to Professor Gianfranco Minati for his help and feedback on the paper.
112
V. Bouchard
References 1. P. Drucker, Innovation and Entrepreneurship (Harper Business, 1993). 2. G. Hamel, Harvard Business Review 77(5), 70-85 (1999). 3. J.P. Andrew, H.L. Sirkin, and J. Butman, Payback: Reaping the Rewards of Innovation (Harvard Business School Press, Cambridge, 2007).
4. P.S. Adler, A. Mandelbaum et al., Harvard Business Review, March-April, 134-152 (1996).
5. R.M. Kanter, Executive Excellence 17(8); 10-11 (2000). 6. G. Pinchot III, Intrapreneuring: why you don’t have to leave the corporation to become an entrepreneur (Harper and Row, New York, 1985).
7. R.A. Burgelman, Administrative Science Quarterly 28(2), 223-244 (1983). 8. D. Dougherty, C. Hardy, Academy of Management Journal 39(5), 1120-1153 (1996).
9. A.L. Frohman, Organizational Dynamics 25(3), 39-53 (1997). 10. P.G. Greene, C.G. Brush and M.M. Hart, Entrepreneurship Theory and Practice 23(3), 103-122 (1999).
11. Z. Block, I.C. Macmillan, Corporate venturing : creating new businesses within the firm (Harvard Business School Press, Boston, 1993).
12. R.M. Kanter, J. North et al., Journal of Business Venturing 5(6), 415-430 (1990). 13. V. Bouchard, Cahiers de la recherche EM LYON, N. 2002-08 (2002). 14. N. Fast, The rise and fall of corporate new venture divisions (UMI Research Press, Ann Arbor, 1978).
15. R.A. Burgelman, L.R. Sayles, Inside corporate innovation: strategy, structure and managerial skills (Free Press, New York, 1986).
16. P. Gompers, J. Lerner, in R.K. Morck, Ed., Concentrated Corporate Ownership (University of Chicago Press, Chicago, 2000).
17. V. Bouchard, Cahiers de la recherche EM LYON, N. 2001-12 (2001). 18. R.M. Kanter, L. Richardson, J. North and E. Morgan, Journal of Business Venturing 6(1), 63-82 (1991).
19. G. Minati, S. Brahms, in: Emergence in Complex, Cognitive, Social and Biological Systems, G. Minati and E. Pessa, Eds., (Kluwer, New York, 2002), pp. 41-52.
20. G. Minati, E. Pessa, Collective Beings (Springer, New York, 2006).
THE COD MODEL: SIMULATING WORKGROUP PERFORMANCE
LUCIO BIGGIERO (1), ENRICO SEVI (2) (1) University of L’Aquila, Piazza del Santuario 19, Roio Poggio, 67040, Italy, E-mail:
[email protected],
[email protected] (2) LIUC University of Castellanza and University of L’Aquila, Piazza del Santuario 19, Roio Poggio, 67040, Italy E-mail:
[email protected] Though the question of the determinants of workgroup performance is one of the most central in organization science, precise theoretical frameworks and formal demonstrations are still missing. In order to fill in this gap the COD agent-based simulation model is here presented and used to study the effects of task interdependence and bounded rationality on workgroup performance. The first relevant finding is an algorithmic demonstration of the ordering of interdependencies in terms of complexity, showing that the parallel mode is the most simplex, followed by the sequential and then by the reciprocal. This result is far from being new in organization science, but what is remarkable is that now it has the strength of an algorithmic demonstration instead of being based on the authoritativeness of some scholar or on some episodic empirical finding. The second important result is that the progressive introduction of realistic limits to agents’ rationality dramatically reduces workgroup performance and addresses to a rather interesting result: when agents’ rationality is severely bounded simple norms work better than complex norms. The third main finding is that when the complexity of interdependence is high, then the appropriate coordination mechanism is agents’ direct and active collaboration, which means teamwork. Keywords: agent-based models, bounded rationality, law of requisite variety, task interdependence, workgroup performance.
1. Introduction By means of the COD (Computational Organization Design) simulation model, our main goal is to study the effects of the fundamental modes of connection and bounded rationality on workgroup performance. Therefore, we are at a very micro-level of analysis of a theory of interdependence and coordination. Technological interdependence is one of five types of interdependence,1 the others being the behavioral, informational, economic, and juridical. Technological interdependence coincides with task (or component) interdependence, when it is referred at the micro-level of small sets of technologically separable elementary activities. Task interdependence is 113
114
L. Biggiero and E. Sevi
Figure 1. Modes of connection.
determined by several factors, which occur at network, dyad, and node levels.1 One of the most important factors is just the mode of connection, that is the parallel, sequential or reciprocal ways in which tasks and/or agents interactions can take place. Two (or more) tasks can be connected by means of one (or more) of these three modes of connection (Fig. 1): (1) parallel connection, when tasks are connected only through its inputs and/or outputs; (2) sequential connection, when the output of one task is the input of the following; (3) reciprocal connection, when the output of a task is the input of the other and vice versa. This categorization coincides with those that, in various forms and languages, has been proposed by systems science, calling them systemic coupling [2,3]. It’s noteworthy to remind that they exhaust any type of coupling and that, as underlined by cybernetics, only the reciprocal mode refers to cybernetic systems, because only in that case there is a feedback. Indeed, into the systems science the reciprocal connection is usually called structural coupling, while into the organization science it is called reciprocal [4,5]. According to Ashby [6] the elementary and formally rigorous definition of organization is the existence of a functional relationship between two elements. For some links are more complex then others, the degree of complexity resides into the form and degree of constraint connections establish between elements. In fact, in parallel connection systems are almost independent (Fig. 1), because they are linked just through resources (input) sharing and/or through the contribution to the same output. These are very weak constraints indeed. The strength of the constraint increases moving to the sequential connection because the following system depends on the output of the preceding one. It is not just a “temporal” sequence, but rather a sequence implied by the fact that the following operation acts on the output of the previous one. Thus, according to
The COD Model: Simulating Workgroup Performance
115
Figure 2. The model structure.
the definition of complexity in terms of the degree of constraint, the sequential is more complex than the parallel connection. Finally, the reciprocal connection has the highest degree of constraint because it operates in both directions: system B depends on the input coming from A’s output, and vice versa. Here we see a double constraint, and then the reciprocal is the most complex connection. Moreover, the double constraint makes a radical difference, because it constitutes the essence of feedback, and therefore the essence of the cybernetic quality. Lacking the feedback relationship, parallel and sequential connections are not cybernetic interdependencies. This reasoning leads to argue that the ranking of the three basic types of interdependencies in ascending order of complexity is the following: parallel, sequential, reciprocal. This way Thompson’s [4] and Mintzberg’s [5] arguments are supported and clarified by cybernetics. Biggiero and Sevi [7] formalize these concepts, and link them to organization and cybernetics literature. Moreover, they analyze the issue of time ordering, which expands the number of the fundamental modes of connection from three to seven. However, notwithstanding these developments and a vast literature no any operative and conclusive demonstration has been supplied, neither through empirical data or algorithmically. The first aim of the COD model, therefore, is just to do it in the virtual reality. In section three it is shown that, in order to achieve a satisfying performance, a workgroup executing tasks characterized by sequential or, even more, reciprocal connections should employ progressively more and more complex coordination mechanisms. Indeed, without them performance is very low in regime of parallel connection, and near zero in regime of sequential or
116
L. Biggiero and E. Sevi
reciprocal connection. Moreover, in section four limits to agents’ computational capacity are introduced, and it is evidenced that they sharply decrease group performance. Finally, it is shown that when such limits are severe, weaker coordination mechanisms perform better. 2. The COD model architecture and the methodology of experiments 2.1. The general structure The COD modela has a hierarchical structure, which sees on the top two “objects”: TaskCreator and WorkGroup. The former generates modules while the latter manages agents’ behavior. Through a frequency chosen by the model user, TaskCreator gives the quantity and quality of modules to be executed. Thus, by simulating the external environment, it operates as an input system for the group of workers. By creating modules with more or less tasks, it defines also structural complexity (Fig. 2). In this simple version of the COD model we suppose that both the structure of modules and agents’ behavior will not change. A module is constituted by tasks, which are made by components. In each interval, that is in each step of the simulation, one component per each task is examined, and eventually executed. Here we make the simplified assumption that each single module is characterized by only one mode of connection, and that between modules, whatever is its inner interdependence, there is only parallel interdependence. Thus, the mode of connection refers to the relationships between the tasks of a single module. Finally, it is assumed that the components of each single task, regardless of its number, are always connected by a sequential interdependence. It configures a workgroup producing independent modules, which are not given at once at the beginning, but instead supplied progressively at each interval according to a given frequency. Apparently, this is a situation rather different from that usually described in the literature on technology, modularity or production management. There, few complex products and its parts are planned and available as a stock at the beginning. Therefore, in the language of the COD model, many modules are connected in various ways to build a complex output. Conversely, here, many (relatively simple) products (modules) a
The program running the model is available on the URL of Knownetlab, the research center where the authors work: www.knownetlab.it. To run the program is needed the platform and language LSD (Laboratory on Simulation Development), available at www.business.auc.dk/lsd. With the program even some indications on its concrete handling are given. Anyway, the authors are available to give any support to use it and to questions concerning the topics here addressed.
The COD Model: Simulating Workgroup Performance
117
Table 1. Agents’ behaviors. Behaviors
Description
Search
Looking and engaging for a component (of a module’s task)
Execution
Working on that component
Inactivity
Being locked into a component
are supplied alongside the simulation. Rather than a car or an electric appliance, the situation simulated by the COD model resembles more a library small workgroup. Books are modules, and cataloguing, indexing, and placing in the right place of shelves are tasks, with its own components, that is elementary operations like checking, writing, etc. A task can be in one of the following states: • not-executable, because – in the case of reciprocal connection – it is waiting for a feedback, or because – in the case of sequential connection – its preceding task has not been executed; • executable, because there is parallel connection or – in the case of sequential connection – the preceding task has been executed or – in the case of reciprocal connection – the feedback is ready. 2.2. Agents’ behavior At any interval each agent can do one of three things (Tab. 1): searching, that is looking for a component (of a task) to work on; working in the component where she is currently eventually engaged; being inactive, because she cannot do any of the previous two things. In this basic version of the model we suppose that all agents are motivated to work and that they have all the same competencies. The latter assumption will be implicitly modified by the introduction of norms, which force agents to follow a specific behavior. These norms can be interpreted as coordination mechanisms, and have been set up in a way to improve agents’ behavior, so to increase group performance. Searching is performed by an agent who is not already engaged, and therefore looking for a (component of a task of a) module. Such a searching consists in checking all components of all tasks of all modules existing in that specific interval in TaskCreator, and then in randomly choosing one of them. If she finds a component, then she engages in that same interval, and in the following interval she works it out. If she doesn’t find any free component, then she waits for the next step to start a new search. Hence, searching and engaging activity takes at
118
L. Biggiero and E. Sevi
least one interval. Of course, only one agent can engage into the same component. The agents executing a certain component will finalize the corresponding task by moving to the next component until the whole task is completely executed. In this case, while moving from a component to the next one into the same task there are no intervals spent for searching. Once ended up the last component of the task, the agent will be free to search a new component in a new task of the same module or of another module. The third possible behavior is the inactivity. It means that, temporarily or definitely, she cannot work out the component she has chosen and has been engaged. This situation occurs in one of two cases: (i) she doesn’t find any free task to be engaged in; (ii) she engages in a component of a sequential task whose preceding task has not been executed; (iii) she chooses a component of a task which is connected in reciprocal interdependence with other tasks, whose feedback is missing or delaying. In other words, she needs the collaboration of other agents, who at the moment (or definitively) are not available. It is supposed that in each step an agent can operate (work out) no more than one component, while two or all (three) agents can work on the same module. 2.3. The formalization of the modes of connection Let’s consider the two tasks X and Y, and the following formalisms: • xt and yt represent the state at time t respectively for X and Y. They indicate the state of advancement of the work of the two tasks; • α and β represent the number of components constituting respectively the tasks X and Y. In our model we consider component length of 1 step; • px and py indicate the number of components ended up into a task. A task is considered ended up when its last component has been completed. In our case X task is executed when px = α and Y task when py = β; • Ca,t indicates the contribution that the agent a provides at time t. Once engaged into a task, at each step an agent increases the degree of advancement of the task and reduces the number of remaining components. Generally, the value of C is not a parameter because it depends on the characteristics of an agent in each specific step. However, in this basic version of the model, C is assumed as a stationary value equal to 1. 2.3.1. Parallel connection This mode of connection is characterised by indirect connection through the common dependence (complementary or competitive) from the same inputs or
The COD Model: Simulating Workgroup Performance
119
through the contribution to the same module’s output. Tasks are interdependent because they are organised (interested) to achieve (contribute to) the same output. Agents engage into a specific task and execute it needless any input or feedback from other tasks. Once engaged into a task, the agent proceeds until its end. In formal terms:
for the task X
if p x < α then xt = xt −1 + Ca , t else xt = xt −1
for thetask Y
if p y < β
then yt = yt −1 + Ca , t else yt = yt −1. At each step the state of advancement increases by the value corresponding to the contribution supplied by the agent who works that task. Once all the components are completed, that is when p x = α for X task and p y = β for Y , the state of advancement stops in a stationary value. Tasks indirect dependence on inputs is represented by the fact that once agent ai is engaged into X she cannot engage into Y . It’s clear also the indirect dependence on outputs because the two components contribute to the outcome of the whole module ( xt + yt ). Let’s suppose tasks X and Y were made respectively by three and four components ( α = 3 and β = 4 ). Let’s suppose that agent a1 engages into the former task at time t1 while agent a 2 engages into the latter task at time t3 . Each agent employs one step into searching the tasks before starting working in the next step. Agent a1 starts the execution of the X task at time t 2 and, ending up a component in each step, completes the task after 3 steps at time t 4 , when the number of completed components reaches the number of components of the task ( p x ≥ α = 3 ). In the same way the Y task is ended up after four steps at time t 7 , when p y ≥ β = 4 . The whole module is considered completed at the major time ( t 7 ), with a state of advancement equals the sum of the state of the two tasks ( x7 + y7 = α + β ). 2.3.2. Sequential connection It is characterised by the fact that the output of a system -a task, in the present model- enters as input into the following system. This is a direct asymmetric dependence relationship. In formal terms:
120
for the task X
L. Biggiero and E. Sevi
if p x < α then xt = xt −1 + Ca ,t
for the task Y
else xt = xt −1 ; if p y < β and
px ≥ α
then yt = yt −1 + xt −1 + Ca , t else yt = yt −1 ; As in the parallel connection there is also an indirect interdependence related to resources sharing, because if agent ai is engaged in X she cannot work in Y . The task Y depends entirely on X either because it takes into account the state of X in the previous step ( yt = yt −1 + xt −1 + Ca ,t ) or because if all components of X are not executed ( p x < α ), then Y cannot start ( yt = yt −1 ). Workflow crosses sequentially both tasks and the final output is obtained only with the completion of task Y . It is clear the asymmetry of the relationship: while task X acts autonomously, task Y depends on (adapts to) X ’s behaviour. Let’s suppose the task X and Y were made by three components ( α = 3 and β = 3 ). Let’s suppose that agent a2 engages into Y task at time t1 while agent a1 engages into X at time t3 . Since the starting of the Y task needs the output of X , at time t 2 agent a2 will not be able to start working on Y . In fact, because a1 engages into the former task at time t3 , execution of X task is started yet on time t 4 and, as in the parallel case, it is ended up after three steps at time t 6 (when p x ≥ α = 3 ). Only from the next time t 7 , agent a2 can start the execution of Y that is completed after three steps at time t9 (when p y ≥ β = 3 ). The whole module is considered executed at the end of the latter task at time t9 , with a state of advancement of works ( yt ) equals to 6. 2.3.3. Reciprocal connection The reciprocal interdependence is characterised by a situation like the sequential connection plus at least one feedback from the latter to the former taskb. The output of a task enters as input into the other, and vice versa. Therefore this connection can be truly considered a kind of interdependence, because the dependency relationship acts in both directions and employs the output of the
b
It can be hypothesised (and indeed it is rather common) a double feedback from the former to the latter component. The question concerns from which component comes out the final outcome. This double or single loop becomes rather important and complex when considering more than two systems and when dealing with learning processes.
The COD Model: Simulating Workgroup Performance
121
connected systemc. The major complexity of this connection respect to the previous ones mirrors in the formalisation too. The formalisation of the two tasks is symmetric.
for the task X
if ( p x = 0 AND p y = 0) OR ( p x ≤ p y AND p y < α ) then xt = xt −1 + yt −1 + Ca , t
for the task Y
else xt = xt −1 ; if ( p x = 0 AND p y = 0) OR ( p y ≤ p x AND p y < α ) then yt = yt −1 + xt −1 + Ca , t else yt = yt −1 ;
The function of dependence is formalised in the same way of the sequential connection, but now it occurs in both tasks. The execution of a task should take into account what happened in the other ( xt −1 + yt −1 ). They should exchange its outputs at the end of each component, so that workflow crosses over time both tasks. For instance, if the task X worked out component 1, in order to execute component 2 it needs to get from task Y the output of its component 1. The work on a task cannot take place until in the other at least the same number of components is executed. In the formalisation this is represented by the following conditions: if ( p x ≤ p y ), and if ( p y ≤ p x ). Thus, tasks exchange feedback as many as the number of its components. To illustrate the model, let’s suppose a module made by two tasks, both constituted by three components ( α = 3 and β = 3 ). Let’s suppose that agent a1 engage into the second task at time t1 , and that in the next step t 2 she works out the first component, while agent a3 , still at time t 2 , engages into the first task. At next time t3 , in order to proceed with the second component ( yt = yt −1 + xt −1 + Ca ,t ), the Y task needs to work out the output of the first component of the X task. The execution of the second component of the Y task cannot start until on the X task a number of components at least equals to the Y task have not been worked out. In formal terms when ( p y ≤ p x ). This way the second task results temporarily locked-in and the agent a1 cannot do anything else than being inactive in time t3 and try to work again in the next time. At time t3 agent a3 executes the first component of first task and, by giving its output to the second task, allows the execution of the second component by agent a1 at time t 4 . Remind that a module is considered completed only when all its tasks have been worked out. c
Actually both the conditions must hold.
122
L. Biggiero and E. Sevi
It is important to underlie that, even just from looking carefully at the formalisation, the reciprocal interdependence is much more sensitive to the risk of inactivity and/or delay. Tasks are really interdependent both on the exchanged output and on the time at which the transfer happens. Though in principle a module characterised by reciprocal connection can be worked out by a single agent moving between tasks, this implies delays for searching and engaging. Therefore, supposing modules of the same length, the best performance occurs when all tasks are taken at the same moment by an agent, because simultaneity eliminates delays.
2.4. Norms and coordination mechanisms Norms and coordination mechanisms would not be necessary if the pure chance were sufficient to assure a satisfying, even if not maximum group performance. However, as we will discuss in next section, through our model we experimented that, without some elementary norm, the performance is unsatisfying in regime of parallel connections and nearly zero for the other two types of connection. We have hypothesized six norms (Tab. 2), and its corresponding coordination mechanisms [5]. Cooperation Norm guarantees that every agent is willing to cooperate: nobody defects its job or voluntarily defeats the colleagues purposes. Finalizing Norm drives agents to complete the task in where they are engaged by moving from the current to the next component. As we have shown in 2.2 and 2.3 paragraphs, agents can be engaged in not-executable tasks. The Anti-inactivity Norm works out this inactivity by prescribing that agents leave locked task and search for another one. Since this norm works out the situation of inactivity but doesn’t prevent it, we introduce the Outplacement Norm able to prevent choosing locked components. This norm works on sequential connection by forbidding agents to pick tasks following tasks not yet executed, while in reciprocal connection by driving agents to avoid tasks that are waiting for feedback. The Focusing Norm prescribes that agents give precedence to incomplete tasks by choosing tasks of modules in progress. More complex is the Norm of Collaboration, since it recommends that agents choose with priority tasks currently under working, that is, incomplete tasks on which other agents are currently engaged. The first five norms are forms of weak planning focused on tasks, because agents are told how to search and cope with tasks, overlooking other agents. However, they are weak forms of planning, because they don’t specialize agents on a specific kind of task or module, and neither they are directed by a
The COD Model: Simulating Workgroup Performance
123
Table 2. Norms and coordination mechanisms. Type of norm 1. Cooperation Norm 2. Finalizing Norm 3. Anti-inactivity Norm (1 + 2 +3) 4. Anti-trap Norm (1+2+3+4) 5. Focusing Norm (1+2+3+4+5) 6. Collaboration Norm (1+2+3+4+5+6)
Description Every agent does work (nobody defeats, free-rides, defects or loafs) Once started a task, agents must end it up moving from the current to the next component. Agents forced to inactivity because engaged in a locked task leave it immediately and move to search another task. Agents avoid to be engaged in locked tasks. In sequential connection they avoid tasks following tasks not yet executed, while in reciprocal connection they avoid tasks that are waiting for feedback. Agents give priority to choose tasks of modules in progress. Agents give priority to choose tasks of modules under working by other agents.
Corresponding coordination mechanisms Planning agents’ behavior Planning agents’ behavior Planning agents’ behavior Planning agents’ behavior
Planning agents’ behavior Favoring reciprocal adaptation
supervisor. In fact, the corresponding configuration of the workgroup is not hierarchical: it is a group of peers who do not coordinate directly one another. The sixth norm is qualitatively different, because it addresses precisely to agents’ collaboration, and thus it configure the group as a teamwork. These norms have been applied in a cumulative way, increasing complexity at each level. Some of them can be seen as a sub-set of the previous one. Higher complexity means that each norm implies more constraints than the previous one. This way to measure norm complexity equals that used to measure the complexity of the modes of connection. These constraints limit agents’ behavior, by addressing their efforts in a more effective and efficient way. By constraining behaviors many wrong choices are prevented, and thus, group performance increased. The issue of the role played by norms and coordination mechanisms pertains to the theory of coordination and not to the theory of interdependence: while the former deals with concrete or abstract objects, like systems, tasks, activities, etc., the latter deals with and refers to agents. Tasks (systems) are connected in a certain way, and that way is defined by the technology. Agents are coordinated in a certain way, and that way is defined by the norms that somebody (or the agents themselves) sets up and applies. The rationale for the need of norms and coordination mechanisms is more complex and cannot be extensively discussed here. We can just say simply that without them group performance results unsatisfying or just null. As we will argue in next section, the need for norms progressively more complex can be carried on as the
124
L. Biggiero and E. Sevi
demonstration that some type of connection is more complex than others. Moreover, norm complexity can be measured in the same way as the complexity of the modes of connection, that is in terms of the degree of constraint they put on agents’ behavior. The more restrictive, that is the more limiting agents’ choices, the more complex they are. Constraints limit agents’ behavior by addressing their efforts to a more effective and efficient way, so that wrong choices are prevented and (consequently) performance increased. Notice that the COD model does not deal with the issue of how norms are set up or emerge or eventually change. Moreover, respect to Mintzberg’s categorization [5] of coordination mechanisms, here managers’ supervision is not considered. Finally, components are supposed to be totally standardized, task eventually differ only for the number of components, and modules for the modes of connection among its tasks.
2.5. The methodology and working of the model Our model analyzes the effects of task interdependence by showing how, in order to get a satisfying performance, more complex connections require more complex norms. Group size is fixed at 3 agents, whose performance is measured by the following two indexes: • effectiveness: number of executed modules divided by the maximum number of executable modules. This index varies between 0 and 1; • efficiency: number of working steps divided by the maximum number of steps that can be employed in working. This index refers to the degree of use of inputs, which here is constituted by the agents’ time actually employed for working divided by the maximum number of steps that the group can employ for working, thus excluding the steps for engagements. Two aspects should be underlined: (i) these indexes are normalized on group size and structural complexity, so that effectiveness and efficiency are independent on them; (ii) maximum efficiency doesn’t necessary correspond to maximum effectiveness because only through an adequate coordination agents efforts can be addressed on the right tasks and resources can be minimized. Experiments are conducted on specialized groups, that is groups executing modules characterized by only one of the three modes of connections. Thus, performance of workgroups specialized on parallel modules (henceforward labeled as (P), sequential modules (S) and reciprocal modules (R) are analyzed separately. We run 10 simulations per each experiment, using different values for the random generator, so to prevent its possible influence on results. Data
The COD Model: Simulating Workgroup Performance
125
record the mean of performance indexes given by each series of 10 simulations. Each simulation lasts 900 intervals, and the module creation frequency is fixed on 0,50 and kept constant during the experiments, so that in each simulation are respectively generated 450 modules. Each module is composed by three tasks and each task by three components, that is, each task needs three working steps and one engaging step. Given these environment parameters, a group of three members has a production capacity of 675 tasks resulting in 225 modules. It means that, when the group performs at maximum, it completes 225 modules corresponding to its maximum productive capacity. According to a satisfying, and hence non-maximizing approach to social sciences [8,9,10], it is important to underlie two types of group performance: the conditions under which is reached respectively the maximum and the satisfying performance. In the former case maximum effectiveness and efficiency are achieved, while in the latter a satisfying performance could be enough. The rationale is that groups whose members work efficiently –that is, they don’t waste time in searching in vain or remain tapped in blocked components- and whose effectiveness is acceptable can be judged positivelyd. Our library small workgroup is therefore supposed at maximum to complete the storing of 225 books (modules) per year, that is 75 books per worker, which means 9 working days per book. At first sight this is not a hard goal to reach if books are simple, and if this would be the only activity of librarians. Saying that books are simple means, as we will show with our simulation results, that tasks (cataloguing, indexing, and placing) can be executed independently, that is in a parallel regime. In other words, each worker could independently work on one of the tasks related to the same book. The situation would change slightly whether task connections were sequential and dramatically whether were reciprocal. The difficulty would further increase whether agents’ ability to search among incompletely stored books were limited. In both cases, and of course much hardly when these conditions occur simultaneously, it would be really difficult to reach, if not the maximum, at least a satisfying performance without employing a set of complex coordination mechanisms. This is one of the things we are going to discuss in next section with the results of our experiments. d
Though it is not treated in this basic version of the model, a crucial point is that each norm has its own cost, and that more complex norms are more costly. Adding more and/or more complex (costly) norms increases effectiveness maybe up to the maximum, but it should be checked in each specific case whether the advantages coming from the maximum effectiveness do compensate the disadvantages coming from managing more numerous and eventually more complex norms.
126
L. Biggiero and E. Sevi
3. The effects of task interdependence We base on the following argument our ordering of modes of connection in terms of complexity: a mode is more complex than another one if, ceteris paribus, a workgroup operating in a certain mode of connection requires that, in order to reach the same performance, more numerous and complex coordination mechanisms should be employed. This is an indirect demonstration, based on computational experiments. Although this demonstration results a coherent ordering of modes of connection, it is different to that used above (see the introduction section) and in other works [5] where complexity is defined in terms of degree of connection constraint. Workgroups facing with parallel interdependence are not complex, and don’t need special devices to be effective. Conversely, in regime of sequential or reciprocal interdependence complexity grows up, and consequently coordination becomes more complex too. In spite of model simplifications, our analysis confirms the main suggestions coming from consolidated literature on this subject. The radical difference is that now such statements are based not on the “ipse dixit”, that is on the reputation of some scholar, but rather on an algorithmic demonstration. Incidentally, in our case the “scholars” were perfectly right in the main arguments. The results of our simulation model (Tab. 3) show that the Cooperation Norm is actually unable to help groups achieve adequate performance. Effectiveness is low whatever the mode of connection, while efficiency reaches a satisfactory level only in the parallel connection. The Finalizing Norm guarantees an almost satisfying performance only to the group working tasks connected with parallel interdependence, while in the other two cases agents are locked into tasks that cannot be executed. In the sequential regime too many agents engage into tasks successive to those not yet completed, and in the reciprocal interdependence they wait too long for a feedback from other tasks. In most simulations agents are almost all yet locked in the early steps, so that the group soon enters in an irreversible paralysis. The Anti-inactivity Norm prescribes that agents locked into a task leave it immediately (during the same interval when they engage in it) and search for another task. Hence, this norm works out the situation of inactivity but doesn’t prevent it, because it intervenes on the effects and not on the causes of inactivity. This norm leaves untangled the performance of the group working in parallel regime, because there is no inactivity to be worked out, and it improves a little bit the group working with the sequential mode. The performance of the
The COD Model: Simulating Workgroup Performance
127
Table 3. The effects of task interdependence, main results from the simulation model.
1. Cooperation Norm
2. Finalizing Norm (1 + 2)
3. Anti-inactivity Norm (1 + 2 +3)
4. Anti-trap Norm (1 + 2 +3+4)
5. Focusing Norm (1+2+3+4+5)
6. Collaboration Norm (1+2+3+4+5+6)
Effectiveness
Efficiency
P
0.16
0.53
S
0.03
0.19
R
0.01
0.23
P
0.66
1.00
S
0
0.01
R
0
0
P
0.66
1.00
S
0.54
0.76
R
0.16
0.58
P
0.66
1.00
S
0.65
1.00
R
0.29
0.74
P
1.00
1.00
S
1.00
1.00
R
0.79
0.79
P
1.00
1.00
S
1.00
1.00
R
1.00
1.00
reciprocal group remains definitely unsatisfactory. Indeed agents consume a lot of time in searching. In order to substantially improve the performance another norm becomes necessary. The Anti-trap Norm prevents choosing locked tasks. Its action requires that agents know the right sequence of execution of each task. While the group working in the regime of reciprocal tasks remains into its respective quasi-satisfying and bad performance, the group facing with sequential tasks reaches the same outcomes of the parallel regime. Through the Focusing Norm, which prescribes that agents choose prior incomplete modules, a sharp increase of performance is realized. It brings groups in both parallel and sequential to the maximum. Once focusing agents on the same modules, their efficiency pushes effectiveness. Even the group working with reciprocal tasks benefits substantially from this norm, but it doesn’t yet reach the maximum performance. To this aim it is necessary the (final) Norm of Collaboration, which forces agents choosing firstly modules currently under working, that is, incomplete modules on which agents are engaged in. This norm is more restrictive than the
128
L. Biggiero and E. Sevi
previous one, because, in order to get priority, it is not enough that a module is incomplete. It is even necessary that in that module other agents are currently working on. By adding this norm all the three types of interdependence reach the maximum performance. Notice that this norm is qualitatively different and more complex than the previous ones: it establishes coordination between agents, while the others intervene on the relationships between agents and tasks.
4. The effects of bounded rationality Our model is truly non-neoclassical, because: a) agents are rule followers 8, 9 and not utility maximizers; b) agents’ rationality is bounded. 10, 11, 12, 13, 14 Currently there is a hot debate concerning the ways to operationalize bounded rationality so to give it a sound and effective scientific status. Among the many ways in which this could be done and the many facets it presents, in our model we chose one of the simplest: agents’ computational capacity. The idea is that agents cannot look at and compute all executable modules, because they should be checked and computed in order to decide which one is better to be executed, and in which component of its tasks. Fig. 3 shows that the challenge to agents’ rationality sharply increases over time, at the half of group working life there are at least 112 incomplete circulating modules to be computed by each agent. In particular, the problem is generated by the progressive proliferation of incomplete modules and tasks. Let’s consider the best situation of efficiency and effectiveness: a group working in regime of parallel connections, where agents are coordinated by the Collaboration Norm, that is the most effective coordination mechanisms. And further, let’s suppose they have no computational limits in searching, checking, and computing modules and tasks (Fig. 3). Well, even in this most favorable case already after early 20% of group working life, that is after 180 intervals, around 45 incomplete modules do circulate (Fig. 3). In the best conditions – easiest regime and most effective norms- in a single interval each agent after 180 intervals should be able to compute 45 books, that is 135 tasks (405 components). The size of the decision space becomes too large very soon. Tables and shelves of our small library workgroup progressively and soon become filled in by incompletely stored books, and hence the degree of disorder grows accordingly. Every day becomes much harder to compute all the incomplete modules in order to choose the right one. Even in the best conditions the goal which at first sight appeared so easy to achieve becomes unreachable. If the yearly duration of working time of an average American worker were supposed
The COD Model: Simulating Workgroup Performance
129
Figure 3. Complete and incomplete modules circulating in a group working in parallel regime and coordinated by the Collaboration Norm and with unboundedly rational agents. NMod: number of created modules; NModCompl: number of executed modules; NModToCompl: number of uncompleted modules.
to be 1800 hours, in our 900 steps simulations each interval would correspond to 2 working hours. Now, the problem of agent computational capacity sets up in the following way: how many modules can be looked at and computed (analyzed) in one interval (2 hours)? This problem is crucial, because group effectiveness and efficiency depends essentially on this ability. An agent, in fact, has to “open” and check all incomplete modules in order to choose the task to be worked out. Let remind that each module is made by 3 tasks, each constituted by 3 components. At the very end, an agent will choose a specific component of a specific task of a specific module. She will do that taking into account the specific interdependence regime and the norms eventually ruling that group. Even assuming that librarians work of storing is effectively supported by computer programs, it could be reasonably supposed that in a 2-hour-length standard interval a highly competent (strongly rational or efficient) agent can check 40 modules (equals to 120 tasks, equals to 360 components), while a lowly competent or motivated just 2. Tab. 4 results show that if the group is in the parallel or sequential regime and it is coordinated through the most effective norm, then the achievement of a satisfying performance requires a computational capacity of at least 20 modules per each agent. Consequently, only with a very high rationality joined with the best coordination mechanism it is possible that a group deals with complex task to achieve a satisfying performance. If the regime is in the reciprocal mode, then it is requested the double capacity. Let say that this regime needs librarians with double competence respect to the other two regimes. If the group is coordinated by a less effective norm, then in the reciprocal regime the performance will
130
L. Biggiero and E. Sevi
Table 4. Agents’ computational capacity effects on group performance. Group 1: Coordination through Anti-trap Norm. Group 2: Coordination through Norm of Collaboration. Group 1 Agents’ computational capacity in terms of modules
Effectiveness R
Efficiency
P
S
P
S
R
2
0.66
0.66
0.28
1.00
1.00
0.74
5
0.67
0.66
0.28
1.00
1.00
0.74
10
0.65
0.64
0.28
1.00
1.00
0.74
20
0.65
0.65
0.29
1.00
1.00
0.74
40
0.66
0.66
0.30
1.00
1.00
0.74
80
0.66
0.65
0.29
1.00
1.00
0.74
Group 2 Agents’ computational capacity in terms of modules
Effectiveness
Efficiency
P
S
R
P
S
R
2
0.37
0.43
0.17
0.37
0.44
0.32
5
0.52
0.58
0.28
0.52
0.59
0.41
10
0.63
0.69
0.40
0.63
0.70
0.51
20
0.75
0.79
0.58
0.75
0.80
0.64
40
0.85
0.88
0.75
0.85
0.89
0.78
80
0.93
0.95
0.89
0.94
0.95
0.90
never overcome 30% even supposing a computational capacity of 80 modules per agent. It means that in presence of complex task interdependence and without collaborating at their best there is no way to reach a satisfying performance. Tab. 4 shows also another interesting result: at the lowest degrees of computational capacity it is more effective a simple coordination. In fact, when the regime when connection is parallel and computational capacity is less than 20 modules, Group 1 performs better than Group 2. Similarly, when the mode of connection is sequential or reciprocal, group 2 performs better than group 1 only if computational capacity exceeds 10. This is due to the fact that once rationality is really bounded, it is so also as concerning goal seeking behaviors. In fact, the norms of focalization and collaboration tell agents searching for two specific module categories: in progress and under the execution of other agents. However, the problem is that, if computational capacity is under a certain threshold, then the time consumed for searching those specific module categories is high. In particular, it becomes so high to vanish the advantages of being more goal seeking. The effectiveness of goal seeking behavior is more
The COD Model: Simulating Workgroup Performance
131
Figure 4. Effective combinations among bounded rationality and coordination complexity.
than compensated by the ineffectiveness of spending a lot of time in the searching activity. In other words, leading to less specific goals, that is allowing for a wider range of choices, less complex norms reduces the effort of searching and increases effectiveness of less rational agents. This explanation is confirmed by the analysis of efficiency, which actually is inversely dependent on the time spent in searching activity. When mode of connection is parallel or sequential, whatever the computational capacity of agents, the efficiency of group 1 is much higher than the efficiency of group 2. Similarly, when tasks are connected in a reciprocal way, group 2 scores a higher efficiency only if agents have a high computational capacity. Figure 4 summarizes the effective combinations among bounded rationality and coordination complexity. If the workflow arriving to the workgroup from external environment were better regulated than a flow of 0,5 module per step, then the performance would be, ceteris paribus, much higher, and in particular there would be less incomplete modules. In other words, in order to reach satisfying performances, workgroups would need less rationality or less (complex) coordination norms. At the extreme of a perfect regulation, the number of modules arrived from the external environment would coincide with those completed, and goal achievement would request no high rationality (just one module per agent per interval) and not all the norms. On the other hand, it is likely (but left as well to future research agenda) that a workflow more uncertain than a constant rate of module creation would require ceteris paribus more rationality or more complex coordination mechanisms. Such an increase of complexity associated with unstable workflow could be more than compensated by introducing agents’ learning in one or more of these three forms: i) better searching ability after completing tasks; ii) higher
132
L. Biggiero and E. Sevi
productivity when working on the same task; iii) major ability to collaborate as the number of successful collaborations grow over time. Actually, in this basic version of the model agents do not learn and therefore the corresponding forms of nonlinearity do not take place.
5. Conclusions Our simulation model tells us that groups working on complex interdependencies can reach an acceptable performance only by means of complex norms. Reciprocal interdependence can be managed satisfactory only through the Focusing Norm, and reaches the maximum only through the Norm of Collaboration, which actually includes five simpler norms. Sequential interdependence can be satisfactorily managed by applying the Anti-trap Norm, which includes three norms, and the parallel interdependence already with the Finalizing Norm. These results have also a normative side: it is redundant to employ complex norms to coordinate groups working on tasks connected by simple interdependencies. Further, and quite surprisingly, when agents’ rationality is severely bounded, the Collaboration Norm becomes not simply redundant, but indeed disadvantageous. In other words, coordination between agents does not work well when agents’ computational capacity is very low. Well focused taskbased coordination would perform better. Of course these results, and especially this normative side should be taken with prudence, because our model is still extremely simple. The introduction of knowledge exchange, competencies, personal conflicts, learning processes, and task specificity could change them significantly. However, by now we obtained four relevant findings: 1) an algorithmic demonstration of the ordering of interdependencies in terms of complexity; 2) an operationalization of bounded rationality in terms of computational capacity; and 3) an algorithmic analysis of the effects of bounded rationality on workgroup performance, which takes into account also task interdependence as a moderating factor; 4) an explanation of why and under what circumstances teamwork is a superior organization. This latter result confirms suggestions proposed, but left theoretically and empirically unproved in organization science. This version of the COD model is simple, because it supposes that agents have the same competencies and motivations to work, that they don’t learn, that they don’t make mistakes, that there are no behavioral issues (personal conflicts, leadership problems, etc.), and that there are no differences between tasks. Moreover, there are no externalities, neither other forms of nonlinear
The COD Model: Simulating Workgroup Performance
133
phenomena. However, despite its simplicity, this model is very helpful either because it is the ground on which building more complex and realistic models or because it already shows many interesting effects. Moreover, by the inner logic of simulation models, in order to be able to explain results coming from rich (complex) models it is necessary to know the behavior of the variables in simple (controllable) models.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
L. Biggiero, Towards a new theory of interdependence, (2008), www.ssrn.com . F. Varela, Principles of Biological Autonomy (Elsevier, NY, 1984). C.W. Churchman, The Systems Approach (Dell, NY, 1979). D.J. Thompson, Organizations in Action (Mc-Graw, New York, 1967). H. Mintzberg, The Structuring of Organization (Prentice-Hall, Inc. NJ, 1979). W.R. Ashby, in Principes of Self-organization, Ed. H. von Foerster and G.W. Zopf, (Pergamon Press, New York, 1962), p. 255. (Reprinted in Modern Systems Research for Behavioral Sciences, Ed. W. Buckley (Aldine Pub, Chicago, 1968)). L. Biggiero and E. Sevi, Modes of connection and time ordering definitions and formalisation of the fundamental types, (2008), www.ssrn.com . J.C. March, A Primer of Decision Making. How decisions happen (The Free Press, NY, 1994). J.C. March, in Organizational Decision Making, Ed. Z. Shapira (Cambridge UP, Cambridge, 1997), p. 9. J.C. March and H.A. Simon, Organizations (Blackwell Publishers, Cambridge, 1958). J. Conlinsk, Journal of Economic Literature 34, 669 (1996). B.D. Jones, Annual Review of Political Science 2, 297 (1999). D. Kahneman, American Economic Review 93, 1449 (2003). H.A. Simon, Organization Science 2, 125 (1991).
This page intentionally left blank
IMPORTANCE OF THE INFRADISCIPLINARY AREAS IN THE SYSTEMIC APPROACH TOWARDS NEW COMPANY ORGANISATIONAL MODELS: THE BUILDING INDUSTRY GIORGIO GIALLOCOSTA Dipartimento di Progettazione e Costruzione dell’Architettura, Università di Genova Stradone S. Agostino 37, 16123 Genoa, Italy E-mail:
[email protected] Infradisciplinary, besides interdisciplinary and transdisciplinary, applications, forms part of the definition of new company organizational models and, in particular, for networked-companies. Their related systemic connotations characterize them as collective beings, especially regarding the optimization of interactions between agents as well as context-specific interference. Networked-companies in the building industry (chosen to illustrate the infradisciplinary values of the systemic approach towards company organizational models) require, due to their nature and particularities of context, certain specifications: behavioral microrules of an informal nature, suitable governance of their sector, etc. Their nature and particular context thus determine, especially in the systemic view, the need not only for an interdisciplinary and transdisciplinary approach, but also an infradisciplinary one. Keywords: systemics, infradisciplinarity, building, company, organization.
1. Introduction The Discorso preliminare of Diderot and d’Alembert' s Enciclopedia states: “(...) there is not a single academic who would not willingly place the theme of his own study at the centre of all the sciences, in a way similar to primitive human beings who placed themselves at the centre of the world, convinced that the world had been made for them” (Diderot and d’Alembert, 1772 [3], cit. in Dioguardi, 2005 [5, p. 45], author' s translation). Even today, within various disciplines, this tendency persists: • sometimes with each academic emphasising those features, carriers of assumed generalist evidence, of his/her own area of interest; • sometimes insisting tout court upon the particular collocation of their own expertise; • in other cases claiming the irreducibility of that science towards more general laws of interpretation and conduction of the phenomena, or stressing assumed structural peculiarities, etc. 135
136
G. Giallocosta
From the same Discorso, the role assigned to philosophy in the “(...) encyclopaedic order of our knowledge ...” (Diderot and d’Alembert, 1772 [3], cit. in Dioguardi, 2005, [5, p. 44], author' s translation) also emerges. This encyclopaedic order, in fact, “(...) consists of collecting knowledge within the smallest possible space and putting the philosopher, so to speak, over and above this vast labyrinth, at quite an elevated observation point, from which he can completely embrace all the main arts and sciences ...” (Diderot and d’Alembert, 1772 [3], cit. in Dioguardi, 2005 [5, p. 44], author' s translation). This approach leads to the fragmentation of knowledge into various disciplines. This disciplinary fragmentation remains common practice in many areas of study and research. Nor usually can one be completely free of mistaken interpretations of: • generalism, where the aim is to recompose (and/or construct general theories) but with unacceptable simplifications; • specialism, whenever means and ends inspired with scientific rigour in the interpretation and management of peculiarities (and in the relative operational activities) lead to artificial sectorialisms. In this sense systemic, especially through interdisciplinary and transdisciplinary processes (thus producing interactions, correspondences and theories at higher levels of generalisation), also leads to recomposition amongst the various disciplines. And it does so not by replacing but by integrating the specialistic knowledge of the latter: for this reason, and to avoid mistaken assumptions about centrality in all the sciences (Diderot and d’Alembert, 1772 [3], cit. in Dioguardi, 2005 [5, p. 45]), infradisciplinarity is associated with interdisciplinarity and transdisciplinarity. It is well-known that: • interdisciplinarity occurs when problems and approaches of one discipline are used in another; • transdisciplinarity occurs when systemic properties are discussed and studied in a general way, as properties of models and representations (without reference to cases in specific disciplines). Infradisciplinarity has, in its turn, already been defined epistemologically as being, fundamentally, a set of prerequisites necessary for any scientific activity and as a method of investigation regarding intrinsic aspects of disciplines (Lorenzen, 1974 [11, pp. 133-146]), and here is taken, above all, as resources and assumptions activating and validating specialistic rigour. It is thus important that it be considered in research activities, to avoid genericism.
Importance of the Infradisciplinary Areas in the Systemic Approach …
137
A example of the risks of mistaken generalism, and of insufficient attention towards infradisciplinary aspects (even though it is not strictly ascribable to the scientific disciplines), is the case of company organisational systems applied to the construction sector. The latter shows, in fact, significant and substantial peculiarities, as will be seen below: peculiarities which, moreover, are expressed to the same extent in the nature of the construction companies (especially Italian ones), leading to an indication of their particular behavior as collective beingsa, and sometimes precursors (as will be seen) of completely unexpected operational effectiveness. Current theories of company organisational models, converging, above all, towards the concept of networked-company (Dioguardi, 2007 [6]), provide a general reference framework for applications/elaborations over widely differing sectors. In this way, the latter (through the use of models, behavioral rules, etc.) can manage and coordinate innovational leanings and inherent multiple phenomenologies of various scenarios: local industrial estates, virtual industrial estates, etc.b. Also this does not exclude any specificity of such phenomenologies.
a
b
The concept of collective being expresses, above all, situations in which the system which emerges from the interactions amongst the component parts may show behaviour very different from that of its individual components: so different, in fact, as to require the dynamic use of multiple systems (interacting and emerging from the same components). This concept, when applied to the reality of companies, allows the description of new models, and thus novel possibilities of intervening in processes involving the ability to (Minati and Pessa, 2006 [12, pp. 64, 70-75, 89-113, 365-368]): decide, store information, learn, act intelligently, etc. Collective beings also refers to collective behaviour emerging from that of autonomous agents which share a similar cognitive model, or at least a set of common behavioural micro-rules (Minati and Pessa, 2006 [12, pp. 110-111]). Virtual can mean potential. In the thinking of St. Thomas Aquinas and other scholastics: - an effect is formally contained within its cause, if the nature of the former is present in the latter; - an effect is virtually present within its cause if, while not containing the nature of the former, it may produce it (Minati and Pessa, 2006 [12, p. 362]). The concept of virtual company usually refers to an electronic entity, put together by selecting and combining organisational resources of various companies (Minati and Pessa, 2006 [12, pp. 365-368]). This concept also expresses an opportunity for active cooperation amongst several companies, often having the same target. In this sense, the constitution of a virtual company implies the development of a suitable network of relationships and interactions amongst those companies, developed on the basis of customer requirements (Minati and Pessa, 2006 [12, p. 366]). More generally, the concept of virtual district comprises simultaneous meaning of: - potential (and specific) organisational development, where the constituent members appear to have significant proximity only from an IT point of view (Dioguardi, 2005 [5, p. 127]), - quasi-stability (Garaventa et al., 2000 [8, p. 90]).
138
G. Giallocosta
In this sense, therefore, when coherently adopted within a systemic view, new company organization theories avoid the risks of any pretext regarding the centrality of individual areas of application: thereby reducing the possible effects of self-referentiality. 2. Systemics connotations of networked-companies Although prior to his later contributions, Dioguardi defines the networkedcompany as “(...) a series of laboratories (...) expressed as functional areas which overall provide a network of internal operational nodes. Amongst these (...) economic transactions develop almost leading to an internal quasi-market. The company is also open to external cooperation from other companies, through transactions with suppliers, and these (...) produce a network of supplier companies which are nevertheless independent and able to activate transactions in a real external market which remains, however, an expression of the supplier network of the general company (...) The company is thus structured in a reticular way allowing the coexistence of a hierarchical order together with the efficiency of the market within an organisational harmony like that in Goethe' s web of thought: “The web of thought, I' d have you know / Is like a weaver' s masterpiece: / The restless shuttles never cease, / The yarn invisibly runs to and fro, / A single treadle governs many a thread, / And at a stroke a thousand strands are wed ” (Goethe, 1975 [10, p. 94]; Italian citation: “In realtà, la fabbrica dei pensieri / va come un telaio: / pigi il pedale, mille fili si agitano / le spole volano di qua e di là, / i fili corrono invisibili, / un colpo lega mille maglie”, cit. in Dioguardi, 1995 [4, p. 171], author' s note) ... This company model, however, entails centrifugal freedom of movement capable of disaggregating the component companies (but also involving critical elements for the individual supplier companies - author' s note)c. It is thus necessary (regarding the risks of activating such centrifugal autonomies author' s note) to search for elements of aggregation and homogeneity. And these can be found precisely within the concepts of culture and quality, intended both as internal organizational requirements as well as external manifestations capable of expressing a competitive nature” (Dioguardi, 1995 [4, p. 171], author' s translation). Especially in Italy, in the building sector (in the more advanced cases) models of networked-companies, or of general company, can be defined through connections “(...) at three fundamental levels: c
See, for example, particularly regarding the building sector, Giallocosta and Maccolini, in Campagnac, 1992 [2, pp. 131-133].
Importance of the Infradisciplinary Areas in the Systemic Approach …
• • •
139
the general company itself, which essentially takes on the role of managing and orchestrating (...) the multinodal company (the operational nodes of the general company) ... responsible for managing production, finance and plant and machinery; the macrocompany, consisting of the specialist external companies (...) involved, by the general company, through the multinodal company, in production and supplier activities, by means of quasi-stable relationships ...” (Garaventa et al., 2000 [8, p. 90], author' s translation).
More recent conceptual developments, ascribable to modern aspects of company networksd, could lead to possible developments of traditional districts towards innovative forms having their own technological identity, able to compete in global markets. For the genesis and optimum development of these innovative structures, the themes of governance and the need for associationism amongst companies are very important; particularly the latter aspect, for which: “(...) associationism amongst companies should be promoted, with the objective of a more qualified presence in the markets, and thus one should promote the formation of networks (...) comprising not only small and medium-sized companies in similar markets, but also companies in other districts having analogous or complementary characteristics, interested in presenting themselves in an adequate manner to large-scale markets ...” (Dioguardi, 2007 [6, pp. 143144], author' s translation). Clearly, the theme of governance becomes central for company networks (and networked-companies), especially where: • their formation occurs through spontaneous processes (company networks); • whenever criticality occurs (or there is a risk of it occurring) during useful life-cycles, but also due to more general requirements regarding the definition and realisation of goals (mission) and related management issues; in this sense, the existence of a visible hand (a coherent substitute for the invisible one of the market evoked by Adam Smith), deriving from the professional competence of the managers and from suitable regulatory strategies for that sector, ensures governance: thus acting as an observer, in the systemic sense, and thus active, being an integral part of the processes occurring (Minati and Pessa, 2006 [12, pp. 50-55]).
d
Such company networks “(...) lead to novel aspects of economic analysis which form units at a third level, in addition to individuals representing first level and companies second level elements” (Dioguardi, 2007 [6, p. 138], author' s translation).
140
G. Giallocosta
Further systemic connotations of networked-companies lie in maximizing and, at least in the really advanced cases, optimization of the interactions amongst component companies (agents); these latter, moreover, are typical of a company collective being, expressing the ability to learn, accumulate know-how, follow a strategy, possess style, leadership, etc.: it follows that it possesses intelligence (or better, collective intelligence) in terms of, for example, the ability to make choices on the basis of information, accumulated know-how, elaborating, strategies, etc., and also when faced with peculiarities of context (Minati and Pessa, 2006 [12, pp. 110-134, 372-374])e. The explicit role played by the latter already alludes to the significant effects thus produced (but also, as will be seen, to the synergies which arise) regarding the ability/possibilities of the companies to make choices, and thus to assume suitable behavior, to follow suitable strategies, etc: the set of peculiarities factors are thus considered as agents (and in a dialogical sense with rules and general aspects) regarding the optimum development of behaviors, company strategies, etc., and coherently with innovative theories; these do in fact acquire, precisely because of this (and of the infradisciplinary aspects which it carries), continuous refinement and specification. In resolving the make or buy dichotomy, prevalently by way of orientation towards productive decentralisations (whilst maintaining strategic internal activities, avoiding a drift towards hole corporations), the networked-company also stresses its own behavior as an open system: and precisely, at least on the basis of its productive performance, in the sense of being able to continually decide amongst various levels of openness or closeness with respect to its own context (Minati and Pessa, 2006 [12, pp. 91-124]). The nature of the latter, and the decisions regarding the ways in which one can relate to it, also induce within the company, the possibility of adaptive flexibility (minimal where tendencies toward systemic closure prevail). Such strategies then evolve towards possible forms of dynamic flexibility, such that the company, even though it is suitably prepared in this sense (Tangerini, in Nicoletti, 1994 [13, pp. 387-392]), not only receives market input but modifies and develops it (Garaventa et al., 2000 [8, pp. 125-137]): for example, by anticipating unexpressed requirements, satisfying e
The concept of intelligence in company collective beings, coherently with a simplistic definition of the former, may be considered as corresponding to, for example, the ability to find the right answers to the questions, and assuming (or considering) that the right answers are not so much the real ones but the more useful ones (or rather, those that work). In this sense one can attribute intelligence to collective beings: the intelligence of flocks, swarms, companies, etc., are manifest in the specificity of their collective behavior, where only collectively (as opposed to the inability of the individual members) are they capable of solving problems (Minati and Pessa, 2006 [12, pp. 116-125]).
Importance of the Infradisciplinary Areas in the Systemic Approach …
141
latent needs, etc., and while sustaining in an evident manner, unacceptable risks of manipulating the processes of the formation of demand, the development of induced needs, etc., and for which inhibiting measures of such risks even through governance and sharing ethical codes is necessary (Minati and Pessa, 2006 [12, pp. 336-346]). Thus, there are mutual company-context influences, synergic modifications between them, following non-linear and recursive processesf. Above all, the existence of such interactions, their attributes, the implicit nature of their own connotations (closely related to the character and peculiarities of the company and of the other actors involved, and to the context in which they occur), require the use of infradisciplinary applications and improvements for: • an effective interpretation of such emergent phenomena (Minati and Pessa, 2006 [12, pp. 98-110]), • the most efficient management possible of the latter. 3. Networked-company in the Building Industry Specific aspects of the building sector (illustrating in particular its distinctive character compared to other areas of industrial activity, and for its direct interference with company organizational models) can be summarized, amongst others, by (Garaventa et al., 2000 [8, pp. 27-40] and Sinopoli, 1997 [16, pp. 4665]): • relationships with particularities of contexts (environmental, terrain, etc.)g; • technical and operational activities carried out always in different places; f
g
Non-linear processes, typical of complex systems (distinguished, moreover, by exchanges with the external environment), show behaviors which can not be formulated in terms of a (linear) function f (x) such that: f (x+y) = f (x) + f (y) and f (a*x) = a* f (x). A formulation of recursive processes, typical of autopoietic organisations (which produce themselves), can occur by means of a program (p) expressed in terms of itself, so that its execution leads to the application of the same algorithm to the output of the previous stage. A recursive program recalls itself generating a sequence of calls which end on reaching a given condition, a terminating condition. “Due to the fact of being a building, which occupies a significant portion of land over very long periods of time (...) the product of the construction process has to face up to the problem (...) of relating to the characteristics of its own context: the physical ones (climate, exposure, geology, meteorology), the environmental and historical ones (...) The relationship with its context ensures that the product from the building process adds to its economic role a series of cultural and symbolic meanings and that the agents involved in this process have to come to terms with a discipline (which industrialists almost never have to face) which deals precisely with these specific meanings, that is, architecture ...” (Sinopoli, 1997 [16, p. 48], author' s translation).
142
• • • • •
G. Giallocosta
unique nature of the building; tendency towards high costs of construction; maintenance, and often increases in the economic value of the end products over the years; presence of fractionated leadership in the management of single initiatives; existence of temporary multi-organisations during the management of activities (as for other events and areas of activity, such as theatre, football or rugby matches, etc.).
The significant impact of the building industry upon regulations at the urbanistic and territorial planning level, and upon the multiple particularities of the various contexts, tends to place it in a special position within the macroeconomic scenario: there emerges, for example, the need for extremely detailed institutional regulations (Garaventa et al., 2000 [8, pp. 37-41]), and related interests and multi-disciplinary values (social, economic, cultural, etc.). In the building sector, the technical and operational activities are always carried out in a discontinuous manner and in different places (building sites) and burdened with significant risks (unforeseeable nature of climatic and environmental factors, etc.), supplying unique end-products: thus excluding rigorous programming of operational activities, any meaningful production standardizations, etc. The tendency towards high costs of the finished product, and the relatively long periods necessary for the design and production, often lead to burdensome outlays of economic resources by the construction companies, with consequent heavy negative cash-flow for the latter which usually persists over the whole period of the contract. The maintenance (and often the increase) over time of the economic value of the end-products also explains the lack of interest of those involved in this sector (compared to other sectors of economic activity) regarding aspects of productivity, technical innovation, etc.h: emphasis is, in fact, placed upon factors more ascribable to rent income rather than industrial profit (or the optimization of productive activities). h
The building industry, in fact, “(...) is ‘a strange world of suspicious people (...) who often reject (...) innovations which upset (...) behaviour and (...) habits’ (Sinopoli, 1992 [15, p. 12], author' s note and translation) ... The mistrust is (...) deeply rooted since this community has accumulated millenia of experience, producing great works using almost all types of material available in nature (...) This ‘strange world’, thus, takes from experience the criteria of evaluation of possible innovations accepting external stimuli only through a long and not very clear process of ‘metabolism’ through which novel proposals are compared with the order of a system dominated by the persistence of intrinsic conditions, such as (...) the specific nature of the product and in the ways of producing it ...” (Maccolini, in AA. VV., 1996 [1], author' s translation).
Importance of the Infradisciplinary Areas in the Systemic Approach …
143
Distinct from other industrial activities, where in almost all cases a single agent (the managing director) guarantees through her/his own staff the management and control of the various sub-processes (analysis of demand and of the market, product design, construction, marketing, etc.), in the building industry the functions of leadership are fractionated and assigned to a number of agents (designer, builder, etc.); the latter, moreover, are only formally coordinated by the client (who usually does not possess any significant professional competence). Often this situation, even when some of those agents (often the construction company, especially in Italy and notwithstanding initiatives taken to follow European Directives) take on a central role in the productive processes (Garaventa et al., 2000 [8, pp. 74-76, 120]), leads to conflict, legal wrangling, etc., heightened, moreover, by the existence of temporary multi-organizations. The latter (Sinopoli, 1997 [16, pp. 60-65]): • are formed for the period necessary for managing a given activity (clearly excluding the possibility of accumulating any common experience, as it is unlikely to be usable on successive occasions), • are composed of organizations (or agents, such as designers, companies, suppliers of materials and components, etc.) which, although each is independent, make decisions which depend upon (or interact with) those taken by the others. Situations emerge, endogenous to the building sector, which lead to, well beyond the peculiarities ascribable to the various social, local or other aspects, structural dissimilarities of processes and production in the building sector with respect to other sectors of industrial activity. With other so-called non-Fordist sectors (not in the historical sense), the building sector certainly possesses original modes of production and accumulation (Garaventa et al., 2000 [8, p. 28]). Symptomatic of this, for example, especially during its phases of greatest development, are the significant capital earnings ability but also the low productivity of the work done (which contributes much less to profit formation, as is well known, with respect to other industrial sectors). The character and the specific nature of the building sector thus become the aspects and questions to face up to in order to develop networked-companies in this industry. Above all, a governance as a consequence of public policies besides, naturally, the visible hand of a management possessing a suitable business culture for the building industry, will ensure sufficient compatibility with: • the significant implications for urbanistic and territorial layouts, • the processes of the formation and satisfaction of demand,
144
•
G. Giallocosta
harmonic developments (Giallocosta and Maccolini, in Campagnac, 1992 [2, pp. 131-133]) of supply and of the markets (general companies having responsibility for production and orchestration, specialist companies, independent companies active in maintenance, micro-retraining, renovation, etc.).
Naturally, the importance of the existence of business ethical codes is also clear, especially: • in dynamic flexibility strategies, • in make-buy optimizations which do not damage the competitiveness nor the independence of the supplier network (Giallocosta and Maccolini, in Campagnac, 1992 [2, pp. 131-133]). More generally, the requirements of governance seem to take on the aspects of an active observer (as recalled above in a precisely systemic sense), but with leading roles, nature and attributes, in the face of the peculiarities of the sector. The latter also concern, as mentioned above, particular methods relating to models of operational organization, and planning activities, standardization (only as a tendency and of limited reliability), etc. Above all, informal procedures become particularly important within the organizational structures of the sector. This phenomenon is of particular importance in the Italian situation: the significant “(...) presence of informal procedures, which often spill over (...) into aspects which are, at least, problematic within a normal framework of productive efficiency (...), do not, however, hinder the realization of buildings of a good qualitative level, especially for small and medium-sized buildings without particularly complex plant (...) One often observes, for instance, the high quality of the finishing of buildings in Italy (work in which precisely those informal procedures are the most widespread - author' s note), with respect to the situation in France or in Britain” (Garaventa et al., 2000 [8, p. 93], author' s translation). In this country, the “(...) companies work mainly by using techniques and rules of the trade learnt at the individual level (...) This is personal individual knowledge, rather than know-how or operational procedures developed by the company (...) The lack of any formalized rules of the trade lead to processes of professional training during the work itself (...) which produces certain homogeneity of the agents at all levels (...) Thus the development of responsibility of the operatives, (...) their common hands-on training, end up generating a significant understanding amongst the various operatives. This allows a good product from a largely non-formalized context (...) The operatives seem to have a unity of intention (...) which (...) renders possible the realization
Importance of the Infradisciplinary Areas in the Systemic Approach …
145
of buildings of acceptable quality” (Garaventa and Pirovano, 1994 [7], cit. in Garaventa et al., 2000 [8, p. 94], author' s translation). In this sense, that particular behavior as a collective being derives from the sharing of a cognitive model taken up through training activities which are often not formalized (especially in Italy) but which can establish common behavioral micro-rules (Minati and Pessa, 2006 [12, pp. 110-111]): • notwithstanding discontinuous and heterogeneous experience, • provided that customary forms of intervention existi. Thus, in networked-companies in the building industry and, more generally, for the multiple forms of business aggregation to be found there, that dual organizational order, formal and informal, typical of socio-technical systems, develops: • where the latter translates the set of unwritten rules through the former, but originating from the distinct personalities of the operators and thus decisive in reaching successful conclusions or in determining the unsuccessful ones (Dioguardi, 2005 [5, pp. 87-89]), • with particular emphasis upon the peculiarity and the importance of the informal organization, faced with the analogous peculiarities of the sector. Here, therefore, amongst other factors distinguishing the sector from other areas of economic activity, any coherent development of networked-companies requires: • validation and optimization of the work done, even as the effects of informal procedures (as far as they can be ascribed to compatible type of intervention); • maximizing operational flexibility (for the work done and/or differentiated activities).
i
“The unresolved problem of the Italian building sector is that the unity of intention under these conditions is only possible for traditional working practices and for relatively simple buildings. In large and complex buildings (...) the operatives, especially at the operational level on the building site, lose the overall view of the job (...) and, with that, the unity of intention (...) In an analogous manner, the technical quality required for innovative building work can not be reached using the unwritten rules of the trade (...) On the other hand, the efforts being made to formalize processes and technical rules have the effect of destroying the context which generates artisan know-how and the unity of intention: (...) the Italian building industry finds itself in a difficult and contradictory situation ...” (Garaventa and Pirovano, 1994 [7], cit. in Garaventa et al., 2000 [8, p. 94], author' s translation).
146
G. Giallocosta
Similarly, the diseconomies which still burden the processes of producing buildings, and which to different extents penalize the operators involved (negative cash-flows for the companies, high costs and prices for the customers, users/clients, etc.), demand innovation capable of significant reductions in these phenomena, notwithstanding the emphasis still placed upon rent earnings. In this sense, and also for questions of a more general nature (Giallocosta and Maccolini, in Campagnac, 1992 [2, pp. 131-133]), networked-company models above all, being validating procedures of productive decentralization in the building industry, require (together) regulatory activities regarding: • appropriate limits and well-defined suitability of such procedures; • policies of containment of the costs of intermediaries, often exorbitant and inherent in such activities. Clearly, there is also need for more innovative tendencies which: • optimise quality-cost ratios of the work done, • put into place shared rules and formal procedures, especially for complex and technologically advanced activities (where competition is ensured through interchangeability of know-how and operators). For the latter, naturally, there are aspects common to other industrial sectors, but, for the reasons outlined above, they acquire particular importance in the building industry and, thus, require appropriate governance. 4. Conclusions Networked-company models are emblematic of the infradisciplinary aspects of the systemic approach. Within these models one can, in fact, verify the effectiveness of the most recent theories of business organization, typical of the late-industrial era. At the same time, however, given the current specific aspects of the building sector, there are other peculiarities ascribable to: • networked-company in the building industry, • consistent developments. Thus, the infradisciplinary contributions to systemics (as adjuvant activities which process phenomena having multiple components within a system) do not
Importance of the Infradisciplinary Areas in the Systemic Approach …
147
lead to reductionismj, as long as there are no mistaken assumptions of centrality (Diderot and d’Alembert, 1772 [3], cit. in Dioguardi, 2005 [5, p. 45]). Moreover, within the terms mentioned above, the harmonious deployment of transdisciplinarity, interdisciplinarity and infradisciplinarity become essential. As observed above, a networked-company in the building sector provides an exemplary case of this. References 1. AA. VV., Nuove strategie per nuovi scenari (Bema, Milan, 1996). 2. E. Campagnac, Ed., Les grands groupes de la construction: de nouveaux acteurs urbains? (L’Harmattan, Paris, 1992).
3. D. Diderot, J.B. d’Alembert, in Enciclopedia o dizionario ragionato delle scienze, 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
j
delle arti e dei mestieri, 1772 (Laterza, Bari, 1968). G. Dioguardi, L’impresa nella società di Terzo millennio (Laterza, Bari, 1995). G. Dioguardi, I sistemi organizzativi (Mondadori, Milan, 2005). G. Dioguardi, Le imprese rete (Bollati Boringhieri, Turin, 2007). S. Garaventa, A. Pirovano, L’Europa dei progettisti e dei costruttori (Masson, Milan, 1994). S. Garaventa, G. Giallocosta, M. Scanu, G. Syben, C. du Tertre, Organizzazione e flessibilità dell’impresa edile (Alinea, Florence, 2000). P. Gianfaldoni, B. Guilhon, P. Trinquet, La firme-réseau dans le BTP (Plan Construction et Architecture, Paris, 1997). J.W. Goethe, Faust (Penguin Classics, Middlesex, 1975). P, Lorenzen, Ed., Konstruktive Wissenschaftstheorie (Suhrkamp, Frankfurt, 1974). G. Minati, E. Pessa, Collective Beings (Springer, New York, 2006). B. Nicoletti, Ed., Management per l’edilizia (Dei, Rome, 1994). R. Pietroforte, E. De Angelis, F. Polverino, Eds., Construction in the XXI Century: Local and global challenges (Edizioni Scientifiche Italiane, Naples, 2006). N. Sinopoli, L’innovazione tecnica nelle costruzioni, in Sinopie, 6 (1992). N. Sinopoli, La tecnologia invisibile (Angeli, Milan, 1997).
Reductionism intended above all as the unmanageability of emergent phenomena caused by unsuitable convictions about the exhaustive nature of praxis and cognitive models centered only upon specific details and particularities.
This page intentionally left blank
SYSTEMIC OPENNESS OF THE ECONOMY AND NORMATIVE ANALYSIS
PAOLO RAMAZZOTTI Dipartimento di Istituzioni Economiche e Finanziarie, Università di Macerata via Crescimbeni 20, Macerata, Italy E-mail:
[email protected] The paper discusses economic analysis as a normative – as opposed to positive – science. Contrary to conventional economics, it argues that: the economy does not consist of markets alone; both the economy and markets are open systems. The organization of markets and other economic activities therefore depends on the interaction between the economy and the rest of society. What configuration holds in practice is a matter of public policy. In this perspective, public policy is an intrinsic part of economic analysis, not something that follows once the economy has been investigated. The paper also argues that markets have a rationale of their own. As a consequence, public policy must define – or co-determine – the appropriate economic configuration not only by acting upon the institutional setup of markets but also by identifying those sections of the economy that should be coordinated by markets and those that should resort to other economic institutions. Keywords: openness of economy, markets as open systems, public policy.
1. Introduction This paper discusses economic analysis as a normative science. Contrary to conventional economics, it argues that since the economy does not consist of markets alone and both markets and the economy as a whole are open systems, the organization of markets and other economic activities depends on the interaction between the economy and the society they are a part of. What configuration holds in practice is a matter of public policy. In this perspective, public policy is an intrinsic part of economic analysis, not something that follows once the economy has been investigated. The paper also argues that markets have a rationale of their own. As a consequence, public policy must define – or co-determine – an appropriate economic configuration not only by acting upon the institutional setup of markets but also by identifying those sections of the economy that have to be coordinated by markets and those that have to resort to other economic institutions.
149
150
P. Ramazzotti
The paper is arranged as follows. The next section argues that, even in a very stylised model of the market, some political decisions are necessary concerning factor endowments and, in more general terms, property rights. This implies that, depending on which decision is actually taken, a whole range of market configurations is possible. Section 3 argues that the choice of configuration depends on the relation between the economy and the way it is perceived and understood by people. To this end, the section focuses on the characteristics of knowledge. It stresses its irreducibility to a consistent system and how this feature may affect how people assess the economy. More specifically, the multiple facets of knowledge reinforce the possibility of a significant variety of economic setups. Section 4 contends that, how the economy is organized ultimately is a matter of public action. This implies that economics cannot be viewed other than as a normative science. Economic inquiries that neglect the role played by the policy maker either rely on a semiclosed system view of the economy or implicitly assume that the only economy to be taken into account is the status quo. Section 5 provides the conclusions. 2. Capitalist markets as open systems The conventional notion of a self-regulating market can be traced back to conventional economic theory. Walrasian general equilibrium is based on the assumption that when such “exogenous” variables as technology and preferences are given and known and when resources are assigned to economic agents, a properly functioning price system provides all the information that is required in order to allocate those resources. Since technology, preferences and endowments are believed to be independent of how the market functions, the market itself can be viewed as a semi-closed systema: although it may be altered by exogenous shocks, it is a self-regulating systemb. A (Walrasian) market is one where prices are determined by preferences, endowments and technology alone. Prices, however, reflect the assignments of property rights, which simply means that someone is assigned the right to use something independently of the possibly negative consequences that this use may have on third parties: if individual A eats her apple, individual B will not be able to eat it. The rule whereby A rather than B has a right to eating the apple – a b
See Auyang (1988) [1] for a definition of semi-closed system. This view has been criticized on a number of accounts by a great many authors (see, for instance, Boulding 1968 [2], Georgescu-Roegen 1976 [7], Kapp 1976 [9]; see also Dow 1996 [6]). It is nonetheless appropriate to reassess it in order to appreciate its relation with the Polanyian themes that are discussed below.
Systemic Openness of the Economy and Normative Analysis
151
even if B is starving and A is not – is all but natural. It is determined according to some explicit or implicit decision. The assignment of the right – the related decision – is a political, surely not an economic, issue (Schmid 1987 [21]; Bromley 1989 [3]; Samuels, Schmid 1997 [20]; Medema, Samuels 2002 [15])c. The implication of the above is twofold. First, even at the highly abstract level of a Walrasian economy, the market has a political dimension, which obviously contrasts with its claimed independence from other societal instancesd. Second, depending on how property rights are assigned, a range of possible sets of relative prices is possible. In order for the price allocation mechanism to work, a decision has to be made concerning what the interests to be defended are, i.e. what the social priorities aree. Individuals make their economic choices under path dependent circumstances that are associated to political factors. These circumstances lead to a price set which is only one out of many possible ones. It is the price set that reflects past and present social priorities as they emerge from the existing system of power. A different system of power would not constrain a given market. Subject to the profit constraint, it would simply make the market function according to different priorities. Insofar as someone is entitled to charge a price for something, someone else is obliged to pay if she wants that something. Different price sets may be viewed, therefore, as leading to the different possible payoffs of a zero-sum game. There are instances where some sets of payoffs may be deemed superior to others, however. In terms of per capita income, for instance, some distributions may be preferable than others in that they favor a higher rate of income growth. Thus, economic policy – including the assignment of property rights – need not merely reflect the balance of power among conflicting interests. It may also reflect a choice related to some notion of social welfare. The problem is how this social welfare should be defined, i.e. what metric ought to be used to assess social efficiency and the performance of the economy. The above considerations on the variety of price sets suggest that it is rather inappropriate to assess economic outcomes in terms of a price-based indicator. c
d
e
Efficiency – i.e. finding the best way to achieve some goal such as allocation or growth – is not distinct of that political decision. Reducing slack, for instance, involves a decision over the right that a worker has in taking her time when she carries out a task. Markets are characterized by other institutions, e.g. those that affect the conduct of individuals and organizations. We shall not deal with these here. “The issue is not government versus not government, but the interests to which government is to lend its protection and the change of interests to which it gives protection” (Medema, Samuels 2002 [15, p. 153]).
152
P. Ramazzotti
Any trade off would reflect the specific set of relative prices it is based on. So, in general terms, before choosing, one would have to preliminarily choose which set of relative prices is appropriate. Physical output may be just as misleading as a price-valued output: decisions concerning what to produce and how to produce it are based on relative prices, thus on those same circumstances that undermine the uniqueness and objectivity of price-based indicators. The information that a given market provides is based on the political decisions that underlie its institutions. Priorities based only on that information would be biased by the status quo. In other terms, any attempt to change a given market according to the priorities set out by that same market would be selfreferential. The choice of the priorities that the economy should pursue, therefore, requires some value judgement. A benchmark that transcends those priorities is necessary. Independently of how a specific market is arranged, however, market transactions are unlikely to occur if they do not meet the requirement of profitability. Despite differences in how a capitalist market is arranged, it is always based on the profit motive. The name of the “game” is profit. Market institutions should be distinguished, in this regard. On the one hand, the profit goal is a key institutional feature of any capitalist market. On the other, this feature needs to be qualified by a range of other institutions, which we have discussed above. These institutions are not given once and for all but depend on explicit or implicit choices as to what priorities or interests should prevail. The profit game may be either expanded to the point that it encompasses all of the economy or it may be restricted. The same political decisions that assign property rights may choose not to assign them, i.e. they may choose that some good should not be treated as a commodity: this is the case when all members of a community are entitled to medical assistance, which is eventually paid through taxes. In such a case, medical assistance is not a commodity, it is an entitlement, i.e. a right that derives from being a member of a community. Political priorities underlie not only how the market works but also its boundaries. From this perspective, reliance on a profit-centred benchmark would imply the subsumption of society to the market rather than the other way round. Summing up, institutions determine property rights and entitlements, so they involve a value judgement concerning justice. From this point of view, the real obstacles to change would seem to be related to the political establishment – e.g. whether representative democracy works properly or not. The section that follows will focus on how a benchmark that transcends profit may emerge. This
Systemic Openness of the Economy and Normative Analysis
153
will provide some insights on whether it is actually possible to separate the economic domain from the political one. 3. Knowledge as an open system The previous section pointed out that the choice of the priorities that the economy must pursue transcends the market. It has to do with the relation between the society and the economy as well as with the role of the market within the economy. It therefore has to do with what the society values. Contrary to conventional economic theory, individuals are not able perfectly to process all the information required nor is that information generally available. This means that, whether they have to assess a good they wish to consume or the general performance of the economy, they must avail themselves of some assessment criterion. This requires knowledge. The definition of knowledge is definitely controversialf. Drawing on Loasby (1991, 1999, 2005) [10-12], I refer to knowledge as a set of connections – a pattern of relationships – among concepts that is required to make sense of (sections of) realityg. Since nobody can take everything into account at the same time, it is part of the learning process to select what is supposed to be relevant, i.e. to trace boundaries between what needs further inquiry and what has to be discarded. How to do this depends on the goals and the aspiration level of the learning actor (Simon 1976) [25]. An aspiration level reflects individual idiosyncrasies as well as the cultural environment of the learning actor, i.e. the range of shared beliefs, interpretative frameworks and learning procedures that other actors in that environment accept. It ultimately is a value judgement concerning relevance. Although everything is connected to everything else, so that one might conceive of a unique learning environment, in practice actors must adapt to their limited cognitive abilities by learning within specific sub-environmentsh: family, school, religious congregation, workplace, trade union, etc.. Specific knowledge f
g
h
The variety of approaches to the topic emerges in a recent “Symposium On Information And Knowledge In Economics” in the April 2005 issue of the Econ Journal Watch. A discussion of knowledge and public policy is in Rooney et al. (2003) [19]. The “A specific report can provide information only if it can be connected to something else, and it is unlikely to provide much information unless this ‘something else’ is a pattern of relationships—how some things fit together. Such patterns constitute what I call knowledge. Knowledge is a set of connections; information is a single element which becomes information only if it can be linked into such a set.” (Loasby 2005 [12, p. 57]). These environments are the subsystems of what Simon (1981) [26] referred to as a semidecomposable system.
154
P. Ramazzotti
depends on the specific problems that arise in each environment and, possibly, in those that are contiguous to iti. How those problems are framed – i.e. how problem solving activities are carried out – depends on the requirements and the priorities that arise within those environments: how you describe a brain depends on whether you are a butcher, an anatomopathologist, etc. (Delorme 1997, 2001 [4,5]). Market-related learning is constrained by profit in that it would be useless for a businessman – in his capacity as a businessman – to learn something that does not generate an economic gain. Obviously, he may wish to read Shakespeare independently of business considerations – possibly to make sense of life – but, in so doing, he will be pursuing a different type of knowledge, which is unrelated to profit and presumably unconstrained other than by his background knowledge and by the characteristics of the learning process itselfj: it could be associated to what Veblen referred to as idle curiosity. In his attempt to make sense of life, an actor may distinguish preferences, which are associated to egoistic goalsk, from commitments, which are associated to non-egoistic goals – be they those of another individual, of another group or of an entire communityl – or simply to ethical rules. What is important about this distinction is that there may be no common denominator between the two domains. As long as preferences and commitments do not interfere with each other, there may be no problem. Indeed, they may produce positive feedbacks, as may be the case when actors rely on non-egoistic rules in order to find a solution to the Prisoner’s Dilemma. When the two domains do interfere, the actor may face a conflict which is much like a moral dilemmam. An example might be an individual who carries out a specific economic activity – e.g. the production of armaments – that clashes with her ethical values – the non-
i j
k
l
m
The importance of contiguity is stressed by Nooteboom (1999) [16]. See the distinction that M. Polanyi (1962) [18] provides of different learning processes and of how they can be more or less restricted by the bounds that characterize them. Preferences may also include sympathy, which occurs when A’s well being depends on B’s well being. See Sen (1982) [23]. “Non-egoistic reasons for choosing an action may be based on ‘the possibility of altruism’. They can also be based on specific loyalties or perceived obligations, related to, say, family ties, class relations, caste solidarity, communal demands, religious values, or political commitment.” (Sen 1986 [24, p. 344]). The typical example of a moral dilemma is when Agamemnon was forced to choose between losing his army or losing his daughter. A similar concept in psychology is cognitive dissonance, which arises when an individual is unable to cope with information that is inconsistent with her strongly held views.
Systemic Openness of the Economy and Normative Analysis
155
acceptance of military conflicts as the solution to international or domestic disputes. Owing to bounded rationality, knowledge may well involve coexisting yet potentially inconsistent views of reality. In a capitalist economy, the profit motive provides the rationale for most economic transactions. It therefore underlies views of how to conduct business and of economic activity in general. These views may turn out to be inconsistent with what actors view as appropriate from other perspectives, e.g. ethical, religious, etc.n. Preferences and values associated to potentially inconsistent domains may coexist for quite a long time, without interfering with each other. Consequently, each domain is likely to lead to the insurgence of domain-specific institutions. Markets may therefore coexist with institutions that transcend them: clubs, churches, political parties, etc.. Institutional setups are not merely instrumental to the solution of specific problems. Once they are established they easily become a part of the reality that actors take for granted: they are internalized. This cognitive dimension of institutions (Zucker 1991 [29]; Scott 1995 [22]) suggests that it may not be easy to conceive of their dismantling as well as to envisage an alternative. One implication of the above discussion is that there is no single “game” being played, and quite a few sets of rules may coexist, interact and sometimes clash. The institutional setup that underlies the market may or may not be consistent with ethical values, as well as with institutions that are associated to those values. Thus, sweatshops may be consistent with profit and its related institutions – e.g. firms, stock markets, etc. – but may be deemed unacceptable from a range of ethical perspectives and inconsistent with the rules associated to their related – e.g. religious, human rights and political – institutions. From a policy perspective, this suggests that making sense of, and somehow dealing with, these inconsistencies should be a priority. Thus, although complexity requires that we devise simulation models that take account of multiple interactions and non linearities (Louie, Carley 2007 [13]; Law, Kelton 1999 [14]), so as to appreciate the dynamics of a system, the key point of the paper is that these should not be viewed as ever more sophisticated deterministic models. Inconsistencies involve choices and degrees of freedom within the models. They stress that the distinction between positive and normative economics is generally misleading. n
Ethics exists in business as well as in other domains of life. It is part of the market-related institutional setup. However, when I refer to ethics, in what follows, I refer to values that are independent of market-related activities.
156
P. Ramazzotti
A second implication is that knowledge about one’s reality may consist in a set of autonomous sub-systems but the boundaries between these sub-systems are never given once and for all. They may be reassessed. So, although actors may accept a distinction between the economic and other domains, thereby adapting to, and situating themselves within, a given economic and societal setup, they may also valuate those very setups and desire to change them. This involves confronting the profit motive with other values. More specifically, it involves confronting alternative ways that society can resort to in order to materially reproduce itselfo. The conclusion the above discussion leads to is that, although economic performance is usually assessed in terms of how profit-centered transactions within markets provide for the allocation of resources, for the rate of growth, or for accumulation, this type of assessment may be misleading for two reasons. First, societal values that clash with market-related values may eventually undermine social cohesion, thereby disrupting economic, as well as social, relations. Second, in so far as the economy is a sub-system of society, its performance should not be expected to prevail a priori over society’s overall performance, however assessed. At the very least they should be on the same standing. More generally, one would expect that the economy’s performance should be assessed in terms of society’s value system rather than in terms of its own criteria. Taking account of societal dimensions such as justice and carep, however, may be problematic, as the next section will argue. 4. Systemic openness and public policy Section 2 argued that political choices determine rights, which – together with other institutions – determine the structure of prices and the composition and amount of output and investment. The resulting institutional structure acts upon the choice sets of economic actors. This is the context where actors learn about the economy and learn to interact in compliance with the constraints that the extant market provides. As the discussion of knowledge in the previous section argued, however, learning actors generally transcend the economy and pursue a knowledge that is independent of market constraints. The interaction between the societal value system that this knowledge leads to and the economy allows for a great variety of potential economic and societal setups. Which one occurs o
p
Polanyi (1957) [17] specifically refers to contracted exchange, redistribution and reciprocity as the three available alternatives. Van Staveren (2001) [28] links the value domains of care and justice to giving – which corresponds to Polanyi’s notion of reciprocity – and distribution.
Systemic Openness of the Economy and Normative Analysis
157
in practice depends on the values that eventually prevail, either explicitly or implicitly. The above discussion on how markets are structured stressed that the very assignment of property rights affects distribution. It is therefore reasonable that different stakeholders within the economy will try to defend their vested interests or shift the balance of economic power to their advantage. Any economic analysis that acknowledges the openness of a market economy must take into account how different interests may affect the overall performance of the economy. The assumption that the economy should not be interfered with is tantamount to implicitly accepting the balance of economic power that is determined by the status quo. A more active policy generally changes such a balance. An appropriate policy procedure would require the explicit formulation of choices. Any policy reasonably has to decide what weight it must assign to each type of economic activity, thus what boundaries there should be between the market, the welfare state and a broadly defined non-profit sector (including families). A strictly interrelated task is to define the characteristics of these economic sectors, thus how they are expected to interact among each other. It is not enough, however, to argue in favor of principles such as redistribution and reciprocity as if they were alternatives to the market. Depending on a variety of circumstances, they may be either alternative or complementary. The relation between market and non-market values and priorities may vary. In some instances, it may be a positive one. Thus, a rise in employment may be functional to a rise in output, quite independently of any value judgement in favor of full employment per se, and redistribution and reciprocity may be functional to the profitability of the market independently of any reference to social justice or care. Post Keynesian economics, for instance, has stressed the relation between distribution and growth, especially emphasizing that a more balanced distribution of income generally has a positive effect on the level of income itself; welfare provisions such as schooling and public health typically create positive externalities that favor economic growth; as for reciprocity, while charities prevent the effects of the market economy from occurring in their most dramatic form, they also help the market in that they prevent social unrest, which would undermine economic activity. On the other hand, distribution may also affect profits – thus investment decisions and growth – negatively, as Kalecki (1943) [8] pointed out. Restrictions on polluting industries may reduce profitability. Under some circumstances – which mainstream economic thought tends to view as permanent – public expenditure
158
P. Ramazzotti
may displace private investment. Any help to the poor may be claimed to provide a disincentive to work, as the defenders of workfare policies contend. The three forms of integration and their related institutions may, therefore, be mutually consistent or inconsistent. What is more, inconsistency may exert its negative consequences even when it does not occur. In a capitalist market economy beliefs affect economic decisions, especially investment, in a significant way. So it may suffice for business just to believe that accumulation is precluded for that expectation to fulfil itself. Uncertainty may cause economic disruption quite independently of action by business to defend its vested interests. Change in how markets are structured may affect those perceptions, leading to reactions that range from uncertainty to cognitive dissonance. The implication is that, although the priorities underlying the inception of change may be generally accepted, the process may lead to the perception of unexpected and unwanted institutional inconsistencies. This is a critical issue in the light of the question: who is to choose? Owing to complexity, actors may change their mind in the process. The bandwagon effects of uncertainty may reinforce these effects. Policy must co-evolve with the parties involved. It must induce institutional change but unless it allows actors to change their perception of the economy as institutions change, it may determine two types of negative reactions. First, change may not be perceived, so that actors continue behaving as if nothing happened: for instance, provisions in favor of the weaker sections of society (e.g. consulting rooms provided by the welfare state) may remain underemployed, to the advantage of the market for private, costly and often inappropriate services (e.g. abortions). Second, opposition to what is perceived as institutional disruption might lead to reactions that recall luddist opposition to technological change. While a utility maximizer who learns only in order to achieve her goal would not be concerned about the general effects of the policy that is being carried out – she would merely focus on (micro) adaptation – a learning actor who can abstract from specific circumstances of time and place may judge policy at the macro level and either support it or oppose it. Thus, a bi-directional relation must occur between social actors and policy makers. The former must be aware of what change is occurring and how it may impinge on their life. The latter must achieve change through consent, which involves that they must avoid actors from perceiving change exclusively in terms of social disruption but also that they must be aware of what change is most important in the eyes of the social actors.
Systemic Openness of the Economy and Normative Analysis
159
Along with the bi-directional relation between social actors and policy makers, social actors must interact with each other. Precisely because they may wish to change their economic and societal environment, inconsistencies may arise among the metrics adopted by each one. In order to overcome these inconsistencies actors must be able to carry out appropriate search processes, that is, learn – a policy implication from section 3. In doing so they must also interact with others in order to achieve a generally shared view of what the appropriate metric should be. 5. Concluding remarks Systemic openness characterizes all markets: they could never work properly if they were not embedded in a broader (institutional) environment. Markets and institutions are interrelated. Not all institutions, however, are functional to the market, because some arise within extra-economic domains and may well be inconsistent with the institutions underlying the market, as well as with the profit motive that characterizes modern capitalism. The issue society has to deal with, therefore, is how to avoid institutional inconsistency. This involves choosing what relation must exist between the market, the economy and other societal institutions. The above choice requires a view of how things are and of how they ought to be: it requires knowledge of the reality people are a part of. Knowledge, however, is also an open system: people cannot separate, once and for all, their economic lives from their ethical lives. At the same time, they cannot keep everything together because they are boundedly rational. They cannot have allencompassing and consistent views of what is appropriate. Quite to the contrary, inconsistencies may occur within individuals as well as among them. A priori there is no reason to believe that economic constraints, which are not technically neutral but ultimately depend on discretionary decisions, should prevail over other societal requirements. Similarly, there is no reason to believe that the status quo is preferable to other situations. Economic analysis must, therefore, investigate how direct the economy towards society’s ends. It must deal with normative issues. If public policy is concerned with the overall quality of life of the members of society, it must allow them to overcome the inconsistencies discussed above. It must therefore take into account that, along with the values underlying the functioning of the market, a range of different values exists, and only members of society can judge what the priorities are. But, in order to choose, these members must be provided with the preliminary requirements for free choice.
160
P. Ramazzotti
The issue is not figuring out what people want but giving rise to a process that will lead people to learn how and what to choose. The process of change that such a policy determines may well lead actors to perceive new inconsistencies. Actors who initially favor one type of change may eventually change their views about what the priorities are. This is consistent with the assumption that actors are not substantively rational and that they learn as their environment evolves. It implies that the choice of priorities is an ongoing process that requires interactive learning and dialogue between policy makers and the actors involved, as well as among the latter. The general conclusion this discussion leads to is that democracy matters for normative economics. Democracy may be a means for an ongoing learning process by social actors – that eventually leads to appropriate choices – or it may be a mere counting of votes. Similarly, when institutional inconsistencies prevent governments from choosing, the solution may consist in allowing society to deal with those inconsistencies – at the risk of some social instability – or in restricting the action of minorities and dissenters. The type of action that governments take eventually affects the subsequent ability of society to actually choose the relation between its general values and economic ones as well as between the status quo and other alternatives. References 1. S.Y. Auyang, Foundations of complex-system theories (Cambridge University Press, Cambridge, 1988).
2. K.E. Boulding, in Management Science 2(3), 197-208 (1956); also published in: 3. 4. 5. 6. 7. 8. 9.
K.E. Boulding, Beyond Economics. Essays on Society, Religion and Ethics (University of Michigan Press, Ann Arbor, 1968). D.W. Bromley, Economic Interests and Institutions – The conceptual foundations of public policy (Blackwell, New York, 1989). R. Delorme, in Beyond Market and Hierarchy, Ed. A. Amin and J. Hausner, (Cheltenham, Elgar, 1997). R. Delorme, in Frontiers of Evolutionary Economics. Competition, SelfOrganization and Innovative Policy, Ed. J. Foster and J.S. Metcalfe, (Cheltenham, Elgar, 2001). S.C. Dow, The methodology of macroeconomic thought: a conceptual analysis of schools of thought in economics (Elgar, Cheltenham, 1996). N. Georgescu-Roegen, in Energy and Economic Myths. Institutional and Analytical Economic Essays, Ed. N. Georgescu-Roegen, (Pergamon Press, New York, 1976). M. Kalecki, Political Quarterly, 14, (1943). K.W. Kapp, in Economics in the Future: Towards a New Paradigm, Ed. K. Dopfer, (Macmillan, London, 1976).
Systemic Openness of the Economy and Normative Analysis
161
10. B.J. Loasby, Equilibrium and Evolution. An Exploration of Connecting Principles in Economics (Manchester University Press, Manchester, 1991).
11. B.J. Loasby, Knowledge, Institutions and Evolution in Economics (Routledge, London, 1999).
12. B.J. Loasby, Econ Journal Watch 2(1), 56-65, 2005 13. M.A. Louie, K.M. Carley, The Role of Dynamic-Network Multi-Agent Models of 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29.
Socio-Political Systems in Policy, CASOS Technical Report, (2007), http://reports-archive.adm.cs.cmu.edu/anon/isri2007/CMU-ISRI-07-102.pdf . A.M. Law, D.W. Kelton Simulation Modelling and Analysis (McGraw-Hill, New York, 1999). S.G. Medema, W.J. Samuels in Economics, Governance and Law. Essays on Theory and Policy, Ed. W.J. Samuels, (Elgar, Cheltenham, 2002), pp. 151-169. B. Nooteboom, Cambridge Journal of Economics 23, 127-150 (1999). K. Polanyi, in Trade and market in the early empires: economies in history and theory, Ed. K. Polanyi et al., (The Free Press, New York, 1957), pp. 243-270. M. Polanyi, Personal Knowledge. Towards a Post-Critical Philosophy (Routledge, London, 1962). D. Rooney, et al., Public Policy in Knowledge-Based Economies. Foundations and Frameworks (Elgar, Cheltenham, 2003). W.J. Samuels, A.A. Schmid, in The Economy as a Process of Valuation, Ed. W.J. Samuels, S.G. Medema, A.A. Schmid, (Elgar, Cheltenham, 1997). A.A. Schmid, Property, Power, and Public Choice. An Inquiry into Law and Economics, 2nd edition, (Praeger, New York, 1987). W.R. Scott, Institutions and Organizations (Sage Publications, Thousand Oaks, 1995). A. Sen, in Choice, Welfare and Measurement (Basil Blackwell, Oxford, 1982). A. Sen, in Development, Democracy and The Art of Trespassing. Essays in Honor of Albert O. Hirschman, Ed. A. Foxley, M.S. McPherson, G. O' Donnel, (Indiana University of Notre Dame Press, Notre Dame, 1986), pp. 343-354. H.A. Simon, in Method and Appraisal in Economics, Ed. S.J. Latsis, (Cambridge University Press, Cambridge, 1976), pp. 129-148. H.A. Simon, in The Sciences of the Artificial (MIT Press, Cambridge, MA, 1981). H.A. Simon, A. Newell, Human Problem Solving (Prentice Hall, Englewood Cliffs, 1972). I. van Staveren, The Values of Economics. An Aristotelian Perspective (Routledge, London, 2001). L.G. Zucker, American Sociological Review 42, 726-743 (1991).
This page intentionally left blank
MOTIVATIONAL ANTECEDENTS OF INDIVIDUAL INNOVATION
PATRIZIA PICCI, ADALGISA BATTISTELLI Department of Psychology and Cultural Antropology University of Verona - Italy E-mail:
[email protected] The current work seeks to focus on the innovative work behavior and, in particular, on the stage of idea generation. An important factor that stimulates the individual to carry out the various emergent processes of change and innovation within the organization is known as intrinsic motivation, but under certain conditions, the presence of different forms of extrinsic motivation, as external regulation, introjection, identification and integration, positively influences innovative behavior at work, specifically the creative stage of the process. Starting from this evidence, the organizational environment could be capable of stimulating or indeed inhibiting potential creativity and innovation of individuals. About 100 individuals employees of a local government health department in Central Italy were given an explicit questionnaire. The results show that among external factors that effect the individual such as control, rewards and recognition for work well done, controlled motivation influences overall innovative behavior whereas autonomous motivation plays a significant role in the specific behavior of idea generation. At the same time, it must also be acknowledged that a clearly articulated task which allows an individual to identify with said task, seems to favor overall innovative behavior, whilst a task which allows a fair degree of autonomy influences the behavior of generating ideas. Keywords: innovation, antecedents of individual innovation, motivation, self determination theory, work characteristics.
1. Introduction One of the most common convictions nowadays is the understanding that in terms of innovation, it is not only the remit of the organization to be innovative but also of its individual parts. Whereas the organization can provide experts and specialists in Research and Development, it is also possible to develop and use individuals’ innovative potential, in order to respond successfully to the constant challenges of the market and social system. Betting on personnel becomes a deciding factor in competitive advantage, often reflected in a high quality service, based on the idea of continuous improvement. Currently, the focus of research on organizational citizenship behavior, employees’ creativity, positive individual initiative and on critical and reflective
163
164
P. Picci and A. Battistelli
behavior patterns which accompany people at work is the primary motivation of personnel to commit themselves to various proactive behaviors, identified as “extra-role”. It is the same general concept, according to which individuals in certain situations will do more than requested and comes under the definition of innovative work behavior (IWB), whereby individuals begin a certain course of action and intentionally introduce new behaviors in anticipation of the benefits of innovative changes (Janssen, Van De Vliert and West, 2004 [26]). For many modern organizations, public or private, competitive advantage depends on their ability to favor and program innovation, by activating and converting the ideas within the innovative process and transforming them into marketable products (Salaman and Stirey, 2002 [32]). The emerging character of innovation, when adopting the framework of Theories of Emergence (see, for instance, Crutchfield 1994 [12]; Holland 1998 [24]; Minati and Pessa 2006 [28]), appears as a sort of challenge for every theory of this kind. Namely, by its very nature, the innovation itself is unpredictable, a circumstance which seems to rule out any attempt to state the conditions granting for the occurrence of this special form of ‘intrinsic’ emergence (to use Crutchfield’s classification). Nevertheless, while a complete model of emergence of innovation is obviously unfeasible, we could still try to characterize the different psycho-social factors which seem to have some relevant influence on the development of innovation itself. We could then try to answer to questions such as: what particular factor favors individuals’ innovative behavior at work? What stimulates them to be creative or to adhere and contribute to processes of change and improvement in their specific work? By concentrating on the individual innovative process in general and on the phase of idea generation in particular, the current paper proposes to examine in what way specific motivational factors, linked to the individual and their work, can stimulate the frequency of the creation of ideas and the expression of new and better ways of doing things by individuals in the workplace. The psychological motivational construct of creativity and innovation at work as an antecedent factor has been defined on the basis of studies, carried out by Amabile (1983; 1988) [2,3], according to the distinction between intrinsic and extrinsic. Such a distinction built upon previous studies of Deci (1971) [14] and Deci and Ryan (1985) [15] and led to the Gagné and Deci (2005) [19] theory of self-determination which represents a recent attempt to put extrinsic motivation (autonomous vs controlled) into operation. Furthermore, by
Motivational Antecedents of Individual Innovation
165
continuing to follow the line taken by Amabile, various characteristics of the organizational/work environment which can stimulate or inhibit the expression of potential individual creativity, are described. Among the abovementioned characteristics, the important role of the various elements of the task (Hackman and Oldham, 1980 [21]), will be examined, highlighting in particular possible influences on the creative phase of the individual innovative process. 2. Motivational factors which influence the emergence of individual innovation and idea generation. Individual innovation or role, understood as “the intentional introduction within the role of new and useful ideas, processes, products and procedures” (Farr and Ford, 1990 [18, p. 63]), is that type of innovative behavior which the individual puts into practice to improve the quality of their work. A recent widespread current of thought considers individual innovation as a complex process made up of phases, often defined in different ways, which can be essentially divided into two distinct behaviors, namely, the generation and implementation of ideas (Rank, Pace and Frese, 2004 [29]). The precise moment of idea generation is the initial phase of innovative wok behavior. It is considered to be the most relevant factor in creativity, that is to say the most closely linked, to “the production of new and useful ideas” (Scott and Bruce, 1994 [31, p.581]), in which it is principally the individual, who acts according to their interpretation of internal environmental factors. The abovementioned phase, which is characterized by “subjectivity”, differentiates itself from other phases of innovative behavior at work which are more “social” (idea promotion and idea realization), that is to say, those which give space for necessary moments of interaction between individuals. Given its fundamental importance for the development of emerging innovation, idea generation is among the most discussed tasks of individual innovative behavior. These innovations may be born both spontaneously and intentionally from individuals and groups at work, with the aim of making the work at hand better, simpler and more efficient (among the many papers devoted to a theory of this subject we will quote West, 1990 [34]; McGahan, 2000 [27]; Alkemade et al., 2007 [1]; Cagan, 2007 [11]). One specific study, aimed at researching individual characteristics which stimulate or inhibit creativity, as expressed by employees towards their work, is that of Amabile and Gryskiewicz (1987) [8]. They analyzed individual performance within a problematical situation in the workplace. Among the qualities of the problem solver, which favor
166
P. Picci and A. Battistelli
creativity, not only do a number of typical personality traits, such as persistence, curiosity, energy and intellectual honesty, emotional involvement in the work per se and willingness to accept a challenge emerge but also the possession of fundamental cognitive abilities for certain sectors, and finally characteristics more closely linked to the particular situation. These abilities include being part of a team with dependable intellectual and social qualities and showing good social, political and relational skills. Amabile (1988) [3] proposes a componential model, in terms of psychosocial creativity, in which three necessary components/elements for good creative performance are described, namely, domain-relevant skills for the task, creativity-relevant skills and intrinsic task motivation. Intrinsic motivation involves people carrying out an activity in and of itself, given that they find the activity interesting and that it gives a spontaneous satisfaction. Extrinsic motivation, on the other hand, requires a combination of activity and certain consequences thereof, in such a way that satisfaction does not originate in the activity per se but rather in the consequences from which they derive (Porter and Lawler, 1968 [30]). A number of authors (Woodman, Sawyer and Griffin, 1993 [35]), maintain that it would be preferable to avoid the development of extrinsic motivation among workers, given that it would direct the focus/attention “beyond the heuristic aspects of the creative task and towards the technical and normative aspects of performance” [35, p. 300] even if certain conditions exist in which these aspects play a favorable role in the creative carrying out of the work performance and in which it may even be necessary and desirable that the positive effects actually increase. For example, with the imposed limitations of deadlines, expectations, controls and contractual rewards, work tends to be completed on time and well. Furthermore, not only do people need financial recompense for their work but they also need positive rewards of other types, such as feedback, recognition and behavioral guidelines (Amabile, 1988 [3]). Amabile in particular dedicates a major part of his work to the study of the role of motivation in task performance, positing the hypothesis that intrinsic motivation may influence the emergence of the creative process, whereas extrinsic motivation may actually be destructive, even if at times in simpler tasks it can act in synergy with intrinsic motivation and actually increase the expression of creativity, to such an extent that high levels of performance such as innovative, emerge clearly (Amabile, 1996, 2000 [5,6]; Amabile, Barsade, Mueller and Staw, 2005 [7]).
Motivational Antecedents of Individual Innovation
167
The inclusion of intrinsic motivation as determinant in innovative behavior directs our attention towards the motivational potential of work characteristics, such as the variety of task and of competences required, the degree of significance and of perceived identity in and of itself, feedback and autonomy (Hackman and Oldham, 1980 [21]). Farr (1990) [17] confirms that compared to simplified tasks, complex tasks are more challenging and potentially encourage innovation. Hatcher, Ross and Collins (1989) [23] highlight a positive correlation between task complexity (a comprehensive measurement of autonomy, variety and feedback) and the generation of ideas phase. In order to understand the degree of commitment and level of motivation that an individual has with regards their work, it is therefore also useful at this point to consider the nature of the task, which as previously noted, is strongly related to satisfaction with same (Hackman and Oldham, 1975 [22]). From what has already been stated, it seems patently obvious that in order to be creative in carrying out tasks at work, it is necessary to be intrinsically motivated and this only becomes possible if two fundamental conditions exist, namely that a person loves what they are doing and that their work takes place in a positively motivational context. Only when these conditions have been met, does the probability of being ready for innovation within organizations increase, thus guaranteeing creative contributions by employees, which in turn produce important benefits in the long term. If it is taken as a given that the process of innovation not only includes the development but also the implementation of creative ideas, the objective of the present paper is to consider how individual work motivation and work characteristics influence the emergence process of idea generation. It has therefore been decided to concentrate on the initial phase, which also represents the phase of maximum individual creativity in the process. 3. Intrinsic and extrinsic motivation in the idea generation process The relationship that is established between intrinsic and extrinsic motivation has caused great interest in the pertinent literature. The most prevalent psychological models proposed to date in the field, have tended to concentrate on a deep antagonism between these two forms of motivation, in that as one increases, the other decreases. Nonetheless, as the abovementioned implies, various evidence points to a more complex and clearly expressed reality. Firstly, we will look at the fact that even though the relationship between intrinsic and extrinsic motivation is always inversely proportional, we will underline how
168
P. Picci and A. Battistelli
under certain conditions, the synergic presence of these two motivational forms may actually determine positive effects on creative performance. It is useful to remember that in this regard an innovative project is made up of various phases and that whilst it may be helpful in the initial phases to propose as many ideas as possible, in the successive phases it is more important however to dwell upon those produced, examining and choosing the most appropriate (Gersick, 1988 [20]). It is generally maintained that the synergic action of extrinsic motivators is more useful in those phases, where a high level of new ideas is not required, such as the phase of collecting data or the implementation of the chosen solutions. Amabile (1994) highlights intrinsic and extrinsic motivation as relatively independent factors, rather than completely opposing poles of the same dimension. Nonetheless, certain empirical evidence shows how people simultaneously maintain a strong orientation towards intrinsic and extrinsic motivation. An interesting fact has emerged from comparing the link between motivation and creativity. Not only do reported test scores for creativity in professionally creative individuals correlate positively with the so-called “challenge” component of intrinsic motivation, but they also correlate positively with “external acknowledgement”, a component of extrinsic motivation (Amabile, 1994). The current research uses the Gagné and Deci (2005) theory of selfdetermination as a reference point. The theory works within a continuum that distinguishes between autonomous and controlled motivation. These two forms of intentional motivation, by their nature, can be differentiated from “amotivation”, which implies a complete absence of motivation on the part of the subject. Autonomy presupposes a degree of willingness and of choice in the actions to be performed, e.g. “I am doing this job because I like it”, whereas controlled motivation, being partly intentional, differentiates itself by the fact that the subject acts under pressure from external factors , e.g. “I am doing this work for the money”. Intrinsic motivation is a classic example of maximizing autonomous motivation. With regards extrinsic motivation, however, the theory identifies 4 types of motivation along a continuum, from the complete absence of autonomy to its absolute presence (auto-determination). Among the abovementioned types of motivation, two belong to controlled motivation (externally controlled motivation and introjection) and two belong to autonomous motivation (identification and integration).
Motivational Antecedents of Individual Innovation
169
In order to be undertaken successfully, the activities, which may be of little personal interest, require external motivational forms, such as financial rewards, positive acknowledgements and promotions. This is a form of externally controlled motivation and is the prototype of extrinsic or controlled motivation. Other types of extrinsic motivation are related to those behaviors, values and attitudes, which have been interiorized in people at differing levels. It is possible to distinguish between three fundamental processes of interiorizing: introjection, identification and integration, which are differentiated by the degree of autonomy characterizing them. Introjection is the process, by which a value or a behavior is adopted by an individual but is not fully accepted or lived by said individual as their own. Unlike the other two types of autonomous motivation, namely identification and integration, the above type of extrinsic motivation is controlled. Identification is characterized by the fact that a behavior or a value is accepted by the subject/individual because they have judged it to be personally important and coherent with their identity and objectives. For example, if a nurse truly has the well-being of their patients at heart, they will be prepared to operate independently, undertaking their own initiatives with tasks of little interest or even with highly unpleasant ones. Ultimately, integration is a form of extrinsic motivation, characterized by a greater degree of autonomy, in which certain values and behaviors are not only tacitly accepted by the individual but also incorporated and integrated into their value system and way of life. The abovementioned motivational form, even if it shares many aspects of intrinsic motivation, is still part of extrinsic motivation, due to the fact that the person who is acting, is not interested in the activity per se but considers the activity at hand to be instrumental in reaching personal objectives “similar to” but “different from” said activity. As is stressed in the theory, people who are autonomously motivated, even if the motivation is extrinsic in nature, are potentially more inclined to introduce changes in the way they work because they constantly wish to do their work in the most efficient manner possible. It is for this reason therefore that the work itself becomes even more intrinsically motivating, without excluding however the wish on the part of the individual to have other forms of external acknowledgement, regarding the quality of their work. This datum implies that even those individuals with a more controlled motivation may potentially develop creative ideas while carrying out their work, by simply introjecting a value which does not belong to them, in order to adapt to their organization’s wishes to ultimately obtain a certain form of positive acknowledgement or to avoid other forms of disapproval.
170
P. Picci and A. Battistelli
Thus, in line with the salient aspects of the self-determination motivational theory, as reexamined by Gagné and Deci (2005) [19], particularly with regard to an as yet unstudied possible relationship to innovative behavior, it appears pertinent at this juncture to hypothesize that not only autonomous motivation (due to its proximity to intrinsic motivation) but also controlled motivation (externally regulated at differing levels), may have a positive influence on the behavior of idea generation. Let us therefore look at the following hypothesis: H1: Both autonomous motivation (in the form of identification and integration) and controlled motivation (in the form of external regulation and introjection) positively influence idea generation behavior. 4. Work characteristics influencing the emergence of the innovation process Undoubtedly, the motivation that a subject shows towards their work depends not only on individual personality but various studies have also shown the crucial role that the work environment has in stimulating creative capacity. In other words, external background and other factors stimulate the individual per se and condition their creative and innovative capacity. For example, the resources includes all those aspects which the organization places at the individual’s disposition, until such time as it becomes practical for them to effect a creative performance. This allows sufficient time to produce an innovative piece of work, to work with competent and prepared personnel, having the availability of funds, materials, systems and adequate processes, along with the relevant information, and to have the possibility of learning and training (Sigael and Kaemmerer, 1978 [33]; Ekvall, 1996 [16]). It has been consistently observed over time that the structural organization of work has direct effects on creative performance. The more complex the task, the more motivated, satisfied and productive the individuals become (Cummings and Oldham, 1997 [13]). A greater complexity of task should stimulate a higher creative potential, to the extent that a higher degree of responsibility and autonomy, in the choices made by an individual, is clearly implied. In this case, we are dealing with tasks that call for the necessity of adopting various perspectives and observing the same problem from different points of view. The abovementioned perspectives are characterized by the fact that they require a high level of ability until the tasks are carried out. They enable individuals to follow through with the task from beginning to end, in such a manner that the
Motivational Antecedents of Individual Innovation
171
individual is fully aware of the meaning of their work. They provide important feedback during the execution of the task. Finally, these various perspectives have a strong impact on people’s lives, both within and outside the organization. By way of contrast, the most simple or routine tasks tend to inhibit enthusiasm and interest and consequently they do not stimulate the expression of creative potential (Scott and Bruce, 1994 [31]). Some jobs in contrast to others however, offer people a greater opportunity for innovative behavior. Hackman and Oldham (1980) [21] identified three conditions, in which people can feel motivated by their work. Firstly, they must recognize the results of their work. Secondly, they have to experience the sensation of taking responsibility for the results of their work. Finally, they must live their work as something significant and relevant to their value system. The research literature, in this regard, highlights five useful work characteristics for handling the demands of the task at hand (Hackman and Oldham, 1975, 1980 [22,21]), skill variety, task identity, task significance, autonomy and job feedback. When combined together, the five job characteristics decide the motivational potential of a working role/position. For this reason, if a job has a low motivational potential, the intrinsic motivation will be correspondingly low and the feelings of a person will no longer be positively influenced, even by a job done well. Farr (1990) confirmed that “complex” jobs, when compared to simpler ones, are more challenging and require more thought and that consequently they tend to promote innovation. Those studies that follow this hypothesis generally confirm the existence of a relationship between the job characteristics and further confirm the creative phase of innovation, known as idea suggestion (Axtell, Holman, Unsworth, Wall, and Waterson, 2000 [10]). Using job complexity as a measurement, based on Job Diagnostic Survey (Hackman and Oldham, 1980 [21]), Oldham and Cummings (1996) found a positively significant correlation in creative scores, attributed to employees by their superiors, highlighting the interaction between job complexity and personality characteristics, in predicting idea generation. Overall, studies in job characteristics suggest that when individuals are committed to various tasks with high levels of controls, they have a greater propensity to find new solutions, in improving their work (Axtell et al., 2000 [10]).
172
P. Picci and A. Battistelli
It is therefore by following the objective of said work that we are in a position to outline the hypothesis mentioned below: H2: Job characteristics (task identity, task significance; feedback, autonomy and skill variety) positively influence the behavior of idea generation. Finally, in the light of such evidence and considering the predictable relationship between job characteristics and intrinsic motivation, it is quite plausible to hypothesize a synergic role for these factors, not only in the generational phase of ideas but also within the process of individual innovation. It is therefore ultimately proposed to test for the following hypothesis: H3: Autonomous and controlled motivation and job characteristics positively influence the entire behavior of individual innovation. 5. The method The application within the work environment context of said study was the Health Service, because of the changeable nature of the organization which underwent a notable number of reforms, legislated for during the 1990’s that transformed Hospitals into “Health Service Firms”. The research was carried out according to a quantitative and transversal type of methodology, through a specifically geared questionnaire. A research questionnaire was presented in a generic manner to the relevant subjects, providing them with general indications, whilst clarifying total anonymity and the final objective of the project. This was done in complete agreement with the Head of the Psychology Unit and the Head of Quality Business Centre for Health Service Firms, operating in a Region of Central Italy. 5.1. The sample The sample is made of 100 subjects currently employed in the Health Service and in the administrative area of the relative management service, situated in a region of Central Italy. 53% of the total sample are Health Service Personnel and the remaining 47% are made up of Administrative Personnel within the Hospital Service. 48% of the sample are male and the remaining 52% are female. The average age of sample participants is 44.7. The information regarding qualifications obtained by those involved in the sample, revealed a relatively diverse reality, divided as follows: 40% are in possession of a High School Certificate, 10% are in possession of a Diploma
Motivational Antecedents of Individual Innovation
173
from Professional Schools/Technical Institutes, 3% are in possession of a 3 year University Diploma, 26% are in possession of a Graduate Degree and finally 21% are in possession of a Postgraduate Qualification. The average tenure within the Health Service of those sampled is 15 years. Regarding the function served by the various subjects within their respective sectors, 22 are Directors/Managers/Referees in situ or in organizational positions. 20 are part of Management and the major part of the sample (58 subjects) declared that they belong to Ward Personnel. Finally all those involved in the study have officially spent an average of 12 years in Service. 5.2. The measure The questionnaire was composed of two sections, the first comprising of a general enquiry into personal details and the second included three scales, the function of which was to analyze innovative behavior, motivation and perceived job characteristics. The construct of innovative work behavior (IWB) by Scott and Bruce (1994) [31], revisited by Janssen (2000) [25], was to use to measure innovative work behavior. In order to observe the innovative behavior of idea generation, three specific items were used for this dimension, taken from the scale of nine items of innovative work behavior, as published by Janssen (2000).These items, based on three states of innovation, conceived three items that refer to idea generation, three that refer to idea promotion and three that refer idea realization. The response format was based on the 5 point Likert Scale, where 1= never and 5= always, and upon which the subjects indicated the level of frequency, with which they undertook innovative work behavior, e.g. “With what frequency does it happen to you to have to come up with original solutions to problems?” The measurement scale for job characteristics was taken from “Job Diagnostic Surveys” (JDS), by Hackman and Oldham (1980) [21]. The scale was composed of ten items and had already been used in previous non-published Italian Research that had tested the validity of the structure. Five dimensions were considered: task variety, identification of the subject within the task, significant, autonomy and feedback. Each dimension looked into two items of the scale. For example, “My job requires me to use a number of complex capacities at a high level.” (task variety); “My job offers me the opportunity to finish that part of the work, which I had previously started.” (task identification); “My job is not very significant or important in my life.” (task
174
P. Picci and A. Battistelli
significance); “In doing my job, I am constantly provided with a considerable degree of independence and freedom.” (autonomy) and finally “My job provides me with little or no indication upon which I may judge, whether I am doing well or badly.”,(feedback from the work itself). The subjects were requested to indicate their response on the 5 point Likert scale, where 1= absolutely false and 5= absolutely true, based on their level of agreement/disagreement with potentially descriptive aspects of their job. To observe the forms of motivation that drive individuals to perform their job, a recently constructed scale of 20 items is in the publishing and evaluation stage in Italy, based on the self-determination theory of Gagné and Deci (2005) [19]. The motivational forms considered, refer to intrinsic motivation that is completely autonomous and to various types of controlled and autonomous motivation, which may be identified along the continuum of extrinsic motivation as follows: externally regulated motivation (e.g. “I am doing this job because it allows me to have a high salary.”), introjection (e.g. “I am doing this job because the esteem in which my colleagues hold me, depends on my work.”), identification (e.g. I am doing this job because it is important to me.”), integration (e.g. “I am doing this job because it allows me to reach my goals in life.”). The subjects were asked to indicate their response on a 7 point Lickert scale, where 1= absolutely false and 7= absolutely true, based on their level of agreement with the example of motivation described in the items. 6. The results Among the measures used, Table 1 summarizes the average, the deviation standard and the reliability test (Alpha Cronbach). When compared to the motivational scale of Gagné (2005) [19], it emerged from an explorative analysis of the results that the original four-dimensional structure, as hypothesized by the author (controlled external motivation, introjection, identification and integration) never appeared in the data obtained by the sample used in this research. In fact, the resulting three-dimensional structure consists of the following: external regulated motivation (M=2.67; DS=1.12), introjection (M=2.58; DS=1.32) and identification/integration (M=4.14; DS=1.31). This last dimension incorporates two motivational types which are nearest to intrinsic motivation or to those motivational types which according to the theory, appear in the autonomous category. In order to obtain the current structure, 6 items of the 20 in the original scale were eliminated, due to a lack of saturation among the sample.
Motivational Antecedents of Individual Innovation
175
Table 1. Descriptive Analyses of variables. VARIABLES Autonomy Task Variety Feedback Identification Significance Job Characteristics Innovative work behavior (IWB) IWB Idea suggestion M_Integration/Identification M_Externally Controlled Motivation M_Introjection
N 100 98 100 100 100 100 100 100 97 99 99
Range 1-5 1-5 1-5 1-5 1-5 1-5 1-5 1-5 1-7 1-7 1-7
M 3.77 3.63 3.74 3.81 3.97
D. S .93 .98 .80 .92 .77
3.26 3.18 4.14 2.67 2.58
.69 .80 1.31 1.12 1.32
.70 .88 .85 .90 .70 .69
Note: it is reported that the reliability coefficient of the global Job Characteristic scale was calculated out of a total of 10 items
In line with the central hypothesis of this research, we then proceeded with an analysis of possible specific relationships between each of the variables, hypothesized as antecedents (motivation and job characteristics) and the single phase of idea generation. Table 2 shows the results of the regressions carried out, in order to study the relationship between idea generation and motivation in their autonomous and controlled forms (H1 hypothesis): Within the sample, the results of the regression show a significantly positive influence in the dimension that covers forms of integration/identification of idea generation behavior. It is therefore possible to confirm that motivation, only in the form of identification/integration, influences the emergence of the innovative behavior of idea generation. From the abovementioned data, it clearly emerges that the H1 hypothesis cannot be confirmed in its totality, showing once again that only autonomous motivation which is the nearest to an intrinsic form, appears to be implicated to a greater degree in the creative process. In fact, this process which is based on the emergence of individual innovation, does not reveal any significant result in relation to the dimensions of controlled motivation (external regulation and introjection). Table 3, on the other hand, shows the possible relationships of influence between job characteristics, according to the job characteristics model (Hackman and Oldham, 1980 [21]), and always shows the specific phase of idea generation. As can be seen from the above Table, the behavior of idea generation gives a result of being positively influenced by two specific job characteristics, namely task variety and autonomy.
176
P. Picci and A. Battistelli Table 2. Job characteristics in the phase of idea generation. Dependent Predictors Variables R² adjusted = .110; F= 13.219; p pet , then the depth of erosion, he , is given by the following formula:
he = {qr , qe , p pef } 4.2.2. Computation of the “minimizing” debris outflows In SCIDDICA-S4c, the concept of “minimizing” outflows was introduced. They are simply the outflows computed by applying the minimization algorithm of the
204
R. Rongo et al.
differences, without considering any kind of relaxation factor. Hence, such flows are those that, if distributed to the neighboring cells, lead the neighborhood to the state of equilibrium. The parameters involved in such elementary process are pf and padh. The first specifies a “critical angle”: if the angle among two adjacent cells does not overcome pf, the cell candidate for the distribution is eliminated and will not receive any flow. The parameter padh specifies the thickness of flows that cannot leave the cell due to the effect of adherence. It generally depends on the characteristics of the flowing mass. Minimizing outflows are then considered as a “starting point” to derive “effective flows”, as described in the following. 4.2.3. Conservation of mass, energy and momentum This is the elementary process where effective flows are computed. In general, if qo(x,y) denotes the minimizing flow from the cell x to the cell y, v(x,y) its velocity and d(x,y) the distance between the cell x and y, the effective flow, f(x,y), is given by the following formula:
f ( x, y ) = q0 ( x, y ) ⋅ [v( x, y ) ⋅ pt d ( x, y )] Here, the quantity v( x, y ) ⋅ pt , being pt the CA clock, represents the distance that the flow covers on the basis of its velocity v( x, y ) . Consequently, v( x, y ) ⋅ pt d ( x, y ) can be considered as an index of how much space the flow will cover in a CA step with respect the maximum allowed, i.e. d ( x, y ) , and thus can be considered as a variable relaxation rate. Eventually, note that v( x, y ) ⋅ pt cannot overcome d ( x, y ) , otherwise the flow can exceed the neighborhood. If this happens, the CA clock must be diminished and the simulation restarted. Once effective flows are determined, also associated values of energy and momentum are computed so that, in the subsequent distribution phase, mass, energy and momentum are preserved. 4.2.4. Energy loss This elementary process is responsible to velocity drop and, consequently, to energy dissipation. Three parameters are involved, which together form the overall parameter of dissipation pd. If v denotes the module of velocity of the mass in a generic cell, velocity drop is modeled as follows:
v = v − h p dN − v p dP − v 2 pdQ
Evolutionary Computation and Emergent Modeling of Natural Phenomena
205
where: • pdN is the “not-dependent” dissipation parameter, which produces a velocity drop only on the basis of the mass weight (in SCIDDICA-S4c modeled in terms of height h). • pdP is the dissipation parameter, which produces a velocity drop proportionally to the current velocity. • pdQ is the dissipation parameter, which produces a velocity drop proportionally to the square of the current velocity. Note that such mechanisms of dissipation can be considered either alone or in combination. For instance, if one conjectures that the behavior of the flow to be modeled is essentially turbulent, proportional velocity drop can be neglected by simply imposing pdP to 0. Similarly, if the behavior of the flow is conjectured to be essentially laminar, the dissipation which characterize turbulent flows (i.e. proportional to v2) can be neglected by simply imposing pdQ to 0. 5. Calibration As previously stated, once that a MCA model has been defined, a calibration phase is generally needed to find a set of parameters which allow the model itself to reproduce the phenomenon of interest in a satisfying manner. At this purpose, maps of real cases can be compared with simulations, and a quantitative measure of the quality of the results can be expressed through suitable “fitness functions”. As discussed in D’Ambrosio et al. (2006) [6], the trivial comparison of the extent of real and simulated cases can be considered for a simplified, preliminary calibration. Though, when proper input data are available, a more “articulated” fitness function, based on a more representative set of characteristics of the phenomenon (e.g. erosion depth or landslide thickness for a debris flows model, or even information about the duration of the real event), is indeed a better choice, and allows for a more “refined” calibration. In the following the application of Genetic Algorithms to the calibration of SCIARA-fv and SCIDDICA-S4c is described with respect two real cases of study. 5.1. SCIARA-fv Calibration Among the numerous variants of Genetic Algorithm models proposed in literature (cf. Mitchell 1996 [16]; Cantù-Paz 2000 [2]), the one employed for the calibration of SCIARA-fv represents (encodes) parameters to be optimized as bit strings. Moreover, the GA is steady-state and elitist, so that at each step only the
206
R. Rongo et al.
10
m3/sec 5
0 1
2
3
4
5
6
7
8
9
10
Days
Figure 1. Lava emission rate of the 2001 Etnean eruption started from Mount Calcarazzi which threatened the towns of Nicolosi and Belpasso.
worst individuals are replaced. The remaining ones, required to form the new population, are copied from the old one, choosing the best. In order to select the individuals to be reproduced, the “binary-tournament without replacement” selection operator was utilized. It consists of a series of “tournaments” in which two individuals are selected at random, and the winner is chosen according to a prefixed probability, which must be set greater for the fittest individual. In our case, this probability was set to 0.6. Moreover, as the variation without replacement scheme was adopted, individuals cannot be selected more than once. Employed genetic operators are classic Holland’s crossover and mutation with probability of 1.0 and 2/44, respectively. In particular, the above probability of mutation permitted to have, on an average, two bits mutated for each individual, as the genotype length (obtained as the sum of the number of bits chosen for the encoding of each considered SCIARA-fv parameters - cf. Table 1), was exactly 44. Eventually, the number of individuals forming the initial population was set to 256, while the number of individuals to be replaced at each GA step was set to 16. Finally, the original fitness function e1 (cf. Spataro et al. 2004 [17]), was replaced with a new one. The e1 fitness function took into account only the comparison between the areal extensions of the real and simulated events; it was defined as:
e1 =
m( R ∩ S ) m( R ∪ S )
(2)
where R and S represent the areas affected by the real and simulated event, respectively, while m(A) denotes the measure of the set A. Note that e1 ∈ [0,1] ;
Evolutionary Computation and Emergent Modeling of Natural Phenomena
207
Table 1. The best set of SCIARA-fv parameters as obtained through calibration phase, together with their explored ranges. The number of bits used for the genetic algorithm encoding are also listed. Parameter
Explored range
ps pTv pTsol padv padsol pcool pa
[60, 180] [1123, 1173] [0.1, 2.0] [6.0, 30.0] [10-16, 10-13] -
Bits 8 8 4 6 16 -
Best value 155.29 s 1373 °K 1165.35 °K 0.7 m 12 m 2.9×10-14 m °K-3 5m
its value is 0 if the real and simulated events are completely disjoint, being m( R ∩ S ) = 0 ; it is 1 in case of perfect overlap, being m( R ∩ S ) = m( R ∪ S ) . First calibration experiments were performed by considering the 2001 Etnean eruption (Sicily, Italy) which started from the fracture of Mount Calcarazzi and pointed southwards creating the main danger for the towns of Nicolosi and Belpasso (cf. Figure 1 for the lava flow emission rate at vent). In this preliminary phase the fitness function e1 was adopted. However, even if results seemed quite satisfactory, the best simulation gave its final shape at the end of the 8th day, in spite of the 10th for the case of the real lava flow. As a consequence, obtained parameters allowed for simulating lava flows with different rheological characteristics, e.g. with greater viscosity. Hence, an improved fitness function, f1, was devised, which takes into account both the areal extensions of the real and simulated events, and their temporal duration. It is defined as follows: f1 = e1(t1 ) e1(t 2 )
where e1 is defined as before, while t1 and t2 represent two different temporal instants where it is evaluated. In particular, t1 represents the time in which the real event reaches its stationary state. In this instant, the function e1 is evaluated for the first time, giving information about the overlapping ratio of the simulation at that particular moment. However, contrarily to the real event, the simulation might not reach its final configuration at the same instant and its shape change further in time. In this case, if the function e1 is again evaluated, for instance at the time t2>t1, its value could differ from the previous one, meaning that the overlapping ratio changed and thus the simulation did not stop when the real event did. Hence, as e1 does, even the function f1 gives values belonging to the interval [0,1], with the difference that the value 1 is obtained when the real and simulated events perfectly overlap, with the further condition
208
R. Rongo et al.
Figure 2. Comparison between the 2001 Nicolosi Etnean Event and the best SCIARA-fv simulation, as obtained by adopting the parameters listed in Table 1. Key: 1) Area affected by the real event; 2) Area affected by the simulation; 3) Area affected by both real and simulated events; 4) Limits of real event, 5) Limits of simulated event.
that the simulation stops exactly at the same time as the real event does. In other words, f1=1 if and only if e1(t1)=e1(t2)=1. Note that, by considering the available data concerning the cases of study here considered (limited to the areal extension and duration), f1 can be considered a satisfying objective function for the model calibration phase. A more refined function can be certainly considered, e.g. by evaluating intermediate results along the overall period of evolution and not only at the end. However, its definition is constrained to the availability of reliable information about the real phenomenon, which is usually difficult to obtain. Accordingly, the goal for the GA was to find a set of CA parameters that maximize f1. On the basis of previous empirical attempts, ranges within which the values of the CA parameters are allowed to vary were individuated in order to define the GA search space (cf. Table 1), and a set of 10 experiments iterated for 100 steps. As regards the fitness function, t1 was set to 10 days (which corresponds to the duration of the real event), while t2 was set to 13 days. As concerns the prefixed parameters, pTv was set to a value which corresponds to the typical temperature of Etnean lava flows at vents, while pa was set on the basis of the detail of available topographic map of the area of interest. Coupled with the prefixed parameters, those obtained thanks to the calibration phase allowed to satisfactorily reproduce the considered 2001 Nicolosi Etnean lava flow (cf. Figure 2), giving rise to a fitness equal to 0.72, corresponding to a value of 0.74 in terms of areal comparison (i.e. in terms of e1 – cf. equation 2).
Evolutionary Computation and Emergent Modeling of Natural Phenomena
209
Figure 3. The May 1998 Curti landslide. Key: 1) area affected by the real case; 2) limit of the zones with constant depth of regolith (assumed values in meters, in italics); 3) border of the area considered for comparison between the real and the simulated cases; 4) secondary source locations.
This value exceeds the “classic” threshold (0.7) commonly assumed as “acceptable” for calibration experiments. 5.2. SCIDDICA-S4c Calibration As for SCIARA-fv, the calibration of SCIDDICA-S4c was performed through a genetic algorithm by considering a real case of study, specifically the May 1998 Curti-Sarno (Campania, Italy) debris flow (Del Prete et al., 1998) [8]. In Figure 3, the map of the Curti real case is shown, depicting location and extent of
210
R. Rongo et al.
Table 2. List of SCIDDICA-S4c parameters either prefixed or optimized through GA. Variation ranges and best values are also shown. Parameter pc pt pdN pdP pdQ padh pf pet ppef
Explored range [0, 5] [0, 1] [0, 16] [0, 10] [0.0001, 5]
Bits 8 8 8 8 8
Best value 1.25 m 0.25 s 4.7 0 0.74 0.001 m 2.3 degrees 0J 0.18
sources, and thickness of erodable regolith. Information concerning topography, sources and erodable regolith, together with the values assigned to the CA parameters, do define the “initial conditions” of each simulation. However, differently from the case of SCIARA-fv, the fitness function e1 was adopted (cf. equation 2), as only the areal extent of the real case was known with sufficient detail. Similarly to SCIARA-fv, model parameters were encoded as bit strings, and populations of 200 individuals were considered. Furthermore, the adopted GA is a steady state and elitist model with a “binary-tournament without replacement” selection operator (with probability 0.6 to select the best individual) and classic Holland’s crossover and mutation as genetic operators (with probability 0.8 and 1/40, respectively). In particular, the probability of mutation permitted to have, on an average, one bit mutated for each individual, as the genotype length (obtained as the sum of the number of bits chosen for the encoding of each considered SCIDDICA-S4c parameters - cf. Table 2), was exactly 40. Eventually, the number of individuals forming the initial population was set to 200, while the number of individuals to be replaced at each GA step was set to 15. Previous empirical attempts of simulation of the considered case of study, manually performed by iteratively assigning reasonable values to model parameters, helped in hypothesizing the ranges within which the values of the CA parameters are allowed to vary (cf. Table 2). Values of CA parameters, either optimized by GA or prefixed, are listed in Table 2. As concerns the prefixed parameters, the value adopted for pc suggested a small value for pt; such values allowed for a quite detailed description of the phenomenon, both in spatial and in temporal terms. The dissipative parameters pdP was set to zero, in order to only consider frictional and turbulent effects.
Evolutionary Computation and Emergent Modeling of Natural Phenomena
211
Figure 4. The May 1998 Curti landslide: comparison between the real case and the best simulation. Key: area affected by 1) real landslide, 2) simulated landslide, 3) both cases; 4) border of the area considered for comparison.
Eventually, the value of padh allowed for simulating the observed effect of thin mud plastering along the trail. As regards the optimized parameters, the values obtained for pdN and pdQ seem to evidence an important role of dissipation for the considered case of study: values are in fact very close to the right end of the explored ranges. The value obtained
212
R. Rongo et al.
for pf well reflects the high assumed fluidity of the event. Despite a quite wide range was selected for optimizing pet, its resulting value well corresponds to the extremely-erodable type of detrital cover (mainly, allophane soils) of the study area. Furthermore, the related factor of progressive erosion, ppef, did also end up with a rather low value, still reflecting the characteristic of the real event, in which erosion progressively occurred. In Figure 4, a graphic comparison between the map of the best simulation (i.e. obtained with optimized parameters) and the real case is shown. Even though only simplified, calibration allowed to satisfactorily reproduce the real phenomenon, both in qualitative and in quantitative terms. In particular, the overall affected area and depth of regolith erosion along the path are quite in good accordance with surveyed evidence; the branching of the flow at the base of the slope is fairly well simulated (better than by previous releases of the model), and so their successive merging right upslope of the urbanized area. The obtained fitness value of 0.76 exceeds the “classic” threshold (0.7) commonly assumed as “acceptable” for calibration experiments. 6. Conclusions In this review paper we described an evolutionary approach for the calibration of two Macroscopic Cellular Automata models for the simulation of lava and debris flows, respectively. Such phenomena, classified among the most dangerous geological processes both for living beings and material properties, are unfortunately very difficult to be modeled through standard approaches. However, Macroscopic Cellular Automata demonstrated to be a valid alternative in such attempt, on condition that their great capability to show even more different emergent dynamical behavior is, in a way, properly governed. As evidenced by the application to the considered simulation models to two real cases of study, Genetic Algorithms can perform this task in a satisfying manner. This is not surprisingly, as Genetic Algorithms demonstrated a high and general ability as optimization algorithms in many scientific fields. Also in our cases, they were able to properly calibrate the considered simulation models so that the phenomena to be reproduced could be simulated in a satisfying manner. Anyway, particular attention must be reserved to the definition of the fitness function. As regards our experience, the choice of a “poor” fitness, like that only based on an areal comparison between the real and simulated events, can lead the search of the desired emergent model behavior towards fictitious solutions (local optima). This was confirmed by the first calibration performed on the
Evolutionary Computation and Emergent Modeling of Natural Phenomena
213
SCIARA-fv lava flow model, for which only the adoption of a more refined objective function which, besides information about the areal extent, considered also information on the duration of the real event, allowed to obtain a new set of model parameters able to produce the desired model behavior. Unfortunately, reliable information about the phenomena here discussed are often difficult to be obtained. This is due to several reasons, among which the fact that they are rapid phenomena (in particular the debris flows) and then difficult to be monitored during their evolution. For the Curti debris flow, for instance, it was not possible go back to the duration of the event with a sufficient precision, and only a simplified fitness function could be considered. In such cases the model reliability could further be confirmed, and a validation phase which evaluates the goodness of the model against a sufficient number of different cases of study could be certainly desirable. References 1. M.V. Avolio, G.M. Crisci, S. Di Gregorio, R. Rongo, W. Spataro and D. D’Ambrosio, Computers and Geosciences-UK, 32, 897-911 (2006).
2. E. Cantù-Paz, Efficient and accurate Parallel Genetic Algorithms (Kluwer Academic Publishers, Dordrecht, The Netherlands, 2000).
3. G.M. Crisci, S. Di Gregorio and G.A. Ranieri, in Proceedings International AMSE Conference Modelling & Simulation, Paris, France, Jul.1-3 1982, (1982), pp. 65-67.
4. G.M. Crisci, R. Rongo, S. Di Gregorio and W. Spataro, Journal of Volcanology and Geothermal Research 132, 253-267 (2004).
5. D. D’Ambrosio, R. Rongo, W. Spataro, M.V. Avolio and V. Lupiano,. in LNCS 4173, Ed. S. El Yacoubi, B. Chopard and S. Bandini, (2006), pp. 452 - 461.
6. D. D’Ambrosio, W. Spataro and G. Iovine, Computers and Geosciences-UK 32, 861-875 (2006).
7. D. D’Ambrosio, G. Iovine, W. Spataro and H. Miyamoto, Environmental Modelling & Software 22, 1417-1436 (2007).
8. M. Del Prete, F.M. Guadagno and A.B. Hawkins, Bullettin of Engineering Geology and the Environment 57, 113-129 (1998).
9. S. Di Gregorio, D.C. Festa, R. Rongo, W. Spataro, G. Spezzano and D. Talia, in 10. 11. 12. 13.
Parallel Computing: State-of-the-Art and Perspectives, Ed. E.H. D' Hollander, G.R. Joubert, F.J. Peters and D. Trystam, (1996), pp. 69-76. S. Di Gregorio, R. Serra and M. Villani, in Proceedings of 3rd Systems Science European Congress, Roma 1-4 October 1996, Ed. E. Pessa, M.P. Penna, A. Montesanto, (Kappa, Roma, 1996), pp. 1127-1131. S. Di Gregorio and R. Serra. Future Generation Computer Systems 16, 259-271 (1999). S. Di Gregorio, R. Serra and M. Villani, Complex Systems 11, 31-54 (1997). J.H. Holland, Adaption in Natural and Artificial Systems (University of Michigan Press, Ann Harbor, 1975).
214
R. Rongo et al.
14. G. Iovine, D. D’Ambrosio and S. Di Gregorio, Geomorphology 66, 287-303 (2005). 15. A.R. McBirney and T. Murase. Annual Review of Earth Planetary Sciences 12, 337357 (1984).
16. M. Mitchell, An Introduction to Genetic Algorithms (MIT Press, Massachusetts, 1996).
17. W. Spataro, D. D’Ambrosio, R. Rongo and G.A. Trunfio, In Proceedings of the 7th
International Conference on Cellular Automata for Research and Industry (Perpignan, France, Sep.20-23). LNCS 4173, (2004), pp. 725-734. 18. S. Succi, The Lattice Boltzmann Equation for Fluid Dynamics and Beyond (Oxford University Press, Oxford, 2004).
A NEW MODEL FOR THE ORGANIZATIONAL KNOWLEDGE LIFE CYCLE
LUIGI LELLA, IGNAZIO LICATA ISEM, Institute for Scientific Methodology, Palermo. Italy E-mail:
[email protected] Actual organizations, in particular the ones which operate in evolving and distributed environments, need advanced frameworks for the management of the knowledge life cycle. These systems have to be based on the social relations which constitute the pattern of collaboration ties of the organization. We demonstrate here, with the aid of a model taken from the theory of graphs, that it is possible to provide the conditions for an effective knowledge management. A right way could be to involve the actors with the highest betweeness centrality in the generation of discussion groups. This solution allows the externalization of tacit knowledge, the preservation of knowledge and the raise of innovation processes. Keywords: organizational knowledge, theory of graphs, network models.
1. The knowledge life cycle Nowadays every organization must be able to learn quickly and continually from the environment where it operates (Nonaka, 1994 [18]). The new knowledge comes from the experiences of the individuals operating within the organization, and it is constructed through their social and collaborative interactions (Nonaka and Takeuchi, 1995 [19]; Nonaka, Toyama and Konno, 2000 [21]). For this reason the technology should focus on the problem of finding innovative solutions to improve the cooperation among individuals and the awareness of the knowledge and the skills reached by each of them (Stenmark, 2003 [24]). As noticed by McElroy (McElroy, 2003 [17]) the knowledge management systems (KMS) of the first generation have been focussed principally on the processes of knowledge diffusion and integration. This means they were based on the assumption that valuable knowledge was already present within the organization. Therefore the main purpose of a KMS was to provide the right information to the right people and to codify all the explicit and tacit knowledge embodied in organizational processes and in the beliefs of the individuals. But the purpose of a KMS should be also the production of new knowledge, not only the integration of the existing organizational knowledge. So, as stated
215
216
L. Lella and I. Licata
by McElroy, the KMS of the second generation have all to deal with the problem of the creation of knowledge, favouring the detection of problems and needs and the finding of solutions. This innovation process proceeds in a form that is called by McElroy “Knowledge Life Cycle” or KLC. The KLC is not just a model, but a “framework for placing models in context” using the exact definition of McElroy. So a complex of different competing ways and views of how knowledge can be produced and integrated. The way this framework works is influenced by the following assumptions. First of all the learning foundation is the experience of gaps in everyday activities. The detection of these gaps, which are the lack of the needed knowledge to carry out the activities in the right way and in the shortest time, represents a sort of emergence of problems. The detection of gaps is just the first step toward the formulation of the problems which are defined by McElroy “knowledge claims”, which comprise an analysis, an elaboration of the problems to be solved. Knowledge claims could be conjectures, assertions, reports, guidelines or entire theories on the right processes to follow in order to fill the detected gaps. The formulation of a knowledge claim can involve more individuals leading to the generation of groups. These communities, in a formal or informal way, share ideas submitting them to a sort of peer review. This process is requested to validate the emergent innovative ideas. Such process of knowledge claim formulation and evaluation is considered by McElroy a process of knowledge production. Not all the knowledge claims survive within the organization finding the interest and the approvation of the other individuals. The ones that do not succeed in the evaluation process could be “undecided knowledge claims” or “falsified knowledge claims”. The reports which certify the failure of the knowledge claims are called “meta-claims” i.e. claims about claims. The knowledge claims which pass the validation are instead integrated into the activities of a wider group of people. The integrated knowledge can take the form of mentally held knowledge by individuals or groups or explicit artifacts like documents and files. The first type of container can be considered a special kind of tacit knowledge, but only the explicit forms of knowledge are considered “knowledge claims” by McElroy. The following phase is the knowledge use which regards the business processing, not the knowledge processing, even if new problems, and so other knowledge claims, can arise even in this last phase.
A New Model for the Organizational Knowledge Life Cycle
217
Summing up the processes of Knowledge Production, Knowledge Integration and Business Processing have not to be conceived as isolated, but they interact each other in a complex manner. And their complexity degree has to be realized in order to support the organizational processes of innovation. Means that the capture, the coding and the deploying of knowledge alone are not sufficient to guarantee the creation of innovation. These efforts are merely examples of information management or information processing, not knowledge management. The main intuition of McElroy is that a KMS has to guarantee strategies and environments where knowledge can also be valuated producing knowledge claims and meta-claims. Only an evaluative and critical process can integrate and coordinate the different phases of knowledge management. 2. A new framework to support KLC According to the definition of McElroy, in order to support the knowledge life cycle the KMS has to provide and promote knowledge sharing spaces where individuals can discuss certain problems, conjectures and theories. The system we are going to present tries to achieve this goal in two steps. First the social network of the entire organization is analyzed to detect the points where knowledge and information pricipally flow. Once these individuals have been detected they are prompted by the system to create a community to discuss a given knowledge claim of common interest. This group can take the form of a community of practice (Hildreth and Kimble, 2000 [10]; Wenger, McDermott and Snyder, 2002 [25]; Saint-Onge and Wallace, 2003 [23]) where individuals meet each other in face to face encounters or it can take the form of a network of practice (Hildreth and Kimble, 2004 [11]) where individuals take on the debate in virtual environments as forums, blogs (Jensen 2003 [14]) or a wiki (Ebersbach, Glaser and Heigl, 2005 [9]). The encounter has to produce a document, for example a report, a guideline, a directive, which resumes the ideas, the problems and the solutions which have emerged in the debate. This document has to be structured as an hypertext, meaning that it has to contain also references to other documents and reports. The network of documents can be considered as a network of ideas which have been externalized by a single author or a community of authors as the discussion reports. It has to be stressed that the present work doesn’t want to cope with the problem of the definition of an opinion formation model (Di Mare and Latora,
218
L. Lella and I. Licata
2006 [8]; Bordogna and Albano, 2007 [6]). It wants only to present a preliminary study on the effects of the introduction of a new knowledge management platform on the evolution of networks of ideas and knowledge claims within an organization. Figure 1 shows the two different dimensions taken into consideration by the system, which are the organizational social network and the network of ideas which emerge from the knowledge production spaces provided by the system and maintained by the individuals which intercept the majority of knowledge and information flows. The emerging network of ideas can be conceived as a complex system which constantly evolves in time. This implies that the structure of the network continuously changes through the addition or the removal of nodes and links. In this kind of networks the survival of nodes seems to depend on some quality of the nodes, for example the quality or the perceived interest or utility of the exposed idea or knowledge claim. Thanks to the innovative value of the claimed ideas it can happen that some research papers in a short timeframe acquire a very large number of citations, much more than other contemporary or older publications. As stated by Bianconi and Barabasi (Bianconi and Barabasi, 2000 [4]) this example suggests that the nodes of a network of ideas have a different ability (fitness) to compete for links. The success of the idea depends also on its popularity and its foundations which are represented respectively by the number of other documents which reference the externalized idea (for example the inbound links of an electronic hypertext) and the number of document referenced by the externalized idea (for example the out-bound links of an electronic hypertext). A good model which considers all these factors is the one presented by Bianconi and Barabasi (Bianconi and Barabasi, 2000 [4]). The process starts with a net consisting of N disconnected nodes. At every step t = 1 N each node establishes a link with other m units. If j is the selected unit, the probability that this node establishes a link with the unit i is:
Pi =
U i ki U k j j j
(1)
where ki is the degree of the unit i , i.e. the number of links established by it, while U i is the fitness value associated to the node. We want to define such network principally by connecting the knowledge claims which have been externalized in debates and encounters promoted by the
A New Model for the Organizational Knowledge Life Cycle
219
Figure 1. Network of externalized ideas at t = 0. At t = 1 the pattern of ties within the social network is changed. The betweeness centrality of A, B and C has decreased and the individuals D and E, having the highest betweeness centrality, are invited to promote other two discussion spaces. Individuals A, B and C continue to attend to the discussions of their groups, but the interest on their knowledge claims has vanished. This is attested by the decrease of the betweeness centrality of the promoters A, B and C. Thanks to the high fitness the node E within the network of externalized ideas can establish the same number of connections as node B. Probably the individual E promotes a meta-claim on the knowledge claim sustained by the individual B. This justifies the presence of the connection among reports B and E. But the ever changing values of the business centralities of the promoters guarantees that no externalized knowledge claim will prevail over the other ones.
individuals which control the flows of information and knowledge. This choice is due to the fact that these individuals have a broader and more generalized vision of the problems and needs of the organization. So they are more capable to suggest knowledge claims which inspire the interest of a wide organizational community. The betweeness centrality has been considered in literature (Marsden, 2002 [16]; Alony, Whymark and Jones 2007 [2]) as a way to find the most valuable nodes within a social network. It can be said that a node with an high betweenes centrality plays a “broker” role in the network, i.e. it has a great influence over what flows (and does not) in the network. These nodes play indeed an important role but at the same time constitute a failure point of the network. That is because without their presence some subgroups of individuals within the
220
L. Lella and I. Licata
organization could be cut off from information and knowledge. The betweeness centrality bi of a node i belonging to a social network is obtained as:
g jiw
bi = j ,w
g jw
(2)
where g jw is the number of shortest paths from node j to node w ( j , w ≠ i ) and g jiw is the number of shortest paths from node j to node w passing through node i . The betweeness centrality is indeed the best way to select people which can start and promote discussion among the other individuals within the organization. The main purpose of our system is twofold. First of all our platform has to promote environments where people can share their ideas on topics of common interest. To achieve the best results the system detects the most important people in the knowledge life cycle, that is those with the highest betweeness centrality. These people have to suggest (maybe with the aid and the prompt of the system) a knowledge claim which can regard the largest audience. All the suggestions, problems and solutions which are emerged from the community that grows around the promoters of the discussion are grouped, organized and externalized. In this way a certain amount of tacit knowledge, that is knowledge principally held in the minds of individuals or embedded in processes (Polanyi M., 1967 [22]; Nonaka I., 1994 [18]; Nonaka et al., 1998 [20]; Hildreth and Kimble, 2000 [10]; Bhatt, 2001 [3]; Bosua and Scheepers, 2002 [7]), can be exteriorized in an explicit form like a report or a guideline. Thanks to such sharing environments new ideas can arise promoting the creation of knowledge and innovation. At the same time the system allows to achieve another important outcome that is the preservation of knowledge. During the encounters the participants can get in touch with people never seen before or people whom they have never collaborate with. In this way the knowledge can survive even without the promoter of the discussion. For this reason it has to be considered that the fitness function, that is the betweeness centrality, is not constant but changes over time. The participation to a common activity as a discussion forum or the updating of the content of a collaborative work environment as a wiki or a community blog can be conceived as a form of communication or social relation. So in our model the fitness function we choose for a given externalized idea takes the following form:
A New Model for the Organizational Knowledge Life Cycle
0 g (t ) jiw
bi (t ) = U i (t ) = j,w
g (t ) jw
221
t < ti t ≥ ti
(3)
where ti is the instant at which the discussion group is constituted. It implies that the betweeness centrality of each node of the social network may vary over time thanks to the effect of the collaborative process of knowledge creation. We assume that bi (t ) is a decreasing function for t > ti , considering that the creation of a discussion group involves a rewiring process in the social network localized around the node i which promotes the discussion. This factor has important effects on the evolution of the network of opinions and ideas as we will demonstrate hereafter. In another work Bianconi and Barabasi (Bianconi and Barabasi, 2001 [5]) compared their model to the evolution of a Bose gas, assigning an energy ε i to each node determined by its fitness U i and a parameter β acting as an inverse temperature ( β = 1 T ):
εi = −
1
β
log U i
(4)
According to this mapping, a link between two nodes i and j with different fitnesses U i and U j corresponds to two different noninteracting particles on the energy levels ε i and ε j . The addiction of a new node to the network corresponds to the insertion of a new energy level ε i and 2m particles to the gas. In particular m particles, corresponding to the m out-bound links of node i , distribute themselves on the level ε i while the other m particles are deposited on other energy levels corresponding to the in-bound links coming from node i . The probability that a particle is settled on a level i is given by (1), and deposited particles are not allowed to jump to other energy levels. Each node added at time ti and corresponding to an energy level ε i is so characterized by an occupation number ki (ε i , t , ti ) representing the number of links (particles) that the node establishes at time t. Bianconi and Barabasi (Bianconi and Barabasi, 2001 [5]) made the assumption that each node increases its connectivity following the power law:
k i (ε i , t , ti ) = m
t ti
f (ε i )
(5)
By the introduction of a chemical potential µi they also demonstrated that the dynamic exponent f (ε i ) takes the following form:
222
L. Lella and I. Licata
f (ε i ) = e − β (ε i − µ i )
(6)
This mapping has lead to the prediction of the existence of three different phases in the evolution of their network model. When all the nodes have the same fitness (6) predicts that f (ε i ) = 1 2 and according to (5) the occupation number, which corresponds to the connectivity of node i, increase as (t ti )1 2 . This means that old nodes having smaller ti have larger ki and the model reduces to the scale free model (Albert and Barabasi, 2001 [5]). In our case this result indicates that new ideas tend to originate from the most popular ones, which establish more connections with the others, and old ideas have more chances to become more popular and survive than the others. Clearly this phase, that has been called by Bianconi and Barabasi “firstmover-wins” (FMW), doesn’t correspond to a real network of opinions where the value of the idea influences more its success than its age. In systems where nodes have different fitnesses the fittest nodes acquire links at an higher rate even if they have been introduced at a later time with respect to the others. This phase is called by Bianconi and Barabasi “fit-getrich” (FGR). In our model the value of an externalized idea is approximated by the fitness (3), which can be considered the extent by which the promoter of the knowledge claim is capable to interest the members of the community which discuss the knowledge claim. But in the two first phases there is no clear winner as the fittest node’s share of all links decreases to zero in the thermodynamic limit, leading to the emergence of a hierarchy of few large hubs surrounded by many less connected nodes. In the “first-mover-wins” phase the relative connectivity of the oldest node follows the law:
k max (t ) (1 2) −1 −1 2 ≈t =t →0 mt
(7)
In the “fit-get-rich” phase the relative connectivity of the fittest node decreases as:
k (ε min , t ) ≈ t f (ε min ) −1 → 0 mt
(8)
considering that f (ε min ) < 1 . Bianconi and Barabasi demonstrated that below a given TBA = 1 β BA the fittest node maintains a finite fraction of the total number of connections during the growth of the network. This particular phase has been compared by Bianconi and Barabasi to the Bose-Einstein (BE) condensation (Huang, 1987 [12]).
A New Model for the Organizational Knowledge Life Cycle
223
Figure 2. Evolution of the network of externalized ideas at t = 1.
In a network of knowledge claims this phase has to be avoided because it means that a given idea prevails over the other ones, limiting the process of innovation of the organization. As stressed by Bianconi and Barabasi real networks have a T independent fitness distribution meaning that their status (BE or FGR) is independent of T. Luckily our KMS model tends to level the fitnesses of the nodes. The rewiring of social ties around the individuals with the greatest betweeness centralities leads to the appearance of new individuals with the highest betweeness centralities. In this way the BE condensation can be avoided. This assert cannot be mathematically demonstrated as the fitness function depends on an unspecified number of variables, but figure 1 and figure 2 can show the way by which the network of externalized ideas evolves in time. At t = 0 we have three individuals A, B and C with the highest betweeness centralities which are invited by the system to promote a space of discussion reporting all the arisen knowledge claims in the documents A, B and C. Some people within the discussion groups A and B notices a correlation among the themes treated by A and B and a reference is generated among the corresponding reports. A correlation is detected between the reports B and C and another reference is added. The spaces of discussion are open to every interested participant and alerting measures could be adopted in order to spread the invitations over the entire organization. In this way there can arise large
224
L. Lella and I. Licata
communities which do not include only the strongest ties of the discussion promoter. At t = 1 the pattern of ties within the social network is changed. The betweeness centrality of A, B and C has decreased and the individuals D and E, having the highest betweeness centrality, are invited to promote other two discussion spaces. Individuals A, B and C continue to attend to the discussions of their groups, but the interest on their knowledge claims has vanished. This is attested by the decrease of the betweeness centrality of the promoters A, B and C. Thanks to the high fitness the node E within the network of externalized ideas can establish the same number of connections as node B. Probably the individual E promotes a meta-claim on the knowledge claim sustained by the individual B. This justifies the presence of the connection among reports B and E. But the ever changing values of the business centralities of the promoters guarantees that no externalized knowledge claim will prevail over the other ones.
3. Conclusion and future work A KMS needs techniques and strategies to support the entire knowledge life cycle. This process has to lead to the formulation of knowledge claims and meta-claims, which are produced by problem analysis and problem validation processes. These activities cannot be scheduled and structured beforehand in a top-down fashion by the management, but they have to arise in an emergent manner, considering the knowledge gaps encountered by the agents in their activities and involving the right people which can effectively judge and deal with the arisen knowledge claims. A possible way could be to monitor the evolution of the social network which characterizes the organization in order to detect the individuals with the highest betweeness centralities and prompt them to detect problems. These individuals intercept the majority of information and knowledge flows and therefore they are the rightest people to suggest knowledge claims, submitting them to the pair review of a large involved and interested community. In other words these actors are invited by the system to produce knowledge. The discussion environments promoted and sustained by these individuals drives a social relation complexity turned to the generation of meta-claims or other knowledge claims. In this work we presented a knowledge management framework that is designed to externalize tacit knowledge producing knowledge claims. We have tried to demonstrate that our framework is capable to preserve organizational
A New Model for the Organizational Knowledge Life Cycle
225
knowledge from being lost and most of all to create the right conditions for keeping the innovation processes. We have chosen the model of Bianconi and Barabasi to represent the growth of the network of knowledge claims as this is the only model which allows to consider in the evolutionary process both the popularity of the externalized ideas, i.e. the number of the references made by other knowledge claims to the externalized idea, and the value or fitness of the externalized idea, represented by the betweeness centrality of its promoter. Every time an individual is invited to suggest and promote a knowledge claim his/her betweeness centrality decreases favoring the augmenting of the betweeness centralities of the members of the community generated by the knowledge claim. This sort of leveling effect of the fitnesses of the externalized ideas allows to avoid the situation where a certain knowledge claim prevails over the other ones. It is important to stress that without the particular mechanism of involving in discussions the individuals with the highest betweeness centrality, the knowledge gaps perceived by these individuals could remain internalized in a tacit form or, once externalized, could be limited to a small group of individuals strongly tied to them. A number of issues have still to be treated. First of all we are going to model and evaluate the effects of the introduction of discussion promoters on the overall structure (Iansiti and Levien, 2004 [13]) of the network of ideas. For example we will evaluate the robustness of the network of ideas in the presence of specific kinds of perturbations, the productivity of the network of ideas in terms of delivery of innovations and the niche creation in terms of variety, i.e. number of new ideas in a given period of time, and the overall value of the new options created. In this effort we need to take into account both local and global resources of the ecosystem. For example the fitness function should not exclusively depend on the betweeness centralities of the nodes but also on global measures of the network health as the previously introduced ones. It has been demonstrated that ecosystems governed by local and global resources can lead to the emergence of stable hubs which are a strong indicator of system robustness (Lella and Licata,2007 [15]) , and we will try to evaluate if our knowledge management model does follow this particular trend. After this preliminary study of the knowledge model we will choice the communication channels to monitor in order to obtain a good representation of the organizational social network. Many researchers have tried to evaluate the possibility to approximate the pattern of organizational relations principally
226
L. Lella and I. Licata
following face to face encounters, telephone communications, tele conferences, and email flows. We will review all the works regarding organizational social network analysis and we will try alternative ways to reconstruct the pattern of social ties represented by networks of k-logs. The second problem to be solved is the choice of the most suitable environment to promote the creation of knowledge. We will compare the performances of different solutions like face to face debates, forums, community blogs and wikis. Finally we will have to define appropriate mechanisms and strategies to involve the individuals to share their knowledge and experiences, maybe suggesting them a list of possible arguments to debate with the colleagues. We will also have to define ways to invite individuals to the discussion groups. We will compare the effects of different solutions as the direct invitation of the promoter or the definition of alerting mechanisms which for example suggest the potentially interesting discussion groups for the activities of each individual operating in the organization.
Acknowledgments This work has been partially granted by the PRIN-2005 research project “Dinamiche della Conoscenza nella Società dell’Informazione”, national Coordinator Prof. Cristiano Castelfranchi. One of authors (IL) thanks Ginestra Bianconi for her precious suggestion and encouragement.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
R. Albert, A. Barabasi, Rev. Mod. Phys. 74, 47-97 (2001). I. Alony, G. Whymark and M. Jones, Informing Science Journa,l 10, (2007). G.D. Bhatt, Journal of Knowledge Management, 5(2), (2001). G. Bianconi and A-L. Barabasi, arXiv:cond-mat/0011029v1, (2000). G. Bianconi and A-L. Barabasi, Physical Review Letters, 86(24), (2001). C.-M. Bordogna, E.-V. Albano, Journal of Physics: Condensed Matter, 19, (2007) R. Bosua and R. Scheepers, in Proceedings of the 25th Information Systems Research Seminar in Scandinavia (IRIS25), Ed. K. Bodker, M.K. Pedersen, J. Norbjerg, J. Simonsen M.T. Vendelo, (Roskilde University, Denmark, 2002). A. Di Mare and V. Latora, arXiv:physics/0609127, (2006). A. Ebersbach, M. Glaser and R. Heigl, Wiki. Web Collaboration (Springer, 2005). P. Hildreth and C. Kimble, Journal of Knowledge management 4(1), 27-38 (2000). P. Hildreth and C. Kimble, Knowledge Networks: Innovation through Communities of Practice (Idea Group, Hershey, PA, 2004). K. Huang, Statistical Mechanics (Wiley, Singapore, 1987).
A New Model for the Organizational Knowledge Life Cycle
227
13. M. Iansiti, R. Levien, The Keystone Advantage: What the New Dynamics of Business 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
Ecosystems Mean for Strategy, Innovation and Sustainability (Harvard Business School Press, Boston, 2004). M. Jensen, Columbia Journalism Review (2003). L. Lella, I. Licata, EJTP, 4, 14, 31-50 (2007). P.-V. Marsden, Social networks 24(4), 407-422 (2002). M.-W. McElroy, The new knowledge management. Complexity, learning, and sustainable innovation (KMCI Press, Butterworth-Heinemann, Boston, MA, 2003). I. Nonaka, Organization Science, 5(1), (1994). I. Nonaka and H. Takeuchi, The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation (Oxford University Press, New York, 1995). I. Nonaka, P. Reinmoeller and D. Senoo, Euro. Manag. J. 16(6), 673-684 (1998). I. Nonaka, R. Toyama and N. Konno, Long Range Planning, (32), 5-34 (2000). M. Polanyi, in Knowledge in Organizations, Ed. L. Prusak, (ButtreworthHeinemann, Boston, MA, 1967), pp. 135-146. H. Saint-Onge and D. Wallace, Leveraging Communities of Practice (ButtreworthHeinemann, Boston, MA, 2003). D. Stenmark, Knowledge and Process Management 10(3), 207-216 (2003). E. Wenger, R. McDermott and W.-M. Snyder, Cultivating Communities of Practice (HBS Press, 2002).
This page intentionally left blank
ON GENERALIZATION: CONSTRUCTING A GENERAL CONCEPT FROM A SINGLE EXAMPLE
SHELIA GUBERMAN Digital Oil Technologies, Cupertino, California, USA E-mail:
[email protected] Using the linguistic approach it is possible to generalize from a single example. Keywords: linguistic approach, concept formation, generalization.
1. Introduction and background In Artificial Intelligence it is accepted that a computer can create a pattern by applying a pattern recognition algorithm to a set of examples that represent at least two classes of objects (for example, various representations of the characters “A” and “B”). Because there are no precise definitions of “pattern” and “concept” these two terms in the AI context were considered as synonyms. That substitution does not solve any problems but it reflects the general tendency in AI to baselessly lift the level of “intelligence” achieved at some point in time by using philosophical vocabulary. So, it was decided that if one can say that the computer creates a pattern it will be correct to say that the computer creates a concept. The reality is quite different. The computer does not create patterns. Pattern recognition programs generate decision rules. The decision rule built by the computer represents the difference between a given class of objects (a “pattern”) and another given class (or a number of classes) – not the essence of the class, i.e., the pattern. This state of art becomes clear when we consider, for example, that water can be distinguished from ice by density, from oil by electrical resistance, from acid by its effect on living tissue. Here it is important to observe that none of these discernible attributes is sufficient to describe water as a concept. Consequently, because pattern recognition programs do not generate patterns, nobody can say that pattern recognition is a solution for creating concepts (without discussing the equivalence between pattern and concept). At the same time these two problems – pattern recognition and creating a concept – can be treated using a similar approach, i.e., the linguistic approach.
229
230
S. Guberman
(a)
(b)
(c)
Figure 1. Winston’s examples for constructing the notion of “an arch”.
2. On the philosophical approach In philosophy the definition of a concept is as follows: “A concept is an abstract idea or a mental symbol, typically associated with a corresponding representation in language” [13]. In this paper we follow the “representation in language” approach. P. H. Winston was the first to reject the idea of creating a concept from a number of examples [14]. In that paper Winston proposed the use, for machine learning, of the concept of “an arch”. In his approach he emphasized the importance of an adequate “visual language” of description and pondered, as a matter of fact, a single example of an arch, and a number of examples of what might be considered “not an arch”, each example calling attention to a crucial feature of an arch (see Figure 1). It turns out that this approach also embodies an attempt to define a “concept” through differentiating attributes, and it fails (see the example of the concept of “water” above). Let us now take a close look at the problem from the linguistic point of view. And let us use the same example – an arch. Let us describe an arch in simple English. The result might be something like this: One block rests on two standing blocks
(1)
We can transform this sentence to a more formal grammatical structure. Specifically: [(4-block) rests] on {[(4-block) stands] [(4-block) stands]}
(2)
Where brackets reflect the levels of the grammatical structure, and “4” indicates that the base of the block is a 4-angle. In our opinion, this relatively simple construction makes a lot of sense and allows us to construct a concept. The reasons are these: 1. The level of the term in that structure reflects the importance of the feature in the definition of the notion “an arch”: the higher the structural level of the term, the more important it is in defining the concept of “an arch”. The
On Generalization: Constructing a General Concept from a Single Example
(a)
(b)
231
(c)
Figure 2.
2.
3.
lower the level, the less important it is. For example, the lowest level term (for a block in our example it is 4) can be altered (to become a 3 or a 10) but it will influence the nature of the “arch” object very little (see, for example, Figure 2(a)). The term appearing on the next level (the block) can be changed to a pyramid, or to an “H-profile block”, nevertheless the structure will still be interpreted as roughly an arch. If the term “block” is changed to a “pyramid”, we will still recognize the shape of an arch (or at least a caricature of an arch – see Figure 2(b)). Changes at a higher level, however, put an end to perceiving this structure as an arch (see Figure 2(c), where “stands” is changed to “lies”). We note that in sentence (2) there is no indication of the relation between the two upright blocks. This means that there is nothing specific to be said about this relation, i.e., upright blocks are in a “nonspecific” position. If they are to be in a particular position (for example, they touch each other) this would be reflected in the natural language description (“two upright blocks are touching each other and the third one rests on them”). The use of the terms “upright” and “rests” may trigger the realization that the “upright block” is vertical and “resting block” is horizontal.
Nevertheless, the most important information captured in sentence (2) is the hierarchical grammatical structure of the statement. It allows us to emphasize the key features of the object we are pondering, and, as a consequence, allows us to define the notion of “an arch”. We thus posit that this grammatically structured sentence is the notion of an arch. The word “arch” is a label only for that structure. So far we have considered a particular example of description. We now consider whether there exists a general approach which can allow us to create concepts through the use of an adequate language of description. 2.1. Constructing a description According to Bongard' s “imitation principle” [1], the most effective way of solving any recognition problems is to describe the objects to be recognized in
232
S. Guberman
terms of how the objects were created. We point out that sentence (2) discussed above defines the concept of “an arch”. Moreover, it comprises an instruction on how to build an arch (stand two vertical bars apart and rest a third one across top of the two vertical ones). If we can develop a program that will “look” at a single example of an arch (Figure 1) and create description (2), this will solve the problem of recognizing arches in general, because structure (2) is an arch. In other words, if an object is described using an adequate language of description, this description contains the description of the class, to which the object belongs, and thus the “concept”. To further illustrate the construction of a description, let us now consider the computer recognition of handwriting. A language, adequate for describing written scripts, has been introduced [3]. In this language, the description of any given character becomes the “name” of the character. We consider it remarkable that the language adequate for the description of handwriting recognition is the language used in describing the process of writing. We note that this is in accordance with the “imitation principle” introduced by Bongard [1]. Most importantly, though, the script is treated not as a static picture, but as a track of the movement of the writing implement. The language of description for writing recognition consists of 8 basic elements (“words”), which are interrelated: in free-hand writing, one element can be transformed into other, which is its neighbor in the line of elements as shown in (3): (3) Therefore, in this language of description, the canonical character is described as a sequence of elements . If we have such a description of the canonical “a” we can then apply the transformation rule and get all possible shapes of a written “a”: • If the first element “ ” is changed into its neighbor in the sequence (3) “ ” this will produce a written “a” . • If the second element “ ” is changed into its neighbor “ ” this will produce a written “a” - . • If the third element “ ” is changed to his neighbor “ ” this will produce a written “a” - , • and so on.
On Generalization: Constructing a General Concept from a Single Example
233
Thus, we get exactly what we are after: from the appropriate description of a single example of a written “a” we get the description of the complete class of “a”. A number of other applications may demonstrate that the real solutions to old unsolved problems of pattern recognition are found only when an adequate language of description is used to describe the objects under consideration. We again emphasize that it is remarkable that when the description is adequate it not only gives the right answer, but the decision rule become extremely simple, one may say – primitive [4, 5]. What this means is that such an approach may reveal a new understanding of the intellectual processes that occur in the human brain. It seems to us that there is no need for sophisticated procedures for decisionmaking and recognition for intelligence to take shape, but the ability to create an adequate description of the world. That leads us to a deeper problem: why is our brain endowed with the ability to adequately reflect the world (the outer one as well as the inner one)? This problem was discussed by Wittgenstein at the beginning of the 20th century [15]. His answer was as follows. There exist an infinite number of potential objects in the world but there are only a finite number of notions in our language. Every object that is reflected in the language as a notion has a social meaning. The knowledge about these notions, as social objects (the structure of the objects as well as their key features) has to be translated, from one person to another, from one generation to the next. If so, there would be no objects in the world, which could not be articulated with a language, and, at the same time, they would all be socially important. This means that to a great degree, the world, in which we live consciously, is a world, which is filtered through our language. To put it simply (and perhaps facing the danger of oversimplification) we may say this: we live in a world which is defined by our language. 3. Implicit generalization Let us now analyze a very simple image (Figure 3). The description of this image in a “natural language” might be: “big white triangle”
(4)
This sentence can be transformed into its formal structure: (big) (white) (3) [angle] The grammatical structure of (5) is:
(5)
234
S. Guberman
Figure 3.
plane figure big
white
polygon = “n-angle” 3
(6)
Sentence (5) represents the notion of a “big white triangle”. Now, let us try to generalize this notion by eliminating some of the terms in the grammatical structure of (5) shown by sentence (6). If we start at the lowest level of this structure, we may eliminate the term “3”. As a result we get this description of a general class: “big, white polygon (n-angle)”. If we now omit “big” from the second level of the structure in (6), we obtain a broad notion of the “white nangle”. Now, if we leave out “white” from (6), we get a notion of the “big nangle”. Finally, if we omit “n-angle”, we get a more general notion of “big white” plane figure. We now note that for natural language, the information embedded in a given sentence is obtained not only from terms explicitly used in the sentence but also from the related terms that are not explicitly brought up. Again, let us consider sentence (5). The “big white triangle” implies the existence of “small white triangle” and of “big black triangle”, and of small black 4-angle, and so on. Thus, the single sentence, which describes a simple object, may represent a world populated by big triangles, small triangles, white quadrangles, black pentagons, and so on. And that is not all. As soon as we contemplate a small object and we draw it on the paper in a certain way, this may generate a new description, for example: “small white triangle in the left upper corner”. This opens a new dimension in the world of figures (more precisely, two dimensions if we deal with figures on a plane).
On Generalization: Constructing a General Concept from a Single Example
(a)
235
(b)
Figure 4.
We observe that for natural language, even for a simple sentence, layers of understanding arise from the knowledge about the language and the world. This represents an important difference between natural and algorithmic languages. Another difference between natural and algorithmic languages lies in their goals: the goal of natural language is to explain, the goal of algorithmic language is to ensure program execution. Each subsequent sentence in the text written in natural language has to be understandable; each subsequent sentence written in an algorithmic language has to be executable [6]. Now we will show how the linguistic interpretation of a concept can explain some of our mental abilities. 4. Psychological experiments The experiments concerning visual observation described in this section of the paper have been previously presented in detail [10]. The task presented to the human subject of the experiment is to find, in Figures 4(a) and 4(b), an object different from all the others shown. It turns out that the time to search for such an object is significantly longer when the human subject of this experiment is presented with Figure 4(b). We will attempt to explain this observation by making the use of the approach presented in this paper. We note that the whole image is perceived as a number of similar objects: pairs of parallel lines (in accordance with the proximity principle of Gestalt psychology) [10]. The description of the object in natural language is “two parallel lines”. As with sentence (5), explored in the previous section of the paper, we note that it is possible to generate objects dissimilar to the “two parallel lines” description by inserting “non” for each element of the structure: 1. “non-two parallel lines” 2. “two non-parallel lines” 3. “two parallel non-lines “ (for example arcs)
236
S. Guberman
(a)
(b)
Figure 5.
One can see that the object that is different from the rest, which has to be found, fits description 2. This means that as soon as we recognize that the ordinary object here is “two parallel lines” we implicitly possess the idea of what we have to find. It is obvious that 1 has to be rejected; all objects consist of two lines. 3 has to be rejected as well, because all objects in Figure 4(a) are lines. The first proposition tested is 2, and this will lead to the right solution. For Figure 4(b), the description of the object is “two lines”. The dissimilar object is described as: 1. “non-two lines” (for example, “three lines” or “one line”), 2. “two non-lines” (for example, “two arcs”). 3. “two parallel lines”. For the situation shown in Figure 4(b), description “two parallel lines” does not correspond to the unconsciously perceived patterns. It is thus reasonable to predict that the decision time concerning finding a dissimilar object might be longer. The actual experiment [10] is in agreement with this prediction, arrived at by analyzing the structure. The same analysis and conclusion can be obtained for the two images shown in Figure 5(a) and 5(b). The task is the same: find a dissimilar object. The natural language description of the object in Figure 5(a) is “vertical line”. The dissimilar object described by “non-vertical line” structure turns out to be precisely the description of the object that is the subject of the required task. The description of a standard object in Figure 5(b) is “line”; we observe that “non-line” description will not generate the description of the dissimilar object- “vertical line”. 4.1. Gestalt as generalization All basic Gestalt principles (similarity, proximity, good continuation and so on) have to help in recognizing the organization of the image, i.e., dividing the image into appropriate parts and find the relationships between them. In the case
On Generalization: Constructing a General Concept from a Single Example
(a)
237
(b) Figure 6.
of the simple drawing in Figure 6(a) the image can be described as two crossing lines “ab” and “cd”, or two touching angles “ac” and “bd”, or four lines “aO”, “bO”, “cO”, “dO”. The “good continuation” principle helps describe the image (i.e., to represent our perception) as containing two parts – two crossing lines “ab” and “cd”. Why is this choice of representing (describing) the image preferable? From all potentially possible partitions of the whole, the preferred set of parts is that with the simplest description [3]. The simplicity of the description reflects 1) the number of parts (the lower the number, the simpler the description), 2) the relationships between the parts (touching, crossing, above, to the right), and 3) the simplicity of description of each of the parts. So, the hypothesis of creating the image in Figure 1 by drawing the lines point-by-point and in random order has to be rejected as being extremely complicated and practically impossible. The number of parts in the case of two crossing lines and two touching corners is the same – two, but to create the whole from the chosen parts is much more difficult in the case of corners. It is simple to draw the first corner, but drawing the second one takes a lot of concentration. First, the vertex of the corner has to coincide with the vertex of the first corner. Secondly, the direction of the first leg has to be precisely the same as the direction of the appropriate leg of the first corner. That will give the smooth continuation in the crossing point. The same conditions have to be satisfied for the second leg. Overall, it is a very arduous problem. This means that the relationships between the parts are very complicated. In the case of crossing lines the relationships are described by one condition only – crossing. The perception of Figure 6(a) as “two crossing lines”, as a matter of fact, represents not only the given image, but also a set of images (see Figure 6(b), first row) which our perception will refer to one class, which carry the same
238
S. Guberman
pattern, the same Gestalt. One of the important features of that pattern is stability: one can change some parameters of an object (curvature of the lines, intersection point, or length of lines) but the resulting image will still carry the same Gestalt. On the contrary, should one choose to describe Figure 6(a) as consisting of two angles, changes in the parameters will create a set of images (Figure 6(b), second row), which will not be accepted by our perception as belonging to the same class, the same pattern, the same Gestalt, as the initial image does. So, the first row in Figure 6(b) containing figures with the same Gestalt is a correct generalization of Figure 6(a). 4.2. Grammatical structure of a common sentence Although the examples analyzed above are formally expressions in natural language they are rather “technical”. Let us now analyze an ordinary sentence in natural language: The black horse jumped over the half-decayed fence.
(7)
As noted before, there are many things we may deduce from this sentence, depending on our knowledge of the language and of the world. This sentence may describe a scene of a chase out of a city, perhaps near an abandoned farm. We may have seen such a scene many times in western movies. Here is its grammatical structure: horse black
jumped over
fence half-decayed
(8)
Let us now begin to change the low level terms in this structure and observe how this may influence our perception of the scene. The white horse jumped over the half-decayed fence.
(9)
Nothing changes in our perception. As a matter of fact our intellect is very sensitive to different kinds of details. In our understanding of the world (or at least, the movie world) a white horse is something special, and in most cases, may be associated with an important person in the story. The black horse jumped over the painted fence.
(10)
The essence of the scene is still the same but now it could take place in a different environment – perhaps closer to a populated area, perhaps on the outskirts of a small town.
On Generalization: Constructing a General Concept from a Single Example
239
The black horse kicked the half-decayed fence.
(11)
Our perception and the pattern has changed dramatically. There is no longer a chase. We may imagine a rather comical scene: a drunken cowboy on an unhappy horse. Now let us ponder this: The black fly flew over a half-decayed fence.
(12)
This sentence may create a completely different perception of the scene. We may imagine that it is a hot afternoon. Cowboys are sitting in a restful repose near the saloon. They are half-asleep. Silence. Only a black fly is making a loud buzzing noise. We now note that the analysis of a transformation of sentence (9) may offer us a glimpse into a cultural phenomenon worth pondering. The history of medicine shows that some diseases are referred to by the name of the physician who described it first (Alzheimer' s disease, Korsakov’s syndrome, for example). The descriptions of a particular case of the disease were then used by generations of physicians as a description of the disease. However, we know that the same disease may create, in different patients, patterns of symptoms with a number of variations. Therefore, the generalization could be successful only when the description contains not only the list of characteristic symptoms of the disease but also their relative importance. As demonstrated in the previous sections of this paper, such information could be represented in the grammatical structure of the description. The fact, that these descriptions were really good, means that these physicians were well educated and their skill in natural language was high. The physicians of future generations will have to be linguistically educated to be able to extract the knowledge from the grammatical structure. All this shows the importance of wielding a skilful pen not only for physicians but also for many other professionals – a truism, which has been and is still disputed by many, many students in school. 5. Conclusion We have demonstrated that if the description of a single object or situation is obtained in an adequate language, its grammatical structure will contain information on the relative importance of the properties of objects or situations. This allows the creation of generalizations (abstractions) of the object or situation. This, in turn, allows us to discover relationships between the structure of the language and the system of concepts about the world around us. One of
240
S. Guberman
the manifestations of this kind of relationship is that the name of an object (not a symbol denoting it, but the name expressed in an adequate language of description) reflects its essence. We note here that this theme is the subject of discussion in a specific branch of philosophy – Philosophy of name [7]. In conclusion, let us quote two authors with whom we share their point of view. Plato: “Words do imitate Ideal Forms in a perfect and consistent way” [8]. Russell: “For my part, I believe that, partly by means of the study of syntax, we can arrive at considerable knowledge concerning the structure of the world” [9]. References 1. M. Bongard, Pattern Recognition (Spartan Books, New York, 1970). 2. A. Church, Introduction to Mathematical Logic, Vol. 1, (Princeton University Press, Princeton, NJ, 1956).
3. S. Guberman, in Proceedings of the 6th Systems Science European Congress, Sept. 19-22 2005, (Ecole Nationale Supérieure d' Arts et Métiers (ENSAM), Paris, 2005).
4. S. Guberman and E. Andreevsky, Cybernetics and Human Knowing 3(4), 41-53 (1996).
5. S. Guberman, Y. Pikovskii, E. Rantsman, in Proc. SPE Western Regional Meeting, (Long Beach, California, 1997).
6. S. Guberman, W. Wojtkowski, Res-Systemica, 2005, http://www.afscet.asso.fr/resSystemica/
7. A. Losev, Philosophy of Name, (in Russian: Context Publ., Moscow, 1992). 8. Plato, Cratylus , available at 9. 10. 11. 12. 13. 14. 15.
http://www.journals.uchicago.edu/ISIS/journal/issues/v94n4/940415010/940415010. web.pdf B. Russell, An Inquiry into Meaning and Truth, (Allen & Unwin, London, 1940), Preface. A. Treisman, Scientific American 254(1): 114-125 (1986). M. Wertheimer, Productive thinking (Harper & Brothers, New York, 1959). M. Wertheimer, Philosophische Zeitschrift für Forschung und Aussprache 1, 39-6 (1924). Wikipedia, “concept”, “gestalt”, www.wikipedia.org . P.H. Winston, Learning Structural Descriptions from Examples, Technical Report, (Massachusetts Institute of Technology, 1970), available at https://dspace.mit.edu/bitstream/1721.1/6884/2/AITR-231.pdf . L. Wittgenstein, Tractatus Logico-Philosophicus (Taylor & Francis, London, 2001).
GENERAL THEORY OF EMERGENCE BEYOND SYSTEMIC GENERALIZATION
GIANFRANCO MINATI Italian Systems Society, Milan, Italy E-mail:
[email protected] The problem in defining generalization is considered by examining some core aspects, such as (a) the extent of the domain of validity of a property, (b) the transformation between different non-equivalent representations and (c) the respective representations of different observers and their relationships, i.e., a dynamic theory of relationships between levels of observation as introduced by the Dynamic Usage of Models (DYSAM). The purpose of this paper is to better clarify the conceptual framework of generalization in order to be able to set the context for a General Theory of Emergence as meta-theory, using models of models (as for logical openness) and interacting hierarchies. After considering some approaches used to generalize and focussing upon the purpose of General System Theory for generalizing, we examine some concrete approaches, such as DYSAM, for building up a General Theory of Emergence with specific theories of disciplinary emergence as particular cases. Keywords: emergence, generalization, meta-theory, models, system, trans-disciplinarity.
1. Introduction We introduce the concept of generalization by distinguishing between the concepts of transposition and translation of properties. We then specify the meaning of the concept of generalizing as: 1) extension of the domain of validity of properties; 2) transforming between different non-equivalent representations; 3) respective representations of different observers and their relationships. The discussion relates to the possibility of using models, representations, methodologies and results obtained in one domain in another one and to simultaneously use different, non-equivalent models such as in the Dynamical Usage of Models (DYSAM) briefly described below. We present a list of classical approaches, i.e., in non-systemic frameworks, used for generalizing, such as Abstraction, Analogy, Concept, Homomorphism, Induction, Isomorphism, Knowledge Representation, Language, Learning, Metaphor, Model, Relation and Structure. We briefly discuss what is considered to be the opposite of generalizing, that is making unique and non repeatable, focussing upon the level of representation used.
241
242
G. Minati
We then examine the generalization of systemic properties, i.e., considering them as properties of categories of new entities, i.e., systems. We consider the so-called General System Theory (GST) approach and the prospective framework of the General Theory of Emergence (GTE). While GST is concerned with the generality of properties of systems established through organization, a GTE is expected to focus upon collective phenomena establishing systems and particularly: 1) correspondences between models and representations of phenomena considered emergent; 2) the development of tools for detecting and verifying processes of emergence and de-emergence; 3) the identification and classification of different possible non-equivalent kinds of emergence; 4) identification of the limitations of its generalization by defining the domain of validity of such a theory. We then introduce DYSAM as an approach for meta-modeling within the framework of the search for a GTE as meta-theory, still unavailable. We then introduce aspects considered relevant for a future GTE. 2. What is generalization? 2.1. An introduction The word generalization denotes the process of making general. The adjective general identifies the fact that a property is considered suitable for a larger quantity or wider variety of elements than that originally considered. The process of generalizing is dealt with in various disciplines, such as in logics when dealing with induction [1], philosophy [2], psychology and AI as in processes of learning [3]. It may take place with or without the changing of variables or rules in representations and models. In the first case there is a transposition of the same property in different contexts, in the second there is an adaptation, translation, which, however, maintains the fundamental aspects. In linguistics, for instance, the same words or expressions may be transposed between different languages or may be translated in such a way as to keep the same meaning by using different words and concepts. Examples of transposition take place when the same model is used to model different kinds of phenomena, i.e., transposed, such as for the Lotka-Volterra equations used to describe population and market dynamics by changing the meaning of variables. In music, an example of transposition is when an orchestra plays adapting its music to a special, historical, recorded theme or natural sounds (e.g., Mozart’s kindergarten symphony). Examples of translation take place when the same conceptual approach is used to model different kinds of
General Theory of Emergence Beyond Systemic Generalization
243
phenomena, such as looking for attractors and processes of convergence or equilibrium in different models. In music an example of translation is when the same score is adapted for another musical instrument. We recall that generalizing is different from making generic. The process of making generic makes properties imprecise and fuzzy. In this way the field of validity is not well-defined and, because of that, extended. A generic property possesses little rigor, and is accepted as being imprecise. In everyday usage of the concept there is an unfortunate correspondence between impreciseness and general validity. Similarly, popularizing is based on simplifying complex concepts. Generalizing in this case is intended as an extension of the validity of welldefined properties to other entities in a domain or to other domains by reducing accuracy. 2.2. Specifying the concept of generalization We may now try to better specify the concept of generalization. It relates, for instance, to 1. the problem of extending the domain of validity of a property. In this case, for instance, the process of generalizing consists of replicating a behavior in any context (the success or failure is not cognitively learned, but evolutionarily selected). For instance, the use of pheromone trails by ants is a behavior repeated in any context. It doesn’t work on lava, ice or water because it is a non-survivable or unsuitable environment for ants. In this case the behavioral rules of the system are fixed; 2. the transformation between different non-equivalent representations such as continuous and discrete by allowing interpolation and extrapolation in mathematics, conservative and non-conservative systems [4], biological and physical modeling [5]. This relates to the relationships between different models. By the way, we are not postulating the assumption that it is always possible to transform one model into another, to reduce one to another. It is possible to find correspondences which may allow reducibility or furnish the reasons for irreducibility; 3. the respective representations of different observers and their relationships. This relates to a lack of a dynamic theory of relationships between levels of observation. One attempt to introduce a suitable approach was the introduction of the Dynamic Usage of Models (DYSAM) [6,7]. We underline how this view is the opposite of reductionism based on the usage of a single, specific disciplinary model to deal with any kind of phenomenon. An extension of this approach is given by learning. Learning is intended as in modern cognitive science, i.e., as the process of suitably changing behavioral rules to better fit the environment.
244
G. Minati
These three points may be considered as issues for the fulfillment of strategies for generalizing. The interest in a better understanding of the concept of generalization relates to the possibility of implementing two different strategies: • the use of models, representations, methodologies, results obtained in one domain, in another one, as with inter-disciplinarity, mentioned below in Section 5; • the simultaneous usage of different non-equivalent models, by adopting multidimensional representations leaving it to the user to identify a suitable strategy for dealing with multiple-levels, based upon acting on one level to influence the others. This is a typical instance of Collective Beings (CBs), particular cases of Multiple-Systems (MSs). MSs are established by the same components interacting in different ways, such as interacting networked computer systems performing cooperative tasks and the Internet, where different systems play different roles in continuously new, emerging usages. CBs are established when the same components, interacting in different ways, are autonomous agents, i.e., possessing a natural or artificial cognitive system, able to simultaneously or dynamically decide to interact in different ways [7]. Examples include components which are simultaneously members of families, workplaces, traffic systems, mobile telephone networks or consumers. How it is possible to combine generalizations? Which combination of generalized properties is required to produce a property still possessing general validity (e.g., linear, non-linear and connectionist)? What about the domain of validity? Is it just an intersection of domains or it is possible to consider more sophisticated solutions? We may think of a kind of algebra of generalized properties. This is the problem of trans-disciplinarity as mentioned below. We briefly recall another aspect to be considered, that is, theoretical issues, but also obstacles to generalization in real life. Actually, we have examples of powerful approaches for generalizing which have received little or no interest from researchers. We need to discover the real reasons for this lack of interest. We are faced with the problem that theoretical research is performed by human beings living in real social systems affected by several kinds of real problems such as career, need for gratification and support, stereotyping of approaches, difficulties in publishing, economic and human resources, interest from students, colleagues and industry. We mention a couple of examples in the field of mathematics related to a domain of research having achieved results for generalization which are unconsidered. The first relates to research on
General Theory of Emergence Beyond Systemic Generalization
245
continuity extended from real numbers to cardinal or transfinite numbers, i.e., supercontinuum [8]. The other relates to the Löwenheim-Skolem theorem in model theory, asserting that the existence of a model with transfinite cardinality implies existence of the same model with any other transfinite cardinality. This theorem established the existence of “non-standard” models of arithmetic. It states that even systems in which we may prove Cantor' s Theorem, stating the existence of transfinite numbers, had countable models [9]. 3. What is the opposite of generalization? In the preceding section we listed some crucial characteristics of the concept of generalization in conceptual, non-systemic frameworks. From this discussion we may conclude that a first, general characteristic of the concept of generalization is related to that of repeatability. This correspondence considers, of course, the context, the model, the level of description and the observer. We may say that we apply the concept of generalization to itself. The concept of repeatability may be a good entry point to specify what generalization is not. The opposite of generalizing is to make particular, that is to consider a specific, non-repeatable situation. In this case the interest is on specifying the uniqueness and how this has been achieved. Depending on the level of description, any event may be represented as unique. Such a situation may be related to (a) the level of description considering a virtually infinite number of details, (b) events having very low probability of occurring (unique events are more frequent than probable ones) and (c) models representing and allowing simulation of unique events, such as for chaos. Nevertheless unique events are interesting not only for modeling and, eventually, simulating how they have been generated, but also because they may be represented and memorized. It is then possible to reproduce the representation. Examples of unique events are registrations of artistic events and pictures of unique astronomical events. It is then possible to scientifically consider representations of unique events. Thanks to modeling and computer techniques it is now possible to simulate, i.e., make repeatable and interactive, unique events. Of course, uniqueness exists only in relation to some specific aspects of the representation, i.e., a level of description. On the one hand, representation, reproduction, modeling and simulating are intrinsically limited by the information available. For instance, pictorial images do not explicitly provide information about temperature and speed, but these may be logically inferred. On the other, the representation is related to the constructivist process used by the observer [10].
246
G. Minati
After these brief comments, we can see that the characteristic of being specific and non-repeatable mainly relates to the level of description used by the observer. When considering the opposite of generalizing we can also mention a process usually considered to have this characteristic: the process of specializing. It usually refers to applied, research and educational activities in specific disciplines. Specializing is intended as a process that identifies a welldefined and restricted area where specific disciplinary knowledge and expertise is applied in a repeatable way. As we will see, specialization can be considered as the opposite of generalization once we have a theoretical and not only operational definition of generalization. 4. Outline of some classical approaches used for generalizing As mentioned in Section 2 there are at least two ways of generalizing: 1. the process of generalizing consists of making something applicable to a wider variety of cases. Problems of generalizing in this case regard the repeatability and applicability of cognitive results to cases having a context of validity different from the initial one. 2. the process of generalizing consists of representing the same property in different, equivalent ways. An example of a way of representing the same property in different contexts is isomorphism, structure-preserving mapping between two algebraic structures. An example of a way of representing the same property in a partial way in different contexts is given by analogy and metaphors (see below). In the following we list and briefly comment upon some approaches used to generalize, i.e., to extend the validity of properties and operators within a specific domain and from a specific domain to another domain. They are called classical because they do not necessarily apply to systemic properties. Briefly, we may distinguish between systemic and non-systemic properties when properties are considered, by the level of description of the observer, as related to elements or to the systems established by interacting elements. Elements or parts are constructivistically identified by the observer when using (equivalent or non-equivalent) models to explain the whole, i.e., the system [11]. In short, systemic properties are properties of systems established through organization or collectively. Systemic properties are stationary, i.e., existent while the process of organization or emergence is active, such as functions for organizations, e.g., assembly lines and electronic circuits, computational functions for connectionist devices (e.g., Neural Networks) and life itself. A system may also have nonsystemic properties whereas a non-system can not possess systemic properties,
General Theory of Emergence Beyond Systemic Generalization
247
because these are only generated by emergence or organization. Examples of non-systemic properties, i.e., not necessarily applying to systems, are: weight, speed, quantity, position, shape and odd/even. Examples of systemic properties, i.e., necessarily applying to systems, are: complexity, dissipation, openness/closeness and organization. Behavior, on the contrary, may refer both to elements and systems established by interacting elements. For instance, consider the behavior of a particle and the behavior of a gas of particles in a changing environment (pressure and temperature). Generalization of not necessarily systemic properties, may take place through: • Abstraction. Cases can be considered as special cases of more general ones, as in learning. • Analogy. At a certain level of description, two different items are considered equivalent. • Conceptualizing. Cognitive inference producing concepts from abstraction, objects to be used in a generalized way through tools such as language and in particular analogy and metaphors. It is possible to have the concept of abstraction whereas it is not possible to have the abstraction of a concept. Concepts, distinguished from abstractions themselves, relate to the usability of the abstractions as elements for other, higher levels of abstraction. • Morphisms. Homomorphism enables the conservation of an algebraic structure from one domain to another. Isomorphism, bijective morphism, enables the conservation of an algebraic structure between two different domains in a bijective way. • Induction. It enables one to assume probable the extension of a property p1 detected in all elements considered and all having another property p2 to all elements having the property p2. • Knowledge representation. Knowledge has a generalizing content per se, referring to the possibility of applying it to different cases. Representing has a generalizing content allowing a higher level of abstraction, i.e., using knowledge to process representations of itself. • Language. Languages are of crucial interest for generalizing because they use symbols (e.g., words) as representations and, recursively, as elements to generate higher, i.e., more abstract, levels of representation, such as statements and systems of statements, e.g., books, stories, hypotheses and models. • Learning. Learning may be intended as a representative process of generalizing being related to cognitive restructuring in such a way as to build up models for larger, i.e., more general cases.
248
•
•
•
•
G. Minati
Metaphor. It allows description of something less well-known in terms of something more well-known. It is a kind of hypothetical analogy between the familiar and the less familiar. In this way the process of producing metaphors may also be very misleading because the suggested analogy may be completely unsuitable. Modeling. It has a particular power of generalization because knowledge is not only represented to be transmitted, recorded, and used to induce generation of other knowledge, but also for simulation. Relations. They constitute a very basic kind of correspondence between elements in different sets. Making different sets in correspondence to each other is the first basic step for the possibility to transport properties from one set to others. Structures. They are simple ways to represent symbolic knowledge and may be generalized as properties through transpositions between different sets.
When considering processes of generalization we may also consider the measurability of generalization to deal with questions such as: • Which property is more general? • How general is a property? Quantifying generalization may relate to the extension of the domain of validity, by considering limitations. For instance flexible and diluted are properties which may be used in a general way from their original domain of validity, physics and chemistry. Which property is more general? The one provided with the larger domain of validity. Probably the property of being flexible applies to a larger number of cases than the property of being diluted because the latter applies to environments having, even metaphorically, different densities whereas the former to virtually any environment. To answer the question How general is a property? it is necessary to introduce a measure. This is not the interest of this paper. For our purpose it is sufficient to consider that it is possible to define the problem. Another issue is what we may call “Artificial generalization”. Examples of artificial processes of generalization include, for instance, areas of artificial intelligence, such as pattern recognition, image understanding, language processing and automated diagnosis. Another aspect concerns the ability of artificial systems to learn, i.e., generalize non-explicitly represented knowledge, such as in artificial Neural Networks. Another example concerns reasoning by example. Most of these approaches are based upon Gestalt principles [12,13,14]. Their application is related to methods, focussing on the
General Theory of Emergence Beyond Systemic Generalization
249
general problem of pattern recognition, for generalizing experimental data allowing computers to use sets of examples, as in geology and medicine. This subject is not the interest of this paper. We only mention how the problem can also be considered from this point of view. We conclude this Section with some comments regarding the question “Is it possible to not generalize? Generalizing is a property of cognitive systems. Systems having cognitive systems of different levels of complexity generalize in different ways and at different levels. This may easily be detected by considering animal behavior. Moreover, any learning process is based on some level of generalization. So the question does not apply to systems provided with cognitive systems. We are interested on how and what to generalize, as discussed in the following session. 5. Generalization for systemic properties Systemic properties, as introduced above, are general as they not only refer to categories of items, but to properties adopted by categories of new entities, i.e., systems. How the interaction between elements is a necessary condition for the establishment of systems has been widely discussed in the literature.7 The so-called General System Theory (GST) was introduced by von Bertalanffy15 for systems design, properties, usage, representation, inter- and trans-disciplinary applications. Inter-disciplinarity occurs when approaches and models of one discipline are used by another, while trans-disciplinarity arises when properties are studied per se, considered without reference to specific disciplinary cases; transdisciplinarity also studies the relationships between properties. In this line of thought we had the establishment of various approaches such as Systems Dynamics [16], Systems Theory [17], Systems Engineering [18], Information Systems [19], Living Systems [20], Social Study of Information Systems [21], Soft Systems Approach [22], and Systems Practice [23]. After considering processes such as collective behavior, self-organization, emergence and the constructivist role of the observer, the focus in the literature was then generalized by considering 1) how collective phenomena establish systems, 2) how processes of acquisition of new properties take place within systems, 3) hierarchies of emergence and interactions between them, 4) multimodeling and 5) establishment of Multiple Systems. The new problems relate to the search for a General Theory of Emergence (GTE) to account for any kind of collective phenomena [24]. This expression is used in various disciplinary contexts with the aim of generalizing processes of emergence, as when introducing the concept of an “evolutionary Mechanics” [25] and for an
250
G. Minati
ontology of levels and when discussing agent-based computing [26]. While the framework of GST regards systems established through organization, structured into subsystems and possessing properties to be managed by using inter- and trans-disciplinary approaches, GTE deals with processes of the establishment of systems through emergence, where the dynamics do not relate only to processes with respect to time, but also to multi-modeling, hierarchy of interacting levels, acquisition of new properties and the multiple roles of elements. In this way GTE considers GST as a particular case. Moreover GST has been controversially named “Theory” even though it is more of an approach and a cultural framework. Theories are statements of a language having generalized explicative and predictive power. In science a theory consists of hypotheses related to experimental data logically connected as in the hypothetico-deductive method. A theory is then a formalization of observations allowing it to be used as a model to explain, predict, simulate and to be negated. Examples of very well-known theories are: electromagnetism, game theory, gravitation, quantum theory and relativity. The term theoretical relates to the representation of a result which is predicted by theory, but has not yet been observed. Typical examples include the prediction of the existence of black holes. Failed predictions are useful to prove a theory wrong, as in the famous Michelson-Morley experiment performed to detect the aether wind. GTE is expected to become a real theory able to explain collective processes in different disciplinary fields. It is conceptually possible to use theories of phase transitions in any disciplinary field if we are able to properly describe processes using suitable variables in the equations. In the same way we could have both a single GTE of, more likely, different disciplinary approaches expressing in different ways the same principles (at the moment we have neither!). 6. General Theory of Emergence The study of emergence and emergent phenomena is at the focus of current research interests in several disciplines, such as Physics, Biology, Artificial Intelligence, Economics and Cognitive Science. There are general theories such as Analytical Mechanics related to a mechanical description of phenomena; Thermodynamics related to a description of phenomena in terms of energy and collective motion of particles; theories of phase transitions related to changes of state of matter (e.g., solid, fluid, gaseous, superconductive and superfluid), development of biological organisms, learning and aspects of social systems. Synergetics related to processes of selforganization (describable through order parameters) of patterns and structures
General Theory of Emergence Beyond Systemic Generalization
251
in open systems far from thermodynamic equilibrium as in many different physical, biological, chemical and social systems. For such modeling, it is possible to achieve generality by giving different meanings to the variables (e.g., particle, cell, social agent as buyer in a market). The novelty is that a GTE should deal with a hierarchy of processes of emergence, when a property emerges from the interaction of entities emerging from lower-level structures, such as in Baas hierarchies [27,28,29], acquired emergent properties (see the paper presented at this conference) and systemic, emergent properties having current theories as particular cases. It may be expected to be a theory of modeling emergent phenomena, a meta-theory like meta-mathematics (i.e., mathematics used to study the foundations and methods of mathematics) within a formalist approach. The purpose in this case is not to try to demonstrate self-coherence (destroyed by Gödel’s theorem), but the relationships and interdependence between models. Meta-theorizing is now being focussed upon in various disciplinary fields, including physics [30]. Metatheories are related to modeling by using models, i.e., meta-modeling. Meta-modeling is a consolidated approach in several disciplines, such as software engineering using meta-languages to describe other languages and creating semantic models [31,32,33,34]. Another example of meta-modeling in systems science is the concept of logical openness related to the establishment of meta-levels, i.e., models of models [35]. Finally the concept of DYSAM not as a single, procedural, rule-based methodology, but as a systemic general model, meta-model (i.e., a model of models), used to carry out specific, contextual methodologies [7]. A GTE is thus very closely related to the generalization of processes of establishing collective phenomena possessing and generating multiple systemic properties. It is expected, at least, to: • deal, in a systematic way, with correspondences between models and representations of phenomena considered emergent; • allow the development of tools for detecting and verifying processes of emergence and de-emergence in general [7,36]; • identify and classify possible different non-equivalent kinds of emergence, such as biological and physical [5]; • identify the limits of its generalization. It is necessary to define the domain of validity of such a theory. This is because any attempt to produce a theory having unlimited value has in itself the contradiction stated by the wellknown Gödel theorem in Meta-mathematics.
252
G. Minati Table 1. Emergence of new properties in emergent systems. Emergent systems
Social systems
Living systems Brain
Properties of the emergent systems Language and population dynamics Homeostasis, Autopoiesis Cognitive abilities
Emergent properties within emergent systems Swarm Intelligence and Collective learning abilities (e.g. industrial districts) Psychosomatic illnesses Mind illnesses
GST should be intended as a general theory of processes establishing systems and systemic properties. The general process to be theorized is that of emergence [37], comprehensive of that of organization. In its turn a GTE should have particular theories as specific cases and should be able to multi-model processes of the establishment of systems and of systemic properties. This relates to meta-properties such as the property to • make emergent a process (i.e., induce processes of emergence); • detect the occurrence of a process of emergence; • make a process de-emerge (i.e., to disappear); • transform one process of emergence into another; • manage a process of emergence (e.g., slow down, speed up, split, change parameters); • mix processes of emergence (e.g., swarming and markets); • separate processes of emergence; • find possible categories of non-equivalent processes of emergence; • describe interactions between levels in hierarchies making new properties emerge; • distinguish between processes of emergence of systems and processes of emergence of properties in complex systems (i.e., processes of emergence in processes of emergence). In this view, systems do not only possess properties, but new properties may be established by following processes of emergence taking place within them (Tab. 1). GTE may be able to generate meta-models, i.e., models of models of specific processes of emergence as studied in the literature. We may have an anticipation of this theory when considering systemic properties as trans-disciplinary properties and try to represent relationships between them. Using this language we may, for instance, describe conditions for their compatibility/incompatibility, sequential occurrence over time, randomness, multi-modeling, power to activate, detect, de-emerge, transform, influence, mix, and separate processes of
General Theory of Emergence Beyond Systemic Generalization
253
emergence. Today we have phenomenological approaches, but not a theory able to model embedded processes of the emergence of properties. Systems do not only objectively possess properties, but are also able, in their turn, to make emergent new ones (complex systems are systems within which processes of emergence occur). Examples of emergence of systemic properties in systems established by processes of emergence are given by cognitive abilities in natural and artificial systems, collective learning abilities in social systems such as flocks, swarms, markets, firms and functionalities in networks of computers (e.g., Internet). Evolutionary processes are assumed to establish properties in living systems together with processes of self-organization [38]. The generality that we have today relates to the validity of a theory in different domains. With GTE we are looking for a theory of properties, taking the first ones as particular cases. We also think that a good approach is both to try to clarify why theories such as, for instance, Synergetics, theories about phase transformations and dissipative structures and quantum models of emergence are not themselves General Theories of Emergence (i.e., what is missing?) and how is it possible to generalize them in such a way to extend them towards a more general one. On this point we recall von Bertalanffy’s expression about the Unity of Science: “A unitary conception of the world may be based, not upon the possibly futile and certainly farfetched hope finally to reduce all levels of reality to the level of physics, but rather on the isomorphy of laws in different fields” [15]. Is this the prospective for a General Theory of Emergence? 7. Conclusions Scientific research is a continuous balance between delving deeper and deeper into details and generalizing results in order to apply it to a more generalized domain and create relations between specialized results at a suitable level of description. We have discussed here the definition of generalization with special regard to the context of systems research and emergence. The problem of generalizing as related to systems research and particularly that of founding a General Theory of Emergence has been discussed. We have introduced a framework within which it is possible to build up the basis of such a theory as a meta-theory, based on meta-modeling. A first step in this endeavor is find out why theories, such as Synergetics, theories about phase transformations and dissipative structures, and quantum theories are not themselves General Theories of Emergence and how they may be particular cases of a more general theory. The interest to lay the basis of such a theory is related to the possibility of
254
G. Minati
managing systemic properties and processes of emergence as with DYSAM and other approaches. Those improvements are expected, for instance, to allow: • the availability of a theory able to go beyond the cross-disciplinary usage of models (i.e., inter-disciplinarity), towards the usage of multiple-modeling, such as for CBs, and the modeling of embedded processes of emergence, as for the emergence of properties in emergent systems (i.e., transdisciplinarity); • the search for systemic properties where they have not yet been considered; • transfer between different levels: the study of problems in specific disciplines and details used to define systemic properties and use systemic properties for acting upon those details. The aim of this paper was to introduce: • the meaning of generalization; • how the issue of generalization is related to emergence and Systemics; • a framework for the search for a General Theory of Emergence. References 1. J.H. Holland, K.Y. Holyoak, R.E. Nisbett and P.R. Thagard, Induction (MIT Press, Cambridge, MA., 1986).
2. C.W. Evers and E.H. Wu, Journal of Philosophy of Education, 511 (2006). 3. T.M. Mitchell, Machine Learning (McGraw-Hill, New York, 1997). 4. G. Nicolis, and I. Prigogine, Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order through Fluctuations (Wiley, New York, 1977).
5. E. Pessa, in Systemics of Emergence: Research and Development, Ed. G. Minati and E. Pessa (Springer, New York, 2006), pp. 355-374.
6. G. Minati and S. Brahms, in Emergence in Complex Cognitive, Social and 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp. 4152. G. Minati and E. Pessa, Collective Beings (Springer, New York, 2006). J. Hintikka and G. Sandu, The Journal of Philosophy 290 (1992). A. Rohn, The Journal of Symbolic Logic 25 (1941). R. Butts, and J. Brown, Eds., Constructivism and Science (Kluwer, Dordrecht, Holland, 1989). S. Guberman and G. Minati, Dialogue about Systems (Polimetrica, Milan, Italy, 2007). M. Bongard, Pattern Recognition (Spartan Books, New York, 1970). M. Wertheimer, Productive Thinking (Harper, New York, 1943). M. Wertheimer, Social Research, 78 (1944). L. von Bertalanffy, General System Theory: Foundations, Development, Applications (George Braziller, New York, 1968). J.W. Forrester, Industrial Dynamics (MIT Press, Cambridge, MA., 1961).
General Theory of Emergence Beyond Systemic Generalization
255
17. S.A. Umpleby and E.B. Dent, Cybernetics and Systems 79 (1999). 18. W.A. Porter, Modern Foundations of Systems Engineering (Mac Millan, New York, 1965).
19. R. Hirschheim, H.K. Klein and K. Lyytinen, Information systems development and 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.
data modeling: Conceptual and philosophical foundations (Cambridge University Press, Cambridge, 1995). J.G. Miller, Living Systems (McGraw Hill Books, New York, 1978). C. Avgerou, Omega, 567 (2000). P. Checkland and J. Scholes, Soft Systems Methodology in Action (Wiley, New York, 1990). P. Checkland, Systems Thinking, Systems Practice (Wiley, New York, 1981). G. Minati, in Systemics of Emergence: Applications and Development, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006), pp. 667-682. J.P. Crutchfield, in Complexity: Metaphors, Models, and Reality, Ed. G. Cowan, D. Pines and D. Meltzer, (Addison-Wesley, Reading, MA, 1994) pp. 515–537. C. Emmeche, S. Køppe and F. Stjernfelt, Journal for General Philosophy of Science 83 (1997). N.A. Baas, in Alife III, Santa Fe Studies in the Science of Complexity, Proc. Volume XVII, Ed. C. G. Langton, (Addison-Wesley, Redwood City, CA, 1994), pp. 515-537. N.A., Baas and C. Emmeche, Intellectica 67 (1997). K. Kitto, Modeling and generating Complex Emergent Behavior, Ph.D. thesis, The School of Chemistry, Physics and Earth Sciences (The Flinders University of South Australia, 2006). S. Blaha, The Metatheory of Physics Theories, and the Theory of Everything as a Quantum Theory Computer Language (Pingree-Hill Publishing, Auburn, NH, 2005). G. Booch, J. Rumbaugh and I. Jacobson, The Unified Modeling Language User Guide (Addison Wesley Longman Publishing Co., Redwood City, CA, 1999). J.P. van Gigch, System Design Modeling and Metamodeling (Plenum Press, New York, 1991). J.P. van Gigch, Applied General Systems Theory (Harper & Row, New York, 1978,). J.P. van Gigch, Metadecisions: Rehabilitating Epistemology (Kluwer, New York, 2003). G. Minati, M.P. Penna and E. Pessa, Systems Research and Behavioral Science 131 (1998). G. Minati, in Emergence in Complex Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp. 85-102. J.P. Crutchfield, Physica D, 11 (1994). S. Kauffman, Investigations (Oxford University Press, New York, 2000).
This page intentionally left blank
UNCERTAINTY, COHERENCE, EMERGENCE
GIORDANO BRUNO Department of MEMOMAT, Sapienza University of Rome Via A. Scarpa 16, 00164 Rome, Italy E-mail:
[email protected] In a previous paper (Uncertainty and the Role of the Observer, co-authored with G. Minati and A. Trotta, Proceedings of the 2004 Conference of the Italian Systems Society in publication by Springer), we focused on the deep epistemological contribution of the Italian mathematician Bruno de Finetti (1906 - 1985), from a systemic point of view. He considered the probability of an event nothing but the degree of believe of the observer in its occurrence, relating this degree of believe to the information available, in that moment, to the observer. He pointed out how, when considering probability, we need to focus on the role of the observer expressing the degree of believe and how S/He can construct a system of coherent probabilities. The purpose of this paper is to show how this subjective conception of probability is based on assuming a systemic framework, even in cases of conditional events. Regarding this, we underline how the fundamental conceptual and methodological tool is the well-known Bayes Theorem. With reference to this theorem, we will be introducing examples to show how its usage is not only crucial in generating probabilities suitable for the emergence of a system of coherent evaluations, but even able to explain some paradoxical aspects. Keywords: subjective probabilities, Bayes theorem, role of the observer, coherent evaluations.
Sembrerebbe pertanto naturale che i modi abituali di pensare, di ragionare, di decidere, dovessero esplicitamente e sistematicamente imperniarsi sul fattore incertezza come sull’elemento concettualmente preminente e determinante. (Bruno de Finetti, Teoria delle probabilità, Einaudi, 1970) 1. Introduction In a previous paper [1] we focused on the deep epistemological contribution of the Italian mathematician Bruno de Finetti (1906 - 1985), from a systemic point of view. He considered the probability of an event nothing more than the degree of believe of the observer in its occurrence, relating this degree of believe to the information available, in that moment, to the observer [2]. The goal of the paper was to show how the subjectivist approach to probability has a systemic validity, in the sense that the observer plays a fundamental role in making the emergence
257
258
G. Bruno
of a system of coherent probabilities, when S/He must assign a set of probabilities to different events, relating to a given random phenomenon. We recall now the principal treated aspects. First of all, it was remarked how it is possible to assign to a family of events a qualitative measure of probability, by a natural order relation: not less possible than [3]. This relation let us construct an axiomatic probability theory, even in the case of conditional events by introducing a further axiom. Obviously the observer is free, with regard to S/His opinion, to choose the preferred relation among all that are admissible. Secondly, it was quoted as de Finetti has introduced a numerical measure of the degree of believe of an event. He, by referring to a bet pattern, determines such a measure (probability) as the price p which a coherent person is willing to pay for getting 1 if the event were true or 0 otherwise. In this definition of probability, a coherent person is a person who agrees only bets in which S/He has not an “a priori” loss. This subjective probability approach, founded on the former numerical measure of probability, suggests how the de Finetti conception is the most able to assure a systemic procedure, based on the role of the observer and on the coherence tool. 2. Conditional events and their probabilities In this paper we wish to continue dealing with our argumentation in the case of conditional events. We recall that, given any two events E and H ( H ≠ Φ ), we can consider a conditional event E H , which has the following meaning TRUE, E H = FALSE, INDETERMINATE,
if H true and E true if H true and E false if H false
So if we want to bet on E/H we must do it, by de Finetti [2], in the following way:
PAY p TO GET
1 0 p
if H true and E true if H true and E false if H false
Uncertainty, Coherence, Emergence
259
By this definition of a conditional bet, we obtain that we can estimate the uncertainty of E H by means of the price p . This measure is named conditional probability and P( E H ) denotes it. De Finetti [2] proves that the conditional probability P( E H ) , so introduced, verifies all the properties (axioms) of a probability and it represents the probability of E , H supposed true. In particular, he shows how the natural condition of coherence: random winnings “not all negative” in a set of bets, leads for any event E and H ( H ≠ Φ ) to the theorem:
P( E ∩ H ) = P ( H ) P( E H ) ; and to its corollary, the well-known Bayes Theorem ( E ≠ Φ ):
P( H E ) = K P( H ) P( E H ) , with K = 1 P( E ), P( E ) ≠ 0 . 3. Bayes theorem meaning and its applications Let us dwell on the meaning of Bayes Theorem. If we regard E as an event which represents an experimental result of a random phenomenon and H an hypothesis concerning the same phenomenon, then the theorem asserts that the probability of H conditionally to E is proportional to the probability of hypothesis H multiplied by the probability of E conditionally to H . To clarify better the former explanation, let us to resort to the classical model of an urn of unknown composition. Let us considerer an urn which contains N balls, but it is unknown how many are red (having from 0 to N red balls). Let us indicate by E the event h red balls on n , as a possible result of an extraction of n balls (for example without restoration). The event in the urn there are r red balls on N , be the hypothesis H . The Bayes theorem allows us to evaluate the probability of hypothesis H conditionally to the experiment E (called final probability): it is proportional by the factor K to the probability of H (called initial probability) multiplied to the probability of E conditionally to H (called likelihood). In other words, the Bayes theorem shows us how we must update our evaluations in the presence of further information (better, by supposing to receive further information): final probability = K × initial probability × likelihood.
260
G. Bruno
Let us observe that initial and final have, in this context, only the meaning respectively of before and after of E beginning known. Of course, in the same way we must evaluate the final probability of the contrary hypothesis H c , obtaining
P( H c E ) = K P( H c ) P( E H c ) , with K = 1 P( E ) , P( E ) ≠ 0 . Much more in general, if we want to evaluate the final probabilities of m different hypothesis H j , which set a partition of the certain event Ω , we obtain the following expression of Bayes theorem:
P( H i E ) = K P( H i ) P( E H i ) , j = 1,2, P( E ) =
, m ; with K = 1 P( E ) ,
P( H j ) P( E H j ) , P( E ) ≠ 0 .
j =1, 2, , m
Let us go back to our example. The H1 represent the possible hypotheses of the urn’s composition. We may, after an initial guess, formulate, via Bayes Theorem, the final answer. In the general framework of objectivistic probability (classical as well as frequentist probability) the observer has only to execute, in the correct manner, the calculations, using symmetries or self similarity (we need to point out that some subjective choices have been made, e.g. all the outcomes are considered equiprobable and the extraction independent). In these cases Bayes Theorem loose part of its importance and it stands as a pure mathematical result. In de Finetti [2] subjectivistic context Bayes Theorem shows better the meaning of “learning by experience”. This is exactly what happens in the medical practice. While is investigating a possible illness, a medical doctor usually starts from an initial guess, then he asks for specific instrumental exams, and, on the basis of exams’ results, he comes out with the final answer. Even in this case we are in the realm of uncertainty: the event “The patient suffering of such an illness” is only possible, not certain nor impossible. Fortunately, despite human and instrumental errors, the probabilities of discovering an illness are close to 1 (or 0). In the applications the maximal likelihood method is often used to estimate the value of a parameter that may contribute to some stochastic phenomena: one has to compute the probabilities of E (probability densities in the case of a continuous parameter) conditionally the hypothesis H1 (likelihoods) and take the maximum value as the parameter’s estimate.
Uncertainty, Coherence, Emergence
261
In the urn example we evaluate the P( E H j ) , for all j, and we take the largest value: assuming that the maximum of P( E H j ) is obtained for j = 3 , we say that the hypothesis H 3 is an estimate of the “true” urn composition. Let us imagine that the urn contains ten balls without knowing how many of them are red. We sample (extract one ball and put it back into the urn) five balls, three are red. We call E this outcome. We now evaluate P( E H j ) , j = 0,1, ,10 , and we get:
P( E H1 ) =
5! j 3! 2! 10
3
10 − j 10
2
.
In particular, we have:
P( E H 0 ) = 0 P( E H1 ) = 0.0081 P( E H 2 ) = 0.0512 P ( E H 3 ) = 0.1323 P( E H 4 ) = 0.2304 P ( E H 5 ) = 0.3125 P( E H 6 ) = 0.3456 P( E H 7 ) = 0.3087 P ( E H 8 ) = 0.2048 P ( E H 9 ) = 0.0729 P( E H10 ) = 0 The maximum value 0.3456 is attained for H 6 , therefore we conclude that the estimate (maximal likelihood) for the urn’s composition is: six red balls out of ten. There are two side effects in this approach both of a logical nature. First of all we are using the inverse conditional probabilities P( E H j ) instead of the P( H j E ) which are the correct ones; only through the P( E H j ) one may be able to find out the most probable urn’s composition among the available hypotheses, and take it as the urn’s composition estimate. Moreover, and this is the second side effect, we have not made any use of the P( H j ) (hypothesis probability) and this may lead to a wrong conclusion. In the urn’s composition example, if one uses (as he should) Bayes Theorem in evaluating P( H j E ) he would need the P( H j ) . Every distribution, apart from the equiprobable, of them may lead to a different result, i.e. the maximum of
262
G. Bruno
H j may be not associated to H 6 . If we know, by chance, that it is more probable to find, in the urn, as many red balls as no red balls i.e. P( H 5 ) = 0.6 , P( H 4 ) = P( H 6 ) = 0.15 , and P( H 0 ) = P( H1 ) = P( H 2 ) = P( H 3 ) = P( H 7 ) = = P( H 8 ) = P( H 9 ) = P ( H10 ) = 0.0125, we get P( H 0 E ) = 0 P( H1 E ) = 0.00035 P( H 2 E ) = 0.00225 P( H 3 E ) = 0.00581 P( H 4 E ) = 0.12185 P( H 5 E ) = 0.66111 P( H 6 E ) = 0.18278 P( H 7 E ) = 0.01357 P( H 8 E ) = 0.00902 P ( H 9 E ) = 0.00320 P( H10 E ) = 0 Therefore we now have H 5 as the most probable hypothesis. Let us discuss a different example, may be even more significant for its “paradoxal” outcomes. Paul is looking for a new job, but he does not show up for the interview. Let us denote this event by E . The human resources director wants to know why, and he comes up with some hypotheses: H1 = Paul found a new job H 2 = Paul went to jail H 3 = Paul won a lottery or any other different reason. If the director uses likelihood to reach a conclusion, he will come out with H 2 , since H 2 implies E , therefore P( E H 2 ) = 1 . On the other hand, using Bayes Theorem, the most probable hypothesis may not be H 2 . As a matter of fact, if we assume P( E H1 ) = 0.6 , P( E H 3 ) = 0.2 e P( H1 ) = 0.7 , P( H 2 ) = 0.25 , P( H 3 ) = 0.05 we get
P( H1 E ) = 0.618 P( H 2 E ) = 0.368 P ( H 3 E ) = 0.014 Hence the most “reasonable” hypothesis is also the most probable.
Uncertainty, Coherence, Emergence
263
4. Conclusions The examples we have been discussing suggest the following remarks. Bayes Theorem is the cornerstone of “coherence” upon which we build up the probabilities updating. Only by using Bayes Theorem one is able to reduce the uncertainty not to eliminate it, and coming, therefore, to certain conclusions. It represents a good example of non-linear thinking. The probabilities updating must follow an unique principle: the coherence. It guarantees the observer not falling into contradictions moving from initial to final guesses. The final probabilities of the m events would be admissible if and only if they satisfied Bayes Theorem. The observer, therefore, is able to pick one or more sets of admissible hypotheses using coherently his level of faith on the events being considered, the available information and the coherence. The observer models the emerging system (set of events interacting with probabilities assigned by the observer himself) taking into account the likelihoods which are merely probabilities even though they are considered as “certain” data. The observer is eventually the unique “responsible” of his evaluations, and in any case it is not fair to say if he was right or wrong in his prediction. This is because any prediction lives in the realm of uncertainty and it cannot become a forecast (realm of certainty). One may only argue if the observer has been coherent or not. From what we have discussed in this present work as well as in the previous one, we may deduce that the uncertainty logic of Bruno de Finetti [4] is a significant example of systemic approach to the study of “reality”, and it may be considered as an example of logical openness. The observer, as a matter of fact, in his quest for an admissible system of probabilities, does not need (as in the objectivistic framework) to know the whole space of events, its elements and their probabilities, but he can start from the probability evaluation of a single event, and he proceeds step by step in evaluating the probabilities of those events he is interested in. Obviously the observer must consider all the interactions among the events (in the unconditioned as well as in the conditioned cases), and he must respect, as we mentioned several times, “only” the coherence.
264
G. Bruno
References 1. G. Bruno, G. Minati and A. Trotta, in Proceedings of the 2004 Conference of the Italian Systems Society (Springer, New York, 2006).
2. B. de Finetti, Theory of Probability, A Critical Introductory Treatment (translated by A. Machi and A. Smith) (Wiley, London, 1974).
3. B. de Finetti, Annales de l’Institut Poincaré 7(1), (1937). 4. B. de Finetti, La logica dell'incerto (Il Saggiatore, Milano, 1989).
EMERGENCE AND GRAVITATIONAL CONJECTURES
PAOLO ALLIEVI(1), ALBERTO TROTTA(2) (1) Sogin S.p.A., Via Torino 6, 00184 Roma, Italy E-mail:
[email protected] (2) Department of Mathematics, Scientific Lyceum ”Innocenzo XII” Via Ardeatina 87, 00042 Anzio (RM), Italy E-mail:
[email protected] The behaviour of coherent structures emerging as outcome of a phase transition can be ruled by classical or by quantum laws. The latter circumstance depends in a critical way on the relative importance of quantum fluctuations which, in turn, depends on the numerical value of Planck’s constant. In this paper we explore the consequences of the hypothesis according to which there are different kinds of Planck’s constant, each one related to the kind of interaction entering into play in the specific phase transition. Within this paper we dealt with the simplest case, in which we have only two Planck’s constants: the usual one, interpreted as related to electromagnetic interactions, and another, related to gravitational interactions. We feel this framework should be useful to describe cosmological phase transitions, such as galaxy and star formation, as well as the birth of black holes. According to our hypotheses, these emerging coherent structures should be ruled by suitable quantum laws (expressed, for instance, by a suitable kind of Schrödinger equation), including a “gravitational” Planck’s constant. Even if the present paper deals with the particular case of gravitational interactions, it seems that its methodology could be useful even to study other kinds of emergent phenomena. Keywords: Planck’s constant, gravitational interaction, corpuscular models, gravitational waves.
1. Introduction The behaviour of coherent structures emerging as outcome of a phase transition can be ruled by classical or by quantum laws. The latter circumstance depends in a critical way on the relative importance of quantum fluctuations which, in turn, depends on the numerical value of Planck’s constant. Usually people assumes the latter as a universal constant, whose value is very small. It is, however, worthwhile to explore the consequences of the hypothesis according to which there are different kinds of Planck’s constant, each one related to the kind of interaction entering into play in the specific phase transition. Would this be the case, the whole theory of emergence should be reformulated. Within this paper we dealt with the simplest case, in which we have only two Planck’s constants: the usual one, interpreted as related to electromagnetic interactions, and another,
265
266
P. Allievi and A. Trotta
related to gravitational interactions. In order to keep the theory as simple as possible, we adopted a semiclassical description, based on a heuristic corpuscular model of long range interactions, which let us find the numerical value of the “gravitational” Planck’s constant, as well as give a more correct estimate of frequency of gravitational waves. We feel this framework should be useful to describe cosmological phase transitions, such as galaxy and star formation, as well as the birth of black holes. According to our hypotheses, these emerging coherent structures should be ruled by suitable quantum laws (expressed, for instance, by a suitable kind of Schrödinger equation), including a “gravitational” Planck’s constant. We found even a more general formula to compute the value of the associated Planck’s constant for whatever kind of long range interaction, provided it be described by a corpuscular model. Even if the present paper deals with the particular case of gravitational interactions, it seems that its methodology could be useful even to study other kinds of emergent phenomena. As a consequence of the foregoing, the paper deals with the following topics: • the order of magnitude of the graviton wavelength λG = 1013 m and frequency vG = c/λG =(3⋅108)/(1013) ≅ 10-5 Hz (section 2), • the gravitational Schrödinger’s Equation (section 4) and the gravitational Planck’s constant hg = 2⋅1021 h (section 6, where h = 6.62⋅10-34 Js is the electromagnetic Planck’s constant), • the quantized orbits of the solar system (section 7), • the limit mass Mlimit = 284 solar masses of the black Hole (section 8). The conclusions are important because it is possible to place the frequency range of the gravitational waves around 10-5 Hertz, a circumstance allowing precise experimental tests. The gravitational Planck’s constant hg = 2⋅1021 h and graviton wavelength λG allow us to estimate the order of magnitude (10100) of the ratio of the largest (lPlanckG = 1067 cm) to the smallest (lPlanckE = 10-33 cm) Universe dimension. The topic has been treated in a classical/quantum way to find a confirmation of a heuristic corpuscular Model of Electrodynamics and Gravitation (see Table 1, where the order of magnitude of some parameters of interest are shown, and Table 2). In fact considering a heuristic corpuscular Model of Electrodynamics [1], based on the hypothesis that every charge emits naturally and continually particles of energy εf ≅ 10-24 eV and diameter df ≅ 10-19 m, it is possible to state the mathematical structures of 0 (permittivity of free space) and h (electromagnetic Planck’s constant).
Emergence and Gravitational Conjectures
267
Table 1. Comparison between Atomic and Solar Systems.
Mass
m
m
m
m
kg
Heavenly body
Diameter
eV
1022 D
kg
length
Mass
Particle
SOLAR System
Diameter D
ATOMIC System
Atom U238
4E-25
2E+11
1E-10
-
1E+12
1,6E+12
-
Photon uv
2E-35
1E+01
1E-10
1E-07
1E+12
-
-
SunJupiter -
Nucleus U238 Nucleon
4E-25
2E+11
1E-13
-
1E+09
1,4E+09
2E+30
Sun
1,67E-27
9E+08
1E-14
-
1E+08
1,4E+08
2E+27
Jupiter
Graviton
2E-34
1E+02
1E-14
1E-13
1E+08
-
-
-
Electron
9E-31
5E+05
1E-15
-
1E+07
1,3E+07
6E+24
Earth
Fotino
2E-60
1E-24
1E-19
-
1E+03
1,0E+03
1E+10
mountain
Gravitino
2E-84
1E-48
1E-22
-
1
1
1E+03
rock
Table 2. Present universal Values (Constants over time??). stationary G,c ε0, c
Gravitation Electromagnetism
waves h g, c h,c
me = const e2 e
In particular for h we have [2]:
h=
2 ⋅ 9π k D2 d f ≅ 16 k 4 ε f c
2 ⋅ 9π (10 21 ) 2 10 −19 ⋅1,6 ⋅10 −19 ⋅ ⋅ = 6,62 ⋅10 −34 Js (1) 16 (3 ⋅1013 ) 4 10 − 24 ⋅ 3 ⋅108
Considering, moreover, a heuristic corpuscular model of gravitation [3], based on the hypothesis that every body emits naturally and continually particles of energy εg ≅ 10-48 eV and diameter dg ≅ 10-22 m, with time constant
τ = (cR 2p ) (4m p G ) = 2 ⋅109 years (were Rp and mp are respectively proton Radius and mass), it is possible to state the mathematical values of G (gravitational constant) and hg (gravitational Planck’s constant). In particular for hg we have [4]:
268
P. Allievi and A. Trotta
hg =
2 ⋅ 9π kD2 d g 2 ⋅ 9π (1021)2 10−22 ⋅1,6 ⋅10−19 ≅ ⋅ ⋅ ≅ 1,32⋅10−12 Js 16 k 4 ε g c 16 (3⋅1013)4 10−48 ⋅ 3⋅108
(2)
Some consequences of such a corpuscular model of gravitation are the following: (1) The velocity of light c decreases over time (3 m/s in 10 years); (2) solar system expands; (3) Earth Radius increases of 3 mm/year; (4) Earthquakes recurrence Period is 7 years; (5) Earth revolution (year) and rotation (day) Periods increase respectively of about 8 s and 0,004 s in 100 years. Therefore the ratio between hg and h is:
hg h
=
ε f dg ⋅
εg d f
≅ 2 ⋅10 21
(3)
It is noticeable that for any other X Process by which particles, of energy ε x and diameter dx, are emitted, it is possible to state the following mathematical structure of the new emerging physical quantity:
hx =
2 ⋅ 9π k D2 d x ⋅ 4⋅ . 16 k εx c
(4)
2. Tuning Error Although the theory of general relativity foresees that an accelerated mass emits gravitational waves, which travel in space at the velocity of light and consist of gravitons, since 1950 the scientists are vainly proving the existence of the gravitational waves. Systems of antenna, to catch gravitational waves, have been built for resonance frequencies of the order of some kHz. In the following we suppose that there is a tuning error in the astronomical observations, in that frequencies too high are considered. We moreover suppose that the mass and the dynamical parameters of the heavenly bodies, which emit gravitational waves, only determine the wave intensity and not its frequency, which is characteristic of the elementary units (accelerated nuclei) constituting the heavenly bodies. If we refer to the Table 3, we can see that proceeding from strong interaction to electromagnetic one and then to gravitational one, the characteristic frequencies, which are associated to the particles transferring action, progressively decrease. We therefore assume that this is a natural manifestation which all the existing matter shows and leads to place the frequency range of the gravitational waves around 10-5 Hertz.
Emergence and Gravitational Conjectures
269
Table 3. Characteristic frequencies, associated to the particles transferring action. Emitting source
Source Radius
Radiation (Particle)
Nucleus
Rn
m 1E-14
Atom
RA
1E-10
Macromolecules Heavenly Bodies
Rmol
1E-08
RHB
1E+09
(*)
G=(RHB/RA) A=(10
9
Gamma (Photon) Light (Photon)
/10-10)10-6
Thermic rays (Photon) Gravitational wave (Graviton)
Wavelength
Frequency
λn
m 1E-12
fn
Hz 1E+20
λA
1E-06
fA
1E+14
λmol
1E-04
fmol
1E+12
λG (*)
1E+13
FG
1E-05
The frequency vG of the gravitational wave can be calculated as follows. During the gravitational collapse, every elementary system, which constitutes the collapsing mass, can prevalently be (given the relative abundance of Hydrogen) reduced to a nucleon of mass m which orbits round an other nucleon, as it is shown in Figure 1. The order of magnitude of wavelength λG of the graviton, which is emitted from such a system, is:
λG = 4π
r3 c2 Gm
1/ 2
(10 −10 ) 3 ⋅ (3 ⋅ 108 ) 2 = 4π 6,67 ⋅ 10 −11 ⋅ 1,67 ⋅ 10 −27
1/ 2
≅ 1013 meters (5)
where: r = 10 −10 m is the atomic radius, c = 3 ⋅ 108 m / s the velocity of light, G = 6.67 ⋅ 10 −11 Jm / kg 2 the gravitational constant and m = 1,67 ⋅ 10 −27 kg the nucleon mass. The corpuscular model of gravitation, based on the loss of mass [4], leads to the same value for λG. The order of magnitude of the frequency vG of the gravitational wave is then:
νG =
c
λG
=
3 ⋅ 108 ≅ 10 −5 Hz 1013
(6)
3. Planck’s Quantum-Electromagnetic Length (lPlanckE). After the Big Bang, the following equation of the energy conservation is valid:
E0 = Mc 2 + U p where, with reference to the Universe, E0 is the total energy,
(7)
270
P. Allievi and A. Trotta
v m
2r
m v Figure 1. A nucleon of mass m which orbits round an other nucleon.
M = M rest 1 −
v2 c2
−1 / 2
≅ M rest 1 +
1 v2 2 c2
the mass,
v the expansion speed, r the radius, c the velocity of light and
3 M2 Up = − G the potential energy (G is the universal attraction constant). 5 r Differentiating Eq. (7), we have:
0 = dM − which becomes:
1=
6 G M ⋅ dM 3 G M 2 + dr 5 c2 r 5 c2 r 2
(8)
3 G M M dr 2− 5 c2 r r dM
(9)
The quantization of the angular momentum yields:
h 2π
(13)
h 2πvr
(11)
Mvr = or:
M= where h is the Planck' s constant.
Emergence and Gravitational Conjectures
271
Putting Eq. (11) into Eq. (9), the latter becomes:
1=
3 Gh 1 h 1 dr 2− 2 2 10π c vr 2π vr 2 dM
(12)
Bearing Eq. (35) in mind, Eq. (12) becomes:
3 Gh 1 7 5 Gh 1 2+ = 2 2 10π c vr 9 6π c 2 vr 2 Extracting r from Eq. (13), we have: 5 Gh r2 = 6π c 2 v 1=
(13)
(14)
At the beginning of the Big Bang is v = c , so Eq. (14) becomes, representing s quantum-electromagnetic length: by lPlanckE the Planck' 2 l PlanckE =
and finally:
l PlanckE =
5 Gh 6π c 3
(15)
5 Gh 6π c 3
(16)
utilizing the present values of the constants that are G = 6.67 ⋅10 −11 Jm kg 2 , h = 6.62 ⋅ 10 −34 Js , c = 3 ⋅108 m / s , Eq. (16) yields the following value for lPlanckE:
l PlanckE =
5 6.67 ⋅10 −11 ⋅ 6.62 ⋅10 −34 = 2 ⋅10 −35 m ≅ 10 −33 cm 8 3 6π (3 ⋅10 )
(17)
4. Planck’s Quantum-Gravitational Length (lPlanckG). The gravitational Schrödinger' s equation, relating to a system of two bodies of which one of mass m is gravitating round the other of mass M at distance r with velocity v and momentum p = mv , is:
− where ∆ =
hg2 8π
2
m 2p
∆−
GM E Ψ = Ψ r m
∂2 ∂2 ∂2 is the Laplacian, mp the nucleon mass, + + ∂ x2 ∂ y2 ∂ z2
(18)
272
P. Allievi and A. Trotta
hg is the gravitational Planck' s constant, E the total energy and Ψ (x,y,z,t) the state function of the system above mentioned. Eq. (18) is determined in the following way. Let us consider the total energy of the system:
mv 2 Mm −G =E 2 r
or: 2
m mp
m 2p (v x2 + v 2y + v z2 ) 2m
(19)
−G
Mm =E r
(20)
Let us assume the following operators (where i2 = −1):
p px = m p v x =
hg
hg ∂ hg ∂ ∂ ⋅ ; p py = m p v y = ⋅ ; p pz = m p v z = (21) 2π i ∂z 2π i ∂x 2π i ∂y ⋅
Inserting these operators in Eq. (20) and multiplying both sides by Ψ (x,y,z,t), we yield:
1 m 2m m p
2
2
hg ∂ 2π i ∂x
hg ∂ 2π i ∂y
+
2
hg ∂ 2π i ∂z
+
2
−G
Mm Ψ = EΨ r
(22)
or:
hg2
m − 2 ⋅ 8π m m p
2
∂2
⋅
∂x
2
+
∂2 ∂y
2
+
∂2 ∂z
2
−G
Mm Ψ = EΨ r
(23)
and finally:
−
hg2 8π
2
m 2p
⋅
∂2 ∂x
2
+
∂2 ∂y
2
+
∂2 ∂z
2
−G
M m ⋅Ψ = EΨ r
(24)
Comparing Eq. (18) with the well-known Schrödinger' s electromagnetic equation that we cast under the form:
−
h2 8π 2 me2
∆−
e2 E Ψ = Ψ 4πε 0 me r me
(25)
where me and e are respectively the electron mass and charge and ε0 is the permittivity of free space, we deduce that we can utilize the results obtained by differential equation (25) taking care of substituting into them:
Emergence and Gravitational Conjectures
273
hg / m p for h / me
(26)
e2
GM for
(27)
4πε 0 me
Therefore we can get Planck' s quantum-gravitational length lPlanckG utilizing the following expression of the Bohr radius for the hydrogen atom:
rBohr
2
h 2ε 0 h = = 2 me πme e
⋅
1 4π
2
1 e2
⋅
(28)
4πε 0 me
by the substitutions (26) and (27) and putting, for the hydrogen atom, M = mυ :
hg
l PlanckG =
2
mp
hg2 1 ⋅ 2⋅ = 4π Gm p 4π 2G m 3p 1
(29)
Remembering Eq. (44) in section 6, we can finally compute the following value for lPlanckG :
l PlanckG =
(2 ⋅10 21 h) 2 4π
2
Gm 3p
=
(2 ⋅10 21 ⋅ 6.62 ⋅10 −34 ) 2 2
4π ⋅ 6.67 ⋅10
−11
(1.67 ⋅10
− 27 3
)
≅ 10 65 m = 10 67 cm (30)
Remembering Eq. (43), lPlanckG can also be computed through the expression:
l PlanckG =
5.
h 3c 4π G 2 m 4p me
(31)
2
Change of the Radius R of a heavenly Body when its Mass M changes
We can show that the volume change dV of a heavenly body, when its mass M changes, is:
dV dM = −3 V M
+
Gravitational effect where, for example, β =1 for the black hole; β =1 for the Earth;
α=2 α=0
α
dM M
+
Fusion kinetic effect
α = 3,3 β =1 α = 2/3 = 0.66
β
dM M
(32)
Mass effect for the Star; for kinetic effect alone.
274
P. Allievi and A. Trotta
For the whole Universe, whose mass M, after the Big Bang, decreases over time (because part of it changes into energy), Eq. (32) becomes:
dV dR dM 2 dM 2 dM 7 dM =3 = −3 + = −3+ =− V R M 3 M 3 M 3 M Gravitational effect
(33)
kinetic effect
consequently:
and so:
dR 7 dM =− R 9 M
(34)
dR 7 R =− dM 9M
(35)
6. Gravitational Planck’s Constant hg Comparing the Graviton and Photon physical characteristics, it is possible to arrive at the result that the ratio of gravitational Planck' s constant hg to electromagnetic one h is equal to the ratio of λG (Graviton wave length) to λF (Photon wave length), that is:
hg h
=
λG λF
(36)
This expression is coherent with the supposition that the order of magnitude of the Photon Energy E F = hν F = hc λ F is equal to the Graviton Energy
EG = hgν G =
hg c
λG
.
We know that the wave length λG of the Graviton, emitted from a system of a nucleon which orbits round an other nucleon at distance r, is:
λG =
4π 2 r 3 RSchwarzschild
1/ 2
4π 2 r 3c 2 = G ⋅ mp
1/ 2
≅ 1013 meters
(37)
being: RSchwarzschild = Gm p c 2 and where mp is the nucleon mass and r the atomic radius.
Emergence and Gravitational Conjectures
275
On the other hand we know that the wave length λF of the Photon, emitted still from the hydrogen atom, is, remembering Eq. (27):
λF =
4π 2 r 3c 2 ≅ e2 4πε 0 me
4π 2 r 3c 2 hc me
(38)
where e = 1,6 ⋅ 10−19 C and me = 9,1 ⋅ 10−31 kg are respectively the electron charge and mass and ε 0 = 8,85 ⋅10 −12 F / m is the permittivity of free space. By relations (37) and (38), Eq. (36) becomes:
λ hc / me = G = h λF G ⋅ mp
hg
1/ 2
hc = G ⋅ m p ⋅ me
1/ 2
(39)
Putting into Eq. (39) h = 6.62 ⋅ 10−34 Js , c = 3 ⋅108 m / s , G = 6.67 ⋅10−11 Jm / kg 2 , m p = 1.67 ⋅ 10−27 kg and me = 9.1 ⋅ 10−31 kg , the ratio hg / h assumes the following value:
hg h
=
6.62 ⋅10 − 34 ⋅ 3 ⋅108
1/ 2
6.67 ⋅10−11 ⋅1.67 ⋅ 10− 27 ⋅ 9.1 ⋅10 − 31
= 1.4 ⋅ 1021
(40)
Therefore the value of gravitational Planck' s constant is given by:
hg = 1.4 ⋅1021 ⋅ h = 1.4 ⋅ 1021 ⋅ 6.62 ⋅10−34 ⋅ Js = 9.3 ⋅10−13 Js
(41)
From Eq. (39) we have moreover, raising it to the 2nd power:
hg2
=
hc Gm p me
(42)
hg2 =
h 3c Gm p me
(43)
h
2
or:
It is possible to express the gravitational Planck’s constant in the following different way:
hg h
=
1
e = 2 ⋅10 21 m 4π ε 0G e ⋅
(44)
276
P. Allievi and A. Trotta
This expression is obtained by taking into account that the Photon/Graviton Force has the following mathematical structure:
1 ∆E 1 ∝ 2 c ∆t λ and therefore, remembering Eq. (36), F=
2
hg
=
h
λG λF
1
2
=
⋅
(45)
e2 r2
4πε 0 FF = FG m2 G 2e r
=
e2 4πε 0G me2 1
⋅
(46)
7. Stationary States (stationary Orbits) of the solar System To make an example we utilize the results obtained in section 4 for the solar system. Recalling the values of the physical constants above mentioned and utilizing for the solar mass the value M = MS = 2⋅1030 kg, we compute the radius of the 1st stationary orbit and the relative total energy (without intrinsic energy), per orbiting unit-mass,:
r1 =
hg2 4π
2
m 2p
GM
=
(2 ⋅ 10 21 ⋅ 6.62 ⋅ 10 −34 ) 2 4π 2 ⋅ (1.67 ⋅ 10
− 27
) 2 ⋅ 6.67 ⋅ 10 −11 ⋅ 2 ⋅ 1030
= 120 000 km (47)
E1 1 GM 1 6.67 ⋅ 10 −11 ⋅ 2 ⋅ 1030 J =− ⋅ =− ⋅ = −5.5 ⋅ 1011 8 m 2 r1 2 kg orbiting 1.2 ⋅ 10
(48)
or, if we express the energy in unit-mass,
E1 / c 2 kg = − 6 ⋅ 10 − 6 m kg orbiting
(49)
It is interesting to note that the energy, which is necessary to the orbiting unitmass to go away from the energy level E1 (1st stationary orbit) to infinity, is:
En → ∞ E E kg − 1 2 = − 1 2 = 6 ⋅ 10 − 6 2 kg mc mc mc orbiting
(50)
The full mathematical description of the system is analytically intractable and is left for future developments.
Emergence and Gravitational Conjectures
277
Table 3 shows radii rn of the stationary orbits, for some significant values of n, to which the real Planets orbits correspond. The values of rn are computed by resorting to the following relationship:
rn = n 2 ⋅ r1 = n 2 ⋅120000 km .
(51)
8. Limit Mass of a black hole The Dirac’s Equations for an electron in a central field, as for example for hydrogenoid systems, yield the following expressions for total energy Wn of the stationary states of the hydrogenoid system (Wn = mec2 + En , that is: electron intrinsic energy + electron kinetic energy + electromagnetic potential energy of the system electron/nucleus, disregarding the intrinsic energy of the nucleus which is at rest) and for radii rn of the stationary orbits:
me c 2
Wn, j , H = 1+
1 ( j + )2 − α 2 2
rn, j , H = −
(52)
α2
1 Z e2 1 = 2 2 4πε 0 (Wn, j , H − me c ) 2
2
1/ 2
+ n'
Z e2 2
4πε 0 me c 1 −
Wn, j , H
(53)
me c 2
where:
α=
2π Ze 2 2π me Ze 2 Z ⋅ = ⋅ ⋅ = hc 4πε 0 c h 4πε 0 me 137
(54)
is, for Z = 1, the fine-structure constant, 1 j =l ± is the internal quantum number , 2 l = 0 , 1, is the azimuthal quantum number,
n= j+
1 + n'= 1, 2 , 2
is the total quantum number,
the symbol H, initial letter of Hydrogen, is referred to Hydrogenoid systems of atomic number Z, while the values of the other constants are defined in sections. 4 and 6.
278
P. Allievi and A. Trotta Table 3. Radii rn of the stationary orbits, to which the real Planets orbits correspond. N
1 2 3 22 30 35 44 80 109 155 194 222
Planet P
Planet Mass mP kg
Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune Pluto
Average distance from Sun 106 km
3,3E+23 4,9E+24 6,0E+24 6,5E+23 1,9E+27 5,7E+26 8,7E+25 1,0E+26 1,0E+22
rn 106 km 0,12 0,48 1,08 58 108 147 232 768 1426 2883 4516 5914
58 108 150 228 778 1427 2870 4496 5900
Orbiting velocity vn km/s 1054 527 351 48 35 30 24 13 10 7 5 4,7
Expressions (52) and (53), on the grounds of the above mentioned analogy between electromagnetic and gravitational phenomena (see section 4), can be also utilized for gravitational phenomena, as in the case of a heavenly body of mass m which gravitates, at a distance r, round another heavenly body, at rest, of greater mass M, when we substitute (see relations (26) and (27)) into them: m for me, αg for α, hg / mn for h / me and GM for (Ze2)/(4πε0 me). Relations (52), (53) and (54), therefore become for gravitational phenomena:
m c2
Wn, j = 1+
1 ( j + ) 2 − α g2 2 rn , j =
(55)
α g2
1 ⋅ 2
c
2
2
1/ 2
+ n'
GM Wn, j 1− m c2
where: mn = m p = 1.67 ⋅ 10 −27 kg is the nucleon mass, hg = 2 ⋅ 10 21 h is the gravitational Planck’s constant,
(56)
Emergence and Gravitational Conjectures
αg =
2π mn ⋅ ⋅ GM is the gravitational fine-structure constant, c hg
1 2 l = 0 , 1, j =l ±
n= j+
279
(57)
is the internal quantum number, is the azimuthal quantum number,
1 + n'= 1, 2 , 2
is the total quantum number.
Wn,j are the values of the total energy of system stationary states (Wn,j = mc2 + T + U = mc2 + E, that is: intrinsic energy of mass m + kinetic energy of mass m + potential energy of the system of two masses m and M ), disregarding intrinsic energy of mass M that is Mc2. The fundamental state of the gravitational system in argument is individuated by the following values of the quantum numbers:
n =1,
l=0
(58)
consequently by conditions (57) we deduce:
j=
1 and n'= 0 . 2
(59)
Inserting such values (58) and (59) into Eq. (55), the total energy of the system at the fundamental state, per unit-mass orbiting round mass M, is:
W
1,
1 2 2
mc
= 1+
−
α g2
1 2
1 − α g2
= 1 − α g2
(60)
In case that mass M reaches the following limit value:
M limit =
c hg 2π G mn
=
3 ⋅108 ⋅ 2 ⋅10 21 ⋅ 6.62 ⋅10 −34 2π 6.67 ⋅10 −11 ⋅1.67 ⋅10 − 27
= 5.7 ⋅1032 kg = 284 M Sun
(61)
where MSun = 2⋅1030 kg , we have, by relation (57), αg = 1 and expression (60) becomes:
W
1,
1 2
=0
(62)
from which:
m c2 = − E
1,
1 2
(63)
280
P. Allievi and A. Trotta
while the radius of the 1st stationary (fundamental) orbit is, recalling Eq.s (56), (61) and (62):
r
1 1, 2
=
hg G M lim it 2 ⋅ 10 21 ⋅ 6.62 ⋅ 10−34 = = = 210 km 4π c m n 4π ⋅ 3 ⋅ 108 ⋅ 1.67 ⋅ 10 − 27 2 c2
(64)
Relation (63) express that mc2 is equal to the energy which is necessary, for mass m, to leave the fundamental orbit and, winning the gravitational field due to mass M, to reach infinity with remaining mass equal to zero. This behaviour is peculiar to masses which are placed on the surface of a black Hole. Therefore, radius r
1,
1 2
= 210 km also represents the radius of limit mass
M limit = 284 solar masses, which then behaves as a black Hole. 9. Conclusions This article shows that it is possible to place the frequency range of the gravitational waves around 10-5 Hertz . This allows detailed experimental tests. Moreover we sketched how, by making use of a corpuscular semiclassical approach, we can compute the values of two fundamental Planck’s constants associated to two different interactions, as well as their interrelationships. The method could be easily extended to other contexts, allowing the building of a general theory of emergence with a multiplicity of different “Planck’s constants”. This could generalize in an interesting way the usual theories of emergence so far introduced. References 1. P. Allievi, Theory on the corpuscular nature of the interactions among moving
charged particles, in Proceedings of the Mathesis National Conferences, (Anzio/Nettuno 2004). 2. P. Allievi, Theory on the Photon structure, in Proceedings of the Mathesis National Conferences, (Anzio/Nettuno 2004). 3. P. Allievi, Theory on the corpuscular nature of the gravitation, in Proceedings of the Mathesis National Conferences, (Trento, 2006). 4. P. Allievi, Theory on the Graviton structure, in Proceedings of the Mathesis National Conferences, (Trento, 2006).
EMERGENCE IN SOCIAL SYSTEMS
This page intentionally left blank
INDUCING SYSTEMS THINKING IN CONSUMER SOCIETIES
GIANFRANCO MINATI(1,2), LARRY A. MAGLIOCCA(2) (1) Italian Systems Society, Milan, Italy E-mail:
[email protected] (2) Ohio State University, Columbus, Ohio, USA E-mail:
[email protected] We introduce some core principles related to systems thinking: interaction, establishment of systems through organization and self-organization (emergence), and the constructivist role of the observer including the use of language. It is not effective to deal with systemic properties in a non-systemic way, by adopting a reductionist way of thinking, i.e., when properties acquired by systems are considered as properties possessed by objects. We consider the reduced language adopted in consumer societies as functional to maintain consumerist attitude. In consumer societies, language is suitable for maintaining people in the role of consumers with a limited ability to design and create. In this context freedom is intended as freedom of choice. To counteract this reduced language, we propose the diffusion of suitable games, cartoons, comics and pictures, images, concepts and words which can enrich everyday language, especially that of young people, and provide an effective way for inducing some elementary aspects of systems thinking in everyday life. The purpose is to have a language to design and develop things and not merely to select from what is already available. We list a number of proposals for the design of such games, stories and pictures. Keywords: consumerism, games, induction, language, systems thinking.
1. Introduction In this paper we use the term Systemics to refer to the usage of systemic concepts in various activities, such as scientific research, management, education, economics, politics and culture in general. Some currently used expressions and terms are: systems theory; systems thinking; general system theory and systemic view, principles, approach, properties and problems as frequently introduced and discussed in the scientific literature. Systems are not observer-independent, but are observer-dependent in the sense that it is the observer modeling a phenomenon as a system. This contrasts with the objectivistic view assuming reality to exist as it is. In Systemics, modeling is carried out by the observer through a language to represent it. This approach is known as constructivism. The opposite of Systemics is reductionism. It is based on assuming the level of description related to composing elements as a general effective strategy to deal with systems and the macroscopic level as a linear
283
284
G. Minati and L. Magliocca
extension of the microscopic one. In this paper we will better specify the concepts mentioned above and use them for dealing with social systems, i.e., communities established by interacting autonomous agents. An agent is said to be autonomous when provided with a cognitive system allowing them to decide how to interact. Examples of social systems are anthills, beehives, teams, temporal communities (queues and markets), corporations, cities and hospitals. This paper focuses on the usage of reductionism in consumer social systems (i.e., societies supporting their economic activities by artificially increasing the consumption of resources and products. Support for this consumer attitude is given through the use of a reductionist language reducing systems to objects and processes to products. This is a way to hide the inherent non-sustainability of the underlying and induced processes. We consider that a positive contribution towards allowing social systems to overcome this non-sustainable consumer phase is to introduce and expand the usage of non-reductionist language in daily life, for instance through games, cartoons, comics, pictures and stories able to induce Systems thinking, able to make people realize its full complexity and effects. 2. The general view There are many instances in the scientific literature describing how it is ineffective to deal with social issues (e.g., economic, educational, related to family, health and safety) without considering the need to model them using a systemic approach based on the fundamental work by Bertalanffy [1] and other researchers [2,3,4,5]. In more recent times the usage of systemic models based on emergence has been introduced to deal with problems of self-organized social systems, such as industrial districts, markets and organizational learning [6,7,8], and living systems such as flocks and swarms modeled using, for instance, Artificial Life [9,10] and Swarm Intelligence [11,12,13]. Systemic concepts are more and more widespread at professional and academic levels, whereas common, everyday thinking is based on simplified and reductionist approaches. We believe that one reason may be that consumer societies sustain this approach by diffusing short-term, symptomatic, cause-effect, and local views for the marketing of solutions and functions with no reference to the more general picture. Within the framework of the constructivist approach illustrated later, this is mainly done by imposing a language (through processes of standardization, reduction, and simplification of verbal and pictorial language used in advertising, TV, and other media) which only represents and deals with what is considered suitable for consumers. Induction [14] is a logical inference
Inducing Systems Thinking in Consumer Societies
285
allowing one to infer from a finite number of particular cases to another case or to a general conclusion. For instance, if we extract a number of balls from a given box and see that all are white, then we may infer that all the balls in the box are white. We introduce here the idea of using games, comics, cartoons and stories for popularizing the core principles of systemics, as a contribution towards reproducing systemic problems in the players. In this way it is possible to induce some elementary aspects of systems thinking, suitable for designing behaviors and not just applying, optimizing and using current rules (i.e., designing vs. playing a game with predefined rules) derived from reductionist thinking. In short, the idea is to induce the use of a language suitable for constructivistically representing and managing systems in a non reductionist way. The purpose is to induce systemic concepts, representations and approaches to be adopted as commonplace when considering many other problems. 3. Constructivism and Language In this paragraph we will focus upon the constructivist role of the observer based on language, when dealing with natural (i.e., non-artificially designed) systems. The role of the observer is not to perturb or produce relativity, as in classical views, but, following the introduction of Gestalt psychology [15], Cognitive Science [16,17], and Constructivism [18], to create cognitive existence, as when the observer detects / cognitively generates coherence (e.g., dealing with selforganized, emergent phenomena, such as swarming and flocking). The existence of the phenomenon is necessarily related to the cognitive model used by the observer [19,20]. 3.1. Constructivism The constructivist approach, or constructivism [18,21,22,23], has historically been connected with the principles mentioned above. Von Glasersfeld, for instance, asks: “What is radical constructivism?” He defines it as an unconventional approach to the problem of knowledge and knowing. It starts from the assumption that knowledge, no matter how it is defined, is in the heads of people, and that the thinking subject has no alternative but to construct what he or she knows on the basis of his or her own experience. What we make of experience constitutes the only world we consciously live in. It can be sorted into many kinds, such as things, self, others, and so on. But all kinds of experience are essentially subjective, and though I may find reasons to believe that my experience may not be unlike yours, I have no way of knowing that it is
286
G. Minati and L. Magliocca
the same. The experience and interpretation of language are no exception [18]. The same book he explicitly deals with the issue of language (such as in Chapter 1, entitling a paragraph “Which language tells it ‘As It Is’?”, p.2, and also in Chapter 7, in a paragraph entitled “Language and Reality”, p.136). In the same book, the Sapir-Whorf hypothesis (see Section 3.2) is mentioned as an important source for his work on p.3. To summarize, we may say that the more extensive, more accurate, more articulated, more able to express nuances, more capable of abstractions our language is the more effective we can be because, correspondingly, we may construct sophisticated representations and we may model our action with increasing accuracy. In a constructivist view, properties depend upon the level of description (in short, a level of description used by an observer is given by the disciplinary knowledge used, purposes of the observer in modeling, the kind and quantity of variables, scaling, relations and interactions used to model a system) adopted by the observer [24]. Examples are: • A ballpoint pen may be intended as an object (for suppliers, sellers and users) or as a system of interacting components (for instance, for the designer). • A device, such as a TV set, may be intended as an object (for suppliers, sellers and users) or as a system of interacting components (for a technician having to fix it). • An autonomous system (i.e., a system provided with a cognitive system) may be intended as a buyer (i.e., an agent making an economic transaction) or as a system of interacting components (for a physician or a psychologist considering the future usage and effects of what is bought). 3.2. Language There are controversial theories regarding different research aspects of language, for instance language and human behavior, linguistics and cognitive science, general semantics theory, knowledge representation, linking human processes of thinking to human behavior, learning, linking of the thinking process to language, man-machine interfacing, representing, representing meaning through language, linguistic construction of reality and translation [25,26,27,28,29,30, 31,32,33,34,35,36,37]. For the purpose of this paper we will limit ourselves to considering some of the most fundamental, general principles which have been the basis of research activity. For instance, how learning involves language and how language
Inducing Systems Thinking in Consumer Societies
287
influences learning, as introduced by Vigotsky: “The relationship between thought and word is not a thing but a process, a continual movement back and forth from thought to word and from word to thought: .... thought is not merely expressed in words; it comes into existence through them” [38]. This view was successively elaborated and formulated as the celebrated Sapir-Whorf hypothesis [39,40] – now accepted in the weaker sense – mentioned in Chapter 10 of Von Bertalanffy’s book cited above. In this context we just want to give a general idea of the approach. The general, ‘strong’ (we mention below the so-called ‘weaker’ versions), idea introduced by this approach is that what we can think is enabled by the language that we use for describing, hypothesizing, designing, rejecting, and so on. If we do not have the language to say it, it doesn’t exist for us. There are many approaches for dealing with the ideas introduced by the Sapir-Whorf hypothesis. The versions of these approaches may be briefly summarized [41] as below: 1. Strong hypothesis—language determines thinking; 2. Weak hypothesis—language influences perception and thinking; 3. Weakest hypothesis—language only influences memory. Below we consider the weak hypothesis as the most suitable for dealing with processes of influencing and even manipulating social systems behavior. The purpose of this paper is not to propose how to teach systemic knowledge, rather to make its adoption natural at the various social levels, such as in schools, families, workplaces, management, politics, and in particular social systems (i.e., hospitals, prisons, temporary communities – transport, social events, distribution - and so on). As mentioned in the introduction, we think that one possible, effective approach is to improve the language used in social systems for constructivistally representing and managing systems in a non reductionist way [42,43]. Unfortunately in consumer societies the purpose is to simplify and standardize [19,44,45] languages used in everyday life making people concentrate on marketing and business issues only. 4. How to have the language to imagine it? As introduced by constructivism there is a continuous correspondence between real life style and what can be represented by the social language. We are focusing on the fact that the simplified language of consumer societies is, in short, suitable for supporting consumer activities such as selecting, comparing and optimizing. By using this kind of reduced and simplified language people
288
G. Minati and L. Magliocca
may only have wishes and projects that can be dealt with by consuming, see the issue related to the economics of consumer credit invented to support this strategy. Moreover, the increase in consumer credit reduces, as a secondary effect, savings [46,47]. An example of effects of this simplified language is given by confusing freedom with freedom to select between pre-defined choices. In consumer societies the strategy is not to explicitly reduce freedom (as in authoritative societies), but to strongly induce a mono-dimensional freedom, such as that of selecting between pre-established choices. In this way social systems are such that they make ineffective (nevertheless, they do not forbid) a selection of something that is not already on offer. Examples include: • Replacement of a process with a product, such as products replacing diets, physical exercise and active remedies to correct unhealthy life styles. The possibility of selecting products is assumed to substitute the possibility of changing lifestyle; • Offering ways of spending free time (e.g., in shopping centers and attending organized events) versus autonomously designing activities (for instance, local excursions). Many people do not know their own town or country because there are non-standardized micro-tourist offers. This is common for artistic places and beauty spots; • Offering technologies for producing microclimates. The user can select from the market a specific air-conditioner, but not decide or influence the processes producing this need. Air conditioning helps to locally reduce temperature, by using energy and then contributing to increase the outside temperature; • Offering pre-cooked food together with pharmaceutical products to reduce problems deriving from fast food and the preservatives they contain. The user may select the product, but has no influence on the lifestyle producing those needs; • Offering evening TV programs that the consumer may select versus the possibility of reading a book, listening to music, discussing, writing or navigating on the Internet. Alternative evenings are not forbidden, but unusual. See the pathological dependence of TV upon young generations, very useful on the one hand, to control children (i.e., not having to spend time directly interacting with them) and establish shared time in families (with possible low-level interactions limited to the issues proposed by TV) and, on the other, to advertise products. The user selects the programs, but not the lifestyle.
Inducing Systems Thinking in Consumer Societies
289
The language used by advertisements supports the reductionist view, i.e., the attention towards objects and properties are assumed to be self-established and self-consistent, with a lack of or no attention at all being paid to the underlying and related processes. Examples are: • Advertisements using a language combining a given food with given effects inducing the idea, without describing it explicitly, that this food produces those effects; • Advertisements using a language combining specific products with values, such as naturalness, good health, reliability, beauty and success inducing one to accept a relationship between them; • Advertisements using references to science to support the truthfulness, the objectivity of the quality of a product, even using scientific language to say something which is certainly not scientific (such as percentages and statistics referring not to variables, but to immeasurable aspects). In this view it is more important to present a convincing statement rather than an accurate description of the product or the process. Previous classical strategies for manipulating social systems were based on persuading or convincing people adopt certain beliefs, ideals, or political, religious or racial views. The strategy for manipulating and controlling social systems in order to establish consumer societies is different. It is sufficient to reduce the possibility of thinking in a different way and, above all, to imagine different scenarios. This strategy is pursued by depriving social systems of a language suitable for imagining and thus designing change. We think that one aspect of this strategy is to relegate systemic thinking to scientific and professional issues to be kept separate from everyday thinking. As mentioned above, systemic properties are supported and maintained by continuous interactions of components with the constructivist role of the observer modeling them in this way. If we stop interactions, systemic properties disappear (e.g., device functions, teams, organizations and life itself) and if we ignore interactions systemic properties become just properties. The reductionist way of considering systemic properties is assuming them to be objectivistic properties by ignoring the complex interacting supporting processes of interaction and the constructivist role of the observer. Systemic properties are reduced to mere properties of objects and then considered as such.
290
G. Minati and L. Magliocca
5. Having a language for designing. The entry point. The ideal strategy spread the usage in everyday activities of systemic concepts for designing evolutionary rules and establishing a more self-aware behavior, may be pursued by increasing the level of social knowledge. Moreover, in our societies the problem lies not so much with the non-availability of knowledge (available through the Internet, broadcasting, popularization, increasing offer of books and in schools), but with its poor or unsuitable usage. Knowledge is more accessible today than it has ever been. We believe that, in spite of this, its social influence has been de-activated by social behavioral models based, for instance, on language manipulating techniques. 19,45 In order to reverse this de-activation and its usage, one possible approach is to diffuse suitable images of the meaning and usage of knowledge: • to provide examples which tend to induce more general approaches, and • to enrich everyday language in such a way as to make possible the modeling of life in a systemic way, thus allowing the detection, representation and focussing upon interactions and systemic properties. Systemic knowledge has the power to allow the design of micro and macro evolutionary social rules more than simply organizational ones. We believe that the most important target is the younger generation allowing them to design new rules and not just perpetuate the current ones as the only possible choice. In economics, for instance, the perspective may not only be absolute and continuous growth in the same way [48,49,50]. Development may be intended as changing the ways and fields of growth [20,51]. It may be designed or be a constraint from previous growth processes. One way to induce systems thinking may be through comics and cartoons, video backgrounds, images, films, linguistic games, interactive on-line videogames and quiz games based on systemic principles able to enrich everyday language with words, images and concepts. The usual, consumer, offer of games in general, especially for young people in the age range of 2 –13 [52], is mostly for leisure, entertainment, implicitly advertising other products, services and supporting consumer life styles. In this section we list some approaches for the design of games and other means which may induce systemic thinking, i.e., to consider as natural the systemic aspect of problems and situations.
Inducing Systems Thinking in Consumer Societies
291
5.1. Linguistic games We propose linguistic games, such as: • Translation of a phrase from a language into the same language (i.e., by using different words): what should be invariant is the meaning. • Finding out what is lacking in a word to express a certain meaning and how to overcome it. For instance, how to substitute the lack of conditional or subjunctive tenses for verbs? • Finding out how the meaning of words lies not in the words themselves (as labels), but in interactions, within the mind of the observer, with the preceding and subsequent words. The same process occurs for the meaning of phrases within a story or a book. • Building up ambiguous statements (the meaning depends upon the observer and the context). • Building up phrases by using predefined small sets of collections of words and discovering meanings which are not possible to represent using only those words. One way to play these games is to have different competing players or teams as typically occur in a classroom. Systemic content of the game: the issues relate to (1) the systemic nature of language, i.e., how the meaning is established through the interactions between words in the mind of the observer, (2) multiple-representations, i.e., through the process of re-formulating and translating, and their equivalence or nonequivalence. 5.2. Distinguishing between the composition of elements and the establishment of systems This game has the purpose of distinguishing between processes of composing elements and establishing systems. The process of composing is able to establish entities having new properties (new with respect to the component elements). However, in this case, the process of interaction is not required to be continuous. It is expected to produce some effects, giving rise to new properties which may be considered as non-systemic. Examples are the processes of cooking and of mixing colored water or light giving rise to new colors (for instance, blue and yellow giving green). On the contrary, the process of establishing systems (both through organization and/or emergence) is based on continuous interaction between elements. Examples are properties of devices established when
292
G. Minati and L. Magliocca
powered on (i.e., when elements are made to interact, as in electronic circuits). One way to play a game involving these aspects may be to categorize processes to compose, organize and emerge. It may be based on producing pictures, movies, animations, on-line quizzes and discussions in classrooms. Systemic content of the game: the issues relate to distinguishing processes of composition of elements and how a system may be established by organizational and/or self-organizational rules. In the first case elements give rise to new structures whereas in the second, systems are designed by setting explicit organizational rules or by setting behavioral rules for components. 5.3. Acting on a system It is well-known and illustrated by cases considered below, that acting at the microscopic level has non-linear effects on the macroscopic one and is an ineffective way for managing the macroscopic one. Acting upon single elements is a poor strategy for acting upon a system. The systemic view requires one to consider the interactions among elements and, specifically, the functional roles of elements in organized systems and the behavioral rules in emergent systems. Ways of influencing systems include: • influencing interactions among elements, for instance, by perturbing through the introduction of noise and varying its intensity; • changing the organizational rules followed by elements when interacting, such as changing how electronic components are connected or the organization in an assembly line. The replacement in this case is due to a missing role in the organization; • changing the behavioral rules followed by elements when interacting, such as how single agents behave in traffic and anthills. The replacement in this case is due to an unorthodox behavior of the agent. Besides considering specific cases, representing and commenting upon them, a very effective approach could be to act on simulated systems and then detect, comment upon and explain reactions. It could be a useful idea for designing innovative educational videogames [53], for instance, in the framework of the well known Game of Life, the cellular automaton devised in 1970 by J. H. Conway. It is a zero-player game because the evolution is determined by the initial state inserted by the player who then observes how the system evolves. Other games close to this purpose and available today are based upon loops of action-reaction as in models based on System Dynamics, such as SimCity TM.
Inducing Systems Thinking in Consumer Societies
293
Systemic content of the game: when any phenomenon is represented as a system then effective action upon it requires the adoption of a systemic approach and not merely acting upon individual components and processes assuming that this will produce linear effects upon the system. 5.4. More is not always better The untying of a knot requires clarity about the entire configuration: applying stronger force usually makes the problem worse. The key point is to balance actions having in mind the global configuration of effects. The addition, for instance, of more energy and resources to a system does not linearly imply an improvement: a system may need to reduce its temperature and the amount of resources to be processed in order to allow reorganization. What is good for an element may be not good at all for the system. Local aspects can not be extended linearly to the entire system. Games may be invented by contrasting cases where the purpose of the system is accomplished by increasing or reducing input. Suitable supplies to the system are possible with a good knowledge of the whole and not only of its parts. Systemic content of the game: the expression more is better is usually based on the assumption that such abundance refers to resources essential for the system and that the receiver can always limit and refuse any excess. The general idea is that abundance is better than shortage. By the way, resources are often information per se. The limitation of one thing and abundance of something else are important inputs for a system in order to establish, for instance, balancing, compensating and evolutionary processes. The supplier may have a wrong model of the system. Examples are diets and daily activities able to reinforce, model and balance a system. 5.5. Improvement of parts does not always imply improvement of the system For instance, larger wings do not allow an airplane to fly faster and higher production may be a disaster for a company lacking a suitable distribution network. The message is that the functioning of a system is based upon keeping a balance between inputs and effects, allowing variations over specific optimal ranges. It relates to the so-called homeostatic principle. Homeostasis is the property of an open system to regulate its internal environment to maintain a stable condition in a changing external environment. In biology, for instance, an automatic mechanism produces opposite reactions to external influences in order to maintain equilibrium. The homeostatic principle is the key-ordering force for
294
G. Minati and L. Magliocca
all self-sustaining systems, and above all, living systems. Games may be based on quizzes and finding examples where detecting a means of improving the performance of a system have effects upon component parts and their interactions. Systemic content of the game: another way to make evident how the strategy of acting upon an element is ineffective for acting upon the system. 5.6. A problem in a component part is a problem for the entire system How do negative actions upon component parts affect the system or diffuse through it? We may consider two aspects of the problem by considering the evolutionary scenarios of a social system: • Perturbing effects of an action upon parts and interactions have unbalancing effects throughout a system considered a long time after their occurrence. This delay worsens the situation because (a) corrective actions are often no longer possible, and (b) the system will have adapted its evolution on the assumption of the persistency of the action. This is the typical situation of non-sustainable processes. The system will adapt to the new situation by producing changes, e.g. adaptation to pollution or a poisoned food chain. • Some adopt the approach that a negative situation caused by such an imbalance or non-sustainability will be solved by scientific improvements. This is what it is called progress. Moreover, the point is that progress, when successful, will enable the system to adjust to the new situation, by restructuring its evolutionary rules not on the basis of its design, but to tolerate and adapt to changes as constraints due to particular, local interests. Such as the consumption of exhaustible resources, for example, oil used as a fuel while also being used for producing artificial substances, such as plastic. The arrival point is the same in both cases: a new reconfiguration of the system. Is this the only possible definition of development? Analogous ways of thinking apply, for instance, to processes related to medicine, psychology and economics. A cartoon may show a house where the first floor is on fire or a ship with a hole on one side and people saying, I am pleased that the fire is not in our room, or I am pleased that the hole is not on our side. Help may arrive before the disaster, but the system will no longer be the same. Is this the only way to develop a system, by constraints?
Inducing Systems Thinking in Consumer Societies
295
Systemic content of the game: any negative, destructive effects on parts and interactions will have, in time, evolutionary, non-linear and different effects on the entire system. 6. Conclusions Some basic concepts have been introduced about processes of the establishment of systems, constructivism and the role of the observer based on language to model and produce cognitive reality. Dealing with systems needs systemic knowledge as opposed to reductionist knowledge based on considering systems acquiring properties as objects possessing properties. Social systems should, in our view, be dealt with not only by using systemic knowledge, but also by taking into account that the agents involved are autonomous agents behaving through the use of their cognitive systems and cognitive models. In nature, the cognitive models used are normally fixed, predefined for species (ants, for instance, build ant-hills, bees build beehives, and felines always hunt in the same way; only evolution can change that). Human beings may vary, invent, and design their own social systems through knowledge and cognitive models. The fundamental tool for designing worlds is language as introduced by constructivism. Consumer societies are based on a reduced language maintaining usage of a specific cognitive model supporting marketing and consumption, by considering agents merely as economic agents, buyers. Reductionism applied to social systems has the consequence (i.e., purpose in consumer societies) to assume social evolutionary rules to be fixed and not subject to improvements or changes. We have presented here some ideas for designing games and other approaches able to induce systemic ways of thinking in young people and, consequently, the ability and the freedom to design, i.e., design the future and not just accept a continuation of the present. What is new in this paper? The approach for understanding the strategy at the basis of consumer societies. References 1. L. von Bertalanffy, General System Theory. Development, Applications (George Braziller, New York, 1968).
2. P. Checkland, Systems Thinking, Systems Practice (Wiley, New York, 1981). 3. C.W. Churchman and M. Verhulst, Eds., Management Sciences (Pergamon, New York, 1960).
4. R.L. Flood and M.C. Jackson, Eds., Critical Systems Thinking: Directed readings (Wiley, Chichester, UK, 1991).
5. M.C. Jackson, Systems approaches to management (Kluwer, NewYork, 2000).
296
G. Minati and L. Magliocca
6. S.Y. Auyang, Foundations of Complex System Theories in Economics, Evolutionary 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
Biology, and Statistical Physics (Cambridge University Press, Cambridge, UK, 1998). F. Belussi and G. Gottardi, Eds., Evolutionary Patterns of Local Industrial Systems. Towards a Cognitive Approach to the Industrial District (Ashgate, Aldershot, UK, 2000). G. Dei Ottati, European Planning Studies, 463 (1994). E. Bonabeau. and G. Theraulaz, Artificial Life 303 (1994). P. Cariani, in Artificial Life II, Ed. C. Langton, D. Farmer and S. Rasmussen, (Addison-Wesley, Redwood City, CA, 1992), pp.775-797. E. Bonabeau, M. Dorigo and G. Theraulaz, Swarm Intelligence: from natural to artificial systems (Oxford University Press, UK, 1999). M.M. Millonas, in Artificial Life III, Ed. C.G. Langton (Addison-Welsey, Reading, MA, 1994), pp. 417-445. G. Theraulaz, S. Goss, J. Gervet and J.L. Deneubourg, in Proceedings of the 1990 IEEE International Symposium on Intelligent Control, Ed. A. Meystel, J. Herath and S. Gray (IEEE Computer Society Press, Los Alamitos, CA, 1990), pp. 135-143. J.H. Holland, K.Y. Holyoak, R.E. Nisbett and P.R. Thagard, Induction (MIT Press, Cambridge, MA, 1986). M. Wertheimer, Philosophische Zeitschrift für Forschung und Aussprache, 39 (1925). P.H. Lindsay and D.A. Norman, Human Information Processing (Academic Press, New York, 1972). D.A. Norman, Cognitive Science 1 (1980). E. von Glasersfeld, Radical Constructivism: A Way of Learning (Routledge Farmer, New York, 1995). G. Minati, in Systemics of Emergence: Research and Applications, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006), pp. 569-584. G. Minati and E. Pessa, Collective Beings (Springer, New York, 2006). H.R. Maturana and F. Varela, The Tree of Knowledge: The Biological Roots of Human Understanding (Shambhala, Boston, MA, 1992). H. von Foerster, Understanding Understanding: Essays on Cybernetics and Cognition (Springer, New York.2003). P. Watzlawick, Ed., Invented Reality: How Do We Know What We Believe We Know? (Norton, New York, 1983). S. Guberman and G. Minati, Dialogue about systems (Polimetrica, Milano, Italy, 2007). J.R. Anderson, Language, Memory, and Thought (Erlbaum, Hillsdale, NJ., 1976). J.R. Anderson, The Architecture of Cognition (Harvard University Press, Cambridge, MA, 1983). J.R. Anderson, Rules of the Mind (Erlbaum,: Hillsdale, NJ, 1993). D. Bickerton, Language & Species (University of Chicago Press, Chicago, 1992). D. Bickerton, Language and Human Behavior -The Jessie and John Danz Lectures(University of Washington Press, Washington, 1996). N. Chomsky, Knowledge of language: Its nature, origin and use (Praeger, New York, 1986).
Inducing Systems Thinking in Consumer Societies
31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.
297
T.W. Deacon, The symbolic species (W.W. Norton & Company, New York, 1997). T. Givon, Evolution of Communication 45 (1998). G. Grace, The Linguistic Construction of Reality (Croom Helm, London, 1987). N. Lund, Language and thought (Routledge, Hove, UK, 2003). R.A. Müller, Behavioral and Brain Sciences, 611 (1996). S. Pinker and P. Bloom, Behavioral and Brain Sciences, 707 (1990). W. Wilkins and J. Wakefield, Behavioral and Brain Sciences, 161 (1995). L.V. Vigotsky, Thought and Language (MIT Press, Cambridge, MA, 1962). E. Sapir, Language 207 (1929). Reprinted in Selected writings of Edward Sapir, Ed. D.G. Mandelbaum, (University of California Press, Berkeley, 1949), pp. 34-41. B. Whorf and J. Caroll, Ed., Language, Thought and Reality: Selected writings of B. L. Whorf (John Wiley&Sons, New York, 1956). E. Hunt and F. Agnoli, Psychological Review 377 (1991). L.A. Magliocca and G. Minati, in Emergence in Complex Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp.235250. L.A. Magliocca and A.N. Christakis, Systems Research and Behavioral Science, 259 (2001). H. Marcuse, One Dimensional Man (Beacon, Boston, 1964). G. Minati, World Futures, 29 (2004). G. Bertola, R. Disney and C. Grant, Eds., The Economics of Consumer Credit: European Experience and Lessons from the US (MIT Press, Cambridge, MA, 2006). L.C. Thomas, J. Ho and W.T. Scherer, IMA Journal of Management Mathematics, 89 (2001). N. Georgescu-Roegen, in The Political Economy of Food and Energy, Ed. L. Junker, (University of Michigan, Ann Arbor, MI, 1977), pp. 105-134. N. Georgescu-Roegen, in Prospects for Growth: Changing Expectations for the Future, Ed. K.D. Wilson, (Praeger, New York, 1977), pp. 293-313. N. Georgescu-Roegen, in Energy: International Cooperation on Crisis, Ed. A. Ayoub, (Press de l’ Université Laval, Québec, 1979), pp. 95-105. G. Minati, Proceedings of the first Italian Conference on Systemics, Ed. G. Minati, (Apogeo, Milano, Italy, 1998), pp. 93-106. J.B. Schor, Born to Buy: The Commercialized Child and the New Consumer Culture (Scribner, New York, 2004). I. Bogost, Persuasive Games (MIT Press, Cambridge, MA, 2007).
This page intentionally left blank
CONTEXTUAL ANALYSIS. A MULTIPERSPECTIVE INQUIRY INTO EMERGENCE OF COMPLEX SOCIO-CULTURAL SYSTEMS
PETER M. BEDNAR (1,2) (1) Department of Informatics, Lund University, Sweden. (2) School of Computing, University of Portsmouth, UK. PO1 3HE, Portsmouth, Hampshire, UK. E-mail:
[email protected] This paper explores the concept of organizations as complex human activity systems, through the perspectives of alternative systemic models. The impact of alternative models on perception of individual and organizational emergence is highlighted. Using information systems development as an example of management activity, individual and collective sense-making and learning processes are discussed. Their roles in relation to information systems concepts are examined. The main focus of the paper is on individual emergence in the context of organizational systems. A case is made for the importance of attending to individual uniqueness and contextual dependency when carrying out organizational analyses, e.g. information systems analysis. One particular method for contextual inquiry, the framework for Strategic Systemic Thinking, is then introduced. The framework supports stakeholders to own and control their own analyses. This approach provides a vehicle through which multiple levels of contextual dependencies can be explored and allows for individual emergence to develop. Keywords: strategic systemic thinking, contextual analysis, individual emergence, contextual dependency.
1. Introduction Minati [1] suggests that a study of processes of emergence implies a need to model and distinguish the establishment of structures, systems and systemic properties. It goes on to point out that, in a constructivist view, an observer identifies such properties by application of models. Different perceptions of structures and systems correspond to different, irreducible models. Perceived emergence of systemic properties, e.g. functionality in computer systems or collective learning abilities in social systems, then ensues from application of such models. The author of this paper wishes to compare and contrast two alternative models that may be applied in forming constructivist views of organizational systems. The paper shows how one particular model highlights the importance of individual, as well as organizational emergence. Its contribution is to argue for a 299
300
P.M. Bednar
move away from reductionist cybernetic models towards critical systemic thinking – from attempts to reduce uncertainties inherent in management of organizations towards approaches which embrace ‘complexification’. Using information systems development as an example, the implications for individual and collective learning in organizations are explored and a case for contextual methods of inquiry to support organizational learning is made. A particular framework for contextual inquiry is then described in outline. An organisation may be viewed as a complex social system, affected by goals and values of the individuals within it [2]. We are reminded by Senge [3] that “Today, systems thinking is needed more than ever because we are becoming overwhelmed by complexity. Perhaps for the first time in history, humankind has the capacity to create far more information than anyone can absorb, to foster far greater interdependency than anyone can manage, and to accelerate change far faster than anyone’s ability to keep pace ... organizations break down, despite individual brilliance and innovative products, because they are unable to pull their diverse functions and talents into a productive whole.” [3, p. 69]. The nature of these social systems, their sub-systemic structures and the relations which sustain them over time vary widely from one organization to another. An organization can also be viewed as a purposeful human activity system [4]. However, objective agreement on the nature of such systems is elusive, since the defining properties of ‘the system’ will depend upon the viewpoint of the individual who considers it. For example, when a person enters a bank as a customer, he is likely to view this organization as a system for providing him with financial services. However, to a person who enters that bank as an employee, it may appear to be a system for providing her with a livelihood. Checkland refers to these differing perspectives as “Weltanschauungen” or “worldviews” [4]. Schein [2] suggested that organizational culture is formed over time through shared goals. Such sharing could only be achieved through a negotiation of differing perspectives held by individuals [4]. For this reason, agreement on a single description of a “real” human activity system will remain elusive and consensus on its goals difficult to achieve. Within any ‘organization’, an interacting collection of living individuals can be found, each with a unique life history and worldview. Every individual produce her/his own unique understanding of context, constructed through interaction with organizational systems and environment by means of a variety of sense-making strategies [5,6,7]. Those taking on responsibility for
Contextual Analysis. A Multiperspective Inquiry into Emergence …
301
management as an activity need to be aware of the challenges posed by these differing perspectives. One possible definition of ‘management’ is ‘a set of practices and discourses embedded within broader asymmetrical power relations, which systematically privilege the interests and viewpoints of some groups, whilst silencing and marginalizing others.’ [8]. Langefors [9] discusses the role of organizational information systems. He considered that, in order to manage an organization, it would be necessary to know something about the current state and behaviour of its different parts and also the environment within which it was interacting. These parts would need to be coordinated and inter-related, i.e. to form a system. Thus, means to obtain information from the different parts of a business would be essential and these means (information units) would also need to be inter-related. Since the effectiveness of the organization would depend upon the effectiveness of the information units, an organization could be seen as crucially ‘tied-together’ by information. For Langefors, therefore, the organization and its information system could be viewed as one and the same [9]. The next section of the paper sets out some of the theoretical background within which contemporary systemic models have been framed. This is followed by a discussion of learning and knowing in an organizational context. Contrasting models of organizational systems are then set out, showing how different perspectives on emergence result from their application. A role for contextual inquiry in enabling individual, as well as organizational emergence to be explored is then set out. One possible method of contextual inquiry is explained. The final section of the paper attempts to summarize the arguments. 2. Background The Many attempts have been made in the past to understand and manipulate social phenomena by application of laws derived from the natural world. Ackoff [10] quotes examples set out by sociologist Sorokin [11] where researchers had attempted to establish laws of ‘social physics’. He also notes that philosopher Herbert Spencer referred to a general characteristics of ‘life’ (accepted in relation to biological phenomena) as no less applicable to society, i.e. characteristics of growth, increasing differentiation of structure and increasing definition of function [10]. A great deal of research is available on systems perspectives in social science (see for example West Churchman [12], Simon [13]). However, as Emery [14] points out, these contributions have been fragmented and diverse, often using similar terms to denote quite different concepts. Attempts have been made to liken the operation of social ‘systems’ to
302
P.M. Bednar
mechanistic models derived from engineering (see, for example, applications of the Shannon-Weaver [15] model from telecommunications to human interaction and communication) or to organic models from biology (e.g. applications of Maturana and Varela’s theory of autopoeisis [16]). Ulrich [17] provides a discussion of the way that root metaphors in systems thinking influence the way in which a person conceives of ‘a system’. Without these metaphors, the concept of a system might have remained ‘empty’. The scope for systemic research to inform management thinking has therefore been diverse and confused. Perhaps one of the most influential works has been the General Systems Theory of Von Bertalanffy [18]. He did not favour direct application of mechanistic models to human problems, suggesting instead: “... systems science, centered in computer technology, cybernetics, automation and systems engineering, appears to make the systems idea into another – and indeed the ultimate – technique to shape man and society ever more into the ‘mega machine’ ...” [18, p. viii]. In his chapter on ‘The Meaning of General Systems Theory’ he points out that models which are essentially quantitative in nature have limited application to phenomena where qualitative interpretations ‘may lead to interesting consequences’ [18, p. 47]. Nevertheless, cybernetic models derived from GST have had great appeal in management literature. In particular, a concept of sub-optimality has been the focus of attention. Boulding [19], for instance, attempts to establish laws of organization. His law of instability suggests that organizations fail to reach a stable equilibrium in relation to their goals due to cyclic fluctuations resulting from the interaction of sub-systems. Ways to remove sub-optimality, a result of conflict between systemic and sub-systemic goals, have therefore been identified as a key function of management as it attempts Fayol’s [20] classic tasks of planning, directing and controlling. The reflection is that learning must surely be a prerequisite to purposeful activities of the kind Fayol describes [20]. Bateson [21] reminds us that a critical element of learning is reflexivity – awareness of one’s own responses to context. Such reflexivity should inform any systemic view of human activities. From an interpretive perspective, an individual’s sense-making is codependent with the organizational culture within which it takes place, and requires continual construction/re-construction through reflection over time [2]. A perception of organizational life focused on goal-seeking is therefore problematic. Vickers [22] argues that life consists in experiencing relations rather than seeking ‘ends’. He challenges a cybernetic paradigm which a goal-
Contextual Analysis. A Multiperspective Inquiry into Emergence …
303
seeking model implies, suggesting instead a cyclical process in which experience generates individual norms and values. These in turn create a readiness in people to notice aspects of their situation, measure them against norms and discriminate between them. Our ‘appreciative settings’ condition our perceptions of new experiences, but are also modified by them. Development of an individual’s appreciative system is thus ongoing over time as a backdrop to social life. If individual sense-making is co-dependent with organizational culture there must be some interaction between them, built on communication. Information can be defined as data which is rendered meaningful in a particular context. The meaning attributed to an item may well vary when understood from the point of view of different individuals. Each individual produces her/his own understanding of contexts within which information is formed, constructed through interaction with organizational systems and their environment by means of a variety of sense-making strategies [5]. During the 1960’s, Borje Langefors [23] developed the ‘Infological Equation’. This work identifies the significance of interpretations made by unique individuals within specific organizational contexts [9]. The Infological Equation [9,23] “I=i(D,S,t)” shows how meaningful information (I) may be constructed from the data (D) in the light of participants’ pre-knowledge (S) by an interpretive process (i) during the time interval (t). The necessary pre-knowledge (s) is generated through the entire previous life experience of the individual. Individuals perform different systemic roles within organizations, and have unique perspectives derived from the sum of previous life experiences. Meanings are constructed by different individuals reflecting their unique world views. While it is possible to construct a ‘conduit’ through which data may flow around an organization, information is constructed by individuals in their interactions within the organizational context. Logically, therefore, it is possible to develop a data system to support management tasks, but this could only become an information system through direct and interpretive participation from those individuals using it. The logic demonstrated by the Infological Equation suggests that individual learning and organizational development are inextricably bound together. Information systems must therefore provide support for contextually relevant individual learning, and organizational analysis drawing on this learning, as a systemic process over time [24]. 3. Learning and Knowing Those theories that an individual creates through sense-making will be influenced by multiple contextual dependencies arising from her/his experience
304
P.M. Bednar
and environment [24]. Such dependencies have been derived through the particular experiences of individuals involved, in the context of their own working situations. The distinctiveness of each work situation lies in construction of meanings that individuals attach to it. In relation to systems design in particular, therefore, there is no reason to assume consensus among the different actors as to the desirable properties of a proposed system. Indeed, as the Infological Equation demonstrates [9,23], it is not possible for any individual to know in advance precisely what requirements she/he might have. Instead, actors need support to engage in a collaborative endeavour of requirement shaping. Here individuals partake in a learning spiral through reflection on sense-making in a work context in order to create understanding of those emergent ‘systems’ in their minds. Individual learning may be described as taking place through sense-making processes as a response to messy and uncertain contexts in which resolutions are sought. Different orders of learning may be identified, based on a cycle of experience and reflection on experience [6,25]. Higher orders of learning involve reflection on sense-making processes themselves, i.e. a learning cycle transforms into a spiral. Reflection on sense-making becomes an exercise in practical philosophy. Certain points follow from this. If individual learning is a creative process based in sense-making, then context is clearly important. Any unique individual’s view is based in reflection on experience [6], and experience is context specific. Therefore, an examination of contextual dependencies, as part of analysis, will be important. Knowing, as a creative process, is inextricably linked to learning. Bateson [6] suggests that information may be defined as ‘a difference that makes a difference’, existing only in relation to a mental process. This process is what leads to an individual ‘knowing’. Bateson [6] describes a hierarchy of different orders of learning. At level zero, learning represents no change, since the same criteria will be used and reused without reflection. This is the case in rote learning of dates, code words, etc which is contextually independent and in which repeated instances of the same stimuli produce the same resulting ‘product’. All other learning, according to Bateson’s hierarchy [6], involves some element of trial and error and reflection. Orders of learning can be classified according to types of errors and the processes by which correction is achieved. Level I involves some revision using a set of alternatives within a repeatable context, level II represents revision based on revision of context, and so on.
Contextual Analysis. A Multiperspective Inquiry into Emergence …
305
Bateson’s hierarchy [6] finds an echo in the work of Argyris and Schon [25] (single and double-loop learning). Double loop learning comes about through reflection on learning processes in which individuals may attempt to challenge prejudices and assumptions arising from their experiences [25,26]. When individuals need to solve an immediate problem, i.e. close a perceived gap between expected and actual experience, they may harness their sense-making processes within contexts of existing goals, values, plans and rules (Vickers’s appreciative settings [5]), without questioning their appropriateness. However, if individuals challenge received wisdom and critically appraise assumptions previously applied, double-loop learning occurs. The resulting process creates a productive learning spiral, which is at the heart of any successful organizational innovation. As mentioned previously, the Infological Equation [9,23] suggests that individuals develop unique understandings (meaningful information) by examining data in the light of (their own) pre-knowledge gained from reflecting on experience during a previous time interval. Information, and ‘knowledge’ derived from it, cannot therefore be seen as commodities, to be transmitted from one individual to another (or stored) as containers of objective meaning. Furthermore, it is through these processes of constructing new understandings/ meaning, by examining data in light of experience, that organizations, their goals and cultures are constituted. If individual learning is a creative process, organizational learning is so also. 4. Complexification and Emergence Attempts by students of management to reduce organizational problems to consideration of ‘sub-optimality’, drawing on mechanistic models from systems science can be seen as reductionism. Exploration of multiple levels of contextual dependency may help analysts to avoid entrapment in various types of reductionism, including undue reliance on sociological, psychological or technological concepts. It may also help to eliminate tendencies towards generalization, or substitution of an external analyst’s own views for those of the participating stakeholders. A need to promote deep understandings of problem spaces requires us to go beyond grounding of research in phenomenological paradigms. In order to avoid various types of reductionism and achieve deepened understanding, analysts must attempt to incorporate philosophy as an integral part of their research practice [5,17,21,27,28]. As pointed out by Werner Ulrich [29] in his discussion of boundary critique perception of a system varies with the stance of the observer, i.e. this
306
P.M. Bednar
differentiates between an observer’s and an actor’s picture of reality, which means that anyone wishing to inquire into IS use must continually align themselves with actor perspectives. For example, meaning shaping in particular situations can be described through comparisons of different actors’ perspectives within given structural criteria, or ‘circling of realities’. This refers to a necessity to acquire a number of different perspectives (in time-space) in order to be able to get a better and more stable appreciation of an actor reality [30]. The whole person includes dimensions of both ‘heart’ and ‘mind’ [31]. Personal perspectives which transcend received, organizational ‘common sense thinking’ may be encouraged to emerge through methods which emphasise individual uniqueness and contextual dependency. Those engaged in management tasks such as IS design should not forget that they set up personal boundaries for a situation by defining it from their own experiences and preferences. As human beings we all have pre-understandings of phenomena, which are influenced by our own values, ‘wishful thinking’, and how each of us has been socialized into a particular society. These preunderstandings are being reviewed gradually, with the support of our experience. In a continual exchange/interchange between an individual’s preunderstanding and experience, a process of inquiry may progress. It follows from the preceding discussion that, from the point of view of each individual’s perception, an organization is an emergent property of inter-individual sensemaking processes and activities. The organization is continually constructed/reconstructed for each individual as a result of emergence from individual sensemaking perspectives. A critically informed approach to research involves recognition / understanding of this emergence. Without recognition of the uniqueness of each particular individual’s experience of organizational life this critical approach may be undermined. Within a traditional scientific paradigm, the focus of a researcher’s attention rests on increasing the precision and clarity with which a problem situation may be expressed. This can lead to an artificial separation of theory from praxis, of observation from observer and observed. ‘Knowing’ about organizational context (formed by on-going construction of meanings through synthesis of new data with past experience) may be deeply embedded and inaccessible to individuals concerned. The perspective promoted in this paper emphasises self-awareness of human individuals. In research undertaken from this perspective, a focus towards emancipation and transparency, rather than clarity and precision, is adopted. A researcher taking such a perspective will recognize that there are uncertainties and ambiguities
Contextual Analysis. A Multiperspective Inquiry into Emergence …
307
inherent in socially constructed everyday world views (a similar discussion can be found in Radnitzky [32]. In some approaches, a human activity system is regarded as a mental construct derived from an interrelated set of elements, in which the whole has properties greater than the combination of component elements. When such a model is adopted, individual uniqueness is subsumed in perceived emergent properties of a conceptualised system. Even when considered as a duality seen as a system to be served and a serving system [4,33], individuals remain invisible. In order to take into account unique individual sense-making processes within an organizational problem arena, there is a need for analysts to explore multiple levels of contextual dependencies. Every observation is made from the point of view of a particular observer [32]. Since it is not possible to explore problem spaces from someone else’s point of view, it follows that external analysts can only play supportive roles in enabling individuals within given contexts to explore their own sense-making. In an alternative model, an organizational system may be seen as an emergent property of unique, individual sense-making processes and interactions within a particular problem arena [34,35]. When considered in this way, it is possible to perceive some individuals themselves to have emergent properties of their own which can be larger than (e.g. outside of) those of one particular organizational system seen as a whole. Consider, for instance, a football club seeking to recruit skilful players for its team. The manager may perceive a need for a creative, attacking midfielder to play a role as one component part of the team’s efforts to win. The Los Angeles Galaxy Club recently experienced such a need but chose to recruit former England captain, David Beckham. Beckham can play the role of an attacking mid-fielder for the team. However, he brings with him qualities which transcend this in terms of his personal notoriety, publicity potential and marketing value for sales of Club products such as replica shirts, etc. Beckham has emergent properties beyond those of any other mid-field footballer in relation to the human activity system which is that Club. This model is not, of course, the same as a non-systemic, fragmented view which focuses on individuals but fails to perceive an emergent system arising through their interactions, and hence ignores the impact of norms, values, expectations, communicational acts, etc. on individual sense-making processes [36]. 5. Contextual Inquiry The importance of context for systemic analysis has been widely recognized [4,17,24,28]. Contextual inquiry, as described here, is viewed as a special case
308
P.M. Bednar
of contextual analysis. This paper describes an application of a framework for contextual inquiry, the Strategic Systemic Thinking (SST) framework [24]. This forms an exploration into the nature of open systems thinking and how systemic identities are maintained and generated within a specific human activity context. SST maintains a particular focus on ways in which human analysts can deal with complexification and uncertainty although this poses apparently insuperable epistemological problems. Particular emphasis is placed on a multiplicity of individual sense-making processes and ways these are played out within organizations. SST can support groups of organizational actors to take contextual dependencies into consideration, and is intended as a means to enable them to cope with escalations in complexity. A cardinal principle of the framework is that actors should own and control their own inquiry, supported but not dominated by a facilitating professional analyst. When an attempt is made to evaluate effectiveness in managing or ‘designing’ organizational systems, concepts of analysis become important. Good practice requires an understanding that addresses intrinsic and contextually-dependent characteristics of organizational activities. An understanding can only come about through relevant evaluative and analytical strategies. Evaluation is a result of both inquiring and reflecting thought processes, i.e. mental activity intrinsically dependent upon a demonstrated, contextually-dependent desire to explore a certain problem space. Analysis is an inquiry into the assumed-to-be unknown and/or a questioning of the assumed-tobe known. Evaluation, is a consolidating process, where judgments are made, and assumed ‘truths’ and ‘knowledge’ are incorporated into some kind of hierarchy. Together, an analysis (i.e. creation of ‘new’ knowledge) and evaluation (i.e. categorization of ‘existing’ knowledge) represent closing of a learning circle. Any conscious reflection over requirements for a higher quality learning circle could become a daunting exercise as it involves raising the quality of ‘knowing’. This is why a framework such as SST has an important role to play. SST involves three aspects intra-analysis, inter-analysis and value-analysis. These should not be regarded as sequential, as it is possible to begin at any point in the framework. SST is intended to be iterative, and therefore it is possible to move from one analysis to another repeatedly and in any direction, at any time. A range of methods are available to the actors, and their facilitating external analyst, in seeking to articulate their worldviews. These methods include: rich pictures, brain-storming, mind-maps, diversity networks, drama transfers, roleplaying – all of which are supporting creation, visualization, and communication
Contextual Analysis. A Multiperspective Inquiry into Emergence …
309
of mental models and narratives. Each of the three aspects of the framework helps to guide inquiries with a number of themes. The purpose of intra-analysis is to enable creation of an individual process for structuring a problem. This analysis aims to create and capture a range of narratives from participating stakeholders by providing an enrichment and visualization process for them. Inter-analysis is the aspect of the inquiry which represents collective reflections of decision-making alternatives. The aim is to have a dialogue and to reflect upon ranges of narratives derived through intra-analysis. The purpose is not to achieve consensus or to establish common ground, but to produce a richer base upon which further inquiry and decision-making could proceed. Grouping of narratives takes place through consideration and discussion of individually produced narratives. Results of these inquiries might be considered to form a knowledge base relating to problem spaces under investigation. A critical and reflective approach in considering these results is needed to ensure a basis for ‘good’ decision-making and to avoid unintended, negative consequences for actors and organizations concerned. Evaluation could be said to be an examination of the ‘known’ – what has been learned from analyses in a sociocultural context. Here actors may carry out examinations of values influencing and constraining the analyses, and consider prioritization from political and cultural perspective. SST can be explained as involving groups of professional members of organisations to act as analysts of their own problem spaces under guidance of expert analysts as external facilitators. This includes examination of their activities and specific use of methodologies, rhetoric and strategies to construct local arguments and findings. By the end of an initial analysis, analysts (e.g. organisational actors) might for example be familiar with some of the strategies available within their organization for further inquiries into contextual dependencies. SST is complementary, rather than alternative, to traditional approaches to analysis. However, there may be conflicts relating to unproblematized assumptions of ontological beliefs and logical empiricism (i.e. unquestioned beliefs of ‘objectivities and truths’). Other assumptions may also arise which are incompatible with the underlying philosophy of SST, e.g. the traditional communicational theories, focusing on a ‘sender-receiver’ perspective. To give a simplified example, in a traditional approach, inquiry might ask what a company wants to achieve with its information and communication system. On the other hand, a contextual inquiry would ask what the people who will use the system want to achieve, and what roles and specific purposes their activities might have in organizational contexts. What makes their
310
P.M. Bednar
unique situation recognizable for them? What specific role do they give to information (and the organizational business)? This inquiry is to be seen as investigation by users themselves into their own assumptions and needs within the space of an open information system (an ' organization' , human activity system or socio-cultural system). This is a bottom up perspective on organisation, information and (technical) communication systems. Systems are envisaged, which are shaped with the intention to serve specific organizational actors and their needs – from their own points of view. 6. Conclusions Contextual inquiry is intended to support analysts to recognize individual emergence, multiperspectivity and open systems thinking in combination. Two different categories of emergence are highlighted. In the first, each individual’s identity is an emergent property of a number of emergent systems of which the individual is a member. In the second category, each organization is an emergent property of the multiple perspectives of all the interacting individuals for whom its existence is relevant. There are multiple views of what comprises the organization, formed from the multiple perspectives of many individuals. From a systems analyst’s point of view, many possible descriptions will emerge in any organizational inquiry, through the differing experiences of context among many individuals. The boundaries of an organizational system will be dependent upon multiple perspectives and descriptions from individuals. This requires consideration to be given to sense-making, emotion and learning processes that those individuals engage in. It is helpful to highlight different levels of abstraction involved in discussions about systems as emergent properties of socio-cultural phenomena. The Strategic Systemic Thinking framework is discussed as a contemporary version of contextual analysis. Its aim is to support application and use of specifically adapted methods by groups of individual stakeholders in their efforts to construct understanding and meaning. Its focus is on ways in which information needs and information use are created by individuals. A concept of contextual dependency is of interest because it supports a focus of inquiry by unique individuals, on their own individual beliefs, thoughts and actions in specific situations and contexts. Through this kind of inquiry support is provided for a contextually-dependent creation of necessary knowledge. This has potential to provide a foundation for more successful communication, systemic analysis and eventually information systems development to be achieved. The
Contextual Analysis. A Multiperspective Inquiry into Emergence …
311
purpose is to create a form of organizational transformation that allows individual emergence to surface. References 1. G. Minati, Call for Papers, AIRS Congress 2007,
(http://www.airs.it/AIRS/indexEN.htm, accessed 18 September 2007).
2. E. Schein, Organizational Culture and Leadership (2nd edition), (Jossey-Bass., San Francisco, 1992).
3. P.M. Senge, The Fifth Discipline: The Art & Practice of the Learning Organization (Doubleday, New York, 1990).
4. P. Checkland, Systems Thinking, Systems Practice (John Wiley & Sons, Chichester, 1981).
5. K. Weick, Sensemaking in organizations (Sage Publications, Thousand Oaks, Cal.. 1995).
6. G. Bateson, Steps to an Ecology of Mind (University of Chicago Press, Chicago, 1972).
7. P.L. Berger and T. Luckmann, The Social Construction of Reality: A Treatise in the Sociology of Knowledge (Anchor Books, New York, 1966).
8. D.L. Levy, M. Alvesson and H. Willmott, in Studying Management Critically, Alvesson, Eds. M. Alvesson and H. Willmott (Sage, London, 2003).
9. B. Langefors, Essays on Infology - Summing up and Planning for the Future (Studentlitteratur, Lund, 1995).
10. R.L. Ackoff, Ackoff’s Best: His Classic Writings on Management (Wiley, New 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.
York, 1999). P. Sorokin, Contemporary Sociological Theories (Harper, New York, 1928). C.W. Churchman, The Systems Approach (Delacourt Press, New York, 1968). H.A. Simon, The Sciences of the Artificial (MIT Press, Cambridge, Mass., 1969). F.E. Emery, Systems Thinking (Penguin Books Ltd., Harmondsworth, 1969). C.E. Shannon and W. Weaver, The Mathematical Theory of Communication (University of Illinois Press, Champaign, IL, 1999). H.R. Maturana and F. J. Varela, Autopoiesis and Cognition: The Realization of the Living. (Reidel, Boston, 1980). W. Ulrich, Critical Heuristics of Social Planning (Wiley, Chichester, 1983). L. von Bertalanffy, General Systems Theory (George Brasiller, New York, 1969). K.E. Boulding, The Organizational Revolution (Harper & Row, New York, 1953). H. Fayol, General and industrial management (Pitman Publishing company, London, 1949). G. Bateson, Mind and Nature: a Necessary Unity (Hampton Press, Cresskill, NJ, 2003). G. Vickers, Freedom in a Rocking Boat (Allen Lane-Penguin Books LTD, London, 1970). B. Langefors, Theoretical Analysis of Information Systems (Studentlitteratur, Lund, 1966). P.M. Bednar, Informing Science 3(3), 145 - 156 (2000).
312
P.M. Bednar
25. C. Argyris and D.A. Schon, Organizational Learning (Addison Wesley, Reading Mass., 1978).
26. A. Argyris and D.A. Schon, Organizational Learning II - Theory, Method and Practice (Addison Wesley, Reading Mass., 1996).
27. H.K. Klein, Fourth Leverhulme Lecture January 12 2007 (Salford Business School, UK, 2007).
28. H.E. Nissen, Informing Science 10, 21-62 (2007). 29. W. Ulrich, The J. of Information Technology Theory and Application (JITTA) 3(3), 55-106 (2001).
30. P.M. Bednar and C. Welch, Informing Science 10, 273-293 (2007). 31. C.U. Ciborra, Getting to the Heart of the Situation: The Phenomenological Roots of 32. 33. 34. 35. 36.
Situatedness. (Interaction Design Institute, Ivrea, Symposium 2005. Accessed June 2007 at: http://projectsfinal.interaction-ivrea.it/web/2004_2005.html. 2004 ). G. Radnitzky, Contemporary Schools of Metascience (Akademiforlaget, Gothenburg, 1970). P. Checkland and S. Holwell, Information, Systems and Information Systems making sense of the field (John Wiley & Sons, Chichester, 1998). G. De Zeeuw, Systemica: J. of the Dutch Systems Group 14(1-6), ix-xi (2007). P.M. Bednar, Systemica: J. of the Dutch Systems Group 14(1-6), 23-38 (2007). N. Hay, The Systemist, 29(1), 7-20 (2007).
JOB SATISFACTION AND ORGANIZATIONAL COMMITMENT: AFFECTIVE COMMITMENT PREDICTORS IN A GROUP OF PROFESSIONALS
MARIA SANTA FERRETTI Dipartimento di Psicologia, Università di Pavia Piazza Botta 6, 27100 Pavia, Italy E-mail: mats.larssonphysto.se Job satisfaction and organizational commitment have long been identified as relevant factors for the well-being of individuals within an organization and the success of the organization itself. As the well-being can be, in principle, considered as emergent from the influence of a number of factors, the main goal of a theory of organizations is to identify these factors and the role they can play. In this regard job satisfaction and organizational commitment have been often identified with structural factors allowing an organization to be considered as a system, or a wholistic entity, rather than a simple aggregate of individuals. Furthermore, recent studies have shown that job satisfaction has a significant, direct effect on determining individuals’ attachment to an organization and a significant but indirect effect on their intention to leave a company. However, a complete assessment of the role of these factors in establishing and keeping the emergence of an organization is still lacking, due to shortage of measuring instruments and to practical difficulties in interviewing organization members. The present study aims to give a further contribution to what is currently known about the relationship between job satisfaction and affective commitment by using a group of professionals, all at management level. A questionnaire to measure these constructs, following a pilot study, was designed and administered to 1042 participants who were all professionals and had the title of industrial manager or director. The factors relating to job satisfaction and the predictive value of these factors (to predict an employee’s emotional involvement with their organization) were simultaneously tested by a confirmative factorial model. The results were generalized with a multisample procedure by using models of structural equations. This procedure was used to check whether these factors could be considered or not as causes producing the measured affective commitment. The results showed that the four dimensions of job satisfaction (professional development, information, remuneration and relationship with superiors) are not equally predictive of affective commitment. To be more specific, the opportunity of professional development or growth provided by a company was shown to be the best predictor of affective commitment. This seems to suggest that, as expected, the emergence of organizations could be a true emergence, not reducible to a sum of single causes. Implications, future lines of research and limitations are discussed. Keywords: job satisfaction, affective commitment, professionals.
313
314
M.S. Ferretti
1. Introduction In the past years theories of well-being in an organizational context evidenced how the latter be an emergent construct influenced by a number of factors, the main ones being job satisfaction and organizational commitment. A complete theory of this kind of emergence should clarify the roles these factors play by resorting to suitable investigations based on the use of questionnaires. The results obtained from the latter should, then, constitute the main departure point for the building of a more complete theory of emergence of well-being. As regards job satisfaction, we remark that it has a central role in many contemporary models and theories of behavior and attitude to work. It is also an important element for the improvement of the quality of employees’ working lives and for organizational efficiency. In the past, satisfaction was studied and observed in essentially a onedimensional way but today it is recognized as being a complex and multidimensional structure. Previous observations tended to be fairly general and simplistic but today, they have been developed into detailed conceptual and empirical definitions. The body of research formed by previous studies and theories relating to this topic has allowed the identification of the multiple factors that influence satisfaction. It has also allowed us to understand what the possible consequences of job satisfaction are. The main factors, according to Locke (1976) [17] are: turnover, absenteeism, physical health, mental health, complaints, attitude to work, self-esteem. On the other hand, the causes of satisfaction, or factors that can influence it, are highlighted in many different models, which aim to classify them. Two of the main classifications that should be mentioned are those cited by Herzberg (1967) [15] and Locke (1976). According to Herzberg (1967) the causes of dissatisfaction are linked to factors inherent in the environmental context (e.g. company politics or procedures, technical competence of superiors, remuneration, inter-personal relationships, physical conditions of work) and to factors inherent in the content of a job (e.g. the nature of the work itself, responsibility held, professional promotion, recognition gained and results achieved). However, Locke proposes another way of dividing up the factors that influence satisfaction: working conditions (e.g. remuneration, promotion, recognition and benefits), the people who you come into contact with at work and individual characteristics (age, sex, how long an employee has been with the company, level of training and hierarchical position held). Previous literature presents many different definitions of commitment, but all of them include the attachment of the individual to their organization. The
Job Satisfaction and Organizational Commitment: Affective Commitment …
315
most commonly studied conception of commitment is based on the work of Mowday et al (1979; 1982) [26,27], who identifies three components: acceptance of the company’s objectives, willingness to work for the company and a wish to stay within the organization. More recently, a three-component conception of commitment has been developed (Meyer, Allen & Smith, 1993) [25]. Meyer and Allen (1997) made a distinction between the target of an employee’s commitment (e.g. the organization, the job, supervisors and the group in which one works) and the relationship that is created between the employee and the target. This relationship may be of an emotional kind: it is possible to feel morally obliged not to leave the organization or job. It is therefore possible to recognize the costs associated with leaving the job (Meyer & Allen, 1991 [21]; Meyer & Herscovitch, 2001 [24]). Most of the past research has analyzed organizational commitment, defined as the psychological state that characterizes the relationship between employee and organization (Meyer and Allen, 1991; 1997) [21,22]. As we have seen, Meyer and Allen have identified three types of commitment: normative, continuance and affective. In the case of normative commitment employees stay because they feel a duty to do so. Continuance commitment expresses a utilitarian bond: employees are conscious of the costs involved in changing so stay within the organization because it is convenient for them to do so. Affective commitment corresponds to emotional attachment: employees stay within the organization because they want to. Meyer, Allen and Smith (1993) [25] discussed the origin and nature of the three types of commitment. They suggest that there are various factors involved. Normative commitment derives from personal values and from the moral sense of obligation that a person has after having obtained favors or benefits from the organization. Whereas, continuance commitment is a product of the benefits obtained from working in a certain organization and from a lack of any other alternative job opportunities. Finally, affective commitment originates from an employee’s working conditions and from their expectations of results: if the job gives the employee what he expects to receive. The model of commitment developed by Penley and Gould (1988) follows a slightly different approach to that taken by Meyer and Allen. Based on Elzoni’s (1961) multi-facetted conceptualization of involvement, Penley and Gould argue that individuals’ commitment to an organization exists in an emotional and an instrumental form. In fact, it is possible to display an emotional commitment, a calculated commitment or a commitment which is due to reasons alien to the organization. In addition, moral commitment is described as a form of emotion with high levels of positivity, characterized by an acceptance of and
316
M.S. Ferretti
identification with the organization’s objectives. Penley and Gould (1988) therefore, seem to conceptualize moral commitment in a similar way to how affective commitment is defined by Meyer and Allen. In the literature relating to organizational commitment, many studies have been carried out with the aim of identifying the factors that precede it (Meyer & Allen, 1990; Meyer, Allen & Smith, 1993 [25]; De Cotis & Summers, 1987 [11]; Wasti, 2003 [32]). A variety of factors that precede organizational commitment have been highlighted, but one, which has received particular attention, is affective commitment. Mowday, Porter & Steers (1982) [27] classified these preceding factors into four categories: individual characteristics (demographic details, personality and attitudinal), structural characteristics of the organization, job characteristics and work experience. Many studies have supported the idea that the different forms of commitment have a considerable impact on work performance, absenteeism, turnover and work related stress (Meyer & Allen, 1991 [21]; Mowday, Porter & Steers, 1982 [27]; Hackett, Bycio & Hausdork, 1994 [14]; De Cotis & Summers, 1987 [11]). The three forms of commitment were all shown to be negatively correlated with turnover. As for the behavioral consequences of organizational commitment, it was found that affective commitment (more so than normative or continuance commitment) is correlated to a large number of resultant variables (e.g. turnover, work performance, citizenship behavior) and presents stronger correlations with every one of these variables (Meyer & Herscovitch, 2001) [24]. Organizational commitment has been studied in relation to many personal and organizational variables, one of which is job satisfaction. There are many parallels between organizational commitment and job satisfaction. This is no surprise as many different studies have shown that there is a strong correlation between commitment and job satisfaction. For example, Mathieu and Zajac (1990) obtained a correlation equal to .49 between the two constructs. Although these two factors are closely correlated they constitute two constructs that are empirically separate (Brooke, Russell, & Price, 1988 [6]; Glisson & Durick, 1988 [13]; Shore, Newton & Thornton, 1990 [29]). Recent studies have shown that job satisfaction is of key importance as it has a direct effect on employees’ commitment to their organization and an indirect effect on employees’ intention to leave a company (Powell & Meyer, 2004) [28].
Job Satisfaction and Organizational Commitment: Affective Commitment …
317
2. Field measures & Targets The theories and models that have been proposed in this area of research so far have been based upon studies that used participant samples taken from various groups of professionals. However, there are only a limited number of studies that have specifically aimed to study these factors in professionals at a managerial level. Few research projects have attempted to determine the factors preceding organizational commitment in different occupational groups. Ritzer & Trice (1969) suggest that the relationship between organizational commitment and its preceding factors might be stronger for non-professionals than for professionals as professionals don’t direct their expectations toward the organization but toward their job. Therefore, the organization, as an object to which they could become committed, is not as important to them as it is for nonprofessionals. Nystrom (1990) emphasized the importance of vertical communication exchanges between managers and their subordinate employees. His research supports the expectation that bosses who experience low quality communication exchanges with their direct superiors tend to feel less involved in the organization, while, managers who take part in good quality vertical communication exchanges express a high level of organizational commitment. The results of the meta-analysis conducted by Cohen (1992) show that the variable ‘salary’ has a stronger relationship with organizational commitment in professionals. Salary is included in the organizational model as a proceeding factor in as much as the organization controls this variable. The fact that this calculated factor has a strong effect on professionals shows that their expectations are not only of an intrinsic character. This result indicates that although general intrinsic aspects are important for the commitment of workers with a high–level job status, extrinsic factors also play a key role (Angle & Perry, 1981). A previous study by Ferretti & Argentero (2006) [12] forms the basis of the current paper. Ferretti & Argentero (2006) showed that job satisfaction in professional work is multi-dimensional: they describe four dimensions which are specific to this type of worker. The first is satisfaction connected to the information an employee receives. The second is satisfaction connected to the opportunities of development and professional growth while the third related to satisfaction linked to remuneration. The final dimension is related to the employee’s relationship with his or her superiors. The present study aims to investigate the relationship between these dimensions of satisfaction and commitment (affective commitment) concerning the organization while also considering the possibility of different the types of satisfaction and commitment.
318
M.S. Ferretti
Past research has shed light on the casual effect between job satisfaction and affective commitment. Darwish (2002) [10] suggested that job satisfaction has a noteworthy effect on affective commitment (regression co-efficient = .44). Moreover, Jernigon, Beggs & Kohut (2002) carried out a study on a group of nurses and showed that affective commitment was explained by work satisfaction. Their line of research related to the variations in the types of commitment (moral, calculated or alien) due to the level of work satisfaction (autonomy, interaction, remuneration, professional status, organizational politics and characteristics required). Therefore, one could expect that in a sample of professionals these different aspects related to job satisfaction could help to explain emotional involvement in the organization. After having revised and expanded Locke’s model (1997), Meyer et al (2004) [23] claimed that workers with strong affective commitment are driven by intrinsic motivation. One could therefore expect that among the different dimensions of job satisfaction observed in the directors it is personal development that provides the biggest contribution to an employee’s affective commitment. To examine the role that specific aspects of job satisfaction play as predictors of affective commitment, the following hypothesis’ have been formulated and tested: H1: Satisfaction due to the opportunity for development, interaction with superiors, remuneration and the information an employee receives all have a positive influence upon affective commitment. H2: The satisfaction due to the opportunity for development is the best predictor of affective commitment STUDY The present study aims to identify which variables relating to job satisfaction encourage emotional involvement with the company. 3. Method Participants 1042 individuals participated in the current investigation (men = 553; women = 489) who were all professionals (professional is used here to mean an employee who is responsible for managing other employees and who has a title such as
Job Satisfaction and Organizational Commitment: Affective Commitment …
319
Table 1. Participants characteristics (N = 1042). Gender Age
Job Level Organizational seniority
Professional macro-area
M
% 53.07
F
46.93
20 years Technical Production Sales Staff Product and Program Management Procurement ICT Systems Other
4.64 29.51 32.57 33.28 7.69 92.31 13.76 11.21 20.36 54.67 17.27 22.74 13.81 26.30 1.00 3.26 8.25 7.37
“manager” or “director”) in a multinational manufacturing company which has its headquarters in Italy. The participants (tab. 1) were mostly over 40 years old (65.9%) and had the title “manager” (92.3%). 22.7% worked in production and 26.3% were staff. 75% of the participants had worked for the company for more than 10 years. These individual characteristics of employees are related by the company’s business activity (manufacturing) and by the employees high level (managers or directors) within the investigated company. 3.1. Materials The investigative questionnaire (displayed in the appendix) used in this study consists of an anonymous questionnaire made up of 20 items, 16 of which measure the four dimensions of job satisfaction while 4 investigate commitment to the organization. The participants were asked to indicate, for each item, how much they agreed with the statement presented (on the basis of their experience within the company) with a Likert–type scale with 5 points where 1 = don’t agree at all and 5 = completely agree. The four dimensions of satisfaction respectively measure:
320
•
•
•
•
M.S. Ferretti
Satisfaction for opportunities of development and professional growth (4 items), this relates to the employees competence which has been contributed to by suitable training and mobility programs. An example of this kind of item is: “I think that in this company I have good opportunities to take part in training courses and to develop professionally” Satisfaction for information received (3 items) refers both to company strategies and consequentially to the company’s results. An example of this kind of item is: “I think that I am well-informed of the company’s results”. Satisfaction for pay (4 items). This area refers to the recognition gained at work for the employee’s personal contributions. These contributions can be seen in terms of personal ability and willingness to propose alternative solutions to problems. An example of this kind of item is: “I can say, with satisfaction, that my personal contributions are suitably recognized within the company I work for”. Satisfaction for relationship with superiors (3 items). This refers to the presence of a professional relationship with a superior who supplies concrete support and who encourages the professional growth of the employees under him. An example of this kind of item is: “When I am in a difficult situation or when I don’t know what to do I can count on the support of my direct superior”.
Concerning the last dimension: Affective commitment (4 items). This refers to emotional commitment and the employee’s emotional involvement in the organization. An example of this kind of item is: “The more I work in this company the more I feel I am a part of it” The statements used in the questionnaire were tested in a pilot study which consisted in interviews with both individual employees and in groups. It was found that the dimensions claimed by past studies to be important for job satisfaction also emerged from the interviews. Therefore, as seen in previous literature such as Huang & Van de Vliert (2003) it is possible to classify the factors into intrinsic and extrinsic. Characteristics that refer to the job itself and those that relate to interpersonal relationships belong to the first category while the aspects of the job connected to remuneration, status and career belong to the second category. It should be noted that in the sample used, the aspects of the job that are cited to be a source of satisfaction refer to four areas that significantly reflect the managerial level of the participants. For this reason they are only partially interchangeable with as those found in samples of lower level professionals. For
Job Satisfaction and Organizational Commitment: Affective Commitment …
321
example, elements such as those connected to information received and those connected to the relationship with superiors are suggested to be of key importance, while aspects relating to interpersonal relationships with colleagues of an equal level are not identified as important. 3.2. Procedure The administration of the questionnaire was organized in special meetings, where the professionals were informed of the aims of the questionnaire, its contents and the instructions of how to complete it. It was totally anonymous and a guaranty was given that the company’s management would not be able to trace the identity of the participants in any way. 3.3. Analysis of the data The four dimensions of job satisfaction identified in a previous study (Ferretti & Argentero, 2006) [12] were hypothesized to be predictors of commitment using a model of structural equations. A cross-validation procedure (Cudeck & Brown, 1983 [9]; Bagozzi & Baumgartner, 1994 [3]) was followed which consisted in randomly dividing the sample into two sub-samples, developing a model on the first of the two subgroups and then establishing its generalizability. The analysis’ were carried out in three steps: 1) formulation and elaboration of the model on an initial subgroup of participants; 2) generalization of the results to the rest of the sub-group; 3) testing the hypothesis of structural invariance according to type. The model that was subjected to testing is shown in Figure 1. As can be observed, it is composed of a model of measures (20 observable variables that measure 5 latent variables) and a structural model (the causal relationships between dimensions of satisfaction and commitment). To be more specific, the model allows both the validity of the factorial structure of the job- satisfaction scale and the entity of the relationship that joins the dimensions of jobsatisfaction to affective commitment to be subjected to testing simultaneously. In this model each item is bound to the saturation of its own original factor. The relative model of structural equations was analyzed with the statistical software AMOS 5 (Arbuckle, 2003) [2]. The goodness of fit was tested using the 2 test. The goodness of fit is considered sufficient when the 2 is not significant; however, given its dependence on sample size, other indices that were independent from this characteristic were considered, in particular the CFI: the comparative fix index (Bentler, 1990) [5], the TLI – Tucker-Lewis Index (Tucker & Lewis, 1973) [31] and the RMSEA – Root mean square error
322
M.S. Ferretti
v15
v10
v13
v7
v5
v1
Pay
v11
v6
v2
Information v17
v12
v8
Affective Commitment
v3
v18 v19 v20
Supervision
Development
v9
v4
v14
v16
Figure 1. The hypothesized model.
approximation (Steiger, 1990) [30]. The first two indices show values ranging from 0 to 1 but values above .90 were considered as satisfactory, as suggested by Bentler (1990) [5]. For the RMSEA, the instructions given by Browne (1990) [7] were followed. Browne (1990) suggests that values below .08 should be considered satisfying and that those below .05 should be considered as good (Browne & Cudeck, 1993 [8]; Marsh, Balla & Hau, 1996 [19]). The generalizability of the model was then checked on the second subsample (Bagozzi & Foxall, 1995) [4]. This multi-sample procedure allows the simultaneous analysis of data taken from different samples, forcing all or some of the parameters to be identical in the different groups. We tested four hypothesis’ which were progressively more restrictive: a) the equivalence of the factorial weights, b) the equivalence of the coefficients of the regression, c) the equivalence of the covariance and d) the equivalence of the measurement errors. The test of invariance requires the specification of a model in which certain parameters are forced to be the same. It also requires that this model be compared to another model which is less restrictive and in which the parameters
Job Satisfaction and Organizational Commitment: Affective Commitment …
323
Table 2. Reliability of the Job Satisfaction Scale: alpha of Cronbach. Factors Pay
(6 item)
Alpha 0.82
Information
(3 item)
0.81
Supervisors
(3 item)
0.84
Development Affective Commitment
(4 item) (4 item)
0.84 0.78
Table 3. Fit indices. 2
gl
2
/gl
RMSEA
TLI
CFI
1° sample (N= 521)
387.08
160
2.40
.05
.94
.95
2° sample (N= 521)
379.26
160
2.40
.05
.94
.95
are free to have any value whatsoever. The comparison between the models was carried out by using the chi-squared test which indicates a value of non invariance. Finally, by employing the same procedure the hypothesis of structural invariance by type was tested. 4. Results The internal validity of all the items belonging to the scales of the questionnaire was evaluated by using the coefficient alpha of Cronbach. As displayed by table 2 the coefficients vary from .78 to .84, and are therefore acceptable. The itemtotal analysis also showed positive correlations. Table 3 presents the indices of good fit considered in both the samples. As far as the formulation of the model (1st sample) is concerned, the indices prove to be satisfying according to the information suggested by previous literature (Bentler, 1990) [5]: the model therefore gives an appropriate explanation for the data. Even though the value of 2 is significant, the indices of fit are over the threshold of .90, while the error (RMSEA) remains on a threshold of .05. As the data and the theoretical model have been shown to be congruent, it is possible to proceed to the interpretation of the parameters. Table 4 shows that the parameters are all high and all significant (p < .001), which is also true of the correlations between the different dimensions of job satisfaction (Table 5).
M.S. Ferretti
324
Table 4. Model of measurement: standardized parameters. Pay V1
.66
V5 V7 V10 V13 V15 V3 V8 V12 V2 V6 V11 V4 V9 V14 V16 V17 V18 V19 V20
.56 .77 .73 .72 .57
Supervisors
Information
Development
Affective Commitment
.85 .78 .72 .76 .76 .81 .56 .52 .71 .73 .79 .82 .64 .50
The structural coefficients (table 6) are all shown to be significant (p< .05) except one: the quality of the relationship with superior members of staff which doesn’t seem to produce effects on the participants’ affective commitment to the organization. This indicates that the existence of a professional relationship in which the superior encourages open discussion, who provides support and who encourages the professional growth of the employees under him is not predictive of affective commitment. Out of the other three dimensions of job satisfaction, the significant regression coefficients indicate that the component relating to remuneration (defined as recognition gained at work for the personal contribution given in terms of competence and willingness to provide alternative solutions to problems ( = 27; p< .05) positively influences affective commitment as does the component relating to how available information is to the employee ( = .12; p 1. Only at baseline of the affective subgroup and at the endpoint of total group 5 factors were extracted as at the endpoint of total group. However a
P.L. Marconi
526
Table 23. Factors extracted in a previous study were Quality of Life, Subjective Distress and hetero-evaluated psychopathology not included 3.
Number of subjects BASE LINE Factors Number Factor List Linked Factors “Central” Factor
Warm Parameters Type of Warm Parameters END POINT Factors Number Factor List
Linked Factors “Central” Factors Warm Parameters Type of Warm Parameters Emergent Features Notes
Total Group
Affective Disorder
Personality Type-B
Other
369
190
34
166
4
5 Adjustment Psychopathology Bipolarity Dependence Irritability 4
4
4
Maladjustment Psychopathology Bipolarity Dependence 4 HA-QoL (Social Anxiety) QOL_PHY QOL_SOC PER_AST PER_PAU PER_FID RAI_ANX Socio Relational Anx.-Depressive
Psychopathology Maladjustment Maladjustment Psychopathology Dependence Attachm. Disorder Dysthymia Cyclothymia
4 3 Psychopathology HA-QoL Maladjustment HA-QoL (Subjective Anxiety) QOL_PSY QOL_PHY QOL_PHY QOL_SOC QOL_AMB PER_PAU PER_PAU PER_AST PER_ESP RAI_ANX RAI_DEP PER_AST RAI_ANX Bipolar Socio Relational Anxious-Depress. Adjustment
5 4 4 4 Maladjustment Maladjustment Maladapt. Psychop. Maladjustment Psychopathology Psychop.-Anx-Dep Hypomania Psychopathology Hypomania Psychop.-Hypom. Dependence Dependence Dysthymia Dependence Incongr. Behav. Trusting Dependence 3 3 4 2×2 groups HA-QoL Maladjustment Psychopathology (Social Anxiety) (Social Anxiety) (Social Anxiety) (Cyclothymia) Psychopathology (Hypertimia) QOL_SOC QOL_SOC PER_PAU QOL_PHY RAI_ANX PER_ESP PER_ESP PER_PAU RAI_DEP RAI_ANX RAI_AGG PER_AST 3TR_AFF RAI_AGG 3TR_IDE RAI_DEP Socio Relational Socio Relational Hypomanic Anxious Anx.-Depressive Anx.-Depressive Anxious Hypertimic Obsessive Hypomania QoL & HA Socio-Relat. Fact. Adjust. Distr. Attachment Psychop QoL Anxiety&Adjustment Mood State & RD Pathol. Social Anxiety Mood State&RD Bipolarity Mood State&RD Diagnostic Low Number of Diversity Cases
factor number higher then 4 seems more reliable, even if in the quoted previous study 5 factors were extracted (Tab. 22) [3].
State Variability and Psychopathological Attractors
527
The factors extracted are sensitive to the covariation of parameters between subjects. If they show different patterns of covariation this feature can lead to splitting of factors, where the use of different populations can lead to different factorial models 4. What is found here is just this: different diagnostic groups of patients show different factorial models, on the basis of a common background structure. The underlying structure seems to be characterized by 6 background “components”: 1. The subjective distress linked to the adjustment level and perceived as quality of life. There is a background interference on this perception done by the attitude of the patient to feel insecure, unable, fearful of what it is not well known about the present and/or the future (linked to the Cloninger’s HA component). 2. The observed “psychopathology” as dysfunction of behaviour, thought processes and inner feelings. 3. The quality of private relationships with an inner feeling of anxiety, sadness, aggressiveness linked to the perceived unsatisfaction. 4. The level of the mood as activity level, trend toward exploration and frequency of trail-shift (linked to the Cloninger’s NS component). 5. The level of the mood as energy level, attention, strength, feeling of competence and social matching capability. 6. The need for an external action for the personal emotional full satisfaction (dependence), that can lead the patient to be emotionally dependent from others and seeking for others’ reward (linked to the Cloninger’s RD component). This background component may be twisted in a “denied dependence”, which actually still remains under the surface attitude of “independence”. The expression of the background actual “dependent” or “independent” (different from “autonomic”) attitude may be modulated by the mood level. These background components are differently extracted among different subugroups. The specific pathological process leads to the distinction or to the aggregation of these basic components building the specific diagnostic pattern as the specific factorial model. However we have to take into account also that the number and the quality of the previously listed “background components” may be sensitive to the background diagnostic composition of the total population also [4]. The presence of a pattern of covariation among parameters more “coherent” at the baseline than at the endpoint is evident. The low correlation between factors extracted at baseline and factors extracted at endpoint is evident also.
528
P.L. Marconi
The low correlations between the factors extracted before and after treatment can be caused by the disappearance of interfering psychological factors sensitive to drugs. At the endpoint a persistence of “coherence” between parameters and factors is still detected. This finding can be interpreted as expression of a persistent “psychopathology” as it can be argued also by the persistence of a psychopathology factor and of a maladjustment factor in the endpoint factorial model. Results from discriminant analysis confirm the presence of a main interference at the baseline represented by the mood disturbance, which is the best cured component, while at the endpoint different source of interference still persist. The common functional factor evidenced is the maladjustment and personal discomfort. However it can be detached by the clinical judgment of “recovery” or “ill state”, and it can be linked to the patient’s compliance to treatment. In Personality Type-B Disorder this compliance is very unreliable and the mismatch between clinician’s and patient’s judgment about the presence of an “illness state” is more evident in our data as more frequent in actual clinical experience. The best change in the severity of maladjustment is observed in affective patients where the drug treatment have demonstrated the highest efficacy. 5. Conclusion Values of clinical parameters (either self described or externally evaluated) have more variability at the baseline than at the endpoint, where they have a trend converging to “normal” values. These data can be interpreted as a lower stability of “rigid” systems. Four/five factors are usually extracted by factor analysis performed either in the whole selected population and in the three diagnostic subgroups. These factors are expression of a “coherence” between parameters that can arise when a pathological interference occurs. Pathological states can be judged subjectively by the discomfort and subjective distress caused by maladjustment. The presence of an adaptive dysfunctionality (as controlled behaviour, perceived feelings, and thought process) can be judged however externally also, with evaluations that are not always correlated with the patient’s judgment. There are probably some background components linked to “personality”, as Cloninger already described (Novelty Seeking, Harm Avoidance, Reward Dependence), that influence the base reactivity of patients [2]. Psychological life-experience based components (character) have to be added to this
State Variability and Psychopathological Attractors Treatment
Distorted and Rigid System (Maladaptive) Distortion Psychopathologic
Non Affected
Psychopathologic Process B
Diagnosis 2 Diagnosis 1
Flexible System (Adaptive) Statistical Independence
"Normality"
A
529
"Persistent Distortion"
B
C(jk)
C(jk) E(j)
Maladjustment Risk Area
Treatment
A(k)
E(j)
Adjustment Recovery Area
A(k)
Figure 10. The pathological state is characterized by a “distortion” of the statistical independence between parameters and background components. This condition leads to a lack of flexibility of the system and in turn to a high maladjustment. Such a condition may be represented as a “flat plan” in the “adaptive graph” drawn at the bottom of the figure (see also about Figg. 1, 2). The treatment “reduces” the “coherence” and increases the statistical independence between parameters and components. The unstable state assessed at abscissa line becomes more stable as more adaptive. At the endpoint the “flat plan” in the “adaptive graph” is changed back to normal adaptability.
temperamental bias; the “character” affects the attitude toward and the quality of social relationships which can help people in managing personal stress also. There are therefore three “patterns” which can be overlapped and evidenced in illness states: 1. the acute clinical disorder pattern (in this study affective and/or anxiety disorders mainly), 2. the background temperament, 3. the presence of abnormal personality traits (character) which leads to a kind of disease protection or to a chronic maladjustment. The different illnesses can be classified not only as associated with the “typicality” of the “syndrome” factorial pattern but also as intensities of dysfunction (maladjustment), symptoms (subjective discomfort) and signs (psychopathology).
P.L. Marconi
530
The model proposed is presented in Fig. 10. The presence of a clinical disorder leads to a “distortion” of the “physiologic” pattern which is characterized by an high statistical independence among parameters, but those which are linked to temperament features (constitutional and linked to genome). The pathologic “distortion” is linked to a lack of flexibility and causes a drop in adjustment capabilities and the rise of a bigger instability (increase of variance) in symptoms and signs detected. The clinical picture however is complicated by an interference exerted also by coping style and background reactivity linked to the personality factors (character and temperament). As the clinical disorder (in this case affective or anxiety disorder) is cured, the instability (detected as variability) goes down as the system becomes more adaptive. However at the endpoint the interference exerted by personality factors becomes more relevant and it can be the main cause of the persistence of a maladjustement and personal and/or social discomfort. The present model has to be considered as a working model, and it needs to be confirmed by further evidence either got by larger number of cases or with the use of more parameters to increase sensitivity to different physiopathological components References 1. P.L. Marconi, in Systemics of Emergence: Research and Development, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006).
2. C.R. Cloninger, Arch. Gen. Psychiatry 44(6), 573-88 (1987). 3. E. Marchiori and P.L. Marconi, in VIII Congress of the International Society for the 4. 5. 6. 7. 8. 9.
Study of Personality Disorders, ISSPD (International Society for the Study of Personality Disorders), (Florence, 2003). P.L. Marconi, in Psicopatologia dimensionale e trattamento farmacologico, Ed. T. Cantelmi and A. D’Andrea, (Antonio Delfino Editore, Roma, 2003). P. Pancheri and P.L. Marconi, Giornale Italiano di Psicopatologia 2(1), (1996). P.L. Marconi, P. Cancheri and R.M. Petrucci, in X World Congress of Psychiatry, Ed. J.J. Lopez-Ibor, F. Lieh-Mak, H.M. Vistosky and M. Maj, (Hogrefe and Huber Publishers, Kirkland, WA, 1999). P.L, Marconi and C. Gambino, in La clinica dell' ansia. Volume II, Ed. G.B. Cassano, P. Cancheri and L. Ravizza, (Il Pensiero Scientifico Editore, Roma, 1992). P.L. Marconi and F. De Palma, in Ossessioni, Compulsioni e continuum ossessivo, Ed. P. Pancheri (Il Pensiero Scientifico Editore, Roma, 1992). P.L. Marconi and P. Cancheri, in Atti del IX Congresso Nazionale di Informatica Medica (Associazione Italiana di Informatica Medica, Università Cà Foscari, Venezia, 3-5 ottobre 1996).
MODELS AND SYSTEMS
This page intentionally left blank
DECOMPOSITION OF SYSTEMS AND COMPLEXITY
MARIO R. ABRAM AIRS - Associazione Italiana per la Ricerca sui Sistemi Milano, Italy Recalling the decomposition methodology, the complexity of the decomposition process is described. The complexity of a system is connected with the depth reached in the decomposition process. In particular the number of subsystems and the number of active relations present in a decomposition are the elements used to define a complexity index. Some considerations about the decompositions sequences allow to put in evidence some basic properties useful to define the maximum values of complexity. Given some hypotheses about the relation patterns due to the starting steps in the decomposition process the range for each decomposition level is evaluated through computer simulations. In addition some connections with other knowledge contexts, as graph theory, are presented. Keywords: decomposition, subsystem, complexity, graph theory.
1. Introduction A possible way to describe a system and to put in evidence its structure may lie in developing a method for the characterization of the system, involving all the aspects connected with the definition of the subsystems and with the possible existence of relations between them. To maintain the control in the description process is not an easy work. Some intuitive and elementary considerations show how the number of binary relations involved between the subsystems increases quickly (with a quadratic law). So it is necessary to apply a rigorous methodology in order to keep complete control on the trend of decomposition of a system into subsystems. In this paper, following again the decomposition approach and using a general decomposition methodology [1] (section 2), we will investigate the possibility of defining and computing some complexity indices of a system. Recalling briefly some ideas about complexity, we will show how the complexity may be defined and evaluated in a decomposition process (section 3). In particular we will define the complexity as a property connected with the number of subsystems and the number of the active relations between the subsystems. Then we will see how the complexity of a system is defined and is meaningful within a specific decomposition level (section 4); in this context
533
534
M.R. Abram
some elementary complexity indices for each decomposition level will be defined evidencing their maximum values. Not all complexity indices are computable, so that some of them will be evaluated through simulations of the decomposition process (section 5). Some methodological remarks are reported in the conclusions (section 6). 2. Decomposition of systems The decomposition process enables to develop the partition of a system into subsystems, maintaining under control the development of the relations between all subsystems. In this way the patterns of relations between the subsystems coming out from the decomposition process are always strictly connected with the decomposition steps for each subsystem. Very briefly, each decomposition step is summarized by the following activities [1,2]: • A subsystem S kn −1 is duplicated with its relations in two new subsystems Pkn and N kn . • The subsystems Pkn and N kn are respectively identified by a property Pkn and its negation Nnk = ¬Pkn . • The relations between the subsystems Pkn and N kn and the other subsystems are reduced eliminating those not coherent with the properties Pkn and Nnk = ¬Pkn . • The new subsystems are then labeled as S kn = Pkn and S kn+1 = N kn . Figure 1 shows a simplified schema of the step n in the decomposition process. The matrix of relations is a picture of all binary relations between all subsystems forming the decomposition pattern. Alternatively the decomposition process may be represented as a directed graph Gn ( S n (Pn ), Rn ) in witch S n is the set of subsystems, Rn is the set of relations between the subsystems, Pn is the definition property used to characterize the subsystem S n . We remark that in this representation the graph structure is enriched by labeling each vertex with the specific property defining the subsystem associated to that node. Nevertheless the matrix of relations is for definition the incidence matrix of the graph Gn ( S n , Rn ) ; so we can investigate which properties of incidence matrices may be useful in describing the relations pattern of the decomposition. The decomposition process is so a synthesis of graphs: a graph of properties and a graph of subsystems. In this way it is possible to give a further definition of system. A system is a couple S n = (Tn , Bn ) where Tn is the graph of
Decomposition of Systems and Complexity
…
…
…
535
Sk
Pkn
S kn −1
S k +1
N kn
…
…
…
Figure 1. Decomposition of a system into two subsystems.
properties and Bn is the graph of subsystems. By construction the graph Tn is a tree of n vertices and n + 1 leaves and the graph Bn is a directed graph of n + 1 vertices and a maximum number of n (n + 1) edges. 3. Complexity and decomposition Different definitions of complexity were developed in order to give a quantitative description and to evaluate the variety of configurations or more in general of the multiplicity of properties in a system. By some authors the concept of complexity has been associated to the idea of redundancy, interpreted as multiplicity of choices. Some of these aspects are taken in consideration when defining a system and become evident when we attempt to decompose a system into subsystems. We are then interested to give a measure of the complexity involved into the decomposition process. In particular we will attempt to investigate how the number of subsystems and of the active relations on them affect the complexity of systems. In a decomposition process, the matrices of relations are isomorphous to the incidence matrices of a directed graph. So a first index of complexity is given by counting the number of elements different from zero in the incidence matrix. This means that this complexity index is given by the number of binary relations involved in that decomposition (the number of edges in the graph). When considering the decomposition process, some deeper evaluations of the hypothesis involved in its definitions are useful, because we need to have a more realistic evaluation of the maximum value of complexity indices related to our applications. An important aspect then emerges from these considerations: the complexity of a system is related to the specific decomposition built for the
536
M.R. Abram
description of that system. In particular the level of complexity of each system is connected with the level of decomposition reached for that system. So speaking about the complexity of a system is improper; instead it is correct to speak about the complexity of the description of a system. Some one attempted to describe the complexity of a system in a general way, but in reality even this description is given with reference to the background model chosen for the system.
4. Evaluating complexity Complexity may be evaluated in different ways, and it is convenient to introduce a definition and a metric that is meaningful and then useful in connection with the specific aspects of the problem. In evaluating the complexity indices three kinds of them are available: the simpler ones are directly functions of the decomposition index and their values are evaluated by means of a linear law. They can be listed as follows: • Logical. Indices related to the number of properties involved into the definition of a system and of the number of subsystems. These values are intrinsic to the decomposition process. • Relational. Indices connected with the number of relations between the subsystems. • Mean. Indices related to average values computed by means of logical and relational indices. While the logical indices are linear and give the dimension of the decomposition (the basic dimension of the problem), the relational indices give an idea of the number of connections involved and the number of interaction or loop structures that are present in the representation. When evaluating complexity indices it is easy to acknowledge that the logical indices are directly computable while the relational indices, being dependent on the choices adopted in the decomposition processes, are not computable in an universal way. In this regard it possible only to compute the maximum values of these indices that characterize the range of the values they can span. Given a decomposition step n and an index X , in defining each metric we will use the following convention: • Effective Value N Xn . • Maximum Value U Xn . • The only independent variable will be the decomposition index n .
Decomposition of Systems and Complexity
537
4.1. Logical indices Logical indices are related to the number of properties involved into the definition of a system and, consequently, to the number of subsystems involved. To proceed further we need to introduce the following nomenclature. • Number of properties N Pn used for the definition of a system at the decomposition level n . Then a system can be defined by a decomposition instantiating n properties and
N Pn = n . •
Number of subsystems N Sn in the decomposition n (number of vertices):
N Sn = N Pn + 1 = n + 1 Substantially it is given by the number of subsystems defined into the decomposition procedure; it is greater than the number of properties because includes the “environment” subsystem.
4.2. Relational indices The relational indices involve the number of relations and help to investigate some structural properties of the systems. Some useful indices are: • Number of relations N Rn between the subsystems involved in a specific decomposition n (number of edges). We are not able to give a formula to evaluate this index because its values come from the number of surviving relations in each decomposition step. n • Number of reducible relations N RR between the subsystems involved in a specific decomposition step n ( h assumes values related to the hypothesized reductions of the direct relations between the subsystems Pkn and N kn of the decomposition step n ): n N RR = 4(n − 1) + h
•
(h = 0,1, 2) .
(1)
Number of interactions N In present in the decomposition n (number of cycles); for a graph it is usually named cyclomatic number γ n [4] and represents the number of cycles that are present in the incidence matrix of the decomposition n . If N Rn is the number of relations between the subsystems (number of edges), and N Sn is the number of subsystems (number of vertices), the number of interactions is:
N In = γ n = N Rn − N Sn + 2 = N Rn − n + 1 .
538
•
M.R. Abram
Number of simple interactions N In1 that are present in the decomposition n n is the number of (number of cycles involving only two subsystems); if N RS binary symmetric relations between the subsystems, it is given by
N In1 = •
n N RS . 2
n Number of non-simple interactions N IK that are present in the decomposition n (number of cycles involving more than two subsystems): n N IK = N In − N In1 .
•
Sparsity, in a decomposition n , is the proportion of active relations with reference to the maximum number of possible relations (see the subsequent definition of an index as regards the meaning of U Rn ): n N SP =
N Rn U Rn
=
N Rn N Sn ( N Sn − 1)
=
N Rn . n (n + 1)
Because we are not able to give an explicit formula for the computation of some n indices as N Rn , N RS and then N In1 , it may be convenient to compute the maximum values of some indices in order to evaluate their range of validity. Then we can consider the following values: • Maximum number of relations involved in a decomposition n (maximum number of edges):
U Rn = ( N Sn ) 2 − N Sn = N Sn ( N Sn − 1) = (n + 1)n •
Maximum number of reducible relations. It is the maximum number of relation involved by subsystems P and N in decomposition n (maximum number of edges to be eliminated): n U RK = 4(n − 1) + 2 = 4n − 2
•
Maximum number of interactions involved in a specific decomposition n (maximum number of cycles) which is, for N Rn = U Rn :
U In = U Rn − n + 1 = n (n + 1) − n + 1 = n 2 + 1 •
Maximum sparsity involved in a specific decomposition n : n U SP =
U Rn U Rn
=1
Decomposition of Systems and Complexity
539
4.3. Mean values Mean indices are related to the average values computed by means of logical and relational indices. The mean values are related to the number of subsystems involved in the decomposition step. • Mean number of relations. Mean number of relations for each subsystem (mean number of edges)
N Rn = •
N Rn . n +1
N In N Sn
=
N In n +1
Mean number of simple interactions. Mean number of simple interactions for each subsystem for a specific decomposition n (mean number of simple cycles).
N In1 = •
N Sn
=
Mean number of interactions. Mean number of interactions for each subsystem for a specific decomposition n (mean number of cycles).
N In = •
N Rn
N In1 N Sn
=
N In1 n +1
Mean number of non-simple interactions. Mean number of non-simple interactions for each subsystem for a specific decomposition n (mean number of non-simple cycles). n N IK =
n N IK
N Sn
=
n N IK n +1
5. Simulation In order to evaluate the numerical values of relational indices it is necessary to use computer simulations. For this reason the decomposition process was implemented through a Matlab software. The hypotheses about the decomposition process influence the evolution of the decomposition itself. The reduction of number of relations in defining a subsystem can be performed in the following different ways:
540
M.R. Abram
0
0 5
5
10
10
15
15
20
20
25
25
30
30
35
35
40
40
45
45 50
50 0
10
20
30 nz = 724
40
50
0
10
20
30 nz = 278
40
50
Figure 2. Matrix of relations (left) and matrix of binary relations of simple interactions (right).
• • • • •
The direct relations between the new subsystems of the last decomposition step are not reduced (h = 0 in equation (1)). One relation of the two direct relations between the subsystems in the decomposition step may be reduced (h = 1 in equation (1)). The two direct relations in the decomposition step may be reduced (h = 2 in equation (1)). The maximum number of relations to be reduced for each decomposition step is fixed. The number of relations to be reduced is in accordance with a defined reduction law.
These possibilities were implemented in our simulations and the identification of the relations to be eliminated was given by a random choice (with uniform distribution) as regards: • The number k of the subsystem S kn to be decomposed into two subsystem Pkn +1 and N kn +1 . • The choice of the relations to be eliminated between the relations involving the two subsystems Pkn +1 and N kn +1 . As an example we report the simulation of a case with 50 properties (then 51 subsystems will be present at the end of the decomposition). The figures show the evolution of the previously defined indices and give an idea of the trend of larger decomposition processes. In particular in figure 2 the matrices of relations
Decomposition of Systems and Complexity 1
50 Decomposed subsystem Properties
45
0.9
40
0.8
35
0.7
30
0.6
25
0.5
20
0.4
15
0.3
10
0.2
5
0.1
0
541
0
5
10
15
20 25 30 Decomposition
35
40
45
50
0
0
5
10
15
20 25 30 Decomposition
35
40
45
50
Figure 3. Sequence of decompositions (left) and sparsity index (right) vs. decomposition index n.
and of simple interactions at the end of the decomposition process are shown. Figure 3 depicts the random choices of the subsystem to decompose and the evolution of sparsity index. The evolution of relation and interaction indices vs. decomposition index n are reported in figure 4; the values of indices, shown in logarithmic scales, are compared with the maximum number of properties U Pn , the maximum number of relations U Rn and the maximum number of reducible n relations U RR . In figure 5 are reported the mean values of some indices for each subsystem referred to the maximum numbers as before. It is evident the importance and the role of the reduction law, the choice of which influences the evolution of the process toward a manageable number of relations or an excessive amount of binary relations involved. In our simulation the reduction law gives the number of relation to n n eliminate N RE and has the form N RE = int(α n) , where we set the reduction coefficient α = 0.4 . The equation gives a value that is bounded by the effective number of the reducible relations between the two subsystems. The choice of the parameters in reduction law affects directly the sparsity index of the decomposition (figure 3). The results obtained from the previous simulations suggest some interesting considerations. They put into evidence the role of the observer, intended as the agent who drives the decomposition process. In previous numerical experiments the random choice of the subsystem to be decomposed can be interpreted as a simulation of the cognitive system of the observer [8]. Therefore the testing of different statistical distributions for the choice algorithm and its probabilistic modeling may give the possibility of investigating different cognitive strategies
542
M.R. Abram
Properties Relations (Maximum) Reducible Relations (Maximum) Relations (Total) Reducible Relations Eliminated Relations
3
10
Properties Relations (Maximum) Reducible Relations (Maximum) Interactions (Total) Simple Interactions Not Simple Interactions
3
10
2
2
10
1
10
10
1
10
0
0
10 0 10
10 Decomposition
1
10 0 10
1
10 Decomposition
Figure 4. Relations indices (left) and interactions indices (right) vs. decomposition index n. 2
2
10
10
Properties Relations (Maximum) Relations (Mean) Reducible Relations (Mean) Eliminated Relations (Mean)
Properties Relations (Maximum) Interactions (Mean) Simple Interactions (Mean) Non Simple Interactions (Mean)
1
1
10
10
0
0
10
10
-1
10
-1
0
10
1
10 Decomposition
10
0
10
1
10 Decomposition
Figure 5. Mean values of relations indices (left) and mean values of the interactions indices (right) vs. decomposition index n.
and consequently of evaluating the effects produced by different dynamics in the evolution of decomposition process.
6. Conclusions The previous findings suggest some considerations. • Implementing a decomposition process increases the description detail of systems and this gives an advantage in acquiring information about system properties; but we ask ourselves whether there is a limit beyond which the decomposition process is useless. • The decomposition process has a twofold meaning connected with the goal of decomposition itself; if we decompose a system to analyze it, we have a
Decomposition of Systems and Complexity
• •
•
•
•
543
description suited to grasp the properties of an existing system and the increasing complexity of decomposition coincides with the “analysis” complexity. On the contrary, in a design activity the decomposition is used to dominate the development of a project. Then the decomposition complexity is connected with the “project” complexity. The advantage of decomposition increases linearly with the number of subsystems but the number of relations increases in a quadratic way. It can be shown that when we increase the number of subsystems, as the latter is an increasing number of available functions, then the system performance can be higher and we can say that an increase of the decomposition level may increase the advantages due to the existence of subsystems. These advantages may increase linearly with the number of subsystems, but the costs for implementing subsystems themselves increase quickly in a quadratic way with the number of relations connecting them. This means that, beyond a given decomposition level, inevitably the costs will overcome the advantages. This implies that increasing the decomposition level is not always convenient. Then it is important to evaluate the cost/benefit ratio related to the number of subsystems and relations. The decomposition process is a rigorous methodology by means of which each subsystem is defined by a property strictly connected with the properties recalled in the previous decomposition steps. Each definition of a property then is important, because its activation into the decomposition process may duplicate the subsystem relations or reduce them. The definitions of indices of properties and subsystems show how each decomposition intrinsically gives rise to an environment subsystem defined through the properties complementary with respect to the ones of subsystems. This entails that whatever kind of decomposition implies the unavoidable introduction of an environment, whose properties are not absolute and objective, but defined in terms of hypotheses made by the system analyzer. It could be possible to describe the behavior of subsystems, independently from the nature of the elements, by resorting to concepts of information theory. This informational approach [5,7], mainly based on quantities such as mutual information, can be useful to evidence the presence of systemic features and therefore of the past occurrence of emergence processes giving rise to the system itself.
544
•
M.R. Abram
When an effective application must be developed and each relation is instantiated by a more detailed description, the definition of complexity must be specified and more specific and detailed definitions of complexity indices can be used. Then the complexity of a system is related to the complexity of the description adopted for that system. Further researches may be useful to go deeper into investigating the results coming from directed graphs theory, so as to find the possible connections of the latter with practical applications [3,6]. This may be advantageous when developing a software to manage the amount of available data and to evidence the structural properties of decomposition process.
•
Acknowledgments I would like to thank Prof. Eliano Pessa who provided information and useful discussion.
References 1. M.R. Abram, in Emergence in Complex, Cognitive, Social and Biological Systems, 2. 3. 4. 5. 6. 7. 8.
Ed. G. Minati and E. Pessa, (Kluwer Acadmic/Plenum Publishers, New York, 2002), pp. 103-116. M.R. Abram, in Systemics of Emergence: Research and Development, Ed. G. Minati, E. Pessa, M. Abram, (Springer, New York, 2006), pp. 377-390. A.-L. Barabasi, in Handbook of Graphs and Networks, Ed. S. Bornholdt and H.G. Schuster, (Wiley-Vch, Weinheim, 2003), pp. 69-84. C. Berge, The theory of graphs (Dover, Mineola, NY, 2001). R.C. Conant, IEEE Trans. Sys. Man Cybernetics 6, 240 (1976), (reprinted in [6]). R. Diestel, Graph Theory, (3rd Ed.), (Springer-Verlag, Heidelberg, New York, 2005). G.J. Klir, Facets of Systems Science, (2nd Ed.), (Kluwer Academic/Plenum Publishers, New York, 2001). G. Minati, E. Pessa, Collective beings (Springer, New York, 2006).
HOW MANY STARS ARE THERE IN HEAVEN ? THE RESULTS OF A STUDY OF UNIVERSE IN THE LIGHT OF STABILITY THEORY
UMBERTO DI CAPRIO Stability Analysis s.r.l., Via A. Doria 48/A - 20124 Milano, Italy E-mail:
[email protected] “Visible universe” is a spherical matter crust that rotates at a convenient speed around a central massive body which represents a black-hole. In addition it expands itself in all radial directions. Such structure was first postulated in 2004 and now is fully confirmed by experimental observations from the WMAP (Wilkinson Microwave Anisotropy Probe) released by NASA in March 2007. Using stability theory (ST) we explain present state and future evolution up to final reach of a stable dynamical equilibrium. We present a consistent set of closed form equations that determine basic quantities as radius, age, Hubble constant, mass, density and “missing mass”. At the end of the expansion the number of typical stars of visible universe will be equal to the Avogadro Number. Keywords: stability theory, structure of universe, emergence of stars.
List of symbols
G = 6.67258 × 10−11 Joule ⋅ m / Kg 2 gravitational constant; c = 1 ε 0 µ 0 = 2.99792458 × 108 m / s speed of light in empty space; ε 0 = 1/ 4π k permittivity in vacuum; k = 8987551788 = 1.6 × 10 −19 Joule ⋅ m / Coulomb 2 Coulomb constant; m0 = 9.1093897 × 10 −31 Kg electron mass; m p = 1.6726231× 10−27 Kg proton mass; q = 1.60217733 × 10 −19 Coulomb unitary charge; −34 h = 6.6260755 × 10 Joule ⋅ s Planck constant; µ0 = 4π × 10−7 Joule ⋅ s 2 / Coulomb2 ⋅ m magnetic permeability in vacuum; rB = 5.29177249 × 10−11 Bohr radius; H0 Hubble constant; α = 7.297353080 × 10 −3 fine structure constant; N A = 6.02213607 × 1023 Avogadro number; α mq = Gm p mq kq 2 = 4.406758406 × 10−40 pure number; 1 α mq = 2.269241715 × 1039 TSM = 3.7459739 × 1030 Typical Stellar Mass;
545
546
U. Di Caprio
γ r = (1 + 5 ) 2
d = d (t ) d 0 = d (t0 ) d f = d (t f ) = 0.5 t0 , t f M 0 = M (t0 ) , M f = M (t f ) M B , MG ρ 0 = ρ (t0 ) , ρ f = ρ (t f )
de-acceleration parameter; de-acceleration parameter, present value; de-acceleration parameter, final value; present time; final time; present mass, final mass; black-hole mass, visible galactic mass; present density, final density; ρc critical density; ρG , ρG galactic density, seeming galactic density; Ep , Ep Potential energy, equivalent Potential energy τ H = H 0−1 Hubble time; Hubble constant at time t ; H (t ) R0 = R (t 0 ) , R f = R (t f ) present radius, final radius; H (t ) R (t ) expansion speed; m0 , m p , q, α are invariant; G, k , µ0 , ε 0 , c vary with time; 2 2 G (t ) c (t ) = const = G c = 7.424257637 × 10 −28 . Note: In order to avoid confusion we use symbol d0 (rather than usual q0) to designate the de-acceleration parameter.
1. Introduction On March 2007 NASA has released a suggestive imago of the whole universe as “viewed” by the WMAP (Wilkinson Microwave Anisotropy Probe). This exceptional result is the culmination of years of practical and theoretical research primarily based upon observations by the spatial telescope Hubble, from 1995 on, and by interconnected observatories disseminated on earth. A first sensational synthesis was made known on October 2003 and successive months, giving numerical estimates of fundamental quantities as “age of Universe”, geometric form, radius, density of matter, expansion rate with time, birth of galaxies, Hubble constant. Such data deeply modified our knowledge of Universe and put in crisis the majority of existing cosmological theories. A first innovative study to cope with this new situation was presented at the 2004 AIRS Congress of Castel Ivano, Trento (I) and published by Springer in 2006 [5]. Here we illustrate further developments in the light of the most recent acquisitions. • The start point is that (cfr. with [5]) visible Universe has a finite extension and a spheroidal form. This undoubtedly means that Universe has a
How Many Stars are there in Heaven? The Results of a Study of Universe …
•
•
•
•
547
geometric center and a center of mass (with Cosmological Principle permission). The two properties must agree. We propose an original approach based on stability theory (ST) and on the equivalence Potential energy/mass (in extreme synthesis special relativity SR). The closed and spheroidal form is explained by a two-body structure, in which visible Universe consists of a matter crust that rotates around a central black-hole and simultaneously undergoes radial expansion. The Newton attraction force is counterbalanced by the centrifugal force. Of course this is a mathematical model only! However it proves to be effective in general and, in particular, for explanation of the missing mass problem: in the two-body problem the coupling Potential energy (of gravitational nature) is negative and since this energy is equivalent to mass, visible mass is only a fraction of the effective mass. This fraction grows in time, due to the fact that potential energy is inversely proportional to expansion radius and, by consequence, the absolute value of the energy in question decreases with time. A stabilization is got when expansion stops, i.e. when expansion speed becomes equal to zero. By application of Relativistic Stability Theory (RST) we establish a direct relation between the mass MB of the central black-hole and the final radius Rf of visible Universe at the end of the expansion. On the other hand the value of Rf is autonomously identified by a quantum gravitational condition that resents the correspondent of the classical Bohr condition in the hydrogen atom. Consequently the relation between MB and Rf allows us to determine MB (i.e. the black-hole mass) from Rf (i.e. the Universe final radius). A fundamental scale factor links together gravitational quantization and electromagnetic quantization . Such factor is defined by the ratio between electric force and gravitational force on the rotating electron in hydrogen and is in the order 1039 m. Multiplying the radius of Compton (about 10-13 m) by 1039 we obtain a radius 1026, which is in the order of the observed experimental value of the radius of Universe (however, in subsequent illustration we give precise figures). The preceding similarity is in itself of a “static” nature as it does not account for expansion of the radius (in Universe and not in hydrogen). The seeming gap is overcome by taking account of the de-acceleration parameter d of classical cosmology (e.g. see Weinberg). We give a formula that connects present value of the radius R0 to final value Rf via the present value d0. The latter is determined so that two basic physical
548
•
U. Di Caprio
properties are simultaneously verified: the present age of Universe is 13.7 billions years and the missing mass is about 97%. In parallel we show the final value of the de-acceleration parameter is df = 0.5 and that the corresponding “age” is tf = 40.043 billions years while the final missing mass is 0%. All in all it is possible to determine closed form equations for defining the final mass and density of universe inasmuch as the transients that lead from present values to final ones. An astonishing finding is that the total mass (visible Universe plus central black-holes) is equal to the mass of a NA typical stars (e.g. see Weinberg), with NA the Avogadro number.
2. Closed form equations of universe A) We postulate similarity between Universe and the hydrogen atom via the adimensional scale factor
α mq = (Gm p mq kq 2 ) = 4.406758406 × 10−40 . The present value H0 of the Hubble constant and the final mass of visible Universe are defined by
H 0 = τ H−1 ;
τ H−1 =
c
α
4/ 3
rB
α mq = (5.669546974 × 1017 ) −1 s −1
(1)
ρ f = ρ c = (3 8π G ) H 02 = 5.565321257 × 10 −27 Kg / m 3 ( ρ c critical density) (2) Above equations implicate
G m p m0 1 1 ρ f = ρ c = 6π 2 c µ 0 q 2 rB α 4 3
2
Final radius R f is given by R f = α rB 2α mq = 4.381444219 × 10 26 m . Mass M B of central black-hole is derivable from (2G c 2 ) M B = R f , which results in M B = 2.950762509 × 1053 Kg . In parallel the final mass of visible Universe is M f = ρ f (4π 3) R 3f = 19.60788 × 1053 Kg . And then the total Universe mass is M tot = M B + M f = 2.255864 × 1054 Kg and satisfies the relation
M tot = N A TSM ; ( N A Avogadro number; TSM typical stellar mass )
(3)
How Many Stars are there in Heaven? The Results of a Study of Universe …
h c TSM = 2π G
32
m 2p
γ r3
549
32
m 3 1+ 0 4π mp
= 3.7459739 × 1030
Kg .
B) Universe age is determined from the equation
t 0 = τ H (1 − 2d 0 ) −1 − 2d 0 (1 − d 0 ) −3 2 cosh −1
with
1 −1 d0
(4)
τ H = H 0−1 = 5.66946974 × 1017 s , γ r = (1 + 5 ) 2 = 1.61833989 , d 0 = d 0 (t 0 ) ; d = − RR R 2 de-acceleration parameter.
Equation (4) represents a significant extension of a noted Weinberg formula referring to the classical analysis of the cosmological problem (i.e. without assuming a two-body structure and, then, without considering rotation of visible Universe). The extension is represented by the factor 1/γr. Imposing that t0 ≈ 13.7 billion years (cfr. with experiments), we find d0 ≅ 0.2342 . Such value turns out equal to the positive solution of the algebraic equation
d0 1 − 2d 0
= 0.5 − 1 − 2d 0 ;
b0 = 0.2452 ≈
1 . 1.5e
(5)
C) The Universe radius at time t ≥ t0 is individualized by equation
R(t ) =
(
)
(
α rB 0.5 − b0 1 − 2d (t ) = R f 1 − 2b0 1 − 2d (t ) α mq
which, in particular, gives
R0 =
d0 α rB = 2.8148 × 10 26 m α mq 1 − 2d 0
)
(6)
(7)
The corresponding value of the density ρ 0 satisfies the equation ρ 0 = 2d 0 ρ c = 2d 0 ρ f and then is numerically equal to ρ 0 = 2.6068 × 10 −27 kg/m3. Consequently present value of mass of visible Universe is M 0 = ρ 0 (4π 3) R03 = 2.435 × 1053 Kg . Note that M 0 has the same order of magnitude of M B (black-hole mass) and contains about 1080 protons, in agreement with most reliable estimates in the literature. The following meaningful relations come into evidence
550
U. Di Caprio
R0 2d 0 = = 0.6424 ; Rf 1 − 2d 0
M0 R = 2d 0 0 Mf Rf
3
= 0.1262 =
1 7.925
They put into dramatic evidence the problem of missing mass and simultaneously point out that the mass deficit disappears at t = tf since then d → df = 0.5 . Remark 1. The Hubble time τ H defined by equation (1) is proportional, via the cosmological factor 1/αmq , to travel time of light when light crosses the atom of hydrogen. In fact radius α 4/3 rB is but the average distance of the electron from the proton in hydrogen, as pointed out by the relation
α m0 c 2 (h m0 c 2 ) 3 = m0 c 2 (α 4 3 rB ) 3 . 3. Equations of future expansion The expansion constant K0 can be computed from the equation
d0
K0 =
2
α1 3
c 2 ≈ 1.4579 c 2 expansion constant
(8)
which derives from the Weinberg K0 = (R0H0)2 (1 – 2 d0) and from our equation (see preceding analysis)
R0 ⋅ H 0 =
α rB d0 d0 c c ⋅ α mq = α mq 1 − 2d 0 α 4 3 rB 1 − 2d 0 α 1 3
On the other hand it must be [5]
R 2 (t 0 ) c2
=
K0 c2
− ϕ B0 − 1 −
R 2 (t 0 ) c2
ϕ0
with
ϕ B0 =
GM 0 GM B ≈ 0.7782 ; ϕ 0 = ≈ 0.6423 2 R0 c R0 c 2
Consequently
R 2 (t 0 ) K0 1 = − ϕ 0 − ϕ B 0 ≈ 0.1043 2 1 − ϕ0 c 2 c R 2 (t0 ) = 0.1043 c = 0.323 c = 9.682 × 107 m / s
(9)
How Many Stars are there in Heaven? The Results of a Study of Universe …
551
This gives present time expansion speed. In addition we take into account rotation (round central black-hole). The corresponding equations can be derived as follows: mass M of visible Universe satisfies the differential equation
γ rot M R = − which leads to
−R
GM MB R2
+
2 (γ rot M )ν rot R
R 1 G M B ν 2 c2 = − 2 R γ rot c 2 R c R2
i.e.
R2
d
c
2
=
2 ϕ B ν rot − 2 γ rot c
with ϕ B =
GMB
(10)
Rc 2
2 With our values we get 0.2342 × 0.1043 = (1 γ rot ) 0.7782 −ν rot c 2 wherefrom γ rot ≈ 1.438 ν rot c ≈ 0.7186 . Equation (10) allows us to determine in closed form the final rotation speed. In fact as R (t f ) = 0 and ϕ B (t f ) = GM B R f c 2 = 1 2 then 2 γ rot (t f )ν rot (t f )
c
2
=
1 2
→ γ rot (t f ) =
1 + 17 4
→
ν rot (t f ) c
= 0.6249 .
A further important step completes definition of the initial state. Using the eq.
R(t 0 ) = −d (t 0 )
R 2 (t 0 ) = −7.7128 ×10 −12 m / s R (t 0 )
we determine the initial acceleration. On the other hand we know the final values of radius, expansion speed, expansion acceleration and de-acceleration parameter. Therefore, by a convenient polynomial interpolation we can “reconstruct” future transients. We use the following
R (t ) t 1− = b1 Rf tf
1 3
+ b2
t tf
2 3
+ b3
t tf
+ b4
t tf
4 3
+ b5
t tf
5 3
+ b6
t
2
tf
where the bi’s are convenient numerical coefficients so that the boundary conditions (at t = t0 and at t = tf ) remain satisfied. An additional condition directly involves the de-acceleration parameter d(t) and is derived from eq. (6):
552
U. Di Caprio
1 − 2d (t ) =
1 R (t ) . 1− 2b0 Rf
4. Energy and missing mass The coupling Potential energy is defined by
E p (t ) = −
G M B M (t ) . R (t )
It is
E p (t 0 ) c
2
=−
G MB M0 ; R0 c 2
E p (t f ) c
2
=−
Mf GMB Mf =− 2 2 Rf c
G M B G M B Rf 1 Rf = = = 0.7749 R0 c 2 R f c 2 R0 2 R0
(11) (12)
Universe total energy is constant and equal to present value E(t0)
E (t 0 ) = M B c 2 + γ rot (t 0 ) M 0 c 2 + E p (t 0 ) ≈ 3.878 × 10 70 J E p (t0 ) = 1 −
R 2 (t0 ) c
2
E p (t0 ) = − 1 −
R 2 (t0 ) GM 0 M B ≈ −1.77 ×1070 J 2 R c 0
The experimental galactic density ρG is bonded to the critical density ρc by the numerical relationship (Weinberg) ρG ≈ 0.028 ρc. In the light of our formulation we give the following theoretical explanation. At present time
E p (t 0 ) 5 ρ G (t 0 ) = 1+ − (d f − d 0 ) = 1 − 0.7749 − 0.1656 = 0.05947 ρ0 M 0 c2 8 with df = 0.5
ρG ρ = 2d 0 G = 2 × 0.2349 × 0.05947 = 0.02794 . ρc ρ0
(13)
Hence E p (t0 ) represents the obscure energy; (5 8)(d f − d 0 ) represents the dark matter. The first eats 77.49% of present density; the second 16.56% . Finally Visible density amounts to 5.94% . Also, with reference to critical density (rather than to effective density) it is
ρ G (t0 ) = 0.02794 = 1 − 0.7749 − 0.1972 ρc
How Many Stars are there in Heaven? The Results of a Study of Universe …
553
We assume that at any future time the visible galactic density is determined by
E p (t ) ρ G (t ) 5 = 1+ − (d f − d (t )) 2 ρ 0 (t ) M (t ) c (t ) 8 In particular
E p (t f ) ρ G (t f ) ρ G (t f ) = =1+ + 0 = 50% ρ (t f ) ρc M f c 2f
which means that final galactic mass is half total universe mass.
5. Duration of expansion Universe total energy is constant (by assumption) while the Potential energy and the kinetic energy vary with time. Consequently the speed of light must vary and, indeed, as shown in [1] this is a general feature of Universe evolution from time zero. In parallel the gravitational constant varies as well, so that
G (t ) c 2 (t ) = G c 2 = const . It must be E(t) = const = E(t0) = 3.878×1070 J with
E (t ) = M B c 2 (t ) + γ rot (t ) M (t ) c 2 (t ) + E p (t ) E p (t ) = 1 −
R 2 (t ) c2
E p (t ) = − 1 −
R 2 (t ) G (t ) M (t ) M B R (t ) c2
In particular
E (t f ) = ( M f + M B )c 2f −
Gf M f MB Rf
with cf and Gf “final values” of the speed of light in empty space and of the gravitational constant. As
Gf M f MB Rf
=
GM B 1 M f c 2f = M f c 2f 2 R f c2
it follows from above equations that
E (t f ) = ≈
Mf 2
+ M B c 2f
19.60788 + 2.95076 ×1053 c 2f ≅ (12.75 ×10 53 ) c 2f Kg 2
554
U. Di Caprio
rotation
Galaxies
Black Hole R0
Figure 1. Universe imago from WMAP (left). Two body dynamic structure of Universe (right).
and, in the final analysis, it must be
15.5 × 1053 Kg × c 2f = E (t0 ) ≈ 3.878 × 1070 Joule c 2f = 2.57 × 1016 = 0.279 c 2 → c f = 0.5275 c The total expansion time tf is given by tf = τH (2/γr)(c/cf) ≈ 42.9 billion years. Such formula can be derived from (4) d0 with df (df = 0.5), t0 with tf , and introducing the factor (c/cf). Note that this corresponds to replace c with cf in formula (1) (that gives the present value of the Hubble constant). In other words the Hubble constant too varies with time and its final value Hf is smaller than present value H0. Another point is worthy mentioning: even in the final state visible mass Mf is a fraction (one half) of the existing one, since the remaining part is submerged by the residual obscure energy owing to the central blackhole. Hence we can define an “effective de-acceleration factor”
d eff = d f (2M f M f ) = 0.5 × 2 = 1 . If we further added the black-hole mass we would find
d eff = d f
2M f + M B Mf
= df 2+
MB = 1.075 Mf
in substantial agreement with noted experimental measurements.
6. Variation of the speed of light The following equation gives the time evolution of the speed of light in empty space
How Many Stars are there in Heaven? The Results of a Study of Universe …
c 2 (t ) =
1 R 2 (t ) M B + M (t ) 1 − 0.5 2 c (t )
E p (t ) = − 1 −
[ E (t0 ) − E p (t )]
555
(20)
R 2 (t ) G M (t ) M B ; E (t 0 ) = 3.878 × 10 70 J R (t ) c 2 (t )
The derivation of (20) from preceding equations is conceptually complex and passes through computation of the ratio R 2 (t ) c 2 (t ) in an autonomous way.
7. Temperature The following formula proves to be effective in computation of temperature at various stages of evolution,
9 1 Θ (t ) = 25 N A
4 3
M 0 c 2 rB R gas R(t )
(21)
where NA is the Avogadro number, Rgas the gas constant 8.31451 j/°K, rB the Bohr radius. With our numerical value it is
Θ (t ) =
7.12664 × 10 26 m° K R (t )
(22)
At time t ≈ 380000 years when matter dominate era began (e.g. see [6], [7]) radius was
R(t m ) = 6.327 × 10 −4 R0 = 1.7809 × 10 23 m
(23)
hence
Θ (t m ) =
7.12664 × 10 23 m ° K 1.7809 × 10 23 m
= 4000 ° K
(24)
Such temperature perfectly agrees with the theoretical value pointed out in [17]. At time ts when visible universe came out the region of influence of black-hole radius was half the final radius Rf (because of physical constraints deriving from stability [6]
R(t s ) = R f 2 = 2.1907 × 1026 m hence
(25)
556
U. Di Caprio Black Hole (MB)
Region of Instability
Region of Stability
Rs
Rss Rs Radius of Instability Rss Radius of Stabilization
Figure 2. Regions of influence of the black-hole.
Θ (t s ) =
7.12664 × 10 26 m ° K 2.1907 × 1026 m
= 3.253 ° K
(26)
Such value is in excellent agreement with the experimental value of temperature of fossil radiation. Using again formula (22) we can derive a reliable estimate of present temperature and of final temperature at the end of expansion. We find
Θ (t 0 ) =
7.12664 ×10 26 m ° K 2.8269 × 10 26 m
= 2.521 ° K
Θ (t f ) = Θ (t s ) 2 = 1.6265 ° K
(27) (28)
Remark 2. We have already seen that NA equals the number of typical stars contained in the whole Universe. The same number appears in formula (21) and that should not be considered a mere coincidence. 8. Diagrams We report a selected set of transients referring to evolution in (t0,tf): radius, radial speed, de-acceleration parameter, visible mass, density (real and visible), speed of light in empty space, gravitational constant. From them one can reascend to correlated quantities such as, e.g., permittivity in empty space (while
How Many Stars are there in Heaven? The Results of a Study of Universe … 11
1.1
557
x 10 7
10 9 8
Radial Speed
Normalized Radius
1
0.9
0.8
0.7
7 6 5 4 3 2
0.6
1 0
0.5
15
20
25
30
35
40
Time [Billion Years]
15
20
25
30
35
40
Time [Billion Years]
(a)
x 10 54
6
(b)
x 10 -27
2 5.5
1.8
5
1.6
4.5
Density
Visible Mass
1.4 1.2 1
4 3.5 3
0.8
2.5
0.6 0.4
2
0.2
1.5
0
15
20
25
30
35
3.5
1
40
Time [Billion Years]
15
20
25
30
35
40
Time [Billion Years]
(c)
x 10 8
(d)
x 10 -11
Gravitational Constant
Speed of Light
7
3
2.5
2
15
20
25
30
Time [Billion Years]
35
6
5
4
3
40
(e)
15
20
25
30
Time [Billion Years]
35
40
(f )
Figure 3. Two body dynamic structure of Universe. (a) Normalized radius vs. time; (b) Radial speed vs. time; (c) Visible mass vs. time; (d) Density vs. time; (e) Speed of light vs. time; (f ) Gravitational constant vs. time
the magnetic permeability is kept constant), rotation speed around the central black-hole, temperature.
9. Recapitolatory description of mathematical structure Our formulation of the cosmological problem is complex and innovative. We think it useful to frame it into a mathematical scheme. • First we have presented 6 equations for computation of H0 , ρc , ρf , Rf , Mf , MB (i.e. the Hubble constant, the critical density, the final density, the
558
•
•
•
•
•
U. Di Caprio
final radius, the final mass, the mass of central black-hole). Then, using 4 more equations we have shown that from above quantities we can derive present values of t0 , ρ0 , R0 , M0 (age, density, radius, Mass), provided that we know t0 (i.e. present value of the de-acceleration parameter). Since d0 is not known directly while t0 and R0 are measured quantities, we went back to a double estimate of d0. The estimates agreed and allowed us to compute ρ0 , R0 , M0 . A further coherency check has been provided by the temperature formula (the eleventh equation) which establishes a bond between temperature and radius. Such formula yields the correct value of Pezias fossil radiation (by the assumption that the corresponding temperature is that at the time when visible Universe firstly “went out” the region of influence of black-hole). In addition the same formula allows us to correlate temperature 4000 °K at the beginning of the “matter dominated era” (cfr. with Weinberg) with a radius R(tm) = 7.12664×1023 m from which, using a Weinberg’s further condition, we go back present radius R0 ≈ 2.8148×1026 m, in full agreement once more with experimental findings. A thirteenth equation determines K0 (expansion constant) from d0 ; and a further 14-th equation leads from K0 to R(t0 ) , present value of expansion speed. Knowledge of R(t0 ) (and of R0, M0, MB) allowed us to determine the Potential Energy E p (t0 ) and the total Energy E(t0) = 3.878×1070 J. In addition we were able of computing visible galactic density as a percentage of the effective density. At this stage we had used 18 equations. By means of polynomial interpolation we have “reconstructed” future transients of Universe expansion. The picture includes the variations of the speed of light and of the gravitational constant. The typical stellar mass is that identified by a noted formula pointed out by Weinberg, with a very marginal correction to account for relativistic stability.
10. Conclusions Using Stability Theory (ST), SR and a principle of similarity between microcosm and macrocosm we have set forward closed form equations that give a complete picture of Universe present structure and state as well us of future evolution up to final Equilibrium at the end of expansion. We found several striking results. • At the end of expansion Universe will contain a number of stars equal to the Avogadro number. That should be correlated to the property that all in all
How Many Stars are there in Heaven? The Results of a Study of Universe …
•
•
•
•
•
•
•
559
Universe is similar to an ideal gas. Stars are gas bubbles each of which represents a particle. Universe has a two-body structure similar to that of the atom of hydrogen and, in particular, its final radius is univocally determined by a quantization rule which is the correspondent of the noted Bohr rule. Final radius is about 40 billion light-years. While in hydrogen the coupling energy is electric, in Universe the coupling energy is gravitational. The scale factor is equal to about 1039 and determines both radius and Hubble constant (which amounts to about 54 MegaParsec/Km/s). The Hubble constant however varies with time and in future becomes smaller. From measured value of age, which is about 13.7 billion years, we have determined (via a closed form equation) present value of the de-acceleration parameter d0 = 0.2349. The final value at the end of expansion is df = 0.5 . Present value of expansion speed is 0.968×108 m/s. Final value is zero. Present value of rotation speed is 2.154×108 m/s and the corresponding relativistic coefficient is γrot = 1.434. The final value is γrot = 1.2807 (emigolden value). Two basic quantities keep constant in Universe evolution: total energy E0 = 3.878×1070 J and ratio G(t)/c2(t) = G/c2. As shown in [5] such properties are “verified” from time zero (big bang). However, both the speed of light and the gravitational constant vary with time. Final value of the speed of light is 0.5275 present value c. Time tf , i.e. total duration of expansion from time zero, is equal to 42.9 billion years. This means that within 29.2 billion years Universe will reach its dynamical Equilibrium. The formula that gives tf is the same that gives t0, provided that d0 is replaced with df and c with cf . Dark energy is determined by the gravitational coupling between visible Universe and central black-hole. In addition we put into evidence a hidden energy linked together with the variation of the de-acceleration parameter. As energy is equivalent to mass, both contribute to reduction of visible mass respect to effective mass and to reduction of visible galactic density. Dark energy “eats” 16.5% of density; hidden energy “eats” 77.5%. All in all visible density is 5.9% the effective density (and 2.79% the critical density), in agreement with noted experiments. (Note that present density is smaller than critical density). Ratio between final mass Mf and present mass M0 is however smaller and equal to about 7.92 .
560
•
U. Di Caprio
In the final Equilibrium, at time tf , a residual dark energy persists (while hidden energy becomes zero) owing to the residual gravitational coupling between rotating Universe and black-hole. This residual dark energy is equivalent to a mass Mf /2. Consequently the total Universe mass is (Mf + MB) and the effective value of the de-acceleration parameter is about 1.075, in agreement with cosmic observations.
Above results have both practical and conceptual relevance. They revolutionize standard cosmological models and show the effectiveness of the theory of Relativistic Stability [6], furthermore framing GR and SR into an entirely new context.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
G. Arcidiacono, Relatività e Cosmologia (Veschi, Roma, 1973). H. Bondi, Cosmology (Cambridge University Press, Cambridge, UK, 1961). U. Di Caprio, Supplement to Hadronic J. 16(1), 163-182 (2001). U. Di Caprio, Hadronic J. 23, 689 (2000). U. Di Caprio, in Systemics of Emergence: Research and Development, Ed. G. Minati, E. Pessa, M. Abram, (Springer, New York, 2006). U. Di Caprio, Relativistic Stability, AIRS Congress, Castel Ivano, Trento (2007). R.H, Dicke, in Relativity, Groups and Topology, C. De Witt and B. De Witt, Eds., (Gordon and Breach, New York, 1964). A.D. Dolgov and Ya. B. Zeldovich, Rev. of Modern Physics 53, 1-41 (1981). G.C. Macvittie, General Relativity and Cosmology (Chapman & Hall, London, UK, 1965). E.A. Milne, Relativity, Gravitation and World Structure (Clarendon Press, Oxford, UK, 1935). J.D. North, The Measure of the Universe (Oxford University Press, Oxford, UK, 1952). H.P. Robertson and T.W. Noonan, Relativity and Cosmology (Saunders, Philadephia, 1968). M.P. Ryan and L.C. Shepley, Homogeneous Relativistic Cosmologies (Princeton University Press, Princeton, NJ, 1975). D.W. Sciama, Modern Cosmology (Cambridge University Press, Cambridge, UK, 1971). D.N. Schramm and G. Steigman, Scientific American, 6 (1988). Universe today; http://www.universetoday.com. S. Weinberg, Gravitation and Cosmology (John Wiley & Sons, New York, 1972).
DESCRIPTION OF A COMPLEX SYSTEM THROUGH RECURSIVE FUNCTIONS
GUIDO MASSA FINOLI Vicolo Arno, 2 - Silvi Marina (Teramo) E-mail:
[email protected] Starting from an enough shared definition of “complex system”, we describe the hierarchies of levels and the emergent phenomena within a system, by resorting to concepts of measure and measure invariance. Through recursive functions, we introduce a mathematical representation allowing to show how symmetries, hierarchical structures and emergent properties take place. Keywords: complex systems, hierarchical levels, measure, measure invariance, recursive functions, symmetries.
1. Introduction The definition of complex system is still the subject of a number of discussions, anyway it is possible to base it on the following main features: 1. a complex system can be defined in terms of its elements and of interactions between these elements and with external environment but, except the simple (linear and classical) systems, it is never completely separable from the latter. We could say that the relationship with its environment is a crucial feature of its complexity. 2. This relationship cannot be expressed in terms of or reduced to linear functions; it emphasizes one of the characterizing aspects of these systems, that is their contextuality, i.e. the complete inseparability from their environment. This contextuality is also linked to the measure operation which, unlike the classical systems, can here influence the system itself [2,3,5,20,21]. 3. The relationships with the environment include both relationships among elements of the same level and relationships with the elements of lower and higher levels [6,7]. Therefore, if we denote with ei the i-th element of the K-th hierarchical level of a given system and with ℜ the relation it has with its environment Ai , we have that the system could be described by an expression like:
e ℜ( Ai ) i i
561
(1)
562
4.
G. Massa Finoli
The latter formally defines a system as an aggregate of elements, taking into account that ℜ includes both relationships with other elements of the K-th level and relationships with ( K − 1) -th and ( K + 1) -th levels [18,19]. If (1) behaves like a single system, it means it assumes some properties which are not attributed to those of the components and these emergent properties can be recognized within a reference system which isn’t that one of the components [14,24].
A complex system generally is a system which has a level leap and entails emergent properties. But there can be both systems with emergent properties without level leaps and systems with level leaps without emergent properties. 2. Measure systems and hierarchical levels One of the most important aspects of contemporary physics is the theory of measure, and the comprehension of the emergent phenomena goes through this theory even if redefined according to the necessary specificities of the complexity theory [1,12,17]. We begin with a reference system definition which isn’t connected with the spatio-temporal dimensions. In general, a measure is the recursive application of a unit value µ and this application defines a group structure. This involves a series of consequences: the first being that the lowest possible measure always exists, and it is the generator of the considered group. The second is that all obtainable measures are µ multiples. The third is that a measure is such only if it ends in its recursion, a measure that doesn’t end isn’t a measure. The existence of a possible lowest value of the measure is an aspect of great importance connected with the indetermination and computation concepts. We will use the definition according to which the measure is equivalent to a Turing machine that stops, so a measure is a computable entity but with a sure stop [24]. Starting from this presupposition it will always exist, once given a measure system, a not measurable aspect related to all infinite values of system variables lesser then µ ; we deal here with a continuous and uncountable infinity of these values. The consequence is that for them there isn’t the possibility to establish in a deterministic way the evolution of the system on the base of the initial measures [25,26,27]. But we emphasize another aspect: the system, whose variables have values obtained as multiples of the lowest generator value, implies an infinite series of possible values-elements that we can identify with the elements of a level. For
Description of a Complex System through Recursive Functions
563
this level any employed measure multiple of µ (nµ = k ) generates Z k congruence classes. So all the possible measure systems generable with nµ are invariant among them, it means that any combination of the elements of a level can be always expressed through a linear combination of a basic element; therefore the use of a measure system for the elements of a level doesn’t generate any change in the structure of that level elements. Our definition of hierarchical level includes the concept of measure invariance. So we say that the elements e1 en belong to the K-th level if the measures are invariant as regards the measure systems µ1 µ k , where such measure systems are all multiples of a µ generator system. Invariance means that all the elements of a measure system can be expressed as linear combinations of elements of another system, until all can be expressed as multiples of only one basic element. So a level leap, i.e. the passage from the elements of K-th level to the elements of ( K + 1) -th level, occurs when it isn’t preserved the measure invariance, i.e. the elements of ( K + 1) -th level cannot be expressed as linear combinations of the elements of K-th level. The relationship between the basic element of K-th level and that one of ( K + 1) -th level cannot be expressed through a rational value but it is expressed through a transcendental value. After all the passage from K-th level to ( K + 1) -th level cannot be reduced to a finite computation, i.e. the system as a whole will never be reducible in terms of its components [21]. 3. Recursiveness as expression of contextuality One of the most intricate and unsolved aspects is related to the problem of representing the contextuality of complex systems. In turn, it reduces to the problem of finding a way in order to represent the ℜ relationship. It is possible, for instance, to introduce this aspect as a noise element on the system. This is the point of view adopted by the “dynamical physics” framework proposed by researchers of the Flinders University (FU) [15,16]. It is summarized in the following equation defining the K-th level of a system:
Bij → Bij − α ( B − B −1 ) + ωij where Bij are nodes which are generated in conformity with the quoted relation. But this proposal entails a number of issues that don’t make it satisfying. 1. The impossibility to explicit the complex relationship with the environment is reduced to introduce a more or less stochastic term which should interfere
564
G. Massa Finoli
with the regular development of the system, denoted as ωij .This is
2.
3.
4.
unsatisfying because in this way the relationship ℜ with Ai is viewed as an interference on the system, while ℜ is the constitutive reason of the same system. A living organism is such since there is a given relationship with the environment and this relationship is an integrating part of the same organism. The second shortcoming arises from the fact that this approach considers a hierarchical level on a par with a whole system, with the same equation, but enclosed in a node B, and so on for the other nodes. This kind of solution defines the levels hierarchies through an almost banal relationship (a simple matrioska) and puts moreover some unsolved issues. If in a node there is an entire system, the problem is where ω takes its origin for that structure from and why the structure present in a node cannot interfere with the one of another node. Namely their relationship only occurs at the node level at the high level and not among the structures of the constituent level. The idea of the “monad” of Leibnitz appears as not suitable because in this way the complexity of the levels seems to be the same, in a sort of specular reflection. One of the greater anomalies in the FU proposal is that it doesn’t produce evident and stable symmetries, while the atomic physics of the second half of XX century emphasizes that the presence and the study of symmetries are the key to understand such complex and articulated phenomena. The FU approach, also starting from that one of QED and of QCD, in reality loses later each conservative and symmetric aspect of these theories because it introduces the ω factor which doesn’t allow any kind of stable symmetry. Another limitation is that there can be only one general formula in order to express the intricate jungle of the complex phenomena. The specific law of development, which is particular for that phenomenon and classes of similar phenomena, cannot be confused with the general conditions (or law) that are respected by all phenomena. After all there can be many laws and patterns for complex systems even if all must respect some general conditions that make them so.
In a complex phenomenon the relationship ℜ of contextuality cannot be separated from the relationships among the system elements, moreover there isn’t any possibility of separation among the measures carried out later on. The system is an interconnected whole, therefore it holds in some way the memory of its previous states. In this regard the recursive equations are one of the typologies of equations that allows to find the variations of a relationship on the measured element. Moreover they are introduced as interconnected in their development.
Description of a Complex System through Recursive Functions
565
For example if W0 is an initial vector in the vector space V on the complex numbers, an application ϕ of a vector of V in another vector of V in a recursive way gives:
ϕ (Wn −1 ) = Wn from which
Wn−1 = ϕ n (W0 ) .
Moreover the recursive equations have a property which can be the right representation of a contextuality factor. In a previous essay we emphasized that the deep aspect revealed by the quantum physics is that in the physical world the phenomena show themselves in their states as intimately superposed, and this superposition is at the origin of the relationship between measure and measured, and in order to explain it we make a purely theoretical example [19]. In this regard let us suppose that we deal with a world in which there is a minimum possible value for the determination of a suitable physical quantity. This occurs because, owing to uncertainty principle, values lesser than this minimum are not accessible to a measure. In this way we deal with a sort of quantized universe, like the one postulated by Fredkin [9,10,11]. Let us denote this minimum value by Bk . If we denote the action of measure by r ( Bk ) , then it consists, for every value of the physical quantity under consideration, in finding how many times Bk is contained within this value. In turn this implies a recursive process of application of Bk . And this recursion entails a sort of superposition of the different states reached in the single steps of measure process. This suggests that only a recursive function could allow us to grasp those aspect of superposition that the complex systems have in their intimate structure and that show themselves in the patterns of contextuality. 4. A recursive pattern for complex systems We start from the hypothesis that a relation ℜ between a system and its environment can occur and that this relation can represent the whole system. Moreover let us suppose that it remains invariant during the system evolution, being representative of its intrinsic nature. On the other hand, when we take into consideration a measure operation acting on the system we must recognize that, owing to the previous arguments, it must characterized by a recursive process based on a minimum accessible value. This does not mean that the real number resulting from this process exhausts the
566
G. Massa Finoli
possibilities which the system is endowed with. Namely the non-measurable part, i.e. the one beyond the minimum accessible value, can have an influence on system dynamics, as occurs in quantum mechanics. In order to take into account this circumstance we represent all measures related to the system by introducing a vector of measures having complex components, in such a way that each real part represents the actual measured value, while the imaginary part represents the influence described before. In general terms we can therefore write for the measure vector W : W = M + ig
where M is the real part and g the coefficient of its imaginary part. The latter can also be interpreted as a sort of noise acting on real measure. Now we can introduce the concepts of local and global symmetry in the following way. Given a system within a spatio-temporal context (i.e. defined by 3 spatial coordinates and 1 temporal), we can speak of a global symmetry when the conditions defining the system are the same for each temporal instant. We instead speak of a local symmetry when there is a conditions preservation for a cyclical interval of time, i.e. the conditions, present in the interval ∆ t , are the same again for each interval n ∆ t (n integer). However inside the interval ∆ t we can have an asymmetry that spreads following a given law. This circumstance is particularly important and allows us to distinguish between systems with immediate interaction and systems with time linked interaction [1,12,17]. If we consider the system with reference to the relation (1), when a global symmetry is associated to ℜ , the latter is the same for each part of Ai . A local symmetry is a symmetry among the Ai -fields and inside this sphere there is an asymmetry, that is equivalent to a propagation of a local interaction related to the considered sphere [8]. 5. An elementary recursive function In order to illustrate the foregoing concepts we will start with a simple example in which we have an initial complex vector W1 given by [19]:
W1 = M 1 + ig1 . Let us now consider a set of relations ℜ1 ℜ k endowed with the structure of a cyclical group. The composition Ψ of all relations belonging to the set, when acting on the initial vector according to the rules of multiplication between complex numbers, generates in turn a cyclical group whose elements are given
Description of a Complex System through Recursive Functions
567
from the successive transforms of this vector. This leads to the individuation of an invariant operator associated to the cycle of this group, once we focus our attention on the system defined by a vector V whose elements coincide with the single transforms of the initial vector. Namely the action of the group consists only in permuting the components of this general system vector. In other words the cycle of Ψ generates an invariant operator on the system vector V. Now we introduce a new interpretation according to which this cycle define a hierarchical level of order K . Within it we have a temporal succession of different systems states. However we can conceive this succession as associated to a superposition of these possible states and, when we treat it like a single object, we can say that it defines a new hierarchical level of order K + 1 . This passage is equivalent to a sort of temporal reduction, where the local symmetry of Ψ turns into global symmetry in a new time scale t '. In synthesis, in order to have a level transition within a spatio-temporal reference frame, we need that the component vectors are aggregated in an only vector. The latter has a local symmetry in the time t of the level of order K (components) which becomes a global symmetry in the time t 'of the level of order K + 1 . 6. More complex recursive patterns A more complex example is given by a nested recursive system which is constituted by 3 relationships and 2 initial vectors:
W0 = z1 = x1 + iy1 = M 1 + ig1 W1 = z 2 = x2 + iy 2 = M 2 + ig 2 and C ( z1 z 2 ) = Conjugate ( z1 z 2 ) x + 1 + iy 2 x +1 iy 2 x +1 iy 2 ℜ1 = 2 + i ; ℜ2 = 1 − 2 − ; ℜ3 = 1 − 2 + C ( z1 z 2 ) z1 z 2 C ( z1 z 2 ) z1 z 2 C ( z1 z 2 ) which are applied as follows:
W2 = ℜ1 (W0 ,W1 ) W3 = ℜ 2 (W1 ,W2 ) = ℜ 2 (W1 ; ℜ1 (W0 ,W1 )) W4 = ℜ 3 (W2 ,W3 ) = ℜ 3 (ℜ1 (W0 ,W1 ); ℜ 2 (W1 ; ℜ1 (W0 ,W1 ))) W5 = ℜ1 (W3 , W4 ) = ℜ1 (ℜ 2 (W1 ; ℜ1 (W0 ,W1 )); ℜ 3 (ℜ1 (W0 ,W1 ); ℜ 2 (W1 ; ℜ1 (W0 ,W1 )))) etc.
568
G. Massa Finoli
The result is a cyclic recursive space of order 4 composed by 3 vectors, in which we have:
V1 = (W x , Wx +1 ,W x + 2 ) V2 = (W x + 3 ,W x + 4 ,W x +5 ) = Ψ1 (V1 )
V3 = (W x + 6 ,W x + 7 , Wx +8 ) = Ψ 2Ψ1 (V1 )
V4 = (W x + 9 ,W x +10 ,W x +11 ) = Ψ 3Ψ 2Ψ1 (V1 ) V5 = Ψ 4Ψ 3Ψ 2Ψ1 (V1 ) = Ψ a (V1 ) = V1
The problem is: are Wi also interchanging? In this regard we remark that in many systems the change of the starting conditions assures the interchangeability of the values of W, in this case V = (Wi ,W j , Wk ) with i, j, k any value in the sphere of the 12 values shown before. So the structure is constant in time and the values are interchangeable according to the starting conditions. If we consider only one relation
ℜ=
x2 + 1 iy 2 + z1 z 2 z1 z 2
and we restrict ourselves to a configurational space given by the values of N ( z n ) (Norm of z n ) we see that it assumes some configurations which are function of the values of M 1 and M 2 and of g1 and g 2 . The values of N ( z n ) have a cycle of order 7; these values are expressed by 7 functions of which 4 are straight lines. For g1 = g 2 = 0 the other 3 functions have their starting point in M 1 , M 2 and M 1 + M 2 and oscillate in a sinusoidal way between 0 and M 1 + M 2 . We can now ask ourselves whether, when we go to a higher order level, this system shows or not emergent properties. In this regard we define as emergent properties, in the case under consideration, those values which are not included between 0 and M 1 + M 2 . It allows us to emphasize that emergent states are influenced by the values of g and therefore are not referred to M. In the quoted case since the states are directly depending on the conditions of all starting components, there is a leap of level but there aren’t any emergent properties. On the contrary, if we consider
ℜ'= −
x2 + 1 iy 2 + z1 z 2 z1 z 2
Description of a Complex System through Recursive Functions
569
with values g1 = g 2 ≠ 0 the function f i ( x) associated to the cycle keep a cyclic behaviour but at the same time their values don’t oscillate between 0 and M 1 + M 2 . In this case we have the presence of emergent states associated to the level change. To summarize: we have a level change when a local symmetry of a level changes into a global symmetry of a superior level. A level change is associated also to emergent states when the performance of the states isn’t a simple combination of the starting conditions of the elements. This generally happens in recursive systems that converge to states not depending on the initial conditions. Or, if there is a dependence on the initial conditions, the system must oscillate among the values that go beyond the initial conditions and this is generally obtained when g1 and g 2 are different from zero. It’s evident the contribute that is given by the values of g1 and g 2 , i.e. by the virtual measures of the system elements. Moreover we can see that if g ≠ 0 the below levels influence the level determinations in two fundamental ways: 1. All the variations of the below levels are synchronized so as to preserve the local cyclicity of the components level. 2. There are phase-displacements in the below cyclicity that determine a local no-cyclicity. If there isn’t a local symmetry at level K, a global symmetry at level K + 1 cannot take place, therefore a level passage doesn’t occur. It means the performance in K + 1 level depends on the particular situations of the not predictable below level. We call such a situation with the name “chaotic”. An example is obtained if we resort to the relation:
ℜ''=
x2 + 1 iy 2 − C ( z1 z 2 ) C ( z1 z 2 )
7. Conclusions In synthesis the study of particular recursive functions in the complex field can reproduce interesting and characterizing aspects of complex systems including those where there can be emergent systems, but without level changes (the case of the chaotic systems) and local asymmetry systems which go towards a global symmetry (dissipative systems).
570
G. Massa Finoli
References 1. K. Brading and E. Castellani, Eds., Symmetries in Physics: Philosophical Reflection (Cambridge University Press, Cambridge, 2003).
2. D. Aerts, J. Broekaert, L. Gabora, in Proceeding of Fundamental Approaches to Consciousness, Tokyo '99, Ed. K. Yasue (Benjamins, Amsterdam, 2000).
3. D. Aerts, B. Coecke and S. Smets, in Metadebates on Science, Ed. G. Cornelis, S. Smets and J.P. van Bendegem, (Kluwer Academic, Dordrecht, 1999), pp. 291-302.
4. D. Aerts and S. Aerts, Diederik Aerts and Sven Aerts, in Quo Vadis Quantum 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.
Mechanics? Possible Developments in Quantum Theory in the 21st Century, Ed. A.C. Elitzur, S. Dolev, and N. Kolenda, (Springer, New York, 2004), pp. 153-208. S.Y. Auyang, Foundations of complex-system theories (Cambridge University Press, Cambridge, 1998). N.A. Baas, in Alife III, Santa Fe Studies in the Science of Complexity, Proc. Volume XVII, Ed. C.G. Langton, (Addison-Wesley, Redwood City, CA, 1994), pp. 515-537. N.A. Baas and C. Emmeche, Intellectica 25, 67-83 (1997). P.A.M. Dirac, I principi della meccanica quantistica (Bollati Boringhieri, Torino, 2001). E. Fredkin, Physica D 45(1-3), 254-270, (1990). E. Fredkin, in PhysComp '92: Proc. of the Wkshp. on Physics and Computation, Oct. 2-4, 1992, Dallas, Texas, (IEEE Computer Society Press, 1992). E. Fredkin, in Proceedings of the XXVIIth Rencontre de Moriond (Editions Frontieres, Gif-sur-Yvette, France, 1992). T.P. Cheng, Gauge theory of elementary particle physics (Clarendon, Oxford, 1984). J. Goldstein, Physics Today, 51, 42-46 (1998). J. Goldstein, Emergence 1(1), 49-72 (1999). K. Kitto, Modelling and generating complex emergent behaviour, PhD Thesis (The Flinders University of South Australia, 2006). C. M. Klinger, (2005) Process Physics: Bootstrapping Reality from the limitations to Logic, PhD Thesis, (The Flinders University of South Australia, 2005). Y. Makeenko, Methods of Contemporary Gauge Theory (Cambridge University Press, Cambridge, 2002). G. Massa Finoli, Per una visione unitaria dell’ecosistema. Un modello logico filosofico per la sociobiologia (Edizioni ETS, Pisa, 1987). G. Massa Finoli, Un modello logico filosofico per i sistemi complessi (Editori Riuniti, Roma, 2006). G. Minati and E. Pessa, Collective Beings (Spinger, New York, 2006). G. Minati, Systemist 28(2), 200-211, (2006). H.H. Pattee, BioSystems 60, 5-21 (2001). R. Penrose, The emperor's new mind (Oxford University Press, Oxford, 1989). E. Pessa, in Emergence in Complex Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp. 379-382. I. Prigogine, From Being to Becoming (Freeman, San Francisco, 1980). I. Prigogine and I. Stengers, La nuova alleanza. Metamorfosi della scienza (Einaudi, Torino, 1981). I. Prigogine, Le leggi del caos (Laterza, Roma-Bari, 1993).
ISSUES ON CRITICAL INFRASTRUCTURES
MARIO R. ABRAM(1), MARINO SFORNA(2) (1) Cesi Ricerca S.p.A., Via Rubattino 54, 20134 Milano, Italy E-mail:
[email protected] (2) Terna S.p.A., Via Arno 64, 00198 Roma, Italy E-mail:
[email protected] In the last decades, the interactions between the infrastructures of a country gained increasing importance and consequently people started to acquire consciousness of their mutual interdependencies. This situation became evident in the last years when a number of large blackouts occurred in U.S.A. and in Europe and portions of electric power systems collapsed and forced other infrastructures to collapse as a consequence. The paper reports a brief description of critical infrastructures, recalling some properties as safety, security, emergency, vulnerability and stability. Then, the implications of these characteristics in control actions are briefly evaluated. The need to build a reference model useful to analyze and to simulate the phenomena connected with critical infrastructures is discussed. Investigating these concepts as emergence of properties may be a chance; this point of view may be useful to identify and to evaluate a systemic approach to deal with critical infrastructures problems. Some remarks about the complexity of the problems involved in managing the interaction between critical infrastructures are finally reported. Keywords: infrastructure, critical infrastructure, interaction, control, security, criticality.
1. Introduction Modern societies developed a large set of systems and methods that enabled mankind to use different forms of energy. The computer technology accelerated the development of instruments for spreading and using information and enhanced the possibility to control more complex systems. These new resources changed the way of life and they were the chance to overcome diseases, illness, hungry, ignorance allowing a part of mankind to reach a more stable welfare and a potentially a peace coexistence. These results were reached increasing the availability of energy under their different forms and developing the ability to convert energy into different forms, especially into electricity. This increased spreading the supply networks and the operating of main energy systems became a strategic activity for all the countries. The situations evolved, the need of energy is still growing, but the world periodically confronted itself with very serious energy crisis.
571
572
M.R. Abram and M. Sforna
Recently, everybody experienced the effects of severe malfunctioning, as the large blackout of electric power system as that occurred in U.S.A in 2001 and 2003 and in Europe, in 2003 (Italy and United Kingdom) and 2006 (Western Europe). So, we got consciousness about the existence of criticalities in infrastructures that can drive to large and severe malfunctioning. In addition the pervasivity of the information technology now gives evidence of the mutual interactions between the different networks of public services. Moreover, tragic criminal events stressed the attention on terrorist actions with the necessity to know and solve the uncontrolled critical part of the infrastructures. The diffuse consciousness of the vulnerability of public services focused the experts’ attention on possible dangerous instabilities. The defense approach prevailed in the last years addressed the necessity to solve the security problem by eliminating the vulnerabilities of the infrastructures and mitigating the effects of not avoidable damages. In any case, it is recognized, that security problems have many aspects and often considering only on one of them may drive to reduce the attention on the others. So, the terrorism problems grasped the attention of decision makers often forgetting the intrinsic vulnerability of the systems. These problems involved economic and political communities and became the object of law actions. The examples or main events are the US Presidential Act [17,18] and the proposal for a European Union Directive on the identification and defense of critical infrastructures [3,4,5]. So, only recently the vulnerability of the infrastructures reached the due importance to get the attention of Governments and Institutions. In this paper we will investigate about these themes considering different approaches. Even if the electric power system is our leading reference, we will attempt to develop some general subjects that, hopefully, can be useful to investigate the interactions among different infrastructures and how these interactions may amplify the effects of vulnerabilities. In particular, after a definition of critical infrastructure (Section 2), the paper will describe some general concepts useful to identify structural and functional properties of the infrastructures from the viewpoint of customers and Providers (Section 3). Considering the evaluation of global properties of an infrastructure we recall the drivers that influence its operating strategies (Section 4). Then, some aspects involved in the interaction of infrastructures are examined (Section 5), and the related problems connected with the control strategies as well (Section 6). In addition, the role of human factor is briefly evaluated in Section 7. The emergence and the systemic approach related with
Issues on Critical Infrastructures
573
the problem of interacting infrastructures are considered in Section 8. Finally, methodological remarks are discussed in Section 9. 2. About critical infrastructures Usually an infrastructure is considered as the set of all the physical components which are needed to be build and operated to supply customers wit a goods, e.g.: materials, energy, information, communication, etc. When the extension of these services involves a whole country, or a community of countries, the possible problems of an infrastructure may become problems for the country, or many countries. Actually, an infrastructure is a complex organization structured in multilevel hierarchies. Its components must work and be operated in an integrated way. Examples of infrastructures are the electric power system, cold chain, fuels supply, communications networks, transportation, social services, health care, military defense, etc. Each infrastructure uses the services and the resources provided by other infrastructures and this can creates the conditions for a mutual dependence. It is enough to think how many services today depend on electricity and telecommunications [1,13,14]. Recently, this dependence increased quickly and it evolved into a strong interaction between the infrastructures. The mentioned blackout events showed how these interactions become critical and now it usual to mention the main infrastructures as critical. The problem is already at economic and political levels. So, Institutions started to consider, evaluate and plan the protection criteria for critical infrastructures and some reference definitions has been prepared. Critical infrastructures are: “Systems and assets, whether physical or virtual, so vital to the United States that the incapacity or destruction of such systems and assets would have a debilitating impact on security, national economic security, national public health or safety, or any combination of those matters” [17]. Again: “There exists a number of critical infrastructures in the European Union, which if disrupted or destroyed, would affect two or more Member States. It may also happen that failure of a critical infrastructure in one Member State causes effects in another Member State. Such critical infrastructures with a trans-national dimension should be identified and designated as European Critical Infrastructures (ECI). This can only be done through a common procedure concerning ECI identification and the assessment of the need to improve their protection” [4].
574
M.R. Abram and M. Sforna
3. Modeling infrastructures Each infrastructure, by means of the related Organization, applies and exercises its activity on specific domains. The Organization in charge of a certain infrastructure controls its components. We can attempt to describe this scenario using a simplified model and providing the relationships among the main states the infrastructure, or its representing technical system, may operate. There can be three main operating states for the system (Figure 1): • (S) System in Service. The system is correctly in operation and the values of the state variables are within the assigned range. All its subsystems are in state (S). The status performance indexes, stability, working point, reserve availability, are in a predefined range. • (O) System Out of Service. The system is not working. The state values are out of the range assigned. All of its subsystems are in state (O). • (D) System in Degraded Service. The system is in operation. However, some of the state values are out of the assigned range. Its subsystems can be in state (D) or (O). The delivered service is degraded and the whole infrastructure can potentially evolve to the state out of service (O). These definitions of operating states give a meaning to a global property of the system. They are an attempt to collect into a single variable the synthesis of all the available information about the system. Following the formalism of superimposed automata, this situation is shown in Figure 1(b), where the stable states (S), (D) and (O) are white and the transition states are gray. In general, the possible system control actions are designed with the goal to manage the states of the system according to planned procedures. For example, many levels of control conditions may be present. By using a global index of the system performance, the service state may range from System in service (S), with the maximum security index, i.e.: the service is continuous and affordable with minimum risk conditions, to System out of service (O), when no service is provided. Between these two conditions many intermediate levels of service may be present and they are grouped by the definition of System in degraded service (D). This simple representation is also applicable to each subsystem of the infrastructure. With reference to Figure 1(b), usually the transitions (3) and (4) are due to free evolution of the system, instead the transitions (1) and (2), to normal operation, are possible only with the application of manual or automatic control actions. The Organization/Company in charge of a certain infrastructure knows the status of the components, by means of the SCADA apparatuses/system, during
Issues on Critical Infrastructures
0
S
S
2
1
3
2
4
0
(a)
1
D
O
2
575
1
0
4
1
0
2
3
O
(b)
Figure 1. Simplified state diagrams of an infrastructure: (a) status of the system as it is perceived by the user (local perception of the system status by the customer); (b) status of the system as it is known by the operator in charge of the infrastructure (global perception of the system status).
the system operation and human operators can infer a general knowledge of the infrastructure operating states (S), (D) or (O). The state diagram for a component, a plant, or the infrastructure, can be represented by many automata, the transitions of witch are driven by the values of a large quantity of process variables and control signals. Moreover, the perception of the service status is different for the service Company and each Customer. In fact, Customers usually perceives only the two states System in Service (S) and System out of Service (O), as they are represented in Figure 1(a). Only special manufactures can perceive a Degraded Services (D) state, but only if their apparatuses (lamps, motors, computer, etc.) work not properly detecting a certain sub-set of degradation. This is a very simplified description, but this example can be a reference to show how the interactions of the Customer with infrastructures are usually a very simple ON/OFF condition. Exercising an infrastructure in the state (S), (D) or (O) is usually a management choice that defines the goals and the strategies of the Company. So, for example, an infrastructure may be operated with the goal to work between the states (S) and (D), performing high operating quality standards. On the contrary, for poor operating standards the infrastructure works between the states (D) and (O). Each Customer experiences that the same kind of service have different quality levels depending on time and location. Those levels can be a Company choice,
576
M.R. Abram and M. Sforna
but they also depend on occurrence of external events such as natural or artificial perturbations/contingencies. Definitively, the Customer has a local perception of the quality of service instead the Company is more interested in a global perception. 4. Evaluating infrastructure global properties For describing the states of an infrastructure may be useful to identify some of its global properties. It is now usual to consider the following terms: • Risk. Commonly speaking, it is the possibility of an event occurring that will have an impact on the achievement of objectives. More correctly, is the product of the probability of an event and the numerical evaluation of damage it causes. • Safety. It is the condition to which risks are managed to acceptable levels. • Security. It is the condition of being protected against danger or loss. • Emergency. It is a situation which poses an immediate risk to health, life, property or environment. Most emergencies require urgent intervention to prevent a worsening of the situation, or to mitigate the consequences. • Vulnerability. It is a condition of weakness of the system. The vulnerabilities does not compromise the system but they can potentially be exploited by a perturbation letting the system to evolve in undesirable effects. • Stability. It is a condition in which the variation of the system state variables will remain within a certain range and the state variable fade to stable final values, after a perturbation. The meaning of these concepts is related to the chosen model. Moreover, a model must be adequately defined because the local meaning may be very different from the global meaning. The priorities associated to the previous definitions change drastically with the setting of the main goals, or drivers, that inspire the managing strategies of the system. Some reference values are: • Quality of service. Attention focused on the Customer satisfaction. • Economy. Attention focused on the Costs reduction. • Remuneration. Attention focused on the Revenue and Stockholders. • Environment. Attention focused on increasing the respect for the Environment.
Issues on Critical Infrastructures
•
577
Sustainability. Attention on the Sustainable development in time, i.e.: the operating activity of the infrastructure is sustainable for the system and for its environment.
The previous considerations suggest that using these drivers is a consequence of the existence and application of a background economic and social model that is the actual reference in setting priorities. 5. Interactions between infrastructures The analysis and solution of the possible problems related to the operation of infrastructures is generally complex due to the fact that their interaction involve a large amount of components. Then, the problems of interaction among infrastructures becomes a problem of interactions among different components [12,13]. They could be single components, but can also be macro-constrains, such as: economical, political, social, environmental and energy sectors. Each component, or macro-constrain, influences each infrastructure and a greater amount of external components to which they are connected. Consequently, the approaches to the infrastructure interactions should be considered from a global point of view. Figure 2 shows a simplified description of the propagation of causal effects in the case an infrastructure (b) goes into state (D) or (O). If the degradation is enough large, it can be perceived locally by the Customer (Figure 2a) as a state of Out of service (O). In addition, the effected infrastructure (Figure 2b), may force another infrastructure (Figure 2c), into states (D) or (O). If the two infrastructures interact, when the system (2c) goes into the state (D) or (O), it can feedback on the infrastructure (2b) amplifying a degradation into the state (D) or (O) and, eventually, pushing the customer (2a) in state (O). Figure 3 shows this sequence. So, it is possible to state that: an infrastructure is critical if it can force another infrastructure into states (D) or (O). This condition has been already recognized in the following sentence: “Determining interdependencies and cascading failure modes in critical infrastructures is a complex problem that is exacerbated further by the diverging characteristics of the interconnected infrastructure types. Services in some types of infrastructure such as telecommunications or the electric grid are provided and consumed instantly. Others, notably oil and gas but also other infrastructures built on physical resources, however, exhibit buffering characteristics” [16].
578
M.R. Abram and M. Sforna
S
S
2
1
3
2
3
D
O
4
(a)
S
2
D
1
4
1
O
O
(b)
(c)
Figure 2. Simplified state diagrams: (a) local perception of the service of the infrastructure (b) and possible effects on the infrastructure (c).
In addition, the time scales of the phenomena, or processes involved in each infrastructure, play a crucial role. For example, the very fast dynamics of the electric power system interact with the slow dynamics of other infrastructures. Then, large processes, like those supported by infrastructures, evolve and interact showing a large spectrum of dynamics. This is the case of thermal processes which have usually a time constants of the order 10 2 ÷ 10 3 sec, while the power system dynamics evolve with time constants in the range 10 −1 ÷ 1 sec. The transient properties of infrastructures influence how the interactions evolve in time. The correct evaluation of these data and the identification of system parameters would be the basic requirements to build and validate realistic models for studying the possible interaction phenomena. 6. Infrastructure control To control a process means to act on some variables governing it in order to force a desired evolution. Consequently, each infrastructure is controlled to reach the goals established for a certain given services. In managing and operating an infrastructure, the control involves different levels of the Organization. For example, the control strategies at low levels of a factory are implemented and operated by technical personnel, taking into account the operating constraints and the resources available. On the contrary, strategies and policies that drive the technical operations are forecasted by the
Issues on Critical Infrastructures
S
S
2
1
3
S
2
3
D
O
4
(a)
579
2
D
1
4
1
O
O
(b)
(c)
Figure 3. Simplified state diagrams: the infrastructure (c) feedback on infrastructure (b) and on the customer (a).
Top management, instantiated by the Executive levels and designed by the sector Experts. Enlarging the vision, due to the fact that a power system is intrinsically unstable, its basic processes have the natural tendency to degrade in Out of Service (O). The permanence into the state In Service (S) is obtained only artificially with the help of dedicated and complex automatic control systems [15], called primary and secondary regulation in normal operation (Figure 4). Often the automatic control is integrated with manual adjustments, called terziary regulation. The strategies used to design the control of processes are very refined and usually they are performed by means of sophisticated and complex computerized systems that in real-time supervise and manage a large amount of information coming from an extensive set of variables and measures. In particular, the procedures for controlling degradation of service and putting the systems in secure conditions, are usually planned, simulated and possibly optimized off-line. Furthermore the Company can define the operating conditions for each subsystem and the global status can be composed by the information collected from each subsystem. So, for example, the availability of affordable ready system reserve, provided by plants in operation, can increase the global level of security of the electric infrastructure. It is a matter of redundancy of resources. Instead, some emergency control strategies may consider to degrade the service
580
M.R. Abram and M. Sforna
S
NATURAL EVOLUTION
5
4
5
4
6 6
A
3
3
0 7
7
8
2
E
O
1
1
8
0
2
CONTROLLED EVOLUTION
S A E O
In Service Alarm Emergency Out of Service
Figure 4. Simplified state diagrams: system with two states (A) and (E) of degraded service.
of a subsystem just to maintain the security of the remaining whole systems. Following this approach, it is common in power systems to adopt automatic load-shedding strategies which disconnect a defined and limited portion of Customers so that all the remaining Customers, and subsystems, can survive and continue to provide the service, even if in a temporary degraded status [7,15]. Similarly, the availability of redundant communication channels increases the probability to survive and overcome interruptions or to mitigate their effects on the information flows. For a national infrastructures, emergency/security policies are evaluated, verified and influenced by the Government Authorities or Regulators, that define the constraints and responsibilities. It should be considered that the problem of controlling an infrastructure involves the attribution of a responsibility in choosing the best operating conditions (with certain goals, constrains and resources). This responsibility is implemented in the control of the given infrastructure (Figure 5). It is possible to split the Responsibility concept into the following three levels: • Organization. It is the Company that is in charge of the infrastructure. It is constituted by all the human resources and assets, organized in hierarchical levels, that operate the infrastructure.
Issues on Critical Infrastructures
581
ORGANIZATION
ORGANIZATION
CONTROL
CONTROL
SCADA
SCADA
INFRASTRUCTURE A
INFRASTRUCTURE B
Figure 5. Simplified picture of the interaction levels between two infrastructures A an B.
•
•
Control. It is the level that operate the infrastructure. It is a structure with human experts constantly using the following level to remotely supervise and control the infrastructure. The Control itself can be structured into hierarchical sub-levels. SCADA (Supervisory Control And Data Acquisition). It is the set of apparatuses and systems that implements the control and supervision functions. It is the technological structure that implements and actuates the control functions in each component and subsystem of the infrastructure. The SCADA structure can be structured itself into hierarchical sublevels and it is the level that mostly relies on the services of other infrastructures and its effectiveness and quality is a function of these infrastructure performances.
The interactions emerge by means of the relationships existing among the elements of the infrastructure. These relationships are different and are allocated on different logical and structural planes. At the lowest level the infrastructures interact physically exchanging matter and energy regulated by commercial contracts. The SCADA systems can interchange information. Between control centers, human operators communicate by languages. Organizations exchange as market agents. The control problem is strictly connected with the attribution of responsibilities into the chain hierarchy. As a consequence, each infrastructure is responsible and have the direct control on its own structures and assets. Usually the infrastructures are managed as systems disjoint from the environment they
582
M.R. Abram and M. Sforna
are embedded (Figure 5). Moreover, inside one infrastructure there is not the perception on how the other infrastructure is organized and it is working. Definitively, the interactions are only physical, for the exchanged services, and commercial, for the economic compensations. There is not any exchange of knowledge on how the related infrastructures work, not to mention the information on weak points or, as minimum, the reciprocal exchange of basic signals for the other system states. It should be noted that within this scenario of deep knowledge of each own infrastructure and a lack of exchange information, a sort of protectionist, is difficult to manage the vulnerabilities arising from the interactions among different critical infrastructures. This is a problem that could be faced introducing higher level of control, that impacts on the interaction processes. From this viewpoint there could be three possible approaches that can be summarized as follow: • Centralized Control. A unique control center supervise and manages the interactions among the infrastructures. The control strategy can be rigid, vertical, hierarchical. The building of a similar structure needs to aggregate a very large political and economical consensus for gaining the authority to operate. On the contrary, the centralized control strategy is possible and realistic inside a single Company, where a Security Operation Center can supervise all the aspects concerning safety, security, assets protection, etc. beside the common network operation center that control the core business grid, i.e: the power system in the case of a power company. • Distributed Control. The control strategy is structured in decoupled subsystems. The central control delegates to peripheral systems the local control function, instead the central operating control authorities give the parameters for security and quality of service. A central SCADA operates the high level control. The electric power market is an example. This structure has the possibility to be implemented with the new agents that can emerge in large aggregation of States as it could happen in the European Community. • Network Interactions. It is the more realistic situation, a de facto condition. The interactions between the agents are not imposed but are negotiated and the interaction protocols are developed and formalized by standardization committees. Naturally the role of politics, cultures, traditions, etc., can define and allocate the priorities to accomplish the goals listed in Section 4.
Issues on Critical Infrastructures
ORGANIZATION
ORGANIZATION
CONTROL
CONTROL
SCADA
SCADA
INFR. A
INFR. B
583
Figure 6. Another way to picture the interactions between two infrastructures A and B.
Beside that, the hoped interactions among infrastructures could be enforced as an enlarging superimposition of the domains of influence. So, the control center of each infrastructure can have a glimpse of the operating status of the others interacting infrastructures (Figure 6). There are not physical constrains to implement that, but only confidential ones. Moreover, there could be the possibility of creating potentially conflict situations to be solve with clear agreements having the security and continuity of common operation as the unique driver. Every attempt to control the interactions substantially causes the redefinition of the chain of responsibilities. However, it should be recognized and stated that each Organization must adopt the following fundamental principle of self-responsibility: “The responsibility of a public service cannot be transferred to an involved Provider”. As a consequence, each Organization must be aware of the vulnerabilities it acquire with the Provider/s and must setup recovery preventive actions. The realization of the previous approaches may be in contradiction with the goals settled by the Companies in charge of the critical services supplied by critical infrastructures. At least, it has an economic impact. In a free market scenario, this requirements can unbalance competition depending on the robustness of each infrastructure and the adopted strategy of quality of service, actual or only promised it. If this is the paved a further problem could be: how must infrastructures be controlled? Which requirements must be imposed?
584
M.R. Abram and M. Sforna
In this context, the first step could be the common adoption of international standards that regulate the internal functions and processes for each infrastructure. A relevant example is the widespreaded application of Quality Standards such as ISO 9001:2000 [8], Environment Standard, such as ISO 14001:2004 [9], Safety Standard as BS OHSAS 18001:2007 [2], Supply Chain Standard, such as ISO 28000:2007 [10] and other special and sector standards. The applications of these standards can furthermore contribute to build the basic principia and the culture of the model adopted. In other words, it may be a step toward the building of a common background reference model. Beside, each Country is called to develop a coherent law system in accordance with the structures and the values of actual technical standards and of accepted international agreements. In this way, the choice of interaction standards is based on the definition of constraints that are approved, accepted, applied and then supported internationally. 7. The role of human factor The evaluation of the human factor in operating infrastructures is another relevant aspect. The key role of the operators and of experts located at different hierarchical levels may be very critical and can be the most critical factor in the infrastructure performances. Operators, when are correctly trained and motivated, can exercise an efficient and effective use of the procedures and of the complex and sophisticate supervision and control tools. In addition, in case of emergency conditions he can operate in the best way using all the possibilities and resources intrinsic in the controlled processes correctly understanding the available information. The role of operators is essential in supervision and control of emergency states because they can choose the more adequate strategy in order to reach a goal, especially when a critical situation is not modeled or planned for the automatic control by predefined apparatuses. For example, in the context of multilevel organizations, Company experts have a large knowledge base and the ability to evaluate and develop successful strategies. They are a fundamental resource in solving critical situations by means of their experience, technical and theoretical knowledge, spirit of invention, definition and execution of local procedures. They can develop different strategies in accordance or in oppositions with prescriptions, if needed. They find solutions outside the specific technical context of automatic systems. Another level in which the human factor is very important lies into investigating the processes and designing control functions and systems.
Issues on Critical Infrastructures
585
This consideration is much more relevant in the decisions making process. In fact, the choices of the drives at economic and political levels have the stronger influence on infrastructure performances and security. On the contrary, for the same reasons, experts and managers could be source of criticality when they operate with lack of experience, fraud, absence of knowledge, casualty, discontent, lack of commitment. 8. Systemic Approach and Emergence The arguments described in previous sections may be read from a systemic point of view. This will give a global perspective from which the increasing importance of relationship among the systems emerges. The problem to know and evaluate the status of the system and to model critical phenomena as emergent properties, acquired by the system during processes of emergence occurring within it, is actual and involves all the large and complex systems of all the infrastructures. The concept of emergence is strictly related to the theoretical role of the Observer [11,12]. Only the assumption of a suitable level of description by the observer may allow detection and modeling of emergent properties. For instance the level of description considering single bodies, cars or industries is necessary (as for a reductionistic level of description), but it will never be sufficient to detect emergence of flocks, traffic and industrial districts with their emergent properties. In this view emergence is intended as process of acquisition of new properties by a system modeled at a level of description higher than the one used to model what is considered by the observer interacting components. Crutchfield introduced the concept of intrinsic emergence giving rise to profound changes in system structures and requiring formulation of a new model of the system itself [6]. Following this paradigm, interactions among infrastructures establish a new system assuming new, emergent properties that cannot be modeled by using the level of description used for structures or single infrastructures. In this context, even the role of humans can be seen from a different viewpoint if we consider it as Observer. At the various level, the human may, or should, be able to see the emerging properties of systems and, as a consequence, he/she should be able to use them in designing and operating infrastructures. This position suggests to consider the set of infrastructures as a super-system in which we attempt to know and to use all the possible relationships among the existing systems. In this sense we can describe the service provided by an
586
M.R. Abram and M. Sforna
infrastructure as the properties emerging from all the elements concurring to form the system. Following this approach the vulnerability of an infrastructure/system is connected to the disappearing of the emergence processes that constituted or characterized the system. This is due to a perturbation that cancel the interactions among its elements. In this sense the infrastructure is the emergence of a service given by means of a physical structure. The vulnerability of infrastructures is located into interactions among subsystems then each interdependence relationship is a source of vulnerability. A systemic approach controlling a system means to be able of controlling, managing and driving the emergence processes. Controlling and managing vulnerability means to control the interactions among systems. Recalling Figure 5, this involves to control all the interactions at the physical, information, supervision and organization levels. These considerations have practical consequences. It is emerging the need of a more extensive call in responsibility for the Regulators, standardization Committees, decision makers, in general, who must identify, evaluate, update, communicate and support the applications of new operating models and procedures. 9. Conclusions Some general considerations were developed to investigate the emerging phenomena originated by interacting infrastructures. The complexity of the problems is evident. It is difficult to develop an adequate quantitative model, simple enough to be used and complete enough to describe the core of the problem and all the interaction phenomena. Many modeling approaches are available, but a relevant amount of activities are necessary for developing effective and useful representations of emerging properties in systems. The emergence of interactions in heterogeneous systems and in multilevel hierarchies, shows how the techniques and representations developed in many disciplines, even if sophisticated and detailed, are often inadequate and give only partial representations of the properties of interacting systems. For this reason it is difficult to develop models of the interaction among infrastructures which give a complete account of the problems complexity. The building of an adequate model is the basic requirement for analyzing and eventually developing a possible control on the interacting processes. The goal is to build
Issues on Critical Infrastructures
587
quantitative models complete enough to be useful but simple enough to be manageable. The problems involved in controlling the infrastructures manifest their importance when the different infrastructures interacts with multiple interaction levels. Because the interaction problems involves many technical, organizational, political levels, the effective managing of interactions implies the solution of a control problem that involves all the hierarchical levels. In order to bound the propagation of perturbations among systems and to increase the chances for surviving and recovering, a possible strategy could lie in reconsidering also the design requirements, the operating procedures and the managing criteria with the goal to reduce the levels of interaction. The standardization processes evolve quickly and the acknowledge of systemic approaches could accelerate the integration and harmonization of the existing norms and laws. These additional notes could address the future researches and activities. Acknowledgments We would like to thank Dr. Ing. Dario Lucarella (Cesi Ricerca) and Dr. Ing. Paolo Bossi (Cesi) who provided information, useful discussion and support. References 1. M. Amin, IEEE Control Systems Mag., 22 (2002). 2. BS OHSAS 18001:2007, Occupational health and safety management systems Requirements (BSI, London, 2007).
3. Commission of the European Communities, Green Paper on a European 4. 5.
6. 7. 8. 9.
Programme for Critical Infrastructure Protection, COM 2005/576, 17.11.2005, (Brussels, 2005). Commission of the European Communities, Communication from the Commission on a European Programme for Critical Infrastructure Protection, COM 2006/786, 12.12.2006, (Brussels, 2006). Commission of the European Communities, Proposal for a Directive of the Council on the identification and designation of European Critical Infrastructure and the assessment of the need to improve their protection, COM 2006/787, 12.12.2006, (Brussels, 2006). J.P. Crutchfield, Physica D 75, 11 (1994). U. Di Caprio, in Systemics of Emergence: Research and Applications, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006), p. 293. ISO 9001:2000, Quality management systems - Requirements (International Organization for Standardization, 2000). ISO 14001:2004, Environmental management systems - Requirements with guidance for use (International Organization for Standardization, 2004).
588
M.R. Abram and M. Sforna
10. ISO 28000:2007, Specification for security management systems for the supply 11. 12. 13. 14. 15. 16. 17. 18.
chain (International Organization for Standardization, 2007). G. Minati and M.R. Abram, AEI 90, 41 (2003). G. Minati and E. Pessa, Collective Beings (Springer, New York, NY., 2006), S.M. Rinaldi, Proceedings of the Hawaii Int. Conf. on System Science, (2004). S.M. Rinaldi, J.P. Peeremboom and T.K. Kelly, IEEE Contr. Sys. Mag. 21, 11 (2001). M. Sforna, Safety & Security 1, 14 (2007). N.K. Svendsena and S.D. Wolthusena, Inform. Sec. Tech. Rep. 12, 44 (2007). The White House, The National Strategy for the Physical Protection of Critical Infrastructures and Key Assets (Washington, DC, 2003). USA Patriot Act, Public Law 107-56, October 26, 2001, (Washington, DC, 2001).
THEORETICAL PROBLEMS OF SYSTEMICS
This page intentionally left blank
DOWNWARD CAUSATION AND RELATEDNESS IN EMERGENT SYSTEMS: EPISTEMOLOGICAL REMARKS
LEONARDO BICH CE.R.CO. – Center for Research on the Anthropology and Epistemology of Complexity, University of Bergamo, Piazzale S. Agostino 2, 24129 Bergamo, Italy Email:
[email protected] In this article we analyse the problem of downward causation in emergent systems. Our thesis, based on constructivist epistemological remarks, is that downward causation in synchronic emergence cannot be characterized by a direct causal role of the whole on the parts, as these levels belong to two different epistemological domains, but by the way the components are related: that is by their organization. According to these remarks downward causation, considered as relatedness, can be re-expressed as the noncoincidence of the operations of analysis and synthesis performed by the observer on the system. Keywords: emergence, downward causation, organization, (M,R)-systems, constructivism
1. Introduction Emergence is usually approached from two different points of view. The first is an ontological one, which concerns the formation of new levels of reality in a way that it is meant to maintain a coherence with the physicalist perspective. The other approach is epistemological and it focalizes on the observer and on his limits in modelising complex systems. In this article we assume the second approach. The reason is that we think that the issue of the unpredictability or the non-deducibility of the description of a complex system from that of the behaviour of its part in isolation concerns primarily the interaction between the observer and the system, as for example the limits in the precision of his measurements or the different kinds of observation he can perform at different levels of analysis. Also, unpredictability and non-deducibility concern the models the observer builds and not the systems themselves. Following these remarks we assume a constructivist approach mainly derived from the one proposed by Humberto Maturana and Francisco Varela in the autopoietic theory [9,19,21,22,36,37]. It consists in that the observer does not have a direct access to reality but only to his own experiences he performs in interaction with it. In such a way knowledge is not characterized by a
591
592
L. Bich
registration of the features of an objective reality but occurs in a relational domain where the regularities of these experiences are expressed in the models the observer builds. The autopoietic epistemological approach is based on the concept of unity, which makes it very suitable in a systemic perspective. The primary operation which characterizes the activity of the observer is the distinction of a unity from its background, an action that relies on his purpose and his point of view, and which specifies the identity of the unity together with its domain of existence [21,33]. Through this epistemic operation at least three levels can be distinguished on a unity: • its material parts considered in isolation, or better distinguished from a generic background; • the composite unity, which corresponds to the internal point of view and which constitutes the level of the interactions of the functional components which, differently from the material parts, are distinguished in relation to the system they integrate; • the simple unity, which corresponds to the external point of view and is distinguished from the medium it interacts with. It is considered as a whole with given properties. Each of these observative levels on the same unity determines a domain of existence characterized by the presence of specific elements and relevant properties. The problem of emergence is placed at the level of the relations between these domains [5] and consists in the possibility or not to express the property of one domain in terms of another. The first two levels, which in the case of purely additive interactions can be considered as coincident, have an ambiguous status, for they cannot be considered in a hierarchic relation like that between them and the third level. Their difference depends instead on the direction of the distinction, bottom-up for level 1 and top-down for level 2, and it is crucial to understand emergence. 2. Complex Emergence The use of the term “emergent” to name a non-additive property of a compound, in opposition to “resultant”, dates back to George Henry Lewes’ “Problems of Life and Mind” in 1875 [16]. With time it assumes more precise connotations and it is used to define the phenomena of appearing of qualitative novelties in nature, like properties, relations or levels not present in the pre-existent entities. The acknowledgment of the importance of emergent phenomena finds its first
Downward Causation and Relatedness in Emergent Systems: …
593
rigorization during the 20’s with the rising of a line of thought called British Emergentism [2,7,18,23,34]. Considering the most recent studies since the 70’ and the 80’s, which are focused on the role of the observer, it is possible to distinguish between emergence as unpredictability or as non-deducibility from some initial conditions or basic level [4,8,11,24]. To the first group belong the processes of self-organization like those studied by the thermodynamics of dissipative structures (thermodynamical emergence) of Ilya Prigogine [25] and the computer simulations of Artificial Life (computational emergence): the ordered and unitary behaviour which can be observed in a system is in principle deducible from the laws which characterize the model which describe the behaviour of the constituents, but it is not predictable because of the non-linear interactions between them. An infinite precision in the knowledge of the initial conditions is required to the observer, whose limits have the consequence that it is not possible to determine the evolution of the system after a certain amount of time: even a very slight difference in the initial conditions can give rise to very different behaviours. The emergence characterizing these phenomena, usually called of “self organization”, depends on the intrinsic limits in the process of measurement performed by the observer [30]. Therefore, what emerges is not a qualitatively new level or behaviour which needs to be described by a new model but just an ordered pattern which is recognized by the observer according to his properties as a cognitive agent. According to these remarks the models of these processes can be better called of self-ordering than of self-organization [1]. The new level that we observe is only epiphenomenal. Considering the interaction between some objects A1i , A2i ... Ani with properties P1i , P2i … Pni at the basic level of analysis i, the emerging level characterized by a new object Ai +1 is totally determined and describable in terms of the elements belonging to the level i and their properties (cfr. Humphrey, 1997) [13]:
A1i +1 ( P1i +1 )
A2i +1 ( P2i +1 )
↑ Realizes
A1i ( P1i ) + A2i ( P2i ) = epiphenomenal causality
→ = effective causality ↑ = realization
↑ Realizes →
A3i ( P3i ) + A4i ( P4i )
594
L. Bich
To the second group, which includes those processes which can be considered emergent in a proper sense, belong the phenomena of spontaneous symmetry breaking studied by Quantum Field Theory [3,24], and autopoietic processes [5]: the formation of a system or a new behaviour are not deducible in principle by the model which describes the initial dynamics and require a new kind of description, a new model which cannot be reducible to the initial one neither to a more comprehensive one. In this case, where the new level is not describable in terms of the lower one, what emerges is not merely epiphenomenal. The relatedness between components gives rise to a qualitatively new level with its proper characteristics, which needs new kinds of models in order to be described.
A1i +1 ( P1i +1 )
→
A2i +1 ( P2i +1 )
|
|
Cai ( Pai ) * Cbi ( Pbi )
Cci ( Pci ) * Cdi ( Pdi )
↑ interaction ↑ (non additive)
↓ degradation ↓
A1i ( P1i )
A3i ( P3i )
A2i ( P2i )
(simple unities)
(composite unities)
A4i ( P4i )
In this scheme, adapted from Humphrey 1997 [13], we can identify the three levels which an observer can distinguish on a unity. According to our approach the components Cai , Cbi … Cni integrating the composite unity are expressed with different names than those of the basic ones, and their interaction, being non-additive, is expressed by *. While in the case of emergence as unpredictability the levels of material parts and components coincide and that of the simple unity is merely epiphenomenal, in the phenomena of emergence as non-deducibility the three levels are qualitatively different and the lower one is not pertinent in the description of the others. Different terms are usually used to express this kind of emergence. The first one, which points out that the non-deducibility depends on the way the system is organized and not on the limits in the measurement process, is “intrinsic emergence” [11], but this term tends to obscure the role of the observer. Another
Downward Causation and Relatedness in Emergent Systems: …
595
definition is “emergence relative to a model” [8], which is focused on the limits in the descriptions proposed by the observer but does not express the character of necessity of this epistemological limit: the impossibility could depend on the kind of model, while in this case it is meant as a limit in principle for all models. This problem can be solved using another terminological option, that of “complex emergence” which can express in a proper way both the limit in principle and the role of the observer. We refer here to Robert Rosen’s definition of complexity: “to say that a system is complex […] is to say that we can describe the same system in a variety of distinct ways […]. Complexity then ceases to be an intrinsic property of a system, but it is rather a function of the number of separate descriptions required […]. Therefore, a system is simple to the extent that a single description suffices to account for our interactions with the system: it is complex to the extent that this fails to be true” [30]. In such a way, complex emergence expresses the failure or inadequacy of a single modality of description and the necessity to pass to a new one. The lack of relation between the different modes of description is what makes the emergent phenomena and especially downward causation so puzzling, as we will see in the next paragraph. In the analysis of emergent phenomena and in order to deal with the problem of downward causation a further distinction is necessary, that between synchronic and diachronic emergence [35]: • synchronic emergence concerns the recognition by the observer of a unitary system rising from the interaction between components and the consequent presence of different levels of analysis. It focuses on the hierarchic relation between parts and wholes and it is strictly connected to the problem of distinction exposed in the introduction. • diachronic emergence concerns the appearance in time of new entities in the natural world – for example the origin of living systems or the birth of new species in the evolutionary process – and of new behaviours at the level of the interactions between the system and its environment. We will mainly deal here with synchronic emergence in order to face the epistemological problem of downward causation in those systems characterized by complex emergence, but few remarks will be made also on the diachronic one.
596
L. Bich
3. Downward Causation With the term “downward causation” we generically mean the causal effect of the emergent whole upon the elements that constitute it, in such a way that their behaviour inside the system is different than in isolation. The direction is the opposite of that of the relation of realization, which goes from the elements to the whole they give rise to. It is possible to distinguish three different ways to consider this relation [12]: • strong downward causation: it implicates an ontological difference between the levels considered, due to the introduction of a non-scientific principle, as in vitalist theories. It violates the physicalist assumption of materialism because the upper level is not completely realized by the entities of the lower level and consequently the gap between levels does not depend on the ways the constituents are related. The causality of the whole on its parts in fact has the form of a non-material principle which directly influences the way the constituents behave. • Weak downward causation: it considers the difference between levels as merely formal. The emergent levels show an ordered pattern, but do not have an effective causal power on the elements of the lower levels. They are just epiphenomenal, in that they do not imply an effective change in the system. • Medium downward causation: the upper level influences the lower ones without appealing to a sort of vitalist explanatory principle. This influence consists in that the constituents behave in the system differently than in isolation, due to the way they are related in order to realize the unitary whole they belong to. While we exclude the first kind of downward causation as non-scientific, it is easy to associate the other two versions of downward causation to the two forms of emergence analysed above. Weak downward causation is exhibited by the self-ordering processes which are characterized by emergence as unpredictability. It is epiphenomenal because the behaviour of the elements of the system does not need a new model to be described. Consequently it has just an heuristic value. The third kind of downward causation is the more interesting because it implies an effective difference in the behaviour of the constituent, which requires the construction of a new model. Problems arise when we try to conceptualize this causal relation: how can we conceive a causal power of the whole on the parts if these belong to two different epistemological levels with
Downward Causation and Relatedness in Emergent Systems: …
597
no direct connection and which are described by different and incommensurable models? According to the epistemological remarks we made on complex emergence, which are based on models and not on natural objects in themselves, the whole cannot directly act at the level of its components, for the two levels belong to different and incompatible domains. They are in fact described by different models in such a way that a direct interaction between them is not approachable from both the conceptual and the operational points of view. A direct causal relation would imply that the two levels could be described by a same model, in contradiction with the definition of emergence. This epistemological problem can be solved if we consider downward causation as depending on relatedness [32], that is on organization. The way components are related in order to integrate and realize the emerging systems is what makes an effective difference in their behaviour. A possible way to deal with the problem of downward causation is thus to consider the difference between the two levels of the constituents which are identified by the action of distinction of a unity performed by an observer. From this epistemological point of view downward causation consists in the irreducibility of the models which describe the material parts of the system and the components, these last ones distinguished from the background of the organization of the unity they give rise to. The former depend on their intrinsic properties and are identified through a bottom-up approach; the latter depend on the organization of the emergent system and are distinguished through a topdown approach. While in self-ordering processes a bottom-up approach is sufficient and allows us to modelise the system, here is the top-down one which catches the internal dynamics of the system and determines the model of the emergent behaviour of components [6]. Downward causation, according to this epistemological approach, can be expressed as the irreducible difference between the bottom-up and the top-down approaches, which are not one the inverse of the other. This difference is not considered as a direct consequence of the action of the system as a whole but of the organization, the relatedness, to use an expression of the emergentist Conway Lloyd Morgan’s [18], which determines the identity of the functional components which it connects. Emergence comes up as a problem of observative levels which are characterized by a lack of connection. This makes two different conceptual or experimental activities performed by the observer, that of synthesis from the
598
L. Bich
material parts and that of analysis from the composite unity, not to coincide [31]. Two different kinds of models come out of these operations. These models do not deal with the states of a system S in itself but with mappings on them, which express the measurement interaction between the observer and the system. These mappings, called observables, define some equivalence relations R on S . Given an observable f:
f (S ) = S / R f
(1)
In analytic models the observables are defined on the system S and they are characterized as its projections onto its factors, which correspond to the components belonging to the second level of the distinction. The starting point is the observation of the system as a whole, which is the domain of the operation. Analysis, according to Rosen [31], can be expressed formally in Category Theory as a direct product:
M ( S ) = ∏α fα ( S )
(2)
Synthesis on the contrary is an assembly operation where S is not the starting point, the domain of the operation, but the range of it. It is not a projection of S but an injection of its constituents on it. It is a construction of the system from the analytic models describing its subsystems – that is its material parts – which belong to the first level of distinction. The observables in fact are not defined on S but on its subsystems An . The synthetic model of the system S can be expressed through the direct sum of the models describing its parts [31], as the smallest set containing them:
M (S ) =
n
M ( An )
(3)
In simple systems (see the first scheme in Sec. 2 above) the analytic and synthetic models are one the inverse of the other and the system is fractionable: the properties of the systems are localised in its material parts and can be expressed by models describing them. In the case of complex emergence (see the second scheme in Sec. 2 above) downward causation can be characterized as the non coincidence of analytic and synthetic models of S . Its main consequence is non-fractionability that consists in the lack of a one-to-one correspondence between the organization of the composite unity, identified by a top down distinction, and the structure that realizes it through the material parts, which is instead identified by a bottom up distinction.
Downward Causation and Relatedness in Emergent Systems: …
599
An observer who does not consider the problem of the incompatibility of models, but assumes just an ontological point of view focused on object in themselves instead of on the relation between models, theoretically approaches downward causation as a direct action of the whole on the parts. In this way he puts different levels, which are acknowledged as irreducible from the same definition of emergence as non-deducibility and which are the result of different operations of distinction, on a same one, thus making an epistemological mistake. Therefore this approach involves a kind of downward causation which contradicts the same definition of emergence it starts from. Here we faced the problem of downward causation from a synchronic point of view, focused on the relation between the whole and his constituents. Assuming a diachronic approach we can identify some differences. In fact observing the temporal evolution of the system in interaction with its environment we can identify the correlations of the changes in the composite and in the simple unity. In order to do this we need to put ourselves on a metadomain of second order descriptions, which are symbolic and conceptual and not operational [5]. In this case the causality between the whole and its constituents appears to the observer as the reciprocal modulation between the internal and the external dynamics [20]. Nevertheless this is not a direct causal relation, as the two domains are not intersecting and the two levels to which they belong still need to be considered distinct. It is rather a relation of reciprocal selection that the observer induces from the correlations he observes in time between the two different levels. Like we said above, a direct causal relation would imply a reduction between those levels, which would be described together into one more comprehensive model. Here instead it is just the identification – on a metadomain – of some correspondences which come out of the observation of the two different levels at the same time. 4. Conclusive Remarks In this article, by assuming an epistemological perspective on emergence focalized on the models build by the observer and on the three levels he can distinguish on a system, we showed that downward causation is more complex than the causal effect of the whole on the parts. It depends in fact on relatedness and can be considered as the non-congruence in emergent systems between the synthetic models, which describe a system starting from the parts in isolation, and the analytic ones, which describe the composite unity starting from its organization. These two different classes of models depend respectively on a bottom-up and a top-down operation which are not one the inverse of the other.
600
L. Bich
We also showed that in a physicalist perspective the assumption of an ontological approach to emergence and downward causation can lead to some mistakes, for it makes different levels to interact on the same one and it contradicts the definition of emergence, based on non-deducibility. A further step along this line of research can be to show examples of the non coincidence between analytic and synthetic models. An interesting case for the application of this theoretical framework is the formal model proposed by Robert Rosen to characterize the identity of living systems: the (M,R)-system [15,26,27,28,29]. His demonstration of the inequality of the operations of analysis and synthesis in this class of system [31] is obscure and it is quite controversial [10,17,38], also because it is connected with his thesis about the non-computability of the models describing living systems, which would put serious limitations to the goals of Artificial Life. Studies in this direction would be important for a better understanding of emergence in biological systems. Especially, they would open the way to the development of a non reductionistic approach to the biological study of the organization of living systems focalized on the analysis of their functional components in terms of the unity they integrate. References 1. D.L. Abel and J.T. Trevors, Phys. Life Rev., 3 (2006). 2. S. Alexander, Space, Time and Deity (Macmillan, London, 1920). 3. P.W. Anderson and D.L. Stein, in Self-Organizing Systems: The Emergence of Order, Ed. E. F. Yates, (Plenum Press, New York, 1985) p. 445.
4. N.A. Baas, in Artificial Life III, A Proceedings Volume in the Santa Fe Institute 5. 6. 7. 8. 9. 10. 11. 12. 13.
Studies in the Sciences of Complexity, Ed. C.G. Langton, (Addison-Wesley, Reading, 1994), p. 515. L. Bich, in Systemics of Emergence, Research and Development, Ed. G. Minati, E. Pessa and M. Abram, (Springer, Berlin Heildelberg New York, 2006), p. 281. L. Bich and L. Damiano, Orig. Life Evol. Biosph., 37 (2007). C.D. Broad, The Mind and Its Place in Nature, (Routledge and Kegan Paul Ltd., London, 1926). P. Cariani, On the Design of Devices with Emergent Semantic Functions (Ph.D. Dissertation, State University of New York at Binghamton, 1989). M. Ceruti, La danza che crea (Feltrinelli, Milano, 1989). D. Chu and W. K. Ho, Artif. Life, 12 (2006). J.P. Crutchfield, Physica D 75 (1994). C. Emmeche, S. Køppe, F. Stjernfelt, in Downward Causation. Minds, Bodies and Matter, Ed. P.B. Andersen, C. Emmeche, N.O. Finnemann and P.V. Christiansen, (Århus University Press, Århus, 2000) p. 13. P. Humphreys, Philos. Sci., 64 (1997).
Downward Causation and Relatedness in Emergent Systems: …
601
14. J. Kim, In Emergence or Reduction? Essays on the Prospects of Nonreductive 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.
Physicalism Ed. A. Beckermann, H. Flohr and J. Kim, (De Gruyter, Berlin, 1992), p. 119. J-C. Letelier, J. Soto-Andrade, F. Guinez-Abarzua, M-L. Cardenas, A. CornishBowden, J. Theor. Biol., 238, (2006). G.H. Lewes, Problems of Life and Mind (Houghton, Osgood and Company, Boston, 1975). A.H. Louie, J. Integr. Neurosci., 4 (2005). C.L. Morgan, Emergent Evolution (Williams and Norgate, London, 1923). H. Maturana, Irish J. Psychol., 9 (1988). H. Maturana, J. Mpodozis and J.C. Letelier, Biol. Res., 28 (1995). H. Maturana and F. Varela, De Máquinas y Seres Vivos: Una teoría sobre la organización biológica (Editorial Universitaria, Santiago, 1973). H. Maturana and F. Varela, El árbol del conocimiento (Editorial Universitaria, Santiago del Chile, 1984). B.P. McLaughlin, in Emergence or Reduction? Essays on the Prospects of Nonreductive Physicalism, Ed. A. Beckermann, H. Flohr and J. Kim, (De Gruyter, Berlin, 1992), p. 49. E. Pessa, in First Italian conference on Systemics, Ed. G. Minati, (Apogeo, Milano, 1998), p. 59. I. Prigogine and I. Stengers, La Nouvelle Alliance. Métamorphose de la science (Gallimard, Paris, 1979). R. Rosen, Bull. Math. Biophys., 20 (1958). R. Rosen, Bull. Math. Biophys., 20 (1958). R. Rosen, Bull. Math. Biophys., 21 (1959). R. Rosen, in Foundations of Mathematical Biology, Ed. R. Rosen, (Academic Press, New York, 1972), vol. II p. 217. R. Rosen, Fundamentals of Measurement and Representation of Natural Systems (North-Holland, New York., 1978). R. Rosen, Life Itself: a Comprensive Inquiry into the Nature, Origin, and Fabrication of Life (Columbia University Press, New York, 1991). J. Schroder, Philos. Quarterly, 48 (143) (1998). G. Spencer Brown, Laws of Form (George Allen and Unwin Ltd, London, 1969). A. Stephan, in Emergence or Reduction? Essays on the Prospects of Nonreductive Physicalism, Ed. A. Beckermann, H. Flohr and J. Kim, (De Gruyter, Berlin, 1992), p. 25. A. Stephan, Grazer Philosophische Studien, 65 (2002). F. Varela, Principles of Biological Autonomy (North-Holland, New York., 1979). F. Varela, H. Maturana and R. Uribe, Biosys., 5 (1974). O. Wolkenhauer, Artif. Life, 13 (2007).
This page intentionally left blank
TOWARDS A GENERAL THEORY OF CHANGE
ELIANO PESSA Centro Interdipartimentale di Scienze Cognitive, Università di Pavia and Dipartimento di Psicologia, Università di Pavia Piazza Botta 6, 27100 Pavia, Italy E-mail:
[email protected] This paper deals with the feasibility of a general theory of changes occurring both in nonbiological and in biological world. The aim of this theory should be that of classifying, describing, and forecasting the consequences of changes, as well as of finding the conditions which ensure a possibility of controlling them. The most important sub-case of this investigation would consist in a general theory of emergence, clarifying whether the latter could be or not obtained through a suitable generalization of physical theory of phase transitions. We will argue that this enterprise could be feasible, provided actual theoretical framework holding in physics be enlarged in a suitable way, so as to include phenomena not reducible to particles interacting through force fields of immutable nature. Keywords: emergence, phase transitions, biological models, quantum field theory.
1. Introduction The most important feature of the world of phenomena is the occurrence of changes. All of them occur in time. Some occur in time and space. Others occur in time and configurational variables (describing the inner structure of a given system). Among these changes some appear to be of utmost importance, as they lead to deep structural changes in the system being observed. When not reducible to an observable direct action of the environment, these latter are qualified by the word “emergence”. In the last years, the topic of emergence has been the subject of an intense debate (see Minati and Pessa, 2006, for a review and a list of references) between anti-reductionist philosophers, claiming that emergence cannot be described by actual physical theories, and theoretical physicists, asserting that emergence is nothing but a special kind of phase transition. Whatever may be the conclusion of this debate, a lot of experimental evidence made clear that a number of features, in a first instance seen as typical of emergent phenomena, depend only on the adopted observational time scale. Thus, while it is commonly felt that the sudden occurrence of a ferromagnetic state at Curie temperature is an example of emergent phenomenon, whereas the 603
604
E. Pessa
evolution of reptiles in billion years is not an emergent phenomenon, it is easy to acknowledge that, by adopting a time scale whose unit is some dozen of million years, the time trend of reptile evolution mimics the one of residual magnetization close to Curie point. Such a simple fact forces to enlarge the framework adopted for studying emergence so as to include all kinds of change. In this way the difficult question of the eventual difference between physical and biological (or psychological, economical, social) emergence is reduced to the simpler question of the difference between physical and biological models of change. And, as it is well known, about both kinds of models there is a conspicuous body of experience. Within this paper we will shortly discuss some aspects of physical models of change, trying to put into evidence their advantages and shortcomings. The latter will then be compared with some typical models of biological change, in order to understand why actual physical models fail to account for some typical features of biological world. We will argue, in this regard, that there would be a possibility of remedying for this failure, provided the actual framework used in physical theories were generalized in a suitable way. As the search for this generalization appears to be of utmost importance for the development of Systemics and of a general theory of emergence, hereafter the attribute “biological” will be used in its widest sense, that is referring to all phenomena which are commonly related to the existence of natural living beings. Therefore even psychological, economical, social phenomena will be libeled as “biological”. 2. Physical models of change The framework so far adopted by physicists to describe changes is based on two main components: elementary units and force fields. The latter give rise to the interactions between elementary units which, through a typically linear causeeffect mechanism, drive the dynamics of units themselves. In turn, the spatiotemporal dynamics of force fields is ruled by evolution equations including source terms due to elementary units. Both kinds of dynamics fulfill conservation principles, such as the one of total energy, and are therefore based on a Lagrangian or Hamiltonian formalism. This approach is characterized by a number of heavy shortcomings, which will not be mentioned here, and has been seriously criticized since Newton times. However, all alternative proposals so far made didn’t obtain a wide consent. The previous scheme needs some modification when the number of elementary units is so great as to prevent from a detailed description of their
Towards a General Theory of Change
605
individual dynamics. In these cases physicists resort to a distinction between two levels of description, the microscopic and the macroscopic one. The goal of Statistical Mechanics is, then, that of deriving the phenomenological laws ruling the macroscopic behavior starting from the knowledge of interactions between elementary units at a microscopic level. Needless to say, this goal is very difficult to reach, so that Statistical Mechanics still cannot be considered as a firmly grounded discipline. Before going further, it is to be remarked that the general scheme sketched before still permeates almost all theoretical physics, in a way which is partly independent from the general principles adopted. Thus, it works in classical as well as in relativistic mechanics, in quantum mechanics as well as in Quantum Field Theory (QFT). We will now focus on the application of the above framework to the most interesting kinds of change (at least from the point o view of Systemics): the socalled phase transitions. The theory of the latter has been built by resorting to a suitable combination of phenomenological observations, of phenomenological theories describing them, of suitable theoretical frameworks into which the latter have been embedded, and of statistical arguments supporting all this machinery. As regards phenomenological observations we will introduce a distinction two kinds of them. The first kind includes the behaviors observed when temperature approaches critical temperature, that is the divergence of (generalized) susceptibility, the divergence of amplitude of fluctuations of order parameter, the critical slowing down, and the discontinuity in the curve giving specific heat as a function of temperature. The second kind refers to existence of universality classes, evidenced by the fact that different phase transitions are associated to (almost) the same set of critical exponents appearing in the laws describing macroscopic behaviors near the critical point. Let us now focus on phenomenological theories, by remarking that in this context we have two main approaches: the Ginzburg-Landau theory and the Renormalization Group. While neglecting any technical description of these topics (which can be easily found on standard textbooks; see, for instance, Goldenfeld, 1992; Benfatto and Gallavotti, 1995; Cardy, 1996; Domb, 1996; Minati and Pessa, 2006, Chap. 5), we will limit ourselves to mention the outstanding importance of two main ideas: on one hand, the identification of phase transitions with symmetry breaking phenomena (and in practice with bifurcation phenomena in Ginzburg-Landau functional), and, on the other hand, the scaling hypothesis, allowing the elimination of all irrelevant parameters close to critical point. While both ideas are correct only in a limited number of cases, however they underlie a huge number of models of phase transitions, so
606
E. Pessa
that every concrete computation, done within this context, is, in a direct or indirect way, based on them. However, the main problem of theory of phase transitions (hereafter shortly denoted as TPT) is to fit the foregoing ideas with the general framework briefly sketched at the beginning of this section. In a number of cases this conciliation is made by resorting to classical physics. However, this approach entails the belief in classical thermodynamics and in classical statistical physics. The latter, as it is well known (cfr. [61,1]), is based on the so-called correlation weakening principle stating that, during a relaxation process, all long-range correlations between the individual motions of single particles tend to vanish when the volume tends to infinity. This, in turn, implies that, while spontaneous symmetry breaking phenomena, like the ones described by Landau theory, are allowed by classical physics without any problem (see, for instance, [33,65,26]), they give rise to new equilibrium states which are unstable with respect to thermal perturbations, even of small amplitude (one of the first proofs of this fact was given in [67]). At first sight, such a circumstance could not appear as a problem. After all, nobody pretends to build models granting for structures absolutely insensitive to any perturbation. However, a deeper analysis shows that instability with respect to thermal perturbations is equivalent to instability with respect to longwavelength disturbances, and whence entails the impossibility of taking infinite volume limits. The latter is a serious flaw, as one of the main pillars of TPT, that is the divergence of correlation length at the critical point, due to the occurrence of infinite-range correlations, has a meaning if and only if we go at the infinite volume limit. Therefore the aforementioned results imply that classical physics is a framework untenable if we are searching for a wholly coherent formulation of TPT, in which phenomenological models are in agreement with statistical physics. How to find an alternative framework? At this point the only remaining possibility is to make resort to QFT. The attractiveness of the latter stems from the fact that within QFT, and only within it, there is the possibility of having different, non-equivalent, representations of the same physical system (cfr. [34,37]; a more recent discussion on the consequences arising from this result, often denoted as “Haag Theorem”, can be found in [9,8,60]). As each representation is associated with a particular class of macroscopic states of the system (via quantum statistical mechanics) and this class, in turn, can be identified with a particular thermodynamical phase of the system (for a proof of
Towards a General Theory of Change
607
the correctness of such an identification, see [63]), we are forced to conclude that only QFT allows for the existence of different phases of the system itself. Within this approach all seems to work very well. Namely the occurrence of a phase transition through the mechanism of spontaneous symmetry breaking (SSB) implies the appearance of collective excitations, which can be viewed as zero-mass particles carrying long-range interactions. They are generally called Goldstone bosons (cfr. [32]; for a more general approach see [70]). Such a circumstance endows these systems with a sort of generalized rigidity, in the sense that, acting upon one side of the system with an external perturbation, such a perturbation can be transmitted to a very distant location, essentially unaltered. The reason for the appearance of Goldstone bosons is that they act as order-preserving messengers, preventing the system from changing the particular ground state chosen at the moment of the SSB transition. Moreover, the interaction between Goldstone bosons explains the existence of macroscopic quantum objects [70,43]. Here, it must be stressed that the long-range correlations associated with a SSB arise as a consequence of a disentanglement between the condensate mode and the rest of system [64]. This finding, related to the fact that within QFT we have a stronger form of entanglement than within Quantum Mechanics [20], explains how the structures arising from a SSB in QFT have a very different origin from those arising from a SSB in classical physics. Namely, while in the latter case we need an exact, and delicate, balance between short-range activation (due to non-linearity) and long-range inhibition (due, for instance, to diffusion), a balance which can be broken even by a small perturbation, in the former case we have systems which, already from the starting, lie in an entangled state, with strong correlations which cannot be altered as much by the introduction of perturbations. Despite these advantages, however, the situation within QFT-based approach is not so idyllic at is could appear at a first sight. Namely the mathematical machinery of QFT is essentially based on propagators (or Green functions) which allow to compute only transition probabilities between asymptotic states, without any possibility of describing the transient dynamics occurring during the transitions themselves. And, what is worse, it is easy to prove that, in most cases, such a dynamics must be described by classical physics. This raises a number of further problems, because the transition from the quantum regime, existing far from critical point, and the classical regime, holding close to critical point, is nothing but a phenomenon of decoherence, due to the interaction with the external environment. But, what are the physical features of the latter process (and of the symmetric process of recoherence,
608
E. Pessa
taking place after the phase transition and restating the quantum regime)? How to describe the external environment? How these phenomena can influence the formation of structures (like defects) surviving even after the completion of phase transition and signaling its past occurrence? All these questions cannot, unfortunately, be answered within the traditional framework of QFT. Namely these problems have been dealt with by resorting to suitable generalizations of it. 3. Beyond TPT Starting from the Eighties, the need for applying QFT, rather than to particle physics, to condensed matter and phase transition theory stimulated the introduction of suitable generalizations of old schemata. We will quote, in this regard, three main advances: a) the acknowledgement that Goldstone bosons can undergo a condensation, giving rise to observable macroscopic effects; among the methods introduced to describe such phenomena we mention the so-called boson transformation [70]; in this way it becomes possible to deal with defects arising as a consequence of a symmetry-breaking phase transition (an example is given by solitary waves, or solitons), as well as with macroscopic quantum objects; these developments allowed a deeper understanding of effects which could account for the occurrence of biological coherence, such as Davydov effect and Fröhlich effect; the former (for reviews of this topic, see [62,28,22,11,30]) consists in the production of solitons moving on long biomolecular chains, when the metabolic energy inflow is able to produce a localized deformation on the chains themselves; the latter, in its essence (see, for recent discussions, [47,48]), consists in the excitation of a single (collective) vibrational mode within a system of electric dipoles interacting in a nonlinear way with a suitable source of energy, or with a thermal bath; it is to be added that these dipoles are thought to be present both in water constituting the intracellular and extracellular liquid as well as in biological macromolecules, owing to ionization state connected to the existence of high-intensity electric fields close to cellular membranes; the coupling of both effects allowed for the introduction of a quantum theory of biological coherence (for a recent review see [23]) as well as of a quantum brain theory (the huge amount of literature on this topic is summarized in [40,41,72]), which had a number of important applications, such as the study of the role of cytoskeleton in neural cells [35,69], the operation of memory[71,2,55,29], and the basis for consciousness (see, for instance, [36]);
Towards a General Theory of Change
609
b) the introduction of a more refined theoretical description of the environment; starting from the pioneering work of Caldeira and Leggett (see, for instance, [12]) theoretical physicists began to introduce more and more explicit models of the environment, in turn described as a set of interacting entities (for instance, quantum Brownian oscillators); this allowed to deal in a more correct way with phenomena such as dissipation and decoherence (see, among the others, [13,14,15,73]); among the most interesting methods used to take into account dissipation within QFT we quote the so called doubling mechanism [18]; the latter describes the influence of a dissipating environment by doubling the original dissipative system through the introduction of a time-reversed version of it, which acts as an absorber of the energy dissipated by the original system, so that the whole system, including the environment, can be dealt with as if it were an Hamiltonian system; this framework makes possible to argue that the presence of a field-mediated interaction (present at the level of biological macromolecules) could work against decoherence, provided the field were of a particular kind; let us suppose, for instance, to have a simple quantum system lying in an entangled state (for example the Schrödinger cat state), interacting with a classical field inducing a dissipative dynamics; then (we follow here the argument of Anglin et al. [6]), owing to the fact the different degrees of freedom of the system react in a different way to action of the field, the interference which supported the entanglement disappears and the system state reduces to the product of the single states of its components; in other words, decoherence occurred; however, as the dynamics is dissipative, the system is forced to evolve, independently from its initial conditions, towards an attractor whose dimensionality is lesser than the number of degrees of the system; thus, after a suitable relaxation time, some degrees of freedom (or even all of them, when the attractor is an equilibrium point) fall in the same state, just as if the system were in an entangled state; decoherence disappeared, and recoherence took place! This apparent paradox is easily solved if we take into account that the dissipation induced another stronger kind of entanglement, the one between the system and the environment; and just such entanglement was responsible for the relaxation towards the equilibrium state; c) the explicit modeling of phenomena close to critical point of a phase transition in presence of realistic conditions, that is finite volume, finite time, boundary constraints (see, in this regard, contributions such as the ones of [42, 27,57,45,3,4,44,58,5,7]); this allowed to distinguish, during a phase transition, a number of different stages: the initial one of decoherence (in which quantum fluctuations become much smaller than thermal fluctuations), the classical one,
610
E. Pessa
close to critical point, in which the influence of chaos and noise becomes of utmost importance, and the final one of recoherence, in which the quantum regime is re-established; the features of the final phase arising after the completion of phase transition depend in a crucial way on the dynamics occurring in the classical stage, which, in a sense, structures the landscape for the processes occurring after recoherence; in some cases it is possible to influence the structuring processes occurring within classical stage through suitable external control actions, so as to transform an intrinsic emergence (in the sense of Crutchfield; see [21]) in a controlled pattern formation. The advances quoted above gave rise to a generalized form of phase transition theory which seems more suited to deal with the description of changes occurring in biological matter at the most basic level, that is the one of behavior of biological macromolecules and of processes involving cell membranes. However, there is the suspicion that this framework be unable to work when going to processes of biological change occurring at levels beyond the basic one. 4. Biological models of change Started near the beginning of twentieth century, the development of theoretical models of biological change (including economical, psychological, social change) has been so intense as to give rise to a huge number of different models. However, differently from what occurs in physics, it is impossible to find a common framework to which all models can be reduced. This circumstance (essentially due to a lack of interdisciplinarity) produced a low efficiency of most models of this kind. And it is to be recalled that Von Bertalanffy introduced General System Theory even to remedy for this state of affairs. As it is well known, his work, as well as the one of founding fathers of Systemics, helped to understand the capital role of Dynamical Systems Theory in describing changes in a number of different domains through a sort of unified language. However, this put into evidence even the intrinsic limitations of this approach, deriving from the special features of models of biological change. These latter became easily recognizable after the advent of computer-based simulations of biological models, which were analytically intractable. We can shortly list these features in the following. 4.1. Importance of individuality Contrarily to what occurs in physical models, whose single components have identical features, in biological models each component is endowed with individual features, partly differing from the ones of other components. And the
Towards a General Theory of Change
611
form of distribution among the components of these individual features is crucial for the operation of the whole system. This entails that most biological models describe disordered systems, that is systems for which it is often impossible to forecast a priori the nature of their dynamics on the only basis of their macroscopic statistical features. 4.2. Reactive nature of the environment In most cases the environment of a biological system (often constituted by other living beings) has a reactive nature, as it counteracts the actions of the system under study by sending to it suitable responses and, in some cases, even selecting some features of system itself (a process which often amounts to change the laws themselves ruling the dynamics of the system or the nature of its components). This interplay of the action and the reaction is at the basis of adaptation process and can sometimes be described in a shortened (even if unrealistic) way by resorting to concepts such as the one of fitness. Nothing similar occurs in physical models of change, where we deal, in the best case, with passive environments constituted by thermal baths or Brownian oscillators. 4.3. Creation of new kinds of components In a number of cases the dynamics of biological change can lead to the creation of new constituents, of a new kind not existing before the creation. The appearance of these new elements generally modifies in a radical way the form itself of dynamical laws fulfilled by the system. On the contrary, in every physical model the form of dynamical laws stems unchanged with time. 4.4. Absence of conservation principles Almost the totality of models of biological change lacks general conservation principles, such as the one of total energy. Therefore they cannot put under a Lagrangian or Hamiltonian form, making difficult to make a direct use of methods and results obtained within models of physical change, heavily relying on the Hamiltonian formalism. 4.5. Non-equilibrium dynamics Most, if not all, models of biological change describe non-equilibrium situations, because they deal with systems which are far from reaching adaptation. However, this implies the impossibility of resorting to general results
612
E. Pessa
holding in most physical models, such as detailed balance principle, fluctuationdissipation theorem, and all machinery of equilibrium Thermodynamics. 4.6. Importance of configurational variables Most constituents of biological models are characterized by a complex inner structure, described by suitable configurational variables, whose values have a crucial importance for the dynamics of constituents themselves. Nothing similar occurs in physical models, characterized by very simple constituents (typically point particles). 4.7. Multi-level hierarchical structure Most biological systems are characterized by a multi-level hierarchical structure, which contrasts with the simple two-level (macroscopic and microscopic) description occurring in physical models. Besides, the nature and the number itself of levels can change as a function of the interaction with environment. It is to be underlined the presence of inter-level interactions, the most interesting aspect of them often being the direct influence of higher levels on the lower levels behaviours. A typical example is given by ant polymorphism (see, for instance, [59,68,39]), in which the number of ant sub-species (workers, soldiers, etc.) varies as a function of macroscopic variables such as the numerical consistence of ant colony and the environmental constraints. While all these features evidence the large differences between models of physical and of biological change, the most obvious question is: why should we attempt to reduce these differences? After all, most models of biological change are based on a sophisticated mathematics and this makes us confident in their efficiency in describing biological phenomena. As the latter seem to be very different from physical phenomena, where is the need for an unification? The answer lies in the need for assessing model reliability (or, in other terms, validity). Namely the latter, within the highly fragmented world of biological models, is a very difficult enterprise. Most researchers, in this regard, make use of sophisticated statistical analyses which, however, almost always give significant results only in presence of a very large number of experimental data. And, as it is well known, the latter is a condition which, in most cases of practical interest, it is impossible to fulfill. On the contrary, in models of physical change reliability is based on very general principles which, in turn, are embodied within specific models. The latter give rise to precise predictions (which can be falsified even by only one experimental finding, without the need for statistics). In general, the lack of reliability is due to details of specific
Towards a General Theory of Change
613
models, which can be easily changed without changing the overall framework. Within this approach, therefore, the assessment of model reliability is a far simpler affair than in the biological case. There is, however, another reason for searching for a general theory of change, including the physical and biological ones as special sub-cases: the hope of individuating and classifying the possible different scenarios of change, each scenario being associated to a set of specific strategies for predicting, detecting, and (when possible) controlling changes, independently from their biological or physical nature. In order to evidence how much we are far from reaching this goal (despite the advances of theoretical physics), it is instructive to look at very simple models of biological change, asking ourselves what should be added to actual framework of theoretical physics in order to include the features of models themselves. 5. Bridging the gap between physics and biology? Before starting the analysis of a simple model of biological change, we remark that, among the features quoted in the previous section, one of the most disturbing for models of physical change is given by the existence of inter-level influences which, in most cases, support the reactions of the environment on the system under study. Typically these influences take the form of a global-local interaction, in which some microscopic variables are influenced by the values of suitable macroscopic variables (describing macroscopic environmental states or system states or both). The usual physical models don’t include interactions of this kind, which, on the other hand, could not be described within an Hamiltonian framework. Namely in the simplest cases they require integrodifferential or even functional equations whose mathematics is still largely unknown. In order to evidence the powerful influence of this kind of interactions, we will resort to an almost trivial example, consisting in an artificial neural network containing N units, totally interconnected. By adopting a discretized time scale, the output of the i-th unit at time t + 1 is given by:
xi (t + 1) = tgh
(
j
wij x j (t ) − si (t )
)
(1)
While the weights wij are initially chosen at random (the only condition being the vanishing of self-connections) and then kept fixed during all network evolution, the individual thresholds si (t ) (initially still chosen at random) are allowed to vary according to the dynamical law:
614
E. Pessa
(a)
(b)
(c) Fig. 1. The three figures show the evolution of average output of a neural network containing 50 units in three conditions, corresponding to α = 0 , α = 0.1 , α = 2. For further details see the text.
si (t + 1) = si (t ) + α [m(t ) − s0 ]
(2)
Here m(t ) denotes the average output of the network at time t (which is a macroscopic variable, while the single outputs and thresholds are microscopic variables) and s0 is a suitable threshold value for this average. Besides, α is a parameter chosen by the experimenter. Of course, when the value of this parameter is different from zero we have that (2) gives a simple example of global-local interaction. In order to make this example as simple as possible, we
Towards a General Theory of Change
615
avoided any external input. Thus, our model describes nothing but a closed disordered system (more precisely we deal with a quenched disorder) which must do nothing but to … evolve. In the Figures 1(a), 1(b), 1(c) we show three different snapshots of computer simulations of the evolution for 200 time steps of the average output of a network containing 50 units, all starting exactly from the same initial conditions (same weights, same initial threshold values, same initial output values), with s0 = 0.1 . The only difference lies in the values of α . In 1(a) we have α = 0 (absence of global-local interaction), in 1(b) we have α = 0.1 (weak global-local interaction), and in 1(c) we have α = 2 (strong global-local interaction). As it is easy to see, while in absence of global-local interaction we have a fast (and expected) relaxation towards an equilibrium state, even a weak global-local interaction induces a deep change in network dynamics (proving that such interactions are very effective in controlling microscopic dynamics), whose equilibrium state is shifted towards s0 . Then, the presence of a strong global-local interaction gives rise even to a deep change of the nature itself of network dynamics, which becomes quasi-periodic (and presumably chaotic). This elementary example shows clearly the dramatic effect produced by the introduction of a global-local interaction. Therefore a first problem to be solved by theoretical physicists should be the one of generalizing the usual formalisms so as to include interactions of this kind. Unfortunately, so far no convincing solution of this problem has been proposed, despite the development of theories such as the one of viscoelasticity (standard textbooks are [19,56,38]), which in an explicit way try to generalize the Lagrangian methods to the cases in which global-local interactions are occurring. In order to better illustrate the difficulties arising when trying to connect phenomenological models with usual framework adopted in models of physical change we will use a particular version of a very simple model of population evolution, introduced by Michod (see, among the others, [49,50]), and applied to describe the evolutionary biology of volvocine algae. The interest for this model stems from the fact that, starting only from a microscopic dynamics, it predicts the occurrence of global correlations, a circumstance often attributed, within physical models, only to QFT-based models. Within Michod model the i -th individual of a population is characterized by the values of two variables: its generative ability bi and its survival ability vi . In general, as shown by biological data, the two variables b and v are not reciprocally independent, but are connected through a general law expressed by a function ν (b) which is decreasing in b . In this regard, we introduced a specific form of ν (b) given by:
616
E. Pessa
ν (b) = [γ (1 + γ ) (b + γ )]
(3)
where γ is a suitable parameter. For reasons of convenience we constrained the values of v and b within the interval [0,1] and (3) tells us that v(0) = 1 , v(1) = 0 . However, as the total number of individuals is not constant (owing to the existence of a reproductive ability), but a function of time N = N (t ) , we supposed (Michod did not make explicitly this hypothesis), that the value of parameter γ appearing in (3) (that is the convexity of the curve) were dependent on the momentarily value of N through a law of the form:
γ =
β0 N
+ α0
(4)
where β 0 and α 0 are other parameters. What should we expect from the behavior of this simple model? Michod introduced two different kinds of fitness measure: the average individual fitness of population members, denoted by wˆ , and the total (normalized) population fitness, denoted by W . Their definitions are:
wˆ =
1 N
bν i i i
,
W=
1 N2
BV ,
B=
b i i
,
ν=
ν i i
(5)
From both biological observations and results of computer simulations Michod recognized that, while obviously both fitness measures vary with time, however for most time they have different values. Thus he was lead to introduce a quantity, called covariance and here denoted by Cov , which measures such a difference through the simple relationship:
Cov = wˆ − W
(6)
When the value of Cov is negative, the total population fitness is greater than the average individual fitness. In other words, we are in a situation in which it seems that there is some sort of global cooperation (or coordination) among individuals producing an increase of total fitness. Clearly this is not quantum coherence, but recalls some aspects of the latter. How is this increased total fitness reached? The simulations show that it is sometimes due to a sort of increase in specialization of population members. In this regard, we remark that every pair of values (v, b) characterizes a single individual and that the form of the distribution of these pairs (or better, only of b values, as (95) lets us find the value of v , once known the value of b ) gives a measure of the degree of specialization present within the population at a given time instant. Namely a flattened distribution means a strong difference between the individuals, and
Towards a General Theory of Change
617
Fig. 2. Group fitness (upper curve) and average individual fitness (lower curve) vs time.
therefore a high degree of specialization, while a strongly peaked distributions means small differences between the individuals and low degree of specialization. In order to check whether these effects are present even under the hypotheses we introduced before in (3)-(4), we performed suitable numerical simulations of the evolution of populations. They were based on a previous subdivision of the interval [0,1] of values of continuous variable b in a suitable number of equal sub-intervals, in correspondence to each one of which the value of b was identified with the middle point of the sub-interval. The mechanism of reproduction was random, based on a uniform distribution, and such that each reproductive value was interpreted as the probability, at each generation, of producing a number of offspring which were a fraction of a maximum possible number fixed in advance by the experimenter. The produced descendants were assigned at random to the different generative ability sub-intervals. Besides, even the value of v for each individual was interpreted as the probability of its survival in the next generation. In the Figure 2 we can see a plot of both kinds of fitness vs. generation number in a “life history” characterized by 100 generations, an initial total number of individuals given by 50, a number of 100 different sub-intervals of generative ability, and a maximum allowable number of descendants for each population member and for each generation given by 5. Moreover the maximum allowable number of individuals for each generative ability sub-interval was
618
E. Pessa
fixed to 100, and the initial value of g was 5. The values of remaining parameters were α 0 = 3 , β 0 = 100 . As it is immediate to see, the covariance is always negative and the group fitness prevails over the average individual fitness. What has been in this case the effect of population evolution on the distribution of values of b among the individuals. We remark that at the beginning of this simulation we chose to put all individuals within the same generative ability class, corresponding to the 50th sub-interval. This distribution is depicted in the Figure 3(a). For a comparison we show in the Figure 3(b) the final distribution obtained after 100 generations. As it is possible to see, not only the final distribution deeply differs from the initial one, but the former evidence a very high degree of specialization of single individuals. It is easy to understand that this simple model accounts not only for the evolution of populations of volvocine algae, but, more in general, of the fact that most biological organisms survive in a complex environment just owing to the fact that their components (cells or organs) are highly specialized and reciprocally cooperating. Now let us deal with the main question: can this model be dealt with through the methods, for instance, of QFT? If yes, through which algorithms? If not, for which reasons? To begin, we could argue that the model was not cast under the form of a system of differential equations and, therefore, it could not be dealt with through Hamiltonian-based methods. As a matter of fact it can be easily shown that it is not to possible to cast the model even under the form of stochastic differential equations (provided we not introduce suitable approximations which, however, would destroy the nature itself of the model). On the other hand, even methods such as the one of Doi-Peliti (see [24,53]; applications and generalizations are described by Cardy and Täuber [17], PastorSatorras and Solé [52], Smith [66]), which transforms a model based on a master equation in a sort of Hamiltonian field theory, cannot be applied, because it is easy to recognize that the master equation of Michod model is very complicated and not reducible to a reaction-diffusion form. Nor a mean-field approximation seems useful. This argument is, however, very weak as we could hypothesize that, in a near future, by using suitable new kinds of tricks and approximations, it would be possible, in one or in another form, to cast Michod model into a mathematical format closer to the ones popular in theoretical physics. In any case, this would be a problem only of technical (mathematical) nature, and not a serious conceptual obstacle. A stronger argument takes into consideration the nature of the environment which, as implicitly described in the previous model, has a typically reactive
Towards a General Theory of Change
619
(a)
(b) Fig. 3. (a) Initial distribution of generative abilities; (b) Final distribution of generative abilities.
character. On the contrary, QFT models are placed within simpler environments, such as thermal baths, and the concept itself of fitness is absent. We can therefore claim that QFT will never give rise to a reliable description of phase transitions in biological matter if we will not generalize it so as to include more realistic descriptions of biological environments. Of course, this generalization should also take into account the fact that, owing to previous reasons, phase transitions in biological matter are often of nonequilibrium type. The doubling mechanism quoted in Section 3 constitutes a first step towards this direction, but probably further steps are needed. A third argument deals with the nature of system components. While QFT models can be interpreted as describing assemblies of particles, all having the same nature, the members of population previously introduced are different individuals. In other words, once translated in the language of particle creation and annihilation, the model operating according the rules (3)-(4) describes creation and annihilation, not only of particles, but even of kinds of particles. In terms of a field language it is equivalent to a theory describing the birth and the
620
E. Pessa
disappearing of fields. Within QFT this would require the introduction of third quantization. This is still a somewhat exotic topic, dealt with almost exclusively within the domain of quantum theories of gravitation, but so far with only very few contacts with the world of QFT models of condensed matter (for a first attempt see [46]; see also [54]). However the latter domain could be the better context for applying this kind of extension of QFT, owing to the presence of “effective force fields” which appear and disappear as a function of environmental constraints. On the contrary, it would be useless within the world of elementary particle physics at high energies, where from the starting people are searching for evidence of universal force fields whose nature lasts unchanged. As a consequence of the previous arguments we can claim that, notwithstanding the theoretical efforts quoted in the previous sections, actually the gap between models of biological and of physical change, in particular QFT, is still very large, except in a limited number of cases, related to low-level phenomena. However, provided the generalizations mentioned before were included within a larger theoretical framework, the gap could probably be filled.
6. Conclusions The considerations made within this paper evidenced the difficulty of the road to be followed to reach at least a first form of a general theory of change. In this regard, we hope to have clarified what are the generalizations of physical models needed to fill the gap between models of biological and physical change. We remark that, even if, at first sight, the problems to be solved appear to be only of technical nature, really they are of conceptual nature. Namely the adopted technical methods are strongly dependent on the conceptual framework adopted in describing the world, and mostly on the goals underlying this description. In particular, since Newton times the world has been conceived as populated of systems constituted by elementary (and irreducible) entities plus the interactions between these entities. While this framework can be useful for describing planets revolving around the Sun, or electrons orbiting around a nucleus, it becomes useless when describing other kinds of systems, such as biological ones, in which notions such as cause-effect, elementary entity, system-environment boundary, can become devoid of any sense. We could thus say that, even if traditional physical framework adopts a sort of restricted systemic view, to describe even biological systems we need a generalized systemic view. Actual Systemics can strongly help in developing the latter, thus
Towards a General Theory of Change
621
contributing to the revolutionary conceptual transformation needed to build a general theory of change.
References 1. A. Akhiezer, S. Péletminski, Les méthodes de la physique statistique (Mir, Moscow, 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
1980). E. Alfinito, G. Vitiello, Int. J. Mod. Phys. B 14, 853 (2000). E. Alfinito, G. Vitiello, Phys. Rev. B 65, 054105 (2002). E. Alfinito, O. Romei, G. Vitiello, Mod. Phys. Lett. B 16, 93 (2002). C.A. Almeida, D. Bazeia, L. Losano, J.M.C. Malbouisson, Phys. Rev. D 69, 067702 (2004). J.R. Anglin, R. Laflamme, W.H. Zurek, J.P. Paz, Phys. Rev. D 52, 2221 (1995). N.D. Antunes, P. Gandra, R.J. Rivers, Phys. Rev. D 71, 105006 (2005). A. Arageorgis, J. Earman, L. Ruetsche, Studies in the History and Philosophy of Modern Physics 33, 151 (2002). J. Bain, Erkenntnis 53, 375 (2000). G. Benfatto, G. Gallavotti, Renormalization Group (Princeton University Press, Princeton, NJ., 1995). L. Brizhik, A. Eremko, B. Piette, W. Zakrzewski, Phys. Rev. E 70, 031914 (2004). A.O. Caldeira, A.J. Leggett, Ann. Phys. 149, 374 (1983). E. Calzetta, B.L. Hu, Phys. Rev. D 61, 025012 (2000). E. Calzetta, A. Roura, E. Verdaguer, Phys. Rev. D 64, 105008 (2001). E. Calzetta, A. Roura, E. Verdaguer, Phys. Rev. Lett. 88, 010403 (2002). J. Cardy, Scaling and Renormalization in Statistical Physics (Cambridge University Press, Cambridge, UK, 1996). J.L. Cardy, U.C. Täuber, J. Stat. Phys. 90, 1 (1998). E. Celeghini, M. Rasetti, G. Vitiello, Ann. Phys. 215, 156 (1992). R.M. Christensen, Theory of Viscoelasticity. An Introduction (Academic Press, New York, 1971). R.K. Clifton, H.P. Halvorson, Studies in the History and Philosophy of Modern Physics 32, 1 (2001). J.P. Crutchfield, Physica D 75, 11 (1994). L. Cruzeiro-Hansson, S. Takeno, Phys. Rev. E 56, 894 (1997). E. Del Giudice, A. De Ninno, M. Fleischmann, G. Vitiello, Electromagnetic Biology and Medicine 24, 199 (2005). M. Doi, J. Phys. A 9, 1465 (1976). C. Domb, The critical point (Taylor and Francis, London, 1996). J.R. Drugowich de Felício, O. Hipólito, Am. J. Phys. 53, 690 (1985). J. Dziarmaga, P. Laguna, W.H. Zurek, Phys. Rev. Lett. 82, 4749 (1999). W. Förner, Int. J. Quantum Chem. 64, 351 (1997). W.J. Freeman, G. Vitiello, Physics of Life Reviews 3, 93 (2006). D.D. Georgiev, Informatica 30, 221 (2006). N. Goldenfeld, Lectures on phase transitions and the renormalization group (Addison-Wesley, Reading, MA, 1992).
622
E. Pessa
32. J. Goldstone, A. Salam, S. Weinberg, Phys. Rev. 127, 965 (1962). 33. D.M. Greenberger, Am. J. Phys. 46, 394 (1978). 34. R. Haag, In W.E. Brittin, B.W. Downs and J. Downs (Eds.). Lectures in Theoretical 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65.
Physics, vol. 3 (Wiley, New York, 1961), pp. 353-381. S. Hagan, S.R. Hameroff, J.A. Tuszy ski, Phys. Rev. E 65, 061901 (2002). S.R. Hameroff, A. Nip, M.J. Porter, J.A. Tuszy ski, Biosystems 64, 149 (2002). Hepp, K. (1972). Helvetica Physica Acta 45, 237 (2002). D.E. Hill, Continuum Mechanics: Elasticity, Plasticity, Viscoelasticity (CRC Press, Boca Raton, FL, 2006). W.O. Hughes, S. Sumner, S. Van Borm, J.J. Boomsma, Proc. Nat. Acad. Sci. USA 100, 9394 (2003). M. Jibu, K. Yasue, Quantum Brain Dynamics and Consciousness: An Introduction (Benjamins: Amsterdam, 1995). M. Jibu, K. Yasue, In G.G. Globus, K.H. Pribram and G. Vitiello (Eds.). Brain and being. At the boundary between science, philosophy, language and arts (Benjamins: Amsterdam, 2004), pp. 267-290. P. Laguna, W.H. Zurek, Phys. Rev. Lett. 78, 2519 (1997). H. Leutwyler, Helvetica Physica Acta 70, 275 (1997). F.C. Lombardo, R.J. Rivers, F.D. Mazzitelli, Int. J. Theor. Phys. 41, 2121 (2002). G. Lythe, Int. J. Theor. Phys. 40, 2309 (2001). V.P. Maslov, O.Yu. Shvedov, Phys. Rev. D 60, 105012 (1999). M.V. Mesquita, A.R. Vasconcellos, R. Luzzi, Int. J. Quantum Chem. 66, 177 (1998). M.V. Mesquita, A.R. Vasconcellos, R. Luzzi, Braz. J. Phys. 34, 489 (2004). R.E. Michod, Proc. Nat. Acad. Sci. USA 103, 9113 (2006). R.E. Michod, Y. Viossat, C.A. Solari, M. Hurand, A.M. Nedelcu, J. Theor. Biol. 239, 257 (2006). G. Minati, E. Pessa, Collective Beings (Springer, Berlin, 2006). R. Pastor-Satorras, R.V. Solé, Phys. Rev. E 64, 051909 (2001). L. Peliti, Journal de Physique 46, 1469 (1985). E. Pessa, G. Resconi, In G.Minati, E.Pessa (Eds.). Emergence in Complex, Cognitive, Social, and Biological Systems (Kluwer, New York, 2002), pp. 141-149. E. Pessa, G. Vitiello, Int. J. Mod. Phys. B 18, 841 (2004). Yu.N. Rabotnov, Elements of Hereditary Solid Mechanics (Mir, Moscow, 1980). R.J. Rivers, Int. J. Theor. Phys. 39, 1779 (2000). R.J. Rivers, F.C. Lombardo, F.D. Mazzitelli, Int. J. Theor. Phys. 41, 2145 (2002). G.E. Robinson, Annual Review of Entomology 37, 637 (1992). L. Ruetsche, Philosophy of Science 69, 348 (2002). Yu.B.Rumer, M.Sh. Rivkyn, Thermodynamics, Statistical Physics, and Kinetics (Mir, Moscow, 1980). A.S. Scott, Phys. Rep. 217, 1 (1992). G.L. Sewell, Quantum Theory of Collective Phenomena (Oxford University Press, Oxford, UK, 1986). Y. Shi, Phys. Lett. A 309, 254 (2003). J. Sivardière, Am. J. Phys. 51, 1016 (1983).
Towards a General Theory of Change
623
66. E. Smith, Santa Fe Institute working paper #06-11-40 (Santa Fe Institute, Santa Fe, NM., 2006).
67. D.L. Stein, J. Chem. Phys. 72, 2869 (1980). 68. W.R.Tschinkel, BioScience 48, 593 (1998). 69. J.A. Tuszy ski, Ed., The emerging physics of consciousness (Springer, Berlin, 2006).
70. H. Umezawa, Advanced Field Theory. Micro, Macro, and Thermal Physics (American Institute of Physics: New York, 1993).
71. G. Vitiello, Int. J. Mod. Phys. B 9, 973 (1995). 72. G. Vitiello, My double unveiled (Benjamins, Amsterdam, 2001). 73. W.H. Zurek, Rev. Mod. Phys. 74, 715 (2003).
This page intentionally left blank
ACQUIRED EMERGENT PROPERTIES
GIANFRANCO MINATI Italian Systems Society, Milan, Italy E-mail:
[email protected] We first discuss the concept of structure to differentiate between structural and systemic properties. Within this framework we discuss processes of establishing structures, such as phase transitions, organization and self-organization. The paper introduces concepts and insights about the process of Acquiring Properties (AP) in systems, not just possessing properties. The last point relates to establishing, sustaining and managing new properties in emergent and organizational systems. In the Appendix we briefly discuss this approach for the concept of mind possessed by living matter. Keywords: acquired property, emergence, self-organization, structure.
1. Introduction In various disciplinary fields there is increasing interest, witnessed by the increasing number of publications, in concepts such as emergence, selforganization, collective behavior and phase transitions. This interest relates to theories, models and simulations not only in physics, but for a variety of disciplines, such as Artificial Life, Biology, Cognitive Science, Economics and Social Systems. The interest is particularly evident through the fact that original models developed in physics could not be simply transposed to other disciplines by just changing the meaning of the variables considered as, on the contrary, has been possible with specific approaches such as Synergetics. We expect that different discipline-specific approaches will be represented in a future generalized Theory of Emergence based on multiple, non-equivalent approaches. How are systems established or phenomena modeled as such? A systemic approach is based on considering components interacting in a structured way. We first introduce the concept of structure deriving from relations between elements. New properties of structured elements may be produced by the structure itself, such as order. We then distinguish between: • Structured Interactions, occurring when interactions, i.e., the action of one element affects another’s, follow a predefined structure. An example of structured interactions is given in crystals and on assembly lines; and
625
626
•
G. Minati
Self-organized interactions. In this case elements interact in a nonstructured way, i.e., the constraints within which elements interact, is variable and self-established by elements depending upon boundary conditions, external input and with reference to parameters such as distance, timing, position and number of elements. Examples are flocks and swarms; lasers and ferromagnetism.
This is followed by a discussion of specific processes of establishing structures, such as: • Phase transitions. Second-order phase transitions, for instance, consist of an internal, global and simultaneous process of restructuring. • Organizing. Organizing is intended as introducing structures for interactions. We refer to the process of introducing organization into a set considered as having no organization at all or a different organization. • Self-organizing. The meaning of the prefix self relates to the fact that the variable structure by which elements interact (i.e., organization) is not imposed from without, but adopted autonomously, as a reaction or not to an external input. We are then able to focus upon two ways for establishing systems: • Through organization or • Self-Organization, considered as emergence when requiring the crucial constructivist role of the observer. We present this concentrated and necessarily limited review in order to introduce crucial points related to processes by which systems acquire new properties, subsequent to those originally possessed: • Acquisition of new properties. A system, i.e., the model of a phenomenon as such may have embedded the ability to establish new, unexpected, i.e., not explicitly designed by the observer, properties which can be modeled as properties of an autonomous system. Besides, such new properties may influence the original system. • How to sustain acquired properties. In the hierarchy of levels a property requires other lower levels. Is it possible to sustain a property without involving the lower levels? In the Appendix we consider the prospective of retaining acquired properties in the case of living matter provided with suitable cognitive system. One hotly
Acquired Emergent Properties
627
debated property is mind. A not modeled process, because of the evident impossibility of obtaining experimental information, is that related to new, emergent properties acquired by living matter provided with cognitive systems having higher levels of complexity, following the transition from living to nonliving matter. 2. Structures and Systems In mathematics we consider the structure of sets. This relates, for instance, to additional mathematical aspects, such as algebraic structures (groups, rings and fields), equivalence relations, measures, metric structures (i.e., geometries), orders and topologies. An abstract structure is a formal object defined by a set of composition rules, properties and relationships. These definitions are able to distinguish between structural and systemic properties. The structures of elements derive from relations between elements (e.g., positions of elements in a configuration, network or in relational models for databases). Relations may represent static or dynamic configurations of elements. In this case the new property is given by the structure itself and not by the interaction between elements. Examples are given by elements configured in an order (e.g., alphabetical, by age, dimension, weight) or placed by following any functional principles (e.g., lamps per wattage, food per expiry date and electronic components per function). A structure of relations is given by relations between relations (e.g., corporate organization) and structures of structures of elements are given by relations between relations between relations between elements (e.g., regional economics). With reference to systems, a structure in general describes the way by which interactions between elements take place. Interactions among elements are organized when following a structure. Sufficient conditions, in our current knowledge for the establishment of systems, relate to the suitable interacting [1,2,3] in (a) organized and (b) self-organized ways. Let us first consider systems established by (a) organized interacting components. 2.1. Structured Interactions We may distinguish two cases: • Change of structure - Interactions between elements driving towards the establishment of a new structure (e.g., phase transitions). In this case we have a new structure, a new configuration of relationships between elements, and then between their behaviors, as result of the process.
628
•
G. Minati
Establishment of a system - Interaction between elements driving towards the establishment of a system. In this case the elements, in order to continuously interact and to avoid the system dissolving in the environment, must respect some structural constraints, in order for interactions to be effective. Constraints may be, for instance, that elements must be at such a distance as to allow interaction to be effective in the case where this takes place through an exchange of energy, or elements must be connected when interactions take place through the exchange of information.
The first case has already been well described using theories of phase transitions available in the literature (see Section 3.1). In the second case, as mentioned above, components are considered able to have (1) reaction to an external input or (2) a behavior. In case (1) elements only react to inputs by following laws, such as those of physics and chemistry without any autonomous processing. The conceptual framework is that of stimulus-reaction. In such a way structured interactions between elements allow the establishment of systems through collective reactions to an external input thanks to the fact that inputs, i.e., actions on elements, propagate to all others making up the system, i.e., the structure of interacting elements, to adopt properties that individual elements do not possess. In this case systems are established through organization - processes of mutual input/output exchange of energy, matter or information between components of a network of pre-established relationships. Examples of systems established by elements interacting in a structure of relationships are given by chemical bonds and electronic circuits. In case (2) elements have a behavior due to the cognitive processing4 of the input. In this case interacting components are autonomous agents, i.e., agents possessing a natural or artificial (deriving, for instance, from the computational modeling of cognitive processes) cognitive system, such as birds or autonomous robots, allowing them to process the input and not just to react. Structured interaction between autonomous agents allows the establishment of systems having a behavior due to the organization adopted. Examples of systems established by autonomous agents interacting in a structure of relationships are given by assembly lines, military units and sports teams. In both (1) and (2), internal changes are processed as destructuring and eventually collapsing perturbations (e.g., the breakdown of an electronic circuit or the fault of an element in an assembly line).
Acquired Emergent Properties
629
2.2. Self-organized interactions Let us now consider systems established by elements interacting in a nonstructured way. Non-structured way means that the structure, i.e., the constraints by which elements interact, is not pre-established, variable and self-established by elements depending upon boundary conditions and with reference to certain parameters, such as distance, timing, position and number of elements establishing attractors and order parameters [5,6]. Also in this case components may be able to (1) react to an external input or (2) to show autonomous behavior. In the first case we have processes of mutual input/output exchange of energy, matter and information between elements in a non-organized way, i.e., without following pre-established rules. The process leads to the establishment, within suitable boundary conditions, of new self-organized collective entities from the coherent behavior of interacting components. Examples of collective entities established by interacting components are lasers, chemical oscillating phenomena such as the Belousov-Zhabotinsky reaction, ferromagnetism and superconductive systems. In the latter case we consider interaction as taking place between autonomous agents. Examples of collective entities established by autonomous agents are flocks, swarms, industrial districts and markets. We must stress how the two ways in which suitable processes of interaction take place, i.e., structured and self-organized, may be coexistent. For instance, social systems may be considered as organizations ignoring emerging processes taking place within them and vice versa. One typical example where these two ways coexist is an anthill where there are well-defined roles and simultaneous processes of emergence. 3. Processes of establishing structures We may consider three kinds of processes able to establish structures: Phase transitions, Organizing and Self-organizing. 3.1. Phase transitions A phase in physics relates to the state of matter. In short, a phase is a set of states of a physical system having uniform properties such as electrical conductivity, density, structure and index of refraction. Examples of phases of matter are the liquid, solid and gas phases. Phases, however, are not thermodynamic states. For instance, two liquids at different temperatures are in different thermodynamic states, but in the same state of matter. The expression
630
G. Minati
phase transitions relates to processes of changing from one phase to another. This process is called phase transition. It is possible to distinguish between firstorder and second-order phase transitions. First-order transitions require a finite time. It is possible to have the coexistence of different phases in the same system, such as ice in water and liquid and vapor. A suitable external perturbation (e.g., change of temperature or pressure) is able to induce a total or partial disappearance of one phase in favor of the other. Examples of first order phase transitions are solid-liquid-gas transitions. Second-order transitions occur without continuity and simultaneously within the whole system involved in the process. In this kind of transition there is no coexistence of the two phases. The transition consists of an internal, global and simultaneous process of restructuring. Second order transitions are activated by the fact that the structure corresponding to the initial phase becomes instantaneously no longer valid and a new structure is established. Examples of second-order transitions are transitions from paramagnetic to ferromagnetic states, the occurrence of superconductivity and superfluidity. Processes of phase transitions have been reported in the literature using different approaches and theories [2]. 3.2. Organization / Self-organization Organization may be introduced into a set considered as having no organization or a different organization. This is a process for building an artificial system [7]. The concept of self-organization has been widely studied and is often adopted as a synonym of emergence. The meaning of the prefix self relates to the fact that the structure by which elements interact (i.e., organization) is not imposed from the outside, but adopted autonomously, either as a reaction or not, to an external input. Processes of self-organization are processes which can make elements (a) adopt a structure or (b) to interact while following a continuously self-established structure. The first case (a) regards, for instance, phase transitions, with particular reference to so-called order–disorder transitions. Following the work of I. Prigogine [8] and H. Haken [9,10,11], processes of second-order transitions, so-called order-disorder transitions, have been considered as processes of self-organization [12] and the terms emergence and self-organization considered as synonyms. By the way, the two concepts are different inasmuch as a system is considered self-organizing when it is able to change its structure in reaction to external inputs [13]. Emergence, as introduced in literature [2,14,15,16,17], requires the constructivist role of the observer able
Acquired Emergent Properties
631
to detect not only structural or ergodic changes, but also the establishment of new properties [18,19,20,21,22,23]. Processes of self-organization are, for instance, described by models formulated in terms of partial differential equations. Systems may then allow for an infinite number of solutions and their general form cannot be identified using suitable parameters. The difference between two solutions of a partial differential equation is given by an arbitrary function. Moreover, it is possible to find in these models locally stable solutions as descriptions of self-organizing processes. The simplest of such models is the so-called Brusselator [8,24]. The model was very useful for studying the Belousov-Zhabotinsky reaction mentioned above [25]. Case (b) relates to the establishment of Collective Behaviors. In this case elements interact by establishing coherence rather than a fixed organizational structure. Coherent behavior (named Collective Behavior) of particles or agents is established not due to an explicit design setting functions and roles, but as a consequence of their coherent rather than structured interaction. The phenomenon consists of the occurrence of coherence between microscopic behaviors detected by an observer using a model different from that used for individual components. It means that the level of description used for modeling the microscopic behavior is insufficient for detecting coherent, collective behavior. Processes of Collective Behavior are processes of self-organization able to give rise to emergent entities (i.e., systems) having new properties such as in physics with the establishment of lasers, fluids and plasmas [26,27], in biophysics with DNA replication [28], in sociology [29,30,31], in chemistry (pattern formation, dissipative structures), biology (morphogenesis, evolution), economics (markets), sociology (urban growth), brain activities, computer sciences and meteorology as well as non-linear phenomena involving a macroscopically large number of agents as in the case of insect societies [32,33] and swarm intelligence [34]. We briefly mention how the following equivalences, often taken as valid, are not completely correct [2]: • Emergence equivalent to Phase Transitions. • Emergence equivalent to Self-organization [35]. With reference to the first equivalence, the difference relates to properties acquired by continuously and coherent changing of structures and properties acquired by a specific change of structure. The unsuitability of the equivalence also relates to the problem of generalizing the concept of phase transition when
632
G. Minati
dealing with similar processes taking place in other disciplinary contexts. In physics the indicators relate to physical variables, such as temperature and pressure. If we want to apply the same model to other disciplines we need to find the corresponding indicators, for instance, in cognitive science, social systems and biology. With reference to the second equivalence, the unsuitability of the equivalence relates to the absence of the constructivist role of the observer able to detect not only processes of establishment of coherence, but also to realize new properties, i.e., usages though modeling and meaning. We conclude this chapter mentioning approaches to detect that a process of emergence is taking place by assuming self-organization as a necessary condition [2,36, 37,38]. 4. Processes of establishing systems We need to briefly recall that in a constructivist view a system is a model established by the observer for comprehending a phenomenon. In this way the observer identifies parts while trying to model a phenomenon as a system. Observer and designer are one and the same only for artificial systems. In this case we know the parts, interactions and structure because they have been designed. A different partitioning corresponds to different, mutually equivalent or irreducible, models. Moreover, the same process relates to the identification of interactions among parts and how they are organized, i.e., their structure or organization. We have a system when we are able to describe parts, their interaction and their structure [17]. A system is then a model of a phenomenon assumed to be able to represent, explain and simulate it. Designing a system, i.e., new structures, partitions and interactions, is a way to establish new entities having new properties (new with reference to components and interactions). 5. Processes of establishing Acquired Properties (AP) With reference to what has been introduced above, new properties should correspond to the adoption of a) a new structure and/or b) a new way of interacting, and/or c) new parts becoming involved. Actually, it means to establish new systems to get new properties. By the way, systems do not only possess properties, but are also able, in their turn, to establish new ones in different ways. It means that the same modeling has the power to explain the establishment of new properties previously not considered. There are several ways by which a system may acquire new properties, for instance: • Establishing new subsystems - this occurs when a system is established by subsystems and a new configuration of the same subsystems establishes a
Acquired Emergent Properties
•
633
system with new properties. Examples are extended new functionalities of electronic devices by adding to or rearranging their configuration. Introducing parameter changes - relating to a system' s way of working, such as speed of information exchange among parts, intensity of interactions and level of sensitivity to external parameters (e.g., temperature and pressure).
We may assume that a system keeps its identity when keeping the same partitions, structure and interaction between elements. After processes, for instance, of phase transitions, re-organization and self-organization the system is considered as no longer being the same. The same phenomenon may be modeled at different levels of description corresponding to systems differing in their partition and/or interactions and/or structure. This approach is well known in physics where multiple descriptions are possible by using, for instance, waveparticle duality, quantum and non-quantum models. A system may itself activate the process of establishing new properties thanks to the simultaneous or multiple interactions as for Multiple Systems (MSs) and Collective Beings (CBs) [2]. A MS is a set of systems established by the same elements interacting in different ways, i.e., having multiple simultaneous or dynamical roles. Precisely because of their multiple simultaneous or dynamical roles they make emergent different systems having different properties. This concept was introduced several years ago in psychology by considering multiple-memory-system models [39]. Examples in systems engineering include interacting networked computer systems performing cooperative tasks and the Internet, where different systems play different roles establishing continuously new, emerging usages. CBs are particular MSs established by agents possessing the same (natural or artificial) cognitive system. In CBs multiple belonging is active, i.e., decided by the component autonomous agents. Examples of CBs are Human Social Systems where agents may belong to different systems (e.g., families, workplaces, traffic systems, mobile telephone networks and as buyers) or give rise to different systems, such as temporary communities (e.g., audiences, queues, passengers on a bus). In this case a multi-modeling approach, known as the Dynamic Usage of Models (DYSAM) has been introduced, based on approaches already considered in the literature having a common strategy of not looking for a unique, optimum solution. These include the well-known Bayesian method, Pierce’s abduction, Machine Learning, Ensemble Learning and Evolutionary Game Theory [2,40,41]. We consider below some specific processes able to make a system adopt new properties.
634
G. Minati
5.1. Acquiring new properties A system as such, i.e., the model of a phenomenon, may have embedded in it the ability to establish new, unexpected properties, i.e., not explicitly designed by the observer. Moreover, such new properties are able to influence the original system. Consider the case of computational systems, i.e. Turing Machines. Since the 1950s researchers have realized how the ability to compute could give rise to properties of different kinds such as playing chess, learning, pattern and handwriting recognition processes, establishing so-called intelligent behavior. A new generation of computational systems is able to perform so-called subsymbolic computation, including Neural Networks, Cellular Automata and Genetic Algorithms, where computational rules are not explicitly established, but emergent (computational emergence). Physically speaking they all are based on an electronic system (the computer) performing machine cycles, i.e., steps performed by the processor unit processing digital data using a program. This system is able to acquire properties not only related to the explicit purpose of the program, but nonexplicitly designed properties, such as computational emergent properties in Neural Networks, e.g., learning, classifying, pattern recognition and gameplaying. These are examples of acquired properties due to computational emergence. Similar processes take place in living matter provided with neurological systems able to perform processes of signaling and leading to cognitive functions able to establish a cognitive system “intended as a complete system of interactions among activities, which cannot be separated from one another, such as those related to attention, perception, language, the affective-emotional sphere, memory and the inferential system” [42]. These different interacting levels establish the cognitive system and one level influences and is dependent upon the other. So far we have not said anything new. Within this conceptual framework, however, we would like to introduce some new conceptual problems able to produce new approaches. One possible way of formalizing the process of acquisition of new properties is based on considering hierarchical levels as introduced in the Baas formalization of emergence. Consider S1, a set of interacting elements having observable properties at the level of single elements Obs1 (S1). Let S2 be a second-order structure, and the result R obtained by applying interactions Int1 to the elements of S1, whose observable properties are Obs1 (S1): S2 = R(S1, Obs1 (S1), Int1). In this way a property P of S2 is emergent if and only if it is observable at the S2
Acquired Emergent Properties
635
level but not at a lower level, i.e., at the S1 level [43,44]. It is then possible to construct hyperstructures [45]. We refer to processes of emergence taking place in emergent systems. The processes of acquisition of new emergent properties may be formalized as a hierarchy of processes of emergence, when a property emerges from the interaction of entities emerging from lower structures, such as in Baas hierarchies [43,44,45]. Another way is to consider processes of organization taking place in previously established systems. For instance, further processes of organization taking place in organizations needing to better specialize their activity, such as a corporation which needs to sub-divide its market and adopt different, specialized strategies. Another example is given by living matter adopting specialized functions, e.g., organs and specializations as in the brain. It is inappropriate to consider the kind of processes mentioned above as being completely separate, rather than considering them as simultaneous and integrated processes. An effective modeling of such systems is not based on separate models, but rather on integrated models able to represent interactions between levels and processes of emergence. We think that a suitable approach is that represented by DYSAM. The problem of managing processes of emergence relates on one hand to the ability to induce, sustain and regulate and, on the other, to the ability to deactivate, i.e., make de-emergent, acquired properties. 5.2. How to keep acquired properties In the hierarchy of levels a property requires other lower levels. Cognitive properties, for instance, are based on necessary lower levels such as the physiological ones. This also relates to maintaining properties of MSs. Is it possible to sustain a property without the lower levels being involved or the properties of a MS without the original system in which the process of establishing MSs took place? Lower levels are necessary as well being influenced by the higher ones. One approach can be based on substituting (by reproducing the same) lower necessary levels in order to sustain and keep higher levels. The process of substituting is possible for virtual systems. Virtual systems are temporally established by components belonging to another system like in the case of Multiple Systems. For example, a virtual company really exists only as a temporary way of using resources belonging to other companies.46 Another approach is based on reproducing what is emergent without reproducing the
636
G. Minati
process of emergence. For instance, it is possible to reproduce effects without reproducing the generating processes when recording and reproducing music. We must also consider in natural systems the process of reproduction together with the representation and transmission of knowledge. In this case processes of transmission from one supporting system to others take place thanks to educational and representational processes. There are different kinds of processes characterized by gradualism in the replacement of supporting levels. We refer, for instance, to teams replacing over time their members, in the same way new cells replace dead cells in living matter. This could be considered a new concept, that of re-emergence, referring to reproducing similar process of emergence supported by the presence of new replacement elements. 6. Appendix: The acquired mind One crucial and interesting case is that of the properties acquired by living matter (several definitions of living exist), provided with a suitable cognitive system, through processes of emergence [47,48]. After centuries of interdisciplinary study of the topic, mind is still controversial and the subject of different approaches [49,50,51]. The subject of mind is closely related to another which has also been the subject of interdisciplinary investigations for many years: consciousness. Science has made tremendous progress in the study of the system considered crucial for the establishment of mind, i.e., the brain. In the philosophy of mind, researchers study the so-called mind-body problem. Among the many varied approaches [52,53] mind may be considered as an emergent, acquired property of the system established by brain and body. There are different approaches for introducing ways of modeling the emergence of mental processes. Here, we briefly mention the so-called Computational Theory of Mind. In the philosophy of mind, a line of research known as Emergent Materialism [50,54], considers mental phenomena as emergent from interactions occurring at the physical level (i.e., brain and body) in the same way as learning emerges in Artificial Neural Networks through interactions among neurons in the Connectionist Theory of Mind. Another approach is that based upon Quantum Field Theory [55]. This approach is based upon the Quantum Field Theory-based approach to living matter suitable for modeling the physical emergence of the main features of biological emergence. On one hand we have evidence of autonomy of mind by considering, for instance, its specific illnesses confronted using specific levels of description, such as those of psychology and psychiatry. Moreover, mind is able to act upon
Acquired Emergent Properties
637
its supporting, supposedly establishing processes such as those related to brain and body. Actually mind is able to influence brain and body by imposing certain behaviors. One extreme is given by so-called self-destroying behaviors, such as the use of drugs, alcohol and suicide. We may have different possible approaches when trying to model processes of establishment of mind, such as considering it established by processes of organization and self-organization involving both brain and body. It should also be noted how we use mind to study mind. A not modeled process, because of the evident impossibility of obtaining experimental information, is that related to new properties - such as mind itself , acquired by living matter provided with cognitive systems having higher levels of complexity, following the transition from living to non-living matter. Some religions and philosophies mention in different ways the possibility of an eternal life, evidently without the need for a body. We may intend eternal life as the supporting of newly acquired properties, such as mind, without the support of living matter. Death may be then intended as the moment where the change in the supporting process takes place, as discussed in Section 5.2. Of course this is just a conceptual, and not even a hypothetical (how to experiment?), framework. 7. Conclusions The new contributions introduced in this paper relate to the following points. We considered some insights into the role of structure in the process of establishing systems. We then discussed three points: 1) Processes of establishing structures; 2) Processes of establishing systems; and, as a major contribution, 3) Processes of establishing Acquired Properties (AP). The last point relates to the establishment of new properties in emergent and/or organizational systems. This refers to a kind of hierarchy of properties, one being based upon other preceding ones. We then mentioned related problems, such as how to acquire and sustain new properties. These problems give a further idea of the richness of a future General Theory of Emergence [56] to be intended not as a single theory, but rather as a multi-dimensional theory. One initial approach uses the concept of the Dynamical Usage of Models (DYSAM) to cope with Multiple Systems and Collective Beings. We mentioned, in the Appendix, how this view may allow the conception of a new point of view related to crucial existential problems for humanity, such as the possibility of supporting minds after the death of the supporting biological matter. Creation, supporting and managing acquired properties are, in sum, perspectives of a GTE. It is expected to also introduce new epistemological
638
G. Minati
approaches, including multi-dimensionality and the simultaneous usage of irreducible models, in order to deal with problems, such as those related to life, mind, development considered as an emergent property of systems of growth, sustainability as an emergent property not reducible to linear combinations of local sustainability and the supporting of acquired properties in general. References 1. G. Minati, Multiple Systems, Collective Beings, and the Dynamic Usage of Models, Systemist, 200 (2006).
2. G. Minati and E. Pessa, Collective Beings (Springer, New York, 2006). 3. L. von Bertalanffy, General System Theory: Foundations, Development, Applications (George Braziller, New York, 1968).
4. L.N. Barsalou, Cognitive Science: An overview for cognitive scientists (Erlbaum, Hillsdale, NJ, 1992).
5. R.W. Ashby, in Principles of Self-Organizing systems, Ed. H. von Foester and G.W. Zopf (Pergamon, Oxford, 1962), pp. 255-278.
6. H. von Foerster, in Self-Organizing Systems, Ed. M.C. Yovitts and S. Cameron, (Pergamon, New York, 1960), pp. 31-50.
7. S. Gubernan and G. Minati, Dialogue about systems (Polimetrica, Milano, Italy, 2007).
8. G. Nicolis and I. Prigogine, Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order through Fluctuations (Wiley, New York, 1977).
9. H. Haken, Erfolgsgeheimnisse der Natur (Deutsche Verlags-Anstalt, Stuttgart, 1981).
10. H. Haken, Advanced Synergetics (Springer, Berlin-Heidelberg-New York, 1983). 11. H. Haken, in Self-organizing systems: The emergence of order, Ed. F.E. Yates, (Plenum, New York, 1987).
12. J.H. Holland, Emergence from Chaos to Order (Perseus Books, Cambridge, Massachusetts, 1998).
13. N. Banzhaf, in Encyclopaedia of Physical Science and Technology, 3rd edition, vol. 15, Ed. R.A. Meyers (Academic Press, New York, 2001), pp. 589-598.
14. P. Corning, Complexity, 18 (2002). 15. J.P. Cruchtfield, Physica D, 11 (1994). 16. E. Pessa, in Proceedings of the First Italian Conference on Systemics, Ed. G. Minati, (Apogeo scientifica, Milano, Italy, 1998).
17. E. Pessa, in Emergence in Complex Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp. 379-382.
18. R. Butts and J. Brown, Eds., Constructivism and Science (Kluwer, Dordrecht, Holland, 1989).
19. E.M.A. Ronald, M. Sipper and M.S. Capcarrère, Artificial Life, 225 (1999). 20. A. Rueger, Synthese, 297 (2000). 21. H. von Foerster, Observing Systems, Selected Papers of Heinz von Foerster (Intersystems Publications, Seaside, CA, 1981).
Acquired Emergent Properties
639
22. H. von Foerster, Understanding Understanding: Essays on Cybernetics and Cognition (Springer, New York, 2003).
23. E. von Glasersfeld, in The invented reality, Ed. P. Watzlawick (Norton, New York, 1984), pp. 17-40.
24. A. Babloyants, Molecules, Dynamics & Life: An Introduction to Self-Organization of Matter (Wiley, New York, 1986).
25. B.P. Belousov, A periodic chemical reaction and its mechanism (Sbornik Referatoo po Radiatsionnoi Meditsine, Medgiz, Moscow, 1959), pp. 145-147.
26. A.S. Iberall and H. Soodak, Collective Phenomena, 9 (1978). 27. G.L. Sewell, Quantum Theory of Collective Phenomena (Oxford University Press, Oxford, 1986).
28. E. Bieberich, BioSystems, 109 (2000). 29. N.J. Smelser, Theory of Collective Behavior (Free Press, New York., 1963). 30. H. Blumer, in New Outline of the Principles of Sociology, Ed. A.M. Lee (Barnes and Noble, New York, 1951), pp.167-222.
31. R.H. Turner, in Handbook of Modern Sociology, Ed. REL Faris (Rand McNally, Chicago, 1964), pp. 382-425.
32. M.M. Millonas, Journal of Theoretical Biology, 529 (1992). 33. G. Theraulaz and J. Gervet, Psychologie Francaise, 7 (1992). 34. M.M. Millonas, in Artificial Life III, Ed. C.G. Langton (Addison-Welsey, Reading, MA, 1994), pp. 417-445.
35. P.W. Anderson and D.L. Stein, in Self-Organizing Systems: The Emergence of Order, Ed. F.E. Yates (Plenum, New York, 1985), pp. 445-457.
36. E. Bonabeau and J.-L. Dessalles, Intellectica, 85 (1997). 37. F. Boschetti, M. Prokopenko, I. Macreadie and A.-M. Grisogono, in Proceedings of
38. 39. 40. 41. 42. 43. 44. 45.
Knowledge-Based Intelligent Information and Engineering Systems, 9th International Conference, KES, Ed. R. Khosla, R.J. Howlett, and L.C. Jain, (Melbourne, Australia, Part IV, volume 3684 of Lecture Notes in Computer Science, September 14-16, 2005), pp. 573-580. G. Minati, in Proceedings of the Second Conference of the Italian Systems Society, Ed. G. Minati and E. Pessa, (Kluwer Academic/Plenum Publishers, London, 2002), pp. 85-102. E. Tulving, American Psychologist, 385 (1985). G. Minati and S. Brahms, in Emergence in Complex Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer, New York, 2002), pp. 4152. G. Minati and E. Pessa, Eds., Emergence in Complex Cognitive, Social and Biological Systems., Proceedings of the Second Conference of the Italian Systems Society (Kluwer Academic/Plenum Publishers, London, 2002). E. Pessa, La Nuova Critica, 53 (2000). N.A. Baas, in Artificial Life III, Ed. C.G. Langton, (Addison-Wesley, Redwood city, 1993). N.A. Baas and C. Emmeche, Intellectica, 67 (1997). K. Kitto, Modeling and generating Complex Emergent Behavior, Ph.D. thesis, The School of Chemistry, Physics and Earth Sciences (The Flinders University of South Australia, 2006), http://scieng.flinders.edu.au/cpes/postgrad/kitto_k/01front.pdf .
640
G. Minati
46. W.H. Davidow and D.M.S. Malone, The Virtual Corporation: Structuring and Revitalizing the Corporation for the 21st Century (HarperCollins, New York, 1992).
47. E. Pessa, in Systemics of Emergence: Research and Development, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006), pp. 355-374.
48. E. Pessa, M.P. Penna, and G. Minati, Chaos & Complexity Letters, 137 (2004). 49. J.L. McClelland and D.E. Rumelhart, Eds., Parallel Distributed Processing. Explorations in the microstructure of cognition (MIT Press, Cambridge, MA, 1986).
50. J.R. Searle, Minds, Brains and Science (Harvard University Press, Cambridge, Massachusetts, 1984).
51. J.R. Searle, Mind: A Brief Introduction (Oxford University Press Inc, Oxford, 2005). 52. J. Kim, in Oxford Companion to Philosophy, Ed. T. Honderich, (Oxford University Press, Oxford, 1995).
53. J. Heil, Ed., Philosophy of Mind: A Guide and Anthology (Oxford University Press, Oxford, 2003).
54. P. Churchland, Matter and Consciousness (Massachusetts Institute of Technology, Cambridge, 1988).
55. G. Vitiello, My double unveiled (Benjamins, Amsterdam, 2001). 56. G. Minati, in Systemics of Emergence: Research and Applications, Proceedings of
the Third Italian Systems Conference, Ed. G. Minati and E. Pessa, (Springer, New York, 2006), pp. 667-682.
THE GROWTH OF POPULATIONS OF PROTOCELLS ROBERTO SERRA (2,1), TIMOTEO CARLETTI (3,1), IRENE POLI (1), ALESSANDRO FILISETTI (2) (1) Dipartimento di Statistica, Università Ca’ Foscari, San Giobbe - Cannaregio 873, 30121 Venezia, Italy (2) Dipartimento di Scienze Sociali, Cognitive e Quantitative, Università di Modena e Reggio Emilia, Via Allegri 9, 42100 Reggio Emilia, Italy (3) Département de Mathématique, Université Notre Dame de la Paix Namur, Rempart de la Vierge 8, B 5000 Namur, Belgium The growth of protocells is discussed under different hypotheses (one or more replicators, linear and nonlinear kinetics) using a class of abstract models (Surface Reaction Models). A method to analyze the dynamics of successive protocell generations is presented, and it is applied to the problem of determining whether the duplication times of the protocell itself and of its genetic material eventually tend to a common value. The importance of the phenomenon of emergent synchronization for sustained protocell population growth and for evolvability is discussed. Keywords: protocells, Surface Reaction Models, emergent synchronization.
1. Introduction Protocells are lipid vesicles or micelles which are endowed with some rudimentary metabolism and contain “genetic” material, and which should be able to grow, reproduce and evolve. While viable protocells do not yet exist, their study is important in order to understand possible scenarios for the origin of life, as well as for creating new “protolife” forms which are able to adapt and evolve [8]. This endeavour has an obvious theoretical interest, but it might also lead to an entirely new “living technology”, definitely different from conventional biotechnology. Theoretical models can be extremely useful to devise possible protocells and to forecast their behavior. In this paper we address an important issue in protocell research. The protogenetic material in a protocell is composed by a set of molecules which, collectively, are able to replicate themselves. At the same time, the whole protocell undergoes a growth process (its metabolism) followed by a breakup into two daughter cells. This breakup is a physical phenomenon which is frequently observed in lipid vesicles, and it has nothing to do with life, although it superficially resembles the division of a cell. In order for evolution to 641
642
R. Serra et al.
be possible, some genetic molecules should affect the rate of duplication of the whole container. Mechanisms have been proposed whereby this can be achieved (see below). But then a new problem arises: the genetic material duplicates at a certain rate, while the lipid container grows, in general, at another rate. When the container splits into two, it may be that the genetic material has not yet doubled: in this case its density would be lower in the daughter protocells. Through generations, this density might eventually vanish. On the other hand, if the genetic material were faster than the container, it would accumulate in successive generations. So, in order for a viable population of evolving protocells to form, it is necessary that the rhythms of the two processes are synchronized. In some models (like the Chemoton [2]) this is imposed a priori in the kinetic equations, but it is unlikely that such a set of exactly coupled reactions spring up spontaneously. It is therefore interesting to consider the possibility that such synchronization be an emergent phenomenon, without imposing it a priori. In the following we will consider this possibility by analyzing an abstract version of the so-called “Los Alamos bug”, a model of protocells where the genetic material is composed by strands of PNA [6,7]. These resemble the better-known nucleic acids DNA and RNA, but have a peptide backbone and it is believed that they might be found in the lipid phase of the protocell. According to this hypothesis, different PNA's may influence the growth rate of their “container” by catalyzing the formation of amphiphiles (which form the protocell membrane) from precursors. The detailed mechanisms whereby this might happen can be found in [6,7]. Inspired by the Los Alamos bug, we developed a more abstract class of models (which can describe also different specific models) which are called Surface Reaction Models [9]. The simplest case (where the genetic material is composed by a single type of self-replicating molecule) will be described in section 2. This model couples the growth of the genetic material and that of the container, and a mathematical technique can be introduced to study how the quantity of the former varies in successive generations. This is described in section 3, where it is also shown that synchronization is indeed an emergent property, both in the case of linear and nonlinear kinetics. Note that the term “linear” refers to the rate equation of the replicator only: the overall model, with its coupling to the container growth and breakup, is definitely nonlinear. Since there may be different kinds of replicators, with different rates, the case of two coexisting replicators (linear and nonlinear) is discussed in section 4. Section 5 is then devoted to the case where replicators directly interact: a
The Growth of Populations of Protocells
643
comprehensive analytical theory can be developed for the linear case, while nonlinear kinetics is approached through simulations. A major consequence of synchronization is that the competition among protocells is darwinian, even if that of the replicators is not [4]. This aspect is discussed in the final section. This paper aims at presenting a unified view of the major results concerning synchronization, while for detailed calculations and demonstrations the reader is referred to [9,10,11,1], where further references to the scientific literature can also be found. 2. Surface reaction models Let us first consider the case where there is a single replicator in the protocell lipid phase, and let its quantity (mass) be denoted by X . Let also C be the total quantity of “container” (e.g. lipid membrane in vesicles or bulk of the micelle). We suppose that the lipid density is constant, so the volume V of the lipid phase is proportional to C . We assume, according to the Labug hypothesis, that the replicator favours the formation of amphiphiles and that, since precursors are found outside the protocell, only the fraction of X which is near the external surface is effective. We assume that also the replication of X takes place near the external surface. Let us further assume that • spontaneous amphiphile formation is negligible • the precursors (both of amphiphiles and of genetic molecules) are buffered • the surface area S is proportional to V β , and therefore also to C β ( β ranging between 2/3 for a micelle and 1 for a very thin vesicle) • diffusion is very fast within the protocell, so concentrations are constant everywhere in the lipid phase • the protocell breaks into two identical daughter units when it reaches a certain volume ( C = θ ) • the rate limiting step which may appear in the replicator kinetic equations does not play a significant role when the protocell is smaller than the division threshold • the contribution of X to the growth of C is linear • the rate of replication of X in the bulk ( d [ X ] dt ) would be proportional to [X ]ν (square brackets indicate concentrations) Under these hypotheses, as shown in [9] one obtains the following approximate equation which describes the growth of a protocell between two successive divisions:
644
R. Serra et al.
dC = α C β −1 X dt dX = η C β −1 X dt
(1)
When C reaches a critical value θ , the cell breaks into two equal daughter protocells; then, until the next duplication, the system is again ruled by Eq. (1). At the beginning of a new generation, both the initial value of X and that of C equal one half of the value which they have attained at the end of the previous generation, i.e. at the time of cell division. Note that, under the above assumptions, the doubling time at generation i is determined by the initial value of X . Synchronization implies constant division times, so it is achieved if one observes the same initial value of X in two successive generations. Synchronization can of course also be detected by the fact that doubling times become equal in successive generations. 3. One type of replicator per cell Let us first consider the linear case, i.e. let ν = 1 in Eq. (1). It is then immediate to observe that the quantity Q =η C −α X
(2)
is conserved during the continuous growth phase, so its value at the end of the growth is the same as it was at beginning. But since the protocell splits into two equal daughter cells, the initial value of Q , at the next generation, will be exactly one half of the previous value. As generations grow (i.e. as t > ∞ ), then Q > 0 and therefore the initial value of X approaches a constant value:
X∞ =
ηθ 2α
(3)
It can also be proven that the doubling time asymptotically approaches the value ln 2 η . Therefore synchronization is achieved in the case where the replicator follows a linear (i.e. first order) kinetics. The mathematical technique quickly described above can be applied to more general cases [9,10]. The key ingredient is that of finding a first integral of the equations which describe the continuous growth phase, and to obtain a recursion map for the initial values of X at successive generations, on the basis of the halving hypothesis. It can then be proven that the above result holds also for equations which are more general than the one considered above, and also for realistic protocell geometries. What is even more important, by renormalizing
The Growth of Populations of Protocells
645
time it can be proven that the asymptotic behavior is not affected by the value of β , so it suffices to consider the simpler β = 1 case. In particular, in the nonlinear case of Eq. (1) the conserved quantity is
Q = C (t ) 2 −ν −
α X (t ) 2 −ν η
(4)
and synchronization can be proven using the same methods as those described above.
4. Coexisting replicators There may be different replicators in a protocell: this would certainly be the case if they were nucleic acids, which can undergo random mutations, but the remark may hold also for more general hypotheses concerning their chemical nature. Let us then suppose that in the same cell there are two self-replicators X and Y . The generalization of Eq. (1) is then
dC = α ′ X + α ′′ Y dt dX = η ′ X ν C1−ν dt dY = η ′′ Y ν C1−ν dt
(5)
In this case one finds two first integrals of the continuous Eqs. (5), and one can then prove synchronization with the methods of section 3. It is interesting to consider what happens when a the fastest replicator gives a smaller contribution to the growth of the whole container then the other one, e.g. to consider the case α ′ > α ′′ and η ′ > η ′′ [9]. In the linear case (ν = 1 ) one finds that the fastest replicator displaces the other one, whose quantity per protocell eventually vanishes. The “altruist” get extinct in the long run. On the other hand, if ν < 1 the two can co-exist, and tend asymptotically to a situation where their relative ratio is proportional to that of their kinetic coefficients η ′ and η ′′ .
5. Interacting replicators In the case considered in section 4 there were different replicators in the same container, but they did not directly affect each other's synthesis. Let us now consider the case where replicators interact in a linear way. The model equations for the continuous growth between two successive divisions are then
646
R. Serra et al.
dX = α C β −1M X dt dC = C β −1α ⋅ X dt
(6)
where the matrix element M ij describes the effect of molecule of type j on the growth rate of molecule of type i. By considering the case β = 1 and using the techniques of section 3 one finds [11] the following conditions for the asymptotic value of the quantity of X at the beginning of each replication cycle X ∞ :
M X∞ = λ X∞
λ=
(7)
ln 2 ∆ T∞
therefore X ∞ must be an eigenvector of the matrix M belonging to the eigenvalue λ . It can be also proven that
X (Tk ) = e M (Tk −T0 )
X0 2 k −1
(8)
From Eq. (8), by considering the limit of very large times, one finds that the eigenvalue which must be considered in Eq. (7) is the one with the largest real part, let us call it λ1 . Physical interpretation of these results requires that l1 (which is proportional to duplication time) be real and positive, and that X ∞ be real and non negative (some components may vanish in the long time limit) . If the matrix M is non negative and non null (i.e. if every M ij ≥ 0 and there is at least one M ij ≠ 0 ) both conditions guaranteed by the Perron-Frobenius theorem, and in this case it can be proven that synchronization is always achieved. Numerical simulations [1] show that this is also the case whenever λ1 is real and admits a nonnegative eigenvector. When these conditions are not fulfilled, one often finds that some species get extinct, and that the above conditions apply to the remaining reduced system of equations. However, when the eigenvalues with the largest real part does not admit a non negative eigenvector, cases where synchronization does not take place may be observed. So the analytical theory is able to describe all those cases where the eigenvector corresponding to λ1 is nonnegativea, while simulations are required when this condition is not satisfied a
and also the trivial cases where there is no eigenvector with a nonnegative real part.
The Growth of Populations of Protocells
647
When the replicators interact in a nonlinear way, although analytical theory may provide useful results, simulation is the main tool to explore the system behavior. Preliminary experiments with some models of this kind show that, while synchronization is the most frequent outcome, interesting dynamical phenomena can also be observed, where the system approaches synchronization, and looks almost synchronized for fairly long times, but this stable situation abruptly changes. It may then be recovered after a turbulent transient. Nonlinear replication kinetics needs further exploration.
6. Conclusions We have seen in section 4 that two replicators which are found in the same protocell, and which grow in a parabolic way (i.e. with sublinear kinetics, ν < 1 ) can coexist. This phenomenon is typically observed also in population dynamics (i.e. without containers): sublinear kinetics leads to asymptotic coexistence of several species [3], a phenomenon which has been called “the survival of anybody”. On the other hand, selection pressure is much more effective in the darwinian case: the survival of only the fittest is guaranteed in population dynamics if the leading term in the kinetic equations is linear. Note that this corresponds to exponential growth, i.e. constant doubling time. But synchronization guarantees that this is exactly the case for protocells: even if the replicators interact in a parabolic way, the containers undergo exponential growth. Therefore, if different types of protocells exist, we can expect darwinian dynamics among them. While we have proven this for the case of surface reaction model, the same phenomenon has also been observed in different models [4]. Therefore selection pressure might be much more effective at the protocell level than at the molecular level. While synchronization is an interesting phenomenon per se, this remark shows that it may have profound effects on the evolvability of protocells populations.
Acknowledgments Support from the EU FET- PACE project within the 6th Framework Program under contract FP6-002035 (Programmable Artificial Cell Evolution) is gratefully acknowledged. We had stimulating and useful discussions during the warm hospitality at the European Center for Living Technology in two workshops which were held on march 16-18, 2006 and on march 18-19, 2007.
648
R. Serra et al.
References 1. A. Filisetti, M.Sc thesis (Dept. of Social, Cognitive and Quantitative Sciences, Modena and Reggio-Emilia University, 2007).
2. T. Ganti, Chemoton Theory, (Vol. I: Theory of Fluid Machineries; Vol. II: Theory of Living Systems), (KluwerAcademic/Plenum Publishers, New York, 2003).
3. J. Maynard-Smith and E. Szathmary, Major transitions in evolution (Oxford University Press, New York, 1997).
4. A. Munteanu, C.S. Attolini, S. Rasmussen, H. Ziock and R.V. Solé, Generic 5. 6. 7. 8. 9. 10. 11.
Darwinian selection in protocell assemblies, DOI: SFI-WP 06-09-032 (Santa Fe Institute, Santa Fe, 2006). T. Oberholzer, R. Wick, P.L. Luisi and C.K. Biebricher, Biochemical and Biophysical Research Communications 207, 250-257 (1995). S. Rasmussen, L. Chen, M. Nilsson, and S. Abe, Artificial Life 9, 269-316 (2003). S. Rasmussen, L. Chen, B. Stadler and P.F. Stadler, Origins Life and Evolution of the Biosphere 34, 171-180 (2004). S. Rasmussen, L. Chen, D. Deamer, D.C. Krakauer, N.H. Packard, P.F. Stadler and M.A. Bedeau, Science 303, 963-965 (2004). R. Serra, T. Carletti, and I. Poli, Artificial Life 13, 1-16 (2007). R. Serra, T. Carletti and I. Poli, in BIOMAT 2006, Ed. R.P Mondaini and R. Dilão (World Scientific, Singapore, 2007). R. Serra, T. Carletti, I. Poli, M. Villani and A. Filisetti, Submitted to ECCS-07: European Conference on Complex Systems (2007).
INVESTIGATING CELL CRITICALITY R. SERRA (1), M. VILLANI (1), C. DAMIANI (1), A. GRAUDENZI (1), P. INGRAMI (1), A. COLACCI (2) (1) Dipartimento di Scienze Sociali, Cognitive e Quantitative Università di Modena e Reggio Emilia, Via Allegri 9, 42100 Reggio Emilia, Italia (2) Excellence Environmental Carcinogenesis, Environmental Protection and Health Prevention Agency Emilia-Romagna, Viale Filopanti 22, Bologna, Italia Random Boolean networks provide a way to give a precise meaning to the notion that living beings are in a critical state. Some phenomena which are observed in real biological systems (distribution of "avalanches" in gene knock-out experiments) can be modeled using random Boolean networks, and the results can be analytically proven to depend upon the Derrida parameter, which also determines whether the network is critical. By comparing observed and simulated data one can then draw inferences about the criticality of biological cells, although with some care because of the limited number of experimental observations. The relationship between the criticality of a single network and that of a set of interacting networks, which simulate a tissue or a bacterial colony, is also analyzed by computer simulations. Keywords: Random Boolean networks, cell criticality, interacting networks.
1. Introduction The idea that complex adaptive systems are driven to a “critical” state has been proposed by different authors [11,3,2,10], although with somewhat different meanings, as a powerful general principle, which could be useful to understand biological as well as social systems. In order to make precise statements about this hypothesis, and to test it against available experimental data, it is convenient to provide a precise, albeit not all-encompassing, definition of criticality. Random Boolean networks (shortly, RBNs) [8,9] are particularly interesting in this regard as they allow such a precise statement to be made. They represent a well-known model of genetic networks which has proven fruitful, as it has allowed to uncover some features of the relationship between genome size and number of cell types in multicellular organisms, as well as of the relationship between genome size and typical length of the cell cycle. Two kinds of dynamical regimes are usually observed in RBNs, an “ordered” and a “disordered” one (the name “chaotic” is also sometimes used in 649
650
R. Serra et al.
the latter case, although it should be kept in mind that the system attractors, in the case of finite size networks, are always cycles). Networks with different parameters tend to be either in the ordered or in the disordered regime [9,4,17]. RBNs will be briefly described in section 2. Recently, it has been shown that random Boolean networks can also accurately describe the statistical properties of perturbations in gene expression (“avalanches”, defined in section 3) induced by silencing single genes, one at a time, in the yeast S. cerevisiae. It is also possible to relate the distribution of avalanches to a parameter (the Derrida parameter) which determines whether a cell is in the critical state and, by comparing the results of theoretical analyses and computer simulations with those of the actual experiments, it is possible to draw inferences about the value of this parameter in S. cerevisiae cells. Section 3 is dedicated to a discussion of this approach. There is suggestive evidence that the cells which have been examined are in an ordered state, but the value of the Derrida parameter is close to the one which corresponds to criticality. Note that the arguments in favor of the fact that life tends to be found “at the edge of chaos” apply to organisms as a whole, not to isolated cells. Many organisms tend to form colonies, where cells grow close to each other and communicate by transferring molecules to each other. Intercellular communication is even more intense in tissues of multicellular organisms. It is then important to understand the relationship between the dynamics of isolated cells and that of a collection of interacting cells. When a critical cell interacts with others, what is the overall dynamics? Is it more or less ordered? In order to investigate this issue it is possible to use a cellular automaton model, where each cell site is occupied by a RBN, which simulates a single cell. The interaction is modeled by letting the expression of some genes be influenced not only by the genes which are in the same cell, but also by the neighboring ones, in a way which mimics the transmembrane transfer of proteins or other molecules. Section 4 is dedicated to a description of the results of these studies. The final section is dedicated to critical comments and indications for further research. Both the results concerning the distribution of avalanches in gene expression, and those concerning the dynamical properties of interacting RBNs have been to a large extent published in, or submitted to, technical journals, which are quoted in the appropriate Sections and where also reference to other relevant works can be found. The original aspect of the present paper is that of focusing the discussion on the issue of cell criticality.
Investigating Cell Criticality
651
2. Random Boolean networks There are some excellent reviews and books on RBNs [9,10,1] so we will briefly summarize here only their main features. Let us consider a network composed of N genes, or nodes, which can take either the value 0 (inactive) or 1 (active). In a classical RBN each node has the same number of incoming connections kin , and its kin input nodes are chosen at random with uniform probability among the remaining N − 1 nodes (multiple connections from the same node being prohibited). It then turns out that the distribution of outgoing connections per node follows a Poisson distribution. The output (i.e. the new value of a node) corresponding to each set of values of the input nodes is determined by a Boolean function, which is associated to that node, and which is also chosen at random, according to some probability distribution. The simplest choice is that of a uniform distribution among all the possible Boolean functions of kin arguments. However, a careful analysis of some biological control circuits has shown that there is a strong bias in favour of the so-called “canalyzing” functions [6], where at least one value of one of the input nodes uniquely determines the output, independently of the values of the other input nodes. Both the topology and the Boolean function associated to each gene do not change in time. The network dynamics is discrete and synchronous. In order to analyze the properties of an ensemble of random Boolean networks, different networks are synthesized and their dynamical properties are examined. While individual realizations may differ markedly from the average properties of a given class of networks [4] one of the major results is the discovery of the existence of two different dynamical regimes, an ordered and a disordered one, divided by a “critical zone” in parameter space. Attractors are always cycles in finite RBNs: in the ordered regime their length scales as a power of N , moreover in this regime the system is stable with respect to small perturbations of the initial conditions. In the disordered regime the length of the cycles grows exponentially with N , and small changes in initial conditions often lead to different attractors. For fixed N , the most relevant parameter which determines the kind of regime is the connectivity per node, k : one typically observes ordered behavior for small k , and a disordered one for larger k . The parameter which determines whether a network is in the ordered or in the disordered regime is the so-called Derrida parameter, which measures the rate at which nearby initial states diverge. For a more detailed discussion, the reader is referred to [9,1,4,17].
652
R. Serra et al.
3. Avalanches in gene expression data The experiments discussed below are described in [7], while the theoretical analyses are discussed in depth in [13,14,15]: the reader interested in a deeper understanding of these topics is referred to these works. Hughes and co-workers have performed several experiments where a single gene of S. Cerevisiae has been knocked-out and, using DNA-microarrays, have compared the expression levels of all the genes of such perturbed cells with those of normal, wild type cells. The knock-out experiment can be simulated in silicon by comparing the evolution of two RBNs which start from identical initial conditions, except for the fact that one gene (the “knocked-out” one) is clamped permanently to the value 0 in the network which simulates the perturbed cell. The results of both experiments and simulations can be described by the distribution of "avalanches": an avalanche is the number of genes which are modified in a given experiment. In order to compare continuous experimental data with the results of Boolean models it is necessary to define a threshold for the former, so that two expression levels are considered "different" if their ratio exceeds the threshold. The initial simulations were performed using a classical RBN with 2 input connections per node, restricting the set of boolean functions to the so-called canalyzing ones, for the reasons given in Section 2. The comparison of the simulation results with the experimental distribution of avalanches is really good. This was fairly surprising, since the simplest model with practically no parameters, where all nodes have an equal number of inputs (a condition which is certainly not satisfied in real cells) was used. It was then possible to analytically determine that the distribution of avalanches in RBN, as long as they involve a number of genes which is much smaller than the total number of genes in the network, depends only upon one relevant parameter. Let q ≤ 1 be defined as follows. For a node chosen at random, say node A , suppose that one (and only one) of its inputs, also chosen at random, is changed: then q is the probability that node A does not change its value. Let pn be the probability that an avalanche involves n nodes, and let pout (k ) be the probability that a node has k outgoing connections. It can be proven that the distribution of avalanches depends only upon the distribution of outgoing connections: that’s why a simple model with an equal number of input links per node may work well. All the pn can be found from the knowledge of the “outgoing” moment generating function F :
Investigating Cell Criticality
F=
N −1 m=0
q m pout (m)
653
(1)
In classical RBN pout (k ) is Poissonian, and in this case it can be proven that
F = e −λ
λ ≡ (1 − q) A
(2)
Here λ is indeed the Derrida exponent, which also determines the network dynamical regime (cfr. section 2). Therefore the distribution of avalanches depends only upon a single parameter, namely the Derrida exponent. The simple model which we had used in our simulations had a value of this parameter slightly smaller than the critical one, and this turned out to be a fortunate choice. As suggested in [12] the dependency of the distribution of avalanches on λ can then be used to try to infer the value of the parameter which corresponds to it in real cells, and which should discriminate between ordered and disordered states. Among the different cases which have been considered, the best agreement (according the well-known χ 2 measure) with experimental data is provided by the case where λ = 6 7 , slightly smaller than the critical value 1. This supports the suggestion that life forms tend to exist at the critical state or in the ordered region, close to criticality [18]. Note however that, since only a single data set is available, it would be inappropriate to draw definite conclusions concerning this point.
4. Interactions among random Boolean networks The simulations of the interaction among Boolean networks is described in detail in [19,16, 5]. Our interest here concerns the behavior of interacting critical networks, and the main question concerns the effects of interaction on the dynamical regime. In order to model the interaction let us consider a 2D square lattice cellular automaton with M2 cells, each of them being occupied by a complete RBN. The neighborhood is of the von Neumann type (composed by the cell itself and its N, E, S, W neighbors). We assume wrap around so the overall topology is toroidal. Every RBN of the automaton is structurally identical, while the initial activation states of the various genes may differ . In particular, each of the RBNs has the following common features: 1. same number ( N ) of Boolean nodes;
654
2. 3.
R. Serra et al.
same topology, i.e. same ingoing and outgoing connections for each node of the network; same Boolean function associated to each node.
A key aspect of the model is the representation of interactions: the fact that some proteins can pass from one cell to another is modeled by assuming that a cell can be affected by the activation of some genes of a neighboring cell. Nodes able to interact with other cells are defined as shared nodes and they are a subset of the total number of nodes of the RBN. Let f be the fraction of interacting nodes. We define as elementary value of a certain node the value computed according its Boolean function and to the value of its input nodes, belonging to the same RBN. The shared value of a shared node, instead, is calculated taking into account also the activation value of the nodes of its neighboring cells, depending on a precise interaction rule. In our initial study we concentrated on the rule “AT LEAST ONE ACTIVE” node (ALOA), where the shared value of a node x in the cell A is 1 if its value or at least one of those of the nodes x in the four neighboring cells is 1 (and it is 0 otherwise). Let a G-automaton (or, equivalently, a G-colony) be a set of interacting cells, defined by: • the topology of interaction T (in our case this is fixed) • the interaction rule R • the interaction strength, measured by the fraction f of shared nodes • the genome G of the RBNs which are placed in each cell of the automaton We have considered a number of different indicators of the degree of order of the tissue, and we have observed that there is no common tendency in all the networks towards either a more ordered or a more disordered behavior, as the interaction strength grows. So it seems that one cannot simply claim that interaction favors order or disorder. In order to measure the influence of interaction on the degree of order, a useful variable is
Ω = DA + CWA
(3)
where DA is the number of different attractors of a definite G-automaton, while CWA is the number of cells whose RBN reached no attractors. The number of different attractors can be considered as an indicator of the homogeneity of the cells in the G-automaton. Yet, in many cases some cells could reach no attractors and their number would not be computed into this variable. Adding the number of cells with no attractor to the number of different
Investigating Cell Criticality
655
attractors is a way to compensate this effect. Thus, Ω is a variable which measures a kind of order; it is indeed a decreasing function, which attains its minimum value (1) when the order is maximum, and its maximum when the order is minimum (all the cells do not reach any attractor or reach different attractors). The analysis on several G-automata demonstrates the presence of three recognizable kinds of behavior, concerning the dependency of Ω upon f : • Ω constant and equal to 1: all the G-automata reach the same attractor, independently of the value of f and also in absence of interaction. The attractors of this class of G-automata are fixed points • increasing Ω : it reaches a maximum when f = 1 • bell-shaped Ω : we define as bell-shaped a curve with a single maximum for f ∉ 0,1 . It has proven convenient to introduce a further sub-distinction among the Gautomata characterized by a bell-shaped curve of Ω : • left-oriented bell shaped: the maximum of the curve is for f ≤ 0.5 . • right-oriented bell shaped: the maximum is for f > 0.5 . The above criterion divides the G-automata into classes according to a measure of the way in which their behavior changes as a function of the strength of interactions among neighboring cells. The different classes tend to have a similar behavior also with respect to other order indicators which are described in [16,5]. The most interesting observation, however, is that the average period of the attractors observed at f = 0 (i.e. non interacting cells) provides useful information to predict the class to which a particular G-automaton belongs and therefore provides useful information to forecast whether increasing interaction leads, for a given genome, to a more ordered or to a more disordered behavior. In particular, RBNs which, in isolation, have long periods tend to become more disordered as the interaction strength increases, while on the other hand cells with short attractors tend to become more ordered.
5. Conclusions Concerning avalanches, although our best estimate is slightly smaller than the critical value, it must be observed that the available data are not yet conclusive: the best estimate of Ramo et al [12] for the Derrida parameter coincides exactly with the critical value. Shmulevich, Kauffman and Aldana [18] have studied a
656
R. Serra et al.
different system (time courses in HeLa cells) and found estimates for l in the critical or ordered region. Interestingly enough, when one tries to simulate the distribution of avalanches using random boolean networks with a scale-free distribution of outgoing connections (shortly, SFRBNs) one obtains a reasonably good agreement with the experimental data, except for the largest avalanches. When the parameters are chosen in such a way that there is a good agreement with the distribution of the (by far most frequent) small avalanches, one finds that in scale-free RBNs the number of large avalanches is greater than that observed in their classical counterparts, and greater than that observed in real data [15]. However this last phenomenon may be due to the limited number of gene knockouts which have been performed (227 in a network with 6325 genes). Since large avalanches are related to the silencing of a hub node, the possibility that such a hub has never been hit, although unlikely, cannot be ruled out. Nonetheless, the comparison of observed avalanche distribution with theoretical behavior provides the most direct way to investigate the issue of cell criticality so far devised. Concerning the interactions of Boolean networks, the most interesting aspect seems to be that interaction tends to amplify some peculiar features of a given network. Briefly, ordered networks become more ordered when they interact with others networks of the same kind, while disordered networks become more disordered. One could develop interesting speculations about the interplay of evolution with such dynamical properties.
Acknowledgments This work has been partially supported by the Italian MIUR-FISR project nr. 2982/Ric (Mitica).
References 1. M. Aldana, S. Coppersmith, L.P. Kadanoff, in Perspectives and Problems in 2. 3. 4. 5. 6.
Nonlinear Science, Ed. E. Kaplan, J.E. Marsden and K.R. Sreenivasan (Springer, New York, 2003), also available at http://www.arXiv:cond-mat/0209571 . P. Bak, How Nature works (Springer, New York, 1996). P. Bak, C. Tang and K. Wiesenfeld, Phys. Rev. A 38, 364 (1988). U. Bastolla and G. Parisi, Physica D 115, 219-233 (1998). C. Damiani, M.Sc thesis (Dept. of Social, Cognitive and Quantitative Sciences, Modena and Reggio-Emilia University, 2007). S.E. Harris, B.K. Sawhill, A. Wuensche and S.A. Kauffman, Complexity 7, 23-40 (2002).
Investigating Cell Criticality 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
657
T.R. Hughes, et al., Cell 102, 109-126 (2000). S.A. Kauffman, Curr. Top. Dev. Biol 6, 145-182 (1971). S.A. Kauffman, The origins of order (Oxford University Press, NewYork, 1993). S.A. Kauffman, At home in the universe (Oxford University Press, New York, 1995). C.G. Langton, in Emergent computation, Ed. S. Forrest (MIT Press, Cambridge, MA, 1991). P. Ramo, J. Kesseli and O. Yli-Harja, J. Theor. Biol. 242, 164 (2006). R. Serra, M. Villani and A. Semeria, J. Theor. Biol. 227, 149-157 (2004). R. Serra, M. Villani, A. Graudenzi and S. A. Kauffman, J. Theor. Biol. 246(3,7), 449-460 (2007). R. Serra, M. Villani, A. Graudenzi, A. Colacci and S.A. Kauffman, Submitted to ECCS-07: European Conference on Complex Systems (2007). R. Serra, M. Villani, C. Damiani, A. Graudenzi, A. Colacci, S.A., Kauffman, Submitted to ECCS-07: European Conference on Complex Systems (2007). J.E.S. Socolar and S.A. Kauffman, Phys. Rev. Let. 90 (2003). I. Shmulevich, S.A. Kauffman and M. Aldana, PNAS 102, 13439-13444 (2005). M. Villani, R. Serra, P. Ingrami and S.A. Kauffman, in Cellular Automata, LNCS 4173 (Springer, Berlin/Heidelberg, 2006), pp. 548-556.
This page intentionally left blank
RELATIVISTIC STABILITY. PART 1 - RELATION BETWEEN SPECIAL RELATIVITY AND STABILITY THEORY IN THE TWO-BODY PROBLEM
UMBERTO DI CAPRIO Stability Analysis s.r.l., Via Andrea Doria 48/A - 20124 Milano, Italy E-mail:
[email protected] With reference to the restricted two-body problem we show that Stability Theory (ST) and Special Relativity (SR) can be joined together in a new theory that explains a large class of physical phenomena (e.g. black-holes, cosmological dynamics) and overcomes the dualism between SR and General relativity (GR). After recalling the main features of ST (from the Method of Lyapunov to more recent developments up to analysis of fractals) we determine the canonic relativistic equations of the restricted two-body problem. A substantial novelty with respect to noted formulations is pointed out: three state variables (and not two only) are needed for “defining” said equations. They include variable v (magnitude of the rotation speed) in addition to radius and to radial speed. By means of eigenvalue analysis and by application of the Lyapunov theorem on stability in the first approximation we show that linearized system analysis gives a necessary condition only for stability: the radius must be greater than half the Schwarzschild radius. The derivation of a sufficient condition passes through the definition of a convenient Lyapunov function that represents the “local energy” around a given Equilibrium Point. Such derivation is deferred to Part II and results in the proof that the Schwarzschild radius actually represents the reference stable radius of the two-body problem. Keywords: stability theory, Lyapunov function, special relativity theory.
1. Introduction Actual theories of emergence are shaped on two main frameworks introduced long time ago: the theory of stability and relativity theory. The importance of the former doesn’t need to be emphasized, as it constitutes, already from the times of Von Bertalanffy, one of the pillars of systemics. On the other hand, relativity theory, already in the special case and more strongly in the general one, was the first framework in which it was possible to overcome the traditional cause-effect relationship, owing to the presence of nonlinear terms which, while needing a more complex mathematical apparatus, open the way to theories of pattern formation. It is therefore of outmost importance for systemics to investigate what occurs when stability theory and relativity theory meet in the study of particular large systems, such as stars, galaxies, or the whole universe.
659
660
U. Di Caprio
No general study is so far available about the relationships between Stability Theory and Special Relativity. Here we present results referring to the “restricted” two-body problem. We start with the following question: in which way the fact that the rotation speed cannot exceed the speed of light does influence the problem? Also, what is the form of the “canonic” state equations when relativistic effects are taken into account? What constraints are imposed by stability? To introduce the discussion we start with an intuitive reasoning: consider a circular orbit, with constant values of the radius R and the magnitude of the rotation speed v. Such orbit identifies a dynamical Equilibrium Point in the sense of Stability Theory, namely a point whose coordinates are R = const , R = 0 , v = const . In addition R = 0 and then the acting forces balance each other, i.e. F = Fc with F = GMm0 / R 2 , gravitational force and Fc = m0 v 2 / R centrifugal force. It follows from above eq. that
GMm0 R
2
=
m0 v 2 R
→
GM = v2 R
(1)
and so, the relativistic condition v < c results in
GM ≤ R → R ≥ RMIN c2
with
RMIN =
GM c2
(2)
This means that no equilibrium can exist if the radius is smaller than RMIN , Since the existence of an Equilibrium is a preliminary condition for stability we can see that SR by itself puts a stability problem into evidence. Also, the aforesaid minimum radius is half the Schwarzschild radius. This lets us foresee a possible connection between the purely mathematical Schwarzschild singularity and a physical condition derivable from stability. Consequently the known theory of black-holes could be reset on more solid grounds and the connections between SR and GR can be explored in a novel optic, along lines that represent a significant evolution of studies by various authors in the sixties (e.g. P. Caldirola of the University of Milan, H. Bondi of Cambridge University) and in the eihties (G. Jumarie, University of Quebec); the author of the present work belongs to the school of Milan. Another source of inspiration is the “post-newtonian approximation” by S. Weinberg (1972) [39]. That being said we think it important to recall something general about Stability theory and its practical applications. A complex physical system formed by interacting subsystems possesses defined functional characteristics or satisfies assigned requirements in a steady-state mode (Equilibrium), which must be kept at all times for sudden
Relativistic Stability. Part 1 - Relation between SR and ST …
661
“disturbances”, with an adequate margin of “safety”. Stability either represents the fruit of a deliberate design or, in the minimal case, it should be considered a cogent hidden property whose roots need anyhow be explored (in view, e.g., of detecting the number and kind of forces that in reality operate on the system, or in view of deriving physical constraints on the system parameters). Elementary examples of complex systems are the Atom, the Proton, an Electric Power System. Properties to be maintained are, respectively, the emission spectrum of the Atom, the mass of the Proton, the capability of a multimachine electric power system of meeting the “demand” without loosing synchronism (50 or 60 Hz). In our schematization the “disturbances” are either initial conditions in the classical sense or step disturbances that primarily determine a change of the reference Equilibrium Point. Stability analysis has a very solid theoretical foundation and a fairly wide coverage in the literature (please mainly refer to the book by W. Hahn “Theory and application of the Lyapunov' s Direct Method”). It dates back to 1892, before the formulation of Special Relativity and well beforehand the formulation of Quantum theory. We recall, among others the developments and extensions due to M.A. Aizerman, H. A. Antosiewicz. N.G. Chetaev, G.N. Duboshin, N. P. Erugin, E. Gibson; D.R. Inwerson, R.E. Kalman, S. Lefschetz, S. Letov, A.I. Lure, I.G. Malkin, S.K. Persidskii, K.P. Persidskii, V.M. Popov, G.K. Pozarickii, B.S. Razumchin, J.N. Roiteberg, E.N. Rozenwasser, D.G. Schultz, V.M. Stazinskii; G.P. Szegö, O. Taussky. More recently: E. Barbashin and N.V. Krasovskii [4], J.P. Lasalle [28], V.V. Nemyskii and V.V. Stepanov [32], which includes “dynamical systems defined in metric spaces”, N.N. Krasovskii [25], J.K. Hale [24], R.D. Driver [21], who extended the Lyapunov Method to functional differential equations in infinite dimensional space, V.I. Zubov [43], who extended the stability theory to partial differential equations, also giving a method for determining the Region of Asymptotic Stability. More recent extensions regard: strange attractors, fractals, the theory of catastrophes, the Whitney theory of singularities. For these subjects it is suitable to quote the following works: H. Withney “Singularities” [41], B.B. Mondelbroot [31] “Fractals”, R. Thom [38] “Theory of catastrophes”, V.I. Arnol' d [3], for a more rigorous treatment of all the above topics. A variety of applicative works have been published, among which those ones dealing with electric power systems EPS [15], and those with the cosmological problem [16]. The first show a definite way for the study of dissipative systems with n degrees of freedom. The energy function no more is an “integral” of the motion but represents a Lyapunov function. If in the neighborhood of an Equilibrium Point x 0 the rate of change dE / dt is negative, the energy will continually decrease
662
U. Di Caprio
until it finally assumes its minimum value E ( x 0 ) . A Lyapunov function in general possesses local properties: it is positive-definite at x 0 (namely it has a local minimum there) and at one time its “time-derivative along the system trajectories” V ( x) is negative-definite at x 0 (namely it has a local maximum). The study of the stability of the Universe leads to more than interesting results about present state (age, mass, radius, density etc) and future evolution, as well as past evolution. We make reference to an elementary and intuitive notion of stability: if, after the occurrence of a disturbance, the system returns to the stable state we say that the system is stable (Asymptotic stability). The system is also termed stable if it converges to another equilibrium in proximity of the initial equilibrium point (Weak stability). If the system “runs away” so that certain physical variables go on increasing as t → ∞ or leave a convenient bonded region (i.e. the so called Stability Region) then we say that the system is unstable. Physical systems can be described by their mathematical models. The following are some typical descriptions: 1. x = f ( x, u , t ) Nonlinear, time-varying and forced 2.
x = f ( x, t ) Nonlinear, time varying and force free
3.
x = f ( x, u ) Nonlinear, time invariant and forced
4.
x = f ( x) Nonlinear time invariant and force free (Autonomous).
where x represents the n-dimensional state vector, u represents the rdimensional input vector and t represents the independent time variable. Here we are interested to systems of type 4). Then it is understood that the internal forces are suitably “embodied” in the model, while no external force (vector u ) is present. With regard to linear autonomous systems the theory of stability is well known through the methods of Nyquist, Routh-Hurwitz, etc. On the contrary, in the case of nonlinear systems no such systematic procedures exist: closed form solutions of nonlinear differential equations are exceptions rather than the rule. A.M. Lyapunov in 1892 already (“Problem general de la stabilitè du movement”, Russian Ed.1892, reprinted in Annals of Mathematical Studies N.17, Princeton University Press, Princeton, N.J., 1949) set forth the general framework for the solution of such a problem. He outlined two approaches, known popularly as Lyapunov' s “first method” and the other the Second method of Lyapunov or the Direct Method. The distinction is based on the fact that the “first method” depends on finding approximate solutions to the differential equations. In the Second method no such a knowledge is necessary. We make reference to the second method. The existing connection between the Lyapunov method and the classic techniques of analysis of linear systems is illustrated by
Relativistic Stability. Part 1 - Relation between SR and ST …
663
the following Theorem on stability “in the first approximation”, which will be utilized in this Part 1. Consider the system represented by the non-linear constant equation
x = f ( x ) ; 0 = f (0 )
(3)
and the corresponding linear differential equation
x = A x ; A (n × n) matrix ; aij =
∂f i ∂x j
(4) x0
Then 1. The Equilibrium Point x = 0 is asymptotically stable if all the eigenvalues of matrix A have a negative real part. 2. The Equilibrium Point x = 0 is unstable if at least one eigenvalue of A has a real positive part. Consequently, the condition that none of the eigenvalues of A had a real positive part represents a necessary condition for stability of x = 0 . 3. If none of the eigenvalues in question has a real positive part but, however, some eigenvalues have real part equal to zero, the non-linear differential eq. (3) has a “critical behavior” with regard to the Equilibrium. Namely, the eventual stability or instability of x = 0 , cannot be derived from the analysis of the stability of the linear system (1.6), that represents the first order approximation of x = f (x) at x = 0 . The direct analysis of “nonlinear stability” will be afforded in Part II and will lead us to an absolutely original interpretation of the famous Schwarzschild radius of GR. Turning back to present Part 1, we proceed as follows. In Sec 2 we derive the canonic equations of the restricted two-body problem under the assumption that the mass of the rotating body varies with the velocity according to the classic Einstein relation. The canonic equations serve us in view of subsequent stability analysis via the Lyapunov Method Such equations are partially equivalent to those proposed by Caldirola [6], but differ in this fundamental respect: the state variables are three and not two since, in addition to radius R, and to radial speed R they include the magnitude of the rotation velocity v. Such result is not surprising and deploys its full potentiality in the study of black-holes (Part 2). In Sec. 3 we illustrate eigenvalues analysis and apply the aforementioned Lyapunov theorem on “stability in the first approximation”. We conclude that such analysis is absolutely inadequate.
664
U. Di Caprio
In Sec 4 we afford a crucial issue which looks worthy of greater attention than usual: in the special case in which the two-body problem is formulated in a classical (i.e. non relativistic) how can we derive from the Lyapunov method sufficient conditions for stability? Moreover we exploit the principle of equivalence Potential energy / mass for deducting from our dynamic model a necessary condition for relativistic stability. The determination of sufficient conditions is deferred to Part 2, altogether with the study of the Schwarzschild radius. 2.
Derivation of the canonic system equations of the two-body relativistic problem
Consider the plane motion of a m0 by the action of a central attractive force of newtonian type. The special relativity equation of the motion is given by
d (m v ) =F dt
→
dm dv v + ma = F dv dt
(5)
with
m0
m(v) = γ m0 =
2
2
1 − (v / c )
;
a=
dv dt
(6)
Setting v = v R + jvT ; v = v R2 + vT2 ; v R = R ; vT = Rθ and a = aR + jaT ;
a R = R − Rθ ; aθ = 2 Rθ + Rθ ; furthermore FR = − F = −
ρ
R2
( ρ > 0) ; FT = 0
(7)
(force F is always oriented according to the radius that connects the mobile point to the center of force) we derive from eq.s (5) (6) the three scalar equations
dm dv v R + ma R = FR ; dv dt
dm dv vT + maT = 0 ; dv dt
dm mv = 2 d v c − v2
(8)
Since v = R 2 + R 2θ 2 , it is
v v = RR + RRθ 2 + R 2θθ On the other hand eq.s (8) bring about
(9)
Relativistic Stability. Part 1 - Relation between SR and ST …
mv 2
c −v
2
v vT + m(2 Rθ + Rθ ) = 0
Rθ = −
→
vv 2
c − v2
vv 2
c − v2
665
Rθ + 2 Rθ + Rθ = 0
Rθ + 2 Rθ
(10)
From eq.s (9) and (10) we derive
vv 1+
( Rθ ) 2 = R ( R − Rθ 2 ) = v R a R c2 − v2
(11)
and, as ( Rθ ) 2 = vT2 = v 2 − vR2 , then
v v = vR aR
c2 − v2 c 2 − v R2
On the other hand (from (8)) we have a R = (12) leads to
v v = vR
eq.s
R = Rθ 2 +
FR c 2 − v R2 and hence equation m c2
FR v2 F v2 1− 2 = R R 1− 2 m m c c
Taking into account that Rθ 2 =
(12)
(13)
vT2 v 2 − vR2 = and v R = R we get from above R R
FR vv F R2 v2 − R2 − 2 2 R → R = R 1− 2 + m c −v m R c
(14)
Eq. (14) represents an equation with the canonic form R = f ( R, R, v) in the three state variables R, R, v . In order to completely define our state equations we need one more equation of the form v = g ( R, R, v) . It can be obtained from (13):
v=
R v2 F 1− 2 R v m c
(15)
The system of eqs. (14) and (15) where FR = − F = − ρ / R 2 and m(v ) given by eq. (6) defines a dynamic and relativistic model of the 3-rd order in the state variables R , R and v . The Points of Equilibrium of such model are identified by the solutions of the system of equations
666
U. Di Caprio
R = 0; R =0;v =0 →
FR v2 + = 0 ; R = R0 = cost ; v = v0 = cost m (v ) R
(16)
Therefore the Equilibrium Points are defined to within an arbitrary constant that represents the value of the radius R0 . The corresponding value of the speed is obtained from the algebraic eq.s
γ0
v02 c2
=
ρ R0 m0 c 2
=
GM R0 c 2
; γ0
v02 c2
=
γ 02 − 1 γ0
which give γ 0 as the positive solution of the 2nd order eq.
γ 02 − γ 0
GM −1 = 0 R0 c 2
and v0 as v0 / c = GM / γ 0 . Note that eq. (14) has the same structure of the corresponding nonrelativistic equation (given by Newton' s theory) and, indeed, the latter can be directly derived from it simply replacing m(v ) with m0 and letting c → ∞ . On the other hand eq. (15) turns automatically satisfied when c → ∞ : infact eq. (15) leads to R FR v= when c → ∞ (17) v m and, since FR / m = R + Rθ 2 , eq. (17) implicates
v v = RR + RRθ 2
c→∞
when
which, on the other hand, directly follows from computing the time derivative of both members of the equation v 2 = R 2 + ( Rθ ) 2 and recalling that, when c → ∞ , then (2 Rθ + Rθ ) → 0 . Finally, by introducing the state variables
x1 = R , x2 = R , x3 = v we find the canonic representation
x1 = x2 ;
x2 = −
x x2 x3 = 2 1 − 32 x3 c
−
ρ mx12
ρ mx12
1− ;
x22
c2
+ m=
x32 − x22 ; x1 m0
(18)
1 − ( x32 / c 2 )
Remark 1: The canonic state equations identify a third order dynamic model. This represents a fundamental advancement both with regard to Newton and to
Relativistic Stability. Part 1 - Relation between SR and ST …
667
Einstein as well as to other existing formulations. Furthermore the analysis remains unchanged if we postulate equivalence Potential energy mass (cfr. with Caldirola [6]) and replace m with mˆ
mˆ = γ mˆ 0 ; mˆ 0 = m0 +
Ep
ρ
= m0 +
(19) c Rc 2 Thirdly, we can deal either with the gravitational problem or with the electrical problem (e.g. the photon problem). In the first case we assume ρ = GMm0 , 2
while in the second case we assume ρ = kq 2 with k the Coulomb constant and q the unitary charge. 3. Linearized system equations Equation (18) has the general form x = f ( x) . From it we can determine the linearized system equation x = A( x − x 0 ) with x 0 Point of Equilibrium
( R0 ,0, v0 ) and with A 3×3 matrix of elements aij = ∂f i / ∂x j 0 A = a21 0 a 21 =
1 0 a32
∂FR 1 m(v0 ) ∂R
a 21 =
v02 R02
;
0 a23 ; 0 − ( R0 ,v0 )
a23 =
v02
R02
a 21 =
∂FR 1 m(v0 ) ∂R
;
a32 =
v0 2 − (v02 / c 2 ) R0 1 − (v02 / c 2 )
1−
;
− ( R0 ,v0 )
v2 c
2
a32 =
v02
R02
FR vm(v)
(20)
(21) ( R0 ,v0 )
v02 2
v0 −1 R0 c
(22)
The eigenvalue equation Det ( A − λI ) (I identity matrix) results in
− λ[λ2 − (a23a32 + a21 )] = 0 which gives λ1 = 0 ; λ2 = a23 a32 + a21 ;
λ3 = − a23 a32 + a21 = −λ2 with a23a32 + a21 =
v02 v02 −1 R02 c 2
As v0 ≤ 0 above eq.s entail that a23 a32 + a21 ≤ 0 and, also, a23 a32 + a21 < 0 if
v0 < c . Therefore if v0 < c the system eigenvalues are equal to λ1 = 0 ;
668
U. Di Caprio
λ2 = j − (a23 a32 + a21 ) ; λ3 = − j − (a23a32 + a21 ) , j = − 1 while if v0 = c the system eigenvalues are coincident and equal to λ1 = λ2 = λ3 = 0 . In the first
case λ2 and λ3 identify an undamped oscillatory mode, while λ1 identifies a constant mode. The linearized system is weakly stable and nothing can be said about stability of the original non-linear system. In the second case (i.e. v0 = c ) the linearized system is unstable and, due to the Viola Theorem, the non-linear system is unstable as well; however such case is without practical value, since when v0 → c then (γ 0 v02 / c 2 ) → ∞ and R0 → 0 .
4. A necessary condition for relativistic stability The condition of equilibrium (16) brings about ( ρ / R0 ) = m(v0 )v02 . As the
potential energy in Equilibrium is defined by E p = − ρ / R0 aforesaid equation
results in
E p = −m(v0 )v02
→ E p = −γ 0 m0 v02
(23)
Postulating that E p is equivalent to mass (which is primarily justified by the classical Einstein' s equivalence E = m c 2 ) and using eq. (19), we obtain from (23) and (19)
mˆ 0 = 1 − γ 0
v02 c2
m0
(24)
Replacing m0 with mˆ 0 we find the relations
Tˆ0 = (γ 0 − 1)mˆ 0 c 2 ;
Eˆ p 0 = −γ 0 mˆ 0 v02
(25)
(we call Eˆ p 0 relativistic Potential energy and Tˆ0 relativistic Kinetic energy in equilibrium). It is
v2 Tˆ0 + Eˆ p 0 = m0 c 2 1 − γ 0 02 c
1− γ0
γ0
(26)
From eq. (26), we derive the following necessary condition for stability (Appendix)
Relativistic Stability. Part 1 - Relation between SR and ST …
γ0
v02 c2
0 if
and consequently the condition (1 − γ 0 v02 c 2 ) > 0 Further on, as
(1 − γ 0 v02
2
c ) = (1
γ 0 )(γ 0 − γ 02
+ 1)
1− γ 0
v02 c2
0 in
γ 1 < γ 0 < γ 2 , with γ 1 = (1 − 5 ) 2 and γ 2 = (1 + 5 ) 2 , then the condition γ 0 < (1 + 5 ) 2 is necessary. In parallel, since γ is an increasing function of v and (v / c) = ( γ 2 − 1 γ ) the condition v0 < v2 , with (v2 / c) = 0.7815 is a necessary (and equivalent) condition as well. Furthermore, as shown in Part 2, the equivalence Potential energy /mass leads to a reversing of the sign of the system eigenvalues for R < Rmin , so that one eigenvalue becomes real and positive (instability!).
RELATIVISTIC STABILITY. PART 2 - A STUDY OF BLACK-HOLES AND OF THE SCHWARZSCHILD RADIUS
UMBERTO DI CAPRIO Stability Analysis s.r.l., Via A. Doria 48/A - 20124 Milano, Italy E-mail:
[email protected] We point out a sufficient condition for existence of a stable attractor in the two-body restricted problem. The result is strictly dependent on making reference to relativistic equations and could not be derived from classical analysis. The radius of the stable attractor equals the well known Schwarzschild radius of General Relativity (GR). So we establish a bridge between Special Relativity (SR) and GR via Stability Theory (ST). That opens one way to an innovative study of black-holes and of the cosmological problem. A distinguishing feature is that no singularities come into evidence. The application of the Direct Method of Lyapunov (with a special Lyapunov function that represents the local energy) provides us the theoretical background. Keywords: stability theory, Schwarzschild solution, black-holes.
1. Introduction We have seen in Part I that in the two-body relativistic (and restricted) problem there exists a critical radius Rmin so that orbits with radius R < Rmin are unstable. Let us deepen this issue and next expand the analysis. The assumption that Potential energy is equivalent to mass leads to a meaningful change of the system eq.s for R < Rmin . In fact like shown in Part 1, it is m g < 0 for R < Rmin , with
mˆ 0 = m0 +
E p0 c2
= m0 1 −
GM GM ; Rmin = 2 R0 c 2 c
(1)
Consequently the gravitational force becomes repulsive and the centrifugal force becomes centripetal. Due to such reversing the differential equation (14) in Part 1 is to be replaced with
R=
GM v 2 − R 2 − R R2
673
(2)
674
U. Di Caprio
while eq. (15) in Part 1 remains unchanged. By linearization of the system equations, according to the same scheme illustrated in Part 1, we find that the coefficients a 21 and a 23 change their sign whilst the coefficient a32 remains ˆ unchanged. In other words the linearized system matrix becomes equal to A with
0 ˆ A = aˆ 21 0 with
1 0 aˆ32
0 aˆ 23 ; aˆ 21 = − a21 ; aˆ 23 = −a23 ; aˆ32 = a32 0
aˆ 21 = −a21 ; aˆ 23 = −a23 ; aˆ32 = a32
(3)
(4)
then
aˆ 21 aˆ32 + a21 = −(a23 a32 + a21 ) = −
v02 v02 −1 > 0 R02 R
(5)
Equation (5) brings about that one of the system eigenvalues is real positive. Of course this means that the linearized system equations are unstable, in conformity with the results shown in Part 1. The circular motion is decomposed in two exponential motions one of which diverges from the Equilibrium. Hence an Equilibrium does not exists at all for R < Rmin (namely body m0 falls on body M ). This analysis gives us a sound basis for an innovative study of blackholes. The Lyapunov Method proves to be the right tool for affording the problem. In Sec. 2 we preliminary study the non-relativistic case and point out the crucial role of the Potential Energy. In Sec. 3 we analyze the relativistic case (with reference to the equations illustrated in Part 1) and answer the following question: there exists a circular orbit along which the Potential Energy takes a minimum? The answer is affirmative and the radius of the orbit in question is Rs = 2GM c 2 i.e. equal to the radius of Schwarzschild. In Sec. 3 we discuss a behavior of Potential Energy. In Sec. 4 we show the application of the Lyapunov Method. 2. Recall of the stability conditions for the non-relativistic case In the non-relativistic formulation (which can be obtained from relativistic one by letting c → ∞ ) the differential equation of motion is the well known eq.
R=−
GM a 2 − ; with R3 R2
−
GM a 2 1 + 3 = ( FR + Fc ) 2 m R R 0
(6)
Relativistic Stability. Part 2 - A Study of Black-Holes …
FR = −
GMm0
gravitational force; Fc = m0
R2 a areolar velocity.
a2 R3
675
centrifugal force;
Equation (6) has the following canonic representation
R
x = f ( x) ; x =
R
;
f ( x) =
0 − (GM x 2 ) + a 2 x 3
and any point x 0 defined by x0T = [ R0 0] with R0 = a 2 GM Equilibrium point. The linearized system matrix is given by
A=
0
1
a21 0
is an
; a 21 = R0−3 [2GM − 3a 2 R0−1 ] = R0−3 (GM )
and its eigenvalues are the solutions of the equation
λ2 = − a21 → λ1 = j a21 ; λ2 = − j a21 . So the linearized system equations are weakly stable. As regards the non linear system behavior the following function is a Lyapunov function R
V ( R, R ) =
GMm0 a 2 m0 1 m0 R 2 + − dR 2 R2 R3 R
(7)
0
This function is positive-definite at x 0 and its time derivative along the system trajectories is globally equal to zero. In fact V ( R0 ,0) = 0 , [∂V ∂R ]R0 = 0 , [∂V ∂R] R0 = 0 and
∂ 2V ∂R 2
=− R0
GMm0 R03
;
∂ 2V =0; ∂R∂R
∂ 2V ∂R 2
= m0
(8)
GMm0 a 2 m0 dV ∂V ∂V GM a 2 = ×R+ ×R= − R + m R − + 3 = 0 (9) 0 dt ∂R ∂R R2 R3 R3 R Above equations implicate that function V has a local minimum at x 0 and then V is positive-definite at x 0 . Such property brings about that the surfaces V = const are closed (around x 0 ). Moreover, as grad V ≠ 0 for x ≠ x 0 the aforesaid closure is kept up to R = ∞ (i.e. stability is of global type). It is
676
U. Di Caprio
readily seen that V ( R, R) = E − E0 where E is the classical energy function defined by E = (m0v 2 2) − (GMm0 R) . Of course we can write
1 m0 R 2 + ( E p − E p 0 ) 2 GMm0 1 m0 a 2 1 GMm0 Ep = + ; E p0 = 2 R 2 R 2 R0 V ( R, R ) =
The function E p has a local minimum at R0 (such property is not possessed by function E p ). We could extend our analysis to elliptic rather than circular orbits (and would find that elliptic orbits are weakly stable). However such extensions is beyond the scopes of the present work.
3. The stable orbit with minimum potential energy in equilibrium In Equilibrium
v2 v2 Eˆ p = −γ 0 02 1 − γ 0 02 m0 c 2 ≡ f (γ 0 ) m0 c 2 c c
(10)
with
f (γ 0 ) = − It is
dEˆ p dγ 0
=
γ 02 − 1 γ 02 − 1 + γ0 γ0
2γ 04 − γ 03 − γ 0 − 2
γ 03
2
γ 04 − γ 03 − 2γ 02 + γ 0 + 1 γ 02
=
=
2
(γ 02 + 1) 2
γ0
γ 02 − 1 1 − γ0 2
v02
2
1 = 2 (γ 02 + 1) γ 0 2 − 2 γ0 c Therefore
dEˆ p dγ 0
Also
d 2 Eˆ p dγ 02
=
= 0 in γ 0 = γ s
2γ 04 + 3γ 02 + 2γ 0 + 6
γ 04
→
with γ s
d 2 Eˆ p dγ 02
v s2 c2
=
1 2
> 0 for any γ 0 ≥ 1
(11)
(12)
(13)
(14)
Relativistic Stability. Part 2 - A Study of Black-Holes …
677
We conclude that the function Eˆ p (γ 0 ) has a minimum at γ 0 = γ s with γ s defined by the equations
γs
vs2 c
2
=
1 1 ; γs = 2 1 − (vs2 c 2 )
→ γs =
1 + 17 ≅ 1.28 4
For our commodity we call this number emi-golden ratio. (The corresponding value of the rotation speed is vs c = 0.6249 . From eq.
ρ
= γ s m0 v s2 ; ρ = GMm0
Rs
we get the value of the radius Rs of the orbit along which v = vs and γ = γ s
Rs =
ρ m0 c
2
(γ s2 vs2
2
c )
=2
ρ m0 c
2
=
2GM c2
(15)
What is the physical meaning of the aforesaid property? The application of the Direct Method of Lyapunov clarifies the issue.
4. Application of the Lyapunov method Any point x 0 = ( R, R0 , v0 ) with
γ 0 v02 =
2GM ; R0 > 0 ; R0 = 0 R0
is an Equilibrium Point (and R0 determines the value of v0 ). The special Point x 0 individualized by
γ 0 v02 c
2
R0 = Rs =
2GM c2
γ s vs2
2GM
=
2
=
2
c Rs
(16)
=
1 2
(17)
is asymptotically stable. In fact the following function V ( x) = V ( R, R, v) can be proven to be a Lyapunov function [8]
678
U. Di Caprio
V ( R, R, v ) = mˆ 0
R R0
+ + with
a 2 = GMR0 ;
GM R2 GM R02
2GM
1−
+
Rc 2
R −
GM R02
dR +
1 a2 2 R2
7 GM 1 ( R − R0 ) 2 + R 2 3 6 R0 2
(18)
b b1 b ( R − R0 ) R + 2 R (v − v0 ) + 3 (v − v0 ) 2 2 2 2
b1 = −2
v0 ; R0
b2 = 4(γ 02 + 1) ;
mˆ 0 = m0 1 −
γ 0 v02 c
2
=
b3 =
γ0
γ 02
+1
m0 2
(19) (20)
More precisely V-function (18) is positive-definite at x 0 and its time derivative along the system trajectories is locally negative-definite at x 0 . Consequently x 0 is asymptotically stable.
Theorem 1: In the (restricted) two-body relativistic problem there exists an asymptotically stable solution. Such solution is identified by a circular orbit with radius equal to the radius of Schwarzschild ( Rs = 2GM Rc 2 ). The corresponding radial speed is (obviously) equal to zero while the magnitude of the velocity v is equal to the emigolden speed defined by v s c = (γ s2 − 1) / γ s2 with γ s = 1.2807 the emigolden ratio.
Theorem 2: The condition R > Rmin with Rmin = GM c 2 , is necessary for stability while the condition R = Rs = 2GM c 2 is sufficient. Radius R = Rs identifies a stable attractor. Aforesaid conditions impose conditions on the rotation speed (and on the relativistic mass coefficient γ ). They are expressed by v < vmax ; γ < γ max necessary condition (21)
v = vs ;
γ =γs
sufficient condition
(22)
with
γ max = γ r =
1+ 5 ; 2
γ max
2 vmax
c
2
= 1;
γs =
1+ 17 ; 4
γs
vs2 c
2
=
1 2
(23)
Relativistic Stability. Part 2 - A Study of Black-Holes …
679
Theorem 3: The Lyapunov function (19) represents an Energy and satisfies the eq. V = Eloc − E0 with Eloc the “local energy” defined by
GM
Eloc = mˆ 0
R
2
1−
2GM Rc
2
dR +
GMmˆ 0 R02
R+
1 a2 mˆ 0 2 R2
7 GM − mˆ 0 ( R − R0 ) 2 6 R03 b R 2 b1 b + ( R − R0 ) R + 2 R(v − v0 ) + 3 (v − v0 ) 2 2 2 2 2
+ mˆ 0 mˆ 0 =
(24)
γ 0 v02 1 m2 2GM ; = ; R0 = 2 ; a 2 = GMR0 2 2 2 c c GMmˆ 0 E0 = R0
(25) (26)
When c → ∞ the local energy becomes equal to the classical conservation energy. Proof: The statement is evident. Moreover the V function satisfies eq. V = Eloc − E0 with
GM
E0 = Eloc ( x 0 ) = mˆ 0
R
2
−2
(GM ) 2 3 2
R c
+
dR R0
GMmˆ 0 1 a 2 + mˆ 0 R0 2 R02
and, as
GM R2 GM R
2
−2
−2
(GM ) 2 3 2
R c
(GM ) 2 R 3c 2
dR = −
=−
dR x0
GM (GM ) 2 + 2 2 R R c
GM GM GM 1 GM + = 2 R0 R0 R0 c 2 R0
then
E0 = mˆ 0 −
1 GM GM 1 GM GM m0 GM = mˆ 0 = + + 2 R0 R0 2 R0 R0 2 R0
Furthermore the function Eloc is representable in the form
680
U. Di Caprio
Eloc = mˆ 0
GM R2 − +
−2
(GM ) 2 R 3c 2
+
1 a 2 2GM GM + R dR 2 R 2 R0 c 2 R02
2GM 7 GM 2GM R 2 2 ( R − R ) + 0 R0 c 2 6 R03 R0c 2 2 2GM R0c 2
b b1 b ( R − R0 ) R + 2 R (v − v0 ) + 3 (v − v0 ) 2 2 2 2
and hence, since
lim mˆ 0 = m0
c→∞
it is
lim Eloc = m0
c →∞
GM R2
dR +
1 a2 ≡ classical energy function 2 R2
Corollary1: The local energy at the stable Equilibrium Point is positive (and equal to GMm0 2R0 ) Remark 1: When we apply Corollary 1 to the analysis of the two-body electrical problem (photon problem) we find that the local energy in equilibrium is but the photon energy EΦ = kq 2 2R0 with k the Coulomb constant. The preceding analysis allows us to reset the current theory on black holes. Around any massive body M there exists a “forbidden region” so that an external body m0 entering such region “falls” on M. E.g. if M = M sun the radius RB of the forbidden region is RB = (GMm0 c 2 ) about equal to 1.500 Km (much smaller than the radius of the sun itself). Clearly an anomalous situation arises when radius RM of the massive body is smaller than RB . Such situation is leadable back to a condition on volume and on density
Volume ( M )
4 π RB3 3
if
RM < R B
M 1 M 1 c6 RB3 = = (4/3)π (4/3)π (GM/c 2 ) 3 (4/3)π G 3 M 2
(27) (28)
When condition (28) is verified we say that M is a black-hole. The critical density is defined by
Relativistic Stability. Part 2 - A Study of Black-Holes …
ρb >
M c2 (4π /3) G
3
1 M2
681
(29)
Such quantity is “relatively” larger for small black-holes. As an example ρ b = 1.46 × 1020 kg / m 3 if M = 2 × 1030 kg . ρ b = 5.83 × 1080 kg / m 3 if M = 1 kg . An external body m0 entering the region RM < R < RB undergoes chaotic events. Its mass becomes negative and the trajectory cannot be “closed”: it is decomposed into two distinct and contradictory “modes” which are an increasing exponential and a decreasing exponential. The body splits itself in two separate parts, one of which falls on black hole (while the other departs itself from he black-hole and finally comes out the forbidden region). As total energy must be kept, and the part falling on the black-hole is negative, such event determines partial “evaporation”. The outgoing part acquires a positive energy and travels toward the stable attractor which eventually consists of a matter crust with radius Rs (equal to the Schwarzchild radius). In a subsequent work we study the Region of Attraction of the stable attractor and the related “accretion disk”. Here we confine ourselves to a mention of the Hawking's formula on temperature
Θb =
c3 1 4π GM k B
(30)
and of the way by which our results shed further light on such formula. We find that (30) is equivalent to
Θb =
m0 c 2 α rB k B 2π Rs
; Rs =
2GM c2
with rB the Bohr radius, m0 the electron mass, k B the Boltzmann constant. No singularities appear in our formulation: in particular the gravitational force of the black-hole has a finite value.
5. Conclusion Using the Lyapunov Direct Method and the local energy as the Lyapunov function we determined a sufficient condition for asymptotic stability in the twobody relativistic (restricted) problem. There exists a special orbit with radius equal to the Schwarzschild radius, that represents a stable attractor. This result completes the analysis in Part 1 (where we pointed out a necessary condition, which required that the radius of a stable orbit must be greater than half the
682
U. Di Caprio
Schwarzschild radius). Thus we have established a bridge between Special Relativity (SR) and General Relativity (GR) via Stability Theory (ST). We have delineated the application to black-holes, setting forward an innovative analysis. The concept of local energy is novel and should be considered a primary contribution. It substantially generalizes the classical conservation energy and reduces to it when speed of light in vacuum tends to infinity. The local energy is defined with regard to a specific Equilibrium Point and possesses peculiar signdefinitiness properties. With reference to the stable Attractor the time derivative of such energy is negative-definite: that means that the two-body system is locally dissipative. This reminds us the theoretical results illustrated in [15] with regard to dissipative Electric Power Systems. Another striking finding is that along the basic stable orbit the local energy is positive (while the classical energy is negative). This explains the positive value of the energy of the photon in the two-body electrodynamical problem. Last but perhaps not least we have connected the radius of the stable attractor with a special value of the relativistic mass coefficient γ : the latter must be equal to the emigolden ratio much the same as (see Part 1) the minimum radius for stability turns out connected to a value of γ equal to the golden ratio.
6. BIBLIOGRAPHY 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
M.A. Abramowicz, Scientific American 268,26 (1993). G. Arcidiacono, Relatività e Cosmologia (Veschi, Roma, 1973). Arnol'd V.I., La teoria delle catastrofi (Boringhieri, Italy,1987). E.A. Barbashin, N.N. Krasovskii, Prikle. Kat. Mekh. 18, 345-350 (1954). A. Bruce, Nature 347, 615 (1990). P. Caldirola, Teoria quantistica relativistica (Viscontea, Milano, 1963) S. Chandrasekher, The mathematical theory of black holes (Clarendon Press, Oxford, 1983). U. Di Caprio, Int. J. of EPES 8(4), 225-235 (1986). U. Di Caprio, Int. J. of EPES 8(1), 27-41 (1987). U. Di Caprio, W. Prandoni, Int. J. of EPES 10(1), 41-53 (1988). U. Di Caprio, G. Spavieri, Hadronic J. 22, 675-692 (1999). U. Di Caprio, Hadronic J. 23, 689 (2000). U. Di Caprio, Int. J. of EPES 23(3), 229-235 (2001). U. Di Caprio, Supplement to Hadronic J. 16(1), 163-182 (2001). U. Di Caprio, Int. J. of EPES 24(5), 421-429 (2002). U. Di Caprio, in Emergence in Complex, Cognitive, Social and Biological Systems, Ed. G. Minati and E. Pessa, (Kluwer Acadmic/Plenum Publishers, New York, 2002), pp. 127-140. U. Di Caprio, in Systemics of Emergence: Research and Development, Ed. G. Minati, E. Pessa, M. Abram, (Springer New York, 2006), pp. 31-66.
Relativistic Stability. Part 2 - A Study of Black-Holes …
683
18. U. Di Caprio, Application of the Di Caprio-Lyapunov Method to the study of Cosmological Problems, Stability Analysis, Int. Rep. 2006-1, (2006).
19. R.H. Dicke, in Relativity, Groups and Topology, Ed. C. De Witt and B. De Witt, 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43.
(Gordon and Breach, New York, 1964). A.D. Dolgov and Ya. B. Zeldovich, Rev. of Modern Physics 53, 1-41 (1981). R.D. Driver, Arch. Mech. Analysis 10, 401-426 (1962). R. Gautreau, W. Savin, Modern Physics (McGraw-Hill, New York, 1978). W. Hahn, Theory and Application of Liapunov's Direct Method (Mc Graw-Hilll, New York, 1963). J.K. Hale, J. of Diff. Eqs. 1, 452-482 (1965). N.N. Krasovskii, Certain problems of the theory of stability of motion (Russian ed.: Moscow, 1959); (American ed.: Stanford, Ca, 1963). G. Jumarie, Subjectivity, Information, Systems: Introduction to a Theory of Relativistic Cybernetics (Gordon and Breach, New York, 1986). L.D. Landau, E.M. Lifisitz, The classical theory of fields (Butterworth-Heineman, Oxford, 1980). J.P. La Salle, IFAC Congress, art. 415 (1963). G.C. Macvittie, General Relativity and Cosmology (Chapman & Hall, London, U.K., 1965). E.A. Milne, Relativity, Gravitation and World Structure (Clarendon Press, Oxford, U.K., 1935). B.B. Mandelbrot, Encyclopedia of Mathematical Sciences, Vol. 1,4,5,39 (SpringerVerlag, Berlin, 1988). V.V. Nemyskii, V.V. Stepanov, Qualitative theory of differential equations (Russian ed.: Moscow, 1947); (American ed.: Princeton, 1960). J.D. North, The measure of the Universe (Oxford University Press, Oxford ,UK, 1952). H.P. Robertson, T.W. Noonan, Relativity and cosmology (Saunders, Philadelphia, 1968). M.P. Ryan, L.C. Shepley, Homogeneous Relativistic Cosmologies (Princeton University Press, Princeton, N.J., 1975). D.W. Sciama, Modern Cosmology (Cambridge University Press, Cambridge, U.K., 1971). J. Stachel, Ed., Einstein's miraculous year. Five papers that changed the face of physics (Princeton University Press, Princeton, N.J., 1998). R. Thom, Stabilitè strcturelle et morphogenis (W.A. Benjamin, Reading, Massachusetts, 1972). S. Weinberg, Gravitation and cosmology (John Wiley & Sons, New York, 1972). S. Weinberg, The quantum theory of fields: Foundations, (Vol. I) (Cambridge University Press, Cambridge, 1995). H. Whitney, Ann. Math. 62(3), 374-410 (1955). C.M. Will, Was Einstein right? (Basic Books, New York, 1986). V.I. Zubov, Methods of A.M. Lyapunov and their application (Leningrad, 1959), (English transl.: Noordhoff, Gronigen, 1964).
This page intentionally left blank
THE FORMATION OF COHERENT DOMAINS IN THE PROCESS OF SYMMETRY BREAKING PHASE TRANSITIONS
EMILIO DEL GIUDICE (1), GIUSEPPE VITIELLO (2) (1) Istituto Nazionale di Fisica Nucleare, Sezione di Milano, Via Celoria 16, I-20133 Milano, Italia (2) Dipartimento di Matematica e Informatica, Università di Salerno and Istituto Nazionale di Fisica Nucleare, Gruppo Collegato di Salerno, 84100 di Salerno, Italia The emergence of the phase locking among the electromagnetic modes and the matter components on an extended space-time region is discussed. The stability of mesoscopic and macroscopic complex systems arising from fluctuating quantum components is considered under such a perspective. Keywords: symmetry breaking, coherent domains, Anderson-Higgs-Kibble mechanism, gauge fields.
The general problem of the stability of mesoscopic and macroscopic complex systems arising from fluctuating quantum components is of great interest in many sectors of condensed matter physics and elementary particle physics. For example in the problem of defect formation during the process of nonequilibrium symmetry breaking phase transitions characterized by an order parameter [1]. Examples of topological defects are vortices in superconductors and superfluids, magnetic domain walls in ferromagnets, and many other extended objects in condensed matter physics, also including cosmic strings in cosmology, which may have been playing a role in the phase transition processes in the early Universe [2]. In the study of spontaneously broken symmetry theories in quantum field theory (QFT) the Anderson-Higgs-Kibble (AHK) mechanism is well established [3-5]: the gauge field is expelled out of the ordered domains and confined, through self-focusing propagation, into “normal” regions where the order parameter is vanishing. In this report, our attention is focused on the dynamics governing the radiative gauge field inside the ordered region, in particular on its role in the onset of phase locking among the e.m. modes and the matter components. In our discussion we will closely follow Ref. [6].
685
686
E. Del Giudice and G. Vitiello
At a first sight one would say that in the AHK mechanism the gauge field is in competition with the long range correlation established, as a dynamical consequence of the spontaneous symmetry breakdown, among the system components by the Nambu-Goldstone (NG) particles. However, as we show here, the radiative gauge field plays a role also in the ordered regions where it sustains the emergence of coherence. As we will show, such a role is in some sense complementary to, not in contradiction with the AHK mechanism. Phase locking among the matter field and the radiative gauge field in an extended space-time region is found to be the dynamical response of the system aimed to preserve the theory local gauge invariance. The physical meaning of local gauge invariance is to guarantee the stability of the system at mesoscopic and macroscopic space-time scales against quantum fluctuations characterizing the behavior of the quantum components at the microscopic scale. The QFT solution to the problem of building a stable system out of fluctuating components consists in prescribing that the Lagrangian of the system should be indeed invariant under the local phase transformation of the quantum component field ψ (x, t ) → ψ ' (x, t ) = exp(igθ (x, t ))ψ (x, t ) . Local phase invariance is then achieved by introducing the gauge fields, e.g. the electromagnetic (e.m.) field Aµ (x, t ) , such that the Lagrangian be also invariant under the local gauge transformation Aµ (x, t ) → A' µ (x, t ) − ∂ µ θ (x, t ) . This is devised to compensate terms proportional to ∂ µ θ (x, t ) arising from the Lagrangian kinetic term for the matter field ψ (x, t ) . The gauge field may be thus described as a compensating “reservoir” against variations in the many accessible microscopic configurations of the system. Our model system consists of an ensemble of a given number of two-level atoms, say N per unit volume V, which may represent rigid rotators endowed with an electric dipole. We consider the interaction of these atoms with the e.m. quantum radiative modes generated in the transitions between the atom levels and disregard static dipole-dipole interaction. The system is assumed to be spatially homogeneous and in a thermal bath kept at a non-vanishing temperature T. Under such conditions the system is invariant under dipole rotations and since the atom density is assumed to be spatially uniform the only relevant variables are the angular ones. In our discussion we use natural units = 1 = c . By closely following the presentation of ref. [6] we denote with dΩ = sin θ dθ dφ the element of solid angle and with (r , θ , φ ) the polar coordinates of r. The dipole wave field φ (x, t ) integrated over the sphere of unit radius r gives:
The Formation of Coherent Domains in the Process of Symmetry Breaking … 687 2
dΩ φ (x, t ) = N , which, in terms of the rescaled field χ (x, t ) =
1 N
(1)
φ (x, t ) , reads as
2
dΩ ϕ (x, t ) = 1 .
(2)
Under the assumed conditions the field χ (x, t ) may be expanded in the unit sphere in terms of spherical harmonics χ (x, t ) = l , m α l , m (t ) Yl m (θ , φ ) . By setting α l , m (t ) = 0 for l ≠ 0,1 , this reduces to the expansion in the four levels (l , m) = (0,0) and (1, m), m = 0,±1 . The populations of these levels are given by N | α l ,m (t ) |2 and at thermal equilibrium, in the absence of interaction, they follow the Boltzmann distribution. Since thermal equilibrium and the dipole rotational invariance imply that there is no preferred direction in the dipole orientation, | α1,m (t ) | for any m are independent of m and we may put
α 0,0 (t ) ≡ a0 (t ) ≡ A0 (t )eiδ 0 (t ) , α1, m (t ) ≡ A1 (t ) e iδ
iδ 1,m (t ) − iω 0 t
e
≡ α1, m (t ) e −iω 0 t .
(3)
(t )
Here α1, m (t ) ≡ A1 (t ) e 1,m . The amplitudes A0 (t ) and A1 (t ) and the phases δ 0 (t ) and δ1, m (t ) are real quantities; we will also define ω t ≡ δ1,0 (t ) − δ 0 (t ) . In Eqs. (3) ω0 ≡ 1 I , where I denotes the moment of inertia of the atom and gives a relevant scale for the system, ω 0 ≡ k = 2π λ . The eigenvalue of L2 2 I on the state (l , m) is
l (l + 1) 1 = = ω0 , 2I I where L2 denotes the squared angular momentum operator. The three levels (l , m), m = 0,±1 are in the average equally populated under normal conditions. This is confirmed by the absence of permanent polarization in the system. Indeed, in the assumed conditions, the time average of the polarization Pn along any direction n is found to be vanishing [6,7]. Therefore, we can safely write 2
m
2
α1,m (t ) = 3 a1 (t ) .
The normalization condition (2) gives
688
E. Del Giudice and G. Vitiello 2
2
Q ≡ α 0,0 (t ) +
2
2
α1, m (t ) = a0 (t ) + 3 a1 (t ) = 1 , ∀ t
(4)
m
and therefore
∂Q = 0 , i.e. ∂t ∂ 1 ∂ 2 2 a1 (t ) = − a 0 (t ) . ∂t 3 ∂t
(5)
The conservation law ∂Q ∂ t = 0 expresses the conservation of the total number N of atoms; Eq. (5) means that, due to the rotational invariance, the rate of change of the population in each of the levels (1, m), m = 0,±1 , equally contributes, in the average, to the rate of change in the population of the level (0,0), at each time t. We can set, consistently with Eq. (4), the initial conditions at t = 0 as 2
a0 (0) = cos 2 θ 0 ,
1 π 2 a1 (0) = sin 2 θ 0 , 0 < θ 0 < . 3 2
(6)
The θ 0 values zero and π 2 are excluded since it is physically unrealistic for the state (0,0) to be completely filled or completely empty, respectively. The parameter θ 0 can be properly tuned in its range of definition; for example, θ 2 = π 3 describes the equipartition of the field modes of energy E (k ) among 2 the four levels (0,0) and (1,m), a0 (0) ≅ | a1, m (0) |2 , m = 0, ± 1 , as given by the Boltzmann distribution when the temperature T is high enough, k BT >> E (k ) . Below we show that the lower bound for the parameter θ 0 is imposed by the dynamics in a self-consistent way. The field equations are [8,9]:
i
∂χ (x, t ) L2 = χ (x, t ) − i ∂t 2I i
d ρ k,r
[
]
k (ε r ⋅ x) ur (k , t )e − ikt − u r+ (k , t )e ikt χ (x, t ) , 2
∂u r (k , t ) k ikt 2 =id ρ e dΩ (ε r ⋅ x) χ (x, t ) , ∂t 2
(7)
where ur (k , t ) = (1 N ) cr (k , t ) , and cr (k , t ) denotes the radiative e.m. field operator with polarization r; d is the magnitude of the electric dipole moment, ρ ≡ N V and ε r is the polarization vector of the e.m. mode (the transversality condition k ⋅ ε r = 0 is assumed to hold). Notice the enhancement by the factor N appearing in the coupling d ρ in Eqs. (7) due to the rescaling of the fields. In Ref. [6] it has been shown that such a rescaling is actually responsible for the collective behavior of the system in the large N limit. This is related with the fact that, as evident from Eqs. (7), the collective interaction time scale is
The Formation of Coherent Domains in the Process of Symmetry Breaking … 689
shorter by the factor 1 N than the short range interactions among the atoms. Hence, the mesoscopic/macroscopic stability of the system vs the quantum fluctuations of the microscopic components. In obtaining Eqs. (7) we have restricted ourselves to the resonant radiative e.m. modes for which k = 2π λ = ω0 , and we have used the dipole approximation exp(ik ⋅ x) ≈ 1 since we are interested in the macroscopic behavior of the system. This means that the wavelengths of the e.m. modes we consider, of the order of 2π ω0 , are larger than (or comparable to) the system linear size. The amplitude of the e.m. mode coupled to the transition (1, m) ↔ (0,0) is denoted by
um (t ) = U (t ) eiϕ m (t ) ,
(8)
where U (t ) and ϕ m (t ) are real quantities. We remark that Eqs. (7) are not invariant under time-dependent phase transformations of the field amplitudes. Our task is to investigate how the local (in time) gauge symmetry can be recovered. Eqs. (7) are of course consistent with the conservation law ∂Q ∂ t = 0 and they also show that 2 ∂ ∂ 2 u m (t ) = −2 a1, m (t ) , ∂t ∂t
(9)
from which we see that um (t ) does not depend on m since α1, m (t ) = a1, m (t ) does not depend on m. We can derive another conservation law, 2
2
u (t ) + 2 a1 (t ) =
2 2 sin θ 0 , ∀ t 3
(10)
where u (t ) ≡ um (t ) , a1 (t ) ≡ a1, m (t ) , the initial condition (6) has been used and we have set
u (0 ) = 0 .
2
(11)
A0 (t ) = Ω U (t ) A1 (t ) cos α m (t ) ,
(12)
A1 (t ) = −Ω U (t ) A0 (t ) cos α m (t ) ,
(13)
U (t ) = 2Ω A0 (t ) A1 (t ) cos α m (t ) ,
(14)
A0 (t ) A1 (t ) sin α m (t ) , U (t )
(15)
Equations (7) give
ϕ m (t ) = 2 Ω
690
E. Del Giudice and G. Vitiello
where the dot over the symbol denotes time derivative,
Ω≡ and
2d 3
ρ 2ω 0
ω0 ≡ G ω0
α m ≡ δ 1,m (t ) − δ 0 (t ) − ϕ m (t ) .
(16)
Equations for δ1, m and δ 0 can be derived in a similar way. Eqs. (12) - (14) show that the phases turn out to be independent of m. Indeed, the right hand sides of these equations have to be independent of m since their left hand sides are independent of m, so either cos α m (t ) = 0 for any m at any t, or α m is independent of m at any t. In both cases, Eq. (15) shows that ϕ m is then independent of m, which in turn implies, together with Eq. (16), that δ1, m (t ) is independent of m. We therefore put ϕ ≡ ϕ m , δ1 (t ) ≡ δ1, m (t ) , α ≡ α m , u (t ) ≡ um (t ) and a1 (t ) ≡ a1, m (t ) . One can always change the phases by arbitrary constants. However, if they are equal in one frame they are unequal in a rotated frame and gauge invariance is lost. The independence of m of the phases is here of dynamical origin and the phase locking which we will find (see Eq. (18)) among δ 0 (t ) , δ1 (t ) and ϕ (t ) has indeed the meaning of recovering the gauge symmetry. The study of the system ground states for each of the modes a0 (t ) , a1 (t ) and u (t ) shows that spontaneous breakdown of the global SO(2) symmetry (the global phase symmetry) in the plane (a0, R (t ), a0, I (t )) occurs [6] (the indexes R and I denote the real and the imaginary component, respectively, of the field). In the semiclassical approximation [5], we find [6] that for the mode a0 (t ) there is the quasi-periodic mode with pulsation
m0 = 2Ω (1 + cos 2 θ 0 ) (the 'massive' mode with real mass 2Ω (1 + cos 2 θ 0 ) ) and a zero-frequency mode δ 0 (t ) corresponding to a massless mode playing the role of the NG field. Note that the value a0 = 0 consistently appears to be the relative maximum for the potential, and therefore an instability point out of which the system (spontaneously) runs away. On the other hand, a1 (t ) is found [6] to be a massive field with (real) mass (pulsation) σ 2 = 2Ω 2 (1 + sin 2 θ 0 ) . For the u (t ) field, the global SO(2) cylindrical symmetry around an axis orthogonal to the plane (u R (t ), u I (t )) can be spontaneously broken or not, according to the negative or positive value of the squared mass
The Formation of Coherent Domains in the Process of Symmetry Breaking … 691
µ 2 = 2Ω 2 cos 2θ 0 of the field, respectively, as usual in the semiclassical approximation. In the case, µ 2 < 0 , i.e. θ 0 > π 4 , the potential has a relative maximum at u0 = 0 and a (continuum) set of minima given by
µ2 π 1 2 u (t ) = − cos 2θ 0 = − ≡ v 2 (θ 0 ) , θ 0 > , 2 3 4 6Ω
(17)
representing (infinitely many) possible vacua for the system. They transform into each other under shifts of the field ϕ : ϕ → ϕ + α . The global phase symmetry is broken, the order parameter is given by v(θ 0 ) ≠ 0 and one specific ground state is singled out by fixing the value of the ϕ field. We have a 'massive' mode, as indeed expected in the AHK mechanism [5], with real mass 2 | µ 2 | = 2Ω | cos 2θ 0 | (a quasi-periodic mode) and the zero-frequency mode ϕ (t ) (the massless NG collective field, also called the “phason” field [10]). The fact that in such a case u0 = 0 is a maximum for the potential means that the system dynamically evolves away from it, consistently with the similar situation noticed for the a0 mode. We therefore find that dynamical consistency requires θ 0 > π 4 . We now observe that, provided θ 0 > π 4 , a time-independent amplitude U (t ) ≡ U is compatible with the system dynamics (e.g. the ground state value of A0 ≠ 0 implies U = const. ). Equations (14) and (15) indeed show that such a time-independent amplitude U = const. exists, U (t ) = 0 , if and only if the phase locking relation
α = δ1 (t ) − δ 0 (t ) − ϕ (t ) =
π 2
(18)
holds. Therefore,
ϕ (t ) = δ1 (t ) − δ 0 (t ) = ω ,
(19)
and this shows that any change in time of the difference between the phases of the amplitudes a1 (t ) and a0 (t ) is compensated by the change of the phase of the e.m. field. When Eq. (18) holds we also have A0 = 0 = A1 (cf. Eqs. (12), (13)). The phase relation (18) shows that, provided θ 0 > π 4 , α = 0 . It expresses nothing but the local (in time) gauge invariance of the theory. Since δ 0 and ϕ are the NG modes, Eqs. (18) and (19) exhibit the coherent feature of the collective dynamical regime. The system of N dipoles and of the e.m. field is characterized by the “in phase” dynamics expressed by Eq. (18) (phase locking): the local gauge invariance of the theory is preserved by the
692
E. Del Giudice and G. Vitiello
dynamical emergence of the coherence between the matter field and the e.m. field. Finally, we consider the case in which an electric field E due for example to an impurity, or to any other external agent, is applied to the atom system in the phase locking regime. Let us assume E to be parallel to the z axis. Then the term = −d ⋅ E , where d is the electric dipole moment of the atom, has to be added to the system energy. This will break the dipole rotational symmetry. The polarization Pn is given by [6]
Pn =
1 3
( A02 − A12 ) sin 2τ +
2 3
A0 (t ) A1 (t ) cos 2τ cos ω − ω 20 +4
2
t
(20)
whose time average is nonvanishing:
Pn =
1 3
( A02 − A12 ) sin 2τ .
Here τ is given [6] by
tan τ =
ω0 − ω 20 +4 2
2
.
The non-zero difference in the level populations ( A02 − A12 ) , as it is indeed found in the phase locking regime (see [6]), is therefore crucial in obtaining the non-zero polarization. As shown by Eq. (20), the polarization persists as far as the field E is active (i.e. ≠ 0 ). The system finite size prevents indeed from having a persistent polarization surviving the → 0 limit [11,12]. In such a limit the dipole rotational symmetry is thus restored. In conclusion, the system may be prepared with initial conditions given by Eqs. (6) and (11), where the value of the parameter θ 0 is in principle arbitrary within reasonable physical conditions. Starting at t = 0 from the initial condition | u (0) |2 = 0 , the system then evolves towards the minimum energy 2 2 state where a0 (t ) ≠ 0 and the amplitude u (t ) departs from its initial zero value. This implies a succession of (quantum) phase transitions [13] from the 2 initial u0 = 0 symmetric vacuum to the asymmetric vacuum u (t ) ≠ 0 , which means that in Eq. (17) θ 0 has to be greater than π 4 . In this way the lower bound for θ 0 is dynamically fixed as an effect of the radiative dipole-dipole interaction. This results in turn in the phase locking (18) which expresses the coherence in the time behavior of the phase fields (cf. Eq. (19)). The role of the phason mode ϕ is to recover the local gauge symmetry, thus re-establishing the
The Formation of Coherent Domains in the Process of Symmetry Breaking … 693
local gauge invariance of the theory. This is done through the emergence of the coherence implied by the phase locking between the matter field and the e.m. field. The gauge arbitrariness of the field Aµ is meant to compensate exactly the arbitrariness of the phase of the matter field in the covariant derivative Dµ = ∂ µ − igAµ . Should one of the two arbitrariness be removed by the dynamics, the invariance of the theory requires the other arbitrariness, too, must be simultaneously removed. This is the physical meaning of the phase locking. The link between the phase of the matter field and the gauge of Aµ is stated by the equation Aµ = ∂ µϕ ( Aµ is a pure gauge field). When ϕ ( x, t ) is a regular (continuous differentiable) function then it can be easily shown that E = 0 = B , namely the potentials and not the fields are present in the coherent region. In agreement with the AHK mechanism we thus find that in the ordered domains the fields E and B are vanishing; however, the gauge potentials are there nonvanishing and they sustain the phase locking in the coherent regime. We also observe that the existence of non-vanishing fields E ≠ 0 and B ≠ 0 is then connected to the topological singularities of the gauge function ϕ ( x, t ) [11], as it happens, e.g., in the presence of the vortex or other topologically non trivial solution, again in agreement with AHK mechanism. As already observed, the coupling enhancement by the factor N implies that for large N the collective interaction time scale is much shorter than the short range interaction time-scale among the atoms. This in turn implies the mesoscopic/macroscopic stability of the system vs the quantum fluctuations of the microscopic components [6]. In a similar way, for sufficiently large N the collective interaction is protected against thermal fluctuations. Much larger than k BT is the energy gap, more robust is the protection against thermal fluctuations. As a final comment we note that energy losses from the system volume, which we have not considered in the discussion above, do not substantially affect the collective dynamical features. An analysis of energy losses when the system is enclosed in a cavity has been presented in [14] in connection with the problem of efficient cooling of an ensemble of N atoms. Another problem which we have not considered above is the one related to how much time the system demands to set up the collective regime. This problem is a central one in the domain formation in the Kibble-Zurek scenario [2,15,16]. We only remark that since the correlation among the elementary constituents is kept by a pure gauge field, the communication among them travels at the phase velocity of the gauge field [6].
694
E. Del Giudice and G. Vitiello
1. References 1. Y.M. Bunkov and H. Godfrin, Eds., Topological defects and the non-equilibrium dynamics of symmetry breaking phase transitions, NATO Science Series C 549, (Kluwer Acad. Publ., Dordrecht, 2000). 2. T.W.B. Kibble, J. Phys. A 9, 1387 (1976); Phys. Rep. 67, 183 (1980); A. Vilenkin, Phys. Rep. 121, 264 (1985). 3. P. Higgs, Phys. Rev. 145, 1156 (1966). 4. T.W.B. Kibble, Phys. Rev. 155, 1554 (1967). 5. C. Itzykson and J.B. Zuber, Quantum Field Theory (MacGraw-Hill Book Co., N.Y. 1980). 6. E. Del Giudice and G. Vitiello, Phys. Rev. A 74, 022105 (2006). 7. E. Del Giudice, G. Preparata and G. Vitiello, Phys. Rev. Lett. 61, 1085 (1988). 8. C.C. Gerry and P.L. Knight, Introductory quantum optics (Cambridge University Press, Cambridge, 2005). 9. W. Heitler, The Quantum theory of radiation (Clarendon Press, 1954). 10. L. Leplae and H. Umezawa, Nuovo Cimento 44, 410 (1966). 11. E. Alfinito, O. Romei and G. Vitiello, Mod. Phys. Lett. B 16, 93 (2002). 12. E. Alfinito and G. Vitiello, Phys. Rev. B 65, 054105-5 (2002). 13. E. Del Giudice, R. Manka, M. Milani and G. Vitiello, Phys. Lett. B 206, 661 (1988). 14. A. Beige, P.L. Knight, G. Vitiello, New J. Phys. 7, 96 (2005). 15. T.W.B. Kibble, in Topological defects and the non-equilibrium dynamics of symmetry breaking phase transitions, NATO Science Series C 549, Ed. Y.M. Bunkov and H. Godfrin, (Kluwer Acad. Publ., Dordrecht, 2000), p. 7. 16. W.H. Zurek, Phys. Rep. 276, 177 (1997) and refs. therein quoted.
COGNITIVE SCIENCE
This page intentionally left blank
ORGANIZATIONS AS COGNITIVE SYSTEMS. IS KNOWLEDGE AN EMERGENT PROPERTY OF INFORMATION NETWORKS? LUCIO BIGGIERO University of L’Aquila, Piazza del Santuario 19, Roio Poggio, 67040, Italy E-mail:
[email protected];
[email protected] The substitution of knowledge to information as the entity that organizations process and deliver raises a number of questions concerning the nature of knowledge. The dispute on the codifiability of tacit knowledge and that juxtaposing the epistemology of practice vs. the epistemology of possession can be better faced by revisiting two crucial debates. One concerns the nature of cognition and the other the famous mind-body problem. Cognition can be associated with the capability of manipulating symbols, like in the traditional computational view of organizations, interpreting facts or symbols, like in the narrative approach to organization theory, or developing mental states (events), like argued by the growing field of organizational cognition. Applied to the study of organizations, the mind-body problem concerns the possibility (if any) and the forms in which organizational mental events, like trust, identity, cultures, etc., can be derived from the structural aspects (technological, cognitive or communication networks) of organizations. By siding in extreme opposite positions, the two epistemologies appear irreducible one another and pay its own inner consistency with remarkable difficulties in describing and explaining some empirical phenomena. Conversely, by legitimating the existence of both tacit and explicit knowledge, by emphasizing the space of human interactions, and by assuming that mental events can be explained with the structural aspects of organizations, Nonaka’s SECI model seems an interesting middle way between the two rival epistemologies. Keywords: cognition, emergent properties, knowledge, mental states, organization.
1. Introduction A growing concern about knowledge, information and data as crucial competitive factors and main drivers of social development obscured any eventual difference among them. It was taken largely for granted the possibility to create and transfer them within and between organizations, with or without the intervention of knowledge management systems. In this “epistemology of possession” (Cook and Brown, 1999) among knowledge, information and data there are (if any) only slight differences, and all of them can be considered “things” producible and transferable within and between organizations. This
697
698
L. Biggiero
view has been challenged by a different approach, that started as an underground and minority approach and reached now the surface and legitimation of an alternative paradigm. In this “epistemology of practice” knowledge appears as radically different from information and data, and it is referred to the action of knowing rather than to an object. This rival views have a lot of implications at theoretical, empirical and managerial levels. It is not just a question to replace information with knowledge, and to consider organizations as knowledge processors instead of information processors. Once stated that organizations are cognitive systems (and even this assumption is questionable), their properties should be investigated and become a disputable matter because they depend on what does it mean “cognition”. In extreme synthesis, cognition can be marked by three types of capabilities, listed in the following ordering of growing complexity: A) the recognition, manipulation and production of those special sensorial data which are symbols; B) the interpretation of symbols, objects and events; C) the manifestation and activation of mental states, which usually are identified with speech acts, intentionality, emotions, purposive behavior. The two epistemologies take a univocal position for respectively the “A” and the “C” alternatives. In the epistemology of possession it is admitted that even artificial cognitive agents can create and transfer knowledge, and that the peculiar nature of tacit knowledge is substantially denied. However, this epistemology has been heavily criticized for not taking into account the specificity and the complexity of human interaction (Richardson, 2005; Tsoukas, 2005). Moreover, the theories consistent with that epistemology do not explain well the dynamics of competitiveness of firms and territorial systems (Amin and Cohendet, 2004; Nightingale, 2003). The epistemology of practice states that only individuals and human organizations can create knowledge, that tacit knowledge is irreducible to explicit knowledge, whose existence is questioned. However, the epistemology of practice has serious problems to explain such irreducibility and how tacit knowledge can be stored and transferred. Other approaches allow differentiating the required capabilities between the activities of knowledge creation and transfer, and between the creation of tacit and explicit knowledge. Nobody contends that cognitive agents can have different cognitive capabilities, but the core question is: “where does cognition start from?” In other words, does exist a minimum threshold to detect the presence of cognitive capabilities or is it just a question of degree, according to which our refrigerator,
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
699
being designed on a feedback mechanism, would be a very simple but cognitive agent? If keeping the highest threshold (the “C” capability), would agent-based simulation models show mental properties? They certainly have “A”, and likely also “B” properties, and further they can be self-organizing and learning systems. Is all this enough to say that they have mental states? The debate on the nature of tacit knowledge and the limits of codification is fully immersed in the previous issues. The supporters of the epistemology of practice, though claiming a non-reified nature of knowledge, often treat knowledge transfer in a rather traditional way, thus raising the question: transfer of what? The confrontation of the two epistemologies raises a number of other questions: is the juxtaposition between the two paradigms so radical to avoid any compromising position? If yes, then what would be the practical consequences for management and organization science? In particular, would it still making sense to speak about knowledge management systems? If yes, in which terms? What such systems could eventually manage? The answers can be found only revisiting and facing with two crucial debates which have developed during last decades, but which have received less attention by economists and organization scholars. The problem of the nature of cognition and the mind-body problem run parallel in the second half of last century, though the former dates much longer, back to Cartesian philosophy and, to some extent, to ancient Greek and Indian philosophy. The problem of the nature of cognition concerns what does mean “thinking”, computation, and, paradigmatically, whether computers can think. In organization science, given that knowledge creation is the peculiarity of cognitive (thinking) agents, the issue immediately affected by this debate refers to the eventual differences between data, information and knowledge. The mind-body problem deals with the relationship between the physical and the mental states (events) in cognitive systems. It gives indications for the question whether organizations can be considered collective minds, and thus, whether they can have mental states, and whether they can be assigned socio-cognitive and social-psychological properties, like intentions, identity, trust, reputation, etc. Both debates (cognition and mind-body) supply insights on the nature of tacit knowledge and its codifiability, and the related issue of a theory of knowledge management systems. The confusion, difficulties and ambiguities marking the plethora of positions in these fields come from the illusion that it would have been possible to avoid the confrontation with those fundamental issues. The major aim of this paper is to show the strong connections between them and the problems of theorizing in organizational cognition, in simulation modeling, in firm or
700
L. Biggiero
territory competitive analysis, and finally in knowledge management systems. Here of course we cannot revisit extensively both problems, but just refer to them for what matters mostly when considering organizations as cognitive systems. In next section it is discussed the problem of the nature of cognition, showing the differences between (old and new) cognitivists constructivists. In the third section it is address the mind-body problem, which is applied to the issue of organizational knowledge creation and transfer. It is proposed a correspondence between the philosophical approaches and the main current positions in economics and organization science. In the fourth section it is addressed the codification dispute, which is referred to the different positions occurring in the debate on the nature of cognition and the mind-body problem. It is shown how the seemingly distant question of intentionality gives interesting suggestions for the codification problem. The constraining consequences of the rival epistemologies for the crucial question of knowledge transfer are discussed in section five. Finally, in the sixth section some implications for organizations, territorial systems, and knowledge management systems are developed. The issue of complexity is not treated apart, but it will result evident how it crosses all the others. The discussion of this paper is taken mostly at the organizational level, with few indications for the individual and the inter-organizational (and territorial) level. 2. Constructivism vs. cognitivism and connectionism Winograd and Flors (1986), Varela et al. (1991), Varela (1992), and Venzin et al. (1998) look at the debate on artificial intelligence as a key reference for understanding knowledge in organizations. They identify three epistemological positions: the cognitivist, the connectionist, and the autopoietic. In the field of economics and organization science the first perspective is well represented by Simon (1969, 1977, 1997; March and Simon, 1958), Galbraith (1973) and nonevolutionary economists, while the second has been developed in different ways by Nelson and Winter (1982), Kogut and Zander (1992), Kogut (2000), Cohendet and Llerena (2003), Monge and Contractor (2003). Connectionists differ from cognitivists in that information and knowledge are supposed to be distributed within organizations and parallel computed, instead of centralized and sequentially computed. Moreover, behaviors are embedded into a set of routines and rules, which are taken together by means of socio-economic relationships. Indeed, the patrol of “old cognitivists” has been now replaced by “new cognitivists”, who can be considered as one single group with
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
701
connectionists. The distinction between old cognitivists and connectionists (or “new cognitivists”) is addressed by Casti (1989), who suggests to identify them with respectively the supporters of the strong and the weak program of artificial intelligence, or even the top-down and the bottom-up approaches to artificial intelligence. In the autopoietic perspective information is seen as interpreted data, and as such it cannot cross organizational borders. Only data can do that. As for knowledge, while in the connectionist position it depends on the organizational network, in the autopoietic epistemology “knowledge is always private”. Magalhães (1998) underlies the difference between data and information, which in its essence corresponds to that between syntax and semantics: by simply manipulating data is never possible to get information. It is necessary to interpret data through meanings. Human interactions create knowledge, which is seen as a process and not as an object. From knowledge to knowing (Orlikowski, 2002). Magalhães (1998) and Venzin et al. (1998) argue that this autopoietic viewpoint is well represented by Nonaka, Nishiguchi and Takeuchi (Nonaka and Nishiguchi, 2001; Nonaka et al., 1998; Nonaka and Takeushi, 1995), but indeed they consider explicit knowledge as one of the forms in which knowledge not only can be obtained or transferred, but also created. On the contrary, for the supporters of the autopoietic view, explicit (codified) knowledge is an oxymoron (Zeleny, 2005). Knowledge would be associated only to the tacit nature, which, on its own, is seen in the action of knowing and not as a state, as something which can be possessed. The supporters of autopoiesis differ quite a bit when concerning data and information. Some underline the distinction between information and knowledge (Zeleny, 2000, 2005), then assuming the conventional view that information is a sort of structured data. Others (Aadne et al., 1996; Magalhães, 1998; Venzin et al., 1998; Von Krogh et al., 1996) make a sharp distinction already between data and information, which is seen as interpreted data. Besides the criticisms that Biggiero (2001) [4] moved to the fundamental argument that organizations are autopoietic systems, one of the problems in this debate is that constructivists often assign cognitivists too naive or positivist epistemological positions. Although it is not impossible to find them, it is a too easy game to assign cognitivists the most traditional view of extreme rationalism and positivism. That reality is subjectively perceived does not seem a concept so hard to be accepted by new cognitivists (Biggiero, 2001 [5]). Fighting against the idea of objective perceptions and observers, and emphasizing the issue of self-reference constructivists seem to “force an open door”.
702
L. Biggiero
Approaches close to social constructivism (Berger and Luckman, 1967; Brown and Duguid, 1991, 1998; Gherardi, 2001; Organization, 2000; Orlikowski, 2002; Weick, 1969, 1995) or to cybernetic constructivism (Magalhães, 1998; Mingers, 1995; Varela et al., 1991; Von Krogh et el., 1998; Watzlawick, 1984; Yolles, 2006; Zeleny, 2005) belong to the epistemology of practice. If cognition were identified only with mental states (the “C” capability), and these were allowed to pertain only to humans, and, finally, if methodological individualism were accepted while connectionism rejected, then organizations would be not seen as cognitive systems. While we have many contributions towards a theory of organizational knowledge creation, with few exceptions (Yolles, 2006; Zeleny, 2005), we lack indications for knowledge management systems, assuming that this is possible and would not result in another oxymoron. If we come back to the graduate scale of cognition from symbols processing capability to the property of exhibiting and performing mental states, we see that cognitivists and constructivists lie at the opposite extremes: the former tend to consider symbols processing as a full sign of cognitive ability, while constructivists tend to limit it to systems able to exhibit and perform intentionality. Although not yet developing a theory of organizational knowledge creation, the supporters of social simulation through artificial societies are naturally consonant with the epistemology of possession. 3. The mind-body problem As well known, this problem concerns the relationship between the physical and the mental states of humans (Guttenplan, 1994; Haugeland, 1981), and we could say, by extension, of any cognitive system. It is a very old question, whose controversial development in systematic way dates back at least to Cartesio, who first proposed his famous ontological dualism: thought and consciousness derive from mind, which has a different substance from body. There are no laws connecting phenomena generated by them. Thus, the original position denies any relationship between physical and mental states. Cartesian ontological dualism is nothing but anachronistic in social sciences, and specifically in economics. Technological and economic structure of organizations has nothing or less to do with its socio-cognitive and social-psychological aspects, like identity, reputation, emotion, etc. Even more radically, the inner structure of organizations was regarded as a black box preventing investigations. Noneconomic or non-technological variables were neglected or excluded from the attention of economists. This view is still the mainstream, but evolutionary
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
703
theory of the firm and organizational economics have decided to open the black box, and at least to recognize the existence of the two spheres of physical (structural) and mental events. The second interesting position to be listed here is the eliminative materialism (Churchland, 1981), which denies any peculiar existence to mental events. They are pure appearance; there is only body. This view can be found in current organizational economics too, for instance when trust is treated as a phenomenon masking calculativeness (Williamson, 2002). Trust would be considered just and purely as an unclear term to indicate risk and agents’ ability to calculate their convenient decisions. Analogue concern is deserved to “epiphenomena” like reputation, identity, etc. Management and organization science do not follow this line of denying real existence to mental states, though faced by unilateral approaches reflecting single specialized disciplines, like accounting, marketing, etc. Thus, it is possible to find an accounting perspective on reputation, where this phenomenon is looked and measured essentially in terms of financial results. However, the difference with economics is that, even in these partial approaches, it is recognized the complex nature of mental events, which further are supposed to feed back on accounting or other variables as well. What has been long (and partially it is still) debated is whether mental states and intentionality can be assigned at organizational level, and not restricted to the individual level, and ultimately whether organizations are cognitive systems. According to methodological individualism the answer is negative, because these are properties peculiar only of individuals. This is also the dominant position in economics, with some differentiation in evolutionary economics. Conversely, in management and organization science that possibility is growingly accepted and it is consolidating the field of study of organizational cognition, identity, trust, reputation, etc. Even more, and following the same idea, these concepts are becoming to be applied also at inter-organizational and territorial level (Biggiero and Sammarra, 2003; Sammarra and Biggiero, 2001). The third perspective is reductionism, according to which the physical and the mental states have its own ontology and proper concepts and theories, but the mental derive from the physical states. This is currently the dominant view in natural and social sciences at individual level, with differentiated positions when concerning the organizational level. Reductionism has many variants, among which two mostly matter for our discussion. Both can be referred to the theory of tokens, which states that to each mental token does correspond a specific and unique physical token. The two variants differ in the epistemological relationships between physical and mental events. In one variant mental events
704
L. Biggiero
are totally predictable by studying the corresponding physical events, while in the other variant the predictability is denied. This latter is the position of Davidson (1980), who argues that the predictability of mental from physical events is precluded because psychology he says- is not a science but just a categorization and rationalization of behavior. The ontological reducibility of mental to physical events is complete, but the nomological reducibility definitely incomplete. Physical and mental events are mutually influencing, but the former have a major autonomy because there can be physical without mental states but not vice versa. Now, besides the philosophical implications and the disputable judgment on the epistemological status of psychology (MacDonald and MacDonald, 1995), Davidson’s view, once transferred on the field of organizational cognition is interesting for our discussion. It would suggest that, although mental events and knowledge as practice come from the cognitive patterns of human interactions within organization, the created knowledge would remain not completely available, traceable and explainable. As we will see in the next section, this could be the philosophical support for an explanation of why part of knowledge, which can be identified with tacit knowledge, will never be reduced to bytes or someway codified. 4. The codification dispute One of the main results of developing an evolutionary theory of the firm has been that of replacing the view of firms and information processors with a view of firms as knowledge processors (Amin and Cohendet, 2004). This change has had very valuable outcomes in better understanding firms’ behavior, in linking routines and decision making, and ultimately firm competitiveness to capabilities and learning processes, and finally in reducing the gap between economists and management scholars because of looking inside the black box. However, lacking a clear distinction between information and knowledge or reducing knowledge to systematized or structured information diminishes the impact of the evolutionary theory of the firm. The current debate on tacit knowledge and its codifiability mirrors this state of ambiguity and confusion. Connectivists and constructivists seem to share the same reductionist approach to the mind-body problem: mental states can be reduced to and explained by the underlying cognitive networks. However, the measure and extension of reduction and explanation can vary to a large extent. It can be almost substantially eliminated by constructivists, or conversely viewed as a strict mapping by some connectivists. This latter is the position taken by many
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
705
economists in the current debate on the possibility of codifying tacit knowledge (Cowan, 2001; Cowan et al., 2000; David and Foray, 1995; Foray and Cowan, 1995). They argue that in principle it is always possible for any practice to generate a codebook, which would gather all necessary tacit knowledge making it explicit, and consequently transferable. The concrete production of such codebooks would depend on economic convenience in broad sense. Nightingale (2003) suggests an intermediate view between constructivists and connectivists. He shares the connectivist idea of cognitive networks as the generators of mental states, language, tacit and explicit knowledge. Thus, he accepts the existence of both a reified and an interactive form of knowledge, and he agrees also to consider tacit knowledge as resident in cognitive patterns. He suggests to justify the existence and non-codifiability of tacit knowledge by placing it partially into unconsciousness and into that part of conscious mental state where non-verbal knowledge resides. The rationale comes from merging studies on neural networks and consciousness by Damasio (1994, 1999) and Edelman (1987, 1989, 1992), and those on intentionality, language and consciousness by Searle (1983, 1993, 1995, 1998). This way, tacit knowledge remains nonreducible to codebooks, but at the same time it escapes out of the mist of something non-definable and pseudo-scientific. Nightingale’s position appears consistent with Davidson approach to the mind-body problem. The seemingly similarity between connectivists (or new cognitivists) and constructivists in considering cognitive systems as compound by cognitive networks, and its mental states as characterized in terms of emergent properties can disappear when looking at the different outcomes of their approaches to the nature of knowledge and the mind-body problem. For connectivists tacit knowledge would present no any peculiar status which prevents its codification into explicit knowledge. At the very end, it would be just a question of bytes. Moreover, knowledge and mental states can be precisely, though hardly, understood and predicted by studying the structure and evolution of cognitive networks which are supposed to produce them. Finally, knowledge is reified to the status of object, as a database or a book. “The range of knowledge is much greater than the range of action” (Carley, 1999: 8), while for most constructivists in practicing there is much more knowledge than in explicit knowledge. According to connectivists and new cognitivists, artificial cognitive agents can create knowledge, because for cognition it is requested just the minimum capability – the alternative “A” – and, due to its recursive cognitive patterns, they are supposed to generate (or simulate) mental events. Being not able to use natural language and, from an evolutionary point of view, being at its infant stage, its thinking is poor, at least if compared with that of humans. Agent-based
706
L. Biggiero
simulation models (Conte and Castelfranchi, 1995; Gilbert and Terna, 2000; Gilbert and Troitzsch, 2005; Hegselsmann, et al., 1996; Pitt, 2005; Sichman et al., 1998) are interesting cases for this issue. Indeed, most theorists in this field are perfectly wary of the social nature of information and knowledge, albeit they usually do not make sharp distinctions between the two. Some of them are fully engaged in steering their scientific communities far from purely computational approaches (Castelfranchi, 2002). When their simulation models are built in an emergentist way, that is when its agents are able to see their individual and/or aggregate behavior, they possess both properties of self-reference and emergent cognition. Thus, they “think” and create knowledge. Is there any tacit knowledge? Computational scientists engaged in social simulation would answer positively to also this question (Falcone et al., 2002). When models are enough complex it would be possible to get the knowledge creation effects of collective behaviors lacking the possibility to exactly trace when where and how they occurred. If tacit knowledge is considered as a new form of the “ghost into machine”, it would be much closer to the supposed ghost operating in human interactions. Some computational scientists (Carley, 1999; Krackhardt, 1992, 1995) tend often to take an even more extreme position than “simple” cognitivists by assigning cognitive properties also to pure symbols repositories, like databases or books. Conversely, if computers or artificial cognitive networks were denied to think and create knowledge, or if it were assumed some form of ontological dualism or anti-reductionism between physical and mental states, then any social simulation obtained by designing and “running” artificial societies would be totally meaningless. On the other hand, being artificial cognitive networks based on bytes and (seemingly parallel) computation, if the previous conditions were reversed, then explicit knowledge would make sense. Tacit knowledge could be interpreted as residual knowledge, which would be measurable and explainable but not detectable and codifiable because embedded into the processes and “ecologies” (Carley, 1999) of interacting patterns. Nonaka seems to be in an intermediate position, which admits the existence of both tacit and explicit knowledge. According to the SECI model, one of the main goal of knowledge management systems is just that to enable the formation of a space of interaction in which knowledge can be easily created and converted from one form to another.
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
707
5. Knowledge transfer Although related, the questions of creation and transfer can be considered also separately: a system could be able to transfer but not to create knowledge. For example, if knowledge were concerned in both its forms of explicit and tacit knowledge, but at the same time the creation of knowledge were associated only with the highest cognitive capability (the “C” category of cognition), then a simple information technology platform of knowledge management would be a system able to transfer but not to create knowledge. It is noteworthy that this is a possibility allowed also by Nonaka’s SECI model. The reverse is apparently harder to imagine, because intuitively it seems that the capacity to create is more sophisticated of the capacity to transfer, and so one can think that, if a system is able to create, then it is able also to transfer knowledge. However, this logical relationship is not so certain, and it depends on the categorization of knowledge. For instance, if knowledge is narratives, if it is action in progress, if it is knowing, how is it possible to be transferred? For radical constructivists the non-reification of knowledge prevents the possibility of transfer, regardless how distant are the supposed senders and receivers. This limit holds even in the face-toface relationship between a teacher and her disciples (Von Glasersfeld, 1995). Being not an object, nothing can be transferred. This is a very crucial issue because even the ones who inspire their positions to some form of constructivism, when dealing with the problem of transfer, do accept that possibility. Indeed, almost all economic, management, and sociological literature on territorial systems, inter-firm networks, innovations and strategies is focused on, and takes for granted the possibility of knowledge transfer. It would be quite uncomfortable to have a theory of knowledge creation that implies the impossibility of transferring that knowledge. The epistemology of possession assumes that all the three entity are transferable, eventually with some difficulty when using computer-mediated communication technologies and/or when treating tacit knowledge. Indeed, in their view of tacitness there is no any impossibility, but just inefficiency (difficulty) in terms of computational or economic resources. They could argue also that, when knowledge is well codified, computer-mediated communication technologies are more and not less efficient than face-to-face. These positions are firmly rejected by constructivists (Magalhães, 1998; Venzin et al., 1998). It is hard to believe that explicit knowledge does not require the intervention of tacit knowledge to produce any useful and effective outcome. Any amount of cookbook does not guarantee at all to be a good cook. A reason of interest for
708
L. Biggiero
Nonaka’s SECI model is that it admits the existence of, and considers both tacit and explicit knowledge as two complementary forms. Nonaka et al. (2006) seem particularly wary of the problems connected to the definition of knowledge and cognition, and they try to find a way to make their model consistent with constructivism. In their model Ba represents the contextual condition for the most critical operations of transformation between the various forms of knowledge. It could be seen as covering the area between unconscious and conscious actions where Nightingale (2003) places the limits of codifiability, and where Davidson (1980) could put the breach to the reducibility of mental events to physical events. The reinterpretation of the Japanese concept of Ba refers to that space of interaction where gnosiological and semiotic complexity (Biggiero, 2001 [6]) move perceptions and cognition into the sphere of unconsciousness, and non- or para-verbal knowledge. 6. Conclusions In the light of the debates on the nature of cognition and on the mind-body problem, the two rival epistemologies of possession and practice can be updated and better reformulated underlining its implications. In the former perspective: (i) cognition is the emergent outcome of complex adaptive (partially selfreferential) networks; (ii) knowledge can be also explicit; (iii) artificial societies can have mental states; and (iv) tacit knowledge is, at least in principle, completely codifiable. Conversely, in the epistemology of practice: (i) cognition is associated with mental states in a one-to-one relationship; (ii) explicit knowledge is an oxymoron; (iii) artificial societies, being composed by noncognitive agents, do not have mental states and thus, cannot simulate in a satisfying way any mental event; (iv) (tacit) knowledge is a peculiarity of organizations and it is irreducible, because qualitatively different, to information and data. The two epistemologies seem juxtaposed and have even opposite implications. However, especially but not only in organization science, we are in a rather paradoxical situation. On one side the epistemology of practice is recruiting more and more adepts, giving a strong and exciting impulse to understand organizational mental events. On the other side the computational approach to social science has been renewed under the development and the extraordinary heuristic power of agentbased simulation models. But, as we have seen, this perspective finds its scientific sense only into the epistemology of possession. Moreover, many social simulation scientists are supposed to be
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
709
absolutely sensitive to the question of organizational mental events. Thus, it seems that the research agenda of next years has to cope with this question. Being a sort of third way between the two epistemologies, Nonaka’s SECI model is an interesting point of reference. It has the great merits to take into account (or at least to be open to consider the relevance of) organizational mental events, to legitimate the existence of explicit knowledge, and to admit that computational social simulation models can be cognitive systems in the highest sense. Its missing point is that it lacks a clear theory explaining where cognition comes from, what gives space to tacit knowledge, and to what extent mental events can be derived from organizational structure. In short, it is necessary to take clear positions respect to the two problems of the nature of cognition and the mind-body relationship. References 1. J.H. Aadne, G. Von Krogh and J. Roos,. in Managing Knowledge. Perspectives on 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Cooperation and Competition, Ed. G. Von Krogh and J. Roos, (Sage, London, 1996), pp. 9-31. A. Amin and P. Cohendet, Architectures of Knowledge, Firms, Capabilities and Communities (Oxford UP, Oxford, 2004). P. Berger and T. Luckman, The Social Construction of Reality (Allen Lane, London, 1967). L. Biggiero, in Sociocybernetics, Complexity, Autopoiesis and Observation of Social Systems, Ed. G. van der Zouwen and F. Geyer, (Greenwood, Westport (CT), 2001), pp. 125-140. L. Biggiero, Systemica 12, 23-37 (2001), (reprinted in LUISS International Journal, 2000). L. Biggiero, Nonlinear Dynamics and Chaos in Life Sciences 5, 3-19 (2001). L. Biggiero, Entrepreneurship & Regional Development 18(6), 1-29(2006). L. Biggiero and A. Sammarra, in The Net Evolution of Local Systems, Knowledge Creation, Collective Learning and Variety of Institutional Arrangements, Ed. F. Belussi, G. Gottardi and E. Rullani, (Kluwer, Amsterdam, 2003), pp. 205-232. J.S. Brown and P. Duguid, Organization Science 2(1), 40-47 (1991). J.S. Brown and P. Duguid, California Management Review 40(3), 90-111 (1998). J.S. Brown and P. Duguid, The Social Life of Information (Harvard Business School Press, Boston, 2000). K.M. Carley, Research in the Sociology of Organizations 16, 3-30 (1999). K.M. Carley and M. Prietula, Eds., Computational Organization Theory Lawrence (Erlbaum Associates, Hillsdale (NJ), 1994). C. Castelfranchi, International Journal of Cooperative Information Systems 11, 381-403 (2002). J.L. Casti, Paradigms lost (Avon Books, NY, 1989). P.M. Churchland, Journal of Philosophy 78, 67-90 (1981).
710
L. Biggiero
17. P. Cohendet and P. Llerena, Industrial and Corporate Change, 12(2), 271-297 (2003).
18. R. Conte and C. Castelfranchi, Cognitive and Social Action (UCL Press, London, 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44.
1995). S.D.N. Cook and J.S. Brown, Organization Science 10(4), 381-400 (1999). R. Cowan, Research Policy 23(9), 1355-1372 (2001). R. Cowan, P. David and D. Foray, Industrial and Corporate Change, 9, (2000). R. Cowan and D. Foray, Industrial and Corporate Change 6, 592-622 (1997). A. Damasio, Descarte’s Error, Emotion, Reason and the Human Brain (Putnam, NY, 1994). A. Damasio, The Feeling of What Happens, Body and Emotion in the Making of Consciousness (William Heinemann, London, 1999). D. Davidson, Essays on Actions and Events (Clarendon Press, Oxford, 1980). G. Edelman, Neural Darwinism, The Theory of Neuronal Group Selection (Basic Books, NY, 1987). G. Edelman, The Remembered Present, A Biological Theory of Consciousness, (Basic Books, NY, 1989). G. Edelman, Bright Light, Brilliant Fire, On the Matter of the Mind (Basic Books, NY, 1992). R. Falcone, M. Singh and Y. Tan, Eds., Trust in Cyber-societies, Integrating the Human and Artificial Perspectives, (Springer, NY, 2002). D. Foray and R. Cowan, Industrial and Corporate Change 6, 595-622v (1997). B. Gallupe, International Journal of Management Reviews 3(1), 61-77 (2001). J. Galbraith, Designing Complex Organizations (Addison-Wesley, Reading (MA), 1973). S. Gherardi, Human Relations, 9 (2001). N. Gilbert and P. Terna, Mind & Society, 1, 57-72 (2000). N. Gilbert and K.G. Troitzsch, Simulation for the Social Scientist (Open University, Buckingham, 2005). S. Guttenplan, Ed., A Companion to the Philosophy of Mind (Blackwell, Oxford, 1994). J. Haugeland, Ed., Mind Design. Philosophy, Psychology, Artificial Intelligence (The MIT Press, Cambridge (MA), 1981). R. Hegselsmann, U. Mueller and K.G. Troitzsch, Ed., Modelling and Simulation in the Social Sciences from the Philosophy of Sciences Point of View (Kluwer Academic, Dordrecht, 1996). B. Kogut, Strategic Management Journal 21, 405-425 (2000). B. Kogut and U. Zander, Organization Science, 3, 383-397 (1992). D. Krackhardt, in N. Nohria and R. Eccles, Eds., Networks and Organizations, Structure, Form and Action (Harvard Business School Press, Boston (MA), 1992). D. Krackhardt, Entrepreneurship Theory and Practice 19, 53-69 (1995). J. Liebowitz, Ed., Knowledge Management Handbook (CRC Press, London, 1999). G. MacDonald and C. MacDonald, Eds., Connectionism, Debates on Psychological Explanation (Blackwell, London, 1995).
Organizations as Cognitive Systems. Is Knowledge an Emergent Property …
711
45. R. Magalhães, in Knowing in firms. Understanding, managing and measuring 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69.
knowledge, Ed. G. Von Krogh, J. Roos and D. Kline, (Sage, London, 1998), pp. 87122. J.G. March and H.A. Simon, Organizations (revised edition) (Wiley, NY, 1958). J. Mingers, Self-producing systems. Implications and applications of autopoiesis (Plenum Press, NY, 1995). P.R. Monge and N.S. Contractor, Theories of Communication Networks (Oxford UP, Oxford, 2003). R.R. Nelson and S. Winter, An Evolutionary Theory of Economic Change (Belknap Press of Harvard UP, Cambridge (MA), 1982). P. Nightingale, Industrial and Corporate Change 12, 149-183 (2003). I. Nonaka and T. Nishiguchi, Eds., Knowledge Emergence. Social, Technical and Evolutionary Dimensions of Knowledge Creation (Oxford UP. Oxford, 2001). I. Nonaka and H. Takeuchi, The Knowledge-Creating Company (Oxford UP, NY, 1995). I. Nonaka, K. Umemoto and K. Sasaki, in Knowing in Firms. Understanding, Managing and Measuring Knowledge, Ed. G. Von Krogh, J. Roos and D. Kline, (Sage, London, 1998), pp. 146-172. I. Nonaka, G. Von Krogh and S. Voelpel, Organization Studies 27(8), 1179-1208 (2006). Organization, Special Issue on Knowing in Practice (2000). W.J. Orlikowski, Organization Science 13(3), 249-273 (2002). J. Pitt, Ed., The Open Agent Society, Normative Specifications in Multi-agent Systems (Wiley & Sons, NY, 2005). K. Richardson, Managing Organizational Complexity. Philosophy, Theory, Application (IAP, Greenwich (CT), 2005). A. Sammarra and L. Biggiero, Journal of Management and Governance 5, 61-82 (2001). J.R. Searle, Intentionality (Cambridge UP, Cambridge, 1983). J.R. Searle, The Rediscovery of the Mind (MIT Press, Cambridge (MA), 1993). J.R. Searle, The Construction of Social Reality (Free Press, NY, 1995). J.R. Searle, Mind, Language and Society. Philosophy in the Real World (Basic Books, NY, 1998). J.S. Sichman, R. Conte and N. Gilbert, Eds., Multi-agent Systems and Agent-based Simulation (Springer Berlin, 1998). H.A. Simon, The Sciences of the Artificial (MIT Press, Cambridge (MA), 1969). H.A. Simon, Models of Discovery (Reidel, Dordrecht, 1977). H.A. Simon, Models Of Bounded Rationality, Vol. 3, Empirically Grounded Economic Reason (The MIT Press, NY, 1997). D.S., Staples, K. Greenaway and J.D. McKeen, International Journal of Management Reviews 3(1), 1-20 (2001). H. Tsoukas, Complex Knowledge. Studies in Organizational Epistemology (Oxford UP, Oxford, 2005).
712
L. Biggiero
70. F.J. Varela, in Understanding Origins, Contemporary Views on the Origin of Life, 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85.
Mind and Society, Ed. F. Varela and J. Dupuy, (Kluwer Academic, Dordrecht, 1992), pp. 235-263. F.J. Varela, E. Thompson and E. Rosch, The Embodied Mind. Cognitive Science and Human Experience (MIT Press, Cambridge (MA), 1991). M. Venzin, G. von Krogh and J. Roos, in Knowing in Firms. Understanding, Managing and Measuring Knowledge, Ed. G. Von Krogh, J. Roos and D. Kline, (Sage, London, 1998), pp. 26-66. H. Von Foerster, Observing Systems (Intersystems Publications, Seaside, 1982). H. Von Foerster, in Self-Organization and Management of Social Systems, Ed. U. Ulrich and G.J.B. Probst, (Springer, NY, 1984), pp. 2-24. H. Von Foerster, Understanding Understanding. Essays on Cybernetics and Cognition (Springer, NY, 2003). E. Von Glasersfeld, Radical Constructivism. A Way of Knowing and Llearning (The Falmer Press, London, 1995). G. Von Krogh, J. Roos and K. Slocum, in Managing Knowledge. Perspectives on Cooperation and Competition, Ed. G. Von Krogh and J. Roos, (Sage, London, 1996), pp. 157-183. G. Von Krogh, J. Roos and D. Kline, Eds., Knowing in Firms. Understanding, Managing and Measuring Knowledge (Sage, London, 1998). P. Watzlawick, Ed., The Invented Reality (Norton, NY, 1984). K.E. Weick, The Social Psychology of Organizing (Award Records Inc., Newberry, 1969). K.E. Weick, Sensemaking in Organizations (Sage, London, 1995). T. Winograd and F. Flores, Understanding Computers and Cognition. A New Foundation for Design (Ablex Publishing Co., NJ, 1986). M. Yolles, Organizations as Complex Systems. An Introduction to Knowledge Cybernetics (IAP, Greenwich (CT), 2006). M. Zeleny, in The IEBM Handbook Of Information Technology in Business, Ed. M. Zeleny, (Thomson Learning, Padstow (UK), 2000), pp. 162-168. M. Zeleny, Human Systems Management. Integrating Knowledge, Management and Systems (World Scientific, London, 2005).
COMMUNICATION, SILENCE AND MISCOMMUNICATION MARIA PIETRONILLA PENNA (1), SANDRO MOCCI (1,2), CRISTINA SECHI (1) (1) Univ. degli Studi di Cagliari, Fac. di Scienze della Formazione, Dip. Psicologia (2) Univ. degli Studi di Cagliari, Fac. di Scienze della Formazione, Dip. Studi Filosofici Email:
[email protected],
[email protected],
[email protected] The classical theories about communication have different views on the relevance of the requirement of the intentionality of the communicative agent. The composition of these views seems to be problematic since it leads to incompatible outcomes when we try to classify communicative behaviors. Some approaches, in order to build a synthesis, shift on the addressee the task of detecting the intentionality and thus cannot account for a number of interesting communicative phenomena. The systemic perspective instead, through the circularity of the inferences on system elements and the sharing of the attributes and the overall communicative characteristics of the system, defines, specifies and more generalizes the concept of communication, enabling to better single out the variety of phenomena connected to it and to catch the emergence of their communicative value. Keywords: Communication, communicative system, inferential circularity, silence.
1. The multidimensionality of human communication The human communication is a multidimensional activity: it has cognitive, social, cultural, economic, political involvements and it is strictly connected to the action. Therefore we must deal with communication from different perspectives, according to the peculiar concern of the specific thematic area that determines the analysis. As a matter of fact each perspective tends to define communication in the light of its own disciplinary interests and of phenomena included in its domain. That is why there is not a definition of communication univocally accepted. Unfortunately it is not only a matter of definition, but it also linked to issues concerning the contents and the specific features of communicative act. Every approach tends to regard communicative those phenomena which appear to be meaningful and functional to its own point of view. This fact has produced and still produces much confusion too. Besides, the concept of communication is heavily dependent on the (mostly recent) history of attempts made to define the concept itself. For example Shannon’s and Weaver’s mathematical theory identifies the communication basically with an unilateral transmission of 713
714
M.P. Penna et al.
information. The further integration of the feedback concept could constitute an element of generalization in order to define a communicative act since it implied the concept of exchange which is only used in order to regulate the informative flow. Since then the scheme based on the directionality continue to be connected to communication, conceived as addressed from a transmitter (Tx) to a receiver (Rx) and vice versa through a specific channel. In this way the communication has lost the sense of diffusion, of putting in “communis” the context, the sharing’s sense and the circularity. The requirement of purposiveness is an another example of strong conceptual conditioning. Within the psychological approach to linguistic and communicative phenomena the Palo Alto’s Group (Watzlawick et al., 1967, [12]) introduces the notion of behavioral interaction, thus widening the domain of analysis so as to include the context. This allows, in principle, to build the basis of an interactive approach to human communication. This point of view lets us go beyond the traditional framework “of the communication as an unidirectional phenomenon” (from the speaker to the listener) and regards as communicative all behaviors occurring within a definite context. In this way the communication assumes a pragmatic meaning, enabling the communicative act to affect the future behaviors of the communicants subjects. 2. It’s not true that all is communication Watzlawick’s famous metacommunicational axiom “One Cannot Not Communicate” (1967), wanted to emphasize the pragmatic aspect of the message, and it also was introduced to stress the need for looking not only at the simple semantic value, but has been often misunderstood. Finally it has brought some confusion between two important concepts: the behavior and the communication. It’s true that the “not behavior does not exists in an interactive situation”, (1967) because behavior does not have a counterpart, but the not communication is “a priori” not excluded. This can occur in case of the pure causality of the interaction, the state of strong confusion or of the issuer’s unconsciousness. However, Watzlawick took an important step defining the area of communication in an interactive situation. But the consequence of his pragmatic approach stands on the fact that each communicative act is regarded as “preterintenctional” (beyond our control, to mention an Italian legal term), that is beyond the agent intentionality. Who interacts cannot exempt from communicating. Even intentionally silent, we can communicate in a no-verbal way, through gestures, attitudes, posture. But it’s not granted that every behavior is communication, while there are many chances that a phenomenon, apparently
Communication, Silence and Miscommunication
715
without communicative features, is communicative instead. Behavior and communication are different phenomena and if we make them coincide: “[…] everything becomes communication (also the most accidental and unaware action) and we have not more the possibility to understand which are properties of the communication and specificities as such…” (Anolli, 2002 [1]). Moreover Anolli clarifies that the communication must be an interactive observable exchange, that has reciprocal intentionality and awareness (Anolli, op. cit.). The exchange requires that the behavior is not unilateral. So Shannon’s and Weaver’s theory is not more sufficient, as the interaction supposes the interdependence and circularity of the relationship. A mutual change of the respective communicative attributes must occur. Anolli clarifies the concept of the requirement of the intentionality and awareness as well. So the requirement of the intentionality of the communicative act appears essential in order to decide when an act is communicative. On the contrary, all scientific tradition related to Palo Alto’s School defends the lack of intentionality of communication. Also “popular” psychology has a similar concept. This is induced by the high diffusion power of media which consider as a good communicator who is able to catch the attention on himself. 3. The problem of requirement of intentionality Many scientists are changing their minds, holding that an act must be intentional in order to be communicative, unlike Palo Alto’s supporters. D.C. Dennett (1987) [3] sustains, in general, that man thinks that every natural agent, human being, animal, natural power, bases its behavior on goals. Miller and Steinberg (1975) [5], in particular, hold that every kind of human communication is made to obtain an answer or to influence the interlocutor. They maintain that the communicative act is not possible if there is not the intentional element. Grice (1975) [4] has introduced the distinction between the informative intention and the communicative intention. While the former only wants to increase the informative content of the addressee, the former instead constitutes a specific will to communicate, because it is based on the addressee’s awareness, on the issuer’s will to share the knowledge of diffused informative contents. These different points of view seems not to induce a dilemma: is intention fundamental in order to communicate or not? The problem is partially overcome by Buller and Burgoon (1994) [2] that add a further qualifying element in order to decide if we are in front of a proper communicative act.
716
M.P. Penna et al.
They analyze the perception, by the receiver, of the intention to communicate. Therefore there would be communication only when the transmitter has the intention to transmit, expressing positively such intention, and the receiver detects this will. In this way a relationship of communicative awareness is set. If the fundamental requirements are two then we communicate only when these are both present. When the intention is evidently absent, the receiver can only attribute communicative processes to the issuer, but he cannot define them as such. But, if the receiver does not perceive the issuer’s intention to communicate, even if this is manifestly present (because he is evidently declared, or observable by thirds), then this can be only considered a communicative attempt. Finally, a normal behavior is shaped if neither the issuing reveals the intention, nor the receiver tries to solve the problem. Buller’s and Burgoon’s model, trying to mediate the contrast on the intentionality, opens more problems than those it can solve. It introduces a gray area of an attempted or attributed communication, which is the realm of the miscommunication, like the irony, the deductive, false, pathological communication. All of them are, however, fully ways of communicating. In fact in these cases the relation of intentionality is always present and it is modulated, or better, it is disguised with the purpose of “to say, in order not to say” (Anolli, 2002 [1]). The one who is not able to find the communicative intention of the interlocutor perhaps has been deceived by the same interlocutor about his/her own intentions. Apparently, the communicative content does not seem to be addressed to him/her as actually is. Thus the intention carries out a metacommunicational role, because it is informative on the content of the communication itself. Paul Watzlawick (1967) [12] holds this when claiming that: “Every communication has a content and relationship aspect such that the latter classifies the former and is therefore a metacommunication” Sometimes, paradoxically, the true communicative content is the intention itself. In fact the pragmatic valence of the message is included in the will of the agent to focus on his/her own will to push the addresser to a definite behavior. That is why the content of the message can be easily inferred by clear contextual indications, as asserted by Gregory Bateson. He maintained that in every communication “a level of news and a level of order exists” (1951): the transmitted intention or, time after time, disguised on purpose, constitutes the order, that is the pragmatic member of the message which aims at determining a specific behavior in the interlocutors. On the other hand when neither intention is transmitted, nor received even if there is interaction, we are not in presence of a simple behavior,
Communication, Silence and Miscommunication
717
as Buller and Burgoon think, but we go into the field of the pathological communication. This is characterized by the impossibility to define clear relationships, even interacting with it, as in the case of the schizophrenic communication. (Selvini Palazzoli et al., 1975 [10]). 4. The circularity of the inferences The way to specify the concept of communication, by defining its exclusive characteristics and adding attributes, creates a paradoxical situation. In fact, the more you attempt to specify, that is to distinguish the functions of the transmitter and receiver and to characterize the specific parameters, the more you lose the general sense of the communicative concept as a “sharing”; in addition you are loosing more and more its resolutive power. The latter consists in the ability to characterize, to decipher, to distinguish all those phenomena that could be communicative but, by adding those specifications, are taken away from their field of attribution. Such a situation is described by the usual metaphor, “you can see the trees, but the forest is not seen”. An obvious case is the one of silence: if the requirement of the intentionality is not considered essential, all are inclined to recognize silence as a communicative act. But, if the intention is necessary to communicate, in this case to consider silence as an act of communication becomes problematic, unless we are satisfied with what Watzlavick (1967) [12] asserts when he says that silence manifests the will “to communicate what is not wanted to be communicated”, which is banally obvious. It is now necessary to turn upside down the paradoxical formulation: to specify is correct because theoretical confusions and conceptual collocations among similar phenomena are avoided (Anolli, 2002) [1]. We must enlarge the analysis of the interactions to all communicative field, by considering it as a system. In this way the communicative interactions do not interest only the transmitter or receiver, but also the observers come into play, being in turn observed, within and outside the same system. In fact through an explicit systemic framework we can define the communication in way which is specific, complete, coherent and synthetic. Only in this way it makes sense to speak about the circularity of the inference. Since a long time we know that the inferential process is fundamental in the communicative act. It was Peirce (1894) [7] who defined the inferential property of the sign. A message with an important equipment of signs can express a communicative content richer than that expressed in the interaction. On the contrary a message poor of signs, in order to be fully understood, can need a
718
M.P. Penna et al.
remarkable inferential activity by the receivers. But the inference can be limited only to bilateral interaction, falling back in previously mentioned contradictions. It is its circularity among the elements of the system that allows to recover the sense of the “putting in common” of meanings of the communication, which can therefore emerge from the circularity of the inferences and become the emerging result of a relational system with the attributes of interactivity, intentionality, visibility, awareness, expectation. With the help of other paradoxes, it is perhaps the phenomenon of silence the key that allows to turn over the interpretation. We have already analyzed the silence as a communicative phenomenon of systemic nature (Penna, Mocci, 2005) [8]. In fact we have found that within this perspective it can be a communicational carrier, and this occurs when it is inserted within a process of inferential circularity among the system constitutive elements themselves, determining the rising of the global communicative meaning (emergence of meaning). In order to overcome the impasse of the intentionality we must then consider the whole communication as the emerging result of the interactions in a communicative system. Such a system is the mean which enables a more integrated representation and avoids the localized approach based only on the communicative relationships. The system could be viewed as a whole emerging from the overall elements, their attributes, the possible communicative relations among the elements themselves, and the characteristics of these relations, while considering the elements as endowed whit common properties. In this case all elements would be potential communicators, that is endowed whit the faculty of communicating. These abilities, if exercised, would be able to influence the state of all the other elements which constitute the system, modifying attributes and relations. The change of the attributes of any member of the system, that is the change of the single one, would be reflected in that of all the others, so the system would be modified. Such a change of the system would make emerge a systemic property, not present before and not noticeable considering only the elementary relations among the members of the same system. In this way the relations between the elements, that is the interactivity, the visibility, the mutual intentionality, the awareness, the sharing of the meanings and of the symbolic systems are distributed among all the actors of the communicative field. They are not attested in a determinist way between the issuer and the addressee. It is mainly the inferential circular process that distributes such characteristics, as for example the intentionality. Describing at a microscopic level of the system an interactive relation in a classical sense may not appear as communicative.
Communication, Silence and Miscommunication
719
The addressee may lack the awareness of being, for instance, the addressee of a message. However this knowledge can emerge from an other relation, for example between him/her and the other observers. The problem, according to Buller’s and Burgoon’s model, of the addressee who does not find the intentionality of the transmitter, for example, would be overcome, being solved by the fact that in the system is present an observer who attests the intentionality as well. In this case the observer qualifies the action as communicative, making to come up (to emerge) the content. As we have already said, in this context, we are speaking about relations and interactions, and therefore we take in consideration communicative phenomena basically from a pragmatic point of view. As a matter of fact we analyze how the communicative phenomena are determined by and are reflected in the actions of the members of the system. Besides specifying the meaning, the systemic characteristic of the phenomenon will also enable the predictivity: the more the system is open, the more it facilitates the communication. Consequently the communicative content of the system will be a function of its degree of opening. In closed systems there could be only an increase of information, while in the open ones, besides detecting the state of the single relations among the elements, communication could be found too. It is in fact important to be able to estimate the communicative content of an act and its pragmatic effects. Such effects often determine substantial modifications in behaviors, even though not communicative. For example, in complex organizations, companies, institutions, the communicative system nearly always reflects the global relational structure of the same institution. Therefore a communicative act can foreshadow a substantial organization modification. That’s why it is important to enlarge the consideration to all the elements of the field, since their role is crucial, beyond their specific characteristic and their communicational attributes. Naturally the systemic redefinition of the communication can be considered as useful also from the semantic point of view, that is in terms of production and circulation of meanings. In this case the system of the relations is isomorphic to the context, as a matrix of meanings. The model of the communicational system can be described in terms of the characteristic levels of the open systems (Minati, 2004 [6]). The microscopic level could be articulated in a preliminary identification and description of the communicative system, of its members in their respective roles of actors and observers, of the same relational net, the state and the awareness of the interactions. It could then follow a macro description basically constituted by an analysis of the average effects of the system. The average effects are equivalent to the description through the classical theories of communication. Finally, we
720
M.P. Penna et al.
can get into the real phase of description of emergence of a communicative content constituted by the circulation of meaning that is created through the net of observations, inferences, intentions of the observers-actors. This is the true emerging phase of the systemic properties that, in our case, are connected to the creation and the circulation of the meaning and of communicative content by means of the interaction between elements. 5. Conclusions Which is the conceptual gain of the systemic approach in comparison to the classical theories? First of all the systemic approach completes the definition of the phenomenon and allows a wider generalization because it attributes a communicative value to a wider number of phenomena. In fact, actions such as silence, pauses in conversation and miscommunication are considered communicative because they make come up (emerge) meanings and have pragmatic valence. Then it better distinguishes the communicative phenomena from those merely informative through the concept of the circularity of the inferences on the awareness and on the intentionality and it is coherent since it uses the same concepts, characteristics and requirements of the interaction in order to explain every communicative act. References 1. L. Anolli, Ed., in Psicologia della Comunicazione (Il Mulino, Bologna, 2002). 2. D.B. Buller and J.K. Burgoon, in Strategic interpersonal communication, Ed. Daly and Wiemann, (Erlbaum, Hillsdale, N.J., 1994), pp. 191-223.
3. D.C. Dennett, The Intentional Stance (The MIT Press, Cambridge, Mass, 1987). 4. H.P. Grice, in Syntax and Semantics, Vol. 3, Speech Acts, Ed. P. Cole and J.L. Morgan, (Academic Press, New York, 1975), pp. 41-58.
5. G. Miller and M. Steinberg, Between people: a new analysis of interpersonal communication (Science Research Associates, Chicago, 1975).
6. G. Minati, Teoria Generale dei Sistemi – Sistemica – Emergenza: un’introduzione, (Polimetrica, Monza, 2004).
7. C.S. Peirce, in Collected Papers, Vol. 2, (Harvard University Press, Cambridge, Mass., 1894), pp. 1931-1935.
8. M.P. Penna and S. Mocci, in Proceedings of the 6th Systems Science European Congress, Paris, September 19-22, 2005, (Paris, 2005), CD-ROM.
9. J. Ruesch and G. Bateson, Communication: The Social Matrix of Psychiatry (Norton, New York, 1951).
Communication, Silence and Miscommunication
721
10. M. Selvini Palazzoli, L. Boscolo, G. Cecchin, G. Prata, Paradosso e
controparadosso. Un nuovo modello nella terapia della famiglia a transazione schizofrenica (Feltrinelli, Milano, 1975). 11. C. Shannon and W. Weaver, The mathematical theory of communication (University of Illinois Press, Urbana, 1949). 12. P. Watzlawick, J.H. Beavin, D.D. Jackson, Pragmatics of human communication. A study of interactional patterns, pathologies, and paradoxes (Norton & Co., New York, 1967).
This page intentionally left blank
MUSIC: CREATIVITY AND STRUCTURE TRANSITIONS
EMANUELA PIETROCINI Accademia Angelica Costantiniana, http://www.accademiacostantiniana.org Dipartimento di Musica Antica, Piazza A. Tosti 4 Roma RM Italy E-mail:
[email protected] Music, compared to other complex forms of representation, is fundamentally characterized by constant evolution and a dynamic succession of structure reference models. This is without taking into account historical perspective, the analysis of forms and styles, or questions of a semantic nature; the observation rather refers to the phenomenology of the music system. The more abstract a compositional model, the greater the number and frequency of variables that are not assimilated to the reference structure; this “interference” which happens more often than not in an apparently casual manner, modifies the creative process to varying but always substantial degrees: locally, it produces a disturbance in perceptive, formal and structural parameters, resulting more often than not in a synaesthetic experience; globally, on the other hand, it defines the terms of a transition to a new state, in which the relations between elements and components modify the behavior of the entire system from which they originated. It is possible to find examples of this phenomenon in the whole range of musical production, in particular in improvisations, in the use of the Basso Continuo, and in some contrapuntal works of the baroque period, music whose temporal dimension can depart from the limits of mensurability and symmetry to define an open compositional environment in continuous evolution. Keywords: music, emergence, complexity, creativity, structure transitions.
1. Introduction “The Changes have no consciousness, no action; they are quiescent and do not move. But if they are stimulated, they penetrate all situations under heaven. If they were not the most divine thing on earth, how could they do this?” (T'uan Chuan, Kung Tsë) In his foreword to the English translation of the Book of Changes (I Ching), a time-honored monument to Chinese thought, C.G. Jung (I Ching, 1950 [23]) puts forward an interesting interpretation of the text, in particular with regard to the philosophy of events and their succession. In this, as in other sacred Taoist and Confucian texts, the principles of causality pass almost unnoticed compared to the great importance attached to chance: “The moment under actual observation appears to the ancient Chinese view more of a chance hit than a
723
724
E. Pietrocini
clearly defined result of concurring causal chain processes. The matter of interest seems to be the configuration formed by chance events in the moment of observation, and not at all the hypothetical reasons that seemingly account for the coincidence. While the Western mind carefully sifts, weighs, selects, classifies, isolates, the Chinese picture of the moment encompasses everything down to the minutest nonsensical detail, because all of the ingredients make up the observed moment ”. More significant still is his analysis of the temporal dimension of events, from which Jung extracts the ‘principle of synchronicity’: “… synchronicity takes the coincidence of events in space and time as meaning something more than mere chance, namely, a peculiar interdependence of objective events among themselves as well as with the subjective (psychic) states of the observer or observers …” (I Ching, 1950 [23]). If applied to music, these considerations offer a new key to the reading of musical events, and especially the creative moment. Composition has always made use of procedural models of various degrees of complexity to organise sounds (Bruno and Pietrocini, 2003 [12]): from the basic operations (ordering, classifying, assembling, sequencing), to syntax (melody, harmony, rhythm) and formal systems (modality, tonality, seriality …); these and other models utilised over time can be described mathematically. It is to be pointed out, though, that because of the way the models relate and interrelate, development, as in all complex systems, has been decidedly nonlinear (Bruno, 2002 [11]; Benvenuto, 2002 [9]): the really relevant transformations and changes seem to have happened ex abrupto, not supported by sufficient connections of structural, historical and aesthetic causality. Very often, these great changes are ascribed to the genius of a great composer, whose extraordinary intuition produced a turning point that decisively influenced all subsequent production (David and Mendel, 1966 [16]; Apel, 1967 [2]). This theory is certainly valid and supported by numerous examples, although it is not true all of the time or in all cases. However, whatever the historical considerations and aesthetic consequences, it is this phase transition that is the focus of this paper. From the systematic point of view, in fact, we can say that change is the product of an emergence process which is characterized by a particular form of implicit learning by the observer-composer and collective learning when the new system is shared, acknowledged, generalized and reworked (Von Bertalaffny, 1968 [40]; Minati and Pessa, 2006 [26]).
Music: Creativity and Structure Transitions
725
How structural change comes about and exactly when, we cannot know for sure; it is possible, though, to trace signs of anomalies, “interferences”, that seem more often than not to appear by chance during the musical discourse; it is even more interesting to see how, also in quick succession, an accident can substantially modify the creative process. In this work I will be examining some phenomenological aspects of the emergence process in music, concentrating on transformation and change; the first part will deal with elements of the compositional process connected to structure reference systems; the second part is concerned with variation in relation to the composition and performance of music, quoting some significant examples of works from the baroque period. Given the nature and purpose of the paper, use will not be made of purely musicological analytical models, but rather an investigatory method pertaining to the field of research. 2. Part I 2.1. “Musica est scientia bene modulandi … et bene movendi” These are the opening words of Augustine’s “De Musica” (IV sec. d.C.), one of the most important musical treatises from Late Antiquity. Coherent with the principles of classical aesthetics, Augustine underlines that music should be regarded as a science, since reason is used and “proceeds according to the law of numbers in the proportional respect of time and interval” [1]. Music, therefore, is defined as the “science of well regulated movement, movement sought in itself” [1]. But what is meant by “movement” in music? We may attempt a definition by referring to dynamism: the way elements relate and interact, changes of state, modifications, transitions … Therefore, everything that takes place in the act of “cum ponere” in music is like continuous becoming, which, taken in absolute terms, harks back to the ideal representation of the “music of the spheres” of Plato and the Pythagoreans: perfect, immanent and imperceptible to the senses. However, it is because of this very becoming, in the world of the senses, that the idea takes concrete shape in the form of an artistic object. In this case, composition can be seen as the construction of musical architectures. The operational strategies and models utilized refer firstly to discrete space-time, which defines the framework in which the musical event is to be represented; secondly to sound and its physical qualities - pitch, intensity, timbre; thirdly to the duration of the sound, taken as describing the event in space-time (Nattiez, 1977 [30]). The application of models in the compositional process consists
726
E. Pietrocini
Figure 1.
Figure 2.
Figure 3.
essentially in the organization of sound material into structures of different levels of complexity (Pietrocini, 2006 [33]). To make the principal features of this process clear we will quote a few examples of musical construction. One of the simplest compositions is the sequence: all you need is a certain number of sounds set out in time and ordered according to a criterion (Figs.1, 2). If then each sound in the sequence is given duration, we have a melodicrhythmic structure, a “musical phrase”; to this we can apply different degrees of loudness (Forte, Piano, Crescendo, Diminuendo etc.), determining the dynamics (Figure 3). Naturally, in this case the sounds used were structured a priori. In fact, the chosen sounds belong to a range defined in the tempered system (an octave divided into 12 equal semitones) and the duration values used, which have been part of the musical code since the 13th century (Gallo, 1979 [20]), are the result of time-honoured reflection on rhythmic units and symbolic systems (Cattin, 1979 [14]). Furthermore, it is important to specify that the values represent only duration ratios not the actual length of the sound in time. This in fact is established by pulse, more commonly defined as “musical tempo”. Simple operations in complex systems … or rather simple operations that have contributed to the development of complex systems? 2.2. The importance of number five To represent the range of sounds in the example quoted we used the image of a simple keyboard (Figure 1). In mechanical keyboard instruments (piano, harpsichord, organ etc,), the pitch of the individual sounds is predefined: each key corresponds to one or more vibrating bodies (strings or pipes) which are
Music: Creativity and Structure Transitions
727
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Figure 4. Table 1.
Sounds 1 2 3 4 5 6 7 8
Frequency ratio 1 3/2×3/2×1/2 3/2×3/2×3/2 2/3×2 3/2 3/2×3/2×3/2 3/2×3/2×3/2×3/2×3/2 2
tuned to produce a determined frequency. The frequencies are arranged in ascending order, from left to right. From the way the keys are arranged it is easy to identify a pattern that is repeated at different pitches (Figure 4). The sequence defines a framework (the octave) which has 12 sounds, here identified with numbers (white keys) and symbols (black keys). When the sequence is repeated, the same symbols and numbers are identified as sounds whose frequency is in a ratio of 2:1; for example, the sound 6 in the first series has a frequency of 440 Hz, the corresponding sound in the second series is 880Hz. This relationship between the two sounds, the octave interval (diapason), was first identified by the Pythagoreans in their studies on the monochord: by dividing the string in half you get the next highest octave. Applying the same principle but dividing the string by 2/3 you get a fifth (diapente). The octave and the fifth have a ratio with the base sound of 2:1 and 3:2 respectively. All the other sounds of the octave can be identified through numerical relationships and the succession of fifths (Righini, 1994 [35]) (Table 1).
728
E. Pietrocini
Figure 5.
Figure 6.
Figure 7.
Using the names of the notes to identify sounds of the white keys in the sequence, the ratios involved in the succession of fifths can be seen in Table 2. In the present symbolic system, arranging the sounds on a five line staff as on the keyboard, we get a scale. The ratio between adjoining sounds are called tones (8/9) and semitones (243/256). If we continue with intervals of a fifth starting with note B (the last obtained with the ratio of 2:3 in the series in Figure 5), we get the sounds of the black keys, which musically are indicated with the signs of # (sharp) and b (flat) (Figure 6). By using this method, based on a succession of fifths, we obtain all 12 sounds of the octave, and the cycle should close with the note we started with (Figure 7). However, this does not happen: between the initial sound and the final note there is a difference defined by the ratio of 531441/524288, that is about 23.46 cents - the Pythagorean comma, and the difference between the two values is called the schisma.
Music: Creativity and Structure Transitions
729
Table 2. Notes
PYTHAGOREAN SCALE ratios
C
1
D
9/8
E
81/64
F
4/3
G
3/2
A
27/16
B
243/128
C
2
The equal division of the octave (using a ratio of 1/2) in reality is mathematically irreconcilable with the cycle of fifths (based on the ratio of 2/3) because no power of two can ever equal any power of three. 2.3. A blanket that is too short The problem, first identified by the Pythagorean Archytas of Tarentum (428347 B.C.), has long been the subject of study for mathematicians and musical theorists (Boyer, 1968 [10]), also because of its wide-ranging scientific and philosophical implications. Still, the Pythagorean system remained in use in Western musical practice until the emergence of polyphony and fixed-note instruments in the 15th century (Apel, 1962 [3]). In this period, increasing use was made of intervals of a third and a sixth in polyphonic compositions, which, however, were particularly unpleasant to the ear; for this reason, organ builders began to “temper” the fifths, that is to tune them in a way as to distribute the Pythagorean comma, and obtain major thirds that were more consonant, that is closer to the ratio of 4/5. It is in the 16th century that “temperament” is first mentioned by writers: this is meantone temperament, in which all fifths are flattened by the same amount by distributing the comma and eliminating the beat on major thirds for the process of tuning (Bellasich, Fadini, Leschiutta and Lindley, 1984 [7]). However, in meantone temperament, the fifth cycle still does not close properly, since it produces a very sharp interval called “wolf fifth”. Other solutions were found that offered greater consonance and modifications were made to the construction of some instruments. In 1558 the musical theorist Gioseffo Zarlino proposed a radical reform of the musical scale. To the ratio of 2/1 (octave), 3/2 (fifth) and 4/3 (fourth) he
730
E. Pietrocini Table 3. Unison Major Second Major Third Perfect Fourth Perfect Fifth Major Sixth Major Seventh Octave
1:1 9:8 5:4 4:3 3:2 5:3 15:8 2:1
Figure 8.
added the major and minor third, which had a ratio of 5/4 and 6/5 respectively. The remaining intervals were obtained by interpolating the ones that had already been determined: major tone = fifth − fourth = 9/8; sixth = fourth + major third = 5/3; seventh = fifth + major third = 15/8 (Table 3). In Zarlino’s scale (“scala naturale” or natural scale) (Figure 8) there are two different tone intervals, the major tone (9/8) and the minor tone (10/9); it cannot be considered temperament because it cannot be obtained using a cycle procedure and the intervals are perfect only with respect to the base note, thus it was impractical for musical practice, despite the fact that specific instruments were built for the purpose such as the archicembalo or arciorgano, which had 31 keys per octave, but they soon fell into disuse. Even though Zarlino’s theory was closer than any other to the phenomenon of harmonic sounds, which wasn’t discovered until Sauver, in 1700, meantone temperament was used in musical practice for much of the 17th century, while research on the cyclic method continued in parallel with the evolution of instruments and performance techniques (Tuzzi, 1993 [39]) (Figure 9). In 1691, the German Andreas Werckmeister discovered that cyclic tuning using five tempered fifths and seven Pythagoreans fifths could close the cycle of fifths and eliminate the “wolf fifth” so that music could be performed in all tonalities. To this system numerous variations were introduced, known in Germany as well temperament, today often called unequal temperament. “The
Music: Creativity and Structure Transitions
731
Figure 9. Harmonic sounds; differences between harmonic sounds and sounds of the natural scale expressed in cents
Well Tempered Clavier” of J. S. Bach was the first work to systematically explore its potentialities, although we still do not know for sure to which of these temperaments the author was referring. In “well tempered” tuning systems, the tonalities differ because interval width is not constant; this aspect helps explain the reason for choosing a certain tonal framework to produce a desired expressive effect or rhetoric function, at least until the mid-19th-century (Raschl, 1977 [34]). In the 18th and 19th century an increasing number of theorists and musicians turned their attention to the problem of temperament: Leibniz, Mersenne, d’Alembert and, among musicians, Rameau, placed the physicalmathematical modelling of acoustic phenomena and the theory of harmonic sounds at the basis of music theory and began to consider the possibility of equal temperament, which would allow music to be played in all tonalities (Fubini, 1976 [19]). Following on from the theories of Werckmeister, in 1706 the mathematician Neidhart formalised equal temperament with the introduction of a very simple idea: he divided the octave into 12 equal parts using an exponential function. Given the octave ratio of 2/1, a semitone will have a value of 12 2 , that is a number which is multiplied 12 times by itself to give 2 (Righini, 1994 [35]). Considering the question in acoustic terms, we can say that by multiplying a frequency by the 12th root of two we obtain a frequency that is a semitone higher than the base frequency (Table 4). Equal temperance is a theoretical expedient that became a stable part of musical practice between the 19th and 20th centuries: it eliminated the distinction between major/minor tone and diatonic/chromatic semitone, sharps and flats (for example G# = Ab), dividing the tone into two equal semitones. This simplification eliminates many of the inconveniencies for fixed note instruments. The only disadvantage is that compared to natural harmonics the notes are slightly out of pitch. And this is not exactly a negligible detail, even though the ear is almost used to ignoring the difference.
732
E. Pietrocini Table 4. Unison
1
Major Second
12
22
Major Third
12
24
Perfect Fourth
12
25
Perfect Fifth
12
27
Major Sixth
12
29
Major Seventh Octave
12 11
2
2
The blanket is still too short but it gives cover, as long as we keep still … a pity that music is in continuous movement. 3. Part II 3.1. The first move on the chessboard Returning to our simple keyboard, all 12 sounds of the octave finally have a name, a symbol to effectively represent them and a well-defined frequency: the chessboard is ready and the pieces are in place for the great game of composition. The first move is to generate a musical idea: this could be a fairly simple structure, such as an interval, a chord, a rhythmic element, or a more extended sequence, such as a series of intervals, chords or rhythms that make up a musical phrase. But when and how does the idea first come to the composer? And this is the most mysterious aspect of the game. For the Romantics, the generative idea was the fruit of artistic inspiration, a manifestation of the infinite spirit which emerges through finite determinations, by virtue of being able to preconceptually sense the noumenal reality beyond phenomenal limits: “music is the artform which is most devoid of corporeal elements, in that it represents a
Music: Creativity and Structure Transitions
733
Table 5. Natural notes
Do
Re
Mi
Fa
Sol
La
Si
Sharp
Do
Re
Fa
Sol
La
Flat
Re
Mi
Sol
La
Si
German notation
C
D
E
F
G
A
B
H
Figure 10.
movement in itself, detached from objects and carried by invisible wings, such as the wings of the spirit ” [36]. Effectively, if we think of the immortal tunes of great works, which have become part of the shared heritage of civilization and universal language (is there a community that does not identify, for example, with Beethoven’s “Ode to Joy”?) we cannot but recognize in them a transcendent principle. Great themes “speak” to individual and collective conscience, perhaps because they represent a moment of unified experience, space-time linking the bodily self, conscience and the mind (Solms and Turnbull, 2002 [38]). We must not forget, however, the perceptive aspect of the musical phenomenon: an idea, understood as an event-object, needs to undergo a construction process to manifest itself. In this sense, a musical theme can take shape simply by selecting sounds according to a logical criterion or even by trusting in chance and, subsequently, using a structure model to combine the various elements (Bent and Drabkin, 1980 [8]). There are numerous examples of these procedures throughout the history of musical production. The most famous of all is the B-A-C-H theme, which associates the alphabet to German musical notation (Table 5). The “Great Bach” (Johann Sebastian) transformed his name into a musical signature, and even used it in his dedications to “God the Creator of everything”. Very often, at the end of his manuscripts we find the letters S.D.G., which stand for Soli Deo Gloria; with the help of gematria, it is immediately clear that SDG (18+4+7 = 29) corresponds to (9+18+2 = 29) JSB.
734
E. Pietrocini
Figure 11. The autographed manuscript of the last page of The Art of Fugue.
The theme B-A-C-H appears in several of his compositions (Hofstader, 1979): in Kleines Harmonisches Labyrinth BWV 591 for organ, in the Canonic Variations on “Von Himmel Hoch” BWV 769, and the Passion according to Matthew (Matthäuspassion) BWV 224, in which the chorus sings “Wahrlich, dieser ist Gottes Sohn” (“Truly, this man was Son of God ”) and in the final fugue of The Art of fugue (Die Kunst der Fuge) BWV 1080. In this last work, which Bach left unfinished at his death, the theme appears five beats from where the score stops; there follows a note, written by his son Carl Philipp Emanuel: “Über dieser Fuge, wo der Nahme BACH im Contrasubject angebracht worden, ist der Verfasser gestorben” (“At the point where the composer introduces the name BACH in the countersubject to this fugue, the composer died ”). Beyond the musicological controversies on the truth of this statement, we would like to think that, yet again and for the last time the great Bach had placed his seal on an inheritance whose very incompleteness is of the most profound significance: the eternal tension of continuous transformation (Figure 11). 3.2. The game of changes As briefly mentioned before, the initial musical idea can emerge in different ways, not ascribable to a project or to a logical criteria established a priori. There is a particular musical practice in which the theme stems from the performance itself: improvisation (Simpson, 1667 [37]). Improvisation is present in all musical cultures and it is reasonable to believe that it contains the germs of creative production: an intuitive process in which the generative idea emerges through a form of implicit learning which involves both cognitive and affective aspects. “Musical thinking”, in fact, involves an integrated experience of the
Music: Creativity and Structure Transitions
735
perceptive world in every moment of the temporal continuum (Solms and Turnbull, 2002 [38]). In improvisation, each single phrase emerges with the need to continually transform the thematic material; there’s no development in the repetition or reiteration of models. Only through variation, understood as a dynamic process of revision, can creative production manifest itself fully. The act of variation, which involves modifying the organizational models and structures of musical material, is a conscious, ordered operation (Pietrocini, 2006 [33]). The generative idea is adapted by modifying pitch, duration, and dynamic characteristics in terms of the present, consciously and especially with reference to relational systems. It is impossible in this paper to discuss the incredible number of variation techniques: we will simply try to briefly describe the basic elements and illustrate, through historical perspective, some examples that can help shed light on different levels of complexity. 3.3. Variations and simple systems of musical organization 3.3.1. Rhythmic variations → rhythmic system Given the initial rhythmic element, changes can be made to duration values, pulse and/or subdivision. Original rhythmic element
Example of duration variation
Example of duration variation
736
E. Pietrocini
3.3.2. Melodic variations → melodic system Given a series of initial sounds, changes can be made to the order:
Original series
Example of inverse variation (the original is turned upside down and the intervals are inverted)
Example of retrograde variation (the original is reversed, starting from the end). The primary structure or musical phrase is the result of interaction between rhythmic and melodic systems, and their respective forms of variation: a process of rhythmic variation has been implemented by applying duration values, according to an established pulse, to a series of consecutive notes in the octave, as in the example
The original series is still recognizable, but it can no longer be assimilated to the product of the transformation, because the formal characteristics are incompatible. Interaction between rhythmic and melodic systems through variation has led to the configuration of an entity whose relational models belong to another level of complexity and a new reference system (Pessa, 2000, 2002 [31,32]). In fact the original has undergone a process of change. 3.4. Variations and complex systems of musical organization Historically, the organization of a musical phrase according to pitch and duration arose from the evolutionary needs of language: from the earliest recorded history, vocal modulation of sounds has been associated to the rhythm and stress of the words in Western and Eastern civilization.
Music: Creativity and Structure Transitions
737
Figure 12. Melisma on the syllable DE.
As regards classical Western tradition, the first documented forms of melodic rhythmic variation are to be found in mediaeval liturgical chant: it involves ornamental procedures obtained through the performance of several notes for the same syllable of text; these new melodic structures, called melisma, were then sung to other texts, giving rise to Tropes and Sequences, self contained pieces which soon became accepted parts of the sacred repertoire (Cattin, 1979 [14]) (Figure 12). Melisma are probably one of the most archaic forms of improvisation: ethnomusicologists have identified numerous examples in primitive cultures and in popular tradition . In the same context, in particular in group improvisations, the use of melisma has been ascertained in practices involving polyphony, i.e. the simultaneous performance of different sounds or a series of different sounds, which are superimposed and proceed in parallel (de la Motte, 1981 [29]). 3.5. Counterpoint Polyphony in classical music is documented for the first time in Scotus Eriugena’s De Divisione Naturæ in the ninth century, but we may suppose that before this there was a consolidated custom of sacred and profane vocal practice, arising probably from the performance of the same melodic line with voices of different register (Apel, 1962 [3]; Howen, 1992 [22]). The structures obtained from the simultaneous association of more sounds or several melodic lines are called harmony and counterpoint, respectively. In these frameworks, the procedural models are very complex: the techniques of variation establish intricate networks of relationships between these systems. As described above, the problem of relations between intervals and the correct relationship between sounds has always been at the heart of musical research. Counterpoint (punctum contra punctum) involves a series of interval relationships established between superimposed sounds, which in turn are organised into independent melodic lines. The relationships and techniques of variation in this system were first formalised in the 12th century, with the development of organum (de la Motte, 1981 [29]). This is a composition for several voices in which one or more
738
E. Pietrocini
Figure 13. Parallel, note against note, at the same interval distance.
Figure 14. Contrary motion, note against note.
Figure 15. Melisma, several notes against one.
overlying or underlying parallel melodies are added to the vox principalis, a monodic chant from liturgical repertoire; the former are variations of the cantus firmus sung by the vox principalis (Figures 13,14,15). In organum, as in later polyphonic forms, the process of composition consists in the variation of the cantus firmus or tenor (which as of the 13th century was a melodic phrase complete with mensural values) and the superimposition of the original idea and melodic lines or phrases derived from it. Once again, a process of change has taken place. In fact, the contrapuntal process determines the emergence of properties that configure a new dynamic system (Baas and Emmeche, 1997 [4]). Counterpoint has greatly influenced the evolution of musical production, indeed it was the only system of reference for classical music until the 16th century and it has determined irreversible changes to the formal representation of musical thought. In a certain sense, it can be said that the contrapuntal system constitutes, even today, the archetype of musical architecture.
Music: Creativity and Structure Transitions
739
Figure 16.
3.6. Harmony Harmony, too, can be seen as a “vertical reading” of counterpoint. Probably, this new model stemmed, as always, from a practice widespread among musicians that had a poly-vocal instrument: the reductio partituræ (Del Sordo, 1996 [17]), which involved playing the polyphonic score on an instrument; inevitably, the musician had to read vertically, concentrating on simultaneous sounds. This practice, adapted to the resources of the instrument, led to chords being formed (Figure 16). Very soon, the process of variation introduced through the vertical reading of polyphony, gave rise to a completely autonomous system, based on a succession of chords in relation to the main melody (Caccini, 1614 [13], Bianconi, 1982 [6]). The most evident aspect of this change is in the multiplication of different levels of perspective on which the sound material is placed. In counterpoint all voices play an equal part in the musical texture; the generative idea, for example the theme of a fugue, is presented by each in turn and all reworked elements (countersubjects, divertimenti etc.) underline it, without needing to use musical dynamics; in harmony, hierarchies are defined on the basis of the expressive primacy of the original idea (Bach, 1753) [5], which appears in a single voice; therefore it is necessary to differentiate the functions and the dynamic levels of the other parts. The new system began to be assimilated as a form from the middle of the 16th century. In his 1558 Istituzioni harmoniche, Zarlino (1558) [41] speaks of music as an instrument that “moves the heart”, closely connected to the text: “we now have to try to make the harmony fit the words. I said make the harmony fit the words, because although in the second part … it was said that melody is a mixture of speech, harmony, and number, and that in a composition one should not be before the other, yet speech is the main thing and the other two parts are there to serve it …”.
740
E. Pietrocini
This relational model took on concrete shape in the practice of the Basso Continuo, an extempore form of accompaniment which was popular in European musical culture until the early 1800s (Ferguson, 1975 [18]). It must be added that the entire harmonic system, in modal and tonal frameworks, is based on hierarchical relationships: each chord has a specific function and can be placed in relationship with the others on the basis of procedural models defined in terms of agogic criteria. We spoke Zarlino’s important studies and theories in the first part. We may add that the development of the harmonic system kept pace with research and developments in temperament, and work on relational harmonic models has contributed decisively to the configuration of musical macro systems (modality, tonality, seriality …). Nevertheless, we must again underline that the development of harmony is also fundamentally linked to the extemporaneousness of the variation process. At the height of the baroque period, instrumental music for keyboard and lute used harmonic models and redefined them on the basis of the resources of the instruments (Hubbard, 1965 [25]). Improvisation, widespread in all vocal and instrumental music, involves an exploration of technical and expressive possibilities that goes well beyond custom. Even where the relationship is one of subordination to the melody, as in the Continuo, chord agglomerations tend to accumulate extraneous elements such as appoggiaturas, chromaticism and passing notes. In improvised compositions such as toccatas and unmeasured preludes, these “interferences”, which used to be accidental and sporadic, became consolidated practice and then an integral part of the musical discourse (Moroney, 1985[28]). 3.7. Toccatas and Preludes Obviously, this brings substantial change to the entire relational system: if we listen to a toccata by Johann Jakob Froberger a or an unmeasured prelude by Louis Couperinb , it is easy to lose your musical orientation; in these codified pieces the “interferences” are used in a conscious way to produce stupendous pieces of great expressive strength with incredibly “modern” harmonic solutions.
a b
German composer (Stutgart 1616 − Héricourt, Montbéliard 1667). French composer (Chaumes-en-Brie 1626 − Paris 1661).
Music: Creativity and Structure Transitions
741
Figure 17. Beginning of the first Toccata FbWv 101 by Johann Jakob Froberger.
Figure 18. Beginning of the prélude non mésuré in A flat by Louis Couperin (original in Ms Bauyn, Bibliotheque National de France, Res. Vm (7) 674-675, modern edition: Oiseau-Lyre, 1985).
Froberger was active in France between 1652 and 1662: he was certainly acquainted with many musicians at the French court, and in all probability knew Louis Couperin, then organist at the cathedral of Saint Gervais, in Paris. What is certain, is that Couperin was well acquainted with the works of Froberger, even entitling one of his Préludes non mésurés “a l’imitation de Monsieur Froberger”. The first beats, in fact, have a strong analogy to the beginnings of one of German composer’s toccatas; almost immediately, though, a different course is taken, and, although we can see the use of assimilable phraseological models, the compositional solutions and harmonic successions denote completely original development (Figures 17,18).
742
E. Pietrocini
It is to be noted that the type of notation used by the two composers is completely different: Froberger used mensural notation, with well-defined duration values, although the division into beats and pulses is not always indicated with precision; Couperin, on the other hand, used a white notation without duration values or indications of time, but long curved lines (tenues) to indicate sound groupings in the same chord structure. These two choices probably denote efforts to effectively represent a method of performance on the one hand, and on the other the desire to leave factors involving the extemporary nature of interpretation up to the performer. 4. Conclusions What is of greatest interest, for this paper, is the concrete example offered by these two compositions of some nonlinear effects of the transformations that take place in musical structures through variation. Perhaps it is not superfluous to underline that, in music, variation also constitutes a formal structure, much used in classical and popular musical production; however, we have deliberately ignored this aspect to focus on the variation understood as a dynamic process involving the reworking and transformation of the original material. There is one thing that all the elements described have in common: variation is used to substantially modify relational systems only in determined conditions of criticality. One of these seems to be extemporaneity; in improvisation, the temporal dimension is the here and now: there is no planning, correction or going back but a course is plotted along the lines of intuitive thought (Morin, 1993 [27]). This is a very particular form of implicit learning on the go, which perhaps is analogous to the mysterious cognitive processes of early childhood. A second element of criticality is the saturation of procedural models; extremely high levels of complexity are liable to lead to the complete “collapse” of a compositional system. This phenomenon can be seen, for example in some of the contrapuntal works of Johann Se Bach (David, 1972 [15]): right in the middle of a highly intricate and faultless fugue, we may come across what could easily be labeled an “error”: a false relationship, a prohibited movement, a dissonant interval that is not resolved in the way it should be. Incredibly, the presumed error is immediately followed by a structure that redefines its function according to a new procedural model, re-establishing overall balance. In any event, what is most interesting is not so much the transformation itself, but the collective learning that it determines (Minati and Pessa, 2006
Music: Creativity and Structure Transitions
743
[26]): in time the new systems are recognized, codified, used by the social community and incorporated into shared heritage. We can say that the very meaning of music emerges from this continuous invention and transformation: in the great game of change, the match never ends. References 1. Agostino, De musica (Sansoni, Florence, 1969). 2. W. Apel, Geschichte der Orgel und Klaviermusik bis 1700 (Bärenreiter-Verlag, Kassel, 1967). 3. W. Apel, Die notation der polyphonen Musik. 900 - 1600 (Breitkopf & Härtel Musikverlag, Leipzig, 1962). 4. N.A. Baas and C. Emmeche, Intellectica 25(2): 67-83 (1997), (also published as: SFI Working Paper 97-02-008, Santa Fe, Institute, New Mexico). 5. C.Ph.E. Bach, Versuch über die wahre Art das Clavier zu spielen (1753), (Italian translation: L’interpretazione della musica barocca - un saggio di metodo sulla tastiera, Ed. G. Gentili, (Edizioni Curci, Verona, Milan, 1995)). 6. L. Bianconi, Il Seicento in Storia della Musica, Vol. IV, (Società Italiana di Musicologia, EDT, Turin, 1982). 7. A. Bellasich, E. Fadini, S. Leschiutta and M. Lindley, Il Clavicembalo (EDT, Turin, 1984). 8. I. Bent and W. Drabkin, Analysis (Macmillan Publishers Ltd., London, 1980), (Italian translation: Analisi musicale, Ed. C. Annibaldi, (EDT, Turin, 1990). 9. S. Benvenuto, Lettera Internazionale,73-74, 59-61 (2002). 10. C.B. Boyer, A History of Mathemathics (John Wiley & Sons, 1968), (Italian translation: Storia della Matematica (Arnoldo Mondadori Editore, Milan, 1980)). 11. G. Bruno, Lettera Internazionale, 73-74, 56-58 (2002). 12. G. Bruno and E. Pietrocini, in Mathesis Conference Proceedings – Vasto, April 1012, 2003, Ed. E. Rossi, (Abruzzo Regional Council Presidency, 2003). 13. G. Caccini, Le Nuove Musiche et nuova maniera di scriverle (1614), (facsimile of the original at the Florence National Library, Archivium Musicum S.P.E.S. Florence, 1983). 14. G. Cattin, “Il Medioevo I” in Storia della Musica, Società Italiana di Musicologia, Vol. I, Part II, (EDT, Turin, 1979). 15. H.T. David, 1972, J.S.Bach’s Musical Offering, New York: Dover publications. 16. H.T. David and A. Mendel, The Bach Reader (W.W. Norton, New York, 1966). 17. F. Del Sordo, Il Basso Continuo (Armelin Musica - Edizioni Musicali Euganea, Padua, 1996). 18. H. Ferguson, Keyboard Interpretation from the 14th to the 19th Century (Oxford University Press, New York, 1975). 19. E. Fubini, L’estetica musicale dall’antichità al Settecento (Einaudi, Turin, 1976).
744
E. Pietrocini
20. A. Gallo, in Storia della Musica, Vol. II, Società Italiana di Musicologia, (E.D.T., Turin, 1979). 21. D.R. Hofstadter, Gödel, Esher, Bach: an Eternal Golden Braid (Basic Book; New York, 1979), (Italian translation: Gödel, Esher, Bach, un’Eterna Ghirlanda Brillante, (Adelphi, Milan, 1984). 22. H. Howen, Modal and Tonal Counterpoint from Josquin to Strawinskj (Wadsworth Group/ Thomson, Belmont, CA, 1992). 23. I Ching, The “I Ching” or the Book of Changes (Bollingen Foundation, New York, 1950). 24. Learning, Italian translation: Il Contrappunto modale e tonale da Josquin a Strawinskj (Ed. Curci, Milan, 2003). 25. F. Hubbard, Three centuries of harpsichord making (Harvard University Press, Cambridge, 1965). 26. G. Minati and E. Pessa, Collective Beings (Springer, New York, 2006). 27. E. Morin, Introduzione al pensiero complesso. Gli strumenti per affrontare la sfida della complessità (Sperling e Kupfer, Milan, 1993). 28. D. Moroney, Critical Apparatus in L. Couperin, Pièces de Clavecin, Publiées par Paul Brunold, Éditions de l’Oiseau-Lyre, Monaco, 1985. 29. D. de la Motte, Kontrapunkt - Ein Lese und Arbeitsbuch (Bärenreiter- Verlag, Kassel, 1981), (Italian translation: Il Contrappunto, un libro da leggere e da studiare (Ricordi & C., Milan, 1991). 30. J.J. Nattiez, Il discorso musicale (Einaudi, Turin, 1977). 31. E. Pessa, La Nuova Critica 35, 53-93 (2000). 32. E. Pessa, in Emergence in Complex Cognitive, Social and Biological Systems. Ed. G. Minati and E. Pessa, (Kluwer Academic/Plenum Publishers, New York, 2002), pp. 379-382. 33. E. Pietrocini, in Systemics of Emergence, Ed. G. Minati, E. Pessa and M. Abram, (Springer, New York, 2006), pp. 399-415. 34. E. Raschl, Beihefte der Denkmaler Tonkunst den Osterreich 28, 29-103 (1977). 35. P. Righini, L'acustica per il musicista. Fondamenti fisici della musica (Zanibon, Padua, 1994). 36. W.J. Schelling, “Philosophie der Kunst”, in Schelling Werke, Vol. III, (Eckardt Verlag, Leipzig, 1907). 37. C. Simpson, The Division-Viol or The Art of playing extempore upon a Ground (1667), Lithographic facsimile of the second edition, (J. Curwen & Sons, London). 38. M. Solms and O. Turnbull, The Brain and the inner world (2002), (Italian translation: Il cervello e il mondo interno (Raffaello Cortina Editore, Milan, 2004)). 39. C. Tuzzi, Clavicembali e Temperamenti (Bardi Editore, Rome, 1993). 40. L. Von Bertalanffy, General Systems Theory (George Braziller, New York, 1968). 41. G. Zarlino, Istituzioni harmoniche (Venice, 1558).
THE EMERGENCE OF FIGURAL EFFECTS IN THE WATERCOLOR ILLUSION BAINGIO PINNA(1), MARIA PIETRONILLA PENNA(2) (1) Department of Science of Languages, University of Sassari Via Roma 151, I-07100 Sassari, Italy email:
[email protected] (2) Department of Psychology, University of Cagliari Via Is Mirrionis, 1, Cagliari, Italy email:
[email protected] The watercolor illusion is characterized by a large-scale assimilative color spreading (coloration effect) emanating from thin colored edges. The watercolor illusion enhances the figural properties of the colored areas and imparts to the surrounding area the perceptual status of background. This work explores interactions between cortical boundary and surface processes by presenting displays and psychophysical experiments that exhibit new properties of the watercolor illusion. The watercolor illusion is investigated as supporting a new principle of figure-ground organization when pitted against principles of surroundedness, relative orientation, and Prägnanz. The work demonstrated that the watercolor illusion probes a unique combination of visual processes that set it apart from earlier Gestalt principles, and can compete successfully against them. This illusion exemplifies how long-range perceptual effects may be triggered by spatially sparse information. All the main effects are explained by the FACADE model of biological vision, which clarifies how local properties control depthful filling-in of surface lightness and color. Keywords: perceptual organization, grouping principles, color spreading, figure-ground segregation, filling-in.
1. Introduction The watercolor illusion (Pinna, 1987 [18]; Pinna et al., 2001, 2003 [19,20]) is an assimilative spread of color emanating from a thin colored edge (orange) lining a darker chromatic (purple) contour (see Figure 1). The spread of color (coloration effect) is uniform and it extends over large distances. The spatial limit of color spreading is approximately 45 deg. The coloration is complete at 100 ms. All colors can generate a strong coloration effect. The watercolor illusion also occurs on colored and black backgrounds. The optimal line thickness is approx. 6 arcmin. The color spreading effect is
745
746
B. Pinna and M.P. Penna
(1)
(2)
Figure 1. The coloration effect in the Watercolor illusion: When a purple contour is flanked by an orange edge, the entire enclosed area appears uniformly colored by the color spreading of the orange edge. The coloration appearance is like a solid surface color. Figure 2. When Fig. 1 is physically entirely colored in the inside edge by the same orange used in the fringe, the frame appears as a flat plane with two inner wiggly rectangles laying on it. The frame does not manifest the strong figural effect of Fig. 1, where it appears as a rounded surface with two small wiggly rectangles perceived as holes within the solid frame and revealing the white empty space behind them.
much stronger with wiggly lines but it also occurs from straight lines and from chains of dots (see Pinna et al., 2001 [19]). High luminance contrast between inducing lines shows the strongest coloration effect, however, the color spreading is clearly visible at near equiluminance. In high luminance contrast conditions, there is an asymmetry in the amount of color spreading from the two lines. The line with a less luminance contrast relative to the background spreads proportionally more than the line with higher luminance contrast. The color spreads in directions other than the line orientation. In quasi-equiluminant conditions both lines spread at a similar intensity and in opposite directions. Although the direction of the color spreading due to both lines goes in opposite directions, they produce a coloration effect obtained by a combination of both colors in terms of saturation. The chromatic reciprocal influence between lines operates also when there are more than two adjacent and parallel lines. When both lines are replaced by chains of dots and the dots of the inner chain are alternated in different colors, they spread less strongly but the resulting color is a combination (additive mixture) of the component colors. The colors combine also when there are more than two colors or when the dots of the outer
The Emergence of Figural Effects in the Watercolor Illusion
747
purple chain are replaced by dots with alternated different colors. By inserting an empty gap between the two lines, the color spreading is weakened. These phenomenological features, as well as others discussed in the following, suggest that watercolor illusion should be considered as an emergent effect fulfilling constraints which cannot be reduced to the standard ones introduced by Gestalt psychologists (like surroundedness, relative orientation, and Prägnanz). It rather evidences the operation of an entirely new figureground organization principle. The latter can be understood, from a theoretical point of view, as a consequence of the interaction between two different processes operating within visual cortex, that is parallel boundary grouping and surface filling. The complementarity of these processes, described by a recent model of visual perception (Grossberg, 1994, 1997, 2000 [4,5,6]), seems to account both for the observed visual phenomenology and for the neurobiological findings about visual cortex. 2. Figural Effects in the Watercolor Illusion The watercolor illusion not only imparts the color of the inner edge onto a large enclosed area (coloration effect), it also enhances the figural property (figural effect) of this area relative to the surrounding (complementary) area that appears as background (Pinna, 1987 [18]; Pinna et al., 2001, 2003 [19,20]). In Figure 1, the frame surrounding the inner wiggly rectangles manifests a figural appearance and a univocal (poorly reversible) figure-ground segregation that is not comparable with a condition where the same outlined figure is physically entirely colored in the inside edge by the same orange used in the fringe (see Figure 2). In Figure 1, the watercolor illusion strengthens the figural effect of the frame by segregating it in depth and giving it the perceptual property of a rounded surface, extending out from an otherwise flat surface (volumetric effect). On the other hand, the two tilted and small wiggly rectangles appear as holes within the solid frame revealing the white empty space behind them. In contrast to Figure 1, in Figure 2 the frame appears as a flat plane with two inner wiggly rectangles laying on it. While in Figure 1 the figure-ground organization is difficult to reverse in favor of the wiggly rectangles appearing as a figure on the top of a flat large rectangle, in Figure 2 this latter result may appear more saliently than the complementary perceptual result, where the large frame is perceived in front with two wiggly rectangular holes. In Figure 3, a brighter physical orange within the frame elicits perceptual figure-ground organization
748
B. Pinna and M.P. Penna
(3)
(4)
Figure 3. By making brighter the physical orange of Fig. 2 within the frame, it is easier than in Fig. 2. to perceive wiggly small rectangles upon a large flat rectangle. Figure 4. A control for Fig. 1 is obtained by replacing the orange line with a purple line as the outer one. Differently from Fig. 1, the frame does not show any figural salience comparable with the watercolor condition of Fig. 1.
similar to Figure 2, but with stronger differences than Figure 1; that is, it is easier to perceive wiggly small rectangles upon a larger flat rectangle. These percepts can be interpreted in the light of Rubin’s principle of relative contrast (Rubin, 1921 [22]): All else being equal, the region with the higher contrast tends to appear as a figure. Therefore the frame in Figure 2, having a greater contrast than the one in Figure 3, tends to appear in front more as a figure; however, in contrast with the Rubin’s principle, it appears much less as a figure than the watercolored frame in Figure 1. Figure 4 shows a control for Figure 1 by replacing the orange fringe with a line of the same purple belonging to the outer boundary of the frame within Figure 1. In Figure 4, the frame does not present the same figural salience as the one in Figures 1, 2 and 3 but it can be more clearly perceived as a rectangular background behind the inner wiggly rectangles that are now more strongly perceived as two solid figures and not as empty spaces or as holes, as in Figures 1, 2 and 3. By adding an orange fringe to Figure 3, the strong figure-ground conditions of Figure 1 are restored (see Figure 5). Notice that in Figure 5, the inner orange is physically the same as the one in Figure 3, but it appears darker, much denser, and much more as a surface color (Katz, 1911, 1930 [14,15]) than the one in Figure 3. The darkness is due to the coloration effect, while the surface color appearance depends on the figural effect of the watercolor illusion. Both come
The Emergence of Figural Effects in the Watercolor Illusion
749
from the darker orange fringe added to the purple outer line that becomes the boundary of the figure. In Figures 1 and 5, the figural effect of the watercolor illusion is pitted and prevails against the classical Gestalt factors of surroundedness (Rubin, 1921 [22]) and relative orientation (Bozzi, 1975 [1]). In the next Sections, the figural effect of the watercolor illusion is investigated as a new principle of figureground organization when pitted against both principles of surroundedness, relative orientation and Prägnanz. It has been shown (Pinna et al., 2003 [20]) that the watercolor illusion can be considered a distinct and more effective principle of figure-ground segregation than usual ones of proximity, good continuation, closure, symmetry, convexity, and past experience (Wertheimer, 1923 [23]). 2.1. Experiment 1: Watercolor illusion vs. Surroundedness In this experiment, the watercolor illusion, assigning to a given region the status of figure against the Gestalt principle of surroundedness, is shown under more simple conditions than those illustrated in Figure 1 and 5. The surroundedness principle states that, all else being equal, a shape surrounded by a larger one tends to be perceived as a figure, while the surrounding shape appears as a background. 2.1.1. Subjects Fourteen undergraduate students who were naive to the purpose of the experiment participated. All had normal or corrected-to-normal vision. 2.1.2. Stimuli The basic stimulus was made of two squares of different size concentrically included within each other (Figure 6). The sides of the outer square were fixed (10.2 deg), whereas the side length of the inner square was varied as follows: 7.9, 6.3, 4.0, and 2.3 deg. According to the Gestalt principle of surroundedness, the smaller, enclosed square should be perceived as figure. Five edge conditions were used: (i) Purple contour only. (ii) Orange fringes lining the interspace (“frame”) between the two squares. This last condition pits the watercolor effect against the Gestalt factor of surroundedness. (iii) Orange fringes lining the inside edge of the small square and the outside edge of the large square. Here, the watercolor effect is synergistic with surroundedness. (iv) Red fringes lining the frame between the two squares. The watercolor illusion should be weaker
750
B. Pinna and M.P. Penna
(5)
(6)
Figure 5. By adding an orange fringe to Fig. 3, the strong figure-ground conditions of Fig. 1 are restored. The inner orange is physically the same as the one in Fig. 3, but it appears darker, much denser, and much more as a surface color than the one in Fig. 3. Figure 6. Stimulus used to test the watercolor illusion against the Gestalt factor of surroundedness in determining figure-ground organization. According to the Gestalt factor of surroundedness, the smaller enclosed square should be perceived as figure, but when this factor is pitted against the figural effect of the watercolor illusion, the small enclosed square appears as a hole and the frame between the two squares is now perceived as a figure.
because of the smaller luminance contrast between purple and red (Pinna et al., 2001, 2003 [19,20]) than between purple and orange of the previous condition (ii). (v) Finally, same physical color than the orange fringe uniformly covering the area of the frame between the two squares. This is a control to demonstrate that the watercolor illusion cannot be reduced to a condition where coloration is the only property implied, but it shows that the figural property is not necessarily linked to the coloration effect. Thus; the results for condition (v) were expected to be similar to the purple-only condition (i). The stimuli were hand-drawn by using a graphic tablet. The CIE x,y chromaticity coordinates of the chromatic components of the patterns were: (purple) 0.30, 0.23; (orange) 0.57, 0.42; (red) 0.62, 0.34. Stimuli were presented under Osram Daylight fluorescent light (250 lux, 5600 °K) and were observed binocularly from a distance of 50 cm with freely moving eyes. 2.1.3. Procedure There was a training period preceding each experiment to familiarize subjects with the task. During practice, subjects viewed some well-known figures from the literature (e.g., face-vase) to familiarize them with concepts of figure and
The Emergence of Figural Effects in the Watercolor Illusion
751
ground. They practiced scaling the relative strength or salience of each figure using percentages. The task was to report what was figure and what ground. In addition, subjects quantified the relative strength (in percent) of a given surface being perceived as figure or ground. Observation was unlimited, but responses were prompt. Each stimulus was presented in a random sequence that was different for each subject. 2.1.4. Results In Figure 7, mean ratings (in %) of the frame being perceived as a figure are plotted for the 5 edge conditions with frame width as a parameter. In the purpleonly condition (i), the frame was perceived as a figure only when its width was smaller than the size of the inner square. This result was expected because of the proximity factor that groups the stimulus to be perceived as a frame and not as a square inside another square. Accordingly to the proximity principle, when the frame was wider than the size of the inner square, the inner square appeared as a figure. The surrounding area became part of the larger square, which completed itself amodally behind the small square. The double organization, perceived in the purple-only condition, was a good control for the other conditions. When orange fringes were added to the inner edges of the frame to produce watercolor spreading (ii), the frame was always perceived as a figure. The opposite result was obtained when the orange fringes were added to the inside edges of the inner square (iii). Under these conditions, the inner square always appeared as a figure, even when the frame is so narrow that it should be organized to be perceived as a figure due to the proximity factor. Differently from the purplecontour-only condition (i), where the width created a division in what (frame or square) is perceived as a figure, in the latter two conditions (ii and iii) the watercolor illusion won irrespective the width. When red fringes were added to the inside edges of the frame, watercolor won again. However, the figural effect depending on the red line was weaker than the effect depending on the orange fringes (condition ii). Finally, as expected, when orange color was physically and uniformly added to the area of the frame, the results were not significantly different from the purple-only condition. On the base of these results the watercolor illusion imparts not only a coloration effect but also an independent figural effect. A two-way ANOVA revealed that the relative strength (in percent) of the frame being perceived as a figure changed significantly depending on the size of the frame (F3,260=472.891, p